Recommender Systems for Banking and Financial Services
                    Andrea Gigli                                    Fabrizio Lillo                              Daniele Regoli
               MPS Capital Services                             Università di Bologna                      Scuola Normale Superiore
                 Viale Mazzini, 23                             Viale Quirico Filopanti 5                     Piazza dei Cavalieri 7
                 Siena, Italy 53100                              Bologna, Italy 40126                           Pisa, Italy 56126
               andrea.gigli@mpscs.it                            fabrizio.lillo@unibo.it                      daniele.regoli@sns.it

ABSTRACT                                                                      data (client, branch and account identifiers) and traded item data
In this work we demonstrate the usefulness of the application of              (type of asset, transaction currency, asset country, time to maturity,
Recommender Systems in the financial domain. Specifically we                  complexity, industrial sector, industrial group, industrial sub-group,
investigate a dataset, made available by a major European bank,               rating, coupon type, trading channel, buy-sell type).
containing the purchases of a large set of investment assets by                  Having no information on the traded volume per transaction
200k investors. We also present some preliminary results of the               nor the client total wealth at the time of each trade, we model the
application of network analysis via statistical validation to identify        recommendation problem on the basis of the binary information
clusters of investment assets.                                                purchased/not purchased item.

KEYWORDS                                                                      Implicit feedback recommender for financial
Finance, Collaborative Filtering, Networks, Statistical Validation            investments
ACM Reference format:                                                         To better capture clients’ preferences we compare three different
Andrea Gigli, Fabrizio Lillo, and Daniele Regoli. 2017. Recommender Systems   RecSys algorithms. All of them required to test different combina-
for Banking and Financial Services. In Proceedings of RecSys 2017 Posters,    tions of features at our disposal in order to define user and item
Como, Italy. Copyrights held by the authors. , August 27-31, 2 pages.         entities. After some trials and analysis, we defined the user as a com-
                                                                              bination of client ID and bank branch, and the item as a combination
Introduction                                                                  of asset type, country, time to maturity, coupon type, industrial sec-
                                                                              tor and rating.The results for other aggregations are qualitatively
Banking and Financial Services, being them provided by incumbent
                                                                              analogous.
Banks or by FinTech companies, are looking seriously at machine
                                                                                 The first algorithm we tested is the Bayesian Personalized Rank-
learning and information retrieval fields in order to leverage the
                                                                              ing algorithm [3] where we use a matrix factorization method
data at their disposal to provide tailored services and customized
                                                                              maximizing the posterior probability of user preference structure,
experiences to their customers.
                                                                              and tune model’s parameters via 5-fold cross-validation. The sec-
   One of the fields of computer science which can support this
                                                                              ond one is the Alternating Least Squares algorithm [1] using 30
attempt is the one represented by Recommender Systems (RecSys),
                                                                              latent factors and a regularization factor equal to 0.01. The third
which has been heavily investigated in the last years by the research
                                                                              one is an adaptation of the Word2Vec algorithm [2] that we call
community as well as the most promising companies in the e-
                                                                              Asset-Embedding in the following. In this case we treated the clients’
commerce and entertainment fields.
                                                                              portfolios as they were documents, each asset as a word, and vector-
   In this work we show the usefulness of some RecSys algorithms
                                                                              represented each asset by the portfolio it belongs to via continuous
in suggesting investment assets to a large panel of investors. This
                                                                              bag of words in a 300 dimension space.
is done by using a large dataset provided by a major European bank
                                                                                 The RecSys algorithms mentioned above are evaluated through
and comparing the performance of three different RecSys against
                                                                              various tests against two benchmark algorithms based on most pop-
two baseline models in the task of suggesting investment assets.
                                                                              ular items by number of users (POP.u) or by number of transactions
                                                                              (POP.trans):
The Dataset
                                                                                 (1) Average Accuracy of the user preference structure (see [3]);
The recommender system implementation and analysis have been
                                                                                 (2) Expected percentile ranking, as defined in [1] (the lower, the
done on a dataset with financial investment information, made
                                                                                     better);
available to us by a European bank during a research collaboration
                                                                                 (3) the Area Under the ROC curve.
program, which contains 224,885 clients, 1,288,315 transactions and
information related to 7 different asset types, 23 rating levels, 6           Other metrics (e.g. novelty and coverage) have been calculated but
order channels, 12 industrial sectors, 8 maturity buckets, 5 coupon           are left out for lack of space. Different train/test sampling method-
types, 2 product complexity levels.                                           ologies were used:
   The records span a period of twelve months and all data entries               (1) leave-one-out: removing randomly from train one purchased
are properly hashed, anonymized and organized as a table, where                      asset for each user (who has at least 5 purchases);
each record represents a purchase defined by: execution date, user               (2) leave-last-out: removing from train the last (in time) asset
                                                                                     purchased by each user;
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy.
Copyrights held by the authors.                                                  (3) 20% level sampling: removing randomly from train 20% of
                                                                                     interactions.
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy.
Copyrights held by the authors.                                                                 Andrea Gigli, Fabrizio Lillo, and Daniele Regoli
Table 1: Evaluation metrics for leave-last-out test method-
ology, with variable number of most purchased items ex-
cluded from test set.

      most purchased   recommender       Average
                                                    Rank     AUC
      excluded items   system            Accuracy
                       BPR-MF             0.961     3.878    0.970
                       Asset Embedding    0.951      4.975   0.950
            0          ALS                0.954      4.590   0.954
                       POP.u              0.949      5.119   0.958
                       POP.trans          0.950      5.044   0.958
                       BPR-MF             0.941     5.919    0.951
                       Asset Embedding    0.906      9.389   0.903
            20         ALS                0.909      9.080   0.906
                       POP.u              0.916      8.404   0.926
                       POP.trans          0.917      8.286   0.927
                       BPR-MF             0.885     11.537   0.914
                       Asset Embedding    0.859     14.105   0.874
            50         ALS                0.917     8.259    0.913      Figure 1: Communities (pink regions) of assets detected
                       POP.u              0.825     17.472   0.819      on the statistically filtered asset graph projection. Color de-
                       POP.trans          0.820     18.032   0.817
                                                                        notes sector attribute.

Due to limited amount of space, we here report the results for the              the number of purchased objects in the different communi-
leave-last-out case only.                                                       ties.
   Given that a good RecSys should give suggestions relevant and
                                                                           As an example of the possible network analysis, Figure 1 shows
specific to the user and expand user’s taste into neighboring areas,
                                                                        the statistically filtered network derived by applying the validation
we run the above tests after removing n = {0, 20, 50} most popular
                                                                        algorithm to the bipartite network with the same specification of
items from the test set. In this way if a RecSys performs well with
                                                                        users and items as in Table 1, for 1% confidence threshold. There are
0 popular items removed and poorly with 50, it is reasonable to
                                                                        4 big connected communities, 2 smaller ones (but still connected)
deduce that maybe it is just good in suggesting popular items but
                                                                        and 6 small isolated communities. Color of nodes (i.e. of assets)
not items related to the specific interests of the user.
                                                                        refers to different value of sector attribute. As an example, the
   Table 1 displays the results of our study for the leave-last-out
                                                                        light-blue sector, Governmental assets, results to be statistically
train/test sampling case.It shows that all the RecSys we propose
                                                                        over-expressed in the rightmost community, and under-expressed
perform extremely well on the dataset at our disposal, in terms of
                                                                        in the leftmost and in the bottom one. This evidence indicates that
both average accuracy and ranking structure (expected percentile
                                                                        statistically filtered investors’ decisions could be used to cluster
ranking - Rank - and AUC). BPR-MF is the best performer when
                                                                        assets: a promising starting point to build a statistically guided
no popular items is excluded from the test set and its advantage
                                                                        algorithm for recommendations. This is part of a work in progress
doesn’t reduce when we increase the number of popular purchased
                                                                        for future publication.
item removed from the test set. ALS performs similarly well, while
Asset Embedding performs better than POPs when the number of
                                                                        ACKNOWLEDGMENTS
popular items excluded from the test set is at least 50, but it never
beats BPR-MF and ALS.                                                   The authors would like to thank MPS Bank for supporting the
                                                                        collaboration, Francesco Mainieri (MPS) for essential support in
Toward a network-based RecSys for banking and                           extracting the data and Franco Maria Nardini (CNR, Pisa) for useful
                                                                        comments. FL and DR acknowledge support by the European Com-
financial services
                                                                        munity’s H2020 Program under the scheme INFRAIA-1- 2014-2015:
Besides the training of the recommender system shown above and          Research Infrastructures, Grant Agreement No. 654024 SoBigData:
the detailed test previously mentioned, we performed an analysis        Social Mining & Big Data Ecosystem.
of the dataset seen as it were a bipartite network users → items.
We implemented a statistical validation procedure [4] to get a sta-     REFERENCES
tistically significant projection on the item module of the network.    [1] Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for im-
We used this statistically filtered network:                                plicit feedback datasets. In Data Mining, 2008. ICDM’08. Eighth IEEE International
                                                                            Conference on. Ieee, 263–272.
    • to identify items’ communities: each community represents         [2] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient
      the set of items that are purchased together by users in a            Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013).
                                                                            http://arxiv.org/abs/1301.3781
      statistically over-expressed way with respect to a random         [3] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme.
      rewiring of the bipartite network keeping fixed the assets’           2009. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings
      degrees;                                                              of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press,
                                                                            452–461.
    • to identify the features that are statistically over-expressed    [4] Michele Tumminello, Salvatore Micciche, Fabrizio Lillo, Jyrki Piilo, and Rosario N
      (or under-expressed) inside communities;                              Mantegna. 2011. Statistically validated networks in bipartite complex systems.
                                                                            PloS one 6, 3 (2011), e17994.
    • to compute, for each user, a raking of communities of items
      based on the p-values of the hyper-geometric distribution of