Multi Cross Domain Recommendation Using Item Embedding And Canonical Correlation Analysis Masahiro Kazama István Varga Recruit Technologies Co., Ltd. Recruit Technologies Co., Ltd. Tokyo, Japan Tokyo, Japan masahiro_kazama@r.recruit.co.jp vistvan@r.recruit.co.jp ABSTRACT In a multi-service environment it is crucial to be able to leverage user behavior from one or more domains to create personalized recommendations in the other domain. In our paper, we present a robust transfer learning approach that successfully captures user behavior across multiple domains. First, we vectorize users and items in each domain independently. Second, using a handful of common users across domain pairs, we project each domain vec- tor space into a common vector space using canonical correlation analysis (CCA). Next, recommendations can be performed by rec- ommending the items in any domains that are closest to the user’s vector in the common space. We also experimented on what kind of domain combination works well. Figure 1: Overview of this paper: Users and items are vec- KEYWORDS torized in each domain independently and those vectors are Recommender Systems, Canonical Correlation Analysis, Transfer mapped into a common space by canonical correlation anal- Learning, Item Embedding ysis(CCA). An object in the figure denotes a user or an item. ACM Reference format: Masahiro Kazama and István Varga. 2017. Multi Cross Domain Recommen- dation Using Item Embedding And Canonical Correlation Analysis. RecSys 2 RELATED WORK ’17 Poster Proceedings, Como, Italy, August 27–31, 2017, 2 pages. There are two main research areas related to our proposal. CCA [2] is actively explored in the field of multimodal represen- 1 INTRODUCTION tation. Numerous studies have been conducted where using CCA, the embeddings of various types of data (e.g., image, text) by deep In recent years the ever-increasing ubiquity of e-commerce is al- learning and word2vec are projected into a common vector space, lowing us to purchase virtually every product or service that we where various other subsequent tasks can be performed [1, 4]. could desire. With this increasing growth, one can also observe a Another relevant area is recommendation systems, some studies trend in the interconnection of e-commerce businesses by means of proposed recommendation methods using CCA [5]. Our proposal common IDs. With such common IDs, services are not only getting retains its simplicity and robustness with three or more domains. increased visibility, but consumers are also receiving personalized product recommendations in a new domain. In this paper we pro- 3 PROPOSED METHOD pose a simple and robust transfer learning method that facilitates cross domain recommendation that leverages canonical correlation Our approach consists of two steps. First, we calculate the vec- analysis (CCA) to represent multiple domains in a single vector tor representation of each domain using word2vec. Word2vec is a space. All users and items are represented as vectors in the common natural language processing method that generates semantic repre- space; therefore, items from any domain can be recommended to sentations of words [3] , but it can also be used for rating data in the users by calculating the similarity between the users’ vector and the following way: by considering an item as a word and the sequence items’ vector in the common space. Figure 1 shows the overview of of items evaluated favorably by a user (i.e., rated with 4 or 5 stars) our research. as a sentence, the resulting sentences can be fed into word2vec to Our contributions in this papers are as follows: achieve item embeddings. We used skip-gram model with hierar- chical softmax. Next, we define the user vector as the average of • We applied item embedding technique and CCA to multi item vectors favorably evaluated by the user. In this step, the vector cross domain recommendation in a simple and robust way. representations of each domain are calculated independently, so it • We experimented on what kind of domain combination is easy to parallelize them. works well. As a second step, we project each vector space into a common RecSys ’17 Poster Proceedings, Como, Italy space using CCA. Let us illustrate this projection with three do- 2017. mains. Note that it can be easily applied to N (> 3) domains. Let the RecSys ’17 Poster Proceedings, August 27–31, 2017, Como, Italy Masahiro Kazama and István Varga vector of user and item in domain i (= 1, 2, 3) be x i and yi . Let Ci j Table 1: Result of recall@50: Baseline uses Shopping and be the covariance matrix of x i and x j . Note that it is not required Food only. We experimented 5 times and shows the average to have users who are active in all domains. CCA is calculated and standard diviation. when there are some common users in each domain pair. By solv- ing the following eigenvalue equation, the transformation vector Model Recall@50(%) w = (w T1 , w T2 , w T3 )T can be obtained. Baseline 12.8 ± 0.4 Add Restaurants 14.8 ± 0.7 0 C 12 C 13 C 11 0 0 Add Beauty&Spas 12.9 ± 0.7 *. C 0 C 23 +/ w = λ *. 0 C 22 0 +/ w Add Health&Medical 11.3 ± 0.6 21 , C 31 C 32 0 - , 0 0 C 33 - Using the transformation vector w, both items and users from Table 2: Correlation between each domain: correlation is the each domain can be projected to the common vector space. As a average of top 5 canonical correlations between the users’ result, personalized recommendations can be performed by simply vectors in one domain and the other domain. recommending the items closest to the user’s vector in the common vector space. Even if the user is active in only one domain, we Restaurants Beauty&Spas Health&Medical can recommend the other domain’s items. We employed cosine Shopping 0.92 0.74 0.76 similarity for a similarity measure. Food 0.96 0.80 0.69 4 EXPERIMENTS hand, Restaurants displays a high correlation with both Food and We attempted to investigate the domain characteristics that can be Shopping. used to improve recommendation performance. As a baseline, two As a result, we can assume that recommendation performance domains are projected into one common vector space using CCA can be increased by adding a highly correlated domain to an already with the usage of 80% of the common users. 20% of common users participating domain, but there is likely to be an adverse effect if a were held out for testing purposes: based on user actions on one new domain that does not correlate well is introduced. domain, we predict the items on the other domain, comparing the recommended items with the actual actions. 5 CONCLUSIONS Next we added a new third domain and calculated the common We proposed a simple and robust recommendation method that vector space with these three domains using CCA. Our hypothe- works with multiple domains. We experimented what kinds of sis is that if the new domain correlates highly with at least one domain combinations increase recommendation performance. As of the already existing domains, the new domain will enrich the a result, we found that if we add a domain that is highly corre- common vector space, thus improving performance. However, if lated (e.g., based on the top N canonical correlations calculated the new domain is less similar to the already existing ones, this will using CCA) with an already added domain, the recommendation mostly introduce noise, meaning performance will either drop or performance increases. not improve significantly. For the experiments we used the yelp rating dataset1 in which REFERENCES users rated various items. After a basic sanity check (i.e., a re- [1] Ruka Funaki and Hideki Nakayama. 2015. Image-Mediated Learning for Zero- moval of multiple categories), we conducted experiments using the Shot Cross-Lingual Document Retrieval. In Proceedings of EMNLP 2015. 585–590. top 5 categories: Restaurants, Shopping, Food, Beauty&Spas, and http://aclweb.org/anthology/D/D15/D15-1070.pdf [2] Jon R Kettenring. 1971. Canonical analysis of several sets of variables. Biometrika Health&Medical. 58, 3 (1971), 433–451. Table 1 shows the item prediction performance in the Food cat- [3] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In egory, based on user behavior from the Shopping category when Proceedings of NIPS 2013. 3111–3119. the additional categories (i.e., Beauty&Spas, Health&Medical, and [4] Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Gert RG Restaurants) were added. Baseline performance (recall @ 50 = 12.8%) Lanckriet, Roger Levy, and Nuno Vasconcelos. 2010. A new approach to cross- modal multimedia retrieval. In Proceedings of ACMMM 2010. ACM, 251–260. did not increase with the addition of Beauty&Spas and even de- [5] Shaghayegh Sahebi and Peter Brusilovsky. 2015. It Takes Two to Tango: An Explo- creased by 1.5 points with the addition of Health&Medical (11.3%). ration of Domain Pairs for Cross-Domain Collaborative Filtering. In Proceedings However, with the addition of Restaurants, we achieved an improve- of RecSys 2015. 131–138. https://doi.org/10.1145/2792838.2800188 ment of 2.0 points (14.8%), showing that information from this new domain strengthened the relationship between our original two categories, Food and Shopping. Table 2 shows the similarity between our original two domains and the additional domains. Similarity is calculated using the av- erage of the top 5 canonical correlations calculated using CCA. We can observe that the original domains have a relatively low correlation with Beauty&Spas and Health&Medical. On the other 1 https://www.yelp.com/dataset_challenge