Multi Cross Domain Recommendation Using Item Embedding
                 And Canonical Correlation Analysis
                               Masahiro Kazama                                                   István Varga
                       Recruit Technologies Co., Ltd.                                   Recruit Technologies Co., Ltd.
                               Tokyo, Japan                                                     Tokyo, Japan
                      masahiro_kazama@r.recruit.co.jp                                      vistvan@r.recruit.co.jp

ABSTRACT
In a multi-service environment it is crucial to be able to leverage
user behavior from one or more domains to create personalized
recommendations in the other domain. In our paper, we present a
robust transfer learning approach that successfully captures user
behavior across multiple domains. First, we vectorize users and
items in each domain independently. Second, using a handful of
common users across domain pairs, we project each domain vec-
tor space into a common vector space using canonical correlation
analysis (CCA). Next, recommendations can be performed by rec-
ommending the items in any domains that are closest to the user’s
vector in the common space. We also experimented on what kind
of domain combination works well.
                                                                         Figure 1: Overview of this paper: Users and items are vec-
KEYWORDS                                                                 torized in each domain independently and those vectors are
Recommender Systems, Canonical Correlation Analysis, Transfer            mapped into a common space by canonical correlation anal-
Learning, Item Embedding                                                 ysis(CCA). An object in the figure denotes a user or an item.
ACM Reference format:
Masahiro Kazama and István Varga. 2017. Multi Cross Domain Recommen-
dation Using Item Embedding And Canonical Correlation Analysis. RecSys   2   RELATED WORK
’17 Poster Proceedings, Como, Italy, August 27–31, 2017, 2 pages.
                                                                         There are two main research areas related to our proposal.
                                                                            CCA [2] is actively explored in the field of multimodal represen-
1    INTRODUCTION                                                        tation. Numerous studies have been conducted where using CCA,
                                                                         the embeddings of various types of data (e.g., image, text) by deep
In recent years the ever-increasing ubiquity of e-commerce is al-        learning and word2vec are projected into a common vector space,
lowing us to purchase virtually every product or service that we         where various other subsequent tasks can be performed [1, 4].
could desire. With this increasing growth, one can also observe a           Another relevant area is recommendation systems, some studies
trend in the interconnection of e-commerce businesses by means of        proposed recommendation methods using CCA [5]. Our proposal
common IDs. With such common IDs, services are not only getting          retains its simplicity and robustness with three or more domains.
increased visibility, but consumers are also receiving personalized
product recommendations in a new domain. In this paper we pro-
                                                                         3   PROPOSED METHOD
pose a simple and robust transfer learning method that facilitates
cross domain recommendation that leverages canonical correlation         Our approach consists of two steps. First, we calculate the vec-
analysis (CCA) to represent multiple domains in a single vector          tor representation of each domain using word2vec. Word2vec is a
space. All users and items are represented as vectors in the common      natural language processing method that generates semantic repre-
space; therefore, items from any domain can be recommended to            sentations of words [3] , but it can also be used for rating data in the
users by calculating the similarity between the users’ vector and the    following way: by considering an item as a word and the sequence
items’ vector in the common space. Figure 1 shows the overview of        of items evaluated favorably by a user (i.e., rated with 4 or 5 stars)
our research.                                                            as a sentence, the resulting sentences can be fed into word2vec to
   Our contributions in this papers are as follows:                      achieve item embeddings. We used skip-gram model with hierar-
                                                                         chical softmax. Next, we define the user vector as the average of
     • We applied item embedding technique and CCA to multi
                                                                         item vectors favorably evaluated by the user. In this step, the vector
       cross domain recommendation in a simple and robust way.
                                                                         representations of each domain are calculated independently, so it
     • We experimented on what kind of domain combination
                                                                         is easy to parallelize them.
       works well.
                                                                            As a second step, we project each vector space into a common
RecSys ’17 Poster Proceedings, Como, Italy                               space using CCA. Let us illustrate this projection with three do-
2017.                                                                    mains. Note that it can be easily applied to N (> 3) domains. Let the
RecSys ’17 Poster Proceedings, August 27–31, 2017, Como, Italy                                                   Masahiro Kazama and István Varga


vector of user and item in domain i (= 1, 2, 3) be x i and yi . Let Ci j   Table 1: Result of recall@50: Baseline uses Shopping and
be the covariance matrix of x i and x j . Note that it is not required     Food only. We experimented 5 times and shows the average
to have users who are active in all domains. CCA is calculated             and standard diviation.
when there are some common users in each domain pair. By solv-
ing the following eigenvalue equation, the transformation vector                                  Model                     Recall@50(%)
w = (w T1 , w T2 , w T3 )T can be obtained.                                                      Baseline                    12.8 ± 0.4
                                                                                             Add Restaurants                 14.8 ± 0.7
             0     C 12    C 13            C 11    0      0                                  Add Beauty&Spas                 12.9 ± 0.7
         *. C       0      C 23 +/ w = λ *. 0     C 22    0 +/ w                            Add Health&Medical               11.3 ± 0.6
              21
          , C 31   C 32     0 -           , 0      0     C 33 -
   Using the transformation vector w, both items and users from            Table 2: Correlation between each domain: correlation is the
each domain can be projected to the common vector space. As a              average of top 5 canonical correlations between the users’
result, personalized recommendations can be performed by simply            vectors in one domain and the other domain.
recommending the items closest to the user’s vector in the common
vector space. Even if the user is active in only one domain, we                               Restaurants        Beauty&Spas           Health&Medical
can recommend the other domain’s items. We employed cosine                     Shopping          0.92                0.74                    0.76
similarity for a similarity measure.                                             Food            0.96                0.80                    0.69

4    EXPERIMENTS                                                           hand, Restaurants displays a high correlation with both Food and
We attempted to investigate the domain characteristics that can be         Shopping.
used to improve recommendation performance. As a baseline, two               As a result, we can assume that recommendation performance
domains are projected into one common vector space using CCA               can be increased by adding a highly correlated domain to an already
with the usage of 80% of the common users. 20% of common users             participating domain, but there is likely to be an adverse effect if a
were held out for testing purposes: based on user actions on one           new domain that does not correlate well is introduced.
domain, we predict the items on the other domain, comparing the
recommended items with the actual actions.                                 5    CONCLUSIONS
   Next we added a new third domain and calculated the common              We proposed a simple and robust recommendation method that
vector space with these three domains using CCA. Our hypothe-              works with multiple domains. We experimented what kinds of
sis is that if the new domain correlates highly with at least one          domain combinations increase recommendation performance. As
of the already existing domains, the new domain will enrich the            a result, we found that if we add a domain that is highly corre-
common vector space, thus improving performance. However, if               lated (e.g., based on the top N canonical correlations calculated
the new domain is less similar to the already existing ones, this will     using CCA) with an already added domain, the recommendation
mostly introduce noise, meaning performance will either drop or            performance increases.
not improve significantly.
   For the experiments we used the yelp rating dataset1 in which           REFERENCES
users rated various items. After a basic sanity check (i.e., a re-         [1] Ruka Funaki and Hideki Nakayama. 2015. Image-Mediated Learning for Zero-
moval of multiple categories), we conducted experiments using the              Shot Cross-Lingual Document Retrieval. In Proceedings of EMNLP 2015. 585–590.
top 5 categories: Restaurants, Shopping, Food, Beauty&Spas, and                http://aclweb.org/anthology/D/D15/D15-1070.pdf
                                                                           [2] Jon R Kettenring. 1971. Canonical analysis of several sets of variables. Biometrika
Health&Medical.                                                                58, 3 (1971), 433–451.
   Table 1 shows the item prediction performance in the Food cat-          [3] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013.
                                                                               Distributed representations of words and phrases and their compositionality. In
egory, based on user behavior from the Shopping category when                  Proceedings of NIPS 2013. 3111–3119.
the additional categories (i.e., Beauty&Spas, Health&Medical, and          [4] Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Gert RG
Restaurants) were added. Baseline performance (recall @ 50 = 12.8%)            Lanckriet, Roger Levy, and Nuno Vasconcelos. 2010. A new approach to cross-
                                                                               modal multimedia retrieval. In Proceedings of ACMMM 2010. ACM, 251–260.
did not increase with the addition of Beauty&Spas and even de-             [5] Shaghayegh Sahebi and Peter Brusilovsky. 2015. It Takes Two to Tango: An Explo-
creased by 1.5 points with the addition of Health&Medical (11.3%).             ration of Domain Pairs for Cross-Domain Collaborative Filtering. In Proceedings
However, with the addition of Restaurants, we achieved an improve-             of RecSys 2015. 131–138. https://doi.org/10.1145/2792838.2800188
ment of 2.0 points (14.8%), showing that information from this new
domain strengthened the relationship between our original two
categories, Food and Shopping.
   Table 2 shows the similarity between our original two domains
and the additional domains. Similarity is calculated using the av-
erage of the top 5 canonical correlations calculated using CCA.
We can observe that the original domains have a relatively low
correlation with Beauty&Spas and Health&Medical. On the other

1 https://www.yelp.com/dataset_challenge