=Paper=
{{Paper
|id=Vol-2449/paper4
|storemode=property
|title=Review-Based Cross-Domain Collaborative Filtering: A Neural Framework
|pdfUrl=https://ceur-ws.org/Vol-2449/paper4.pdf
|volume=Vol-2449
|authors=Thanh-Nam Doan,Shaghayegh Sahebi
|dblpUrl=https://dblp.org/rec/conf/recsys/DoanS19
}}
==Review-Based Cross-Domain Collaborative Filtering: A Neural Framework==
Review-Based Cross-Domain Collaborative Filtering: A Neural Framework Thanh-Nam Doan, Shaghayegh Sahebi University at Albany - SUNY {tdoan,ssahebi}@albany.edu ABSTRACT Two major approaches to address some of these problems are Cross-domain collaborative filtering recommenders exploit data hybrid [18] and cross-domain [4] recommender systems. Hybrid from other domains (e.g., movie ratings) to predict users’ interests recommender systems merge content-based and collaborative filter- in a different target domain (e.g., suggest music). Most current cross- ing approaches to provide higher-quality recommendations. Some domain recommenders focus on modeling user ratings but pay lim- hybrid recommender systems jointly model user ratings and re- ited attention to user reviews. Additionally, due to the complexity of views to introduce a more sophisticated view to user interests and these recommender systems, they cannot provide any information item features, that leads to improved recommendation results [18]. to users to support user decisions. To address these challenges, we The idea behind cross-domain recommendation systems is to propose Deep Hybrid Cross Domain (DHCD) model, a cross-domain share useful information across two or more domains to improve neural framework, that can simultaneously predict user ratings, recommendation results [4]. They work by transferring informa- and provide useful information to strengthen the suggestions and tion from one or more source or auxiliary domains to suggest useful support user decision across multiple domains. Specifically, DHCD items in a target domain. Especially, when user history in the target enhances the predicted ratings by jointly modeling two crucial domain (e.g., books) does not provide enough information about facets of users’ product assessment: ratings and reviews. To sup- user interests, user preferences in another source domain (e.g., port decisions, it models and provides natural review-like sentences movies) can provide useful insights that can lead to more accurate across domains according to user interests and item features. This or novel recommendations1 . In addition to improving recommen- model is robust in integrating user rating and review information dation results, cross-domain recommender algorithms provide a from more than two domains. Our extensive experiments show solution to problems, such as cold-start or user profiling, in single- that DHCD can significantly outperform advanced baselines in rat- domain recommenders. ing predictions and review generation tasks. For rating prediction Both hybrid and cross-domain recommender systems have shown tasks, it outperforms cross-domain and single-domain collaborative to be successful in the current literature. However, a combination filtering as well as hybrid recommender systems. Furthermore, our of two has been rarely studied. Additionally, the problem of provid- review generation experiments suggest an improved perplexity ing more information to users to support their decisions in cross- score and transfer of review information in DHCD. domain recommender systems, has not been studied. Most of the current research in cross-domain recommenders focus on collab- CCS CONCEPTS orative filtering cross-domain approaches [19]. These approaches incorporate users’ explicit (e.g., rating) or implicit (e.g., purchase) • Information systems → Recommender systems; • Comput- feedback in the auxiliary domain to recommend items in the target ing methodologies → Neural networks. domain. Many of these algorithms jointly model multiple domains by sharing common user’s latent representations across them. Col- KEYWORDS laborative filtering cross-domain recommenders, similar to their Cross-domain Collaborative filtering, neural network, hybrid col- single-domain counterparts, suffer from ignoring content informa- laborative filtering tion. Having advanced models, which are built on users’ rating or binary feedback, complicates the reasoning of why a specific 1 INTRODUCTION user may be interested in an item. Moreover, these recommender algorithms lose the explicit user-item similarities by ignoring an Nowadays, users are overwhelmed by the number of choices online. important source of information: user reviews. Recommender systems are increasingly used as an essential tool, To further enhance the performance and transparency of cross- to alleviate this problem. Despite improvements in recommender domain recommendation systems, we propose to combine hybrid systems, many of them still suffer from problems, including cold- and cross-domain approaches together. With this fusion, we can start [21] and difficulty in explaining their suggestions [26]. More- benefit from the strength of both hybrid and cross-domain recom- over, collaborative filtering recommenders [11] cannot use obvious mender systems: cross-domain modeling will enhance user latent feature-based relations between users and items. Content-based features by providing extra information from other domains (es- approaches cannot capture deeper social or semantic similarities pecially in sparser ones), reviews will bring another dimension between users and items, nor they can suggest novel items (outside for enriching user and item latent features and offer insights to the scope of user profile features) to users [17]. increase the recommendation transparency. Therefore, merging ComplexRec 2019, 20 September 2019, Copenhagen, Denmark 1 While other definitions of domain exist in the literature, e.g., time-based domains, in Copyright ©2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).. this paper, we focus on item domains (e.g., item type or category). ComplexRec 2019, 20 September 2019, Copenhagen, Denmark Thanh-Nam Doan, Shaghayegh Sahebi the two will enrich content features by using review information Domain 1 Rating Prediction Good comic bookacross domains as well as enhance prediction performance. Layer Q Accordingly, we propose Deep Hybrid Cross Domain (DHCD) Layer 2 recommender, which models various types of user feedback (both Layer 1 ratings and reviews) across multiple domains under neural net- Good comic book work framework. We use neural network as a natural choice to item model reviews due to its success in natural language processing Rating Prediction Nice jazz song and generating natural language sentences [5, 26]. In addition to Layer Q user using reviews for producing better-quality suggestions, DHCD can Layer 2 generate natural and useful reviews to support user decisions for Layer 1 suggested cross-domain items. By generating a review that is based Domain 2 Nice jazz song on the specific user’s interests across domains and other reviews, item we can help clarify why a specific item is recommended to user. Our Rating Regression Component Review Generation Component model shares information across domains in two levels by sharing Figure 1: An overview of Deep Hybrid Cross Domain users’ latent representations, and cascading it into reviews’ latent (DHCD) recommendation system. representations. It can capture non-linear user-item relationships by having a neural network framework [5]. Our results and findings Neural Frameworks for Collaborative Filtering. Due to its abil- of this research are summarized as follows: ity to approximate non-linear relation of users and items, neural network is rapidly growing in recommendation systems [25]. • We propose a neural network framework named Deep Hy- He et al. [7] propose a fusion model that combines matrix factor- brid Cross Domain (DHCD) model which unifies ratings and ization and multi-layer perceptron. Despite the efficiency of their reviews of users and items across multiple domains. proposed model, it does not consider reviews and is not extended to • To the best of our knowledge, DHCD is the first framework cross-domain recommendation system. Collaborative Deep Learn- which is able to automatically generate cross-domain reviews ing (CDL) [23] overcome the sparsity of ratings by using auxiliary that in turn can provide decision support for cross domain information such as reviews. Using reviews as a set of words, their recommendations. model outperforms baselines but not considering the sequential • We design and implement multiple experiments to evaluate nature of words in reviews is a limitation. DHCD’s performance in three real-world datasets. Our eval- Review Generation. Ni et al. [14] presented one of the first works uation is performed via two main tasks: rating prediction and that focuses on generating reviews along with preference prediction. review generation tasks, to answer four research questions. Ni and McAuley [15] propose a neural network based upon atten- tion model to assist users writing reviews of items. However, these works and others [1, 8] do not model the preference between users 2 RELATED WORKS and items, nor they are extendable to cross-domain recommenders. Here, we briefly review the literature on cross-domain recommen- dations and neural network-based collaborating filtering. 3 PROPOSED FRAMEWORK Cross-Domain Recommendation focuses on learning user pref- In this section, we describe the architecture of Deep Hybrid Cross erences from data across multiple domains [4]. There are two fo- Domain (DHCD) recommendation system in detail. cuses on cross domain recommendation: collaborative filtering [3] and content-based methods [20]. In this work, we focus on collabo- rative filtering cross domain recommendations. Similar to single- 3.1 Architecture domain collaborative filtering, research work on cross domain rec- DHCD predicts user ratings on items and generates user reviews ommendation usually use matrix factorization. For example, Pan et on them using two main components: the rating regression compo- al. [16] propose a cross domain recommendation system based on nent and the review generation component. In the rating regression matrix factorization by using a coordinate system transfer method. component (RRC), user ratings on items of each domain are mod- Elkahky et al. [3] use deep learning framework to improve the eled as a function of user and item latent representations. For each performance of cross domain recommendation and also provide a user, this component learns a shared latent representation across scalable method to handle large datasets. However, not considering all domains. Moreover, the shared representations of users has a the reviews of items is the main limitation of these methods. role as a gate to transfer information across domains. The shared Xin et al. [24] proposed the first review-based cross-domain rec- user latent representations in combination with domain-specific ommender model. They proposed a graphical model to capture the latent item representations predict user ratings on items. The re- user ratings and item reviews across domains but reviews are not view generation component (RGC) generates user reviews on items used to model user latent features. Later, Song et al. proposed a joint according to user, item, and word latent representations. In this tensor factorization model to capture both user reviews and implicit component, the user and item representations from rating regres- feedback on items to provide cross-domain recommendations [22]. sion component work as a guide to learn review word embeddings However, it does not capture non-linearities across domains, nor it per user-item review. This guidance helps sharing word embedding models reviews as natural sequences of words. None of the above information across domains. Figure 1 illustrates the architecture of works generate reviews. our model. In the following, we present our model in more details. Review-Based Cross-Domain Collaborative Filtering ComplexRec 2019, 20 September 2019, Copenhagen, Denmark q−1 Notations We model the system to include a set of users U, and a projecting from the output of hdq−1 ; i.e. hdq = ReLU(Wqd hd + bqd ) set of item domains D. Each of these domains include a set of items where Wqd and bqd are parameters of H d ’s q th layer for domain d I d , d ∈ D. For a user u ∈ U and each item i ∈ I d , the training d ) and user’s review and ReLU(x) = max(0, x). For the first layer, hd0 = xui d is the input. data may include user’s rating on that item (rui d ). Accordingly, we model training data in domain d We ensure the full connectivity between each two adjacent hidden on that item (sui d layers hdq and hdq−1 . We use regression to map the output vector ŷQ as a set of tuples T d = {(u, i, s, r )|u ∈ U, i ∈ I d , s ∈ Sd , r ∈ R d }. d i.e. rˆd = w d ŷ d +b d where of final layer to the prediction value rˆui ui y Q y Given training data in all domains T , our goal is to simultaneously d ) and d is the predicted rating value of user u and item i in domain d. rˆui estimate user u’s missing rating on item i in domain d (ˆru,i d ). generate user u’s missing textual review on that item (ŝu,i wyd ∈ Rr and byd ∈ R are regression parameters. To learn the parameters of RRC, we optimize the following re- 3.2 Rating Regression Component (RRC) gression loss function: Lr = d d 2 Õ Õ The main purpose of this component is to form a structure to (rui − rˆui ) (2) infer user and item representations using observed user feedback d ∈D u ∈U,i ∈I d on items across all domains. To do this, we model each user u’s interests as latent factors vu and item i’s representation (in domain d is the observed rating of user u and i in domain d. where rui d) as latent factors vid . Then, user u’s predicted rating on item i, rˆui d , d is calculated as a function дr (·) of vu and vi . Formally, we have: 3.3 Review Generation Component (RGC) d This component is to model and generate reviews for user-item rˆui = дr (vu , vid ) (1) pairs in cross-domain setting. Here, we model user, item, and review In many single-domain factorization-based recommender sys- word latent factors to generate natural language sentences. tems, дr is modeled as the vector dot product of these latent factors Recently, recurrent neural networks with components such as plus some bias b [11]. Namely, rˆui = vuT vid +b. This specification has long-short term memory (LSTM) and gated recurrent units (GRU) some limitations that makes it inappropriate for our cross-domain have showed high performance in natural language processing- problem. First, the simple factorization formulation is not fit for a related tasks such as image captioning, Q&A system [5]. Inspired cross-domain problem, as it does not transfer information across by their success, we adapt LSTM as a component for our review domains. Also, the predicted ratings in this model are assumed to generation process. be a linear combination of user and item latent factors. However, As shown in Figure 1, for each domain d, we construct a sepa- recent work suggests that using a non-linear model can enhance the rated LSTM model H̄ d , that can connect to the rating regression representation ability of user and items, and lead to more accurate component. Assume sui d , user u’s review on item i in domain d, as a results [5, 12]. More specifically, in cross-domain recommenders, sequence of words t j where j ≤ Jui (Jui is the number of words in Xin et al. have shown that user ratings across different domains this review). Given a text sequence t 1 , t 2 , ..., t Jui , the LSTM network can have non-linear relationships with each other [24]. Finally, the will update its hidden state parameters (h̄dj ), in step j, according to t j above formulation requires a shared latent space between users and and previous step’s hidden state (h̄dj−1 ). Subsequently, the network items. This assumption can restrict the expressiveness capacity of will predict t j+1 , step (j + 1)’s word, using all of its previous words the model since it (i) limits the user and item latent vectors to have (t