Reshaping Graph Recommendation with Edge Graph Collaborative Filtering and Customer Reviews Vito Walter Anelli1 , Yashar Deldjoo1 , Tommaso Di Noia1 , Eugenio Di Sciascio1 , Antonio Ferrara1 , Daniele Malitesta1,* and Claudio Pomo1,* 1 Politecnico di Bari, via Orabona, 4, 70126 Bari, Italy Abstract Graph collaborative filtering approaches learn refined users’ and items’ node representations by iteratively aggregating the informative content (called messages) coming from neighbor nodes into each ego node. Unfortunately, not all interactions (i.e., graph edges) may be equally important to the users and items involved. As this indiscriminate message aggregation leads to multi-hop representation errors, recent strategies have used attention mechanisms to weight neighbors’ importance to the ego node. Despite their success, such solutions seem to disregard the potentially critical impact users’ reviews may play on this weighting process. Reviews convey the multi-faceted user’s opinion about items and provide a fundamental tool to group like-minded customers. In this work, we first formally show the causes of node error representation in graph collaborative filtering and demonstrate how existing neighborhood weighting procedures (e.g., attention mechanisms) may alleviate the issue at the expense of limited hop exploration. Second, we correct the representation error through an additional graph network where we enrich graph edge embeddings through opinion-aware review embeddings to smooth each neighbor node’s importance on its ego node. We call our solution Edge Graph Collaborative Filtering (EGCF). Extensive experiments on three e-commerce datasets show that EGCF competes successfully with traditional, graph- and review-based approaches on accuracy and beyond-accuracy objectives, while a study on the number of explored hops justifies the adopted configuration for EGCF. Code and datasets are available at: https://github.com/sisinflab/Edge-Graph-Collaborative-Filtering. Keywords Collaborative Filtering, Recommendation, Graph Convolutional Networks, Reviews 𝑖! 𝑢" 1. Introduction "They were too "Very comfortable. They narrow and hurt my Recommender systems constitute the backbone of several 𝑢! also wear well for an feet so I returned active lifestyle. Love them." online platforms (e.g., Amazon), offering consumers lists them." of products that might meet their needs and tastes. Rec- ommendation algorithms are traditionally designed and "Nothing really wrong with the belt just wider 𝑖" 𝑢# and thicker than I like. trained to find preference patterns in user-item recorded Good quality." "Great belt, nice color and holding up very interactions. Optionally, this learning process may be well" enriched through additional informative data constantly Figure 1: A subset of users, items, and reviews users wrote updated on those platforms, which may captivate cus- about items, along with the expressed ratings (in the range tomer’s attention towards items’ characteristics (e.g., 1-5). Despite being connected to the same items, users 𝑢1 - product images) or provide a tool to share opinions about 𝑢2 , and users 𝑢1 -𝑢3 do not share similar opinions about the purchased items to guide other customers during their interacted items. decision-making process (e.g., reviews). Collaborative filtering (CF) [1], one of the most promi- nent recommendation paradigms in recent years, pro- combines these embeddings linearly (e.g., inner prod- motes the intuition of similar users interacting with sim- uct [2]) or non-linearly (e.g., neural networks [3] and ilar items. CF-based models usually map users and items probabilistic models [4]). While focusing on improving to embeddings in the latent space, and learn to predict the user-item prediction step, such techniques have long user interactions by optimizing an objective function that underestimated the importance of deriving informative DL4SR’22: Workshop on Deep Learning for Search and Recommen- features to describe users and items suitably. dation, co-located with the 31st ACM International Conference on Recently, graph convolutional networks (GCNs) [5] Information and Knowledge Management (CIKM), October 17-21, 2022, have taken over CF-based recommendation, thanks to Atlanta, USA their capability of mining user-item high-order relation- * Authors are listed in alphabetical order. Corresponding authors: Daniele Malitesta and Claudio Pomo. ships. Unlike prior techniques, these models explicitly $ daniele.malitesta@poliba.it (D. Malitesta); incorporate user and item relationships into their embed- claudio.pomo@poliba.it (C. Pomo) ding representations. Concretely, the embedding of each © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). node (defined as ego node) is refined by aggregating its CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) neighbors’ node embeddings (i.e., whose contribution is novel and diverse items from the catalog. In this work, called messages). This step is repeated iteratively to propa- we first formally define the problem of nodes’ represen- gate the collaborative signal over multiple hops. These tation error in graph collaborative filtering. After that, models are becoming the de facto standard in personal- we show how existing weighting techniques (such as ized recommendation, reaching remarkable recommen- attention mechanisms) may alleviate the described issue dation performance as in the pioneer works presented at the expense of limiting the hop exploration depth to in [6, 7], and more recently, in the solutions [8, 9, 10]. reduce the effect of oversmoothing. Thus, to address The message-passing pattern, by design, may still such drawback, we propose a lighter-weighting proce- present some limitations despite being successful. An dure that exploits the informative content extracted from argument could be made that not all user-item interac- reviews (i.e., opinions and comments about interacted tions (i.e., graph edges) have the same relative impor- items) to enhance graph edge representation. Such edge- tance. To clarify this, consider the motivating scenario enriched features are eventually used to derive the sim- in Figure 1, where we depict a subset of users and items ilarity between the ego node and its neighbors, which from a real-world e-commerce platform (i.e., the Amazon we re-interpret as the importance of the neighbor node catalog) and enrich their interactions with ratings and on the ego node. Our proposed weighting procedure is reviews. Both user 𝑢1 and 𝑢2 interacted with item 𝑖1 , applied to a GCN acting as the correction to another tra- thus inferring that they might share similar interests and ditional (but error-affected) GCN. We call our solution preferences. However, careful analysis of the correspond- Edge Graph Collaborative Filtering (EGCF). ing reviews reveals that their opinions about item 𝑖1 After formalizing the theoretical basis for EGCF and its are opposite (the expressed ratings are 5 and 2, respec- rationale, we assess its efficacy on three popular product tively). Following a similar reasoning schema, users 𝑢1 categories from the Amazon catalog [18]. Given their and 𝑢3 have both interacted with item 𝑖2 but their com- similar intuitions and rationale to EGCF, we compare the ments, while being generally similar (the item is rated 3 method with four families of CF-based recommendation, and 5, respectively), show slight shades of disagreement i.e., traditional, review-based [19, 20], and graph-based (i.e., 𝑢1 is not completely satisfied with the belt size). As approaches (both leveraging attention mechanisms and the message-passing pattern works by indiscriminately not). We seek to answer these research questions about aggregating the neighbor nodes at multiple hops, the our proposed approach: node representation of 𝑢1 is ultimately influenced by the • RQ1. Can the correction to the node error rep- representations of both 𝑢2 and 𝑢3 after two propagation resentation help EGCF produce more accurate hops. In the long term, such behavior may lead to what recommendations than state-of-the-art baselines? we could define as a node representation error. Weighting the importance of neighborhood while ag- • RQ2. Considering the high impact that novel and gregating the incoming messages into the ego node diverse recommendation lists may have on both is among the prominent solutions to the abovemen- users and companies, how effective is EGCF when tioned issue. Following the same direction path in [11], evaluated on beyond-accuracy metrics, given its other popular and recent works in recommendation strategy for neighborhood exploration? such as [12, 13, 14, 15] leverage attention mechanisms • RQ3. What is the effect of changing the hop (i.e., a neural network) to perform the weighting proce- exploration number on recommendation perfor- dure. Even if these models have widely demonstrated mance, and how can we justify such behaviors to provide superior accuracy recommendation perfor- for the adopted architecture? mance, they are still affected by oversmoothing, the phenomenon according to which node embedded rep- The extensive experimental evaluation shows that the resentations tend to get closer and closer in the latent correction to the node representation error and the pos- space after multiple propagation hops, thus flattening sibility of propagating messages across multiple hops the existing differences in the neighborhood [16, 17]. For permits EGCF to outperform state-of-the-art baselines this reason, attention-based approaches usually propa- on accuracy and beyond-accuracy metrics. Finally, the gate messages for only one or two hops, but this does study on the hop propagation number proves the sound- not help access wider portions of the user-item graph. ness of our proposed architectural configuration while In this respect, we believe attention-based techniques shading interesting direction paths for future work. generally disregard other potential sources of informa- tion (e.g., users’ generated reviews) whose contribution 2. Related Work may positively impact the neighborhood weighting pro- cess. Opinions and comments about interacted items Graph-based recommendation. The approach pro- constitute the basis on which like-minded users gather posed in [21] is the first attempt to address the recom- on online platforms, as they promote the discovery of mendation task through a graph-based architecture. The authors implement a graph autoencoder that labels its representing them through the extracted embeddings. edges with users’ ratings to perform link prediction. Ying Review-based recommendation. Reviews convey a et al. [6] design a graph convolutional network for a web- rich source of information to access users’ multi-faceted scale recommendation to produce high-quality image opinions about interacted items. For this reason, sev- recommendations for the Pinterest platform, efficiently eral existing works propose to extract valuable knowl- exploiting random walk and item’s multimodal side in- edge from them to produce better-tailored recommen- formation. Wang et al. [7] present neural graph collabo- dations [19, 20]. Among the pioneer works, Wang et al. rative filtering (NGCF), whose propagation layer aggre- [29] adopt a stacked denoising autoencoder to approxi- gates the messages from the neighborhood considering mate the user-item rating matrix starting from textual re- the similarity between each neighbor node and its ego views, Almahairi et al. [30] introduce two neural network- node. While providing higher performance to previous based approaches built upon bag-of-words and recurrent state-of-the-art solutions, NGCF (and GCN more gener- neural networks, and Kim et al. [31] present convolu- ally) show limitations later addressed by He et al. [8]. tional matrix factorization (ConvMF), where a convolu- Their idea is to lighten GCN’s traditional layer structure tional neural network is merged with probabilistic matrix and reach superior accuracy performance by removing factorization to learn the context of review documents. non-linearities and node embedding transformation in Reviews are textual documents composed of words, the propagation layer (LightGCN). The latest approaches which may further be grouped into sentences. To exploit try to take a step further to the LightGCN strategy by such hierarchical structure, Zheng et al. [32] design a allowing theoretically unlimited propagation layers [9] convolutional neural network on top of a factorization and revisiting the concept of graph convolution for rec- machine prediction model to extract from review’s words ommendation and node embedding smoothness under a unique embedded representation for users and items. the lens of graph signal processing [10]. The adoption of attention mechanisms may help refine While aggregating messages from neighbor nodes into each review component’s importance on the recommen- the ego node, not all received contributions have the dation profile of users and items. In this respect, Liu same importance. The pioneering work by Velickovic et al. [33] improve the previous approach by weighting et al. [11], called graph attention network (GAT), takes ad- the importance of convolutionally-embedded reviews for vantage of attention mechanisms to weight the different both users and items for the sake of explanation. Simi- influences of neighbor nodes on the ego node. Inspired larly, Lu et al. [34] learn users’ and items’ attention fea- by this rationale, several recent works in recommenda- tures by exploring different review components such as tion seek to assess the relative importance of interacted words, sentences, and topics via a GRU-based network, items on users involved in those interactions. In the last while Liu et al. [33] (based upon the solution described few years, recommendation tasks such as session-based in [35]) augment users’ and items’ collaborative latent recommendation [22, 23, 12] and sequential recommen- factors through features extracted from their generated dation [13, 24] have been widely addressed by using at- ratings and reviews. Wang et al. [36] leverage common tention mechanisms on graphs. Attention mechanisms review properties (e.g., how helpful the reviews were for may also be beneficial when the informative content con- other users) to assess its importance on users and items. veyed by the bipartite user-item graph is enhanced by Only recently, very few works have injected the infor- additional side information, like knowledge graphs [25], mative content of reviews into graph-based networks for heterogeneous information networks [14], or multimodal recommendation. Wu et al. [37] propose a model named items’ content [26]. Exploiting attention to disentangle reviews meet graphs (RMG), a multi-view framework the aspects underlying node interactions may represent that learns users’ and items’ representation by consid- a fundamental step toward explainability [27]. Follow- ering the word- and sentence-level of reviews and ex- ing this direction, the work by Wang et al. [15] named ploring two hops of the user-item graphs to access also disentangled graph collaborative filtering (DGCF), and user-user and item-item relations. Gao et al. [38] present the method presented in Wu et al. [28], propose to disen- a three-structured architecture that catches the short- tangle user-item connections into possible user intents. and long-term user preferences and item features, along State-of-the-art attention-based approaches provide with the collaborative information encoded in the bipar- an efficient neighborhood weighting strategy. However, tite user-item graph. Shi et al. [39] introduce a dual GCN their multi-hop exploration is usually limited to prevent model, where one extracts and propagates review aspects, nodes in the neighborhood from getting too much similar and the other reuses the aspect for the graph. in the latent space (see Section 3.2). Conversely, EGCF Despite addressing recommendation through differ- leverages additional information (i.e., reviews) whose ex- ent strategies, the presented algorithms generally work tracted opinion-aware features do not flatten differences by grouping reviews on both users and items profiles among nodes while easing the weighting process. More- but, in fact, limiting the exploration of users and items over, in contrast to prior works, EGCF enriches edges by neighbors at one hop (i.e., the nearest neighborhood). Conversely, our proposed approach exploits reviews as 3.2. A limitation in the message-passing edge side information to describe user-item interactions The user formulation in Equation (2) can be expanded and propagate their informative content at multiple hops through Equation (1): to overcome theoretical issues in the way graph-based recommender systems are usually designed (see later). (2) 𝜔 e𝑢′ , ∀𝑢′ ∈ 𝒩 (𝑖) ∖ {𝑢} , ∀𝑖 ∈ 𝒩 (𝑢) (︀{︀ (︀{︀ }︀)︀ }︀)︀ e𝑢 = 𝜔 (3) What emerges is that, by propagating messages at two 3. Methodology hops, the node embedding of user 𝑢 is eventually refined through the contributions from other users who inter- The section presents and motivates our proposed method, acted with the same items as 𝑢. In other words, after two Edge Graph Collaborative Filtering (EGCF). We first in- hops, each user profile is influenced by the profiles troduce some notation and preliminaries to graph models of other users who rated the same items. for collaborative filtering. Then, we highlight a poten- Indeed, this assumption is aligned with the rationale tially critical issue in the message-passing schema. Even behind collaborative filtering, i.e., similar users are likely if weighting the importance of each neighbor node may to interact with the same items. However, not all user- alleviate the problem, we discuss the insights and propose item interactions (i.e., graph edges) may be equally im- an enhanced application of the importance weighting. portant to the users and items involved. Thus, indis- criminately aggregating neighbor node embeddings into 3.1. Notation and preliminaries the ego node could, after multiple hops, harm the node updating process by bringing all contributions from the Let 𝒰 = {𝑢1 , 𝑢2 , . . . , 𝑢𝑁 } and ℐ = {𝑖1 , 𝑖2 , . . . , 𝑖𝑀 } be neighborhood, even the noisy ones. We interpret this the sets of 𝑁 users and 𝑀 items in the system, respec- as a node representation error, propagating with the tively. Then, let us consider a bipartite and undirected exploration hops in the graph. user-item graph that connects pairs of nodes when there For this reason, contributions coming from each neigh- exists a recorded interaction among them. User and item bor node are usually weighted before aggregating them nodes are represented through embeddings in the latent into the ego nodes, modifying the presented formula: space, i.e., e𝑢 ∈ R𝑑 , ∀𝑢 ∈ 𝒰 and e𝑖 ∈ R𝑑 , ∀𝑖 ∈ ℐ. Inspired by popular approaches [5], current graph- (︁ {︁ (︁{︁ (2) (2) (1) e𝑢 = 𝜔 𝛼𝑖→𝑢 𝜔 𝛼𝑢′ →𝑖 e𝑢′ , based recommender systems refine users’ and items’ node }︁)︁ }︁)︁ (4) embeddings by exploring their multi-hop interconnec- ∀𝑢′ ∈ 𝒩 (𝑖) ∖ {𝑢} , ∀𝑖 ∈ 𝒩 (𝑢) tions represented in the graph. Let 𝑢 and 𝑖 be the nodes (𝑙) for a user and an item to be updated (i.e., the ego nodes), where 𝛼𝑗→𝑘 stands for the importance that the neigh- and let 𝒩 (𝑢) and 𝒩 (𝑖) be the sets of nodes at one hop bor node 𝑗 has on the ego node 𝑘 after 𝑙 hops. These from 𝑢 and 𝑖, respectively (i.e., their neighborhood). The weights are generally calculated by means of attention ego node embeddings e𝑢 and e𝑖 are updated by aggre- mechanisms, and depend on the embeddings of the neigh- (𝑙) gating their neighborhoods (i.e., messages): bor and the ego nodes they refer to, e.g., 𝛼𝑗→𝑘 = (︁ )︁ (𝑙−1) (𝑙−1) (1) 𝜙 e𝑗 , e𝑘 , where 𝜙(·, ·) is a neural network: e𝑢 = 𝜔 ({e𝑖 , ∀𝑖 ∈ 𝒩 (𝑢)}) (1) (1) (□) e𝑖 = 𝜔 ({e𝑢 , ∀𝑢 ∈ 𝒩 (𝑖)}) (△) (︁ ⏞ (︁ ⏟ )︁ {︁ (︁{︁ ⏞ ⏟ (2) (1) (1) e𝑢 = 𝜔 𝜙 e𝑖 , e𝑢 𝜔 𝜙(e𝑢′ , e𝑖 ) e𝑢′ , (5) (1) (1) where e𝑢 and e𝑖 are the refined embedding versions }︁)︁ }︁)︁ of user 𝑢 and item 𝑖 after one hop, while 𝜔(·) indicates ∀𝑢′ ∈ 𝒩 (𝑖) ∖ {𝑢} , ∀𝑖 ∈ 𝒩 (𝑢) the aggregation function. This message-passing pattern (2) that is, e𝑢 depends on (□) the importance each neigh- may be iterated 𝐿 times, thus exploring wider and wider bor item node 𝑖 has on the ego user node 𝑢 after one hop, neighborhoods of the ego nodes. After two hops, the and (△) the importance all users interacting with the refined embeddings of user 𝑢 and item 𝑖 are: same items as 𝑢 have on their items. Note that (□) may (2) (︁{︁ (1) }︁)︁ be further expanded: e𝑢 = 𝜔 e𝑖 , ∀𝑖 ∈ 𝒩 (𝑢) (2) (︁ )︁ (︁ (︁{︁ }︁)︁ (1) (1) (1) (2) (︁{︁ (1) }︁)︁ 𝜙 e𝑖 , e𝑢 = 𝜙 𝜔 𝛼𝑢′ →𝑖 e𝑢′ , ∀𝑢′ ∈ 𝒩 (𝑖) ∖ {𝑢} , e𝑖 = 𝜔 e𝑢 , ∀𝑢 ∈ 𝒩 (𝑖) (︁{︁ }︁)︁)︁ (1) 𝜔 𝛼𝑖′ →𝑢 e𝑖′ , ∀𝑖′ ∈ 𝒩 (𝑢) ∖ {𝑖} (︁ (︁{︁ }︁)︁ = 𝜙 𝜔 𝜙(e𝑢′ , e𝑖 )e𝑢′ , ∀𝑢′ ∈ 𝒩 (𝑖) ∖ {𝑢} , (︁{︁ }︁)︁)︁ 𝜔 𝜙(e𝑖′ , e𝑢 )e𝑖′ , ∀𝑖′ ∈ 𝒩 (𝑢) ∖ {𝑖} (6) When merging Equation (5) and Equation (6): mapped to word embeddings, which are injected into an opinion-based model pretrained to predict the rating ex- pressed by the user through specific terms in the review. (□) (△) ⏞ ⏟ ⏞ ⏟ 𝜙(e𝑢′ , e𝑖 ) 𝜙(e𝑖′ , e𝑢 ) (□) While the output model carries the single information about the predicted review score, the activation of a hid- ⏞ ⏟ (7) (︁ (︁ )︁ {︁ (︁{︁ ⏞ ⏟ (2) (1) (1) e𝑢 = 𝜔 𝜙 e𝑖 , e𝑢 𝜔 𝜙(e𝑢′ , e𝑖 ) e𝑢′ , den layer would unveil a richer source of textual features (i.e., an embedding) which drove the opinion-based model }︁)︁ }︁)︁ ∀𝑢′ ∈ 𝒩 (𝑖) ∖ {𝑢} , ∀𝑖 ∈ 𝒩 (𝑢) to predict that score. High-level features extracted from The node embedding for user 𝑢 after two hops depends pretrained deep learning models can boost the recommen- on (□) the importance of all users interacting with the dation performance of recommender systems leveraging same items as 𝑢 on those items, and (△) the importance items’ side information (e.g., visual-based recommender of all items interacted by 𝑢 on user 𝑢. In other words, systems [40, 41]). We deem these textual features to de- weighting the importance of each neighbor node on the serve a pivotal role in this weighting process. ego node before the aggregation allows, after two propa- Let r𝑢𝑖 ∈ R𝑓 be the textual embedding extracted from gation hops, to calculate to what extent each user pro- the review of user 𝑢 about item 𝑖 through the pretrained file is influenced by the profiles of the other users opinion-based model. First, we project r𝑢𝑖 ∈ R𝑓 to the who rated the same items. Without loss of generality, same latent space as e𝑢 ∈ R𝑑 and e𝑖 ∈ R𝑑 with a one- a similar consideration could be made after a number of layer neural network: hops greater than two. p𝑢𝑖 = LeakyReLU (Wr𝑢𝑖 + b) (8) 3.3. Enhancing neighborhood weighting where p𝑢𝑖 ∈ R is the projected review embedding, 𝑑 while W ∈ R𝑓 ×𝑑 and b ∈ R𝑑 are the projection matrix through reviews and the bias, respectively. We seek to retain only those As known, graph-based models in machine learning are textual features of review r𝑢𝑖 which can be significant affected by oversmoothing [16, 17]. This phenomenon to later calculate the interdependence between this leads node embeddings, after multiple propagation hops, embedding and user/item ones. to become closer and closer in their representation in Then, we propose to enhance the neighborhood the latent space, eventually flattening their existing dif- weighting procedure at hop 𝑙 by conditioning the im- ferences. As this behavior would profoundly weaken portance weights also on the projected embedding of the models’ performance, exploration of the neighborhood review connecting user 𝑢 and item 𝑖. For instance, the generally tends to be constrained to very few hops (e.g., importance of the neighbor item node 𝑖 on the ego user a maximum of two hops in attention-based weighting). node 𝑢 after 𝑙 hops is calculated as: However, in recommendation scenarios, limiting (𝑙) (︁ (𝑙−1) (𝑙−1) )︁ the exploration of the user-item bipartite graph 𝛼𝑖→𝑢 = 𝜙 e𝑖 , e𝑢 , p𝑢𝑖 (9) may represent an inconsistency to the idea of col- Note that, since p𝑢𝑖 cannot increase the impact of the laborative filtering, where users are connected to share oversmoothing effect (because it is not dependent preferences and tastes for similar items. on the hop 𝑙), its usage in the importance weight Under this assumption, we believe the neighborhood formula becomes even more beneficial. Let us focus weighting process could be further enhanced by exploit- on the weighting function 𝜙(·, ·, ·). Many approaches ing other sources of information that are not usually from the literature propose to leverage attention mecha- taken into account. In the majority of popular online nisms, usually implemented as a neural network trained platforms for e-commerce (e.g., Amazon), reviews are in the downstream task to predict the importance of the fundamental tools to share opinions and comments neighbor node on the ego node. In our solution, we opt about interacted items, as they convey the multi-faceted for a simplified and lightweight formulation that seeks aspects that drove a user to interact with an item. Lever- to calculate the similarity between the neighbor and the aging such side information on the connections exist- ego nodes, conditioned on the opinion embedding ing among users and items in the bipartite graph (i.e., of the review connecting them. Specifically: graph edges) can improve the learning of the importance weights by reducing the oversmoothing effect because (︁ )︁ (𝑙) (𝑙−1) (𝑙−1) 𝛼𝑖→𝑢 = cos e𝑖 ⊙ p𝑢𝑖 , e𝑢 ⊙ p𝑢𝑖 (10) each user/item node embedding is conditioned on the opinion conveyed by the review. where ⊙ is the element-wise multiplication, and cos(·, ·) Let 𝒲𝑢𝑖 = {𝑤1 , 𝑤2 , . . . , 𝑤𝑅 } be the set of 𝑅 words is the cosine similarity. Note that we suppress nega- that compose the review written by user 𝑢 about item tive similarities to zero as such weights are usually non- 𝑖. After an initial tokenization step, the sets of tokens negative. Multiplying both node embeddings by the re- for 𝒲𝑢𝑖 is defined as 𝒯𝑢𝑖 = {𝑡1 , 𝑡2 , . . . , 𝑡𝑇 }. Tokens are view opinion embedding provides the interplay between each node feature and the opinion features, thus produc- ing a modified version of the node representation that conveys a richer source of information. No trainable projection weight is learned in the presented formulation since the contribution of the review embed- 𝐺𝐶𝑁! 𝑙+3 𝐺𝐶𝑁" 𝑙+3 ding is meaningful enough. 𝑙 𝑙+2 𝑙+1 𝒑𝒖𝒊 𝑙 𝑙+2 𝑙+1 (𝒍&𝟏) 𝒓𝒖𝒊 (𝒍&𝟏) 𝒄 𝒊𝟑 𝛼 (%)!→# 𝒆 𝒊𝟑 (𝒍&𝟏) 𝛼!→# (𝒍&𝟏) 𝒆 𝒊𝟐 𝒄 𝒊𝟐 (𝒍) (𝒍) 𝒆𝒖 (𝒍&𝟏) 𝒄𝒖 (𝒍&𝟏) 𝒆𝒖 𝟑 (𝒍&𝟏) 𝒄𝒖 𝟑 (𝒍&𝟏) 3.4. A double message-passing schema (𝒍&𝟏) 𝒆 𝒊𝟏 𝒆𝒖 𝟐 (𝒍) 𝒆𝒊 Opinion-based (𝒍&𝟏) 𝒄 𝒊𝟏 (𝒍&𝟏) 𝒄𝒖 𝟐 (𝒍) 𝒄𝒊 (𝒍&𝟏) 𝒄𝒖 𝟏 𝒆𝒖 𝟏 model The proposed neighborhood weighting procedure can help correct the representation error generated in the (a) (b) traditional message-passing schema. However, the idea Figure 2: Overview of the node refining algorithm proposed is not to completely replace it, as several recent works for EGCF. A statically-weighted GCN network affected by from the literature have demonstrated its efficacy, espe- node representation error (a) is corrected through another cially in producing accurate recommendations [8]. The GCN network (b), where an opinion-based embedding is ex- tracted from each review as edge side information to weight proposed approach involves a double message-passing the importance of the neighbor nodes on their ego nodes. schema, where two graph models are trained to refine their own user/item node representations. While the first one aggregates the contributions coming from Table 1 the neighbor nodes into the ego nodes by weighting the Statistics of the tested datasets. neighborhood importance on the ego node statically, Average the second one aggregates the neighborhood’s messages Datasets #Users #Items #Interactions Density interactions which are also weighted through the opinion embed- per user Baby 4,669 5,435 29,214 0.00115 6.3 dings from reviews. Boys & Girls 8,806 4,165 57,928 0.00158 6.6 We define the two graph convolutional networks as Men 3,218 7,605 60,299 0.00246 18.7 GCN𝑒 (error-affected) and GCN𝑐 (correction) and assign the node embeddings e* to GCN𝑒 , and the node embed- dings c* to GCN𝑐 . As for the aggregation function, in of the node refining algorithm proposed for EGCF is dis- both cases, we sum the weighted messages coming from played in Figure 2. the neighbor nodes. As such, the update of the user node Given the learned error-affected and correction embed- embedding 𝑢 after 𝑙 hops is calculated as: dings from above, EGCF predicts if a user 𝑢 may interact (𝑙−1) with item 𝑖 through the following formulation: (𝑙) ∑︁ (𝑙−1) ∑︁ e𝑖 e𝑢 = 𝛼𝑖→𝑢 e𝑖 = √︀ √︀ 𝑦 ^𝑢𝑖 = e⊤ e𝑖 + c⊤ c𝑖 (13) 𝑖∈𝒩 (𝑢) 𝑖∈𝒩 (𝑢) |𝒩 (𝑢)| |𝒩 (𝑖)| ⏟ 𝑢⏞ ⏟ 𝑢⏞ error-affected correction (𝑙) (𝑙) (𝑙−1) ∑︁ c𝑢 = 𝛼𝑖→𝑢 𝛼𝑖→𝑢 c𝑖 = 𝑖∈𝒩 (𝑢) Thus, we apply the error correction to the user/item (︁ )︁ embedding representation only when predicting the (𝑙−1) (𝑙−1) ∑︁ cos e𝑖 ⊙ p𝑢𝑖 , e𝑢 ⊙ p𝑢𝑖 (𝑙−1) user/item interaction. We optimize EGCF with the state- = √︀ √︀ c𝑖 of-the-art Bayesian Personalized Ranking (BPR) [42]. 𝑖∈𝒩 (𝑢) |𝒩 (𝑢)| |𝒩 (𝑖)| (11) Note that 𝛼𝑖→𝑢 is static and only depends on the topol- 4. Experiments and Discussion (𝑙) ogy of the bipartite graph, while 𝛼𝑖→𝑢 varies along with the exploration hop and depends on the embeddings of 4.1. Experimental Setup ego/neighbor nodes, and the opinion review embedding. Datasets. We use three popular [43, 44] datasets from After 𝐿 propagation hops, the final embedding represen- Amazon’s Baby, Boys & Girls, and Men categories [18] tation is obtained as: which contain historical user-item interactions and re- 𝐿 ∑︁ 1 (𝑙) 𝐿 ∑︁ 1 (𝑙) views. We retain only interactions with non-empty re- e𝑢 = e𝑢 , e𝑖 = e𝑖 views, then keep the 20k and 10k most popular items for 1+𝑙 1+𝑙 (12) Baby and Boys & Girls/Men, respectively. Finally, we ap- 𝑙=0 𝑙=0 𝐿 ∑︁ 1 (𝑙) 𝐿 ∑︁ 1 (𝑙) ply the 5- and 15-core on items and users on Baby/Boys c𝑢 = 1+𝑙 c 𝑢 , c𝑖 = 1+𝑙 c𝑖 & Girls and Men, respectively. Statistics are in Table 1. 𝑙=0 𝑙=0 Baselines. We compare our approach with eight state-of- where we apply the scaling factor 1/(1+𝑙) to further alle- the-art models spanning several families: (i) traditional viate the oversmoothing problem. A schematic overview CF (BPRMF [42] and MultiVAE [4]); (ii) review-based CF (ConvMF [31] and RMG [37]); (iii) graph-based CF Table 2 (NGCF [7] and LightGCN [8]); (iv) graph-based CF with Accuracy metrics, i.e., 𝑅𝑒𝑐𝑎𝑙𝑙, 𝑛𝐷𝐶𝐺, and 𝑀 𝐴𝑅, for top-10 attention (GAT [11] and DGCF [15]). lists. Best value is in bold, while second-to-best is underlined. Reproducibility. We adopt the temporal leave-one-out Models Baby Boys & Girls Men to split the datasets, where the last two recorded inter- MostPop 𝑅𝑒𝑐𝑎𝑙𝑙 0.0940 𝑛𝐷𝐶𝐺 0.0520 𝑀 𝐴𝑅 0.0627 𝑅𝑒𝑐𝑎𝑙𝑙 0.1195 𝑛𝐷𝐶𝐺 0.0647 𝑀 𝐴𝑅 0.0776 𝑅𝑒𝑐𝑎𝑙𝑙 0.0702 𝑛𝐷𝐶𝐺 0.0590 𝑀 𝐴𝑅 0.0672 actions are included in the validation and test. We tune BPRMF 0.1377 0.0785 0.0980 0.1821 0.1446 0.1666 0.1662 0.1314 0.1527 hyper-parameters with [45] and follow the baselines pa- MultiVAE 0.1768 0.1262 0.1455 0.2224 0.1695 0.1990 0.2091 0.1656 0.1898 ConvMF 0.1230 0.0647 0.0800 0.1146 0.0831 0.0972 0.0838 0.0524 0.0584 pers, and fix the batch size to 256 and epochs to 400. As RMG NGCF 0.1272 0.1411 0.0911 0.0916 0.1059 0.1092 0.1512 0.2006 0.1065 0.1523 0.1325 0.1783 0.1067 0.1969 0.0727 0.1461 0.0867 0.1722 for EGCF, we extract review embeddings through a pop- LightGCN 0.1892 0.1362 0.1590 0.2305 0.1743 0.2054 0.2124 0.1605 0.1882 ular pre-trained model1 . Datasets and codes are publicly GAT 0.1595 0.1051 0.1233 0.2069 0.1573 0.1846 0.1695 0.1254 0.1476 DGCF 0.1874 0.1352 0.1558 0.2249 0.1716 0.2023 0.2070 0.1554 0.1823 available2 . All models are implemented in Elliot [46]. EGCF 0.1944* 0.1402* 0.1623* 0.2325 *statistically significant differences (p-value ≤ 0.05). 0.1792* 0.2089* 0.2195* 0.1703* 0.1988* Evaluation protocol. We measure the model accuracy by adopting the recall (𝑅𝑒𝑐𝑎𝑙𝑙@𝑘), the normalized dis- counted cumulative gain (𝑛𝐷𝐶𝐺@𝑘), and the mean av- to weight the importance of neighbor nodes is rewarded erage recall (𝑀 𝐴𝑅@𝑘) [8, 15]. Additionally, consider- in Baby and Boys & Girls, where GAT always outper- ing the influence of novel and diverse recommendation forms NGCF, reaching remarkable results such as the lists [47, 48] on both user’s and business’s interests, we 𝑅𝑒𝑐𝑎𝑙𝑙 on Baby (i.e., 0.1595 vs. 0.1411) and the 𝑀 𝐴𝑅 on also assess beyond-accuracy metrics such as the expectedBoys & Girls (i.e., 0.1846 vs. 0.1783). Disentangling users’ popularity complement (𝐸𝑃 𝐶@𝑘) and the expected free intents on interacted items (i.e., DGCF) produces even discovery (𝐸𝐹 𝐷@𝑘), along with indices measuring con- more accurate recommendations to NGCF on all datasets. centration and coverage, i.e., the 1’s complement of theNevertheless, LightGCN always performs better than Gini (𝐺𝑖𝑛𝑖@𝑘), the Shannon entropy (𝑆𝐸@𝑘), and the DGCF apart from very few cases (i.e., 𝑛𝐷𝐶𝐺 and 𝑀 𝐴𝑅 item coverage (𝑖𝐶𝑜𝑣@𝑘). Specifically, the 𝐸𝑃 𝐶@𝑘 and on Men), even though DGCF’s calculated accuracy values the 𝐸𝐹 𝐷@𝑘 refer to long-tail items and stand for the ex- do not substantially differ from LightGCN’s ones (e.g., see pected number of recommended unknown items which the 𝑀 𝐴𝑅 on Baby). Noticeably, the proposed model (i.e., are also relevant, and the expected number of recom- EGCF) outperforms the other baselines under all settings mended known items which are also relevant, respec- and datasets, with near 100% statistical hypothesis tests tively. Furthermore, the 𝐺𝑖𝑛𝑖@𝑘 and the 𝑆𝐸@𝑘 are (i.e., paired t-test) showing that the results significantly used to assess items’ distributional inequality, i.e., how differ. This finding further motivates the goodness of unequally a recommender system shows different items the solution. While we observe a substantial accuracy to users, and the 𝑖𝐶𝑜𝑣@𝑘 quantifies the number of items improvement in traditional and review-based approaches that the model recommends. For all metrics, higher val- (e.g., +12% to MultiVAE for the 𝑀 𝐴𝑅 on Boys & Girls ues mean better performance. We leave the assessment and +53% to RMG for the 𝑅𝑒𝑐𝑎𝑙𝑙 on Baby), introducing of complexity measures for the proposed model in future an additional GCN-like network guided by users’ reviews extensions of the work. is even more beneficial to correct the representation error observable in unweighted graph approaches. Particularly, 4.2. Results and Discussion results show that such correction may lead to small accu- racy improvements in some cases (e.g., see the 𝑅𝑒𝑐𝑎𝑙𝑙 on Recommendation accuracy (RQ1). Table 2 reports Boys & Girls when correcting LightGCN) but also larger the results for accuracy measures on the top-10 recom- ones in other cases (e.g., see the 𝑛𝐷𝐶𝐺 on Men when mendation lists. Surprisingly, the sole introduction of correcting LightGCN). Such outcomes suggest that while reviews does not seem to produce a consistent accuracy keeping the error-affected contribution in the final predic- boost. For instance, the strongest review-based method tion formula is useful to preserve the superior performance (i.e., RMG) surpasses BPRMF only for the 𝑛𝐷𝐶𝐺 and of graph-based models to traditional and review-based ap- the 𝑀 𝐴𝑅 on Baby (i.e., 0.0911 vs. 0.0785 and 0.1059 proaches, the introduced correction term is useful to gain vs. 0.0980, respectively). Contrarily, adopting a graph even more accurate preference predictions than unweighted model can increase the accuracy to traditional CF. When graph architectures. comparing LightGCN with MultiVAE, which obtain the Recommendation novelty and diversity (RQ2). We best performance in their respective recommendation also assess how novel and diverse recommendation lists families, we observe that the former improves, on Baby, are. The two novelty metrics in Table 3 (i.e., the 𝐸𝑃 𝐶@𝑘 the 𝑅𝑒𝑐𝑎𝑙𝑙 of 7% and the 𝑀 𝐴𝑅 of 9%. However, the and the 𝐸𝐹 𝐷@𝑘, left side) are discussed with concentra- observed difference even reverts on Men for the 𝑛𝐷𝐶𝐺 tion and coverage indices (i.e., the 𝐺𝑖𝑛𝑖@𝑘, the 𝑆𝐸@𝑘, and the 𝑀 𝐴𝑅. The application of attention mechanisms and the 𝑖𝐶𝑜𝑣@𝑘, right side) as in an ideal recommender 1 Please refer to our GitHub repository. system, a loosely concentrated and large set of recom- 2 https://github.com/sisinflab/Edge-Graph-Collaborative-Filtering. Table 3 Calculated novelty metrics, i.e., 𝐸𝑃 𝐶 and 𝐸𝐹 𝐷, on the left side, and diversity indices, i.e., 𝐺𝑖𝑛𝑖, 𝑆𝐸 , and 𝑖𝐶𝑜𝑣 , on the right side, for top-10 lists. Best value is in bold, while second-to-best is underlined. Models Baby Boys & Girls Men Models Baby Boys & Girls Men 𝐸𝑃 𝐶 𝐸𝐹 𝐷 𝐸𝑃 𝐶 𝐸𝐹 𝐷 𝐸𝑃 𝐶 𝐸𝐹 𝐷 𝐺𝑖𝑛𝑖 𝑆𝐸 𝑖𝐶𝑜𝑣 𝐺𝑖𝑛𝑖 𝑆𝐸 𝑖𝐶𝑜𝑣 𝐺𝑖𝑛𝑖 𝑆𝐸 𝑖𝐶𝑜𝑣 MostPop 0.0108 0.0728 0.0135 0.0913 0.0112 0.0904 MostPop 0.0018 3.5313 18 0.0023 3.5724 18 0.0015 3.9332 32 BPRMF 0.0164 0.1153 0.0306 0.2282 0.0259 0.2167 BPRMF 0.0019 3.7819 40 0.0031 4.0921 203 0.0037 5.2991 192 MultiVAE 0.0268 0.2088 0.0360 0.2874 0.0333 0.2912 MultiVAE 0.2139 9.9160 4,143 0.2671 10.2463 3,824 0.1085 9.8988 3,014 ConvMF 0.0135 0.0930 0.0174 0.1219 0.0102 0.0857 ConvMF 0.0018 3.5933 18 0.0030 3.9745 220 0.0029 4.6783 265 RMG 0.0193 0.1488 0.0226 0.1787 0.0144 0.1226 RMG 0.1059 9.4892 2,130 0.1567 9.7193 2,538 0.1146 10.0344 2,549 NGCF 0.0194 0.1463 0.0323 0.2510 0.0292 0.2531 NGCF 0.0948 8.8700 2,641 0.3031 10.5595 3,668 0.1749 10.7116 3,651 LightGCN 0.0289 0.2271 0.0371 0.3012 0.0323 0.2856 LightGCN 0.1405 9.3105 3,417 0.2398 10.1586 3,647 0.2051 10.8815 4,384 GAT 0.0223 0.1708 0.0334 0.2616 0.0248 0.2106 GAT 0.1370 9.2024 3,102 0.2496 10.2821 3,449 0.1235 9.7802 3,530 DGCF 0.0287 0.2228 0.0365 0.2945 0.0311 0.2734 DGCF 0.0673 8.3193 2,325 0.1800 9.7617 3,208 0.1304 10.2011 3,378 EGCF 0.0298* 0.2359* 0.0382* 0.3120* 0.0343* 0.3066* EGCF 0.2294 9.8535 4,490 0.3037 10.4545 4,030 0.2208 10.8876 4,920 *statistically significant differences (p-value ≤ 0.05) Statistical significance is not reported since it is calculated only on user level. 0.312 0.194 0.350 0.232 0.306 0.219 0.310 Recall Recall Recall 0.193 0.230 EFD EFD EFD 0.300 0.308 0.218 0.304 0.192 0.228 0.306 0.250 0.217 0.191 0.304 0.302 0.226 1 2 3 4 1 2 3 4 1 2 3 4 (a) Baby (b) Boys & Girls (c) Men Figure 3: Recommendation performance of EGCF, i.e., 𝑅𝑒𝑐𝑎𝑙𝑙@𝑘 (histogram bars in teal blue) and 𝐸𝐹 𝐷@𝑘 (histogram bars in lime green), on top-10 recommendation lists, when varying the number of explored hops from 1 to 4. mended items should equally span different ranges of without retaining less popular items from the long-tail popularity. As previously observed, EGCF is again the (observing the same models, +3% for the 𝐸𝑃 𝐶 on Baby best or second-to-best technique. While NGCF is not and +6% for the 𝐸𝐹 𝐷 on Boys & Girls). Such outcomes as capable as LightGCN of proposing long-tail items on demonstrate that the content enrichment brought by the Boys & Girls (e.g., 0.2510 vs. 0.3012 for the 𝐸𝐹 𝐷), the extracted review features (injected into the representation former surpasses the latter for the concentration indices error correction) allows to explore user-item interactions at on the same dataset (e.g., 10.5595 vs. 10.1586 for the 𝑆𝐸). multiple hops, leading to more heterogeneous recommen- Since NGCF adopts an ego-neighbor interaction compo- dation lists which also include items from the long-tail. nent, the concentration of explored and recommended Effect of hop exploration number (RQ3). Figure 3 near items gets loose. Moreover, neighborhood weight- displays, for EGCF, the 𝑅𝑒𝑐𝑎𝑙𝑙@𝑘 and 𝐸𝐹 𝐷@𝑘 perfor- ing leads to recommend items from the long tail (e.g., mance variation on top-10 recommendation lists when comparing GAT with NGCF, we observe a +17% for the exploring a number of hops in the range 1-4, where even 𝐸𝐹 𝐷 on Baby). However, such a finding is not consis- numbers stand for same node type connections (e.g., user- tent with the trend recognized for the concentration and user), while odd numbers refer to opposite node type con- coverage indices (e.g., when comparing LightGCN with nections (i.e., user-item). As evident from the histograms DGCF, we notice 0.1304 vs. 0.2051 for the 𝐺𝑖𝑛𝑖 on Men), of Baby and Boys & Girls, the 𝑅𝑒𝑐𝑎𝑙𝑙@𝑘 consistently as the neighborhood weighting procedure comes at the increases from 1 to 4 hops (this is why we adopt four expense of a limited hop exploration, not allowing such hop explorations for EGCF on those datasets). The same models to explore wider catalog portions. Conversely, trend is not observable for Men, where two explored hops injecting user-generated reviews brings new informative seem to provide the highest accuracy boost, motivating content (e.g., RMG recommends a broader and less con- the adoption of 2 hop explorations for EGCF on the same centrated range of items from the catalog than DGCF on dataset. Such behavior could be due to the average num- the Baby dataset). Finally, weighting the neighborhood ber of users’ interacted items in Men (approximately 19, importance and exploring long-distant user-item inter- see Table 1). The node refining probably does not re- actions through reviews-enriched content (i.e., EGCF) quire a broad exploration of its neighborhood. As for the allows to retrieve larger portions of heterogeneous items 𝐸𝐹 𝐷@𝑘, the Baby and the Men datasets seem to agree (e.g., EGCF outperforms LightGCN for the 𝐺𝑖𝑛𝑖 by +63% on two exploration hops to produce the most diverse on Baby and DGCF for the 𝑆𝐸 by +7% on Boys & Girls), item lists of recommendations because they leverage (as previously recalled) user-user and item-item intercon- [6] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. nections (and similarities). The trend is also aligned with Hamilton, J. Leskovec, Graph convolutional neural the Boys & Girls dataset, where user-user and item-item networks for web-scale recommender systems, in: links are exploited even at a higher depth (i.e., four ex- KDD, ACM, 2018, pp. 974–983. ploration hops). The emerged insights shed light on two [7] X. Wang, X. He, M. Wang, F. Feng, T. Chua, Neural main contributions: (i) with the modified neighborhood graph collaborative filtering, in: SIGIR, ACM, 2019, weighting process, which makes use of reviews to enhance pp. 165–174. the informative content carried by user-item interactions, [8] X. He, K. Deng, X. Wang, Y. Li, Y. Zhang, M. Wang, EGCF is less limited in the hop exploration, thus providing Lightgcn: Simplifying and powering graph convolu- more accurate recommendations, and (ii) user-user and tion network for recommendation, in: SIGIR, ACM, item-item connections are the keystones on which building 2020, pp. 639–648. more diverse item recommendation lists. [9] K. Mao, J. Zhu, X. Xiao, B. Lu, Z. Wang, X. He, Ul- tragcn: Ultra simplification of graph convolutional networks for recommendation, in: CIKM, ACM, 5. Conclusion and Future Work 2021, pp. 1253–1262. [10] Y. Shen, Y. Wu, Y. Zhang, C. Shan, J. Zhang, K. B. This work proposes Edge Graph Collaborative Filtering Letaief, D. Li, How powerful is graph convolution (EGCF), which incorporates users’ opinions extracted for recommendation?, in: CIKM, ACM, 2021, pp. from reviews into the edges of a GCN to weight the 1619–1629. neighborhood importance on the ego node. Extensive [11] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, experimental evaluation shows that EGCF outperforms P. Liò, Y. Bengio, Graph attention networks, in: traditional, review- and graph-based models. The work ICLR (Poster), OpenReview.net, 2018. complements with an analysis of beyond-accuracy per- [12] Y. Xie, Z. Li, T. Qin, F. Tseng, J. Kristinsson, S. Qiu, formance and an extensive study on the number of lay- Y. L. Murphey, Personalized session-based recom- ers. Leveraging the importance of graph edges through mendation using graph attention networks, in: node-node side information (e.g., users’ reviews) opens IJCNN, IEEE, 2021, pp. 1–8. to future directions, namely: (i) study the impact of this [13] M. Zhang, C. Guo, J. Jin, M. Pan, J. Fang, Sequential re-weighting by making it a hyper-parameter, and (ii) an- recommendation with context-aware collaborative alyze the possible application of the proposed technique graph attention networks, in: IJCNN, IEEE, 2021, to different tasks other than recommendation. pp. 1–8. [14] Y. Wang, S. Tang, Y. Lei, W. Song, S. Wang, Acknowledgments M. Zhang, Disenhan: Disentangled heterogeneous graph attention network for recommendation, in: The authors acknowledge partial support from the CIKM, ACM, 2020, pp. 1605–1614. projects PASSEPARTOUT, ServiziLocali2.0, Smart Rights [15] X. Wang, H. Jin, A. Zhang, X. He, T. Xu, T. Chua, Management Platform, BIO-D, and ERP4.0. Disentangled graph collaborative filtering, in: SI- GIR, ACM, 2020, pp. 1001–1010. [16] D. Chen, Y. Lin, W. Li, P. Li, J. Zhou, X. Sun, Measur- References ing and relieving the over-smoothing problem for graph neural networks from the topological view, [1] M. D. Ekstrand, J. Riedl, J. A. Konstan, Collabora- in: AAAI, AAAI Press, 2020, pp. 3438–3445. tive filtering recommender systems, Found. Trends [17] K. Zhou, X. Huang, Y. Li, D. Zha, R. Chen, X. Hu, Hum. Comput. Interact. 4 (2011) 175–243. Towards deeper graph neural networks with differ- [2] Y. Koren, R. M. Bell, C. Volinsky, Matrix factoriza- entiable group normalization, in: NeurIPS, 2020. tion techniques for recommender systems, Com- [18] J. Ni, J. Li, J. J. McAuley, Justifying recommen- puter 42 (2009) 30–37. dations using distantly-labeled reviews and fine- [3] X. He, L. Liao, H. Zhang, L. Nie, X. Hu, T. Chua, grained aspects, in: EMNLP/IJCNLP (1), Associa- Neural collaborative filtering, in: WWW, ACM, tion for Computational Linguistics, 2019, pp. 188– 2017, pp. 173–182. 197. [4] D. Liang, R. G. Krishnan, M. D. Hoffman, T. Jebara, [19] L. Chen, G. Chen, F. Wang, Recommender systems Variational autoencoders for collaborative filtering, based on user reviews: the state of the art, User in: WWW, ACM, 2018, pp. 689–698. Model. User Adapt. Interact. 25 (2015) 99–154. [5] T. N. Kipf, M. Welling, Semi-supervised classifica- [20] M. Srifi, A. Oussous, A. A. Lahcen, S. Mouline, Rec- tion with graph convolutional networks, in: ICLR ommender systems based on collaborative filtering (Poster), OpenReview.net, 2017. using review texts - A survey, Inf. 11 (2020) 317. [21] R. van den Berg, T. N. Kipf, M. Welling, Graph con- [36] X. Wang, I. Ounis, C. Macdonald, Leveraging re- volutional matrix completion, CoRR abs/1706.02263 view properties for effective recommendation, in: (2017). WWW, ACM / IW3C2, 2021, pp. 2209–2219. [22] W. Song, Z. Xiao, Y. Wang, L. Charlin, M. Zhang, [37] C. Wu, F. Wu, T. Qi, S. Ge, Y. Huang, X. Xie, Re- J. Tang, Session-based social recommendation via views meet graphs: Enhancing user and item repre- dynamic graph attention networks, in: WSDM, sentations for recommendation with hierarchical ACM, 2019, pp. 555–563. attentive graph neural network, in: EMNLP/IJC- [23] C. Xu, P. Zhao, Y. Liu, V. S. Sheng, J. Xu, F. Zhuang, NLP (1), Association for Computational Linguistics, J. Fang, X. Zhou, Graph contextualized self- 2019, pp. 4883–4892. attention network for session-based recommenda- [38] J. Gao, Y. Lin, Y. Wang, X. Wang, Z. Yang, Y. He, tion, in: IJCAI, ijcai.org, 2019, pp. 3940–3946. X. Chu, Set-sequence-graph: A multi-view ap- [24] Y. Wu, J. Yang, Dual sequential recommendation proach towards exploiting reviews for recommen- integrating high-order collaborative relations via dation, in: CIKM, ACM, 2020, pp. 395–404. graph attention networks, in: IJCNN, IEEE, 2021, [39] L. Shi, W. Wu, W. Hu, J. Zhou, J. Chen, W. Zheng, pp. 1–8. L. He, Dualgcn: An aspect-aware dual graph con- [25] X. Wang, X. He, Y. Cao, M. Liu, T. Chua, KGAT: volutional network for review-based recommender, knowledge graph attention network for recommen- Knowl. Based Syst. 242 (2022) 108359. dation, in: KDD, ACM, 2019, pp. 950–958. [40] R. He, J. J. McAuley, VBPR: visual bayesian person- [26] Z. Tao, Y. Wei, X. Wang, X. He, X. Huang, T. Chua, alized ranking from implicit feedback, in: AAAI, MGAT: multimodal graph attention network for AAAI Press, 2016, pp. 144–150. recommendation, Inf. Process. Manag. 57 (2020) [41] Y. Deldjoo, T. D. Noia, D. Malitesta, F. A. Merra, 102277. Leveraging content-style item representation for [27] J. Ma, P. Cui, K. Kuang, X. Wang, W. Zhu, Disen- visual recommendation, in: ECIR (2), volume 13186 tangled graph convolutional networks, in: ICML, of Lecture Notes in Computer Science, Springer, 2022, volume 97 of Proceedings of Machine Learning Re- pp. 84–92. search, PMLR, 2019, pp. 4212–4221. [42] S. Rendle, C. Freudenthaler, Z. Gantner, L. Schmidt- [28] J. Wu, W. Shi, X. Cao, J. Chen, W. Lei, F. Zhang, Thieme, BPR: bayesian personalized ranking from W. Wu, X. He, Disenkgat: Knowledge graph embed- implicit feedback, in: UAI, AUAI Press, 2009, pp. ding with disentangled graph attention network, 452–461. in: CIKM, ACM, 2021, pp. 2140–2149. [43] X. Chen, H. Chen, H. Xu, Y. Zhang, Y. Cao, Z. Qin, [29] H. Wang, N. Wang, D. Yeung, Collaborative deep H. Zha, Personalized fashion recommendation with learning for recommender systems, in: KDD, ACM, visual explanations based on multimodal attention 2015, pp. 1235–1244. network: Towards visually explainable recommen- [30] A. Almahairi, K. Kastner, K. Cho, A. C. Courville, dation, in: SIGIR, ACM, 2019, pp. 765–774. Learning distributed representations from reviews [44] Z. Wang, W. Ye, X. Chen, W. Zhang, Z. Wang, L. Zou, for collaborative filtering, in: RecSys, ACM, 2015, W. Liu, Generative session-based recommendation, pp. 147–154. in: WWW, ACM, 2022, pp. 2227–2235. [31] D. H. Kim, C. Park, J. Oh, S. Lee, H. Yu, Convolu- [45] J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algo- tional matrix factorization for document context- rithms for hyper-parameter optimization, in: NIPS, aware recommendation, in: RecSys, ACM, 2016, pp. 2011, pp. 2546–2554. 233–240. [46] V. W. Anelli, A. Bellogín, A. Ferrara, D. Malitesta, [32] L. Zheng, V. Noroozi, P. S. Yu, Joint deep modeling F. A. Merra, C. Pomo, F. M. Donini, T. D. Noia, El- of users and items using reviews for recommenda- liot: A comprehensive and rigorous framework for tion, in: WSDM, ACM, 2017, pp. 425–434. reproducible recommender systems evaluation, in: [33] H. Liu, Y. Wang, Q. Peng, F. Wu, L. Gan, L. Pan, SIGIR, ACM, 2021, pp. 2405–2414. P. Jiao, Hybrid neural recommendation with joint [47] S. Vargas, Novelty and diversity enhancement and deep representation learning of ratings and reviews, evaluation in recommender systems and informa- Neurocomputing 374 (2020) 77–85. tion retrieval, in: SIGIR, ACM, 2014, p. 1281. [34] Y. Lu, R. Dong, B. Smyth, Coevolutionary recom- [48] S. Vargas, P. Castells, Rank and relevance in novelty mendation model: Mutual learning between ratings and diversity metrics for recommender systems, in: and reviews, in: WWW, ACM, 2018, pp. 773–782. RecSys, ACM, 2011, pp. 109–116. [35] H. Liu, F. Wu, W. Wang, X. Wang, P. Jiao, C. Wu, X. Xie, NRPA: neural recommendation with person- alized attention, in: SIGIR, ACM, 2019, pp. 1233– 1236.