=Paper=
{{Paper
|id=Vol-1905/recsys2017_poster4
|storemode=property
|title=Explanation Chains: Recommendations by Explanation
|pdfUrl=https://ceur-ws.org/Vol-1905/recsys2017_poster4.pdf
|volume=Vol-1905
|authors=Arpit Rana,Derek Bridge
|dblpUrl=https://dblp.org/rec/conf/recsys/RanaB17
}}
==Explanation Chains: Recommendations by Explanation==
Explanation Chains: Recommendation by Explanation Arpit Rana Derek Bridge Insight Centre for Data Analytics Insight Centre for Data Analytics University College Cork, Ireland University College Cork, Ireland arpit.rana@insight-centre.org derek.bridge@insight-centre.org ABSTRACT Given a set of candidate items, Recommendation by Explanation constructs a justification for recommending each item, in the form of what we call an Explanation Chain, and then recommends those candidates that have the best explanations. By unifying recommen- dation and explanation, this approach enables us to find relevant ★★★★☆ ★★★★★ ★★★★☆ The Notebook recommendations with explanations that have a high degree of Big Fish Pearl Harbor The Illusionist • star-crossed-lovers • romantic-rivalry • fiancé-fiancee- • fiancé-fiancee- both fidelity and interpretability. Experimental results on a movie • carnival relationship relationship • secret-love • shooting • broken- recommendation dataset show that our approach also provides sets • secret-mission • shooting engagement • parachute • secret-mission • secret-love • volunteer of recommendations that have a high degree of serendipity, low •. . . • volunteer • broken- engagement • u.s.-army • u.s.-army • romantic-rivalry popularity-bias and high diversity. • parachute • star-crossed-lovers • self-discovery •. . . •. . . •. . . KEYWORDS User’s Past Preferences Candidate Item Explanation, Fidelity, Interpretability ACM Reference format: Figure 1: An Explanation Chain. Arpit Rana and Derek Bridge. 2017. Explanation Chains: Recommendation by Explanation. RecSys ’17 Poster Proceedings, Como, Italy, August 27–31, 2017, 2 pages. chain satisfies a global constraint in the form of a threshold on the level of coverage it contributes towards features of the candidate 1 INTRODUCTION item. For example, Big Fish has the keywords: secret-mission and Recommender systems provide explanations to help the end-user parachute in common with Pearl Harbour, as well as the keyword understand the rationale for a recommendation and to help him romantic-rivalry in common with The Notebook. make a decision [2]. Conventionally, computing recommendations There is previous work in which there is a more intimate connec- and generating corresponding explanations are considered as two tion between recommendation and explanation, e.g. [3, 5]. In [3], for separate, sequential processes. This affords the recommender the example, recommendations are re-ranked by the strength of their freedom to include in the explanation information different from explanations, so that items with more compelling explanations are that which it used to compute the recommendation [1]. For exam- recommended first. However, these approaches still compute rec- ple, In [4], a recommendation generated by matrix factorization is ommendations and explanations separately, which is what makes explained using topic models mined from textual data associated Recommendation by Explanation a unique development. with items. Such differences are one cause of low fidelity between the recommender and its explanations. 2 APPROACH In this paper, we seek to achieve a higher degree of fidelity be- Recommendation by Explanation is a novel approach that unifies tween the explanations and the operation of the recommender recommendation and explanation: it computes recommendations by system, without compromising the interpretability of the expla- generating and ranking corresponding personalized explanations nations and the quality of the recommendations. For this, we use in the form of explanation chains. Recommendation is modelled what we call explanation chains. Figure 1 shows an example of as a path-finding problem in the item-item similarity graph. Once an explanation chain in the movie domain. The last item in the a chain has been constructed for each candidate item, the top-n diagram (in this case, The Notebook), which we do not regard as chains are selected iteratively based on their total coverage of the part of the chain, is the candidate for recommendation to the user, candidate item’s features and their dissimilarity to other chains in and will typically not already be in the user’s profile. The other the top-n. We describe our approach in more detail in the following items in the diagram (Big Fish, Pearl Harbour and The Illusionist) subsections. form the chain. They are drawn from positively-rated items in the user’s profile and are intended to support recommendation of the 2.1 Generating Explanation Chains candidate item. Each movie is represented as a set of keywords. Given a candidate item, Recommendation by Explanation works Pairs of successive items in a chain satisfy a local constraint in backwards to construct a chain: starting with the candidate item, the form of a similarity threshold; additionally, each item in the it finds predecessors, greedily selects one, finds its predecessors, RecSys ’17 Poster Proceedings, Como, Italy selects one; and so on. The predecessors of an item are all its neigh- 2017. bours in the item-item similarity graph that satisfy four conditions: RecSys ’17 Poster Proceedings, August 27–31, 2017, Como, Italy Arpit Rana and Derek Bridge (a) they are positively-rated members of the user’s profile; (b) they Precision & Diversity (%) Precision & Surprise (%) 95.58 97.52 98.44 93.07 93.56 100 94.39 100 89.16 are not already in this chain; (c) their similarity to the item exceeds 82.4 80 80 a similarity threshold; and (d) their reward (see below) exceeds a 60 60 marginal gain threshold. When there are no predecessors, the chain 40 40 is complete. 20 8.32 20 8.32 3.1 2.32 3.1 2.32 0.6 0.6 At each step in this process, the predecessor that gets selected is 0 0 r-by-e RM r-by-e CB-7 CB-|C| RM the one with the highest reward. The reward rwd(j, i, C) of adding CB-7 CB-|C| Precision at 10 Diversity Precision at 10 Surprise predecessor j to partial chain C that explains candidate item i is Precision & Novelty (%) Precision & Coverage (%) given by: 100 100 81.94 80 80 ( f j \ covered(i, C)) ∩ fi ( f j \ covered(i, C)) ∩ fi 60 53.26 50.78 rwd (j, i, C) = + 40.71 43.4 60 fi fj 40 40 29.35 31.57 20 15.25 8.32 20 8.32 (1) 0 3.1 2.32 0.6 0 3.1 2.32 0.6 Here fi denotes the features of item i and covered(i, C) is the set r-by-e CB-7 CB-|C| RM r-by-e CB-7 CB-|C| RM Precision at 10 Novelty Precision at 10 Coverage of features of candidate i that are already covered by members of the chain C, i.e. covered(i, C) = j ′ ∈C f j ′ ∩ fi . Then the first term S Figure 2: Performance comparison in terms of Precision, Di- in the definition of rwd(j, i, C) measures j’s coverage (with respect versity, Surprise, Novelty and Catalogue Coverage. to the size of fi ) of features of i that are not yet covered by the chain. The second term in the definition measures the same but with respect to the size of f j and therefore assures j’s fitness to content-based system with number of neighbours as the length of explain the candidate by penalizing items that have high coverage the corresponding explanation chain. In r-by-e, explanation chains simply by virtue of having more features. are generated with a similarity threshold of 0.05 and a marginal gain threshold of 0.17 which were set by a grid-search. 2.2 Evaluating Explanation Chains Experimental results are presented in Figure 2. It is clear that the After constructing a chain C for each candidate item i, we must proposed approach outperforms the other methods for precision select the top-n chains so that we can recommend n items to the while still achieving high levels of diversity, surprise and novelty. user, along with their explanations. This is done iteratively based on a chain’s total coverage of the candidate item’s features and its 4 CONCLUSIONS dissimilarity to other chains already included in the top-n. Specifi- Recommendation by Explanation unifies recommendation and ex- cally, we score ⟨C, i⟩ relative to a list of all the items that appear in planation, providing high quality recommendations with corre- already-selected chains C ∗ using the following: sponding explanations that have high fidelity and interpretability. P C \ j ′ ∈C ∗ j ′ S In the future, we will carry out experiments using keyword weight- j ∈C rwd(j, i, C) score(⟨C, i⟩, C ∗ ) = + (2) ing and filtering and experiments in which we lower the thresholds |C | + 1 |C | + 1 to see how this results in looser connections and longer chains in Here, the first term is the sum of the rewards of the items in the the expectation of even less obvious recommendations. We have chain with respect to its length including the candidate item i. The also built a web-based recommender with which to evaluate our second term penalizes a chain if its members are also members system with real users. of already-selected chains and hence encourages the final recom- mendation list to cover as many positively-rated items in the user’s ACKNOWLEDGMENTS profile as possible. In effect, the latter reduces popularity-bias in the This publication has emanated from research supported in part by chains and diversifies the recommendation list. (Note that the sec- a research grant from Science Foundation Ireland (SFI) under Grant ond term is about coverage of items that appear in already-selected Number SFI/12/RC/2289. chains, not their features.) REFERENCES 3 EXPERIMENTAL RESULTS [1] Behnoush Abdollahi and Olfa Nasraoui. 2016. Explainable Matrix Factorization We performed off-line experiments on the hetrec2011-movielens-2k for Collaborative Filtering. In Proceedings of the 25th International Conference Companion on World Wide Web. International World Wide Web Conferences dataset1 augmented by movie keywords from IMDb2 . The dataset Steering Committee, 5–6. comprises 2113 users, 5410 movies and over half a million keywords. [2] Mustafa Bilgic and Raymond J Mooney. 2005. Explaining recommendations: We represented each movie as a set of all of its keywords and Satisfaction vs. promotion. In Beyond Personalization Workshop, IUI, Vol. 5. 153. [3] Khalil Muhammad, Aonghus Lawlor, Rachael Rafter, and Barry Smyth. 2015. measured the similarity between movies using Jaccard similarity. Great explanations: Opinionated explanations for recommendations. In Interna- We compared our approach (r-by-e) with other recommenders that tional Conference on Case-Based Reasoning. Springer, 244–258. [4] Marco Rossetti, Fabio Stella, and Markus Zanker. 2013. Towards explaining latent make use of the same keyword data (CB-7, CB-|C |) and a random factors with topic models in collaborative recommender systems. In Database recommender (RM). CB-7, is a classic content-based model with and Expert Systems Applications (DEXA), 2013 24th International Workshop on. number of neighbours as 7, and CB-|C |, is a dynamic version of the IEEE, 162–167. [5] Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos. 2008. 1 https://grouplens.org/datasets/hetrec-2011/ Providing justifications in recommender systems. IEEE Transactions on Systems, 2 http://www.imdb.com Man, and Cybernetics-Part A: Systems and Humans 38, 6 (2008), 1262–1272.