=Paper= {{Paper |id=Vol-1905/recsys2017_poster4 |storemode=property |title=Explanation Chains: Recommendations by Explanation |pdfUrl=https://ceur-ws.org/Vol-1905/recsys2017_poster4.pdf |volume=Vol-1905 |authors=Arpit Rana,Derek Bridge |dblpUrl=https://dblp.org/rec/conf/recsys/RanaB17 }} ==Explanation Chains: Recommendations by Explanation== https://ceur-ws.org/Vol-1905/recsys2017_poster4.pdf
              Explanation Chains: Recommendation by Explanation
                                     Arpit Rana                                                                  Derek Bridge
                      Insight Centre for Data Analytics                                              Insight Centre for Data Analytics
                       University College Cork, Ireland                                               University College Cork, Ireland
                        arpit.rana@insight-centre.org                                                derek.bridge@insight-centre.org

ABSTRACT
Given a set of candidate items, Recommendation by Explanation
constructs a justification for recommending each item, in the form
of what we call an Explanation Chain, and then recommends those
candidates that have the best explanations. By unifying recommen-
dation and explanation, this approach enables us to find relevant                  ★★★★☆                      ★★★★★                    ★★★★☆
                                                                                                                                                            The Notebook
recommendations with explanations that have a high degree of                    Big Fish                   Pearl Harbor             The Illusionist
                                                                                                                                                            • star-crossed-lovers
                                                                                • romantic-rivalry         • fiancé-fiancee-        • fiancé-fiancee-
both fidelity and interpretability. Experimental results on a movie             • carnival                   relationship             relationship          • secret-love
                                                                                                                                    • shooting              • broken-
recommendation dataset show that our approach also provides sets                • secret-mission           • shooting                                         engagement
                                                                                • parachute                • secret-mission         • secret-love
                                                                                                                                                            • volunteer
of recommendations that have a high degree of serendipity, low                  •. . .                     • volunteer              • broken-
                                                                                                                                      engagement            • u.s.-army
                                                                                                           • u.s.-army                                      • romantic-rivalry
popularity-bias and high diversity.                                                                        • parachute              • star-crossed-lovers
                                                                                                                                                            • self-discovery
                                                                                                           •. . .                   •. . .
                                                                                                                                                            •. . .

KEYWORDS                                                                                                  User’s Past Preferences                             Candidate Item
Explanation, Fidelity, Interpretability
ACM Reference format:                                                                                  Figure 1: An Explanation Chain.
Arpit Rana and Derek Bridge. 2017. Explanation Chains: Recommendation
by Explanation. RecSys ’17 Poster Proceedings, Como, Italy, August 27–31,
2017, 2 pages.                                                              chain satisfies a global constraint in the form of a threshold on the
                                                                            level of coverage it contributes towards features of the candidate
1    INTRODUCTION                                                           item. For example, Big Fish has the keywords: secret-mission and
Recommender systems provide explanations to help the end-user               parachute in common with Pearl Harbour, as well as the keyword
understand the rationale for a recommendation and to help him               romantic-rivalry in common with The Notebook.
make a decision [2]. Conventionally, computing recommendations                 There is previous work in which there is a more intimate connec-
and generating corresponding explanations are considered as two             tion between recommendation and explanation, e.g. [3, 5]. In [3], for
separate, sequential processes. This affords the recommender the            example, recommendations are re-ranked by the strength of their
freedom to include in the explanation information different from            explanations, so that items with more compelling explanations are
that which it used to compute the recommendation [1]. For exam-             recommended first. However, these approaches still compute rec-
ple, In [4], a recommendation generated by matrix factorization is          ommendations and explanations separately, which is what makes
explained using topic models mined from textual data associated             Recommendation by Explanation a unique development.
with items. Such differences are one cause of low fidelity between
the recommender and its explanations.                                       2        APPROACH
   In this paper, we seek to achieve a higher degree of fidelity be-        Recommendation by Explanation is a novel approach that unifies
tween the explanations and the operation of the recommender                 recommendation and explanation: it computes recommendations by
system, without compromising the interpretability of the expla-             generating and ranking corresponding personalized explanations
nations and the quality of the recommendations. For this, we use            in the form of explanation chains. Recommendation is modelled
what we call explanation chains. Figure 1 shows an example of               as a path-finding problem in the item-item similarity graph. Once
an explanation chain in the movie domain. The last item in the              a chain has been constructed for each candidate item, the top-n
diagram (in this case, The Notebook), which we do not regard as             chains are selected iteratively based on their total coverage of the
part of the chain, is the candidate for recommendation to the user,         candidate item’s features and their dissimilarity to other chains in
and will typically not already be in the user’s profile. The other          the top-n. We describe our approach in more detail in the following
items in the diagram (Big Fish, Pearl Harbour and The Illusionist)          subsections.
form the chain. They are drawn from positively-rated items in the
user’s profile and are intended to support recommendation of the            2.1          Generating Explanation Chains
candidate item. Each movie is represented as a set of keywords.             Given a candidate item, Recommendation by Explanation works
Pairs of successive items in a chain satisfy a local constraint in          backwards to construct a chain: starting with the candidate item,
the form of a similarity threshold; additionally, each item in the          it finds predecessors, greedily selects one, finds its predecessors,
RecSys ’17 Poster Proceedings, Como, Italy                                  selects one; and so on. The predecessors of an item are all its neigh-
2017.                                                                       bours in the item-item similarity graph that satisfy four conditions:
RecSys ’17 Poster Proceedings, August 27–31, 2017, Como, Italy                                                                                               Arpit Rana and Derek Bridge


(a) they are positively-rated members of the user’s profile; (b) they                                        Precision & Diversity (%)                                          Precision & Surprise (%)
                                                                                                     95.58             97.52                        98.44                                 93.07                     93.56
                                                                                    100                                               94.39                 100                                      89.16
are not already in this chain; (c) their similarity to the item exceeds                                                                                                  82.4
                                                                                        80                                                                   80
a similarity threshold; and (d) their reward (see below) exceeds a                                                                                           60
                                                                                        60
marginal gain threshold. When there are no predecessors, the chain                      40                                                                   40

is complete.                                                                            20    8.32
                                                                                                                                                             20   8.32
                                                                                                                 3.1           2.32                                                 3.1           2.32        0.6
                                                                                                                                              0.6
   At each step in this process, the predecessor that gets selected is                   0                                                                   0
                                                                                               r-by-e                                          RM                  r-by-e            CB-7         CB-|C|       RM
the one with the highest reward. The reward rwd(j, i, C) of adding                                                CB-7         CB-|C|
                                                                                                             Precision at 10    Diversity                                          Precision at 10 Surprise
predecessor j to partial chain C that explains candidate item i is
                                                                                                             Precision & Novelty (%)                                            Precision & Coverage (%)
given by:                                                                          100
                                                                                                                                                            100
                                                                                                                                                                                          81.94
                                                                                    80
                                                                                                                                                             80
                   ( f j \ covered(i, C)) ∩ fi       ( f j \ covered(i, C)) ∩ fi    60                                53.26                         50.78
rwd (j, i, C) =                                  +                                                  40.71                             43.4
                                                                                                                                                             60

                                 fi                              fj                 40                                                                       40      29.35                           31.57

                                                                                    20                                                                                                                              15.25
                                                                                             8.32                                                            20   8.32
                                                                       (1)          0
                                                                                                                3.1            2.32           0.6
                                                                                                                                                             0
                                                                                                                                                                                    3.1           2.32        0.6

Here fi denotes the features of item i and covered(i, C) is the set                           r-by-e             CB-7         CB-|C|           RM                  r-by-e            CB-7        CB-|C|        RM
                                                                                                              Precision at 10 Novelty                                            Precision at 10 Coverage
of features of candidate i that are already covered by members of
the chain C, i.e. covered(i, C) = j ′ ∈C f j ′ ∩ fi . Then the first term
                                   S
                                                                                   Figure 2: Performance comparison in terms of Precision, Di-
in the definition of rwd(j, i, C) measures j’s coverage (with respect
                                                                                   versity, Surprise, Novelty and Catalogue Coverage.
to the size of fi ) of features of i that are not yet covered by the
chain. The second term in the definition measures the same but
with respect to the size of f j and therefore assures j’s fitness to               content-based system with number of neighbours as the length of
explain the candidate by penalizing items that have high coverage                  the corresponding explanation chain. In r-by-e, explanation chains
simply by virtue of having more features.                                          are generated with a similarity threshold of 0.05 and a marginal
                                                                                   gain threshold of 0.17 which were set by a grid-search.
2.2     Evaluating Explanation Chains                                                 Experimental results are presented in Figure 2. It is clear that the
After constructing a chain C for each candidate item i, we must                    proposed approach outperforms the other methods for precision
select the top-n chains so that we can recommend n items to the                    while still achieving high levels of diversity, surprise and novelty.
user, along with their explanations. This is done iteratively based
on a chain’s total coverage of the candidate item’s features and its               4         CONCLUSIONS
dissimilarity to other chains already included in the top-n. Specifi-              Recommendation by Explanation unifies recommendation and ex-
cally, we score ⟨C, i⟩ relative to a list of all the items that appear in          planation, providing high quality recommendations with corre-
already-selected chains C ∗ using the following:                                   sponding explanations that have high fidelity and interpretability.
                             P
                                                     C \ j ′ ∈C ∗ j ′
                                                         S                         In the future, we will carry out experiments using keyword weight-
                                j ∈C rwd(j, i, C)
       score(⟨C, i⟩, C ∗ ) =                      +                    (2)         ing and filtering and experiments in which we lower the thresholds
                                    |C | + 1            |C | + 1
                                                                                   to see how this results in looser connections and longer chains in
Here, the first term is the sum of the rewards of the items in the                 the expectation of even less obvious recommendations. We have
chain with respect to its length including the candidate item i. The               also built a web-based recommender with which to evaluate our
second term penalizes a chain if its members are also members                      system with real users.
of already-selected chains and hence encourages the final recom-
mendation list to cover as many positively-rated items in the user’s               ACKNOWLEDGMENTS
profile as possible. In effect, the latter reduces popularity-bias in the
                                                                                   This publication has emanated from research supported in part by
chains and diversifies the recommendation list. (Note that the sec-
                                                                                   a research grant from Science Foundation Ireland (SFI) under Grant
ond term is about coverage of items that appear in already-selected
                                                                                   Number SFI/12/RC/2289.
chains, not their features.)
                                                                                   REFERENCES
3     EXPERIMENTAL RESULTS                                                          [1] Behnoush Abdollahi and Olfa Nasraoui. 2016. Explainable Matrix Factorization
We performed off-line experiments on the hetrec2011-movielens-2k                        for Collaborative Filtering. In Proceedings of the 25th International Conference
                                                                                        Companion on World Wide Web. International World Wide Web Conferences
dataset1 augmented by movie keywords from IMDb2 . The dataset                           Steering Committee, 5–6.
comprises 2113 users, 5410 movies and over half a million keywords.                 [2] Mustafa Bilgic and Raymond J Mooney. 2005. Explaining recommendations:
We represented each movie as a set of all of its keywords and                           Satisfaction vs. promotion. In Beyond Personalization Workshop, IUI, Vol. 5. 153.
                                                                                    [3] Khalil Muhammad, Aonghus Lawlor, Rachael Rafter, and Barry Smyth. 2015.
measured the similarity between movies using Jaccard similarity.                        Great explanations: Opinionated explanations for recommendations. In Interna-
We compared our approach (r-by-e) with other recommenders that                          tional Conference on Case-Based Reasoning. Springer, 244–258.
                                                                                    [4] Marco Rossetti, Fabio Stella, and Markus Zanker. 2013. Towards explaining latent
make use of the same keyword data (CB-7, CB-|C |) and a random                          factors with topic models in collaborative recommender systems. In Database
recommender (RM). CB-7, is a classic content-based model with                           and Expert Systems Applications (DEXA), 2013 24th International Workshop on.
number of neighbours as 7, and CB-|C |, is a dynamic version of the                     IEEE, 162–167.
                                                                                    [5] Panagiotis Symeonidis, Alexandros Nanopoulos, and Yannis Manolopoulos. 2008.
1 https://grouplens.org/datasets/hetrec-2011/                                           Providing justifications in recommender systems. IEEE Transactions on Systems,
2 http://www.imdb.com                                                                   Man, and Cybernetics-Part A: Systems and Humans 38, 6 (2008), 1262–1272.