=Paper=
{{Paper
|id=Vol-1905/recsys2017_poster1
|storemode=property
|title=Intent-Aware Diversification using Item-Based SubProfiles
|pdfUrl=https://ceur-ws.org/Vol-1905/recsys2017_poster1.pdf
|volume=Vol-1905
|authors=Mesut Kaya,Derek Bridge
|dblpUrl=https://dblp.org/rec/conf/recsys/KayaB17
}}
==Intent-Aware Diversification using Item-Based SubProfiles==
<pdf width="1500px">https://ceur-ws.org/Vol-1905/recsys2017_poster1.pdf</pdf>
<pre>
        Intent-Aware Diversification using Item-Based SubProfiles
                                    Mesut Kaya                                                     Derek Bridge
                      Insight Centre for Data Analytics                                  Insight Centre for Data Analytics
                       University College Cork, Ireland                                   University College Cork, Ireland
                       mesut.kaya@insight-centre.org                                     derek.bridge@insight-centre.org

ABSTRACT                                                                       Item aspects, such as genres, do not necessarily fully represent a
In many approaches to recommendation diversification, a recom-              user’s tastes or interests and are not available in every recommen-
mender scores items for relevance and then re-ranks them to bal-            dation domain. Hence, in this work, we propose a new intent-aware
ance relevance with diversity. In intent-aware diversification, di-         diversification framework based on user subprofiles, rather than
versity is formulated in terms of coverage of aspects, where aspects        item features. A subprofile is simply a subset of the items in a user’s
are either explicit such as movie genres or implicit such as the la-        profile, each such subprofile representing one of the user’s distinct
tent factors found during matrix factorization. Typically, the same         tastes. We detect a user’s subprofile by adapting DAMIB-COVER, a
set of aspects is used across all users. In this paper, we propose a        method designed for top-n recommendation to shared accounts [5].
form of personalized intent-aware diversification, which we call            Unlike the aspects used in earlier work, which are global across
SPAD (SubProfile-Aware Diversification). The aspects we use in              the set of users, subprofiles differ from user to user, making for
SPAD are subprofiles of the user’s profile. They are not defined in         a more personalized form of diversification. We refer to our new
terms of explicit or implicit features. We compare SPAD to other            framework as SubProfile-Aware Diversification (SPAD).
forms of intent-aware diversification. We present empirical results
in support of SPAD.                                                         2    RECOMMENDATION DIVERSITY
                                                                            The dominant approach to diversification is greedy re-ranking,
KEYWORDS                                                                    in which sets of recommendations RS for a user u are re-ranked
Diversity; intent-aware; subprofiles.                                       by considering the marginal contribution that would be made by
                                                                            adding an item i to the result set RL. The marginal contribution
ACM Reference format:                                                       is measured by an objective function f obj (i, RL) which is typically
Mesut Kaya and Derek Bridge. 2017. Intent-Aware Diversification using       a linear combination of the item’s relevance score s (u, i) and the
Item-Based SubProfiles. In Proceedings of RecSys 2017 Poster Proceedings,   marginal contribution item i makes to the diversity of RL, div(i, RL),
Como, Italy, August 27–31, 2017, 2 pages.                                   the trade-off between the two being controlled by a parameter λ
                                                                            (0 ≤ λ ≤ 1):

1    INTRODUCTION                                                                         f obj (i, RL) = (1 − λ)s (u, i) + λ div(i, RL)                  (1)
It has long been recognized that it is not enough for recommen-             In early work, the diversity div(i, RL) is computed as the average
dations to be accurate or relevant. In many domains, recommen-              (or sum) of the all-pairs intra-list distances (ILD) of the items in RL.
dations must be novel to the user or serendipitous, and a set of            The assumption in this early work is that a set of items that are
recommendations must be diverse. Diversity is one response to un-           dissimilar to each other is more likely to contain one or more items
certainty. A recommender cannot be certain of a user’s short-term           that satisfy the user’s current needs or interests, but there is nothing
or longer-term interests, both because some user profiles are small         in the operation of the system to explicitly ensure this. More recent
and some, while they may not be so small, will contain preferences          approaches, going under the name intent-aware diversification, seek
over different kinds of items. In the face of uncertainty, a diverse        to select items that explicitly address different user interests.
set of recommendations is more likely to contain one or more items             Intent-aware diversification methods assume a set of aspects A
that will satisfy the user.                                                 which describe the items and for which user interests can be esti-
    In many approaches to recommendation diversification, a rec-            mated. The aspects might be explicit: like genres such as comedy in
ommender scores items for relevance and then re-ranks them to               a movie recommender. Alternatively, aspects might be implicit, e.g.
balance relevance with diversity. In intent-aware diversification           corresponding to the latent factors found by a matrix factorization
[3], the idea is that the re-ranked recommendations should cover            recommender system.
the different tastes or interests revealed by the user’s profile. The          User u’s interests can be formulated as a probability distribution
most common way to characterize a user’s tastes is as a probability         p(a|u) for aspects a ∈ A. The probability of choosing an item i
distribution over so-called aspects of the items in the user’s profile.     from the set of recommendations RS given an aspect a of user u is
Aspects are usually either explicit features such as movie genres           denoted by p(i |u, a). In the Query Aspect Diversification framework
or implicit features such as the latent factors found during matrix         (xQuAD) [2, 4] , diversification can be achieved by re-ranking a
factorization. Hence, typically, the same set of aspects is used across     conventional recommender’s recommendation set as Equation (1)
all users — only the probablilities vary across users.                      but with div(i, RL) = novxQuAD (i, RL) defined as:
                                                                                                     X                         Y
RecSys 2017 Poster Proceedings, Como, Italy                                     novxQuAD (i, RL) =          p(a|u)p(i |u, a)           (1 − p(j |u, a))   (2)
2017.                                                                                                a ∈A                      j ∈RL
RecSys 2017 Poster Proceedings, August 27–31, 2017, Como, Italy                                                                           Mesut Kaya and Derek Bridge

                                                                                                               MF                                                         pLSA
   What characterizes the work on intent-aware diversification in                                                                                0.16

recommender systems that we have described so far is the use of                     0.16
                                                                                                                                                 0.14

a global set of aspects. In our work, we infer the aspects from the                 0.14


                                                                            prec


                                                                                                                                         prec
                                                                                                                                                 0.12
user’s profile, making them personalized: the aspects for one user                  0.12

need not be the same for another.                                                   0.10
                                                                                                                                                 0.10


                                                                                                                                                 0.08
                                                                                               0.25         0.50        0.75     1.00                       0.25         0.50        0.75     1.00

3    SUBPROFILE AWARE DIVERSITY                                                     0.30
                                                                                                                   λ
                                                                                                                                                 0.27
                                                                                                                                                                                λ


In this section, we explain our new approach to diversification in                  0.27                                                         0.24


                                                                            αnDCG


                                                                                                                                         αnDCG
recommender systems, which we call SubProfile Aware Diversifica-                    0.24                                                         0.21

tion (SPAD). It is a greedy re-ranking approach; it is intent-aware;                0.21                                                         0.18

but it is also personalized, based on identifiable subprofiles within                         0.25          0.50        0.75     1.00                      0.25          0.50        0.75     1.00

the user’s profile.                                                                                                λ                                                            λ


    Let I be the set of all items. Subprofile detection works on                           cplsa      MMR      RxQuAD     SPAD   xQuAD                  cplsa      MMR      RxQuAD     SPAD   xQuAD


positively-rated items in the user’s profile. In the case of positive-
only feedback, user u’s profile, Iu ⊆ I , is the set of items she has
                                                                                                   Figure 1: Results for MovieLens dataset.
interacted with (liked, clicked on, purchased, etc.). In the case of
explicit ratings rui (e.g. 1-5 stars), then Iu must be defined in terms
of items the user liked, which will usually involve thresholding the           In Figure 1, we plot precision (an accuracy metric) and α-nDCG
ratings, e.g. in our experiments, we use Iu = {i |rui ≥ 4}. A user’s        (a diversity metric) for different values of λ, which controls the
subprofiles are subsets of Iu .                                             amount of diversification. Notice that α-nDCG measures diversity
    Our approach to detecting user subprofiles is based on a method         with respect to the explicit features F (the meta-data). It therefore
for recommending to shared accounts, called DAMIB-COVER [5].                may favour recommenders that re-rank using those features. Our
DAMIB-COVER identifies different tastes within the profile of a             new method, SPAD, makes no use of the features and so it is at a
shared account (which it assumes come from the different users              disadvantage in these experiments.
who share that account) and recommends items to satisfy each taste.            For both baseline algorithms (MF and pLSA), SPAD has the high-
We adapt DAMIB-COVER to take in the profile for a single-user               est precision. For the pLSA baseline, SPAD also has the highest
account u and to extract the different subprofiles Su that correspond       diversity; for the MF baseline, SPAD’s diversity is competitive with
to the different tastes of that user.                                       the other re-ranking algorithms despite being at a disadvantage as
    In the work on intent-aware diversification that we described           mentioned earlier.
earlier, the same set of aspects A was used for all users. In SPAD,            We plan to further explore the effectiveness of SPAD on other
aspects are user-specific: user u has set of aspects Au . And, in           datasets, and with more baseline algorithms. We also plan to de-
the earlier work, aspects were often based on explicit features F ,         velop other subprofile detection methods instead of using DAMIB-
i.e. A = F . In SPAD, aspects are user subprofiles, i.e. Au = Su .          COVER. We will also explore the interpretability of SPAD’s recom-
Each subprofile S ∈ Su contains a set of items from Iu . Different          mendations in terms of subprofiles.
subprofiles can be of different lengths; the number of subprofiles
can differ across users.                                                    ACKNOWLEDGMENTS
    In SPAD, the set RS is greedily re-ranked using the objective           This publication has emanated from research supported in part by
function given as Equation (1) with div(i, RL) = novxQuAD (i, RL)           a research grant from Science Foundation Ireland (SFI) under Grant
(Equation (2)). What differs is the computation of the probabilities        Number SFI/12/RC/2289.
used in Equation (2). Given that aspects are now subprofiles, we
use p(S |u) and p(i |u, S ) instead of p(a|u) and p(i |u, a) for S ∈ Su .   REFERENCES
                                                                             [1] Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based
4    EXPERIMENTS                                                                 reranking for reordering documents and producing summaries. In Procs. of the
                                                                                 21st Annual International ACM SIGIR Conference on Research and Development in
We compare SPAD to other re-ranking approaches on the Movie-                     Information Retrieval. ACM, 335–336.
                                                                             [2] Rodrygo LT Santos, Craig Macdonald, and Iadh Ounis. 2010. Exploiting query
Lens1M dataset with 5-fold cross validation. We show the results                 reformulations for web search result diversification. In Procs. of the 19th Interna-
of taking recommendations made by matrix factorization (MF) and                  tional Conference on World Wide Web. 881–890.
probabilistic latent semantic analysis (pLSA) algorithms and then            [3] Saul Vargas, Pablo Castells, and David Vallet. 2011. Intent-oriented diversity in
                                                                                 recommender systems. In Procs. of the 34th International ACM SIGIR Conference
re-ranking them using SPAD and other re-ranking approaches:                      on Research and Development in Information Retrieval. ACM, 1211–1212.
      • MMR: Uses ILD with distance defined as the complement                [4] Saúl Vargas Sandoval. 2015. Novelty and Diversity Evaluation and Enhancement
                                                                                 in Recommender Systems. Ph.D. Dissertation. Universidad Autónoma de Madrid,
        of Jaccard similarity on the item features [1].                          Spain.
      • xQuAD: See Equation 2.                                               [5] Koen Verstrepen and Bart Goethals. 2015. Top-N Recommendation for Shared
                                                                                 Accounts. In Procs. of the 9th ACM Conference on Recommender Systems. ACM,
      • RxQuAD: Relevance-based xQuAD that is based on maxi-                     59–66.
        mizing relevance, rather than the probaility of choosing a           [6] Jacek Wasilewski and Neil Hurley. 2016. Intent-Aware Diversification Using a
        single item [4].                                                         Constrained PLSA. In Procs. of the 10th ACM Conference on Recommender Systems.
                                                                                 ACM, 39–42.
      • cplsa: Based on explicit aspects but the probabilities are
        learned by a constrained pLSA model [6].

</pre>