=Paper= {{Paper |id=Vol-1905/recsys2017_poster11 |storemode=property |title=How Diverse Is Your Audience? Exploring Consumer Diversity in Recommender Systems |pdfUrl=https://ceur-ws.org/Vol-1905/recsys2017_poster11.pdf |volume=Vol-1905 |authors=Jacek Wasilewski,Neil Hurley |dblpUrl=https://dblp.org/rec/conf/recsys/WasilewskiH17 }} ==How Diverse Is Your Audience? Exploring Consumer Diversity in Recommender Systems== https://ceur-ws.org/Vol-1905/recsys2017_poster11.pdf
    How Diverse Is Your Audience? Exploring Consumer Diversity
                     in Recommender Systems
                             Jacek Wasilewski                                                          Neil Hurley
                    Insight Centre for Data Analytics                                      Insight Centre for Data Analytics
                        University College Dublin                                              University College Dublin
                             Dublin, Ireland                                                        Dublin, Ireland
                  jacek.wasilewski@insight-centre.org                                       neil.hurley@insight-centre.org

ABSTRACT                                                                       recommender systems expose items to the same, wider or narrower
On-line recommender systems have different challenges to over-                 groups of consumers, and how diverse are these groups.
come to provide content to users. One of these is the potential of
isolating users from a diverse set of items by recommending very               2   CONSUMER DIVERSITY
narrow content. In this paper we propose an item-centric view of a             Recommender systems have to deal with the long tail of items
recommender system, looking at the exposure of items to groups of              that are rarely recommended. This includes niche items that are
consumers, and how diverse those groups are, to identify if items              rarely liked, but also items that have not penetrated the market. To
are recommended to narrower groups of consumers. This is op-                   identify and promote these items, we argue it is not enough to ask
posite to current practice where diversity of content is typically             how many users have rated each item in the past, but also which
analysed. Preliminary results on the MovieLens 20M dataset show                users have rated the items, which define its item exposure.
that recommender systems expose items to narrower groups of                       An item’s user profile, Ui , contains the set of users who rated
consumers, and these groups are less diverse.                                  the item in the past. A diversity measure over these users gives
                                                                               insight into the extent to which item has been exposed to a wide
KEYWORDS                                                                       range of different user types. Similarly, the set of users to whom
recommender systems; diversity; consumer diversity; item-centric               the item is recommended, Ri , can be analysed to reveal the extent
evaluation                                                                     to which recommendations extend the exposure of an item. If an
1    INTRODUCTION                                                              item is recommended to diverse consumers, it is possible that the
                                                                               item can reach a wider potential market.
Recommender systems have become ubiquitous in the interfaces to
                                                                                  As it is commonplace for marketeers to model their customer-
product catalogues provided by on-line retailers. From the user’s
                                                                               base through customer segmentation, we find it useful to mea-
perspective, recommender algorithms are used to filter a large set of
                                                                               sure the diversity in terms of the spread across different consumer
possible selections into a much smaller set of items that the user is
                                                                               segments. Given a partition Pc of U into k consumer segments,
likely to be interested in. On the other hand, from the business point
                                                                               U = C 1 ∪ C 2 ∪ ... ∪ Ck , where C j is the j th consumer segment, we
of view, as important as users getting engaging recommendations
                                                                               define consumer diversity of a set of consumers S, as functions of
is the utilisation of products in the catalogue.                                                           |S ∩C |
    Sales increase or redistribution across the whole catalogue of             (p1 , ..., pk ), where p j = |S | j is the proportion of the set S that
items might not be the only business goal to be addressed by a rec-            belong to consumer segment C j .
ommender system. In some sense, recommender systems are mar-                      A similar problem is considered in ecology, where a habitat can
keting tools that identify customers and target these customers with           be quantified in terms of species diversity [5, 6], which measures
personalised items. Questions arise: are we exposing items to users            diversity in terms of the proportionality abundance of each species
                                                                               in a sample. It assigns a high diversity value when the sample is
that showed an interest before? Are we promoting items to reach
                                                                               evenly spread across the different species. In biodiversity, different
new groups of customers? How diverse are these groups? From                    measures like species richness, Shannon entropy, Simpson concen-
market development perspective, recommender system should help                 tration, can be generalised through the Hill number [5], or diversity
us in achieving all of these business goals. To measure and control            of order q defined as:
for this, we need a picture of how items are exposed to different                                                           ! 1/(1−q)
                                                                                                                 k
groups of people, and if the exposure is diverse.                                                      q
                                                                                                                 Õ      q
                                                                                                           D≜          pj
    In this paper we tackle the problem of item exposure to under-                                               j=1
stand who consumes items and if potential consumers are reached
by recommendations. We measure diversity of the people getting                 and 1 D = limq→1 q D = exp(H (p)). In biodiversity these are called
recommendations for an item, using approaches coming from ecol-                true diversities or effective number of species [6]. With q = 0 we
ogy, such as species diversity of a habitat. This is different to the          obtain richness, q = 1 true diversity of Shannon entropy, and for
content diversity of recommendations that has been typically consid-           q = 2 inverse Simpson index. Entropy increases as both richness
ered in the context of diversity in recommender systems. The main              and evenness increase, where Simpson index measures dominance
goal of this paper is to find the answer to the following question: do         and is less sensitive to richness. In our context, each consumer
                                                                               segment corresponds to a “species”. Then, with the help of the true
Project funded by Science Foundation Ireland under Grant No. SFI/12/RC/2289.   diversity we can evaluate the diversity of a habitat—that is, an item
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy.                     in our case. We can use true diversity to compare the exposure
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy                                                                                   Jacek Wasilewski and Neil Hurley


             0.2                           0.40                                                   Simpson indices drop, indicating items being 2-3 times less diverse.
Proportion


                                                                                                  Paired t-test show significance of the differences (p < 0.001).
             0.1                                                                                     As item’s popularity can affect collaborative filtering methods,
             0.0                                                                                  we wonder if lower diversity is due to low item popularity. To
                    0
                    1
                    2
                    3
                    4
                    5
                    6
                    7
                    8
                    9
                   10
                   11
                   12
                   13
                   14
                                                                                                  examine this, we split the items into the head of most popular items
                                           Consumer segments                       Dataset   UB   (80% of interactions), and the rest in the tail—histogram of Shannon
Figure 1: Distribution over segments of The Matrix movie                                          diversity in Figure 2. On dataset, head items have higher diversity,
(pop: 51,334), in the dataset and recommendation.                                                 while tail tends to obtain lower values. Recommendations do not
                               Dataset            Head              Tail          UB              follow these—both groups of items have distribution of diversity
             0.15                                                                                 skewed towards 0. This suggests that even popular items, receiving
Proportion




             0.10                                                                                 more interactions, are isolated from wide and diverse consumers.
             0.05
             0.00
                    0     3 6 9 12                       0          3 6 9 12 15                   4     RELATED WORK
                          Shannon entropy                           Shannon entropy               Diversity is commonly studied in the context of items that are
Figure 2: Histograms of Shannon entropy for dataset and rec-                                      recommended to users, which might help mitigating the problem
ommendations. Items are distinguished by their popularity:                                        of users being exposed to narrower spectrum of item types. A
head and tail items.                                                                              number of frameworks have been proposed to measure and increase
                                             Dataset         UB             IB     MF             diversity, such as Intra-List Diversity [10].
                        Richness (q = 0)      12.99          5.11          3.73    6.43               Sales diversity [1, 3] is a notion of diversity which attempts to
                        Shannon (q = 1)        8.41          2.84          2.99    3.53           capture how items perform, e.g. how evenly they are consumed.
                        Simpson (q = 2)        6.90          2.34          1.95    2.84           It tackles the long tail problem, where most popular items drive
Table 1: Average values of true diversity indices for all items                                   the recommendations. Aggregate Diversity [1], Gini index [3], Shan-
in the dataset and recommendations (UB, IB, MF).                                                  non entropy [9] are some of the measures of sales performance
                                                                                                  over items. In [2] an item-centric evaluation is conducted to de-
of different items to the consumer segments or to compare the
                                                                                                  tect pathologies hindering novel recommendations. These method,
exposure of a single item under different conditions.
                                                                                                  however, analyse impacts on items globally, not individually, and
                                                                                                  also without considering different groups of consumers.
3            ANALYSIS OF CONSUMER DIVERSITY                                                           In information retrieval, a concept of profile diversity [8] has
We investigate consumer diversity on the MovieLens 20M dataset                                    been proposed, where a profile contains information about the
[4]. For that, a partition into consumer segments is required. We                                 user’s community. Then queries should retrieve documents that
create behavioural segments based on past interactions. X -means                                  different communities find useful. However, the framework does
[7] clustering algorithm is used to define segments—k = 15 clusters                               not analyse consumers reached by these documents.
have been created based on interactions. Results of such clustering
depends on the initialisation parameters, which is a limitation, but it                           5     CONCLUSIONS
still enables comparison of diversity. We analyse recommendations                                 In this paper we identified and explored the problem of consumer
(of N = 20 items) generated by collaborative filtering algorithms                                 diversity, which measures how diverse each item is in terms of con-
available in the RankSys framework (http://ranksys.org): user- (UB)                               sumer segments. Our analysis shows that popular recommendation
and item-based (IB) kNN, and matrix factorisation (MF).                                           techniques expose items to much narrower and less diverse con-
    We wonder if recommender systems might suffer not only from                                   sumers. Although the overall quality of recommendations might be
narrowing content served to users, but also items being exposed                                   good, items are hidden from certain groups of people who expressed
to narrow audiences. To illustrate that, we take a movie (The Ma-                                 an interest in them in the past.
trix) for which we show distribution of consumers over segments—
Figure 1. It can be seen that one segment (no. 4) is over-represented                             REFERENCES
almost 4 times in recommendations. We measured its true diversi-                                   [1] G. Adomavicius and Y. Kwon. 2012. Improving Aggregate Recommendation
                                                                                                       Diversity Using Ranking-Based Techniques. IEEE TKDE 24, 5 (2012).
ties: richness, Shannon and Simpson indices. Richness decreased                                    [2] Ò. Celma and P. Herrera. 2008. A New Approach to Evaluating Novel Recom-
from 15 to 13 which means 2 segments are not reached, Shannon                                          mendations (RecSys ’08).
and Simpson indices also dropped, respectively, from to 9.80 to 6.14,                              [3] D. Fleder and K. Hosanagar. 2009. Blockbuster Culture’s Next Rise or Fall: The
                                                                                                       Impact of Recommender Systems on Sales Diversity. Manage. Sci. 55, 5 (2009).
and 8.85 to 4.43, which means that recommendations are generally                                   [4] F. M. Harper and J. A. Konstan. 2015. The MovieLens Datasets: History and
less diverse in terms of consumers to which this movie has reached.                                    Context. ACM Trans. Interact. Intell. Syst. 5, 4 (2015).
True diversities are also easy to interpret—they tell the effective                                [5] M. O. Hill. 1973. Diversity and evenness: a unifying notation and its consequences.
                                                                                                       Ecology 54, 2 (1973).
number of species, the number of equally abundant species that                                     [6] L. Jost. 2006. Entropy and diversity. Oikos 113, 2 (2006).
produce same diversity. In our case, recommendations are 1.5 times                                 [7] D. Pelleg and A. W. Moore. 2000. X-means: Extending K-means with Efficient
                                                                                                       Estimation of the Number of Clusters (ICML ’00).
less diverse on Shannon index, and 2 times on Simpson index.                                       [8] M. Servajean, E. Pacitti, S. Amer-Yahia, and P. Neveu. 2013. Profile Diversity in
    Table 1 contains values of considered true diversity indices, av-                                  Search and Recommendation (WWW ’13 Companion).
eraged over all items. Richness shows that on average items are                                    [9] Z. Szlavik, W.J. Kowalczyk, and M.C. Schut. 2011. Diversity measurement of
                                                                                                       recommender systems under different user choice models (ICWSM’11).
consumed by users of 13 out of 15 segments, but only recommended                                  [10] C. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. 2005. Improving Recom-
to 3-6 segments. If concentration is taken into account, Shannon and                                   mendation Lists Through Topic Diversification (WWW ’05).