User Segmentation for Controlling Recommendation
                              Diversity

                  Farzad Eskandanian                                     Bamshad Mobasher                                Robin Burke
                Center for Web Intelligence,                          Center for Web Intelligence,              Center for Web Intelligence,
                    DePaul University                                     DePaul University                         DePaul University
                    Chicago, IL 60604                                     Chicago, IL 60604                         Chicago, IL 60604
                feskanda@depaul.edu                                 mobasher@cs.depaul.edu                       rburke@cs.depaul.edu

ABSTRACT
The quality of recommendations is known to be affected by
diversity and novelty in addition to accuracy. Recent work
has focused on methods that increase diversity of recommen-
dation lists. However, these methods assume the user pref-
erence for diversity is constant across all users. In this pa-
per, we show that users’ propensity towards diversity varies
greatly and argue that the diversity of recommendation lists
should be consistent with the level of user interest in di-
verse recommendations. We introduce a user segmentation
approach in order to personalize recommendation according
to user preference for diversity. We show that recommen-
dations generated using these segments match the diversity
preferences of users in each segment. We also discuss the
impact of this segmentation on the novelty of recommenda-                                            Figure 1: ILD Distribution of User Profiles.
tions.
                                                                                            using any one of a variety of standard recommendation tech-
Keywords                                                                                    niques. We show that such recommendations have a level of
                                                                                            diversity that matches the interest of the segment’s users.
Recommendation diversity, Performance evaluation metrics,
Novelty, Collaborative Filtering
                                                                                            2.   DEFINITIONS
                                                                                            Let U and I be the sets of users and items, respectively.
1.     INTRODUCTION                                                                         The lists of recommendations is denoted as R. Ru is the
Although there are many methods in the literature that can                                  recommendation items for user u ∈ U and user profile Iu is
be used to increase diversity in recommendations [1], only                                  the list of items that u has rated. Diversity is the measure
a few have mentioned the varying degrees of interest users                                  of dissimilarity between items in a set. For this purpose, we
have for diverse recommendation results [2]. One can imag-                                  use average pairwise distance of items in a set as Intra-List
ine two extreme cases of this interest: one user likes to re-                               Distance (ILD) [4].
ceive as recommendations only science fiction movies made
within the last 10 years; another user likes a more diverse                                                                1       XX
set of movies from many genres in her recommendation list.                                              ILD(L) =                          d(i, j)           (1)
                                                                                                                      |L|(|L| − 1) i∈L j∈L
Obviously, any attempt to increase the diversity of recom-
mendation list is likely to generate poor results for the first                                In addition to diversity, we can measure the impact of user
user with limited interests.                                                                segmentation on the novelty or catalog coverage of recom-
   We measure a user’s preference for diversity as a func-                                  mendation lists. We define novelty as the average distance
tion of the diversity of items that the user has rated, and                                 from the items in user profile to the items in recommenda-
segment the users into groups based on their scores. Recom-                                 tions.
mendations for each group can be generated independently
                                                                                                                              1                 X X
                                                                                             N ov(Iu , Ru ) =                                          d(i, j)
                                                                                                                |Ru ||Iu | − min(|Ru |, |Iu |) i∈R j∈I
                                                                                                                                                 u    u
Permission to make digital or hard copies of part or all of this work for personal or                                                               (2)
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
                                                                                              We also consider the popularity of items in the recommen-
on the first page. Copyrights for third-party components of this work must be honored.      dation lists. Popularity of an item i is defined by
For all other uses, contact the owner/author(s).
                                                                                                                               |Ui |
                                                                                                                P op(i) =
 c 2016 Copyright held by the owner/author(s). RecSys 2016 Poster Preceedings,
                                                                                                                            maxj∈I (|Uj |)
September 15–19, 2016, Boston, MA, USA 978-1-4503-2138-9.                                   where Ui is the set of users who have rated item i.
3.     EXPERIMENTS AND DISCUSSIONS
We used MovieLens 1M1 data set for analysis and evalua-
tion of the proposed method. We create a term vector for
each movie using the genre information in the dataset and
measure the distance between movies, d(i, i), as the cosine
of two genre vectors. After the ILD value for each user has
been computed, the next step is to define intervals for seg-                           Table 1: User Segments
menting the user profiles. Figure 1 shows the distribution of
ILD values across the MovieLens user profiles. The figure
shows that there are a relatively small number of users with
low ILD values, rising to a peak around 0.74 and falling off
rapidly thereafter.
   We divided the range of ILD values into four segments,
shown graphically in Figure 1 and also in Table 1. The figure
shows the boundaries of each segment and the mean ILD,
µsk , for k = 1, 2, 3, 4. Note that segment 3 is larger than the
others, which reflects the large number of users with this
range of diversity in their profiles.
   We generated our re using three recommendation mod-
els (two neighborhood based models and one using matrix
factorization, BPRMF (Bayesian Personalized Ranking with
Matrix Factorization) [3]) using the whole dataset, as well
as using each segment separately.
   Table 2 shows the results of these experiments in terms of
precision and recall, diversity, novelty and popularity. We
expected to find that diversity would be increased when for                       Table 2: Recommendation Results
segments with higher preference for diversity and that effect
is clearly present in ILD values for all the recommendation         This work examines the consequences of segmenting user
algorithms. As we move from segments with low diversity to          populations by diversity, as a means of personalizing user
those with higher diversity, the ILD values of the resulting        interest in and tolerance for diversity. We show that interest
recommendations are monotonically increasing.                       in diversity varies widely across users, with a distinct peak
   We expected to find that popularity is monotonically de-         and users with preferences both low and high.
creasing. That is, the segments containing users with di-              Our division of the user population into four segments
verse profiles would produce recommendations outside of the         is a simple but effective method for increasing diversity for
“short head” of highly popular items and the more diverse           those segments of the population interested in such diversity
the users, the more obscure the recommendations. This ef-           and decreasing it for those with less interest. The expected
fect is not seen. Instead, popularity increases between seg-        effects on diversity and novelty are seen across three different
ments 1 and 2 and decreases afterwards. One explanation             recommendation algorithms.
for this phenomenon is that segment 1 users are actually               We plan to explore these effects in future work in addi-
niche users with a strong interest in a single movie genre          tional datasets and algorithms, as well as alternate methods
and as a result, their profiles do not contain many of the          for personalizing diversity.
typical “blockbuster” films. Outside of segment 1, the ex-
pected effect is seen across all remaining segments. We will
explore this phenomenon further in future work.
                                                                    5.   REFERENCES
   A trade-off between precision and recall is observed in          [1] G. Adomavicius and Y. Kwon. Improving aggregate
Table 2. As ILDSi increases, P recisionSi increases and                 recommendation diversity using ranking-based
RecallSi decreases. Increase in ILDu suggests that a user u             techniques. IEEE Transactions on Knowledge and Data
is interested in movies from a variety of genres. The num-              Engineering, 24(5):896–911, 2012.
ber of hits (items in the recommendation list) for this user        [2] K. Kapoor, V. Kumar, L. Terveen, J. A. Konstan, and
also increases because more movies are considered relevant              P. Schrater. I like to explore sometimes: Adapting to
recommendations. However, there are more movies in the                  dynamic user novelty preferences. In Proceedings of the
catalog that can match the user’s interests so achieving good           9th ACM Conference on Recommender Systems, pages
recall of just those items that the user rated is more difficult.       19–26. ACM, 2015.
   Table 2 also shows that the novelty of recommended items         [3] S. Rendle, C. Freudenthaler, Z. Gantner, and
increase along with the increase in diversity. So, segmenta-            L. Schmidt-Thieme. Bpr: Bayesian personalized
tion based on diversity, not only preserves the user’s propen-          ranking from implicit feedback. In Proceedings of the
sity towards diverse recommendations, but also results in a             twenty-fifth conference on uncertainty in artificial
corresponding change in the the level of recommendation                 intelligence, pages 452–461. AUAI Press, 2009.
novelty.                                                            [4] C.-N. Ziegler, S. M. McNee, J. A. Konstan, and
                                                                        G. Lausen. Improving recommendation lists through
4.     CONCLUSIONS                                                      topic diversification. In Proceedings of the 14th
                                                                        international conference on World Wide Web, pages
1
    http://grouplens.org/datasets/movielens/                            22–32. ACM, 2005.