Introduction

Adaptive Diversity in Recommender Systems?

Tommaso Di Noia

Vito Claudio Ostuni

Jessica Rosati

0 1 2

Paolo Tomeo

Eugenio Di Sciascio

0 0 Polytechnic University of Bari , Via Orabona, 4, 70125 Bari , Italy 1 University of Camerino , Piazza Cavour 19/f, 62032 Camerino (MC) , Italy 2 University of Milano-Bicocca , Piazza dell'Ateneo Nuovo, 1, 20126 Milano

The evaluation of a recommendation engine cannot rely only on the accuracy of provided recommendations. One should consider additional dimensions, such as diversity of provided suggestions, in order to guarantee heterogeneity in the recommendation list. In this paper we analyse users' propensity in selecting diverse items, by taking into account content-based item attributes. Individual propensity to diversication is used to re-rank the list of Top-N items predicted by a recommendation algorithm, with the aim of fostering diversity in the nal ranking. We show experimental results that con rm the validity of our modelling approach.

Introduction

In the recommender systems eld, most of the approaches have been devoted to maximizing recommendation accuracy. However, it has been recognized that improving only the predictive accuracy is not enough to judge the e ectiveness of a recommender system [ 3 ], since the most accurate recommendations for a user are often too similar to each other and attention has to be paid towards the goal of improving individual diversity, the degree of diversi cation in the recommendations provided to an individual user. A number of works propose strategies to enhance the trade-o between accuracy and diversity [ 9, 8, 10 ].

The main intuition behind our work is that some users may prefer diversi cation in suggestions while others may not and they could be inclined to diversify with respect to not all item attributes. We propose an adaptive attribute-based diversi cation approach able to customize the degree of individual diversity of the Top-N recommendation list, using the Entropy measure to represent the inclination to diversity of the user over di erent content-based item dimensions. We apply our approach to the movie domain, considering what reasonably leads a user to choose a movie in a huge collection of items, that is genre, actor, director and year of release. However not all these factors have the same in uence on di erent users: by way of example, a user can decide to cling to a particular director and accept to watch several genres.

The main contributions of this paper are: { a representation of user's propensity in diversifying her choices. { an adaptive attribute-based re-ranking approach based on the aforementioned representation. ? An extended version of this paper has been published in [ 4 ].

Adaptive diversi cation

In the recommendation process, after the ratings prediction for unrated items, the maximization of user's utility and the improvement of individual diversity in the items list can be pursued through a re-ranking phase [ 1 ]. There are several heuristics which let to re-rank items in an e cient way, such as the MMR greedy strategy [ 7 ]. MMR iteratively selects the item which maximizes an objective function fobj , which in turn can deal with the trade-o between accuracy and diversity and is de ned as fobj (i; S) = r (u; i) (1 ) max sim(i; j) (1) j2S where S is the previously re-ranked list, r is a function to estimate the rating of user u for item i, sim a similarity measure on item pairs and the parameter lets to manage the accuracy-diversity balance.

The diversi cation attitude of each user for each item attribute a 2 A is measured through Shannon's entropy. For each attribute, users are clustered in four groups, referred to as quadrants, de ned by the medians of the entropy and user pro le length distributions across all users. For example a user u is in the rst quadrant for the genre attribute, if her entropy Hgenre(u) is less than the median of the entropy computed across all users and she has a short user pro le (her number of ratings is less than the median of users' ratings). The same user may belong to di erent quadrants in relation to di erent attributes. Table 1 provides a representation of quadrants. The main modelling hypothesis behind this classi cation is that users who have explored items with di erent characteristics in the past are willing to accept diverse recommendations. Given an attribute a, we interpret a high value of entropy as an attitude of the user to choose items with di erent values for a. Conversely, a low value of entropy is read as her willing to consider items similar for that attribute.

Entropy hQuadrant 1 Quadrant 2 tgLow Entropy High Entropy e Small Pro le Small Pro le n L leQuadrant 3 Quadrant 4 oLow Entropy High Entropy rPLarge Pro le Large Pro le

Table 1. Quadrants

Quadrants are used to de ne the similarity measure in Equation (1). Let us consider a user u and indicate with A the set of item attributes (for example in the movie domain A = fyear; genre; direction; starringg). We consider a function qu : A ! f1; 2; 3; 4g, which assigns, for each attribute, the quadrant to which user u belongs to and then we de ne a quadrant weight !i 2 [0; 1], with i 2 f1; 2; 3; 4g. The overall similarity between items i and j in Equation (1), for user u, is tailored to the quadrants she belongs to and is de ned as: sim(i; j) =

Pa2A !qu(a) sima(i; j) m jAj (2) with m = maxf!i j i = 1; 2; 3; 4g and sima(i; j) a similarity measure between i and j with respect to attribute a. The weights associated to user belonging quadrants in uence the similarity score and hence the resulting objective function of MMR, eventually varying the diversity. 3

Experiments and Results

We carried out experiments on Movielens 1M4 dataset, enriched with further attribute information (actors and directors) extracted from DBpedia5, as in [ 5 ]. We concentrated on users who gave at least fty ratings. The nal dataset contains 4297 users, 3689 items and 942590 ratings. Training and test sets were built with a temporal 60-40% split. We compared our approach with two baselines: no-MMR, user-based kNN Collaborative Filtering algorithm with Pearson correlation; MMR, re-ranking with Equation 1 of the top 200 recommendations generated by no-MMR for each user. Our adaptive approach is denoted as adaptiveMMR. The parameter in Equation 1 was set to 0:5. As similarity measure for attribute a in (2), we used the Jaccard index. To reduce the number of distinct attribute values, we divided movies in decades and performed a K-means clustering for actors and directors on the basis of their DBpedia categories, obtaining 20 clusters. The number of values is 19 and 8 for genre and year, respectively.

We used the TestItems evaluation methodology presented in [ 2 ], with Precision (P@k ) and nDCG@k for accuracy, ILD@k for diversity and avg(P,ILD) for the balance between accuracy and diversity, as in [ 6 ]. P@k is chosen instead of nDCG@k since they have a similar trend.

Firstly, we tested the validity of the hypothesis that users who have explored di erent items in the past are inclined to diversity. As shown in Table 2, MMR dominates the no-MMR for quadrant 2 and 4 for both precision and ILD, demonstrating that users with high entropy bene t from diversi cation. In the other quadrants (1 and 3) there is a normal decrease of accuracy. Hence users with low entropy in their user pro les are not inclined to an uncontrolled diversi cation.

Later, to test the e ectiveness of adaptiveMMR, we conducted a grid search on !, nding, as a rst result, that our intuition of choosing small values for !1 and !3 and bigger ones for !2 and !4 is validated by accuracy and ILD results. Without such constraints, in fact, the accuracy values of adaptiveMMR get deeply worse. For lack of space we discuss here only three weights con gurations: A = h0; 0; 0; 1i, B = h0; 1; 0; 1i, C = h0:1; 1; 0:1; 0:75i. The values of list C were computed via grid search xing !1 and !3 and varying !2 and !4 with a step of 0:05. These con gurations let us deal with emblematic situations: con guration 4 Available at http://grouplens.org/datasets/movielens 5 http://dbpedia.org

A acts on users who are in quadrant 4 for some attributes and con guration B on users belonging to quadrant 2 or 4. Table 3 shows the results with k = 10. AdaptiveMMR gains the best balance between accuracy and diversity, represented by avg(P,ILD). In terms of accuracy, adaptiveMMR out-performs no-MMR and MMR, especially adaptiveMMR-C. Remarkably, the con guration C has an ILD value close to MMR but a signi cantly better accuracy values. 4

Conclusions

Results showed in this paper suggest that the individual tendency to diversity, represented by entropy, is a factor to take into account in the diversi cation process and should be considered even for users with a small pro le length. Acknowledgements The authors acknowledge partial support of VINCENTE (PON02 00563 3470993) and RES NOVAE (PON04a2 E)

Adomavicius and

Kwon . Improving aggregate recommendation diversity using ranking-based techniques . IEEE TKDE , 24 ( 5 ): 896 { 911 , 2012 .

Bellogin ,

Castells ,

and I.

Cantador . Precision-oriented evaluation of recommender systems: An algorithmic comparison . In ACM RecSys '11 , pages 333 { 336 , 2011 .

Bradley and

Smyth . Improving Recommendation Diversity . In Irish Conference in Arti cial Intelligence and Cognitive Science , pages 75 { 84 , 2001 .

Tommaso

Di Noia , Vito Claudio Ostuni, Jessica Rosati, Paolo Tomeo, and Eugenio Di Sciascio. An analysis of users' propensity toward diversity in recommendations . In Proceedings of the 8th ACM Conference on Recommender Systems, RecSys '14 , pages 285 { 288 . ACM, 2014 .

V. C.

Ostuni ,

T. Di

Noia ,

E. Di

Sciascio , and

Mirizzi . Top-n recommendations from implicit feedback leveraging linked open data . In ACM RecSys '13 , pages 85 { 92 , 2013 .

Panniello ,

Tuzhilin , and

Gorgoglione . Comparing context-aware recommender systems in terms of accuracy and diversity . User Modeling and UserAdapted Interaction , 24 ( 1-2 ): 35 { 65 , 2014 .

Vargas and

Castells . Exploiting the diversity of user preferences for recommendation . In OAIR '13 , pages 129 { 136 , 2013 .

Zhang . Enhancing diversity in top-n recommendation . In ACM RecSys '09 , pages 397 { 400 , 2009 .

Zhang and

Hurley . Avoiding monotony: Improving the diversity of recommendation lists . In ACM RecSys '08 , pages 123 { 130 , 2008 .

10. C. Ziegler , S. M.

McNee , J. A.

Konstan , and G.

Lausen . Improving recommendation lists through topic diversi cation . In WWW '05 , pages 22 { 32 , 2005 .