Top-N recommendations on Unpopular Items with Contextual Knowledge Paolo Cremonesi Paolo Garza Politecnico di Milano, Italy Politecnico di Milano, Italy paolo.cremonesi@polimi.it garza@elet.polimi.it Elisa Quintarelli Roberto Turrin Politecnico di Milano, Italy Moviri srl, Italy quintare@elet.polimi.it roberto.turrin@moviri.com ABSTRACT used to improve the accuracy of recommendations. For ex- Traditional recommender systems provide recommendations ample, the restaurants and cuisine types chosen during the of items to users; recently, some of them also consider the winter can be different from those chosen in the summer; context related to predictions. In this paper we propose moreover, the age and location of a user can have a great a technique that relies on classical recommendation algo- impact in the choices related to the food domain. The use of rithms and post-filters recommendations on the basis of con- contextual information in the area of recommendation sys- textual information available for them. Association rules tems has been considered in the recent literature [1, 7, 8, are exploited to identify the most significant correlations 13, 14]. In [13] the notion of context is integrated in the among context and item characteristics. The mined rules customer’s behavior model for improving the prediction of are used to filter the predictions performed by traditional their behavior; however, in [8] the authors show that col- recommender systems to provide contextualized recommen- lecting contextual information relevant for recommendation dations. Our experimental results show that the proposed purposes is a hard task and they develop a special-purpose approach allows improving the output of classical algorithms browser to gather such information. The proposals related proposed in the literature, especially in the case of unpopu- to contextual recommendation systems can be classified into lar items. three main groups: pre-filtering, post-filtering, and contex- tual modeling [2]. In [1] the context is used for rating es- timation in multidimensional recommender systems, where Categories and Subject Descriptors contextual information are added to the standard dimen- H.4 [Information Systems Applications]: Miscellaneous sions related to users and items. In [7] a proposal which uses contextual values as virtual items used together to standard ones is presented; the advantage is that classical recommen- General Terms dation algorithms can be used without any modification. In Algorithms, Experimentation [4] an approach to pre-filter the item ratings on the basis of the possible values of a unique contextual perspective, that Keywords is the season when the rating was expressed, is described. [14] contains a comparison between the pre-filtering and the Recommender system, Association rules, Contextual data post-filtering approaches and identifies which method dom- inates the other and under which circumstances, since, in 1. INTRODUCTION AND RELATED WORK general, there is not an approach better than the others. Recommender systems help people in retrieving poten- In this paper we introduce a generic contextual post-filte- tially useful information or products in a huge set of choices ring technique, not specifically adapted to a target scenario; by using the knowledge of the individual’s past ratings, but however, to clarify the exposition, in the examples we will also the ratings of all the system users. use the movie domain. The proposed approach exploits as- However, other aspects, such as the situation, location, sociation rules to identify the most significant correlations and time can influence the user ratings and thus, can be among context and item characteristics and uses them to fil- ter the predictions performed by a traditional recommender system in order to provide contextualized recommendations. The performed experiments show the effectiveness of the proposed approach in the context of top-N recommenda- tions, in particularly when unpopular items are considered. The paper is organized as follows. In Section 2 the pro- posed approach is described, while in Section 3 the obtained experimental results are discussed. Finally, Section 4 draws CARS-2011, October 23, 2011, Chicago, Illinois, USA. conclusions and discusses future work. Copyright is held by the author/owner(s). are related to a context c, and hence which items must be considered, we propose to exploit association rules, and in particular the correlations between context and item charac- teristics. Association rules [3] are rules of the form X ⇒ Y , where both X and Y are sets of objects. In our approach, X is a set of predicates representing a context and the conse- quent is a characteristic of the available items. For instance, (gender = M ) ∧ (age = [20 − 25]) ⇒ (genre = horror) is an association rule representing a correlation between the con- text (gender = M ) ∧ (age = [20 − 25]) and the horror genre mined from the available data (users, items, contexts, and ratings). Association rules have been successfully applied to market basket analysis [3] and automatic data classifica- Figure 1: The Context-RS recommender system tion [11]. In this work we show that they can be profitably exploited also in the context of recommender systems. 2. PROPOSED ALGORITHM Two measures are usually used to assess the quality of as- sociation rules: support (sup) and confidence (conf ). The The notion of context has emerged with different inter- support represents the frequency of the rule in the analyzed pretations in various fields of research like psychology, phi- data, while the confidence estimates the conditional proba- losophy, or computer science. A widely accepted definition bility of the consequent given the antecedent. is the one proposed in [6], where the context is considered The proposed approach infers which kinds of items are as any information useful to characterize the situation of an more of interest for each context c by extracting association entity, in our work an access to an item in the movie do- rules from the available training ratings, by taking into con- main. In general, contextual information can be explicitly sideration also the context in which the ratings have been declared by the user, e.g., the user is interested in dramatic given and the characteristics of the rated items. In particu- movies when he/she is alone, and in horror movies when lar, it extracts all the association rules of the form context ⇒ he/she is with friends, or implicitly inferred by the system, item charactetistic with support and confidence higher than as in the case of temporal information, or by sensors, e.g., the minimum thresholds minsup and minconf , respectively. the location. The contextual information available for our We remark that the characteristics of the considered users scenario includes static demographic information, i.e., Age, are also part of the contextual knowledge. Hence, in the Gender, Occupation, and ZIP code (of the residence address) antecedent of the extracted rule predicates on the charac- of the users. However, the generality of our proposal allows teristics of the users can be present (e.g., a predicate such us to extend our experiments to domains where a wider set as gender = M ). of contextual and dynamic values is traced. For example, After the set of contextual rules has been mined, for each we could consider the user situation (he/she is alone or with context c the consequents of the association rules having c friends) and his/her location (e.g., at home, at office). as antecedent are used to form a set Sc composed of the item 2.1 The Context-RS recommender system characteristics which are more frequently related to c. Only those items which are characterized by a characteristic that We propose a new recommender system, called Context- belongs to Sc can be recommended when the user current RS, that combines contextual knowledge, traditional recom- context is c. For example, suppose the following two rules mender systems, and association rules to improve the quality are extracted for the context (gender = M ): of top-N recommendations. Context-RS is a post-filtering (gender = M ) ⇒ (genre = horror), sup = 5%, conf = 50% approach (see Figure 1) based on the available knowledge (gender = M ) ⇒ (genre = action), sup = 4%, conf = 40% about users (U ), items (I), contexts (C), and ratings (R). It It follows that Sc is equal to {(genre = horror), (genre = exploits association rules to find useful correlations between action)} for the context (gender = M ). Hence, only horror contextual knowledge and item characteristics and uses the and action movies can be recommended to male users when extracted rules to filter uninteresting items from the items our approach is adopted. recommended by a traditional recommender system. Given an arbitrary user u in the context c, the proposed approach works as follows to select the top-N items to rec- 2.1.1 Contextual rule mining ommend to u. In order to be able to apply traditional rule mining algo- rithms to mine contextual rules, the available data (ratings 1. for each unrated item i a traditional (collaborative) and the related knowledge) must be represented in the trans- recommender system is used to predict the appropriate actional data format [3]. In the context of association rule a rating rui of user u transaction is defined as a set of items, and a transactional 2. the subset of items having a set of characteristics re- dataset as a set of transactions [3]. For the extraction of the lated to the current context c, according to the ex- contextual rules exploited by our approach, a transactional tracted contextual association rules, is selected dataset D is generated by applying a data transformation on the rating data taking into consideration also contextual 3. the top-N items of the subset of items selected during knowledge and item characteristics. For each rating four the previous step are recommended to user u pieces of information are known: the rated item (i), the The main difference between Context-RS and the previ- user who rated the item (u), the context in which the rating ous post-filtering approaches is given by the adopted selec- has been given (c), and the rating value (rui ). Each rating tion technique (step 2). To decide which types of items given by user u to item i in the context c is transformed in a transaction (i.e., set of pairs) composed of Algorithm All items Long tail items the following pairs: standard 28.90 14.14 PureSVD with context 29.32 16.17 • one pair (i.characteristic=value) for each characteristic standard 13.47 4.40 AsySVD with context 14.26 6.96 of the considered item i NNCosNgbr withstandard 25.75 7.75 • one pair (u.characteristic=value) for each characteris- context 26.33 9.60 tic of the considered user u CorNgbr standard 7.59 2.91 with context 8.63 3.97 • one pair (c.characteristic=value) for each characteris- TopPop standard 14.35 0.03 with context 16.07 0.06 tic of the considered context c MovieAvg standard 0.50 0.27 For instance, consider a rating given by a male in the with context 1.77 0.42 situation “with friends” and in the location “at home” to a fantasy movie produced in year 2001. The set {(gender = Table 1: Recall% at N =3 M ), (situation = with f riends), (position = at home), (genre = f antasy), (year = 2001)} is included in D. In order to extract only rules related to positive user ex- have randomly selected 1,000 additional items unrated by periences we have considered only the positive user ratings user u, (ii) we have predicted the ratings for the test item i (i.e., 4 and 5 out of 5). and the additional 1,000 items, and (iii) we have formed a From the generated dataset D, contextual association rules top-N recommendation list composed by the N items with are efficiently mined by means of an implementation of the the highest predicted ratings. Finally, the recommendation FP-growth algorithm downloaded from the FIMI website [9]. quality has been measured in terms of recall(N ), defined as the percentage of tested items that appear in the top-N list 3. EXPERIMENTAL RESULTS [5]. Since large datasets with contextual information are not Similarly to [5], an advanced analysis used to compute the available, in our initial evaluation of the proposed approach quality on long-tail items has been performed by selecting we used the Movielens dataset and we considered as context from T the subset of ratings related to unpopular items. In (“static” context) the available demographic data. our tests, we have considered only items with less than 990 ratings (the 95% of items) which refer to the 67% of ratings. 3.1 Testing methodology, datasets, and algo- rithms 3.2 Recall analysis Our study considered two neighborhood (CorNgbr and In this section, we analyze the recall achieved by contex- NNCosNgbr) and two latent-factor (AsySVD and PureSVD) tual and not contextual algorithms on the Movielens dataset. collaborative algorithms, along with two non-personalized Initially, we report a high-level analysis related to the re- algorithms used as baseline (TopPop and MovieAvg). call achieved by setting the number of recommended items The two non-personalized algorithms recommend static N to 3 (real systems such as IMDB and Amazon usually lists of items regardless the collected user ratings: Top- show from 2 to 5 recommended items per page depending Pop (Top Popular) suggests the most rated items (i.e., the on their layout). Then a deeper discussion on the recall most popular), while MovieAvg (Movie Average) suggests achieved by varying N on the long tail items is reported. the highest rated items (i.e., the most liked). We recall that, in the performed experiments, the context is Neighborhood collaborative algorithms are based on the given by age and gender of the users, while the considered similarity relationships among either users or items, in terms item characteristic is the movie genre. For the contextual of collected ratings. CorNgbr (Correlation Neighborhood) rule mining step, we set the minimum support threshold to computes item-item similarity by means of the Pearson lin- 0.01%, while the minimum confidence was initially set to 0% ear correlation coefficient [10]. Similarly, NNCosNgbr (Non- (the effects of the enforced minimum confidence threshold is normalized Cosine Neighborhood) computes item-item sim- analyzed in the following). Table 1 summarizes the recall ilarity by means of the cosine coefficient [5]. at N =3 for the considered algorithms. For each algorithm Latent factor collaborative models represent users and we report the recall obtained by the traditional version of items in a common low-dimensional ‘latent factor’ space. the algorithm (standard version) and that achieved by the AsySVD (Asymmetric SVD) is a matrix factorization model contextualized version (with context). We report the recall that reported an RMSE of 0.9000 in the Netflix context [10]. obtained on all items and that achieved on the long tail PureSVD is a latent factor algorithm recently proposed [5], items. The exploitation of contextual rules allows improv- whose rating estimation rule is based on the conventional ing the performance of all the considered algorithms in both SVD, where unknown ratings have been treated as zeros. cases (all items and long tail items). The recall improvement In this work we have considered the 1-million-rating Movie- is on the average higher when non-personalized algorithms lens dataset [12], which publishes user ratings along with are used (e.g., +1.72% for TopPop on all items) and less movie genres (e.g., comedy, horror, . . . ) and demographics relevant when state of the art latent factor based recom- (e.g., user gender, age, . . . ). According to the methodology mender systems are considered (e.g., +0.42% for PureSVD adopted in [5], known ratings are split into a training set M on all items). Even if contextual knowledge allows improv- and a test set T . The test set T contains only 5-star ratings, ing recall also when all items are considered, its positive i.e., the items relevant to the respective users. Therefore, effect is more evident when long tail items are considered. we have firstly trained the algorithm over the ratings in M. In this more difficult situation the information provided by Then, for each rating in T given by user u to item i, (i) we the automatically extracted contextual rules allows improv- ing more significantly also the recall of the better performing 50 algorithms (+2.03% for PureSVD and +1.85% for NNCos- PureSVD (with context - conf=15%) PureSVD (with context - conf=0%) Ngbr). The post-filtering operation, performed by means PureSVD (std) AsySVD (with context - conf=15%) of contextual rules, allows focusing on the most interesting 40 AsySVD (with context - conf=0%) AsySVD (std) movie genres for each context discarding irrelevant items from the top-N prediction list. The improvement of recall 30 recall(N)% on the long tail items is an interesting property. In fact, a good recall on the long tail items means that the proposed approach is able to properly recommend also not best seller 20 items, providing more novelty. 10 3.2.1 Prediction of long tail items Figures 2(a)-2(c) report the recall-at-N when varying the 0 value of N . For the contextual algorithms we report the re- 0 2 4 6 8 10 N sults obtained by using two different settings during the con- textual rule mining step: minconf =0% and minconf =15%. The second configuration (minconf =15%) is more selective. (a) Latent factor algorithms Hence, for each context a fewer genres are selected. How- ever, as discussed in the following, the second configuration allows achieving higher recall values. 50 NNCosNgbr (with context - conf=15%) NNCosNgbr (with context - conf=0%) The contextual versions of the considered algorithms are NNCosNgbr (std) CorNgbr (with context - conf=15%) significantly better than the not contextual ones, indepen- 40 CorNgbr (with context - conf=0%) CorNgbr (std) dently of the considered algorithm. The more selective con- figuration, i.e., the one with minconf =15%, shows to be the 30 most effective (+7.6% at N =5 with respect to standard ver- recall(N)% sion for PureSVD). We performed also a set of experiments by setting minconf equal to 20%. However, in that config- 20 uration the filter becomes too tight and recall decreases. The use of contextual rules allows improving significantly 10 the recall of the non-personalized algorithm TopPop (+18.6% at N =10 with respect to standard TopPop). The recall 0 of the contextual version of TopPop is comparable to that 0 2 4 6 8 10 achieved by the standard versions of many other more com- N plex algorithms. In particular, the recall achieved by the contextualized version of TopPop is comparable to that of (b) Neighborhood algorithms standard NNCosNgbr and even better than standard AsySVD. 50 TopPop (with context - conf=15%) 3.2.2 Recommendations for new users TopPop (with context - conf=0%) TopPop (std) The good performance achieved by the contextualized ver- 40 MovieAvg (with context - conf=15%) MovieAvg (with context - conf=0%) sion of TopPop is particularly of interest in presence of new MovieAvg (std) users. When recommender systems have to deal with new 30 users, the state of the art collaborative approaches cannot be recall(N)% used because the history of new users is empty. In this situ- ation only non-personalized algorithms, such as TopPop and 20 MovieAvg, can be used. Since the contextualized version of TopPop needs only to know the current context and the pro- 10 file of the new user (e.g., Age and Gender), contextualized TopPop can be used to recommend items to new users. The recall at N =10 of the contextualized TopPop is 18.9% while 0 0 2 4 6 8 10 that of the standard (not contextualized) TopPop is 0.25%. N Hence, our approach is a very interesting solution when new users have to be managed. (c) Non-personalized algorithms 3.3 Explanation provided by contextual rules The main goal of Context-RS is the recall improvement. Figure 2: Movielens: recall-at-N on long tail (95% However, the proposed approach adds also expressiveness of items) to traditional collaborative recommender systems. In fact, the extracted contextual rules can be used to explain to end users the reason behind the performed recommendations. A the used standard collaborative filter and (ii) their charac- merely list of items is usually less appealing than a set of teristics are considered interesting for the current context, items with an explanation. All the items recommended by according to the extracted contextualized rules. By simply Context-RS are selected because (i) they are ranked high by showing together to each recommended item also the contex- Context ⇒ Item characteristic Support% (Absolute Support) Confidence (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = Drama) 1.72% (17204) 34.69% (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = Comedy) 1.59% (15903) 32.16% (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = Romance) 0.87% (8702) 17.62% (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = Children0 s) 0.38% (3801) 7.62% (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = M usical) 0.24% (2401) 4.89% (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = Animation) 0.22% (2200) 4.36% (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = M ystery) 0.19% (1900) 3.85% (gender = F ) ∧ (age = [35 − 44]) ⇒ (genre = F antasy) 0.15% (1500) 2.98% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = Action) 3.40% (34007) 22.76% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = T hriller) 2.42% (24205) 16.18% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = Sci − F i) 2.15% (21504) 14.41% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = Adventure) 1.72% (17204) 11.51% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = W ar) 1.03% (10302) 6.91% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = Horror) 0.90% (9002) 6.05% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = M ystery) 0.52% (5201) 3.48% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = W estern) 0.33% (3301) 2.19% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = F ilm − noir) 0.30% (3001) 2.01% (gender = M ) ∧ (age = [35 − 44]) ⇒ (genre = Documentary) 0.12% (1200) 0.78% Table 2: Contextual rules for two representative contexts tual rule that has been used to select it we can improve the [4] L. Baltrunas and F. Ricci. Context-based splitting of confidence of the user on the provided recommendations. As item ratings in collaborative filtering. In ACM an example of the knowledge provided by contextual rules, RecSys’09, pages 245–248, 2009. the sets of rules for two representative contexts are reported [5] P. Cremonesi, Y. Koren, and R. Turrin. Performance in Table 2. The first context is “female in the range 35-44 of recommender algorithms on top-n recommendation years old”, while the second one is “male in the range 35-44 tasks. In ACM Recsys’10, pages 39–46, 2010. years old”. In both contexts, only a subset of the 18 available [6] A. Dey. Understanding and using context. Personal genres is automatically selected by means of the extracted Ubiquitous Computing, 5(1):4–7, 2001. contextual rules. [7] M. A. Domingues, A. M. Jorge, and C. Soares. Using contextual information as virtual items on top-n 4. CONCLUSIONS AND FUTURE WORK recommender systems. In Workshop on Context-Aware In this paper, we showed how contextual rules, represent- Recommender Systems (CARS-2009), pages 1–5, 2009. ing frequent relationships between context information and [8] M. Gorgoglione, C. Palmisano, and A. Tuzhilin. the characteristics of the rated items, allow increasing the Personalization in context: Does context matter when recall of state-of-the-art collaborative recommender systems. building personalized customer models? In ICDM’06, Our approach can be profitably applied to both personalized pages 222–231, 2006. and non-personalized recommender systems. [9] R. J. B. Jr., B. Goethals, and M. J. Zaki. Frequent As future work, we are focusing our attention on the cross- itemset mining implementations repository, 2004. domain problem. In particular, we are investigating the http://www.cs.rpi.edu/~zaki/Workshops/FIMI/. possible application of contextual rules, extracted from a [10] Y. Koren. Factorization meets the neighborhood: a dataset, to another dataset. For example, the set of rules multifaceted collaborative filtering model. In ACM mined from a movie dataset could be applied to a book SIGKDD’08, pages 426–434, 2008. dataset. Obviously a mapping function is needed to map [11] B. Liu, W. Hsu, and Y. Ma. Integrating classification the rules mined from one domain to the domain of the other and association rule mining. In ACM SIGKDD’98, dataset. This approach could be useful (i) when past con- pages 80–86, 1998. textual history is available only for one dataset or (ii) in [12] B. Miller, I. Albert, S. Lam, J. Konstan, and J. Riedl. presence of noisy datasets. Contextual rules are mined from MovieLens unplugged: experiences with an the first, non-noisy, dataset and are used to perform contex- occasionally connected recommender system. In tualized recommendations for the second dataset. International conference on Intelligent user interfaces, pages 263–266, 2003. 5. REFERENCES [13] C. Palmisano, A. Tuzhilin, and M. Gorgoglione. Using [1] G. Adomavicius, R. Sankaranarayanan, S. Sen, and context to improve predictive modeling of customers A. Tuzhilin. Incorporating contextual information in in personalization applications. IEEE Trans. Knowl. recommender systems using a multidimensional Data Eng., 20(11):1535 –1549, 2008. approach. ACM Trans. Inf. Syst., 23(1):103–145, 2005. [14] U. Panniello, A. Tuzhilin, M. Gorgoglione, [2] G. Adomavicius and A. Tuzhilin. Context-aware C. Palmisano, and A. Pedone. Experimental recommender systems. In ACM RecSys’08, pages comparison of pre- vs. post-filtering approaches in 335–336, 2008. context-aware recommender systems. In ACM [3] R. Agrawal, T. Imieliński, and A. Swami. Mining RecSys’09, pages 265–268, 2009. association rules between sets of items in large databases. In ACM SIGMOD’93, pages 207–216, 1993.