Inferring Contextual User Profiles - Improving Recommender Performance Alan Said Ernesto W. De Luca Sahin Albayrak TU Berlin TU Berlin TU Berlin DAI Lab DAI Lab DAI Lab alan.said@dai-lab.de ernesto.deluca@dai-lab.de sahin.albayrak@dai-lab.de ABSTRACT are commonly categorized as either model-based or memory- In this paper we present the concept of inferred contextual based [8]. In this work we focus on the latter, which cre- user profiles (CUPs) which extends the traditional user pro- ates item prediction for a user by finding users similar to file definition by describing the user in a given situation, or that user (in terms of co-rated items), a so-called neighbor- context. The approach is evaluated in the scope of movie hood. The information from the neighborhood is then used recommendation. In our evaluation, we infer two CUPs for to predict items not rated by the user which should be of each user, and use only one of the profiles, instead of the interest. Memory-based, or neighborhood-based approaches full user profile for recommending movies. We evaluate the commonly use measures such as the Pearson correlation Co- model on a data snapshot from the Moviepilot movie rec- efficient or cosine similarity to create the neighborhoods [14]. ommendation website, with results showing a substantial improvement in terms of precision, recall and mean average However, in some situations, approaches using only the his- precision. torical usage information of users are not capable of iden- tifying relevant items [2], or approaches utilizing other in- formation can provide better recommendations. Instead, if Categories and Subject Descriptors at first identifying the situation, the context, a system can H.3.3 [Information Search and Retrieval]: Retrieval provide tailored recommendations for the specific context, models; H.3.5 [Online Information Services]: Web-based provided information about it is available. services In order to create a context-aware recommendation model, General Terms one needs to define the concept of context. In this work we Algorithms, Design, Experimentation, Human Factors use Dey’s widely-accepted definition: ”Context is any infor- mation that can be used to characterize the situation of an entity” [11]. Here, the entity is understood as an item which Keywords can be influenced by contextual parameters that describe recommender systems, collaborative filtering, experimen- the state of the user and item during consumption. tation, context-awareness, user modeling, information re- trieval, human factors, movie recommendation Context-aware systems commonly use a predefined static set of contexts in order to generate recommendations for the 1. INTRODUCTION specific situation, e.g. weekday, season, time of day [4, 13]. Recommender systems have become a popular component in online services to help and guide users in information We propose an approach for automatic context-inference in retrieval oriented tasks [16]. Frequently, recommender sys- the scope of movie recommendation, based on the time of tems infer the preferences of users based on a priori data, i.e. a rating event and the information on whether or not the the already consumed data. Collaborative Filtering (CF) rated movie is still shown in the cinema. models are the de facto standard in when it comes to rec- ommendation of frequently consumed items, e.g. movies, Our approach to context-inference for recommendation is books, etc [14, 16]. CF calculates the relevance of an item evaluated using a dataset from the Moviepilot1 movie rec- for a user based on other users’ rating information on items ommendation website. We present an inferred Contextual co-rated by the user and his or her peers. CF approaches User Model (CUP), a user profile, similar to the “micro- profile” concept by Baltrunas and Amatriain [4]. Our model infers the context of where a movie was seen (at the cinema, or at home) through a combination of movie meta data, the dates of when a movie was shown in the cinema, and the cre- ation time of the rating, i.e. the time when the movie was rated by a user. The model creates two “virtual” (context) profiles for each user (two CUPs), the cinema CUP and the CARS-2011, October 23, 2011, Chicago, Illinois, USA. home CUP. Copyright is held by the author/owner(s). 1 http://www.moviepilot.de The biggest difference between our work and the related allows a flexible and generic integration of contextual infor- work described in section 2 is that we infer Contextual mation using a User-Item-Context N-dimensional tensor for User Profiles automatically (i.e. split users into context- modeling data, instead of the traditional User-Item matrix. aware sub-profiles, as shown in Figure 1), and show that even In their “Multiverse Recommendation” model, every differ- this simple model of context-inference adds to the quality ent type of context is considered as an additional dimension of a recommender. The process is presented in detail in in the data representation, extending the user-item matrix Section 3. to a tensor. The factorization of this tensor leads to a com- pact data model that can be used to provide context-aware Our experiments show that when using our context model, recommendations. we can improve recommendation results significantly com- pared to the uncontextualized preferences of users. The full Bogers [6], presents a movie recommendation algorithm, details of our evaluation and results are presented in Sec- ContextWalk, based on taking random walks on the con- tion 4. The paper is concluded by a summary of the contri- textual graph. In addition to the common CF user-item re- butions and a discussion about future work in Section 5. lations, this algorithm allows the inclusion of different types of contextual features, such as actors, genres, directors, etc. Our main contribution is showing that a relatively simple It supports other recommendation tasks with the same ran- inference model based on surrounding information can be dom walk model without the need for alteration or retrain- used to boost recommendation results considerably. ing, e.g. recommending interesting movies or actors for a specific group of users. 2. RELATED WORK At the moment, recommender systems tend to use very sim- Contextual user modeling, and context-awareness in general plistic user models, adding new user preferences to the exist- have been hot topics during recent years with numerous pa- ing profiles as the users interact with more items (e.g. rate pers [4, 13, 17], workshops [3, 10], etc. covering the field. new movies, buy new books, etc.). But these approaches of- However, the topic is not new, and has been touched upon ten ignore the ”situated action” of the user. Situated action for the better part of the last 20 years. One of the earliest states that users who interact with a system in a particular systems using the concept of location-based context, the Ac- context have items that are relevant within that context may tive Badge Location System by Want et al. [18], introduced find the same items irrelevant in a different context [15]. this type of context-awareness as a means of providing ser- vices to people in an office environment. Similar systems As stated by Mobasher [15], context plays an important role have been subsequently put to use both in research and the in psychology for human memory as well as in linguistics industry, Bokun and Zielinski [7] for instance, created the for disambiguation purposes. Research in intelligent infor- Next Generation Active Badge System which broadcast the mation systems has also shown that incorporating context, location of the badge wearers. Abowd et al. [1] wrote about or situational awareness, in the recommendation process in- context for mobile environments in the form of location for creases the performance and perceived usefulness of recom- automated tour guides already in 1997. mender systems [4]. Adomavicius and Tuzhilin [2] divide context-aware recom- 3. CONTEXTUAL USER MODELING mender systems (CARS) into three types: Given an analysis of user modeling in the scope of recom- mender systems, in this paper, we choose to extend the term to contextual user modeling as our focus is on defining 1. Contextual Pre-Filtering, where context directs data context-aware user profiles (CUPs). Each CUP is specific selection for the situations a user encounters. 2. Contextual Post-Filtering, where context is used for filtering recommendations computed by traditional ap- The context profile model we describe is based on the lo- proaches. cation and time, the context (or “situated action” [15]), in which a user watches a movie. Given a set of users, movies 3. Contextual Modeling, where context is directly inte- and ratings with timestamps of when the rating event oc- grated into the model curred, we infer the context of the rating event. We define two CUPs, home and cinema and assign each user’s movie ratings to one of these as shown in Figure 1. Assignment of Contextual pre-filtering can be achieved by using “micro- ratings is based on the assumption that movies rated within profiles” where a single user profile is split into several, pos- two months of their cinema premiere date have been seen sibly overlapping, contextual sub-profiles, each representing in the cinema2 , we consequently assume movies rated at a the user in one or several particular contexts [4]. Here, the later point in time are assumed to have been seen at home. recommendation process uses these micro-profiles, not only a single user model. The performance is shown to be better Having created two CUPs for every user, we can now use than that of traditional Collaborative Filtering methods. a collaborative filtering approach to recommend movies for Contextual post-filtering is applied within traditional ap- 2 proaches, while contextual modeling directly involves the the specific time a movie is shown in the cinema usually varies depending on the number of visitors, however the time model, e.g. adapting a generic tensor factorization approach. between the cinema and home release of a movie usually An example of this is the tensor factorization-based Collab- varies between 4 weeks - 4 months [9], 2 months being typical orative Filtering method, by Karatzoglou et al. [13], which for German cinema each of the CUPs based on the ratings in each specific con- of movies seen by its users. One of the services offered by text. Moviepilot are movie recommendations. Each user is pre- sented with a set of movies which should be of interest. ui uj uk um ul These recommendations are based on the users’, and their ui uj uk um ul home cinema home cinema home cinema cinema home peers’, previously rated movies. ma 1 3 5 ma 1 3 5 mb 4 4 mb 4 4 This dataset is a subset of the full, unfiltered, data that cre- mc 5 2 mc 5 2 md md ates the basis for the Moviepilot website. The dataset was 5 3 3 5 3 3 me 3 4 1 1 me 3 4 1 1 obtained directly from Moviepilot, thus eliminating any in- consistencies which might be the result of crawling a website (a) Uncontextualized (b) The same rating matrix, like this. The dataset contains ratings by 10, 000 randomly rating matrix. where users from (a) have been divided into CUPs. selected users who have rated at least one movie. In addi- tion to the ratings, the dataset also contains information on when movies had their cinema premieres. The total num- Figure 1: Shown is an example of a user-movie ber of ratings in our subset is 1, 539, 393 spread over four matrix (a) and a user-movie-context (b) matrix. years. The total number of ratings in Moviepilot over the Columns with identifier ui...l refer to users and rows same amount of time is more than 7 million. Figure 2 shows with identifiers ma...e to movies. The elements of the the number of ratings per month in both datasets. The rat- matrix are the ratings of users given to movies. All ings are stored on a 0 to 100 scale with 0 being the lowest users might only have one CUP, as is the case with and 100 being the highest. The scale the users are presented uk . with is 0.0 to 10.0. This type of modeling is in agreement with the pre-filtered context-awareness concept discussed in Section 2. It is 4.2 Experimental Setup also related to the time-based “micro-profiles” approach pre- The algorithm used to produce the recommendations is sented by Baltrunas and Amatriain [4] where users are based on collaborative filtering [16]. We evaluate our results also divided into sub-profiles, however these sub-profiles are on a subset of 10, 000 randomly selected users due to the based on the time of the event only, without taking its loca- long running times of the experiments when the full dataset tion and item specific meta data into consideration. was used. Even for this subset, each experiment took circa 3 hours to complete on a 2.4GHz dual core PC. The rationale for this division is the assumption that people have different rating profiles, or different tastes, based on For the experiments, 50 training and evaluation sets each for where and when they see a movie, consequently the movies the original and for the contextual user profiles were created. which should be recommended to users should be different The evaluation sets consisted of circa 5000 ratings for 500 depending on how the movie will be consumed. randomly selected CUPs for the contextualized evaluation. Analysing the 10, 000 users in our dataset, we were able to Our model is built upon the assumption that users rate identify 7, 487 cinema CUPs and 4, 670 home CUPs - mean- movies they have seen within a short amount of time from ing that not all users seem to rate movies in both contexts. the time of viewing, i.e. generally not saving up ratings for, For the uncontextualized case, the CUPs were merged into rather rating them continuously . This is supported by the the original user, meaning a fewer number of columns in the general rating trend shown in Figure 2. The graph shows the input matrix (see Figure 1(a)). The merged columns have average number of ratings per user from the initial month of roughly twice as many ratings each though3 . registration for both the subset used in our experiments (in- troduced in Section 4.1) and the full dataset. As some users In order to avoid problems related to cold start, for both stop using the service, the number decreases over time. The users and items, we decided that users in the evaluation sets high amount of ratings in the beginning indicates that users had to have rated at least 30 movies. For each of these rate a “larger than normal” amount of movies just after reg- users, 10 movies having been rated with a value above the istration, in order to create their profiles, but after one or user’s average rating were extracted into the evaluation set two initial rating sessions, the average number of ratings per (i.e. the set of True Positive recommendations). The rest user per month stabilizes at between 10 and 12. There are of the ratings were used for training. The recommendation no extreme anomalies (peaks) in the curve, would there be algorithm was run one time each for the 50 pairs of original any, these would indicate accumulated rating sessions. and CUP datasets. The results presented in this paper are averaged over all 50 runs. 4. EXPERIMENTS AND RESULTS The recommendation algorithm used in our experiments was We evaluated our contextual user profile model on a K-Nearest Neighbor using the Pearson Correlation Coeffi- dataset from the German movie recommendation commu- cient as the neighbor similarity measure. Experiments were nity Moviepilot. It should be noted that the algorithm itself performed for K = 150. We evaluate our recommendations is not the focus of our evaluation, rather the concept of in- with the Mean Average Precision (MAP), Precision at N, ferred contextual user profiles. and Recall at N measures. These measures where chosen since they are well-known and widely-used in the field of 4.1 Dataset 3 which should bias the results positively for the original The Moviepilot website contains information and news setup as the number of true positives becomes twice as high about movies, actors, directors, etc., as well as the ratings (at most) for the merged users compared to the CUP’s. 140 Subset used Full dataset 120 100 # of ratings per user 80 60 40 20 0 0 12 # of months after first rating 24 36 Figure 2: The sum of the total number of ratings per month per user since their first rating. The number of ratings, in both the full dataset as well as in our subset stabilizes at around 10 ratings per month per user. The high number for the first month in our dataset is explained by the users in our dataset being active users, i.e. who create a profile for the purpose of returning. The significantly lower value in the full dataset is due to users who create a profile, rate very few items and never return. 0,018 Recommender Systems and Information Retrieval, provid- 288% Original Profiles 0,016 CUPs ing a statistically sound estimate of the recommendation 186% 138% CUPs Home quality [12]. 0,014 187% CUPs Cinema 206% 204% 103% 0,012 165% 165% 146% 145% 100% 99% 4.3 Results 100% 99% 87% 86% Precision 0,01 Figure 3 shows the precision levels obtained in our experi- 100% 0,008 ments. The recommendations using the contextualized user 100% 100% profiles outperform the original dataset by 200% when rec- 0,006 ommending one item only in terms of average precision. The 0,004 approach consistently outperforms the baseline until the rec- 0,002 ommended set reaches circa 50 items. In terms of recall, 0 shown in Figure 4, the CUP approach consistently outper- 1 5 10 50 100 forms the baseline. When looking at each CUP separately N we see that the home CUP outperforms all other approaches (contextual and not contextual) by even more. The per- Figure 3: Precision@N with N={1, 5, 10, 50, 100} formance in terms of recall is similar, however the original for the original user profiles, the average value for users profiles never seem to be able to outperform the CUPs. both home and cinema CUPs and for each of the When looking at MAP, shown in Table 1, the improvement two inferred CUPs. is somewhat smaller, which is expected given the fact that precision is higher for the original user profiles at high N’s. able to considerably improve recommendation results in The observed results confirm the assumption that the lo- terms of precision, recall and mean average precision. Re- cation and situation (“situated action” [15]) influences the sults indicate that automatic contextualization of user pro- consumer in such a way that the taste (i.e. rating value) dif- files into CUPs affects the quality of recommendations pos- fers from situation to situation. This confirms the notion of itively. We showed that, in a movie recommendation sce- users having separate rating profiles depending on the com- nario, the venue and time of a consumption as well as the bination of where, how and when the movie is seen. More “freshness” of the item is reflected in the rating behavior of importantly, the performance of a recommender system can users and that this information can be used for recommen- be improved considerably if this information is used. dation purposes. 5. CONCLUSION The situation in which users consume a particular product, In this paper we presented a method for automatic con- has an effect on their taste or rating behavior. However, textualization of rating events in a movie recommendation the context covered in this work needs to be extended and scenario, in order to create contextual user profiles, CUPs. further researched to gain more insight into the way contex- By using the date of the rating, and the information on how tualized user profiles should be inferred, managed and used. new a movie was at the time of rating, we were able to infer For instance, the profiles explored in this work are mutually the venue (at home, or at the cinema) in which a movie was exclusive, which, in the presented recommendation scenario, seen. seems plausible, as the location of an event can only be sin- gular. If the context profile would be extended to include We evaluated the inferred contextual user profiles and were factors such as company, mood or ambiance of the venue, 3,00E-02 Original Profiles 7. REFERENCES 2,50E-02 CUPs 286% [1] G. D. Abowd, C. G. Atkeson, J. Hong, S. Long, CUPs Home CUPs Cinema R. Kooper, and M. Pinkerton, ‘Cyberguide: a mobile 2,00E-02 context-aware tour guide’, Wirel. Netw., 3, (10/1997). [2] G. Adomavicius and A. Tuzhilin, Context-Aware Recall 1,50E-02 387% Recommender Systems, 217–257, Springer, 2011. 129% 124% [3] G. Adomavicius, A. Tuzhilin, S. Berkovsky, E. W. 1,00E-02 100% De Luca, and A. Said, ‘Context-awareness in recommender systems: research workshop and movie 143% 135% 5,00E-03 100% recommendation challenge’, in RecSys 2010. ACM. 666% 570% 280% 259% 100% 1173% 204% 191% 100% 195% 100% 183% [4] L. Baltrunas and X. Amatriain, ‘Towards 0,00E+00 1 5 Time-Dependant recommendation based on implicit N 10 50 100 feedback’, in CARS 2009, (2009). [5] S. Berkovsky, A. Said, and E. W. De Luca, eds. Figure 4: Recall@N with N={1, 5, 10, 50, 100} for CAMRa ’10. ACM, 2010. the original user profiles, the average value for both home and cinema CUPs and for each of the two [6] T. Bogers, ‘Movie recommendation using random inferred CUPs. walks over the contextual graph’, in CARS 2010. [7] I. Bokun and K. Zielinski, ‘Active badges–the next generation’, Linux J., 10/1998, (1998). Recommender MAP % improvement [8] J S Breese, D Heckerman, and C Kadie, Empirical Original users 5.26E − 3 0% analysis of predictive algorithms for collaborative filtering, volume 461, 43â52, San Francisco, CA, 1998. Contextual user profiles 6.05E − 3 15% [9] Ben Child. Closing the window on the multiplex | ben Home Context 7.97E − 3 51% child | guardian.co.uk. Cinema Context 6.00E − 3 14% http://www.guardian.co.uk/film/filmblog/2010/ may/28/cinema-window-dvd-release-multiplexes (retrieved 07/2011), May 2010. Table 1: The Mean Average Precision values and the relative improvements for our CUPs model and [10] E. W. De Luca, A. Said, M. Böhmer, and the original user profiles. F. Michahelles, ‘Workshop on context-awareness in retrieval and recommendation’, in IUI. ACM, (2011). [11] A. K. Dey, ‘Understanding and using context’, Personal Ubiquitous Comput., 5, (01/2001). the assumption on mutual exclusiveness of the contexts may need to be relaxed. [12] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl, ‘Evaluating collaborative filtering Our current work includes the in-depth analysis of data in recommender systems’, ACM Trans. Inf. Syst., 22, order to be able to accurately identify other contexts, infer (01/2004). them from implicit relations and subsequently use them for [13] A. Karatzoglou, X. Amatriain, L. Baltrunas, and recommendation purposes. N. Oliver, ‘Multiverse recommendation: n-dimensional tensor factorization for context-aware collaborative In conclusion, it appears that even trivial context inference filtering’, in RecSys 2010. ACM, (2010). models can be used to considerably improve recommender [14] G. Linden, B. Smith, and J. York, ‘Amazon.com systems quality, without adding much complexity to the rec- recommendations: item-to-item collaborative filtering’, ommendation algorithms themselves. Internet Computing, IEEE, 7(1), (jan/feb 2003). [15] B. Mobasher, ‘Contextual user modeling for In this paper we have covered the topic of inferred Con- recommendation’, in Keynote at the 2nd Workshop on textual User Profiles (CUPs), and showed that, even with Context-Aware Recommender Systems, (2010). rather simple inference models, there is much to gain in [16] Moviepilot. Wie funktioniert moviepilot? terms of recommendation quality. The contexts covered in http://www.moviepilot.de/pages/faq#wie_ this work have been one related to watching movies in the funktioniert_moviepilot (retrieved 03/2011). comfort of one’s home, and one where the watching takes [17] A. Said, ‘Identifying and utilizing contextual data in place at a cinema. Both contexts improve recommendation hybrid recommender systems’, in RecSys. ACM, quality considerably. (2010). [18] R. Want, A Hopper, V. Falcão, and J. Gibbons, ‘The active badge location system’, ACM Trans. Inf. Syst., 6. ACKNOWLEDGMENTS 10, (01/1992). The authors would like to express their gratitude to the Moviepilot team who contributed to this work with dataset, relevant insights and support. The work in this paper was conducted in the scope of the KMulE project which was sponsored by the German Federal Ministry of Economics and Technology (BMWi).