Rushed or Relaxed? – How the Situation on the Road Influences the Driver’s Preferences for Music Tracks Linas Baltrunas Bernd Ludwig Francesco Ricci Telefonica Research, University of Regensburg, Free University of Bolzano, Plaza de E. Lluchi Martin 5, Universitätsstraße 31, Piazza Domenicani 3, Barcelona, Spain Regensburg, Germany Bolzano, Italy Linas@tid.es bernd.ludwig@ur.de fricci@unibz.it ABSTRACT For a recommender system, there is a major implication In context-aware recommender systems, the dependency of from this observation. If we can assess such an influence the user’s ratings on factors that describe important aspects for individual users we are able to better personalize recom- of the recommendation context is used to provide more rel- mendations. Beyond this, it may even be possible to group evant recommendations. users influenced in a similar way by certain contextual condi- Individual users may be influenced differently by the same tions. This knowledge could lead to an improved prediction set of contextual factors. By understanding this kind of de- of ratings for items not previously rated by the user. pendency between the user’s ratings (evaluations) and con- With this in mind, it seems worth understanding the in- text, it is possible to identify user profiles and use them fluence of context on user ratings. In previous work [2], we to predict precisely the user ratings for items to be rec- reported on a collection of ratings data for music tracks while ommended. In this paper, we present our methodology to users experienced different stereotypical situations while driv- identify user profiles in a corpus of ratings for music tracks. ing a car. In this report, we focus on the analysis of this data These ratings were collected in a user study, which simu- with respect to the aims discussed above. Whether or not a lated typical situations that occur while driving a car. We particular aspect of context is important for predicting user present the findings derived from the data, and argue that ratings, is dependent on the user to whom the recommen- it is feasible to distinguish different typologies of users from dations are targeted. Our data suggest that different users the ratings they give to music tracks in specific contexts. have different perceptions of their surroundings and that these perceptions may influence musical preferences. Our data reveal that people assign different ratings to the same Categories and Subject Descriptors music track in different contexts and in many cases these H.3.3 [Information Storage and Retrieval]: Information differences are statistically significant. Search and Retrieval—Information Filtering Our paper is structured as follows: In the next section we briefly present our data. Next, we introduce the mathemat- Keywords ical tools we use to analyze the influence of context on user ratings. In sections to follow, we present evidence that con- Recommender Systems, Context-based Reasoning, Collabo- text can provoke a change the music genres preferences of rative Filtering the user. In the final section, we discuss whether or not the influence of the context on ratings can even be observed for 1. INTRODUCTION individual users, and conclude the paper with a discussion Recommender systems predict user ratings for items on of the results and outline our plans for future work. the basis of previous ratings for similar items or similar users [5]. As users may rate the same item differently depend- 2. DATA CORPUS AND CONTEXT MODEL ing on the situation in which they will experience or use the item, context-aware recommender systems [4, 6, 3, 1] As described in [2], we collected two independent data have become a popular research focus. The main idea is samples. In these experiments, driving situations were simu- to model context as a set of variables (contextual factors) lated with descriptions on a website. In the first experiment, each of which can take one of a finite set of discrete val- we intended to capture the influence of context on the ac- ues (contextual value). The user ratings are stochastically tive and conscious decision of a user to listen a tracks of a dependent on the contextual values. certain genre if at the same time he was exposed to a certain contextual factor. For this purpose, users were asked to fo- cus on one context factor at a time and rate the influence of this context factor on their decision to listen to a track of a randomly proposed genre on a three-level scale (POSITIVE, NEGATIVE, or NONE). In this way, the decision making process in this experiment was modeled as an active modification of the user’s attitude towards a genre. Over a period of three Presented at Searching4Fun workshop at ECIR2012. Copyright c 2012 for weeks, we acquired 2436 ratings from 59 users (Users were the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted recruited via email-lists and social networks). This study by its editors. was considered a pilot, and in order to avoid the sparse data Context Factor M IY (X, Y ) then defined as: sleepiness 0.169766732 XX P (x, y) traffic conditions 0.034971332 M I(X, Y ) = P (x, y) · log y∈Y x∈X P (x) · P (y) weather 0.027759496 driving style 0.025347564 M I can be normalized to the interval [−1; 1] by computing road type 0.022788139 its value relative to the entropy of Y : natural phenomena 0.015574021 mood 0.013993043 M I(X, Y ) M IY (X, Y ) = P landscape 0.010431354 − y∈Y P (y) · log P (y) For X we have 2436 ratings (see Section 2 above). For each Figure 1: Mutual Information between Influence of of the context factors, we collected 95 ratings. Figure 1 Context on Ratings and Context Factors gives a numeric overview of the average ratings in the second data set and the impact of the single context factors on the average rating. The results indicate that users are influenced heavily by problem a small number of tracks for each genre were pro- variable driving conditions such as their own physical con- posed. 95 ratings were collected per contextual factor. dition (sleepiness) and external factors such as traffic and For our model of context, we relied on cognitive task anal- weather. Personal factors, such as their mood, and factor yses of car driving and considered three different kinds of a not directly related to the car driving task, such as the land- driver’s perceptions and actions as potentially relevant: scape in which users are traveling, are of minor impact. Context Factor Possible Values In the next step of our analysis, we wanted to understand driving style relaxed driving, sport driving whether the influence of context depends on the user pref- road type city, highway, serpentine erence for a music track. We hypothesized that if the user landscape coast line, country side, more strongly likes or dislike a track then his rating can be mountains/hills, urban significantly influenced by contextual factors. In order to sleepiness awake, sleepy traffic conditions free road, many cars, traffic jam analyze this hypothesis we grouped the data into 5 parti- mood active, happy, lazy, sad tions for each of the 5 possible ratings a user could assign weather cloudy, snowing, sunny, rainy to a track. I.e. the partition 1 (“the tracks disliked with- natural phenomena day time, morning, night, afternoon out considering context”) contains all tracks rated with 1 (while different context factors were activated), and parti- Situations where more than one passenger was present tion 5 (“the highly preferred tracks”) contains the tracks were beyond the scope of our research. rated with 5 in any context. Again, the influence of the For the second sample, we collected tracks with ratings on context factors can be computed by measuring the mutual a five star scale. The sample consists of 955 ratings ignoring information and therefore the dependence between the ran- any context factor and 2865 ratings taking one contextual dom variable “a track is rated r without considering context” condition into account. The ratings were given by 66 differ- (r ∈ {1, 2, 3, 4, 5}) and the random variable “context factor c ent users (including many who had participated in the first is active while a track is rated r”. Figure 2 shows the results study). 69 to 167 ratings were collected per contextual fac- of this experiment. A first look at the numbers gives the tor depending on the assumed relevance for the experiment impression that the mutual information is generally higher (see Figure 1 and the discussion in Sect. 3). than in the experiment documented in Figure 1. To test this in a statistically sound way, we compared the mutual infor- mation values for each partition to those shown in Figure 3. RELEVANCE OF CONTEXT FACTORS 1 using a t-test. The results are given in the last column. When analyzing the dependency between contextual fac- With the exception of partition 3 which groups the tracks tors and ratings we could not make any modeling assump- that users did rate neutrally, for each partition the difference tions regarding the nature of the dependency. The same is statistically significant (the dot stands for α = 0.5, ∗ ∗ for holds for inter-factor dependencies. Therefore, paramet- α = 0.01, ∗ ∗ ∗ for α = 0.001). These findings suggest that ric models for the dependency such as linear regression are when users have strong positive or negative opinions for cer- not appropriate. Instead, we had to find a non-parametric tain tracks, the conditions they experience while driving a model. In information theory, the concept of mutual infor- car can influence more their ratings for these tracks. mation of two random variables is known exactly for this We also analyzed the influence of context on the prefer- purpose: it provides means to quantify the mutual depen- ences for certain music genres. For this purpose, we analyzed dence of two random variables. the data coming from the first study (see above). We for- In our case, we can apply mutual information to quanti- malized the user responses (POSITIVE, NEGATIVE, or NONE) tatively assess the difference in the average ratings for music as a random variable I. Given this variable, the genre G ignoring any influence of context compared to the average and the activated context factor C given, we can estimate rating taking single contextual factors into account. More the probability distribution P (I|G, C) from the first data formally, we define a random variable X for the event that set and compare it to the distribution P (I|G) which does users assign one of the ratings 1, 2, 3, 4, or 5 to a genre (in not take any context into account. For our purposes, it is the first sample) or to a track (in the second sample). again interesting to compute the mutual information for the Secondly, we define another random variable Y for the above random variables (C|G) and (I|G). The following ta- event that one of the context factors holds in the current ble presents the top-3 results for all combinations of genres situation. Mutual information (M I) between X and Y is and context factors: Partition Context Factor 1 2 3 4 5 driving style 0.145373959 0.048822968 0.18469473 0.035874718 0.028085475 landscape 0.039462852 0.025682432 0.05470132 0.042950347 0.038938108 mood 0.017266963 0.029724906 0.052830753 0.046422692 0.093026607 natural phenomena 0.022655695 0.053228548 0.084777547 0.024086852 0.082907254 road type 0.062203817 0.027293531 0.040344565 0.073388508 0.143056622 sleepiness 0.136737517 0.17566705 0.053153867 0.396715694 0.31060986 traffic conditions 0.036059416 0.121036344 0.124320839 0.032237073 0.139863842 weather 0.089973183 0.064745768 0.03265592 0.019943082 0.053972648 Level of Significance . ∗∗ . ∗∗ Figure 2: Mutual Information between Influence of Context on Ratings (POSITIVE, NEGATIVE, or NONE) and Context Factors Given a Certain Rating (key: ’.’: α = 0.5. ∗, ∗: α = 0.01) Blues driving style 0.324193188 tracks may change their opinion if they experience their driv- road type 0.216609802 ing situation intensively enough. sleepiness 0.144555483 Classics driving style 0.77439747 sleepiness 0.209061123 4. INDIVIDUAL USER TYPES weather 0.090901095 We now investigate the influence of context on individual users. We analyze the user ratings of the four users who Country sleepiness 0.469360938 gave most of the ratings in our second data collection phase driving style 0.363527911 (see above). We show that different contextual factors can weather 0.185619311 influence different users in different ways. In the following Disco mood 0.177643232 tables, Mean with context (MCY) is the average rating of a weather 0.17086365 user for all items rated under the assumption that the given sleepiness 0.147782999 contextual factor holds. Mean without context (MCN) is the average (of all users) rating for the same items without con- Hip Hop traffic conditions 0.192705142 sidering context. Differences in these averages are compared mood 0.151120854 using a t-test in order to assess whether a contextual factor sleepiness 0.105843345 actually influences the user’s ratings in a significant way. We Jazz sleepiness 0.168519565 indicate the statistical significance of the difference between road type 0.127974728 MCY and MCN with the p-value of the t-test. weather 0.106333439 We note that a recommender system can exploit the re- Metal driving style 0.462220717 sults of our data analysis when building a prediction model weather 0.264904662 that integrates the average rating of many users for an item, sleepiness 0.196577939 a personalized component for a particular user, and a com- ponent for the context (see [2] for details). Pop sleepiness 0.418648658 driving style 0.344360938 User 1: Preferences above Average. road type 0.268688459 As can be seen in column MCN in Table 3b, this user, on average, rated the tracks in the data base higher than the Reggae sleepiness 0.549730059 others. The comparison with MCN of all users (see Table driving style 0.382254696 3a) suggests that for this user many of the tracks were per- traffic conditions 0.321430505 ceived very positively in driving situations demanding the Rock traffic conditions 0.238140493 driver’s attention. In fact, driving on a highway, on a ser- sleepiness 0.224814184 pentine or mountain road leads to an increase of the average driving style 0.132856064 rating (compared to MCN for all users). On the other hand, situations that can be perceived as negative (e.g. traffic jam) From these results, we can learn two lessons. First, within provoke a decrease of the user ratings. This observation sim- a given genre, the mutual information is very high only for ilarly holds for some other factors: lots of cars, a situation some factors. Evidently, these have a strong influence on quite similar to traffic jam, or driving in morning time. In- the user ratings. This outcome was not obvious before the terestingly, sport driving – which stands for a consciously experiment as the user preferences could have been stronger sportive style of driving – has negative influence on the av- than the influence of the driving situation. However, some erage ratings of this user. Hence we hypothesize that the of these factors influence the ratings for (almost) all genres. user is affected negatively by the tracks (mainly pop music) We may conclude that they are strongly related to the cogni- in situations that are likely to produce stress. tive and emotional state of a driver and therefore constitute User 2: Preferences around Average with Positive important features of recommending music in car. Tendency towards Tracks. Second, as the influence of context is evident, we may In this example the user has a personal average rating conclude that even users with strong preferences for certain similar to the other users. This phenomenon is not an ef- Factor MCN MCY Tendency α Factor MCN MCY Tendency α highway 2.498429 3.521739 ↑ ∗ ∗ ∗ traffic jam 3.077586 1.647059 ↓ ∗ ∗ ∗ traffic jam 2.498429 1.647059 ↓ ∗, ∗ lots of cars 3.077586 1.894737 ↓ ∗ ∗ ∗ city 2.498429 3.800000 ↑ ∗∗ sport driving 3.077586 1.705882 ↓ ∗ ∗ ∗ serpentine 2.498429 3.529412 ↑ ∗∗ active 3.077586 1.866667 ↓ ∗∗ sport driving 2.498429 1.705882 ↓ ∗∗ morning 3.077586 2.000000 ↓ ∗∗ lots of cars 2.498429 1.894737 ↓ ∗∗ city 3.077586 3.800000 ↑ ∗ coast line 2.498429 3.500000 ↑ ∗ mountains/hills 2.498429 3.307692 ↑ . active 2.498429 1.866667 ↓ . (b) MCN versus MCY of User 1 country side 2.498429 3.272727 ↑ . (a) MCN of all Users versus MCY for User 1 Figure 3: Profile of User 1. Only those factors with statistical significance are shown. Factor MCN MCY Tendency α Factor MCN MCY Tendency α happy 2.498429 1.444444 ↓ ∗∗ happy 2.432692 1.444444 ↓ ∗∗ serpentine 2.498429 1.709677 ↓ ∗∗ serpentine 2.432692 1.709677 ↓ ∗ urban 2.498429 1.760000 ↓ ∗ awake 2.432692 3.642857 ↑ ∗ awake 2.498429 3.642857 ↑ ∗ urban 2.432692 1.760000 ↓ ∗ country side 2.498429 1.807692 ↓ ∗ country side 2.432692 1.807692 ↓ . sad 2.498429 1.846154 ↓ ∗ sad 2.432692 1.846154 ↓ . afternoon 2.498429 2.000000 ↓ . relaxed driving 2.498429 2.025641 ↓ . (b) MCN versus MCY of User 2 (a) MCN of all Users versus MCY of User 2 Figure 4: Profile of User 2. Only those factors with statistical significance are shown. fect of any context. The sign of the significant differences previous comparison. Moreover, there is one personal fac- between MCN and MCY in Table 4a indicate that this user tor (awake) under which the user rated significantly higher. likes the tracks in the corpus when he feels awake. Being But, as there are many factors with almost identical ratings sad, he would never like to listen to the tracks. In general, to the already low non-contextualized ratings, in most sit- for this user the traffic situation (differently from user 1) uations the items should not be recommended to this user. seems to play a minor role. Many significant differences in From this observation, we can assume that as this user dis- his ratings can be found comparing his MCY with his non- likes tracks very strongly, it is hard to find context factors contextualized ratings (own MCN) as well as with the rating that may change his attitude. of all the users (MCN), for personal factors such as the mood and the perception of the surrounding landscape. 5. CONCLUSIONS AND FUTURE WORK User 3: Preferences slightly below or on Average We have presented a non-parametric approach to assess with Negative Tendency towards the Tracks. the impact of a set of contextual factors on the user ratings. In this user profile, the factors provoking significant dif- Our findings from the analysis of two data collections suggest ferences between MCN and MCY (see Table 5a) are mostly that the perceptions and experiences during the execution of personal ones or factors that indirectly influence personal a task influence user preferences even for non-crucial items attitudes or the cognitive load of the driver (i.e. road type). such as music tracks to be played in a car. As many of the tracks used for our data collection were pop songs, and on average the user assigns low ratings, we 5.1 Influence of Context can conclude that he has a strong dislike for this kind of mu- sic. This impression is strengthened by the observation that We found empirical evidence that the driving situation negative emotions (such as sad) lead to even worse ratings indeed influences the driver’s preferences for music. The for tracks than on average for this user. influence of context may even be strong enough to modify the preference of a user for his favorite tracks. User 4: Preferences below Average. The findings also suggest that the cognitive load of the In this user profile, there are several highly significant dif- driver, his emotional, mental, and physical state, and cur- ferences between the MCN of all users and MCY (see Table rent traffic conditions influence his preferences. 6a). In every case, the tendency is negative indicating that These findings are surely affected by the set of tracks used there are almost no situations in which tracks from the data in the study. We used this set as the reported experiments set should be recommended to such a user. Probably this were developed within an industrial project, and the tracks user does not like the tracks in the corpus, or he even does were provided by the media platform of the industrial part- not like to listen to music at all while driving. The signifi- ner. It is an interesting task to collect data for other set of cance level of the difference between the personal MCN and tracks – in a wider set of types of tracks or with a different MCY (see Table 6b), here is slightly smaller than in the specialization – and repeat the analysis. Factor MCN MCY Tendency α Factor MCN MCY Tendency α sad 2.498429 1.333333 ↓ ∗∗ sad 2.329787 1.333333 ↓ ∗∗ day time 2.498429 1.666667 ↓ ∗∗ day time 2.329787 1.666667 ↓ ∗ active 2.498429 1.769231 ↓ ∗ active 2.329787 1.769231 ↓ . serpentine 2.498429 1.714286 ↓ ∗ coast line 2.498429 2.000000 ↓ . (b) MCN versus MCY of User 3 (a) MCN of all Users versus MCY of User 3 Figure 5: Profile of User 3. Only those factors with statistical significance are shown. Factor MCN MCY Tendency α Factor MCN MCY Tendency α day time 2.498429 1.166667 ↓ ∗ ∗ ∗ day time 2.175676 1.166667 ↓ ∗ ∗ ∗ afternoon 2.498429 1.666667 ↓ ∗∗ awake 2.175676 3.222222 ↑ . highway 2.498429 1.700000 ↓ ∗ afternoon 2.175676 1.666667 ↓ . urban 2.498429 1.769231 ↓ ∗ morning 2.498429 1.714286 ↓ . mountains/hills 2.498429 1.714286 ↓ . (b) MCN versus MCY of User 4 country side 2.498429 1.700000 ↓ . (a) MCN of all Users versus MCY of User 4 Figure 6: Profile of User 4. Only those factors with statistical significance are shown. 5.2 Critical Discussion of the Study Design B. Shapira, and P. B. Kantor, editors, Recommender It is important to note the constraints and conditions of Systems Handbook, pages 217 – 250. Springer, 2011. our study design. First of all, in the web survey, we created [2] L. Baltrunas, M. Kaminskas, B. Ludwig, O. Moling, fictive situations that the subject should imagine. Hence, F. Ricci, A. Aydin, K.-H. Luke, , and R. Schwaiger. the test persons may have overestimated the relevance of Incarmusic: Context-aware music recommendations in the contextual factors on their music preferences. Hence, a a car. In (to appear) Proceedings of the 12th different study where users are actually facing certain con- International Conference on Electronic Commerce and textual conditions is in order. But before performing that Web Technologies, 2011. evaluation, our study clearly indicates that users perceive [3] L. Baltrunas, M. Kaminskas, F. Ricci, L. Rokach, context as important and influential, and different users, B. Shapira, and K.-H. Luke. Best usage context with different music preferences, have completely different prediction for music tracks. In 2nd Workshop on perceptions. To assess this result quantitatively, the web Context-Aware Recommender Systems, 2010. survey and the described methods represent a simple way to [4] A. Chen. Context-aware collaborative filtering system: collect and analyze data. In fact, we exploited our results in Predicting the user’s preference in the ubiquitous the implementation of a real music recommender system and computing environment. In T. Strang and player [2]. Besides, it is also important to note that during C. Linnhoff-Popien, editors, Location- and our study users rated the music tracks just after listening Context-Awareness, volume 3479 of Lecture Notes in to them. This is not always the case in many recommender Computer Science, pages 244–253. Springer Berlin / systems (e.g. MovieLens or Netflix), where often the ratings Heidelberg, 2005. are provided long after the user experienced the items. [5] Y. Koren and R. Bell. Advances in collaborative filtering. In F. Ricci, L. Rokach, B. Shapira, and P. B. 5.3 Consequences for Future Work Kantor, editors, Recommender Systems Handbook. Currently, we are preparing a new study with an improved Springer, 2011. experimental setup: we are merging our prototype with an- [6] G.-E. Yap, A.-H. Tan, and H.-H. Pang. Discovering other application that allows to log onboard data in a car. causal dependencies in mobile context-aware We will equip cars of test persons with this tool and collect recommenders. In MDM 06: Proceedings of the 7th data in real driving situations. The logged data will allow International Conference on Mobile Data Management, us to detect the values of certain contextual factors from on- page 4, Washington, DC, USA, 2006. IEEE Computer board information about the car and its navigation system. Society. Furthermore, we will be able to combine this data with feed- back from the users (e.g., which of the recommended tracks are played or skipped). From such a new collection of data, gained in a naturalistic setting, we will validate the findings of our simulation study. 6. REFERENCES [1] G. Adomavicius and A. Tuzhilin. Context-aware recommender systems. In F. Ricci, L. Rokach,