Rushed or Relaxed? – How the Situation on the Road
       Influences the Driver’s Preferences for Music Tracks

                   Linas Baltrunas                              Bernd Ludwig                       Francesco Ricci
                Telefonica Research,                      University of Regensburg,           Free University of Bolzano,
             Plaza de E. Lluchi Martin 5,                  Universitätsstraße 31,               Piazza Domenicani 3,
                  Barcelona, Spain                         Regensburg, Germany                      Bolzano, Italy
                     Linas@tid.es                          bernd.ludwig@ur.de                       fricci@unibz.it

ABSTRACT                                                                      For a recommender system, there is a major implication
In context-aware recommender systems, the dependency of                    from this observation. If we can assess such an influence
the user’s ratings on factors that describe important aspects              for individual users we are able to better personalize recom-
of the recommendation context is used to provide more rel-                 mendations. Beyond this, it may even be possible to group
evant recommendations.                                                     users influenced in a similar way by certain contextual condi-
   Individual users may be influenced differently by the same              tions. This knowledge could lead to an improved prediction
set of contextual factors. By understanding this kind of de-               of ratings for items not previously rated by the user.
pendency between the user’s ratings (evaluations) and con-                    With this in mind, it seems worth understanding the in-
text, it is possible to identify user profiles and use them                fluence of context on user ratings. In previous work [2], we
to predict precisely the user ratings for items to be rec-                 reported on a collection of ratings data for music tracks while
ommended. In this paper, we present our methodology to                     users experienced different stereotypical situations while driv-
identify user profiles in a corpus of ratings for music tracks.            ing a car. In this report, we focus on the analysis of this data
These ratings were collected in a user study, which simu-                  with respect to the aims discussed above. Whether or not a
lated typical situations that occur while driving a car. We                particular aspect of context is important for predicting user
present the findings derived from the data, and argue that                 ratings, is dependent on the user to whom the recommen-
it is feasible to distinguish different typologies of users from           dations are targeted. Our data suggest that different users
the ratings they give to music tracks in specific contexts.                have different perceptions of their surroundings and that
                                                                           these perceptions may influence musical preferences. Our
                                                                           data reveal that people assign different ratings to the same
Categories and Subject Descriptors                                         music track in different contexts and in many cases these
H.3.3 [Information Storage and Retrieval]: Information                     differences are statistically significant.
Search and Retrieval—Information Filtering                                    Our paper is structured as follows: In the next section we
                                                                           briefly present our data. Next, we introduce the mathemat-
Keywords                                                                   ical tools we use to analyze the influence of context on user
                                                                           ratings. In sections to follow, we present evidence that con-
Recommender Systems, Context-based Reasoning, Collabo-                     text can provoke a change the music genres preferences of
rative Filtering                                                           the user. In the final section, we discuss whether or not the
                                                                           influence of the context on ratings can even be observed for
1.    INTRODUCTION                                                         individual users, and conclude the paper with a discussion
   Recommender systems predict user ratings for items on                   of the results and outline our plans for future work.
the basis of previous ratings for similar items or similar users
[5]. As users may rate the same item differently depend-                   2.   DATA CORPUS AND CONTEXT MODEL
ing on the situation in which they will experience or use
the item, context-aware recommender systems [4, 6, 3, 1]                      As described in [2], we collected two independent data
have become a popular research focus. The main idea is                     samples. In these experiments, driving situations were simu-
to model context as a set of variables (contextual factors)                lated with descriptions on a website. In the first experiment,
each of which can take one of a finite set of discrete val-                we intended to capture the influence of context on the ac-
ues (contextual value). The user ratings are stochastically                tive and conscious decision of a user to listen a tracks of a
dependent on the contextual values.                                        certain genre if at the same time he was exposed to a certain
                                                                           contextual factor. For this purpose, users were asked to fo-
                                                                           cus on one context factor at a time and rate the influence of
                                                                           this context factor on their decision to listen to a track of a
                                                                           randomly proposed genre on a three-level scale (POSITIVE,
                                                                           NEGATIVE, or NONE). In this way, the decision making process
                                                                           in this experiment was modeled as an active modification of
                                                                           the user’s attitude towards a genre. Over a period of three
Presented at Searching4Fun workshop at ECIR2012. Copyright c 2012 for      weeks, we acquired 2436 ratings from 59 users (Users were
the individual papers by the papers’ authors. Copying permitted only for
private and academic purposes. This volume is published and copyrighted    recruited via email-lists and social networks). This study
by its editors.                                                            was considered a pilot, and in order to avoid the sparse data
            Context Factor           M IY (X, Y )                 then defined as:
            sleepiness               0.169766732                                        XX                           P (x, y)
            traffic conditions       0.034971332                         M I(X, Y ) =             P (x, y) · log
                                                                                        y∈Y x∈X
                                                                                                                   P (x) · P (y)
            weather                  0.027759496
            driving style            0.025347564                  M I can be normalized to the interval [−1; 1] by computing
            road type                0.022788139                  its value relative to the entropy of Y :
            natural phenomena        0.015574021
            mood                     0.013993043                                                     M I(X, Y )
                                                                             M IY (X, Y ) =       P
            landscape                0.010431354                                              −    y∈Y P (y) · log P (y)

                                                                  For X we have 2436 ratings (see Section 2 above). For each
Figure 1: Mutual Information between Influence of                 of the context factors, we collected 95 ratings. Figure 1
Context on Ratings and Context Factors                            gives a numeric overview of the average ratings in the second
                                                                  data set and the impact of the single context factors on the
                                                                  average rating.
                                                                     The results indicate that users are influenced heavily by
problem a small number of tracks for each genre were pro-
                                                                  variable driving conditions such as their own physical con-
posed. 95 ratings were collected per contextual factor.
                                                                  dition (sleepiness) and external factors such as traffic and
  For our model of context, we relied on cognitive task anal-
                                                                  weather. Personal factors, such as their mood, and factor
yses of car driving and considered three different kinds of a
                                                                  not directly related to the car driving task, such as the land-
driver’s perceptions and actions as potentially relevant:
                                                                  scape in which users are traveling, are of minor impact.
Context Factor     Possible Values                                   In the next step of our analysis, we wanted to understand
driving style      relaxed driving, sport driving                 whether the influence of context depends on the user pref-
road type          city, highway, serpentine                      erence for a music track. We hypothesized that if the user
landscape          coast line, country side,                      more strongly likes or dislike a track then his rating can be
                   mountains/hills, urban                         significantly influenced by contextual factors. In order to
sleepiness         awake, sleepy
traffic conditions free road, many cars, traffic jam              analyze this hypothesis we grouped the data into 5 parti-
mood               active, happy, lazy, sad                       tions for each of the 5 possible ratings a user could assign
weather            cloudy, snowing, sunny, rainy                  to a track. I.e. the partition 1 (“the tracks disliked with-
natural phenomena day time, morning, night, afternoon             out considering context”) contains all tracks rated with 1
                                                                  (while different context factors were activated), and parti-
   Situations where more than one passenger was present           tion 5 (“the highly preferred tracks”) contains the tracks
were beyond the scope of our research.                            rated with 5 in any context. Again, the influence of the
   For the second sample, we collected tracks with ratings on     context factors can be computed by measuring the mutual
a five star scale. The sample consists of 955 ratings ignoring    information and therefore the dependence between the ran-
any context factor and 2865 ratings taking one contextual         dom variable “a track is rated r without considering context”
condition into account. The ratings were given by 66 differ-      (r ∈ {1, 2, 3, 4, 5}) and the random variable “context factor c
ent users (including many who had participated in the first       is active while a track is rated r”. Figure 2 shows the results
study). 69 to 167 ratings were collected per contextual fac-      of this experiment. A first look at the numbers gives the
tor depending on the assumed relevance for the experiment         impression that the mutual information is generally higher
(see Figure 1 and the discussion in Sect. 3).                     than in the experiment documented in Figure 1. To test this
                                                                  in a statistically sound way, we compared the mutual infor-
                                                                  mation values for each partition to those shown in Figure
3.   RELEVANCE OF CONTEXT FACTORS                                 1 using a t-test. The results are given in the last column.
   When analyzing the dependency between contextual fac-          With the exception of partition 3 which groups the tracks
tors and ratings we could not make any modeling assump-           that users did rate neutrally, for each partition the difference
tions regarding the nature of the dependency. The same            is statistically significant (the dot stands for α = 0.5, ∗ ∗ for
holds for inter-factor dependencies. Therefore, paramet-          α = 0.01, ∗ ∗ ∗ for α = 0.001). These findings suggest that
ric models for the dependency such as linear regression are       when users have strong positive or negative opinions for cer-
not appropriate. Instead, we had to find a non-parametric         tain tracks, the conditions they experience while driving a
model. In information theory, the concept of mutual infor-        car can influence more their ratings for these tracks.
mation of two random variables is known exactly for this             We also analyzed the influence of context on the prefer-
purpose: it provides means to quantify the mutual depen-          ences for certain music genres. For this purpose, we analyzed
dence of two random variables.                                    the data coming from the first study (see above). We for-
   In our case, we can apply mutual information to quanti-        malized the user responses (POSITIVE, NEGATIVE, or NONE)
tatively assess the difference in the average ratings for music   as a random variable I. Given this variable, the genre G
ignoring any influence of context compared to the average         and the activated context factor C given, we can estimate
rating taking single contextual factors into account. More        the probability distribution P (I|G, C) from the first data
formally, we define a random variable X for the event that        set and compare it to the distribution P (I|G) which does
users assign one of the ratings 1, 2, 3, 4, or 5 to a genre (in   not take any context into account. For our purposes, it is
the first sample) or to a track (in the second sample).           again interesting to compute the mutual information for the
   Secondly, we define another random variable Y for the          above random variables (C|G) and (I|G). The following ta-
event that one of the context factors holds in the current        ble presents the top-3 results for all combinations of genres
situation. Mutual information (M I) between X and Y is            and context factors:
                                                                        Partition
                Context Factor           1              2              3             4              5
                driving style            0.145373959    0.048822968    0.18469473    0.035874718    0.028085475
                landscape                0.039462852    0.025682432    0.05470132    0.042950347    0.038938108
                mood                     0.017266963    0.029724906    0.052830753   0.046422692    0.093026607
                natural phenomena        0.022655695    0.053228548    0.084777547   0.024086852    0.082907254
                road type                0.062203817    0.027293531    0.040344565   0.073388508    0.143056622
                sleepiness               0.136737517    0.17566705     0.053153867   0.396715694    0.31060986
                traffic conditions       0.036059416    0.121036344    0.124320839   0.032237073    0.139863842
                weather                  0.089973183    0.064745768    0.03265592    0.019943082    0.053972648
                Level of Significance    .              ∗∗                           .              ∗∗


Figure 2: Mutual Information between Influence of Context on Ratings (POSITIVE, NEGATIVE, or NONE) and
Context Factors Given a Certain Rating (key: ’.’: α = 0.5. ∗, ∗: α = 0.01)


        Blues       driving style        0.324193188              tracks may change their opinion if they experience their driv-
                    road type            0.216609802              ing situation intensively enough.
                    sleepiness           0.144555483
        Classics    driving style        0.77439747
                    sleepiness           0.209061123
                                                                  4.     INDIVIDUAL USER TYPES
                    weather              0.090901095                 We now investigate the influence of context on individual
                                                                  users. We analyze the user ratings of the four users who
        Country     sleepiness           0.469360938              gave most of the ratings in our second data collection phase
                    driving style        0.363527911              (see above). We show that different contextual factors can
                    weather              0.185619311              influence different users in different ways. In the following
        Disco       mood                 0.177643232              tables, Mean with context (MCY) is the average rating of a
                    weather              0.17086365               user for all items rated under the assumption that the given
                    sleepiness           0.147782999              contextual factor holds. Mean without context (MCN) is the
                                                                  average (of all users) rating for the same items without con-
        Hip Hop     traffic conditions   0.192705142
                                                                  sidering context. Differences in these averages are compared
                    mood                 0.151120854
                                                                  using a t-test in order to assess whether a contextual factor
                    sleepiness           0.105843345
                                                                  actually influences the user’s ratings in a significant way. We
        Jazz        sleepiness           0.168519565              indicate the statistical significance of the difference between
                    road type            0.127974728              MCY and MCN with the p-value of the t-test.
                    weather              0.106333439                 We note that a recommender system can exploit the re-
        Metal       driving style        0.462220717              sults of our data analysis when building a prediction model
                    weather              0.264904662              that integrates the average rating of many users for an item,
                    sleepiness           0.196577939              a personalized component for a particular user, and a com-
                                                                  ponent for the context (see [2] for details).
        Pop         sleepiness           0.418648658
                    driving style        0.344360938              User 1: Preferences above Average.
                    road type            0.268688459                 As can be seen in column MCN in Table 3b, this user, on
                                                                  average, rated the tracks in the data base higher than the
        Reggae      sleepiness           0.549730059              others. The comparison with MCN of all users (see Table
                    driving style        0.382254696              3a) suggests that for this user many of the tracks were per-
                    traffic conditions   0.321430505              ceived very positively in driving situations demanding the
        Rock        traffic conditions   0.238140493              driver’s attention. In fact, driving on a highway, on a ser-
                    sleepiness           0.224814184              pentine or mountain road leads to an increase of the average
                    driving style        0.132856064              rating (compared to MCN for all users). On the other hand,
                                                                  situations that can be perceived as negative (e.g. traffic jam)
   From these results, we can learn two lessons. First, within    provoke a decrease of the user ratings. This observation sim-
a given genre, the mutual information is very high only for       ilarly holds for some other factors: lots of cars, a situation
some factors. Evidently, these have a strong influence on         quite similar to traffic jam, or driving in morning time. In-
the user ratings. This outcome was not obvious before the         terestingly, sport driving – which stands for a consciously
experiment as the user preferences could have been stronger       sportive style of driving – has negative influence on the av-
than the influence of the driving situation. However, some        erage ratings of this user. Hence we hypothesize that the
of these factors influence the ratings for (almost) all genres.   user is affected negatively by the tracks (mainly pop music)
We may conclude that they are strongly related to the cogni-      in situations that are likely to produce stress.
tive and emotional state of a driver and therefore constitute     User 2: Preferences around Average with Positive
important features of recommending music in car.                  Tendency towards Tracks.
   Second, as the influence of context is evident, we may           In this example the user has a personal average rating
conclude that even users with strong preferences for certain      similar to the other users. This phenomenon is not an ef-
    Factor             MCN         MCY         Tendency        α     Factor          MCN        MCY        Tendency       α
    highway            2.498429    3.521739       ↑          ∗ ∗ ∗   traffic jam     3.077586   1.647059      ↓         ∗ ∗ ∗
    traffic jam        2.498429    1.647059       ↓           ∗, ∗   lots of cars    3.077586   1.894737      ↓         ∗ ∗ ∗
    city               2.498429    3.800000       ↑           ∗∗     sport driving   3.077586   1.705882      ↓         ∗ ∗ ∗
    serpentine         2.498429    3.529412       ↑           ∗∗     active          3.077586   1.866667      ↓          ∗∗
    sport driving      2.498429    1.705882       ↓           ∗∗     morning         3.077586   2.000000      ↓          ∗∗
    lots of cars       2.498429    1.894737       ↓           ∗∗     city            3.077586   3.800000      ↑           ∗
    coast line         2.498429    3.500000       ↑            ∗
    mountains/hills    2.498429    3.307692       ↑            .
    active             2.498429    1.866667       ↓            .
                                                                                (b) MCN versus MCY of User 1
    country side       2.498429    3.272727       ↑            .
           (a) MCN of all Users versus MCY for User 1


              Figure 3: Profile of User 1. Only those factors with statistical significance are shown.


        Factor            MCN         MCY         Tendency      α    Factor          MCN        MCY        Tendency    α
        happy             2.498429    1.444444       ↓          ∗∗   happy           2.432692   1.444444      ↓        ∗∗
        serpentine        2.498429    1.709677       ↓          ∗∗   serpentine      2.432692   1.709677      ↓        ∗
        urban             2.498429    1.760000       ↓          ∗    awake           2.432692   3.642857      ↑        ∗
        awake             2.498429    3.642857       ↑          ∗    urban           2.432692   1.760000      ↓        ∗
        country side      2.498429    1.807692       ↓          ∗    country side    2.432692   1.807692      ↓         .
        sad               2.498429    1.846154       ↓          ∗    sad             2.432692   1.846154      ↓         .
        afternoon         2.498429    2.000000       ↓           .
        relaxed driving   2.498429    2.025641       ↓           .
                                                                               (b) MCN versus MCY of User 2
             (a) MCN of all Users versus MCY of User 2


              Figure 4: Profile of User 2. Only those factors with statistical significance are shown.


fect of any context. The sign of the significant differences         previous comparison. Moreover, there is one personal fac-
between MCN and MCY in Table 4a indicate that this user              tor (awake) under which the user rated significantly higher.
likes the tracks in the corpus when he feels awake. Being            But, as there are many factors with almost identical ratings
sad, he would never like to listen to the tracks. In general,        to the already low non-contextualized ratings, in most sit-
for this user the traffic situation (differently from user 1)        uations the items should not be recommended to this user.
seems to play a minor role. Many significant differences in          From this observation, we can assume that as this user dis-
his ratings can be found comparing his MCY with his non-             likes tracks very strongly, it is hard to find context factors
contextualized ratings (own MCN) as well as with the rating          that may change his attitude.
of all the users (MCN), for personal factors such as the mood
and the perception of the surrounding landscape.
                                                                     5.    CONCLUSIONS AND FUTURE WORK
User 3: Preferences slightly below or on Average                       We have presented a non-parametric approach to assess
with Negative Tendency towards the Tracks.                           the impact of a set of contextual factors on the user ratings.
   In this user profile, the factors provoking significant dif-      Our findings from the analysis of two data collections suggest
ferences between MCN and MCY (see Table 5a) are mostly               that the perceptions and experiences during the execution of
personal ones or factors that indirectly influence personal          a task influence user preferences even for non-crucial items
attitudes or the cognitive load of the driver (i.e. road type).      such as music tracks to be played in a car.
   As many of the tracks used for our data collection were
pop songs, and on average the user assigns low ratings, we           5.1    Influence of Context
can conclude that he has a strong dislike for this kind of mu-
sic. This impression is strengthened by the observation that            We found empirical evidence that the driving situation
negative emotions (such as sad) lead to even worse ratings           indeed influences the driver’s preferences for music. The
for tracks than on average for this user.                            influence of context may even be strong enough to modify
                                                                     the preference of a user for his favorite tracks.
User 4: Preferences below Average.                                      The findings also suggest that the cognitive load of the
   In this user profile, there are several highly significant dif-   driver, his emotional, mental, and physical state, and cur-
ferences between the MCN of all users and MCY (see Table             rent traffic conditions influence his preferences.
6a). In every case, the tendency is negative indicating that            These findings are surely affected by the set of tracks used
there are almost no situations in which tracks from the data         in the study. We used this set as the reported experiments
set should be recommended to such a user. Probably this              were developed within an industrial project, and the tracks
user does not like the tracks in the corpus, or he even does         were provided by the media platform of the industrial part-
not like to listen to music at all while driving. The signifi-       ner. It is an interesting task to collect data for other set of
cance level of the difference between the personal MCN and           tracks – in a wider set of types of tracks or with a different
MCY (see Table 6b), here is slightly smaller than in the             specialization – and repeat the analysis.
             Factor         MCN          MCY         Tendency     α     Factor     MCN         MCY        Tendency   α
             sad            2.498429     1.333333       ↓         ∗∗    sad        2.329787    1.333333      ↓       ∗∗
             day time       2.498429     1.666667       ↓         ∗∗    day time   2.329787    1.666667      ↓       ∗
             active         2.498429     1.769231       ↓          ∗    active     2.329787    1.769231      ↓        .
             serpentine     2.498429     1.714286       ↓          ∗
             coast line     2.498429     2.000000       ↓          .
                                                                               (b) MCN versus MCY of User 3
               (a) MCN of all Users versus MCY of User 3


              Figure 5: Profile of User 3. Only those factors with statistical significance are shown.


      Factor              MCN          MCY          Tendency      α       Factor        MCN        MCY        Tendency      α
      day time            2.498429     1.166667        ↓        ∗ ∗ ∗     day time      2.175676   1.166667      ↓        ∗ ∗ ∗
      afternoon           2.498429     1.666667        ↓         ∗∗       awake         2.175676   3.222222      ↑          .
      highway             2.498429     1.700000        ↓          ∗       afternoon     2.175676   1.666667      ↓          .
      urban               2.498429     1.769231        ↓          ∗
      morning             2.498429     1.714286        ↓          .
      mountains/hills     2.498429     1.714286        ↓          .
                                                                                      (b) MCN versus MCY of User 4
      country side        2.498429     1.700000        ↓          .
            (a) MCN of all Users versus MCY of User 4


              Figure 6: Profile of User 4. Only those factors with statistical significance are shown.


5.2    Critical Discussion of the Study Design                              B. Shapira, and P. B. Kantor, editors, Recommender
   It is important to note the constraints and conditions of                Systems Handbook, pages 217 – 250. Springer, 2011.
our study design. First of all, in the web survey, we created           [2] L. Baltrunas, M. Kaminskas, B. Ludwig, O. Moling,
fictive situations that the subject should imagine. Hence,                  F. Ricci, A. Aydin, K.-H. Luke, , and R. Schwaiger.
the test persons may have overestimated the relevance of                    Incarmusic: Context-aware music recommendations in
the contextual factors on their music preferences. Hence, a                 a car. In (to appear) Proceedings of the 12th
different study where users are actually facing certain con-                International Conference on Electronic Commerce and
textual conditions is in order. But before performing that                  Web Technologies, 2011.
evaluation, our study clearly indicates that users perceive             [3] L. Baltrunas, M. Kaminskas, F. Ricci, L. Rokach,
context as important and influential, and different users,                  B. Shapira, and K.-H. Luke. Best usage context
with different music preferences, have completely different                 prediction for music tracks. In 2nd Workshop on
perceptions. To assess this result quantitatively, the web                  Context-Aware Recommender Systems, 2010.
survey and the described methods represent a simple way to              [4] A. Chen. Context-aware collaborative filtering system:
collect and analyze data. In fact, we exploited our results in              Predicting the user’s preference in the ubiquitous
the implementation of a real music recommender system and                   computing environment. In T. Strang and
player [2]. Besides, it is also important to note that during               C. Linnhoff-Popien, editors, Location- and
our study users rated the music tracks just after listening                 Context-Awareness, volume 3479 of Lecture Notes in
to them. This is not always the case in many recommender                    Computer Science, pages 244–253. Springer Berlin /
systems (e.g. MovieLens or Netflix), where often the ratings                Heidelberg, 2005.
are provided long after the user experienced the items.                 [5] Y. Koren and R. Bell. Advances in collaborative
                                                                            filtering. In F. Ricci, L. Rokach, B. Shapira, and P. B.
5.3    Consequences for Future Work                                         Kantor, editors, Recommender Systems Handbook.
   Currently, we are preparing a new study with an improved                 Springer, 2011.
experimental setup: we are merging our prototype with an-               [6] G.-E. Yap, A.-H. Tan, and H.-H. Pang. Discovering
other application that allows to log onboard data in a car.                 causal dependencies in mobile context-aware
We will equip cars of test persons with this tool and collect               recommenders. In MDM 06: Proceedings of the 7th
data in real driving situations. The logged data will allow                 International Conference on Mobile Data Management,
us to detect the values of certain contextual factors from on-              page 4, Washington, DC, USA, 2006. IEEE Computer
board information about the car and its navigation system.                  Society.
Furthermore, we will be able to combine this data with feed-
back from the users (e.g., which of the recommended tracks
are played or skipped). From such a new collection of data,
gained in a naturalistic setting, we will validate the findings
of our simulation study.

6.    REFERENCES
[1] G. Adomavicius and A. Tuzhilin. Context-aware
    recommender systems. In F. Ricci, L. Rokach,