A Survey of Music Recommendation Aids
                                              Pirkka Åman and Lassi A. Liikkanen
                                           Helsinki Institute for Information Technology HIIT
                                              Aalto University and University of Helsinki
                                                          Tel. +358 50 384 1514
                                                      firstname.lastname@hiit.fi

ABSTRACT                                                                    constant competition and can reduce the usefulness of
This paper provides a review of explanations, visualizations and            recommendations unless they can persuade the user to try the
interactive elements of user interfaces (UI) in music                       suggested content. Explanations and other recommendation
recommendation systems. We call these UI features                           aiding UI features are examined in this paper as a way to
“recommendation aids”. Explanations are elements of the                     increase the satisfaction towards recommenders among users.
interface that inform the user why a certain recommendation
was made. We highlight six possible goals for explanations,                 The first interactive systems to have explanations were expert
resulting in overall satisfaction towards the system. We found              systems, including legal and medical databases [4]. Their
that the most of the existing music recommenders of popular                 present successors are commercial recommendation systems
systems provide no explanations, or very limited ones. Since                commonly found embedded in various entertainment systems
explanations are not independent of other UI elements in                    such as iTunes [9] or Last.fm [12]. Explanations can be
recommendation process, we consider how the other elements                  described as textual information telling e.g. why and how a
can be used to achieve the same goals. To this end, we                      recommendation was produced to the user. Earlier research
evaluated several existing music recommenders. We wanted to                 shows that even rudimentary explanations build more trust
discover which of the six goals (transparency, scrutability,                towards the systems than the so-called “black box”
effectiveness, persuasiveness, efficiency and trust) the different          recommenders [13]. Explanations also provide system
UI elements promote in the existing music recommenders, and                 developers a graceful way for handling errors that recommender
how they could be measured in order to create a simple                      algorithms sometimes produce [6].
framework for evaluating recommender UIs. By using this
framework designers of recommendation systems could                         The majority of previous recommendation system research has
promote users’ trust and overall satisfaction towards a                     been focused on the statistical accuracy of the algorithms
recommender system thereby improving the user experience                    driving the systems, with little emphasis on interface issues and
with the system.                                                            the user experiences [13]. However, it has been noted lately that
                                                                            when the new algorithms are compared to the older ones, both
Categories and Subject Descriptors                                          tuned to the optimum, they all produce nearly similar results.
H5.m. Information interfaces and presentation: Miscellaneous.               Researchers have speculated that we may have reached a level
H.5.5 Sound and Music Computing.                                            where human variability prevents the systems from getting
                                                                            much more accurate [7]. This mirrors the human factor: it has
Author Keywords                                                             been shown that users provide inconsistent ratings when asked
Recommendation systems, music recommendation,                               to rate the same item several times [14]. Thereby an algorithm
explanations, user experience, UI design.                                   cannot be more accurate than the variance in the user’s ratings
                                                                            for the same item.
1. INTRODUCTION
Recommender systems are a specific type of information                      An important aspect for the assessment of recommendation
filtering technique that aims at presenting items (music, news,             systems is to evaluate them subjectively, e.g. how well they can
other users, etc.) that user the might be interested in. To do this,        communicate their reasoning to users. That’s why user interface
information about the user is compared to reference                         elements such as explanations, interactive elements and
characteristics, e.g. information on the other users of the system          visualizations are increasingly important in improving user
(collaborative filtering) or content features, such as genre in the         experience. In the past years subjectively perceived aspects of
case of books or music (content-based filtering). In its most               recommendations systems have accordingly gained ground in
common formulation, the recommendation task is reduced to                   their evaluation.
the problem of estimating relevance of the items that a user has
not encountered yet, and then presenting the items that have the            In this paper we want to illustrate the possibilities of user-
highest estimated ratings [6]. The importance of recommender                evaluation of recommendation supporting features in
systems lies in their potential to help users to more effectively           recommendation systems. We do this by performing a review
identify items of interest from a potentially overwhelming set of           on several publicly available music recommenders. Music is
choices [7]. The importance of these mechanisms has become                  today one of the most ubiquitous commodities and the
evident as commercial services over the Internet have extended              availability of digital music is constantly growing. Massive
their catalogue to dimensions unexplorable to a single user.                online music libraries with millions of tracks are easily
However, the overwhelming numbers of content create a                       available in the Internet. However, finding new and relevant
                                                                            music from those vast collections of music becomes similarly
WOMRAD 2010 Workshop on Music Recommendation and Discovery,
                                                                            increasingly difficult. One approach to tackle the problem of
colocated with ACM RecSys 2010 (Barcelona, SPAIN)                           finding new, relevant music is developing better (reliable and
Copyright (c). This is an open-access article distributed under the terms   trustworthy) recommendation systems. Music recommenders
of the Creative Commons Attribution License 3.0 Unported, which             are also easy to access and music has reasonably short process
permits unrestricted use, distribution, and reproduction in any medium,     in determining the quality of recommendation results.
provided the original author and source are credited.
2. GOALS FOR RECOMMENDATION AIDS                                             items compared to the same user in a system without an
Tintarev and Masthoff [16] present a taxonomy for evaluating                 explanation facility [16] and what kind of persuasion techniques
goals for explanations. Those are shown slightly modified in the             are utilized. Persuasion could also be measured by applying
Table 1 below. We argue that satisfaction towards a                          click-through rates used in measuring online ads.
recommendation system is an aggregate of the six other
dimensions, more a goal of itself than the other dimensions. In              5. Efficient explanations help users to decide faster which
addition, we noticed that the dimensions are not so                          recommendation items are best for their current situation.
straightforward as Tintarev and Masthoff present them. Some                  Efficiency can be improved by allowing the user to understand
of them cannot be evaluated using objective measures, and                    the relation between recommended options [12]. A simple way
therefore framework for evaluation recommendation aids must                  to evaluate efficiency is to give users tasks and measure how
be drawn from user research. In the following we describe each               long it takes to find e.g. an artist that is novel and pleasing to
dimension and give examples of how they could be evaluated                   the user.
and measured.
                                                                             6. Increasing users’ confidence in the system results in trust
         Table 1. Dimensions for recommendation explanations.                towards a recommender. Trust is in the core of any kind of
                                                                             recommendation process, and it is perhaps the most important
 Goal                      Definition                                        single factor leading to better user satisfaction and user
 Transparency              Explain how the system works                      experience with the interactive system. A study of users’ trust
                                                                             (defined as perceived confidence in a recommender system’s
 Scrutability              Allow users to tell the system it is wrong        competence) suggests that users intend to return to
 Effectiveness             Help users make good decisions                    recommender systems, which they find trustworthy [2]. The
                                                                             interface design of a recommender affects its credibility and
 Persuasiveness            Convince users to try or buy                      earlier research has shown that in user evaluation of web page
 Efficiency                Help users make decisions faster                  credibility the largest proportion of users’ comments referred to
                                                                             the UI design issues [5]. Trust needs to be measured using
 Trust                     Increase users’ confidence in the system          subjective scales over multiple tasks or questions about the
Resulting in                                                                recommendation aiding features of a recommender UI.
 Satisfaction (increasing the ease of use or enjoyment towards the system)
                                                                             The ease of use or enjoyment results finally in more satisfaction
                                                                             towards a system. Descriptions of recommended items have
1. An explanation may tell users how or why a recommendation
                                                                             been found to be positively correlated with both the perceived
was made, allowing them to see behind the UI and thus making
                                                                             usefulness and ease of use of the recommender system [6],
recommendation transparent. Transparency is also a standard
                                                                             enhancing users' overall satisfaction. Even though we see
usability principle, formulated as a heuristic of ’Visibility of
                                                                             satisfaction as an aggregate of the dimensions presented above,
System Status’ [13]. Transparency can be measured objectively,
                                                                             satisfaction with the process could be measured e.g. by
using binary scale (yes/no), e.g. if a UI provides some kind of
                                                                             conducting a user walk-through for a task such as finding a
explanation how a recommendation was made transparency
                                                                             satisfactory item.
gets a vote. However, evaluating transparency subjectively may
involve users to be asked if they understand how the
recommendation was made using e.g. Likert scale.
                                                                             3. RELATED EMPIRICAL RESEARCH
                                                                             It is widely agreed that expert systems that act as decision-
2. Scrutability means that users are allowed to provide feedback             support systems need to provide explanations and justifications
for the system about the recommendations. Scrutability is                    for their advice [13]. However, there is no clear consensus on
related to the established usability principle of ‘User Control’             how explanations should be designed in conjunction with other
[13]. Scrutability can be measured objectively by finding out if             UI elements or evaluated by users. Studies with search engines
there is a way to tell the system it is wrong. To evaluate                   show the importance of explanations. Koenmann & Belkin [11]
scrutability subjectively, users may be given a task to find a               found that greater interactivity for feedback on
way to tell how to stop receiving e.g. recommendations of Elvis              recommendations helped search performance and satisfaction
songs. If users feel they can control the recommendations by                 with the system.        Johnson & Johnson [10] note that
changing their profile, the UI has the possibility to scrutinize.            explanations play a crucial role in the interaction between users
                                                                             and interactive systems. According to their research, one
3. Effectiveness of an explanation help users make better                    purpose of explanations is to illustrate the relationship between
decisions. Effectiveness is highly dependent on the accuracy of              cause and effect. In the context of recommender systems,
the recommendation algorithm. An effective explanation would                 understanding the relationship between the input to the system
help the user evaluate the quality of suggested items according              (ratings and choices made by user) and output
to their own preferences [16]. This would increase the                       (recommendations) allows the user to interact efficiently with
likelihood that the user discards irrelevant options while helping           the system. Sinha and Swearingen [15] studied the role of
them to recognize useful ones. Unlike travel or film                         transparency in recommender systems. Their results show that
recommenders, in the case of music recommenders the process                  users like and feel more confident about recommendations that
of deciding the goodness of a recommendation is done quite                   they perceive transparent. Explanations allow users to
quickly.                                                                     meaningfully revise the input in order to improve
                                                                             recommendations, rather than making “shots in the dark.”
4. Persuasiveness. Explanations may convince users to try or
buy recommended items. However, persuasion may result in an                  Herlocker and Konstan [6] suggest that recommender systems
adverse reaction towards the system, if users continuously                   have not been used in high-risk decision-making because of a
decide to choose bad recommendations. Persuasion could be                    lack of transparency. While users might take a chance on an
measured according to how much the user actually tries or buys               opaque movie recommendation, they might be unwilling e.g. to
                                                                             commit to a vacation spot without understanding the reasoning
behind such a recommendation. Building an explanation facility         Some of the dimensions are easy to connect to certain UI
into a recommender system can benefit the user in various              elements. For instance, scrutability is usually designed as a
ways. It removes the “black box” around the recommender                combination of explanation and interactivity, whereas other,
system, providing transparency. Some of the other benefits             more general level dimensions depend strongly on subjective
include justification. If users understand the reasoning behind a      experience and are hard to connect with specific UI elements.
recommendation, they may decide how much confidence to                 For example, satisfaction or trust towards a system is usually
place in the suggestion. That results in greater acceptance or         combination of different experienced UI dimensions. Therefore
satisfaction of the recommender system as a decision aide,             the most common dimensions promoted in the evaluated
since its limits and strengths are more visible and its                systems were trust and satisfaction. Those, together with
suggestions are justified.                                             persuasiveness, are experienced very subjectively, which means
                                                                       that empirical user evaluation is needed for more reliable and
4. RECOMMENDATION AIDS IN EXISTING                                     comparable evaluations of those dimensions.
MUSIC RECOMMENDERS
We conducted an expert walkthrough of six publicly available           Obvious example of an explanation providing transparency is
music systems with recommendation functionalities in order to          Amazon’s “Customers with Similar Searches Purchased…”,
find out which of the six goals explanations, visualizations and       with up to ten albums’ list. Pandora tells a user: “This song was
interactive UI elements promote in the existing music                  recommended to you because it has jazzy vocals, light rhythm
recommenders, and how they can be measured in order to                 and a horn section.” Transparency is very hard to achieve
create a simple framework for evaluating recommenders. The             without textual, explicit explanations. Of the reviewed systems,
walkthrough was conducted by authors listing the UI features           only Musicovery’s UI with several interactive elements,
capable of promoting the goals mentioned above. The reviewed           graphical visualization of the recommendations and the
systems include Pandora, Amazon, Last.fm, Audiobaba,                   relations between them give users clear clues of why certain
Musicovery and Spotify. We wanted to include the most                  pieces of music were recommended, without providing
popular online music services, and on the other hand, include a        explanations.
variety of different UIs. Each of the evaluated systems provides
recommendations but not necessarily explanations. Systems              Last.fm offers users scrutability in many ways, e.g. with its
without textual explanations were also included in order to find       music player (Figure 1). One of the system’s more sophisticated
out what kind of goals or functions similar to verbal                  scrutinizing tactics is a social one. Last.fm allows users to turn
explanations other recommendation aids provide.                        off the registering (called scrobbling) of the listened music. The
                                                                       system’s users can perform identity work by turning scrobbling
 Table 2. The occurrences of recommendation aids in a selection of     off, if they feel they do not want to communicate what they
                      music recommenders                               have listened to the other users. Amazon provides “Fix this
               Trans. Scrt. Effect. Pers. Effic. Trust                 recommendation” option for telling the system to remove
  Amazon         1      2        2    3       1        3      12       recommended item from the users browsing history.
  Last.fm         -     2        2    1       2        2       9
 Audiobaba       1      1        2    1       1        2       8
 Musicovery      2      2        2    2       2        1      11
   Spotify        -     -        1    1       1        1       4
  Pandora        2      2        3    3       2        3      15
                 6       9      12      11       9      12
                                                                       Figure 1: Example of scrutable interactivity: Last.fm player’s love,
If a recommender has a possibility to promote a goal with              ban, stop and skip buttons give users a tool to control their profiles
                                                                       and thereby affect recommendations.
explanations, visualizations or interactive elements, it gets a
vote in Table 2. For example, persuasiveness promoted through
                                                                       Users can be helped in efficiency and effectiveness, i.e. making
visualizations is potentially possible in all of the interfaces that
                                                                       better and faster decisions by offering appropriate controls with
have visualizations, even rudimentary ones, such as an album
                                                                       interactive elements. For instance, Musicovery’s timeline slider
cover. A single user might be persuaded to try or buy by
                                                                       is presented in Figure 2. It works in real time with the system’s
presenting a subjectively compelling album cover. From Table
                                                                       graphical presentation of recommended items.
2 we can see that Pandora, Amazon and Musicovery have the
greatest number of UI elements able to provide users support
for sense-making of recommendations. Effectiveness,
persuasiveness and trust are the most commonly promoted
goals. In each recommender, each UI element has the potential
to increase trust towards the systems, but for more accurate
measurement, it remains to be evaluated by empirical user
research, to which extent each elements in certain recommender
interface really promote trust. This applies to most of the six
goals: without empirical data, it is almost impossible to decide,      Figure 2: Musicovery’s timeline slider: interactivity promoting
whether the potential for promoting effectiveness,                     efficiency, scrutability, and effectiveness, resulting in more trust
                                                                       and satisfaction towards the system.
persuasiveness and efficiency actually realizes. Only
transparency and scrutability can be measured using objective
binary scale of yes/no, but they can be evaluated also using           5. DISCUSSION AND CONCLUSIONS
subjective (Likert style) scales. We argue that by measuring           We reviewed dimensions of explanations in six music
these goals for UI elements together with a set of usability           recommendation systems and found out that most of the
guidelines, it is possible to evaluate and design better user          reviewed commercial music recommendation systems are
experiences for recommendation systems.                                “black boxes”, producing recommendations without any, or
very limited explanations. Most of the dimensions are poorly          artists, not to mention advanced social media features that
promoted by textual explanations, but can be promoted by other        together effectively work towards the same goals as the
means, namely by visualizations and interactive elements, and         dimensions of explanations. Furthermore, Spotify, a popular
further, by user-generated content and social facilities. From the    European music service with very simple recommendation
expert walkthrough of the selected music recommendation               facility, does not provide any explanations whatsoever. Its
systems we can draw a tentative conclusion that if UI elements        popularity relies on providing users a minimalistic UI with
can fulfill similar functions as explanations, there is necessarily   effective search facility and a functional, high-quality audio
no need for textual descriptions. By using non-verbal                 streaming. Spotify’s usability and functionality work effectively
recommendation aids as “implicit” explanations and using them         towards overall satisfaction of the system, making explanations,
in recommendation system design, we can promote better user           visualizations or advanced interactivity redundant. Obviously,
experience. This is the case especially when the user has             Spotify’s abilities for helping to find new music are limited,
enough cultural capital and therefore competence for “joining         because of very simple recommendation facility, but it can be
the dots” between recommended items without explicit                  used as an example of the argument that user trust and
explanations. On the other hand, if the recommender is used           satisfaction can be promoted by diverse means depending on
e.g. for learning about musical genre, textual explanations may       the different users’ various needs and desires.
be indispensable.
                                                                      The next step of our research is to conduct an empirical user
As an example of the dimensions that UI elements other than           evaluation of the importance and functions of different UI
verbal explanations can promote is the overall satisfaction or        elements in music recommenders. We are looking for feasible
trust towards the systems that can be achieved by                     scales of measurement that are drawn from user evaluation of
conversational interaction such as in UI example presented in         the goals for UI elements in recommenders. User evaluation
Figure 3, where users are given a chance for optional                 could be done with modified music recommender UIs where
recommendations based on their situational desires and needs.         users are given tasks and comparing e.g. how much taking away
                                                                      a UI feature such as an explanation effects to the time the task
                                                                      is completed. It would also be interesting to explore how
                                                                      different goals can be promoted by combining various UI
                                                                      elements, and by assigning unconventional roles for UI
                                                                      elements, e.g. creating visualizations that would reveal the logic
                                                                      behind a recommendation and at the same time give a user a
                                                                      tool to scrutinize.
Figure 3: A recommendation aid with optional inputs.

Last.fm is an example of recommendation system with no
explanations. However, it has an abundance of other elements
such as user created biographies, genre tags and pictures of

REFERENCES
[1] Adomavicius G., Tuzhilin, A. 2005. Towards the Next               [8] Hill, W., Stead, L., Rosenstein, M., Furnas, G. 1995.
    Generation of Recommender Systems: A Survey of the                    “Recommending and Evaluating Choices in a Virtual
    State-of-the-Art and Possible Extensions. IEEE                        Community of Use”, Proc. of Conference on Human
    Transactions on Knowledge and Data Engineering, 17(6),                Factors in Computing Systems ‘05.
    734-749.
                                                                      [9] iTunes, http://www.apple.com/itunes.
[2] Buchanan, B., Shortcliffe, E. 1984. Rule-Based Expert
                                                                      [10] Johnson, J. & Johnson, P. 1993. Explanation facilities and
    Systems: The Mycin Experiments of the Stanford Heuristic
                                                                           interactive systems. In Proceedings of Intelligent User
    Programming Project. Reading, MA: Addison Wesley
                                                                           Interfaces '93. (159-166).
    Publishing Company.
                                                                      [11] Koenemann, J., Belkin, N. 1996. A case for interaction: A
[3] Chen, L., P. Pu. 2002. Trust building in recommender
                                                                           study of interactive information retrieval behavior and
    agents. In Proc. of International Workshop on Web
                                                                           effectiveness. In Proc. of Conference on Human Factors in
    Personalisation, Recommender Systems and Intelligent                   Computing Systems ‘96, ACM Press, NY.
    User Interfaces ‘02.
                                                                      [12] Last.fm, http://www.last.fm.
[4] Doyle, D., A. Tsymbal, and P. Cunningham. 2003. A
    review of explanation and explanation in case-based               [13] Nielsen, J. and R. Molich. Heuristic evaluation of user
    reasoning. Technical report, Dept. of Computer Science,                interfaces. In Proc. of Conference on Human Factors in
    Trinity College, Dublin.                                               Computing Systems ’90.
[5] Fogg, B. J., Soohoo, C., Danielson, D. R., Marable, L.,           [14] Pu, P., Chen, L. 2006. Trust building with explanation
    Stanford, J, Tauber, E.R. 2003. How do users evaluate the              interfaces. In Proc. of Intelligent User Interfaces ‘06,
    credibility of web sites? In Proc. of Designing for User               pages 93-100.
    Experiences ‘03. Pages 1-15.                                      [15] Sinha, R., Swearingen, K. 2002. The role of transparency
[6] Herlocker J. L., Konstan, J. A. 2000. Explaining                       in recommender systems. In Proc. of Conference on
    collaborative filtering recommendations. Proc. of                      Human Factors in Computing Systems ‘02.
    Computer Supported Collaborative Work ‘00. Pages 241-             [16] Tintarev, N., Masthoff, J. 2007. Survey of explanations in
    250.                                                                   recommender systems. In Proc. of International Workshop
[7] Herlocker, J. L., Konstan, J.A., Terveen, L, Riedl, J. T.              on Web Personalisation, Recommender Systems and
    2004. Evaluating collaborative filtering recommender                   Intelligent User Interfaces ‘07.
    systems. ACM Trans. Inf. Syst., 22(1):5-53.