<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Music Recommendation and the Long Tail</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mark Levy</string-name>
          <email>mark@last.fm</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Klaas Bosteels</string-name>
          <email>klaas@last.fm</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Last.fm</institution>
          ,
          <addr-line>Karen House, 1-11 Baches Street, London N1 6DL</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Using a dataset of 7 billion recent submissions to the Last.fm Scrobble API1, we investigate possible popularity bias in Last.fm's recommendations and streaming radio services. In particular we compare the recent listening of users who listen regularly to Last.fm streaming services with those who listen less often or never. Finally we describe a new service explicitly designed to make recommendations from the long tail, and analyse popularity e®ects across the recommendations which it suggests.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Music lovers today have access to a previously undreamed
of quantity and variety of recordings. Music is available
through an increasing number of digital channels, including
free online streaming services, \all you can eat"
subscription services, and paid downloads, not to mention via
illegal downloading and more traditional physical media. In
one well publicised view [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], this proliferation in availability
should lead to a reduction in the dominance of hits in our
musical culture. With the development of advanced tools for
search and recommendation, we should expect to see
listeners discovering and enjoying a huge range of music that may
be less popular overall, sitting somewhere in the so-called
Long Tail of sales ranks, but which o®ers a good match for
their own personal tastes.
      </p>
      <p>
        The original long tail speculation was that it would
become increasingly pro¯table to \sell less of more" by
making large numbers of niche items easily available. Empirical
studies of consumer behaviour suggest that this is indeed
true, provided that enough choice is available, and that
effective search and recommendation systems are provided to
help users ¯nd their way around large inventories [
        <xref ref-type="bibr" rid="ref3 ref4">4, 3</xref>
        ]. A
large recent study of consumer preference data, including
user ratings for movies and music, shows that while not all
users consume items in the long tail, \the vast majority of
users are a little bit eccentric, consuming niche products at
least some of the time", in particular reporting high average
ratings for niche music [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        Two lines of research suggest, however, that the utopian
vision in which niche movies and music increasingly usurp
the dominance of hits may not be borne out in practice. A
recent study of the Net°ix catalogue of movies shows that,
on the contrary, demand for hits appears to rise, while that
for niche products falls, as the number of available titles
increases [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Meanwhile hit products continue to
dominate the consumption of movies and music even for users
who regularly explore the long tail [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Secondly, a
number of studies of the very recommender systems which are
supposed to support discovery in the long tail suggest that
such systems are frequently prone to popularity bias,
recommending globally popular items ahead of niche products [
        <xref ref-type="bibr" rid="ref1 ref5 ref7">5,
7, 1</xref>
        ].
      </p>
      <p>In this paper we present an empirical study of the
recommendations actually made by the widely-used Last.fm
music recommender system, in particular via its streaming
radio service, and set them in the context of wider music
listening. As well as assessing the degree of popularity bias
in these recommendations, we also compare the listening
habits of a large group of music lovers regularly exposed to
Last.fm's streaming radio with those of a second group who
have no exposure to it. Finally we outline the design of a
recommender system expressly designed to make
recommendations from the long tail, and assess the popularity bias of
a sample of the recommendations it produces.</p>
      <p>The remainder of this paper is organised as follows:
Section 2 brie°y reviews previous work on popularity bias in
recommender systems; Section 3 describes the data used as
the basis for this study; Section 4 investigate the presence
of popularity bias in Last.fm's radio streams, and Section 5
attempts to uncover any corresponding in°uence on users'
wider listening habits; Section 6 outlines a music
recommender explicitly designed to make recommendations in the
long tail, and Section 7 draws conclusions.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>PREVIOUS WORK</title>
      <p>
        Three recent studies identify potential bias in recommender
systems, particularly those based on collaborative ¯ltering
(CF). In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] recommendations are generated using various
well-established CF algorithms based on movie ratings from
the MovieLens and Net°ix datasets2. Over 84% of the
MovieLens recommendations were for movies in the top 20% by
2http://www.grouplens.org, http://www.net°ixprize.com
number of ratings. No comparable ¯gure is given for the
Net°ix recommendations, but the authors suggest that in
both cases large gains in the diversity of recommendations
can be achieved, with little cost to relevance, by suitable
reranking techniques applied to the CF output.
      </p>
      <p>
        The e®ect of recommendations on user behaviour is
studied in a completely simulated setting in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In the simulation
users receive and consume recommendations from a CF
recommender over a series of timesteps, so that over time the
recommendations they receive are in°uenced by the
previously recommended items which were added to their pro¯le
in previous rounds. The simulation was run repeatedly, with
di®ering outcomes, but led in the great majority of runs to
a decrease in the overall diversity of consumption.
      </p>
      <p>
        CF recommendations are studied indirectly in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which
considers the network de¯ned by Last.fm's similar artist
relationships. These relationships provide one of the sources
of data used in Last.fm's recommender system, and can also
be directly navigated as links on the Last.fm website,
providing an active form of music discovery. Besides observing
that Last.fm's similar artist lists are dominated by other
artists with a similar level of popularity, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] computes
various network metrics to support the assertion that \CF tends
to reinforce popular artists, at the expense of discarding
lessknown music", essentially by showing that navigation from
popular to long tail artists often involves traversing a large
number of artist links.
      </p>
      <p>While all three of these papers discuss the e®ects of CF
recommender systems, none of them considers a dataset
of real recommendations made by a deployed system. In
this paper we use Last.fm submissions data, de¯ned fully in
the next Section, to study the e®ect of a large-scale
recommender system in practice.</p>
    </sec>
    <sec id="sec-3">
      <title>DATA</title>
      <p>Last.fm allows music lovers to scrobble details of their
music listening. Scrobbling is available from media players and
streaming services either though native support or via a
suitable plugin, and is built in to some hardware devices. The
Scrobble API3 supports the submission of various events: in
this paper we distinguish between radio listens, which record
the act of playing a track via one of Last.fm's own streaming
radio stations, and scrobbles, where the track played comes
from any source other than Last.fm. In both cases the
submitted metadata includes an artist name: for a scrobble
this is typically drawn from the ID3 tags of the track being
played.</p>
      <p>Last.fm provides various types of streams, including
Similar Artists and Tag radio, launched by supplying a seed
artist or tag respectively, available to anyone, and
Recommendation and Library radio, available to any user
registered for scrobbling. Recommendation radio plays tracks by
artists selected for the user by Last.fm's recommender, while
Library radio plays tracks by artists previously scrobbled by
the user. Users typically listen to Last.fm's radio stations
through the °ash player on the Last.fm website, or via a
client program on their computer, phone or games console.
In each case, information is displayed about the artist of
the current track, including links to the artist's page on the
Last.fm website, lists of similar artists, etc. While
Recommendation radio is clearly an explicit recommendation
ser3http://www.last.fm/api/submissions
vice, all the stations can be considered as o®ering implicit
recommendations, with Similar Artists radio in particular
relying on underlying similarity data which also forms part
of the input to Last.fm's recommender system. Even Library
radio can be regarded as providing a form of non-novel
recommendation, as it may remind the user of artists whom
they like but have not listened to for some time.</p>
      <p>In the following analysis we therefore pay special attention
to Recommendation radio, but also consider the in°uence of
Last.fm streaming radio as a whole. For the time being we
neglect the in°uence of the recommendations displayed on
users' Last.fm home pages and dedicated recommendation
pages. The dataset used consists of over 7 billion
submissions to the Scrobble API received between January and
May 2010.</p>
    </sec>
    <sec id="sec-4">
      <title>4. POPULARITY BIAS</title>
      <p>
        The most widely-used measure of the diversity or,
conversely, concentration of a set of products consumed by a
group of users is the Gini coe±cient [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and this has also
been applied to measure popularity bias within
recommendations [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The Gini coe±cient is computed from the area
bounded by the Lorenz curve, which, in the case of artist
recommendations, plots the proportion of the total number
of recommendations made cumulatively for the bottom x%
of artists recommended. The Gini coe±cient is not ideal for
our purposes here, as it depends on artist ranks within the
set of recommendations being evaluated, i.e. it would show
high concentration for a recommender that overwhelmingly
recommended a small number of artists, even if all the artists
it recommended belonged to the long tail. In the Sections
that follow we therefore show plots similar to Lorenz curves,
but showing the cumulative proportion of recommendations
made in relation to artist ranks according to their global
popularity, based simply on the overall total number of scrobbles
received at the time of writing, shown in Fig. 1. We can also
use this data to de¯ne what we mean by a \long tail" artist.
Fitting Kilkki's informal model [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] suggests that this is any
artist below rank 20,000; Fig. 1 shows, however, that in
reality popularity °atlines slightly further down the tail, and
that a reasonable de¯nition of a long tail artist is one at
rank 50,000 or below.
      </p>
      <p>Fig. 2 shows the distribution of ranks for artists played
on Last.fm Recommendation radio, and on all Last.fm
radio stations, compared with the distribution for all tracks
scrobbled in same period. We observe that Last.fm radio is
somewhat biased away from hit artists in comparison to the
listening of Last.fm users as a whole, while Last.fm's
recommendations are even more strongly biased towards lower
ranking artists. In particular we see that artists in the
top1000 of overall listening make up 40% of scrobbles but only
20% of plays on Recommendation radio. Recommendation
radio plays the same proportion of long tail artists as are
listened to overall, but includes fewer plays of the lowest
ranked artists: it is reasonable to assume, however, that
artists scrobbled at those ranks include many whose tracks
are not readily available for streaming, as well as spurious
artists based on submissions with incorrect metadata that
is not repaired by Last.fm's automatic correction system.</p>
    </sec>
    <sec id="sec-5">
      <title>5. INFLUENCE</title>
      <p>To expose the possible in°uence of Last.fm
recommendations on users, we ¯rst create a set of active listeners by
taking all users who registered during the ¯ve months under
consideration and then scrobbled at least 500 but no more
than 20,000 tracks during that time. The upper limit
removes spammers and other technically-minded enthusiasts
whose scrobbles represent a superhuman quantity of
listening within that period, while the lower limit ensures that
we have a reasonable amount of listening data for all of the
users under consideration. We then draw two samples from
this set of listeners. The ¯rst contains all users who had
no exposure to any Last.fm radio station within the period
(or indeed at any stage, as we include only newly-registered
users). The second group contains all users for whom radio
listens made up 25-75% of their submissions, i.e. these
listeners are highly exposed to Last.fm radio, but also make a
signi¯cant number of scrobbles for listening outside Last.fm.</p>
      <p>Fig. 3 shows the distribution of artists scrobbled by each
of these groups in the ¯rst ¯ve months of 2010, again
compared with that for all tracks scrobbled during the same
period. We observe a bias towards more popular artists in
the mid region for the group of radio listeners, but it is
small compared with the biases in artist popularity for
radio plays shown in Fig. 1, and, more importantly, clearly not
correlated with them. To control for demographic or other
systematic di®erences between users who listen frequently to
radio and those who never do so, in Fig. 4 we compare
scrobbles for users with low exposure to radio, making up 10-50%
of their scrobbles, to those with radio making up 50-90% of
their scrobbles. In contrast to Fig. 3, this shows a slight
bias towards the long tail in users with higher exposure to
radio. We can conclude that there is no evidence that radio
and recommendations cause a systematic bias towards more
popular artists.
6.</p>
    </sec>
    <sec id="sec-6">
      <title>LONG TAIL RECOMMENDATIONS</title>
      <p>
        We build a prototype recommender for long tail artists
using conventional item-based CF. We ¯rst identify a suitable
candidate pool of long tail artists from which to draw our
recommendations. For each artist in our overall catalogue
we then ¯nd the most similar k artists within the pool, based
on scores computed by comparing both scrobbles and tags
applied to each artist. When a user u requests
recommendations, we create a pro¯le of artist weights Wu based on
their scrobbles, and build up a candidate set containing the
top-k similar artists in the pool for each artist in Wu. We
then score each candidate artist a based on their similarity
to artists in the user's pro¯le, computing a score Pu;a using
the well-known weighted sum method [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], ¯nally returning
the top-N highest scoring artists:
      </p>
      <p>Pu;a =</p>
      <p>X sim(a; a0)wu(a0)
(1)
a02Wu
where sim(a; a0) is the similarity between a and a0, and
wu(a0) is the weight assigned to a0 in the the user pro¯le.</p>
      <p>To obtain a suitable pool of long tail artists, we start with
all artists with tracks currently available in the Last.fm \Play
direct from artist" scheme, under which unsigned artists or
labels holding suitable rights can make tracks available for
free streaming from their Last.fm pages. This scheme is
aimed at niche and new artists, but to be sure that artists
in the pool are indeed from the long tail, we also apply a
hard cuto® on current overall reach, removing any artists
with more than 10,000 listeners. Finally to mitigate
problems with artist disambiguation in the long tail, where new
or niche artists have the same names as more popular artists,
we mine Last.fm wiki entries for key phrases indicating
multiple artists with the same name, removing any a®ected
artists from the pool. The resulting set of long tail artists is
updated daily, but at the time of writing contains 118,000
artists.</p>
      <p>To study the popularity distribution amongst artists
suggested by this new system, we generate 50 recommendations
for each of a sample of 100,000 active Last.fm users, de¯ned
as users who have visited the Last.fm website within the last
week. Fig. 5 shows the resulting distribution, compared to
that for plays on the main Recommendation radio station
during the ¯rst ¯ve months of 2010. Approximately 90% of
the sampled recommendations are for artists in the mid to
long tail, with ranks 25,000 to 100,000, with the remaining
10% being for the lowest ranking artists. While the
previous Section suggests that the in°uence of recommendations
may be limited, we can reasonably hope that the prototype
recommender will gradually stimulate increased interest in
the long tail.</p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSIONS</title>
      <p>A comparative analysis of artists chosen by Last.fm's
recommender system and a large body of listening data suggests
that, contrary to claims in the literature based on laboratory
experiments, real world music recommenders do not
necessarily exhibit strong popularity bias. Our results suggest
that, in any event, the in°uence of such a recommender on
users' general listening may be limited. Finally we sketch
the design of a prototype recommender designed explicitly
to suggest artists from the long tail. Future work includes a
user evaluation of the prototype system, which is now
publicly available4.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kwon</surname>
          </string-name>
          .
          <article-title>Toward more diverse recommendations: Item re-ranking methods for recommender systems</article-title>
          .
          <source>In Proc. 19th Workshop on Information Technologies and Systems</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Anderson</surname>
          </string-name>
          .
          <article-title>The Long Tail. Why the future of business is selling less of more</article-title>
          .
          <source>Hyperion</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Brynjolfsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Simester</surname>
          </string-name>
          .
          <article-title>Goodbye pareto principle, hello long tail: The e®ect of search costs on the concentration of product sales</article-title>
          .
          <source>Technical report</source>
          , MIT Center for Digital Business,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Brynjolfsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Smith.</surname>
          </string-name>
          <article-title>From niches to riches: anatomy of the long tail</article-title>
          .
          <source>Sloan Management Review</source>
          ,
          <volume>47</volume>
          (
          <issue>4</issue>
          ):
          <volume>67</volume>
          {
          <fpage>71</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>O.</given-names>
            <surname>Celma</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Cano</surname>
          </string-name>
          .
          <article-title>From hits to niches? or how popular artists can bias music recommendation and discovery</article-title>
          .
          <source>In Proc. 2nd Net°ix-KDD Workshop</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Elberse</surname>
          </string-name>
          .
          <article-title>A taste for obscurity? an individual-level examination of \Long Tail" consumption</article-title>
          .
          <source>Technical report, Harvard Business School</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Fleder</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Hosanagar</surname>
          </string-name>
          .
          <article-title>Recommender systems and their impact on sales diversity</article-title>
          .
          <source>In EC '07: Proceedings of the 8th ACM conference on Electronic commerce</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Gini</surname>
          </string-name>
          .
          <article-title>Measurement of inequality of incomes</article-title>
          .
          <source>The Economic Journal</source>
          ,
          <volume>21</volume>
          (
          <issue>121</issue>
          ):
          <volume>124</volume>
          {
          <fpage>6</fpage>
          ,
          <year>1921</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Broder</surname>
          </string-name>
          , E. Gabrilovich, and
          <string-name>
            <given-names>B.</given-names>
            <surname>Pang</surname>
          </string-name>
          .
          <article-title>Anatomy of the long tail: ordinary people with extraordinary tastes</article-title>
          .
          <source>In WSDM</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kilkki</surname>
          </string-name>
          .
          <article-title>A practical model for analyzing long tails</article-title>
          .
          <source>First Monday</source>
          ,
          <volume>12</volume>
          (
          <issue>5</issue>
          ),
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B.</given-names>
            <surname>Sarwar</surname>
          </string-name>
          , G. Karypis,
          <string-name>
            <given-names>J.</given-names>
            <surname>Konstan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <article-title>Item-based collaborative ¯ltering recommendation algorithms</article-title>
          .
          <source>In WWW '01: Proceedings of the 10th international conference on World Wide Web</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Tan</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Netessine. Is Tom</surname>
          </string-name>
          <article-title>Cruise threatened? using Net°ix Prize data to examine the Long Tail of electronic commerce</article-title>
          .
          <source>Technical report, Wharton Business School</source>
          , University of Pennsylvania,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>