=Paper=
{{Paper
|id=None
|storemode=property
|title=Exploiting Twitter's Collective Knowledge for Music Recommendations
|pdfUrl=https://ceur-ws.org/Vol-838/paper_09.pdf
|volume=Vol-838
|dblpUrl=https://dblp.org/rec/conf/msm/ZangerleGS12
}}
==Exploiting Twitter's Collective Knowledge for Music Recommendations==
Exploiting Twitter’s Collective Knowledge for Music
Recommendations∗
Eva Zangerle, Wolfgang Gassler, Günther Specht
Databases and Information Systems, Institute of Computer Science
University of Innsbruck, Austria
{firstname.lastname}@uibk.ac.at
ABSTRACT as last.fm1 , which own such big corpora. However, most
Twitter is the largest source of public opinion and also con- of them are not publicly available. Especially for academic
tains a vast amount of information about its users’ music purposes, only few (mostly small) data sets for the evalua-
favors or listening behaviour. However, this source has not tion of the proposed approaches are available, like e.g. the
been exploited for the recommendation of music yet. In this million song data set [4].
paper, we present how Twitter can be facilitated for the cre- Twitter is a publicly available service, which holds huge
ation of a data set upon which music recommendations can amounts of data and is still growing tremendously. Twit-
be computed. The data set is based on microposts which ter stated that there are about 140 million new messages a
were automatically generated by music player software or day. Such messages can also be exploited in the context of
posted by users and may also contain further information music recommendations. Many audio players offer the func-
about audio tracks. tionality of automatically posting a tweet containing the ti-
tle and artist of the track the user currently is listening to.
These tweets traditionally contain keywords like nowplay-
Categories and Subject Descriptors ing or listeningto, like e.g. in the tweet “#nowplaying Tom
H.2.8 [Database Management]: Database Applications— Waits-Temptation”. For users who frequently make use of
Data Mining such a service, the set of these tweets can be seen as a user
profile in terms of her musical preferences and provide well
General Terms suited data for e.g. a music recommendation corpus.
In this paper we present an approach for gathering such
Algorithms, Performance, Human Factors, Experimentation data and refining it such that the tweeted artists and tracks
can directly be related to the free music databases FreeDB
Keywords and MusicBrainz. As a use case scenario, we present the
Recommender Systems, Music Recommendation, Twitter recommendation of music based on the data set.
This paper is structured as follows. Section 2 describes
the processes underlying the creation of the proposed data
1. INTRODUCTION set. Section 3 features the approach for the recommendation
Throughout the last years, music recommendation ser- of suitable music tracks as a use case for the gathered data.
vices have become very popular in both academia and in- Section 4 contains related work and Section 5 concludes the
dustry. The goal of such services is the recommendation of paper and discusses future work.
suitable music for a certain user. This is traditionally ac-
complished by (i) either taking the user profile consisting of
the tracks the user listened to in the past and (if available) 2. DATA SET CREATION
the user’s rating for songs into account or (ii) analysing the The goal of this approach is the creation of a corpus of
song itself and using the extracted features in order to find music tracks gathered from tweets of users. These tweets
similar songs. For the recommendation of music, huge cor- contain tracks the user previously listened to and tweeted
pora and user profiles are required as there are millions of about (the so-called user stream). In particular, we propose
different audio tracks. There are some large services, such to make use of tweets which have been posted by users or
∗This research was partially funded by the University of audio players and contain the title and artist of the music
Innsbruck (Nachwuchsförderung 2011). track currently played, like e.g. “#NowPlaying Best Thing
I Never Had by Beyonce”. The following sections describe
the steps taken for the creation of the data set.
Permission to make digital or hard copies of all or part of this work for 2.1 Crawling of Twitter Data Set and Analysis
personal or classroom use is granted without fee provided that copies are The data set was crawled via the Twitter Streaming API-
not made or distributed for profit or commercial advantage and that copies between July 2011 and February 2012. The only publicly
bear this notice
Copyright c and theheld
2012 full by
citation on the first page. To copy otherwise, to
author(s)/owner(s).
available access method is the Spritzer access which only
republish,
Publishedto as
postpart
on servers
of the or to redistribute
#MSM2012 to lists, requires
Workshop prior specific
proceedings,
permission
available and/or
online aasfee.
CEUR Vol-838, at: http://ceur-ws.org/Vol-838 provides real-time access to about 1% of all posted Twitter
#MSM2012, April 16, 2012, Lyon, France.
WWW Lyon, France, 2012 1
Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$10.00. http://www.last.fm
14
· #MSM2012 · 2nd Workshop on Making Sense of Microposts ·
messages. Due to these restrictions, we crawled 4,734,014 than 37 million audio tracks, roughly 3,000,000 discs and
tweets containing one of the keywords nowplaying, lis- 766,909 different artists. MusicBrainz was also considered
tento or listeningto posted by 864,736 different users. as a reference database as we expected it to be of higher
This implies an average of 5.5 tweets for each user. Within quality than FreeDB. MusicBrainz contains about 8 million
our data set, the distribution of tweets per user resembles tracks of about 650,000 different artists.
a longtail distribution, as can be seen in Table 1. Such a The goal of this task is to assign each tweet a FreeDB
distribution implies that considering the fact that recom- and a MusicBrainz entry which represents the title and the
mendations can only be made if a user has posted about according artist extracted from the tweet. We tackle this
two or more tracks, a total of 457,675 users and the respec- resolution task by making use of a Lucene fulltext index as
tive tweets can not be facilitated for our approach as only it allows a simple matching of strings, namely the tweet and
one tweet of these users is featured within the data set. a certain FreeDB or MusicBrainz entry. The fulltext index
is filled with a combined string containing both the artist
Tweets in stream Users and the title of all tracks within the reference databases.
1 457,675 In a next step, we query this fulltext index for each of the
>3 196,422 tweets within the data set in order to obtain the most suit-
>5 126,783 able FreeDB/MusicBrainz candidates for the title and artist
> 10 63,017 of the track. We then use the top-20 search results of Lucene
> 100 3,190 as candidates for the assignment of tracks to the informa-
> 1,000 253 tion mentioned in the according tweet. Lucene’s ranking
> 10,000 5 function is based on the term frequency/inverse document
frequency measure (tf/idf). This measure is dependent on
Table 1: Population of User Streams the length of the query which is not favourable in our ap-
In total, 5,916,294 hashtags were used within the data set. proach as tweets contains a high degree of noise (e.g. URLs,
Clearly due to our used search keywords the hashtags #now- feelings, smilies, etc.) which are not part of a track title
playing and #listeningto were the most prominent hash- but also part of the query (the tweet). Therefore, we im-
tags within the crawled data set. Also, general hashtags like plemented a bag-of-words similarity measure between the
e.g. #music, #radio or #video have been used frequently. query and the documents contained within the Lucene in-
Music streaming services or online radios also make use of dex similar to the Jaccard similarity measure. Our proposed
hashtags when tweeting about the currently playing track similarity measure is defined by the ratio between the size
(e.g. #cityfm or #fizy). of the (term-) intersection of the query and the track and
A total of 1,413,983 tweets (29.8% of the whole corpus) the number of terms contained in the track, as can be seen
featured hyperlinks. An analysis of these URLs revealed in Equation 1.
that URLs are mostly used to point to music services like
|tweet ∩ track|
e.g. Youtube or Spotify, an online music streaming service. simmusic (tweet, track) = (1)
A large part of the hyperlinks lead to the website of the |track|
service which was used to post the track information on The advantage of such a measure is the independence of
Twitter, like e.g. tweetmylast.fm or tinysong.com. the length of the query and the reduced influence of the
noise in tweets. Furthermore, as our goal is to find the
2.2 Resolution of Twittered Tracks best matching audio track for all given tweets, it is crucial
This task aims at parsing the gathered tweets and rec- that most terms within the track are matched. However, in
ognizing the artist name and track title mentioned in the the case of multiple search results having obtained an equal
tweet. Consider e.g. the tweet “#NowPlaying Best Thing score, we still rely on the tf/idf values computed by Lucene.
I Never Had by Beyonce”. For this tweet, we have to ex- Our proposed score is used for a ranking of the Lucene search
tract Beyonce as the artist and “Best Thing I Never Had” as results. For each of the tweets, the track which obtained the
the title of the audio track and match it with a reference mu- highest score are assigned to the tweet. In order to be able to
sic database. Most of the crawled messages are very noisy set a certain threshold for the scores of the matching entries
and consist of many terms which are not concerned with the later, we also store the computed simmusic -score.
music track itself. Considering e.g. the tweet “listening to
Hey Hey My My (Out Of The Blue) by Neil Young on 2.2.2 Evaluation of Resolution
@Grooveshark: #nowplaying #musicmonday http://t.co- For the evaluation of the resolution and the comparison of
/7os3eeA” which contains further information about the on- FreeDB and MusicBrainz, we created a ground truth data
line radio service, a URL and other information which are set which consists of 100 tweets randomly chosen from the
not related to the music track. Especially when dealing with data set. Subsequently, we tried to assign matching tracks in
such noisy tweets, the matching is a crucial task as the qual- the FreeDB and MusicBrainz databases manually. This task
ity of the data resulting from this step significantly influences was done by the same person for both reference databases
the quality of the resulting recommendations. and also contains the resolution of abbreviations or men-
2.2.1 Resolution Approach tions which link to the artist’s Twitter account. For exam-
ple the tweet #nowplaying @Lloyd_YG ft. @LilTunechi
As a reference database for artists and the according tracks, - You can be resolved to the two Twitter accounts Lloyd-
we made use of the publicly available databases FreeDB2 and Young Goldie and Lil Wayne WEEZY F and therefore to
MusicBrainz3 . FreeDB contains information about more the MusicBrainz entry Lil Wayne feat. Lloyd - You. Having
2
http://www.FreeDB.org gathered all possible information from the tweet, the assign-
3
http://www.MusicBrainz.org ing person searched for matching tracks in the database.
15
· #MSM2012 · 2nd Workshop on Making Sense of Microposts ·
If the artist or the title of the track were not directly rec- input user stream. Hence, all recommendation candidates
ognizable in the tweet, single words are used to search the are ranked by the respective count values where a higher
database and find matching artists or titles. We only consid- count value results in a higher rank for the candidate.
ered tweets which were resolved to both the according track
and artist. Tweets such as Chris Duarte, famous blues 3.2 Offline Evaluation
musician - free videos here: http://t.co/UZMXaGQ As a first evaluation we performed an offline evaluation
#blues #guitar #music #roots #free #nowplaying #mu- and compared the computed track recommendations with
sicmonday which only contain information about the artist recommendations provided by the last.fm API4 which lists
were not counted as a match. However, such information is tracks similar to a given track including a score stating the
also very valuable as it describes the musical taste of a user. relevance of the song (matching score).
For our ground truth data set, we were able to manually We made use of the MusicBrainz data set as it contains
assign 57 tracks of FreeDB and 59 tracks of MusicBrainz. cleaner data than FreeDB. Firstly, we removed all tweets
This shows that the size of both data sets is similar, how- of users who contributed only one tweet and which were
ever the FreeDB data set is very noisy (typos, spelling errors matched with a MusicBrainz track with simmusic < 0.8 to
and variations). dismiss uncertain mappings. Hence our final data set con-
Subsequently we ran our automated Lucene based reso- sisted of 2.5 million tweets of 525,751 users. Based on this
lution process on the ground truth dataset using both ref- data set we computed the according association rules and
erence databases ( see details in Table 2). Considering a obtained 500 million distinct rules. Due to computability
simmusic -score threshold of 0.8 we were able to resolve 73% reasons and API limitations, we chose a subset consisting
of the ground truth correctly and had an error rate (false of the most popular tracks and according rules which are
positives) of about 10% of all matched tracks. The high present more than 10 times (c > 10). The final data set con-
number of false positives using the FreeDB data set can be sisted of 15,000 unique tracks and 90 million distinct rules.
lead back to the noisy entries in FreeDB. We called the last.fm API for all tracks and the API was
able to recognize 13,138 out of 15,000 songs. The API re-
RefDB Manually Automated False Pos. turned 3.2 million similar tracks which we matched with our
MusicBrainz 59 43 (73%) 5 (10%) internal MusicBrainz database. In total, 83% of all tracks
FreeDB 57 31 (54%) 18 (36%) with a score > 0.8 were matched. We transformed the gath-
ered last.fm data to association rules and computed the over-
Table 2: Resolution Ground Truth (100 tweets) lap of rules with our rule set. 19% of the last.fm rules are
covered by the Twitter-based rules. If we consider only sim-
Due to these obtained results we used MusicBrainz for all ilar tracks of last.fm with a matching score (gathered via the
further computations (e.g. music recommendations). last.fm API) higher than 0.6, the twitter-based rules cover
79% of all rules in the set. When comparing the top-10
3. MUSIC RECOMMENDATIONS recommendations on both sides the coverage is only about
As a use case, we implemented a music recommendation 1% of all rules. These low numbers can be lead back to
service on top of the data set. The necessary steps for a the restrictions of the Twitter API and the resulting sparse
recommendation of music are described in the following. data set. Especially the incomplete user profiles decrease the
The proposed approach for the recommendation of mu- coverage. E.g. within the “taste” subset of the million song
sic titles relies on the co-occurrence of titles within a user data set roughly 70% of the tracks were played more than
stream. Based on the obtained tweets and the assigned 10 times. In contrast, in our data set only 5% of the tweets
tracks, we propose to use association rules [2] in order to were contained more than 10 times. This fact strengthens
be able to model the co-occurrence of items efficiently. In the evidence that the crawled data set is not representative
the case of the co-occurrence of tweeted music titles, an as- enough which can be lead back to the API limitation and
sociation rule t1 → t2 describes that a particular user who uncertainties in the matching processes. Furthermore, due
tweeted about song t1 also tweeted about song t2 . These to the diversity of music tracks, such an offline evaluation
rules are the basis for the further recommendation process may not reveal the full potential of the approach. Online
and are stored as triples r = (t1 , t2 , c), where t1 and t2 are evaluations may achieve better results for our proposed ap-
tracks which have been tweeted by the same user. c is a vari- proach and are subject to future work.
able holding the popularity of the rule. Hence, such a rule
denotes that track t1 and track t2 both have been listened 4. RELATED WORK
by c users. Research related to the presented approach can be cate-
3.1 Ranking of Recommendation Candidates gorized into (i) approaches dealing with recommendations
either for Twitter or based on tweets and (ii) approaches
In this step, the computed association rules are analysed mainly dealing with the recommendation of music.
and so-called recommendation candidates are extracted. Bas- The utilization of a corpus of tweets for the recommenda-
ed on the rules, the recommended tracks for a certain user tion of resources has been a popular research topic. For ex-
are computed by selecting a subset C ⊆ T of track recom- ample the recommendation of suitable hashtags is discussed
mendation candidates by determining all rules which feature in [14]. Many approaches aim at the recommendation of
tracks occurring on the user stream. The final step for the users who might be interesting to follow, like e.g. in [7].
recommendation of tracks is the ranking of the recommen- Such approaches are typically based on the social ties of a
dation candidates within the set C. Therefore, we make use user (his followees and followers). There are also many ap-
of the count value c describing the popularity of a certain
4
track within all association rules matching the tracks of the http://www.last.fm/api
16
· #MSM2012 · 2nd Workshop on Making Sense of Microposts ·
proaches which exploit these ties to recommend resources, applying CF techniques for the exploitation of the social ties
such as websites [6] or news [12]. of the user are subject to future work. In order to evaluate
As for the second category of related work, the recom- the approach from a user’s point-of-view, online user tests
mendation of music, many different approaches have been are also part of the future work.
presented. Celma [5] provides an overview about this topic.
Within Recommender Systems, in principle two major ap- 6. REFERENCES
proaches are distinguished [1]: content-based recommenda- [1] G. Adomavicius and A. Tuzhilin. Toward the Next
tions and collaborative filtering (CF) approaches. Content- Generation of Recommender Systems. IEEE
based recommendation systems aim at recommending re- Transactions on Knowledge and Data Engineering,
sources which are similar to the resources the user already 17(6):734–749, 2005.
consumed or showed interest in. Collaborative filtering ap- [2] R. Agrawal and R. Srikant. Fast Algorithms for
proaches aim at finding users with a profile similar to the Mining Association Rules. In Proc. of the 20th Intl.
current user in order to recommend items which these simi- Conf on Very Large Data Bases, pages 487–499, 1994.
lar users also were in favor of. This categorization also holds [3] L. Baltrunas and X. Amatriain. Towards
within music recommendations. Content-based methods for Time-Dependant Recommendation based on Implicit
music titles typically rely on the extraction and analysis of Feedback. Workshop on ContextAware Recommender
audio features. The presented approach relies on the second Systems CARS 2009 in ACM Recsys, 2009:1–5.
type as the computation of association rules based on user [4] T. Bertin-Mahieux, D. P. Ellis, B. Whitman, and
profiles can be assigned to the class of CF approaches. P. Lamere. The Million Song Dataset. In Proc. of the
However, for music recommendations also a third impor- 12th Intl. Conf. on Music Information Retrieval, 2011.
tant aspect is exploited for the computation of recommenda-
[5] Ò. Celma. Music Recommendation and Discovery -
tions: context. The notion of context has e.g. been defined
The Long Tail, Long Fail, and Long Play in the
by Schmidt et al. as being threefold: physical environment,
Digital Music Space. Springer, 2010.
human factors and time [13]. These three factors have all
[6] J. Chen, R. Nairn, L. Nelson, M. Bernstein, and
been addressed by music recommendation research. As for
E. Chi. Short and Tweet: Experiments on
the physical environment of a user, e.g. Kaminskas and Ricci
Recommending Content from Information Streams. In
presented a location-aware approach for music recommenda-
Proc. of the 28th Intl. conference on Human Factors
tions [8]. The mood of users has been incorporated for the
in Computing Systems, pages 1185–1194. ACM, 2010.
computation of recommendations in [9] and Baltrunas et al.
[3] considered temporal facts when recommending music. [7] J. Hannon, M. Bennett, and B. Smyth.
Many approaches exploited user profiles in social networks Recommending Twitter Users to Follow using Content
to recommend resources. Mesnage et al. [10] showed that and Collaborative Filtering Approaches. In Proc. of
people prefer the music that their friends in the social net- the 4th ACM Conf. on Recommender Systems, pages
work prefer. The Serendip.me project5 provides its users 199–206. ACM, 2010.
with music which is selected solely based on the Twitter [8] M. Kaminskas and F. Ricci. Location-Adapted Music
ties (the followees) of the user. The dbrec project [11] is Recommendation Using Tags. In User Modeling,
concerned with recommending music based on the DBPedia Adaption and Personalization 2011, Girona, Spain,
data set. In particular, the authors developed a distance July 11-15, 2011, volume 6787 of LNCS, pages
metric for resources within DBPedia which enables the au- 183–194. Springer, 2011.
thors to recommend similar artists. [9] J. Lee and J. Lee. Context Awareness by Case-based
However, to the best of our knowledge there are no ap- Reasoning in a Music Recommendation System. In
proaches concerned with the recommendation of music based Proc. of the 4th Intl. Conference on Ubiquitous
on an analysis of “nowplaying” user streams on Twitter. Computing Systems, pages 45–58. Springer, 2007.
[10] C. Mesnage, A. Rafiq, S. Dixon, and R. Brixtel. Music
Discovery with Social Networks. In Proc. of the
5. CONCLUSION AND FUTURE WORK Workshop on Music Recommendation and Discovery
In this paper we showed that tweets can be exploited to 2011 in conjunction with ACM RecSys, volume 793,
build a corpus for music recommendations. The compari- pages 1–6. CEUR-WS, 2011.
son with the recommendation service of last.fm showed that [11] A. Passant. dbrec - Music Recommendations Using
despite the sparse corpus due to Twitter’s API limitations, DBpedia. The Semantic Web–ISWC 2010, pages
the coverage of last.fm’s recommendations is up to 79%. The 209–224, 2010.
results are very promising although the approach has to be [12] O. Phelan, K. McCarthy, and B. Smyth. Using
enhanced to be usable in real-world recommendation envi- Twitter to Recommend Real-Time Topical News. In
ronments. A mayor improvement would be the expansion Proc. of the third ACM conference on Recommender
of the data set as currently the corpus is very sparse and systems, RecSys ’09, pages 385–388, New York, NY,
the user profiles are incomplete. Also, the matching task USA, 2009. ACM.
of noisy tweets deteriorates the quality of recommendations.
[13] A. Schmidt, M. Beigl, and H. Gellersen. There is more
This is due to the fact that many uncertain matching results
to Context than Location. Computers & Graphics,
have to be dismissed and hence, the size of the usable data
23(6):893–901, 1999.
corpus decreases. Future work also comprises the enhance-
[14] E. Zangerle, W. Gassler, and G. Specht. Using Tag
ment of the matching process by using metadata such as
Recommendations to Homogenize Folksonomies in
location, URLs or further sentiment analysis. Additionally,
Microblogging Environments. In Social Informatics,
5
http://serendip.me volume 6984 of LNCS, pages 113–126. Springer, 2011.
17
· #MSM2012 · 2nd Workshop on Making Sense of Microposts ·