<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Semantic Relations in Context-based Music Recommendations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>˙Ipek Tatlı</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ays¸ enur Birtürk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Engineering, METU Inonu Bulvari</institution>
          ,
          <addr-line>06531 Ankara</addr-line>
          ,
          <country country="TR">Turkey</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we describe an approach for creating music recommendations based on user-supplied tags that are augmented with a hierarchical structure extracted for top level genres from Dbpedia. In this structure, each genre is represented by its stylistic origins, typical instruments, derivative forms, sub genres and fusion genres. We use this wellorganized structure in dimensionality reduction in user and item pro ling. We compare two recommenders; one using our method and the other using Latent Semantic Analysis (LSA) in dimensionality reduction. The recommender using our approach outperforms the other. In addition to di erent dimensionality reduction methods, we evaluate the recommenders with di erent user pro ling methods.</p>
      </abstract>
      <kwd-group>
        <kwd>Recommendation systems</kwd>
        <kwd>user pro ling</kwd>
        <kwd>social tagging</kwd>
        <kwd>semantic relations</kwd>
        <kwd>dimensionality reduction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>These days, most social-networking sites let their
members participate in content generation. For example, users
can label artists, albums and tracks with tags in Last.fm.
A tag can be anything but it is actually a short description
of the item. Because tags represent the reason why a
listener likes an item, but not how much he/she likes it they
are better identi ers of user pro les than ratings, which are
usually numerical values assigned to items by users. Thus,
WOMRAD 2011 2nd Workshop on Music Recommendation and Discovery,
colocated with ACM RecSys 2011 (Chicago, US)
Copyright c . This is an open-access article distributed under the terms
of the Creative Commons Attribution License 3.0 Unported, which permits
unrestricted use, distribution, and reproduction in any medium, provided
the original author and source are credited.
we concentrate on the tag-based contextual representations
of music tracks.</p>
      <p>Items are mostly represented in vector spaces in the
recommendation systems. In tag-based recommendation
systems, users and items are de ned in terms of weighted
vectors of social tags. When there is a large amount of tags,
calculation of the items to be recommended becomes hard,
because working with huge is to represent individual tracks
(songs) in lower dimensional spaces. In order to reduce the
dimensionality, we focus on the genre information of the
tags. Each genre has a relationship with some
instrumentation, with some subgenre information and with style
information each of which may be entered as tags in the music
domain. In our work, for each genre Dbpedia1 (a
structured form of Wikipedia2) is crawled to set the relationships
between genre and its stylistic origins, typical instruments,
derivative forms, sub genres and fusion genres. The
contributions of our approach are that: (1) we provide a
"semantic relations" method for dimensionality reduction in very
huge vector spaces and (2) we perform the comparison of
our method against the classical Singular Value
Decomposition (SVD) method which is the base of Latent Semantic
Analysis (LSA). Our method outperforms the traditional
one.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        In music recommendation systems, tracks can be pro led
in terms of their audio contents (like rhythm, timbre,
tonality, instrumentation). In addition to audio descriptions,
tracks can be pro led in terms of their text descriptions like
their metadata, lyrics, tags and reviews mined from various
blogs [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Metadata information is mostly supplied by
experts. The artist's name, the album's name, genre, duration
and year are some attributes in the metadata. Attributes
are global descriptions of items and do not change
according to users whereas tags are local descriptors and might
change from user to user [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In our study, we focus on text
descriptions, namely tags in track pro ling.
      </p>
      <p>
        Recommender systems either predict ratings for unseen
items or predict items that can be liked. Most of the social
web-sites like Last.fm do not have a rating mechanism.
Instead of explicit ratings, today's recommender systems use
implicit ratings (users' listening habits and purchase
histories etc.). Thus the rating scale in implicit rating
mechanisms is 0-1. Tags can be used in rating-based collaborative
1http://dbpedia.org
2http://en.wikipedia.org
ltering systems with the help of an implicit rating
mechanism [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. If the tag is used by the user, its rating is 1;
otherwise its rating is 0. In most previous studies, 2-dimensional
spaces in music space are taken into consideration (item-user
or user-tag or item-tag). User-tag and item-tag relations
can be used to extend the rating data [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. A new approach
which uses all dimensionalities of the music space is
proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Each 'useritem- tag' data is a tensor in this
study and the researchers propose a Higher Order Singular
Value Decomposition (HOSVD) technique for 3-dimensional
social tagging data. The HOSVD method outperforms the
classical methods. In contrast, the three 2-dimensional
relations among users, tags and items have been used in a
new similarity measure generation which outperforms the
traditional item-based and user-based collaborative ltering
methods [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In this approach, neighborhood generation is
e ected through the similarity of users' tags, similarity of
users' items and the similarity of user's tag-item
relationships. In addition to user similarities item similarities have
been calculated with common tags, common users and
common tag-item relationships. Moreover, tags can be clustered
and these clusters improve the personalized
recommendations [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Up to our knowledge, [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] uses a similar approach to ours
in extracting top 50 music facets from Wikipedia but the
main objective of [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is to provide an automatic method for
uncovering the music facets and to classify tags according
to these facets. On the other hand, we create a hierarchical
genre structure and evaluate the usefulness of our approach
in music recommendation.
3.
      </p>
    </sec>
    <sec id="sec-3">
      <title>OUR APPROACH</title>
      <p>Our system performs 6 main tasks, shown in Figure 1: web
crawling, creating an ontology of musical genres, classifying
tags according to the ontology, track pro ling, user pro
ling and enacting the recommendation process. The circles
denote the phases of the system.</p>
      <p>Users listen to music and enter tags for tracks in their
Last.fm pro les. In the web crawling phase of the system,
a data set is generated. Details of the data set are given in
Section 4.</p>
      <p>
        Tags may be about genre, instrumentation, location, moods,
style, personal opinions and/or artists' names [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For
example, two users of Last.fm tagged some songs as follows:
the rst one loved listening to "The Wall" from "Pink Floyd"
and tagged the track with the words "energetic" and "seen
live". The second one loved "Only Girl" from "Rihanna" and
tagged "Only Girl" also with the words "energetic" and "seen
live". Thus both "Only Girl" and "The Wall" have the same
tags. According to the recommendation's similarity
function, they appear as very similar tracks, although in most
other ways (genre, instrumentation for instance) they are
not. Because of such reasons, subjective tags like personal
opinions and moods are ignored in the track and user
representation in our system. Instrumentation, subgenre, fusion
genre, derivative forms and stylistic information are used
in our track and user representation. Firstly, we decided
the main genres in the musical domain. In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], 14
mainstream genres (country, folk, jazz, blues, r&amp;b, heavy metal,
alternative/indie, punk, rap, electro, reggae, classical, rock
and pop) are used. We enriched these genres with Last.fm's
mainstream genres, which can be reached on the left frame
of the page http://www.last.fm/music. The main genres
used in our system are as follows: acoustic, ambient, blues,
classical, country, electronic, emo, folk, hardcore, hip hop,
indie, jazz, Latin, metal, pop, pop punk, punk, reggae, r&amp;b,
rock, soul, world. Having identi ed our genres, we decided
to crawl the Wikipedia page for each main genre but then
we switched to Dbpedia since it is more structured for
webcrawling. We crawled Dbpedia page for each main genre.
Obtained information is illustrated in Table 1 for "rock
music".In the ontology creation phase, we created a small
ontology-a hierarchical structure -with the help of the data
crawled from the Dbpedia. Relations in our ontology can
be seen in Table 2. In this structure instrumentation,
stylistic origins, derivative forms, subgenres and fusion genres are
the classes; the crawled data are the instances. For example,
"New Age Music" and "Synthpop" are instances of the class
"Derivative forms" which can be seen in Table 1.
      </p>
      <p>LSA does not use word order and morphology. In order
not to di erentiate "electronic" from "electronica", we
applied some stemming algorithm. Stemming is a technique
to convert similar words to a common base form. This base
form does not have to have a meaning from a linguistic point
of view (such as reducing synonyms to a single word, or
nding the root of the word). Various stemming algorithms
exist for the English language. We used the Porter stemmer3
which is a classical stemming algorithm. By using a
stemming algorithm, morphology is taken into consideration in
our approach. In the tag classi cation phase of our
system, we parsed instances existing in the ontology into single
words. We applied the stemming algorithm to each single
word. Then we concatenated the stemmed roots with "%" in
order to consider "word order" that LSA does not use. Some
3http://tartarus.org/ martin/PorterStemmer
examples of the stemming results can be seen in Table 3.</p>
      <p>All the tags in our dataset are saved in the "tags" table.
The reason why we use "%" in the new version of instances is
that we use these versions of the instances in our SQL
statements. We use the newer instances in SQL statements like
"select * from tags where tag name like '%Aborigin%rock%'
". With this usage, we are using about 100000 tags out of
160000 tags in the track representation. In the track
proling phase, the size of a track vector is the size of
mainstream genres (22 in our case). Last.fm provides integer
percentages (between 0 and 100) relative to the most used
tags per track. We updated these percentages by adding 1
to each percentage value in order not to discard any
having 0 percentage. Each entry in the vector is calculated as
follows:</p>
      <p>X hasSubGenres(i; j) + X hasF usionGenres(i; j)</p>
      <p>Where, g(i; j) is the ith term (genre) in jth track; and
hasInstrumentation(i; j) is the total percentage (between
1 and 101) of the tags of the jth track which are found to
be similar to the new instance versions of the
instrumentation class of ith genre (with the help of the aforementioned
SQL statements). The term count is usually normalized to
prevent a bias towards longer documents (which may have
a higher term count regardless of the actual importance of
that term in the document). The term frequency (TF) value
gives local information about a tag. An inverse document
frequency (IDF) value is calculated for each di erent tag in
the training set. This is calculated by dividing the total
number of tracks by the number of tracks that refer to that
feature. The IDF value gives global information of a tag.
Thus, tracks in our dataset are represented as a weighted
list of genres and the weights of the genres are calculated
with TF*IDF.</p>
      <p>wi =
ni;t
nt
log( jT j )
jTij</p>
      <p>In the formula above, wi is the weight of ith genre; ni;t
equals the number of times ith genre appears in tth track;
nt is the total number of genres in tth track; jT j is the total
size of the tracks, and jTij equals the number of tracks in
which ith genre appears. In the user pro ling phase, 3
di erent methods are used:</p>
      <p>1- using the users' own tags (personal tags) that they
entered</p>
      <p>2- using the users' friends' tags (friends' tags) that their
friends entered</p>
      <p>3- using all the tags of the tracks (social tags) that they
listened to</p>
      <p>In the rst method, users are pro led with their own tags.
In the second method, users are pro led with their friends'
tags. And in the last method, users are pro led with all the
tags of the tracks that they listened to. Semantic relations
are also used in user pro ling method 1 and method 2, just
the same as in track pro ling. In method 3, a user pro le is
the sum of the tracks that he/she has listened to. Weights of
the genres in user vectors are also calculated with TF*IDF
method. The main goal after creating a user pro le from
the training set is to recommend the items in the test set.</p>
      <p>In the recommendation phase, we use the common
cosine similarity method. The cosine similarity formula is
given below:</p>
      <p>CosSim(track; user) =
track vector user vector
jtrack vectorjjuser vectorj
4.</p>
    </sec>
    <sec id="sec-4">
      <title>DATA SET</title>
      <p>We use real Last.fm data in this study. In order to not to
use similar users from our own friend lists and in order to
achieve diversity, we selected 69 users from an application
named "join Last.fm"4. In this group, members of the group
share their Last.fm nicknames. We crawled their Last.fm
pro les with the help of Last.fm API5. Since our approach is
not collaborative but content-based, this number of users is
reasonable. Firstly we gathered their top 300 tracks. Then
we extracted their "loved" tracks. For each track, we
extracted the singer names and tags. We also gathered the
4http://www.facebook.com/group.php?gid=2246697136&amp;v=wall
5http://last.fm/api
tag counts per track. Finally for each user we extracted
their tags and their friends' tags. Details of our data set can
be seen in Table 4.</p>
    </sec>
    <sec id="sec-5">
      <title>EXPERIMENTS</title>
    </sec>
    <sec id="sec-6">
      <title>Methodology</title>
      <p>We performed a 4-fold-cross validation in which the
training data size was 75% and the test data was 25%. User
pro les were created using the training set and the task of
our recommender system was to predict the correct items in
the test set.
5.2</p>
    </sec>
    <sec id="sec-7">
      <title>Metrics</title>
      <p>In this study we used the most common evaluation metric:
Precision at the top N ranked results (P@N). Precision is the
ratio of relevant tracks selected correctly to the number of
tracks selected.
5.3</p>
    </sec>
    <sec id="sec-8">
      <title>Results</title>
      <p>
        In Last.fm, although users listen to music, they rarely
enter tags for the tracks that they like. Thus, user pro les in
Last.fm are smaller than in other social tagging sites, so that
the performance of the pure content-based recommendation
is not satisfying [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In Table 5; two recommenders using
LSA with an optimized parameter -k- and our method in
dimensionality reduction are compared in terms of
recommending the corresponding tracks in the test set. LSA is
applied to the track-tag matrix whose size is 13312*169174
(13312 tracks, 169174 tags). On the other hand, the
recommender using semantic relations method decreases the
matrix size to 13312 * 22 (13312 tracks, 22 genres). In this
recommender, each genre is semantically related to
instruments, stylistic origins, subgenres, fusion genres and
derivative forms. Thus, semantically related tags are counted as
the same genre in this representation. As seen in the Table 5,
it is obvious that the recommender using semantic relations
outperforms the recommender using LSA in
dimensionality reduction because it handles the semantic gap problem
in social tagging. Moreover, the recommender using users'
own tags in user pro ling performs better than the
recommender using friends' own tags. However, the recommender
using all social tags in the user pro ling seems to provide
the best results because of the increasing number of tags in
user vectors.
      </p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSION</title>
      <p>User annotated texts, tags in our case, are huge in size, but
the representation matrix is very sparse. Using such giant
matrices in calculations is a time- and resource- consuming
job. For the document categorization or text
summarization, LSA has been used for years because it is easy to use
and reliable. As an alternative, with the help of Dbpedia,
we created an ontology-like semantic relations structure for
the music domain. In this paper, we evaluated two methods
which can be used in dimensionality reduction. In the
evaluation Last.fm dataset was used and the recommenders were
evaluated with di erent user pro ling methods. Our method
has the advantage of using "word order" and "morphology"
with respect to LSA. We plan to extend our work, assigning
di erent weights for di erent relations. For instance,
hasInstrumentation and hasSubgenres may have di erent weights
in the track pro ling.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Celma</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Lamere. Music Recommendation</surname>
          </string-name>
          <article-title>Tutorial</article-title>
          .ISMIR.Vienna, Austria,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.H.L.</given-names>
            <surname>Tso-Sutter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.B.</given-names>
            <surname>Marinho</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Schmidt-Thieme</surname>
          </string-name>
          .
          <article-title>Tag-aware recommender systems by fusion of collaborative ltering algorithms</article-title>
          .
          <source>ACM symposium on Applied Computing. Brazil</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Symeonidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruxanda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nanopoulos</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Manolopoulos</surname>
          </string-name>
          .
          <article-title>Ternary semantic analysis of social tags for personalized music recommendation</article-title>
          .
          <source>ISMIR</source>
          . Philadelphia, USA,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Nayak</surname>
          </string-name>
          .
          <source>Collaborative Filtering Recommender Systems Using Tag Information. Web Intelligence and Intelligent Agent Technology. Sydney, Australia</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shepitsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gemmell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <article-title>Personalized recommendation in social tagging systems using hierarchical clustering</article-title>
          .
          <source>ACM conference on Recommender systems. Lausanne, Switzerland</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sordo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gouyon</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Sarmento</surname>
          </string-name>
          .
          <article-title>A Method for Obtaining Semantic Facets of Music Tags</article-title>
          . WOMRAD. Barcelona, Spain,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Cantador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bellogin</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Vallet</surname>
          </string-name>
          .
          <source>Content-based Recommendation Systems in Social Tagging Systems. ACM conference on Recommender systems. Barcelona, Spain</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Levy</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandler</surname>
          </string-name>
          .
          <article-title>Learning Latent Semantic Models For Music From Social Tags</article-title>
          .
          <source>Journal of New Music Research</source>
          . pages
          <fpage>137</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>