<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Method for Obtaining Semantic Facets of Music Tags</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohamed Sordo</string-name>
          <email>mohamed.sordo@upf.edu</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabien Gouyon</string-name>
          <email>fgouyon@inescporto.pt</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luís Sarmento</string-name>
          <email>las@fe.up.pt</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INESC Porto</institution>
          ,
          <addr-line>Porto</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LIACC/FEUP</institution>
          ,
          <addr-line>Univ. do Porto, Porto</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Universitat Pompeu Fabra</institution>
          ,
          <addr-line>Barcelona</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Music folksonomies have an inherent loose and open semantics, which hampers their use in structured browsing and recommendation. In this paper, we present a method for automatically obtaining a set of semantic facets underlying a folksonomy of music tags. The semantic facets are anchored upon the structure of the dynamic repository of universal knowledge Wikipedia. We illustrate the relevance of the obtained facets for the description of tags.</p>
      </abstract>
      <kwd-group>
        <kwd>Music tagging</kwd>
        <kwd>Last</kwd>
        <kwd>fm</kwd>
        <kwd>Wikipedia</kwd>
        <kwd>Social music</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>Music is a complex phenomenon that can be described
according to multiple facets. Descriptive facets of music are
commonly de ned by experts (e.g. stakeholders in the music
industry) in professional taxonomies. Multifaceted
descriptions are especially useful for music browsing and
recommendation. For instance, recommendations of the Pandora
Internet radio use around 400 music attributes grouped in
20 facets,1 as for instance Roots (e.g. \Afro-Latin Roots"),
Instrumentation (e.g. \Mixed Acoustic and Electric
Instrumentation"), Recording techniques (e.g. \Vinyl Ambience"),
or In uences (e.g. \Brazilian In uences").
1http://en.wikipedia.org/wiki/List_of_Music_
Genome_Project_attributes
WOMRAD 2010 Workshop on Music Recommendation and Discovery,
colocated with ACM RecSys 2010 (Barcelona, SPAIN)
Copyright c . This is an open-access article distributed under the terms
of the Creative Commons Attribution License 3.0 Unported, which permits
unrestricted use, distribution, and reproduction in any medium, provided
the original author and source are credited.</p>
      <p>
        However, there exists no consensual taxonomy for music.
Previous research showed the music industry uses
inconsistent taxonomies [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], even when restricting to a single and
widespread facet such as the music genre. Also,
expertde ned taxonomies (music-related or not) have two
fundamental problems. First, they are very likely to be
incomplete, since it is impossible for a small group of experts to
incorporate in a single structure all the knowledge that is
relevant to a speci c domain. Second, since domains are
constantly evolving taxonomies tend to become quickly
outdated {in music, new genres and techniques are constantly
emerging.
      </p>
      <p>An alternative strategy for describing music consists in
relying on the broadness of the web and making use of
the \wisdom of the crowds". Many music websites allow
users themselves to assign their own descriptive tags to
music items (artists, albums, songs, playlists, etc.). For
instance, users of the website Last.fm tagged the band
Radiohead as \90s", \00s", \alternative", \post-punk", \britpop",
\best band ever", among other things. The combination of
annotations provided by thousands of music users leads to
the emergence of a large body of domain-speci c knowledge,
usually called folksonomy. Due to its informal syntax (i.e.
direct assignment of tags), the tagging process allows the
collective creation of very rich tag descriptions of individual
music items.</p>
      <p>When compared to taxonomies de ned by experts, music
folksonomies have several advantages. First, completeness,
they ideally encompass all possible \ways to talk about
music", including both lay and expert points of view. Second,
due to the continuous nature of the tagging process,
folksonomies tend to be well updated. Third, they usually
incorporate both commonly accepted and generic concepts, as
well as very speci c and local ones.</p>
      <p>It seems reasonable to assume that folksonomies tend to
encompass various groups of tags that should re ect the
underlying semantic facets of the domain including not only
traditional dimensions (e.g. instrumentation), but also more
subjective ones (e.g. mood). However, the simplicity and
user-friendliness of community-based tagging imposes a toll:
there is usually no way to explicitly relate tags with the
corresponding music facets. For instance, a user may assign a
number of tags related with music genre without ever
actually explicitly specifying that they are about \music genre".
For providing a exible browsing experience, this is a
signi cant disadvantage of folksonomy-based classi cation in
relation to classi cation based on taxonomies, where the
information about which facets are being browsed can be made
explicitly available to the user.</p>
      <p>In this paper, we approach an essential research question
that is relevant to bridging this gap: Is it possible to
automatically infer the semantic facets inherent to a given music
folksonomy? A related research question is whether it is
then possible to classify elements of that music folksonomy
with respect to the inferred semantic facets?</p>
      <p>We propose an automatic method for (1) uncovering the
set of semantic facets implicit to the tags of a given
music folksonomy, and (2) classify tags with respect to these
facets. We anchor semantic facets on metadata of the
semistructured repository of general knowledge Wikipedia. Our
rationale is that as it is dynamically maintained by a large
community, Wikipedia should contain grounded and updated
information about relevant facets of music, in practice.</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        Music tags have recently been the object of increasing
attention by the research community [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. A number of
approaches have been proposed to associate tags to music
items (e.g. a particular artist, or a music piece) based on an
analysis of audio data [
        <xref ref-type="bibr" rid="ref1 ref9">1, 9</xref>
        ], on the knowledge about tag
cooccurence [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], or on the extraction of tag information from
community-edited resources [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. However, in most cases,
such approaches consider tags independently, i.e. not as
elements in structured hierarchies of di erent music facets.
When hierarchies of facets are considered, they are
usually de ned a priori, and greatly vary according to authors.
For example, [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] groups tags in the following facets: genre,
locale, mood, opinion, instrumentation, style, time period,
recording label, organizational, and social signaling.
      </p>
      <p>
        To our knowledge, however, few e orts have been
dedicated to the particular task of automatically identifying the
relevant facets of music tags. In their work on inferring
models for genre and artist classi cation, Levy et al. apply
dimensionality reduction techniques to a data set of tagged
music tracks in order to obtain their corresponding
compact representations in a low-dimensional space [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. They
base their approach on tag co-occurrence information. Some
emerging dimensions can be associated to facets such as Era
(e.g. the dimension [90s]). However, most of the
dimensions thus inferred are, in fact, a combination of diverse
music facets, such as for example the dimension [guitar; rock],
which includes concepts of instrumentation and of genre.
      </p>
      <p>
        Cano et al. use the WordNet ontology to automatically
describe sound e ects [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Albeit the very large amount of
concepts in WordNet, they report that it accounts for
relatively few concepts related to sound and music, and propose
an extension speci c to the domain of sound e ects. On
the one hand, they illustrate that browsing can indeed be
greatly enhanced by providing multifaceted descriptions of
items. On the other hand however, it is our belief that,
because of their necessary stability, existing ontologies are
not the most adapted tool to describe domains of knowledge
with inherent open and dynamic semantics, such as music.
      </p>
    </sec>
    <sec id="sec-3">
      <title>METHOD</title>
      <p>Our method consists in using metadata from Wikipedia to
infer the semantic facets of a given music folksonomy. This
is performed in two steps. In the rst step, we specialize the
very large network of interlinked Wikipedia pages to the
speci c domain of the music folksonomy at hand. This is done
by maximizing the overlap between Wikipedia pages and a
list of frequent tags from the folksonomy. As the resulting
network still represents a very large number of nodes, in a
second step, we focus on the most relevant ones (node
relevance being de ned as an intrinsic property of the network).
This step also includes additional re nements.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Obtaining a Music-Related Network</title>
      <p>Wikipedia pages are usually interlinked, and we use the
links between two particular types of pages (i.e. articles
and categories) to construct a music-related network.
Concretely, we use the DBpedia knowledge base (http://dbpedia.
org/) that provides structured, machine-readable
descriptions of the links between Wikipedia pages (DBpedia uses
the SKOS vocabulary, in its 2005 version).2 In particular,
we make use of two properties that connect pages in
DBpedia: (1) the property subjectOf, that connect articles to
categories (e.g. the article \Samba" is a subjectOf of the category
\Dance music", and (2), the property broaderOf, that
connect categories in a hierarchical manner (e.g. the category
\Dance" is a broaderOf of the category \Dance music", which
is a broaderOf of the category \Ballroom dance music").</p>
      <p>We start from the seed category \Music" and explore its
neighbourhood from the top down, checking whether
connected categories can be considered relevant to the music
domain. A category is considered relevant if it satis es any
of the two following conditions:</p>
      <p>It is a tag from the folksonomy, such as for example
\Rock and Roll". (This condition will be referred to as
isMusical );
At least one of its \descendants" is a tag from the
folksonomy and the substring \music" is included in the
title or the abstract of the corresponding Wikipedia
article. (This condition is further referred to as
isTextMusical.)</p>
      <p>The \descendants" of a category are fetched from
DBpedia using the two connecting properties previously described.
These descendants can be either \successors" (i.e. all direct
subjectOf and broaderOf of this category), or successors of
successors, and so on. This iterative search is limited by a
maximum depth, empirically xed to a value of 4. Indeed,
experiments with smaller values yielded a signi cant
reduction of the tag coverage, while experiments with greater
values did not increase signi cantly the coverage.</p>
      <p>If any of the previous conditions is satis ed, the
category, its successors and their edges are added to the
network. Otherwise, the category and all incident edges are
removed. The algorithm proceeds iteratively (following a
Breadth-First search approach) until no more categories can
be visited. A summarized version of the method for
obtaining a music-related network is described in algorithm 1.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Finding Relevant Facets</title>
      <p>Once the network of music-related categories is built, the
next step is to nd the nodes that are potentially more
relevant to the network than others.</p>
      <p>We invert the direction of the edges of the network in
order to point back in the direction of the most generic
category, i.e. \Music", and we compute the PageRank of the
2http://www.w3.org/TR/2005/
WD-swbp-skos-core-spec-20051102/</p>
      <p>C
else</p>
      <p>N
end
(V</p>
      <p>E
Data: C = ;, a list of categories (a queue, initially
empty); N = (V; E), a directed network with a
set of nodes V and a set of edges E (initially
empty);
Result: N , network with music nodes;
C C [ \M usic";
while C 6= ; do
c f irst element of C;
C C c;
if (c isMusical) _ ((at least one descendant of c
isMusical) ^ (c isTextMusical)) then</p>
      <p>(V V [ c [ successors(c)
N</p>
      <p>E E [ edges between c and successors(c)
C [ successors(c)</p>
      <p>
        V
E
c
all edges incident in c
end
Algorithm 1: Pseudo-code for the creation of a network
of music-related categories from Wikipedia.
resulting network. PageRank [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is a link analysis algorithm
that measures the relative relevance of all nodes in a
network. In PageRank, each node is able to issue a relevance
vote on all nodes to which it points to (thus the need for
reorienting the edges). The weight of the vote depends on the
relevance of the voting node (i.e. relevant nodes issue more
authoritative votes). The process runs iteratively, and
(under certain conditions) converges to a stable relative ranking,
where nodes to which more edges from other relevant nodes
converge (directly or indirectly) are considered more
relevant. For initializing the PageRank algorithm, we set the
initial weight of each node to 0.
      </p>
      <p>In order to capture general yet complementary facets of
music, we aim at reducing semantic overlap as much as
possible by applying the following lters:
Stub Filter: We remove all categories with substring \ by "
and \ from ". We noticed that many categories in
Wikipedia are actually combinations of two more
general categories, as for instance \Musicians by genres",
which is halfways between \Musicians" and \Music genres"
(see also gure 1). Further, we also remove categories
that include \ music(al) groups" (e.g. \Musical
groupsfrom California" that has hundreds of connected
categories, hence a high PageRank). Most of these
categories are used as stubs, even sometimes explicitly so
we also excluded categories with the word \stub".
Over-Specialization Filter: We exclude all categories that
include lexically a more relevant category. Many
relevant categories are specializations of other more
relevant ones, this occurs mostly with concepts related
to anglophone music, which are described in great
detail in Wikipedia (e.g. \American Musicians" includes
\Musicians" that has a higher PageRank).</p>
      <p>Tag Filter: We remove all categories that are tags. Our
objective is to uncover music facets that are implicit
to the tags that make up a folksonomy. In general, tags
are elements of such facets, not the facets themselves.</p>
    </sec>
    <sec id="sec-6">
      <title>4. RESULTS</title>
      <p>We experimented our method on a large dataset of artist
tags, gathered from Last.fm during April 2010. The dataset
consists of around 600,000 artists and 416,159 distinct tags.
This dataset was cleaned in order to remove noisy/irrelevant
data: (1) tags were edited in order to remove special
characters such as spaces, etc.; (2) tags were ltered by weight3,
only tags with a weight 1 were kept; and (3) tags were
ltered by popularity, keeping only tags with popularity 10,
i.e. keeping only tags that were assigned to at least 10
artists. As a result, the nal dataset consists of 582,502
artists, 39,953 distinct tags, and 9.03 tags per artist.</p>
      <p>After running both stages of our method, we obtained a
list of 333 candidate facets. Table 1 contains the top-50
facets, ordered by pagerank (top to bottom, left to right).</p>
    </sec>
    <sec id="sec-7">
      <title>4.1 Assigning facets to tags</title>
      <p>In order to assign a set of facets to a given Last.fm tag, we
process the subnetwork of Wikipedia pages specialized to the
Last.fm folksonomy (obtained in section 3.1), as described
in algorithm 2 (Note that this process is restricted to tags
that can be matched to one of the nodes in the network).
3i.e. Last.fm \relevance weight", which goes from 0 to 100
Data: C = ;, a list of categories (initially empty); F , a
list of top-N music facets; t, a Last.fm tag;
Result: T F , list of facets applied to tag t;
iter 1;
T F = ;;
while (F 6= ;) _ (iter maxIter) do</p>
      <p>C C [ predecessors(t);
if (9f 2 (F \ C)) then</p>
      <p>T F T F [ f</p>
      <p>F F f
end
iter iter + 1
end
Algorithm 2: Pseudo-code for assigning Wikipedia facets
to Last.fm tags</p>
      <p>Given a Last.fm tag t, we look at its \predecessor"
categories c, or more formally:
predecessors(t) = fcj(t broaderOf (c)) _ (t subjectOf (c))g:
If any of these predecessors is a top-N facet, it is then
assigned to t. The process continues iteratively until no more
facets can be assigned to the tag, or a maximum number
of iteration (maxIter) is exceeded. We empirically set this
value to 8. This condition can be interpreted as the
maximum distance in the network between a tag and a facet.</p>
      <p>Table 2 presents a small subset of the obtained facets,
followed by a subset of their corresponding list of top tags.
Top tags are chosen based on the distance (in number of
successive edges in the music network) to the given facet.</p>
      <p>The relevance Rtf of a music facet f to a tag t is
computed as the normalized inverse distance dtf {in number of
successive edges{ between t and f :
1
Rtf = Pdtf 1
i dti</p>
      <p>For example, in gure 1, given the tag bulgarian hip-hop,
our method starts navigating through the predecessors of
this tag until nally reaching two music facets: Music genres
and Music geography:
bulgarian hip-hop: {(Music_genres, 0.4),</p>
      <p>(Music_geography, 0.6)}</p>
    </sec>
    <sec id="sec-8">
      <title>5. SUMMARY AND FUTURE WORK</title>
      <p>Although potentially more complete and up-to-date than
taxonomies, music folksonomies lack structured categories,
a particularly relevant aspect to browsing and
recommendation. In this paper, we addressed the problem of uncovering
the underlying semantic facets of the Last.fm folksonomy,
using Wikipedia as backbone for semi-structured semantic
categories.</p>
      <p>There are many avenues for future work. First and
foremost, we intend to evaluate the relevance of the obtained
facets via systematic evaluations of tag classi cation. We
will also study the distributions of music facets with respect
to artist popularity. Further work should also relate to
evaluating the usefulness of the obtained facets in a number of
tasks, such as music recommendation, or tag expansion. We
also intend to release the data (and code used to obtain it)
in order to stimulate its use by fellow researchers.
6.</p>
    </sec>
    <sec id="sec-9">
      <title>ACKNOWLEDGMENTS</title>
      <p>Thanks to Oscar Celma (BMAT), Eduarda Mendes
Rodrigues (FEUP) and anonymous reviewers for useful
comments. This work was partly supported by the Ministerio
de Educacion in Spain, and the Fundaca~o para a Ci^encia e
a Tecnologia (FCT) and QREN-AdI grant for the project
Palco3.0/3121 in Portugal.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bertin-Mahieux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Eck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Maillet</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Lamere. Autotagger</surname>
          </string-name>
          :
          <article-title>A model for predicting social tags from acoustic features on large music databases</article-title>
          .
          <source>JNMR</source>
          ,
          <volume>37</volume>
          (
          <issue>2</issue>
          ):
          <volume>115</volume>
          {
          <fpage>135</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Koppenberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Herrera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Celma</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Tarasov</surname>
          </string-name>
          .
          <article-title>Sound e ect taxonomy management in production environments</article-title>
          .
          <source>In AES</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>O.</given-names>
            <surname>Celma</surname>
          </string-name>
          .
          <article-title>Music Recommendation and Discovery - The Long Tail, Long Fail, and Long Play in the Digital Music Space</article-title>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lamere</surname>
          </string-name>
          .
          <article-title>Social tagging and Music Information Retrieval</article-title>
          . JNMR,
          <volume>37</volume>
          (
          <issue>2</issue>
          ):
          <volume>101</volume>
          {
          <fpage>114</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Levy</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandler</surname>
          </string-name>
          .
          <article-title>Learning latent semantic models for music from social tags</article-title>
          .
          <source>JNMR</source>
          ,
          <volume>37</volume>
          (
          <issue>2</issue>
          ):
          <volume>137</volume>
          {
          <fpage>150</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pachet</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Cazaly</surname>
          </string-name>
          .
          <article-title>A taxonomy of musical genres</article-title>
          .
          <source>In RIAO</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Page</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Motwani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Winograd</surname>
          </string-name>
          .
          <article-title>The pagerank citation ranking: Bringing order to the web</article-title>
          .
          <source>Technical report</source>
          , Stanford InfoLab,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Sarmento</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gouyon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Oliveira</surname>
          </string-name>
          .
          <article-title>Music artist tag propagation with wikipedia abstracts</article-title>
          .
          <source>In ECIR-WIRSN</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Turnbull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Barrington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Torres</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Lanckriet</surname>
          </string-name>
          .
          <article-title>Semantic annotation and retrieval of music and sound e ects</article-title>
          .
          <source>IEEE TASLP</source>
          ,
          <volume>2</volume>
          (
          <issue>16</issue>
          ):
          <volume>467</volume>
          {
          <fpage>476</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>