<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Dynamics and Semantics of Collaborative Tagging</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Harry Halpin</string-name>
          <email>H.Halpin@ed.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valentin Robu</string-name>
          <email>robu@cwi.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hana Shepherd</string-name>
          <email>hshepher@princeton.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dutch Center for Mathematics</institution>
          ,
          <addr-line>and Computer Science, Kruislaan 413, Amsterdam</addr-line>
          ,
          <country country="NL">Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Princeton University</institution>
          ,
          <addr-line>Wallace Hall, Princeton, NJ</addr-line>
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Edinburgh</institution>
          ,
          <addr-line>2 Buccleuch Place, Edinburgh, Scotland</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The debate within the Web community over the optimal means by which to organize information often pits formalized classifications against distributed collaborative tagging systems. A number of questions remain unanswered, however, regarding the nature of collaborative tagging systems including the dynamics of such systems and whether coherent classification schemes can emerge from undirected tagging by users. Currently millions of users are using collaborative tagging without centrally organizing principles, and many suspect this exhibits features considered to be indicative of a complex system. If this is the case, it remains to be seem whether collaborative tagging by users over time leads to emergent classification schemes that could be formalized into an ontology usable by the Semantic Web. This paper uses data from “popular” tagged sites on the social bookmarking site del.icio.us to examine the dynamics of such collaborative tagging systems. In particular, we are trying to determine whether the distribution of tag frequencies stabilizes, which indicates a degree of cohesion or consensus among users about the optimal tags to describe particular sites. We use tag co-occurrence networks for a sample domain of tags to analyze the meaning of particular tags given their relationship to other tags and automatically create an ontology. We also produce a generative model of collaborative tagging in order to model and understand some of the basic dynamics behind the process.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION 1. 1.1</title>
    </sec>
    <sec id="sec-2">
      <title>Folksonomies and Ontologies</title>
      <p>
        The issue of how metadata on web resources should be generated
for the greatest efficiency and efficacy continues to be a central
debate. A small but increasingly influential set of websites, including
the social bookmarking site del.ici.ous, Flickr, Furl, Rojo,
Connotea, Technorati, and Amazon allow users to “tag” objects with
keywords to facilitate retrieval both for the user and for other users.
Their categories are based on the set of tags that are used to
characterize some resource, and these categories are commonly referred
to as “folksonomies.” This approach to organizing online
information is usually contrasted with formal ontologies that are imposed
by experts, not by users [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        There are both benefits and drawbacks to the tagging approach.
Tagging is considered a categorization process in contrast to a
preoptimized classification process, as exemplified by expert-created
Semantic Web ontologies. Jacob defines the distinction between
these two processes in the following way: “Categorization divides
the world of experience into groups or categories whose members
share some perceptible similarity within a given context. That this
context may vary and with it the composition of the category is the
very basis for both the flexibility and the power of cognitive
categorization” while “classification as process involves the orderly
and systematic assignment of each entity to one and only one class
within a system of mutually exclusive and non-overlapping classes;
it mandates consistent application of these principles within the
framework of a prescribed ordering of reality”[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Tagging systems
allow much greater malleability and adaptability in organizing
information than do formal classification systems. Shirky explains
the pitfalls of imposed classification: “If you’ve got a large,
illdefined corpus, if you’ve got naive users, if your cataloguers aren’t
expert, if there’s no one to say authoritatively what’s going on, then
ontology is going to be a bad strategy”[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Proponents of tagging
systems argue that “Groups of users do not have to agree on a
hierarchy of tags or detailed taxonomy, they only need to agree, in a
general sense, on the ‘meaning’ of a tag enough to label similar
material with terms for there to be cooperation and shared value.”[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
Tagging is able retrieve the data and share data more efficiently
than classifying: “Free typing loose associations is just a lot easier
than making a decision about the degree of match to a pre-defined
category (especially hierarchical ones). It’s like 90% of the value
of a proper taxonomy but 10 times simpler.” [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. However, a
number of problems stem from organizing information through tagging
systems including ambiguity in the meaning of tags, the use of
synonyms which creates informational redundancy, and the possibility
of idiosyncratic naming conventions where individuals string
together many words or label items according to their personal utility,
such as tagging a bookmarked site with “toread.” These drawbacks
are serious in that they have the ability to jeopardize the coherence
of the informational content of the tagging system and render
tagging systems less useful for groups of users.
      </p>
      <p>Given the debate over the utility of collaborative tagging
systems compared to other methods of organizing information, it is
increasingly important to understand whether a coherent and socially
navigable way of organizing metadata can emerge from distributive
tagging systems and if so, how this might occur and whether
particular features of sites using tagging facilitate or inhibit the
emergence of coherence. This paper will empirically examine elements
of two subsidiary issues of this larger project. In Section 4 we
examine the dynamics of tag frequency in “popular” del.icio.us tags
in order to detect whether the tag frequencies converge to a
stable distribution and thus a categorization scheme. There is hope
among the proponents of collaborative tagging systems that a
stable sort of distribution might arise from these systems. Note that by
“stable” we do not mean that users stop tagging the resource, but
that the tagging eventually settles to a group of tags that describe
the resource well and new users mostly reinforce already present
tags in the same frequency as they are given in the stable
distribution. Online tagging systems have a variety of features that are
often associated with complex systems such as a large number of
users, a lack of central coordination, and non-linear dynamics and
these sort of systems are known to produce a type of distribution
known as a “power-law.” In Section 5 we examine how the
information content of particular tags in relation to one another might
be used to extract a classification scheme (ontology) from a
categorization scheme (folksonomy). We present in detail some empirical
work on the the first topic, and then more hypothetical work on the
second.
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Dynamics of Tagging</title>
      <p>
        What are the underlying dynamics that could cause tagging to
reach some point of stability such that the distribution of tags
converge? Researchers have observed, some casually, some more
rigorously, that the distribution of tags applied to particular URLs in
tagging systems follows a power law distribution where there are a
relatively small number of tags that are used with great frequency
and a great number of tags that are used infrequently [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Work by
Golder and Huberman using del.ici.ous data has noted a number of
patterns in tagging dynamics. The majority of sites reach their peak
popularity, the highest frequency of tagging in a given time period,
within 10 days of being saved on del.icio.us (67% in the data set
of Golder and Huberman) though some sites are “rediscovered” by
users (about 17% in their data set), suggesting stability in most sites
but some degree of “burstiness” in the dynamics [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Most
importantly for this paper, Golder and Huberman find that the proportion
of frequencies of tags within a given site stabilize over time; they
find it occurs usually after around being bookmarked 100 times [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        To make inferences about the existence of some sort of structure
in the distribution of tag frequencies, we need to understand the
information inherent in the tags based on calculating the frequencies
with which particular tags co-occur with other tags. Again, a
number of critical questions remain regarding the informational value
of tags used. By “informational value” we mean whatever
information is conveyed by the natural language term used in the tag and
how this makes the tag useful or not. Since the “meaning” of tags
is elusive, one way to model their informational value is to look
at their co-occurrence with other tags, and to try to answer
questions about how these co-occurrence models reflect the
informational value of particular tags: Does the structure of tag networks
based on co-occurrence make intuitive sense, doing justice to the
common-sense ideas we have about the relationships between the
concepts under scrutiny? Can tagging provide users with any new
insight into the meaning of resources just by analyzing the
structure of networks based on co-occurrence? Shen and Wu analyze the
structure of a tagging network for del.icio.us data as we do in
Section 5, although unlike in our examples their graph is unweighted
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. They examine the degree distribution (the distribution of the
number of other nodes each node is connected to) and the
clustering coefficient (based on a ratio of the total number of edges in a
subgraph to the number of all possible edges) of this network and
find that the network is scale free and has the features Watts and
Strogatz found to characterize small world networks: small
average path length and relatively high clustering coefficient [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. A
large amount of work exploring the structural properties of nature
language networks finds similar results [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        The dynamics of tagging systems are closely coupled to the
informational value of tags. Golder and Huberman cite two important
features of such collaborative tagging systems that might give rise
to this type of stability: imitation of others and shared knowledge
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. One of the specific features of del.icio.us is the inclusion of
“most common tags” for a given site when a user saves that site,
facilitating the use of the tags others have used with the greatest
frequency. They explain the stability of the less common tags, which
are not displayed for users when they save a site, based on a shared
background and set of assumptions among users. Given that the
stability of tag frequencies presumably relies on both the
interaction between users (imitation) and the shared cultural knowledge
of users, the stability and patterns of tag frequencies might lend
insight into the degree to which there is consensus within a
community about how to characterize some site or into whether there
are different groups of users with different sets of assumptions and
who are tagging the same site. Or, as Golder and Huberman
suggest, changes in the stability of such patterns might suggest that
groups of users are migrating away from a particular consensus on
how to characterize a site and its content or negotiating the
changing meaning of that site. To the extent this consensus is stable, it is
ripe for development into a classification system and formalization
into ontology.
1.3
      </p>
    </sec>
    <sec id="sec-4">
      <title>Ontologies and Observed Patterns</title>
      <p>
        Merholz uses the metaphor of “desire lines” for tagging systems;
these “are the foot-worn paths that sometimes appear in a landscape
over time” such that “a smart landscape designer will let
wanderers create paths through use, and then pave the emerging
walkways, ensuring optimal utility” [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. This metaphor points towards
a way of developing ontologies for the Semantic Web that
maintains the advantages of both taxonomic classification and
collaborative tagging, bridging the two sides of the debate about
organizing metadata. After users have explored the space of possibilities
and discovered some optimum categorization, an ontology could be
formalized for classification purposes. Avoiding pre-optimization,
a user-optimized ontology would take advantage of the often
unexpected ways users categorize data, yet provide the amount of
classificatory power provided by a smaller set of terms that can
then be mapped to a Semantic Web ontology capable of expressing
structured data facets, complex relationships, and scaling across the
Web, which current collaborative tagging systems are incapable of
doing. It is possible that in order to share data effectively users as
a group, naturally and without external influence restricting their
vocabulary, converge to tagging each URI with a fairly small set of
semantically distinct tags.
      </p>
      <p>
        Is it possible that such a classification structure can be detected?
While some have claimed that it is not possible since the
“responsiveness and flexibility” of user categorization “effectively prohibit
the establishment of meaningful relationships” because they are
“fleeting and ephemeral,” there are a number of other cases where
complex structure emerges from simple behavior [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. What are the
types of local rules users might be employing which generate these
observed aggregate patterns and can they be described
mathematically? The paradigmatic case of local rules generating structure is
natural language itself, where “one of the key questions to
understand [is] how a communication system can arise...how distributed
agents without a central authority and without prior specification
can nevertheless arrive at a sufficiently shared language
conventions to make communication possible” [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
2.
      </p>
    </sec>
    <sec id="sec-5">
      <title>THE TRIPARTITE STRUCTURE OF TAG</title>
    </sec>
    <sec id="sec-6">
      <title>GING</title>
      <p>
        To begin, we need a conceptual model to describe generic
collaborative tagging systems which is capable of being formalized
so that we can both make predictions about collaborative tagging
systems based on empirical data and based on generative features
of the model. A well-accepted tripartite model has already been
theorized [
        <xref ref-type="bibr" rid="ref10 ref13">10, 13</xref>
        ], although we hope to clarify it below:
There are three main entities that compose any tagging system:
The users of the system (people who actually do the tagging)
      </p>
      <sec id="sec-6-1">
        <title>The tags themselves</title>
      </sec>
      <sec id="sec-6-2">
        <title>The resources being tagged (in this case, the websites)</title>
        <p>Each of these can be seen as forming separate spaces consisting
of sets of vertices, which are linked together by edges (see Fig. 1).
The first space, the user space, consists of the set of all users of the
tagging system, where each vertex is a user. The second space is
the tag space, the set of all tags, where a tag corresponds to a term
(“music”) or neologism (“toread”) in natural language. The third
space is the resource space, the set of all resources, where each
resource is normally denoted by a unique URI.1 A tagging instance
can be seen as the two edges that links together a user to a tag and
then that tag to a given website or resource. Note that a tagging
instance can associate a date with its tuple of a user, a tag(s), and a
resource.</p>
        <p>From the above model and Fig.1, we observe that tags provide
the link between the users of the system and the resources or
concepts they search for.</p>
        <p>In particular, this analysis reveals a number of dimensions of
tagging that are often under-emphasized. In particular, tagging is a
methodology for information retrieval, much like traditional search
engines, but with a number of key differences. To simplify
drastically, with a traditional search engine a user enters a number of tags
and then an automatic algorithm labels the resources with some
measure of relevancy to the tags pre-discovery, displaying relevant
resources to the user. In contrast, with collaborative tagging a user
finds a resource, then adds one or more tags to the resource
manually, with a system storing the resource and the tags post-discovery.
When faced with a case of retrieval, an automatic algorithm does
not have to assign tags to the resource automatically, but can
follow the tags used by the user. The difference between this and
traditional searching algorithms is two-fold: collaborative tagging
relies on human knowledge, as opposed to an algorithm, to directly
1A “Universal Resource Identifier” such as
http://www.example.com that can return a web-page when
accessed. Notice that some tagging based systems such as Spurl
(http://www.spurl.net) store the entire document, not the URI, but
most systems such as del.icio.us store only the URI. Regardless,
our resource space is whatever is being tagged.
connect terms to documents before a search begins, and so relies on
the collective intelligence of its human users to pre-filter the search
results for relevancy. When a search is complete and a resource
of interest is found, collaborative tagging often requires the user to
in turn “tag” the resource in order to store the result in his or her
personal collection. This causes a feedback cycle. These
characteristics motivate many systems like del.icio.us and it is well-known
that feedback cycles are one ingredient of complex systems, giving
further indication that a power-law in the tagging distribution might
emerge. However, before going further we need to formalize these
qualitative observations about collaborative tagging.
3.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>A GENERATIVE MODEL</title>
      <p>
        Our model needs to combine the three-level model of tagging
presented above with the manner in which feedback cycles and
informational value give rise to a stable distribution of tags over time.
The notion of a feedback cycle is encapsulated in the simple idea
that a tag that has already been used is likely to repeated. This
behavior is a clear example of preferential attachment, known
popularly as a “rich get richer” model. To model this phenomena, we
need to have a baseline probability P (a), or the probability of a
user committing a “tagging action.” This is the probability that for
every time step t, a “tag” is added to a resource. There are very few
empirical studies that estimate this parameter currently.
Additionally, since users often tag more than once, there is P (n) that
determines the number (n) of tags a user is likely to add at once based
on the distribution of the number of tags a given user employs in
a single tagging action. As reported by other studies, this number
varies between two and ten [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], although we will hold n = 1 in
order to simplify our exposition. Once a tagging action (P (a)) has
been done, a preferential attachment model can be formalized by
use of a simple “shuffling theory” model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This model holds that
an “old tag” is reinforced with constant probability P (o), so a “new
tag” is added with probability 1 P (o). If the old tag is added, it
is added with a probability PRR(x()i) , where R(x) is the number of
times that particular previous tag x has been chosen in the past and
P R(i) is the total sum of all previous tags. This leads to tags that
have been heavily reinforced in the past being further reinforced in
the future.
      </p>
      <p>We illustrate this with a simple example, as given by Figure 2,
where P (tag) is P (o) and assuming for simplification P (a) = 1.
Also, we will have a user only add one new tag per time step. At
time step 1 in our example, the user has no choice but to add a
new tag, “piano” to the page. At the next stage, the user does
not reinforce a new tag but chooses a new tag, “music”, and so
P (piano) = 12 and P (music) = 21 . At t = 3, the user
reinforces a previous “piano” tag and so P (piano) increases to 23 ,
while P (music) decreases to 31 . At t = 4, a new tag is chosen
(“digital”), and so P (piano) goes up while P (music) decreases
to 41 and P (digital) is 41 . Taken to its conclusion, this process
produces a “power-law” distribution.</p>
      <p>Preferential attachment models do not explain why a particular
new tag is added to a resource; in practice, tags are not added at
random because their informational value is taken into account. For
example, the oldest tags for a resource are not always the most
popular tags. A new tag may be added that uncovers an informational
dimension not captured by older tags, and if this new dimension
proves both relevant and useful then other users will reinforce the
tag that represents the dimension, perhaps at the expense of older
tags with less relevant informational dimensions. In this case, the
new relevant tag would experience a burst of reinforcement,
perhaps surmounting the frequency with which older tags were used</p>
      <sec id="sec-7-1">
        <title>P(piano) = 1/2</title>
      </sec>
      <sec id="sec-7-2">
        <title>P(music) = 1/4</title>
      </sec>
      <sec id="sec-7-3">
        <title>P(digital) = 1/4</title>
        <p>t=4
P(piano)
=1.0</p>
        <p>P(tag)
1-P(tag)
tag=music
tag=piano</p>
        <p>tag=digital
P(piano)
= 1/2
P(music)
= 1/2</p>
        <p>P(piano)
= 2/3
P(music)
= 1/3
t=1
t=2
t=3
and eventually stabilizing towards the top of the tag distribution for
a resource. The entire tagging process might be considered an
“exploration” versus “exploitation” process where the exploration of
possibly relevant dimensions of a resource is balanced with the
exploitation of previously tagged dimensions of a resource. A
stabilized distribution theoretically represents a state where the optimal
number of dimensions have been tagged.</p>
        <p>While it is impossible for a generic model to assign a priori the
exact informational value of a resource, it is possible to at least
partially model the informational value of a specific tag. A
hypothetical tag applied to every relevant resource would, if used in a search
by a user to discover resources, retrieve every document (imagine
a tag such as “website,” but used once by at least one user on
every resource). This type of tag has an informational value (I) of 0,
and we assume that the informational value of a tag that retrieves
no resources is also 0. Another tag that hypothetically selects only
the resource needed, would have have an informational value (I)
of 1. This does not occur so precisely in practice, as users
presumably want the optimal tag to return some cognitively appropriate
(k) number of resources, such as the number of resources that fit
on the screen or that allow users to effectively browse an area, and
this may vary per user. However, for the purposes of our model we
will assume that k = 1 when quantifying informational value to
simplify our exposition. Notice also that a user may use multiple
tags and these tag combinations may have different informational
values that are not additive. In our work with del.icio.us, we can
empirically estimate the informational value of a tag by retrieving
the number of web-pages a del.icio.us search with a tag (or
combination of tags) returns and converting it into a probability, as done
in Section 5.</p>
        <p>
          In order to explain tight binding between information retrieval
and value, we show an abstract example in Figure 3. In this
example the act of “tagging” by a user (ux) can be considered the
assignment of a tag (ty) to a given resource (rz). Thus, a given
search can be considered a transversal from ux via a number of
tags to a number of resources. The user wishes to minimize the
number of tags needed to retrieve the relevant resources, which is
unknown to both the system and the user. Following Zipf’s famous
“Principle of Least Effort,” users presumably minimize the number
of tags used. [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. In our example the user u2 wishes to use a group
of tags to discover a relevant resource, which an oracle would tell
us is r2. While tag t1 and t5 retrieve exactly one resource I(t1)
and I(t5) = 1, these tags do not identify r2. I(t3) = 0, since it
retrieves all resources in the data-set. While I(t2) and I(t4) &gt; I(t3),
the combination of both tags retrieve exactly the resource r2 in our
example so I(t3; t2) = 1 &gt; I(t2) and I(t3). Notice that
informational value is not additive, since I(t1; t5) = 0 while both I(t1)
and I(t5) = 1.
        </p>
        <p>u1
u2
u3
u4
u5
u6
t1
t2
t3
t4
t5
r1
r2
r3
r4
USERS</p>
        <p>TAGS</p>
        <p>RESOURCES</p>
        <p>If the user is satisfied with the search results and wishes to add
a retrieved resource to their personal collection, they will reinforce
one of the existing tags of the resource by repeating one of the
preexisting tags, and they might also add a new tag. If the user is not
satisfied with the search results, they will likely add a new tag to
a retrieved resource. This tag may allow them to use fewer tags in
future searches to retrieve the same resource. Thus, if we linearly
combine our two models of informational value and preferential
attachment, we can generate the probability of a tag x being
reinforced or added as a linear interpolation of preferential attachment
and information value, with being used to weigh the factors:
P (x) = P (I(x)) + (1 ) P (a) P (o) P ( PRR(x()i) )
This formalizes a process that would give rise to a power-law
via preferential attachment, but one where the informational value
of a tag additionally figures into the dynamics of the tagging
distribution. This model as it stands is heavily parameterized, where
the values of the parameters no doubt vary from one tagging
system to another. We are in process of collecting enough empirical
data from del.icio.us to provide estimations of the model
parameters such that we could compare model-generated results to
empirical distributions. However, first we need to determine whether a
power law actually arises from empirical data.
4.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>THE EMERGENCE OF POWER LAWS</title>
    </sec>
    <sec id="sec-9">
      <title>FROM TAG DISTRIBUTIONS</title>
      <p>According to our model, there should be a connection between
the relative rank of the tag for a given resource (defined by the
ordering of the number of users who used that tag to mark the
website), and the frequency of use of the tag for that particular resource.
If our qualitative intuition about tagging systems as complex
systems is correct, we hypothesize that this distribution should follow
power laws. We consider the distribution of data from a subset
of 100 heavily tagged sites (defined as those that were tagged over
1000 times) and we present the results of power law interpolation
on this data. Finally, we discuss some of the reasons the observed
distributions could emerge, based on the tagging behavior of
individual users.
4.1</p>
    </sec>
    <sec id="sec-10">
      <title>Power Law Distributions: Definition</title>
      <p>A power law is a relationship between two scalar quantities x
and y of the form:</p>
      <p>y = cx
Where and c are constants characterizing the given power law.
Without loss of generality, Eq. 1 can also be written as:
log y =</p>
      <p>log x + log c</p>
      <p>When written in this form, a fundamental property of power laws
becomes apparent– when plotted in log-log space, power laws
represent straight lines. Therefore, the easiest way to check whether
a distribution follows a power law is to apply a logarithmic
transformation, and then use linear interpolation of the data points to
determine the parameters and c.</p>
      <p>In our tagging domain, the intuitive explanation of the above
parameters is that c represents the number of times the most chosen
tag for that website is used, while gives the power law decay
parameter for the frequency of tags at subsequent positions. Thus, the
number of times the tag in position p is used (for p=1 to 25) should
be approximated by a function of the form (where &gt; 0):</p>
      <p>F requency(p = 1)
F requency(p) =
p
(3)
4.2</p>
    </sec>
    <sec id="sec-11">
      <title>Results from Considered Data Set</title>
      <p>As discussed above, in our experiments we considered a test set
of 100 “popular” sites, defined as those that were tagged at least
1,000 times (the most heavily tagged sites in the considered data set
were tagged over 30,000 times). For each website, we considered
in our analysis the most heavily used 25 tags.</p>
      <p>For this distribution, we first applied a transformation to a
loglog scale and then we linearly interpolated the resulting data points
(this was done individually for each site, though in Fig. 4 we show
only the actual data, not the interpolated functions in order to
preserve the clarity of the image). We computed the aggregate
(cumulative) distribution for all sites (by summing up the frequency
of tags that appear in each position) and interpolated the resulting
points. The results are presented in Fig. 4 and Fig. 5, respectively.</p>
      <p>In all of the cases, logarithm of base 2 was used in changing to
a log-log representation. Note that the base of the logarithm does
not actually appear in the power law equation (c.f. Eq. 1), but
because we are interpolating empirical and thus possibly noisy data,
this choice can influence errors recorded in the interpolation phase.
However, we did not find significant differences from changing the
base of the logarithm to e or 10.</p>
      <p>To summarize our results, we found that the data points can (with
some error) be linearly interpolated. The constants of the power law
we found (see Equations 1 and 2), for the cumulative case, had the
values: = 1:28 and c = 18:3. The results from linear
interpolations for each of the individual sites (not shown graphically, due
to lack of space) all had slopes in the same range, i.e. 2 [1; 1:5],
with the average close to 1.28. Thus is can be said that the power
law decay (i.e. slope) measured for the cumulative case is
relatively consistent across individual sites (though the constant factor
c varies, of course). Intuitively, this means there is a fundamental
effect of the way tags are distributed in individual websites that is
independent of the context and content of the specific website.</p>
      <p>There is an important caveat though. We observed that
somewhere between tags in positions 7 and 10 there is a considerably
sharper drop in frequency than the general trendline. This means
that for example, if we do a piece-wise interpolation for the tags in
the first 7 positions and the last 15 we would get, in both cases,
linear functions, but with slightly different slopes (while for the first
(1)
(2)
14
positions is closer to -1, it decreases slightly to -1.28 afterwards).
Furthermore, as Fig. 4 shows, this effect largely holds for almost
all sites in the data set considered, so it is not just attributable to
noise, but a consistent effect of the way tagging is performed. We
do not have yet a satisfactory explanation for this effect, but it might
have a cognitive explanation, based on the number of tags the
average user employs per website. However, this observation does not
affect our basic result that tag distributions follow power laws.</p>
      <p>We note, however that the above analysis refers to heavily tagged
sites (tagged more than 1000 times), and considers the most used
25 tags for each site. We have also looked at a set of less
popular sites, for which the power law interpolation produces somewhat
less clear results - although some of these can be expected to
become more heavily tagged and eventually evolve clear power law
distributions. Furthermore, for each of the sites, below the first 25
highest-ranking tags there are a lot of unique tags that are used more
scarcely (some only by a few people). This forms the “long tail” of
the distribution, which does not usually follow the same power law
decay pattern as the head.</p>
    </sec>
    <sec id="sec-12">
      <title>CONSTRUCTING INTER-TAG CORRE</title>
    </sec>
    <sec id="sec-13">
      <title>LATION GRAPHS</title>
      <p>So while we have shown that power laws evolve on popular sites,
is there any way to model the informational value that partially
drives the process? We look at one of the simplest information
structures that can be derived through collaborative tagging:
intertag correlation graphs. First, we discuss the methodology used for
getting such graphs. Next we illustrate our approach through an
example, with tags from a limited domain. Finally, we discuss the
importance of tag-tag graphs and how they could be used to shed
light on the underlying dynamics of the tagging process.
5.1</p>
    </sec>
    <sec id="sec-14">
      <title>Methodology</title>
      <p>The act of tagging resources by different users induces, at the tag
level, a simple distance measure between any pair of tags. In our
case, define the distance between two tags Ti; Tj through a cosine
distance measure:</p>
      <p>Dist(Ti; Tj ) =</p>
      <p>N (Ti; Tj )
pN (Ti)</p>
      <p>N (Tj )
(4)</p>
      <p>
        Where we denote by N (Ti), respectively N (Tj ), the number of
times each of the tags was used individually to tag all pages, and by
N (Ti; Tj ) the number of times two tags are used to tag the same
page (summed up over all pages). The distance measure captures a
degree of co-occurrence (which we interpret as a similarity metric)
between the concepts represented by the two tags. The distance
measure can play a big role in actual structure retrieved and we
note that there are more sophisticated distance measures proposed
both in item-item collaborative filtering (see [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]), and from text
mining literature. For this paper, cosine distance seemed to work
well enough.
      </p>
      <p>
        Next, from these similarities we can construct a tag-tag
correlation graph or network, where the nodes represent the tags
themselves (weighed by their absolute frequencies), while the edges are
weighed with the cosine distance measure. We build a
visualization of this this weighed tag-tag correlation, by using a
“springembedder” type of algorithm - in our case we preferred the
wellknown Kawada-Kawai algorithm [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. An analysis of the structural
properties of such tag graphs may provide important insights into
how people tag and how semantic structure emerges in distributed
folksonomies (we return to this issue in Section 5.3, where we
discuss the relation between this approach and the structures derived
in the literature on language evolution).
      </p>
      <p>While it would be difficult if not impossible for independent
researchers to collect enough data to construct and analyze the entire
space of tags used in del.icio.us, we did collect enough data to
provide an illustration of the approach for a restricted sub-domain.
5.2</p>
    </sec>
    <sec id="sec-15">
      <title>Constructing tag-tag correlation networks</title>
      <p>
        In order to exemplify our approach, we collected the data and
constructed visualizations for a restricted class of 15 tags, all
related to the tag “complexity.” Our goal, in this example, was to
examine which sciences does the user community of del.icio.us
see as most related to “complexity” science (a problem which has
traditionally elicited some discussion).2 The visualizations were
made on Pajek [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The purpose of the visualization was to study
whether the proposed method retrieves connection between a
central tag “complexity” and related disciplines. We considered two
cases:
      </p>
      <p>Only the dependencies between the tag “complexity” and all
other tags in the subset are taken into account when building
the graph (Fig. 6).
30 other edges (i.e. 45 edges in total for 15 tags) are
considered (Fig. 7). These taken as the ones with the
highest expected correlations, though in future work we’ll
consider more sophisticated methods for determining the cut-off,
based on examining the deviation from the mean.</p>
      <p>
        In both figures, the size of the nodes is proportional to the
absolute frequencies of each tag, while the distances are, roughly
speaking, inversely related to the distance measure (as returned by
the “spring-embedder” algorithm).3 We tested two energy
measures for the “springs” attached to the edges in the visualization:
Kamada-Kawai and Fruchterman-Reingold [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For lack of space,
only the visualization returned by Kamada-Kawai is presented here,
since we feel it is more faithful to the proportions present in the
data.
      </p>
      <p>The results from the visualization algorithm do match well what
one would intuitively expect to see in this domain. Some nodes are
much larger than others, which, again shows the taggers prefer to
use to general, heavily used tags (e.g. the tag “art” was used 25
times more than “chaos”). Tags such as “chaos”, “alife”,
“evolution” or “networks” which correspond to topics generally seen as
close to complexity science (some of them were actually developed
in the context of complex systems), come close to it. At the other
end, the tag art is a large, distant node from complexity. This is
not so much due to the absence of sites discussing the
mathematics/complexity aspects in art. In fact, there are quite a few of such
sites - but they represent only a small proportion of the total sites
tagged with “art”, leading to a large distance measure. There are,
however, some problems in the structure retrieved: the tag
“ecology” would be expected to appear much closer to “complexity,”
since much research on complexity in biological systems has
focused on applications in ecology.</p>
      <p>
        We should mention that in similar work, Mika [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] concluded
(for another domain than the one in this paper), that filtering based
on users produces more useful results than filtering based on items.
      </p>
      <p>Although we cannot preciscely assess whether the same measures
were used for building the graph, for this particular domain, our
approach produced reasonably useful results. Before reaching any
definite conclusions, however, further work examining the results
with different similarity criteria, with different methods of applying
2The choice of terms considered in the subset is loosely based on
the topics covered at the 2006 summer course on complexity
offered by the Santa Fe Institute.
3For two of the tags, namely “algorithms” and “networks,” both
absolute frequencies and co-dependencies were summed over the
singular form tag, i.e. “network” and the plural “networks,” since
both forms occur with relatively high frequency.</p>
      <p>CCoogmnpituiotinng AI BioClohEgavyooslutioCnomAlpgloeNrxieitthtywmAosrlikfeS MathePmhyastiiccss EconomicsEcoPloajegky
Visualization of a tag correlation network, considering only the correlations corresponding to one central node “complexity”
Art
Economics</p>
      <p>PhysicAslifeChaos CEBomviooplolulegEtxiyMocitnoayltAohlgegymoNraieAtthitIcwmsosrks CCoogmniptiuotning
Figure 7: Visualization of a tag correlation network, considering all relevant correlations</p>
      <sec id="sec-15-1">
        <title>Pajek</title>
        <p>spring-embedder algorithms and based on larger and more precise
data sets is needed.</p>
        <p>Overall, we expect that when applying fully automated retrieval
methods on larger data sets (for example, not choosing the subset
of tags considered “by hand”), the same problems we identified for
the tag distributions on individual sites would appear. This means,
some very common, general-purpose tags could have a very large
weight and centrality, although for a particular domain or
application they do not carry too much information.
5.3</p>
      </sec>
    </sec>
    <sec id="sec-16">
      <title>Tag Graphs and Human Language Networks</title>
      <p>In the previous section, we have shown that tag networks can be
easily constructed and visualized and that they could prove useful
in simple information retrieval. However, exploring the properties
of these tag graphs (e.g. node centrality, degree distribution, etc.)
and their evolution - can provide us with much deeper insights into
how folksonomies develop from the aggregate behavior of
individual users. They could additionally provide insight into how more
complex semantic structures evolve.</p>
      <p>
        A starting point in our further modelling is the work that seeks
to explain the emergence of structure and syntax in human
language. In recent high-profile work, Ferrer i Cancho and Sole [
        <xref ref-type="bibr" rid="ref17 ref4">17,
4</xref>
        ] study the evolution of several human languages, by constructing
their graphical protostructure. They do this by taking large
corpuses of (natural language) texts and constructing inter-correlation
graphs between all pairs of words in the language, based on the
distance they appear from each other in these texts.
      </p>
      <p>
        Next, they analyze the resulting graphical structure for each of
the considered languages. Following the seminal work of Zipf, they
show that the retrieved networks, far from having the structure
predicted by random graph theory for such large networks [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], have,
in fact a “small world” structure.4 Furthermore, this protostructure
is remarkably similar across different languages.
      </p>
      <p>
        Graphs which exhibit a small world network effect have the
distribution of the mean degree of the edges follow Zipf’s law.5 Sole et
al.[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] argue that, far from being a mere coincidence, this is an
essential underlying property of human languages, and furthermore,
syntax and structure in human languages emerges “for free” from
these simpler structures. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], they simulate a version of Zipf’s
classic generative model of human language: speakers prefer to
use ambiguous, general words which have minimum entropy (and
minimize their effort for choosing the word), while hearers prefer
words with high entropy, and thus high information content.
      </p>
      <p>
        Comparing this setting with the considered tripartite model of
tagging systems (presented above and in Fig. 1), we observe some
important similarities to models of language evolution. The
resources (websites) could correspond to the objects in the real world
- that need to be described by the language, the users to the
speakers of the language, and the tags to the tokens of the language (i.e.
the words). Tags also likely have a Zipf law distribution of node
degrees, and while the massive data harvesting needed to show this
is difficult, even our provisional results do point in this direction.
In such a case, generative models proposed by Sole et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] may
be useful to explain the online behavior of taggers as regards
informational value. Thus, folksonomy structure could also be seen as
emerging at the intersection between the efforts of taggers, who try
to minimize their effort, and thus prefer to choose more common
4A small-world graph is a graph in which any two nodes are
connected by a path of small maximum length - usually 2-4.
5The degree of a vertex is the number of edges connected to that
vertex. The distribution of the degrees across all vertexes is an
important property of a graph
tags with less information value and retrievers (i.e. “hearers”) who
need to use this tags to find as precise as possible information
resources and so use tags with the highest informational value. In our
generative model shown in Section 3, the results of this “least
effort principle” would be the parameter . However, the next
question we have to encounter, one that is particularly relevant to the
Semantic Web community, is given that a stable power law
distribution emerges from highly tagged sites, is there any methodology
to formalize that distribution into an ontology?
6.
      </p>
    </sec>
    <sec id="sec-17">
      <title>ONTOLOGY EXTRACTION FROM TAG</title>
    </sec>
    <sec id="sec-18">
      <title>GING</title>
      <p>Ontology extraction from tagging is a difficult research question
given the ambiguity in formally quantifying what constitutes an
ontology. In this section we demonstrate the first brush strokes of a
speculative ontology extraction methodology using a tagging
example. One should only apply any sort of ontology extraction from
tagging after the tag distribution of the site has reached a stable
power law. In order to avoid the higher variance of the tail of the
power law distribution, we look only at what del.icio.us measures to
be the most “Common Tags,” which is the upper end of the
powerlaw distribution. Since RDF (Resource Description Framework) is
the most basic component of the Semantic Web, we will use it as
our extraction format instead of a more complex OWL (Web
Ontology Language).</p>
      <p>
        Since we can expect to extract only the most basic structure from
tag space, the very simplicity of the RDF “triple” structure is
actually a boon. RDF differs from traditional knowledge representation
systems in two major ways. First, every statement is composed
on three atoms arranged into a “triple” in the following manner:
Subject, Predicate, Object. Second, each atomic subject, predicate,
or object is denoted by a unique URI. There are a number of
special properties given to us by RDF, although we will focus only
on sub-classing. The predicate rdfs:subClassOf is defined by RDF
Schema to “to state that all the instances of one class are instances
of another,” while rdf:type states that “a resource is an instance of a
class” [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Notice that rdfs:subClassOf is how RDF deals with the
idea of hierarchies, something that folksonomies are supposedly
incapable of providing.
      </p>
      <p>One finds in many folksonomies that a hierarchy is not
eliminated; instead the user in some instances “types the hierarchy out”
or “mixes a single name among multiple tags,” such as repeating
the two tags “digital piano.” A comprehensive ontology-based
system would already formalize “digital piano” as a single class that
is the sub-class of “piano.” There are also facets, or structured
relationships that are not strictly hierarchical relationships.
Traditional facets are given by relationships like that each “Person” has
a “name” or that photos are taken on a “date.” These types of facets
clearly fits within the “triple” structure of RDF, such as foaf:Person
foaf:name xsd:string ‘Harry Halpin” or flickr:ryansking/160217283/
dc:date xsd:date ‘2006-05-25”. However, the real problem with
any automatic mining of facets is that the property is almost always
implicit. For example, many people just add the date to Flickr
photos without adding a “date” tag, much less a “name” tag for a person
when mentioning their name. Worse, the object is usually variable,
as the dates and names of people change. However, while facets
may be beyond our grasp, we can attempt to capture the sorts of
structure exemplified by our first example: When two tags are used
to describe a single class or instance, and when one class is the
subclass of another class. In our example, we inspect an article on the
Web that reviews “digital pianos.” In this context, the term“digital”
is closely related to “piano” since the term “digital” is never
mentioned without “piano,” so we can safely say that the two terms
“digital” and “piano” can be considered in some contexts just
“digital piano.” However, “piano” is a more abstract class than “digital
piano” since “piano” often appears in the context of other tagging
instances without the tag “digital.”</p>
      <p>Since we do not have access to the entire del.icio.us tagging
database, we have to focus on creating ontologies from a given
resource, looking at what knowledge can be extracted from a
webpage about a digital piano rather than the set of all pages about
digital pianos tagged by users. The network is a directed and weighted
graph whose weights are given by how many times two given tags
(txkty) co-occur in a tagging instance (i.e. when a user tags the
site) and dividing this by their total number of occurrences of the
tags, so C(y; x) = txkty . Because a given tag may co-occur with
ty
multiple terms in a single tag and also a tag may co-occur with
non-common tags P (cxkcy) 6= cx. In our example, the most
common tags are “piano” (8), “music”(4), “digital”(2), and
“review”(2).” The single tagging instance “piano music digital review
select” has “select” eliminated since it is not a common tag, and the
tag “piano,” while it appears 8 times in total, appears in this
single instance with three other tags. An example graph of what this
“common tag” graph looks like is shown here in Figure 8. From
this graph, one can tell that the word “piano” co-occurs most
frequently, even with a number of words that are uncommon. Lastly,
one can then tell that for common tags, the tag “digital” always
occurs with “piano” while the word “music” usually occurs with
“piano.” The tag correlation graph is shown in Figure 9, plotted
using the same methodology as in Section 5. This visually estimates
the information values.</p>
      <p>The two heuristics we use to extract ontologies are, given two
tags in the common-tag graph x and y whose “common-graph”
values are given as C(y; x) for the directed edge between y and
x:</p>
      <p>If the I(y) &lt; I(x) and if 1 &gt; C(y; x) &gt; then y rdfs:subClassOf
x.</p>
      <p>If I(y; x) I(y) &lt; I(x), and if C(y; x) = 1 then “y x”
rdfs:subClassOf x.</p>
      <p>The first rule merely states that for a given tag that may belong to
a more abstract class than another tag, the more abstract tag should
have a lower information value (i.e. retrieve more pages) than the
less abstract tag. Also, the rule states that the more abstract tag
should also be used in combination with other tags besides the less
abstract tag, but used with the less abstract class often (as
quantified by ). In this manner, we can guess that the more abstract
class is not equivalent to the less abstract class. The second rule is
similar, that if two tags are to be considered a single “compound
tag resource,” then those two tags should appear together all the
time in tagging instances for a given resource. Furthermore, if the
informational value of one of the tags by itself is less than the rest
of the tags of the newly created “compound tag resource” then this
“compound tag resource” is a subclass of the tag with the lower
informational value.</p>
      <p>These two rules are illustrated by their application to Figure 8.
First, the only tags that are always applied together are “digital”
and “piano,” and since “digital” has a higher informational value
than “piano,” and both have a higher informational value than
“digital piano” (as should be trivially expected), then “digital piano” is
a subclass of “piano” (ex:digital-piano rdfs:subClassOf ex:piano).
When the first rule is applied with an = 0:5, only the connection
between “music” and “piano” is strong enough to (0:6) to qualify
without being absolutely a perfect correlation (1.0), and since
“music” has a higher information value than “piano,” therefore “piano”
can be considered a subclass of “music” (ex:piano rdfs:subClassOf
ex:music). In this particular example, the tag “music” is being used
as shorthand for “musical instruments.” While this technique is
in need of refinement and widespread testing, we do think simple
rules like this might be useful within the context of a stable tagging
distribution of a single resource.
7.</p>
    </sec>
    <sec id="sec-19">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>This work has explored a number of issues highly relevant to
the question of whether a coherent way of organizing metadata can
emerge from distributive tagging systems. We began with outlining
a principled generative model of tagging. Unlike other proposed
models, our model is based on Mika’s formalization of tagging and
incorporates the informational value of tags which we believe
allows for a more complete account of tagging. We also have shown
that our model formalizes many of the common-sense observations
made by people who are informally studying folksonomies.
Because we currently lack the empirical data (since data is not made
easily accessible) to currently estimate the model parameters
using a training set in order to compare the results of the generative
model to an empirical test set, this will be our next goal.</p>
      <p>Using empirical data, we have shown that tagging distributions
tend to stabilize into power law distributions. This suggests that
consensus around the categorization of information driven by
tagging behaviors occurs. Using example domains, we have explored
the most empirically challenging aspects of the generative model:
the informational value of a tag as a function of how many pages
a given tag can retrieve using a search. We examined how this
information can be used with multiple tags to visualize correlation
graphs that lend insight into the categorization process and into
existing intuitions about how concepts are related. This provides
preliminary evidence that some type of latent classification scheme
and taxonomic structure may lie behind tagging.</p>
      <p>Finally, we have shown a simple methodology for extracting
Semantic Web ontologies (in particular RDF and RDF Schema) that
can be used on tagged resources whose tagging distribution has
stabilized into a power law. Again, we need more empirical data
to validate these ontologies and produce them en masse, and
currently we are gathering this data and as such will likely refine our
heuristics in time. From these results, it seems quite plausible that
folksonomies and ontologies, which are merely new incarnations
of categorization and classification respectively, are not mortal
enemies, but fundamentally compatible, as tagging-based
categorization can evolve into stable classification schemes that can be
formalized as ontologies. Further work will contribute more rigorous
analyses to these observations.</p>
    </sec>
    <sec id="sec-20">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work was performed during the authors’ visit at the Santa
Fe Institute, Santa Fe, NM, USA. The authors wish to thanks the
SFI for its support in the initial stages of this research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Batagelj</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Mrvar</surname>
          </string-name>
          .
          <article-title>Pajek - A program for large network analysis</article-title>
          .
          <source>Connections</source>
          ,
          <volume>21</volume>
          :
          <fpage>47</fpage>
          -
          <lpage>57</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Bela</given-names>
            <surname>Bollobas</surname>
          </string-name>
          . Random Graphs. Academic Press, London, England,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Stuart</given-names>
            <surname>Butterfield. Folksonomy</surname>
          </string-name>
          ,
          <year>2004</year>
          . http://www.sylloge.com/personal/2004/08/folksonomysocial-classification-great.html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R. Ferrer</given-names>
            <surname>Cancho</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Sole</surname>
          </string-name>
          .
          <article-title>The small world of human language</article-title>
          .
          <source>Proc. Roy. Soc. London, B</source>
          <volume>268</volume>
          :
          <fpage>2261</fpage>
          -
          <lpage>2266</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R. Ferrer</given-names>
            <surname>Cancho</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Sole</surname>
          </string-name>
          .
          <article-title>Least effort and the origins of scaling in human language</article-title>
          .
          <source>Procs. Natl. Acad. Sci. USA</source>
          ,
          <volume>100</volume>
          :
          <fpage>788</fpage>
          -
          <lpage>791</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Diaconis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>McGrath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and J.</given-names>
            <surname>Pitman</surname>
          </string-name>
          .
          <article-title>Riffle shuffles, cycles and descents</article-title>
          .
          <source>Combinatorica</source>
          ,
          <volume>15</volume>
          :
          <fpage>11</fpage>
          -
          <lpage>29</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Scott</given-names>
            <surname>Golder</surname>
          </string-name>
          and
          <string-name>
            <given-names>Bernardo</given-names>
            <surname>Huberman</surname>
          </string-name>
          .
          <article-title>The structure of collaborative tagging systems</article-title>
          ,
          <year>2006</year>
          . HP Labs Technical Report http://www.hpl.hp.com/research/idl/papers/tags/.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Pat</given-names>
            <surname>Hayes</surname>
          </string-name>
          .
          <source>RDF Semantics, W3C Recomendation</source>
          ,
          <year>2004</year>
          . http://www.w3.org/TR/2004/REC-rdf-mt-
          <volume>20040210</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>E.</given-names>
            <surname>Jacob</surname>
          </string-name>
          .
          <article-title>Classification and categorization: A difference that makes a difference</article-title>
          .
          <source>Library Trends</source>
          ,
          <volume>52</volume>
          (
          <issue>3</issue>
          ):
          <fpage>515</fpage>
          -
          <lpage>540</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Cameron</surname>
            <given-names>Marlow</given-names>
          </string-name>
          , Mor Naaman, Danah Boyd, and
          <string-name>
            <given-names>Marc</given-names>
            <surname>Davis</surname>
          </string-name>
          .
          <article-title>Position paper, tagging, taxonomy, flickr, article, toread</article-title>
          . In Collaborative Web Tagging Workshop at WWW'
          <volume>06</volume>
          ,
          <string-name>
            <surname>Edinburgh</surname>
          </string-name>
          , UK,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Adam</given-names>
            <surname>Mathes</surname>
          </string-name>
          . Folksonomies:
          <article-title>Cooperative classification and communication through shared metadata</article-title>
          ,
          <year>2004</year>
          . http://www.adammathes.com/academic/computer-mediatedcommunication/folksonomies.html.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Merholz</surname>
          </string-name>
          .
          <article-title>Metadata for the masses</article-title>
          ,
          <year>2004</year>
          . http://adaptivepath.com/publications/essays/archives/000361.php.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Mika</surname>
          </string-name>
          .
          <article-title>Ontologies are us: A unified model of social networks and semantics</article-title>
          .
          <source>In Proc. of the 4th Int. Semantic Web Conference (ISWC'05)</source>
          . Springer LNCS vol.
          <volume>3729</volume>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>V.</given-names>
            <surname>Robu</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>La</surname>
          </string-name>
          <article-title>Poutre´. Retrieving utility graphs used in multi-item negotiation through collaborative filtering</article-title>
          .
          <source>In Proc. of RRS'06</source>
          ,
          <string-name>
            <surname>Hakodate</surname>
          </string-name>
          , Japan,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Kaikai</given-names>
            <surname>Shen</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lide</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Folksonomy as a complex network</article-title>
          ,
          <year>2005</year>
          . http://arxiv.org/abs/cs.IR/0509072.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Clay</given-names>
            <surname>Shirky</surname>
          </string-name>
          . Ontology is over-rated,
          <year>2005</year>
          . http://www.shirky.com/writings/ontology-overrated.html.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Sole</surname>
          </string-name>
          . Syntax for free?
          <source>Nature</source>
          ,
          <volume>434</volume>
          :
          <fpage>289</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Luc</given-names>
            <surname>Steels</surname>
          </string-name>
          .
          <article-title>The evolution of communication systems by adaptive agents. In Adaptive agents and multi-agent systems</article-title>
          , pages
          <fpage>125</fpage>
          -
          <lpage>140</lpage>
          . Springer LNAI,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Duncan</given-names>
            <surname>Watts</surname>
          </string-name>
          and
          <string-name>
            <given-names>Steve</given-names>
            <surname>Strogatz</surname>
          </string-name>
          .
          <article-title>Collective dynamics of 'small-world' networks</article-title>
          .
          <source>Nature</source>
          ,
          <volume>393</volume>
          (
          <issue>6684</issue>
          ):
          <fpage>440</fpage>
          -
          <lpage>442</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>G.K.</given-names>
            <surname>Zipf</surname>
          </string-name>
          .
          <article-title>Human Behaviour and the Principle of Least Effort</article-title>
          . Addison-Wesley, Cambridge, Massachusets,
          <year>1949</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>