Adapting K-Nearest Neighbor for Tag Recommendation in Folksonomies
            Jonathan Gemmell, Thomas Schimoler, Maryam Ramezani, Bamshad Mobasher
                                                     Center for Web Intelligence
                                               School of Computing, DePaul University
                                                       Chicago, Illinois, USA

                          {jgemmell, tschimoler, mramezani, mobasher}@cdm.depaul.edu


                            Abstract                                 users. Delicious1 supports users as they bookmark URLs.
                                                                     Flickr2 allows users to upload, share and manage online pho-
  Folksonomies, otherwise known as Collaborative Tagging             tographs. Citeulike3 enables researchers to manage and dis-
  Systems, enable Internet users to share, annotate and search
  for online resources with user selected labels called tags. Tag
                                                                     cover scholarly references. Still other Collaborative Tagging
  recommendation, the suggestion of an ordered set of tags dur-      Systems specialize in music, blogs and business documents.
  ing the annotation process, reduces the user effort from a            At the core of Collaborative Tagging Systems is the post:
  keyboard entry to a mouse click. By simplifying the anno-          a user describes a resource with a set of tags. These
  tation process tagging is promoted, noise in the data is re-       tags may be descriptive (“Folksonomies”), subjective (“awe-
  duced through the elimination of discrepancies that result in      some”), organizational (“toread”) or completely idiosyn-
  redundant tags, and ambiguous tags may be avoided. Tag rec-        cratic (“jfgwh”). Taken in isolation an individual annotation
  ommenders can suggest tags that maximize utility, offer tags       allows a user to organize web resources for later use: re-
  the user may not have previously considered or steer users         sources can be easily sorted, aggregated and retrieved. Re-
  toward adopting a core vocabulary. In sum, tag recommenda-
  tion promotes a denser dataset that is useful in its own right
                                                                     sources may be annotated with multiple tags allowing a re-
  or can be exploited by a myriad of data mining techniques for      source to be identified with several topic areas rather than
  additional functionality.                                          pigeonholed in a single directory. Moreover users need not
  While there exists a long history of recommendation algo-          conform to a predefined vocabulary or rigid hierarchy but
  rithms, the data structure of a Folksonomy is distinct from        may annotate a resource with any tag they wish thereby re-
  those found in traditional recommendation problems. We first       ducing user effort and limiting the entry cost.
  explore two data reduction techniques, p-core processing and          However the utility of tagging extends beyond their im-
  Hebbian deflation, then demonstrate how to adapt K-Nearest         mediate use. Taken as a whole, the aggregation of many
  Neighbor for use with Folksonomies by incorporating user,          annotations results in a complex network of interrelated
  resource and tag information into the algorithm. We further        users, resources and tags often referred to as a Folkson-
  investigate multiple techniques for user modeling required to      omy (Mathes 2004). The opportunity to explore the Folk-
  compute the similarity among users. Additionally we demon-
  strate that tag boosting, the promoting of tags previously ap-
                                                                     sonomy unburdened by a preconceived navigational or con-
  plied by a user to a resource, improves the coverage and ac-       ceptual hierarchy is crucial to the utility and popularity of
  curacy of K-Nearest Neighbor.                                      Folksonomies. Users are able to share or discover resources
  These techniques are evaluated through extensive experimen-        through the collaborative network and connect to other users
  tation using data collected from two real Collaborative Tag-       with related interests. Collaborative Tagging Systems can
  ging Web sites. Finally the modified K-Nearest Neighbor            identify groups of like-minded users, catering not only to
  algorithm is compared with alternative techniques based on         mainstream but also to non-conventional users who are often
  popularity and link analysis. We find that K-Nearest Neigh-        under-served by traditional Web tools. Furthermore, users
  bor modified for use with Folksonomies generates excellent         may enjoy the social aspects of collaborative tagging, at-
  recommendations, scales well with large datasets, and is ap-       tracted by a sense of community not offered by either on-
  plicable to both narrow and broadly focused Folksonomies.          tologies or search engines.
                                                                        A distinct advantage of Folksonomies is the richness of
                        Introduction                                 the user profiles. If a user is interested enough in a resource
                                                                     to annotate it, the tag describes the user as much as it de-
Folksonomies, also known as Collaborative Tagging Sys-               scribes the resource. As users annotate resources, the system
tems, have emerged as a powerful trend allowing Internet             is able to track their interests. These profiles are a powerful
users to share, annotate and explore online resources through        tool for data mining algorithms.
personalized labels. Several Collaborative Tagging Sys-
tems have recently gained popularity attracting millions of             1
                                                                          delicious.com
                                                                        2
Copyright c 2009, Association for the Advancement of Artificial           www.flickr.com
                                                                        3
Intelligence (www.aaai.org). All rights reserved.                         www.citeulike.org
   Even though tags offer many benefits both in the short          Folksonomies. The authors discussed how tags have been
and long term, they also present unique challenges for rec-        used by individual users over time and how tags for an in-
ommendation algorithms. Most Collaborative Tagging Ap-             dividual resource stabilize over time. They also discussed
plications permit unsupervised tagging; users are free to use      two semantic difficulties: tag redundancy, when multiple
any tag they wish to describe a resource. This is often done       tags have the same meaning, and tag ambiguity, when a sin-
to reduce the entry cost of using the application and make         gle tag has multiple meanings. (Macgregor and McCulloch
collaborative tagging more user-friendly. Unsupervised tag-        2006) provide an overview of the phenomenon and explore
ging can result in tag redundancy in which several tags have       reasons why both Folksonomies and Ontologies will have a
the same meaning or tag ambiguity in which a single tag            place in the future of information access.
has many different meanings. Such inconsistencies can con-            There have been several recent research investigations
found users as they attempt to utilize the Folksonomy.             into recommendation within Folksonomies. Unlike tradi-
   Tag recommendation provides a means to overcome these           tional recommender systems which have a two-dimensional
problems. It reduces the user effort to a mouse click rather       relation between users and items, tagging systems have
than a keyboard entry. By reducing the effort users are en-        a three dimensional relation between users, tags and re-
couraged to tag more frequently, apply more tags to an in-         sources. Recommender systems can be used to recommend
dividual resource, reuse common tags, and perhaps use tags         each of the dimensions based on one or two of the other
the user had not previously considered. Moreover, user error       dimensions. (Tso-Sutter, Marinho, and Schmidt-Thieme
is reduced by eliminating redundant tags caused by capital-        2008) applies user-based and item-based collaborative fil-
ization inconsistencies, punctuation errors, misspellings and      tering to recommend resources in a tagging system and uses
other discrepancies. The tag recommender can further pro-          tags as an extension to the user-item matrices. (Nakamoto et
mote a core tag vocabulary steering the user toward adopt-         al. 2008a) and (Nakamoto et al. 2008b) use tags as context
ing certain tags while not imposing any strict rules. The tag      information to recommend resources.
recommender may even avoid ambiguous tags in favor of                 Other researchers have studied tag recommendation in
tags that offer greater information value. The final result is     folksonomies.      (Jaschke et al. 2007) compares user-
a cleaner, denser dataset that is useful in its own right or for   based collaborative filtering and a graph-based recom-
further data mining techniques.                                    mender based on the Pagerank algorithm to recommend per-
   However the data generated through Collaborative Tag-           sonalized tags. (Heymann, Ramage, and Garcia-Molina
ging differs from that common in recommendation algo-              2008) use association rules to recommend tags and intro-
rithms. The introduction of tags manifests a third dimen-          duce an entropy-based metric to find how predictable a tag
sion which must be integrated into recommenders that tra-          is. (Lipczak 2008) uses the title of a resource, the posts
ditionally incorporate only two dimensions. In this paper          of a resource and the user’s vocabulary to recommend tags.
we demonstrate how K-Nearest Neighbor may be adapted               The results show that tags retrieved from the user’s vocab-
to recommend tags in Folksonomies. We describe how both            ulary outperform recommendations driven by resource in-
user and resource information can be directly applied in the       formation. However the experiment was performed on data
algorithm improving both coverage and accuracy while re-           from Bibsonomy, a Folksonomy focused on scientific pub-
ducing computational costs. In addition several user models        lications, and thus might not be applicable to multi-domain
are explored including vectors over the set of tags, vectors       data that cover.
over the set of resources, combinations of these two and fea-          (Xu et al. 2006) presents general criteria for a good
tures derived through Hebbian deflation.                           tagging system including high coverage of multiple facets,
   The rest of this paper is organized as follows. We be-          high popularity and least-effort. They categorize tags to
gin by presenting some related work involving the use of           content-based tags, context-based tags, attribute tags, sub-
recommendations in Folksonomies. We explore the data               jective tags, and organizational tags and use a probabilistic
structure of folksonomies and discuss two feature reduc-           method to recommend tags. (Basile et al. 2007) proposes
tion techniques: p-core processing and Hebbian deflation.          a classification algorithm for tag recommendation. (Sig-
We then outline the basic approach used for recommend-             urbjörnsson and van Zwol 2008) uses a co-occurrence-based
ing tags in Folksonomies and propose modifications to K-           technique to recommend tags for photos in Flickr. The as-
Nearest Neighbor. After a discussion of the datasets and a         sumption is that the user has already assigned a set of tags
description of the experimental methodology, we evaluate           to a photo and the recommender uses those tags to recom-
the proposed modifications using data collected from two           mend more tags. (Adrian, Sauermann, and Roth-Berghofer
real world Folksonomies. Finally, we compare the modified          2007) suggests a semantic tag recommendation system in
algorithm with alternative strategies based on popularity and      the context of a semantic desktop. (Song et al. 2008) uses
link analysis.                                                     clustering to make real-time tag recommendation.

                      Related Work                                          Data Structures of Folksonomies
As Collaborative Tagging Applications have gained in popu-         In this section we define the three-dimensional data struc-
larity researchers have started to explore and characterize the    ture of a Folksonomy and describe how to construct two-
tagging phenomenon. In (Macgregor and McCulloch 2006)              dimensional projections. We further explore two data reduc-
and (Golder and Huberman 2006) the authors studied the             tion techniques for use with Folksonomies. The first is a data
information dynamics of Delicious, one of the most popular         selection technique, P -core processing, that reduces the size
of the data by removing users, resources and tags that occur      P -Core Processing
less than a predefined number of times. The second is a data      Through P -Core Processing users, resources and tags are re-
extraction technique, Hebbian Deflation, which produces a         moved from the dataset in order to produce a residual dataset
new set of features from a two-dimensional projection of the      that guarantees each user, resource and tag occur in at least p
Folksonomy.                                                       posts (Batagelj and Zaveršnik 2002) (Jaschke et al. 2007).
                                                                  Here we define a post to include a user, a resource, and every
Data Models                                                       tag the user has applied to the resource.
The data structure of a Folksonomy differs from that com-            The algorithm iterates through the posts, counting the oc-
mon to most traditional recommendation algorithms. A              currence of users, resources and tags. If the occurrence of
Folksonomy can be described as a four-tuple D:                    these items does not meet the requisite value, p, all occur-
                                                                  rences of the item and the posts in which it appears are re-
                                                                  moved from the dataset. Since removing a post based on
                     D = hU, R, T, Ai ,                    (1)    the occurrence of one item reduces the count for the other
where, U is a set of users; R is a set of resources; T is a set   items in the post, several passes through the dataset may be
of tags; and A is a set of annotations, represented as user-      required. The result is a smaller denser dataset.
tag-resource triples:                                                Several reasons exist to use P -Core Processing. By re-
                                                                  moving infrequent users, resources and tags noise in the data
                                                                  is reduced; uncommon items whether they be tags used by
           A ⊆ {hu, r, ti : u ∈ U, r ∈ R, t ∈ T }          (2)
                                                                  only a few users, unpopular resources, or unproductive users
   A Folksonomy can, therefore, be viewed as a tripartite         are eliminated from consideration. Because of their scarcity,
hyper-graph (Mika 2007) with users, tags, and resources           these are the very items likely to confound recommenders.
represented as nodes and the annotations represented as           Moreover, by eliminating infrequent items, the size of the
hyper-edges connecting a user, a tag and a resource.              data may be dramatically reduced allowing the application
   The tripartite nature of Folksonomies make it ill suited       of the data mining techniques that would otherwise be com-
for many traditional data mining techniques. User based K-        putationally impractical. In short, P -Core Processing offers
Nearest Neighbor for example requires a means to measure          a means reduce noise and focus on the dense regions of the
the similarity between users. The introduction of a third di-     data.
mension confounds this task.
   Aggregate projections of the data can be constructed,          Hebbian Deflation
reducing the dimensionality by sacrificing information.           Whereas P -Core Processing offers a means to reduce the
(Schmitz et al. 2006) The relation between resources and          size of a dataset through feature selection, feature extraction
tags can be formulated as a two-dimensional projection, RT ,      techniques generate entirely new features. Many feature ex-
such that each entry, RT (r, t), is the weight associated with    traction techniques exist such as Principle Component Anal-
the resource, r, and the tag, t. This weight may be binary,       ysis and Singular Value Decomposition. However, despite
merely showing that one or more users have applied that tag       the utility these techniques provide, their computational cost
to the resource, or it may be finer grained using the number      and memory requirements make them impractical for use on
of users that have applied that tag to the resource:              extremely large datasets common in Folksonomies. Hebbian
                                                                  Deflation (Oja and Karhunen 1985) offers a means to ap-
        RTtf (r, t) = |{a = hu, r, ti ∈ A : u ∈ U }|       (3)    proximate these feature extraction techniques with reduced
                                                                  computational and memory needs.
   Such a measure is equivalent to term frequency or tf com-         Given a two-dimensional projection of the Folksonomy
mon in Information Retrieval. Similarly, term frequency *         and a preselected number of features, F, Hebbian Deflation
inverse document frequency or tf*idf (Salton and Buckley          produces two smaller matrices that approximate the origi-
1988) can be adapted for use with two-dimensional projec-         nal projection. Consider for example the two-dimensional
tions:                                                            projection RT; Hebbian Deflation will produce two new ma-
                                                                  trixes, RX and TX, such that:
         RTtf∗idf (r, t) = RTtf (r, t) ∗ log(|R|/nr )      (4)
                                                                                               X
                                                                                               F
   The tf*idf multiplies the tag frequency by the relative dis-                  RT (r, t) ≈         RXri ∗ T Xti            (5)
tinctiveness of the tag. The distinctiveness is measured by                                    i=0
the log of the total number of resources, |R|, divided by the        Since F is often far smaller than either the number of re-
number of resources to which that tag was applied, nr . Sim-      sources or the number of tags the resultant matrixes require
ilar two-dimensional projections can be constructed for UT        many orders of magnitude less space than the original pro-
in which the weights correspond to users and tags, and UR         jection. Analogous features can be extracted from UR and
in which the weights correspond to users and resources.           UT.
   While a portion of the information is lost through this pro-      The algorithm requires many inputs: F , the number of
cess the result is a two-dimensional matrix which can be          features to be extracted; epochs the number of epochs to
readily applied to existing data-mining algorithms such as        train each feature; and lr the learning rate at which the fea-
K-Nearest Neighbor.                                               tures are adjusted.
   Features are extracted one at a time. To begin, the fea-       the user had not considered. Moreover by eliminating the
tures are set to random weights. Then, for each resource-tag      keyboard entry tag recommendation reduces capitalization
pair in RT the actual weight is compared with the current         inconsistencies, punctuation errors, misspellings, and other
approximation. The features are adjusted based upon the           discrepancies that add noise to the data.
degree of the error, the learning rate and the corresponding         The recommendation of high quality tags benefits the user
feature in the opposing matrix. These adjustments are re-         beyond the annotation process itself. Resources are often
peated for the predetermined number of epochs before the          tagged for future reference; by providing the most relevant
feature is finally adopted and the next feature is trained.       tags retrieval of the resources are made easier. Moreover,
   Input: RT , a projection of a Folksonomy; F , the              when resources are tagged in order to characterize content,
           number of features to derive; epochs, the              the recommendation of clear descriptive tags can improve
           number of epochs to train each feature; lr, the        the online experience for other users that are navigating the
           learning rate                                          site. This is particularly relevant for resources that are not
   Output: RX and T X, the Hebbian features                       easily evaluated by computers such as photos, videos, and
   for f = 1 → features do                                        music.
       for epoch = 1 → epochs do                                     Tag recommendation may also be used to assert control
           foreach (r, t) ∈ RT do                                 on tag usage without encumbering the user with a strict vo-
                         P                                        cabulary. Ambiguous tags such as can be avoided in pref-
               aprox = fi=1 RXri ∗ T Xti
               error = RT (r, t) − approx                         erence of less ambiguous tags Moreover, the recommender
               RX(r, f) += lr ∗ error ∗ T X(t, f)                 can offer tags with a higher level of detail. The overuse of
               T X(t, f) += lr ∗ error ∗ RX(r, f)                 redundant tags can also be thwarted by consistently recom-
           end                                                    mending highly used tags while eschewing their less used
       end                                                        counterparts. The result is cleaner denser dataset that is use-
                                                                  ful in own right for navigation, or for further data mining
   end
                                                                  techniques.
   return RX and T X;
               Algorithm 1: Hebbian Deflation
                                                                  Basic Recommendation
   Like many learning algorithms, Hebbian Deflation may
result in overfitting. Overfitting occurs when the features       In traditional recommendation algorithms the input is often
over specialize in the training data at the expense of general-   a user, u, and the output is a set of items, I. The user experi-
ization. In order to combat this, a percentage of the training    ence is improved if this set of items is relevant to the user’s
data may be held out. At each epoch, the Root Mean Square         needs.
Error (RMSE), is calculated both for the ability of the fea-         Tag recommendation in Folksonomies however differs in
tures to approximate the training data and for their ability to   that the input is both a user, u, and a resource, r. The output
approximate the holdout set. If it is observed that the RMSE      remains a set of items, in this case a recommended set of
is rising for the holdout data even as it is dropping for the     tags, Tr . One of the difficulties presented by tag recommen-
training data, overfitting can be assumed. Then the training      dation is the means to incorporate both user and resource
of the current feature may be halted and the training of the      information into the recommendation algorithm.
next feature can begin.                                              Perhaps the simplest recommendation strategy is merely
   Feature extraction through Hebbian Deflation offers many       to recommend the most commonly used tags in the Folk-
benefits. It may offer a means to represent the data in a         sonomy. However such a strategy ignores both user and re-
smaller space, but the features themselves may offer insights     source information.
into the data. Domain experts can analyze features for their         Alternatively given a user-resource pair a recommender
characteristics. Users may navigate over the reduced feature      may ignore the user and recommend the most popular tags
space rather than the larger Folksonomy. Users, resources         for that particular resource. This strategy is strictly resource
and tags may be modeled as a vector over the set of features      dependent and ignores the tagging habits of the user. In a
allowing reduced computation. Finally, Hebbian Deflation          similar fashion a recommender may ignore the resource and
may reveal similarities among items in a Folksonomy that          recommend the most popular tags for that particular user.
remained hidden when the data was expressed as a projec-          While such an algorithm would include tags frequently ap-
tion using tf or tf*idf.                                          plied by the user, it ignores the resource information and
                                                                  may recommend tags irrelevant to the current resource.
     Tag Recommendation in Folksonomies                              An algorithm for tag recommendation in Folksonomies
                                                                  therefore requires a means to include both user and resource
Recommendation algorithms serve a vital role in Web ap-
                                                                  information in the process so that the recommendation set
plications allowing users to focus on a few relevant items
                                                                  includes tags that are relevant to the resource and also repre-
rather than being overwhelmed by a large unordered set of
                                                                  sent the user’s tagging practice.
mostly inappropriate options. Tag recommendation in Folk-
sonomies reduces the user effort to a mouse click rather than
a keyboard entry. This reduction in effort encourages tag-        K-Nearest Neighbor
ging more resources, promotes the application of multiple         User Based K-Nearest Neighbor is a commonly used rec-
tags to a resource, and may present the user with useful tags     ommendation algorithm in Information Retrieval that can be
modified to include both user and resource information. Tra-       document frequency. In this work we focus on term fre-
ditionally it finds a set of users similar to a query user. From   quency. Similarly a user can be modeled as a vector over
these neighbors a set of recommended items it constructed.         the set of resources where each weight, w(ri ), corresponds
   We can modify this approach by ignoring users that have         to the importance of a particular resource, ri .
not tagged the query resource. Once a neighborhood of sim-
ilar users has been discovered, the algorithm considers only
                                                                                 u~r = hw(r1 ), w(r2 )...w(r|R|)i            (7)
on those tags that have been applied to the query resource
and calculates a weight for each tag, wt , the average simi-          Both of these models however ignore a portion of the user
larity of the neighbors that have applied the tag to the query     profile. A user model consisting merely of tags does not con-
resource. Thus the algorithm is resource driven through both       sider to which resources those tags have been applied. And
the selection of neighbors and the selection of tags. Still it     a user model consisting only of resources does not include
remains user driven in that neighbors are determined through       the tags applied to them.
a user model.
                                                                      The user model may be extended to include both tags and
   Input: uq , a query user; rq , a query resource; k,
                                                                   resources. A new vector can be obtained by concatenating
           number of neighbors to consider; n, the number
                                                                   the two previously mentioned vectors.
           of tags to recommend
   Output: Tr , a set of recommended tags
   foreach u ∈ U that has annotated rq do                                  ut+r
                                                                            ~ = hw(t1 )...w(t|T |), w(r1 )...w(r|R|)i        (8)
       su = similarity(u, uq )
   end                                                                While this model does include both tags and resources,
   Let N be k nearest neighbors to uq ;                            the model does not specify which tags were applied to which
   foreach u ∈ N do                                                resources. However, the tags and resources may be tightly
       foreach t that u applied to rq do                           coupled in a vector over all tag-resource pairs where each
            wt += su /k                                            weight, w(tri ), is one if the user has applied tag, t, to the
       end                                                         resource, r, and zero otherwise.
   end
   Sort tags by wt ;
                                                                            u(tr)
                                                                             ~ = hw(t1 r1 ), w(t1 r2 )...w(t|T |r|R|)i       (9)
   Let Tr be the top n tags;
   Return Tr                                                          However these user models risk becoming exceedingly
  Algorithm 2: K-Nearest Neighbor Modified for Folk-
  sonomies                                                         large and extremely sparse. Hebbian features may be used to
                                                                   combat this sparsity. For example, features extracted from
   K-Nearest Neighbor is considered a lazy algorithm; the          either UR or UT may be used to construct the user model:
bulk of its computation takes place after the query. Tra-
ditional approaches would require a comparison between
the query user and every other user. However, since the                             uHebbian
                                                                                      ~      = hf1 , f2 ...f|F |i           (10)
adapted algorithm for K-Nearest Neighbor considers only
those users that have annotated the query resource, the num-          These extracted features can greatly reduce the computa-
ber of similarities to calculate is drastically reduced. The       tional costs of calculating similarities since the number of
popularity of resources in Folksonomies follows the power          features is far smaller than the size of the original matrix.
law and the great majority of resources will benefit from this     Moreover feature extraction may discover hidden relation-
reduced reduction in computation, while a few will require         ships and identify similarities among users that the previ-
additional computational effort. As a result the adapted K-        ously described models would not capture.
Nearest Neighbor scales well with large datasets, a trait not         Several techniques exist to calculate the similarity be-
shared by many other recommendation algorithms.                    tween vectors such as Jaccard similarity or Cosine similarity
                                                                   (Van Rijsbergen 1979). In this work we focus on cosine sim-
User Models                                                        ilarity.
Applications vary in the way they model users. Possible
methods include recency, authority, linkage or vector space        Boosting Tags
models. In this work we focus on the vector space model            In most traditional recommendation approaches the recom-
(Salton, Wong, and Yang 1975) adapted from the Informa-            mendation set would not include an item the user has already
tion Retrieval discipline to work with Folksonomies. Each          used. However in Collaborative Tagging Applications users
user, u, can be modeled as a vector over the set of tags,          often reuse tags. Previously used tags are then an important
where each weight, w(ti ), in each dimension corresponds           clue for the recommendation algorithm.
to the importance of a particular tag, ti .
                                                                      We propose a boosting factor, b, that can be used to pro-
                                                                   mote tags in the user profile. As an additional step to the
                u~t = hw(t1 ), w(t2 )...w(t|T |)i           (6)
                                                                   modified K-Nearest Neighbor recommender, b is added to
  In calculating the vector weights a variety of measures can      the weight of the tag if the user has previously applied that
be used: binary, term frequency or term frequency*inverse          tag to another resource.
  foreach t ∈ T that any u ∈ N has applied to rq do             the annotations were collected. This resulted in 99,864 dis-
      if (uq has applied t) then                                tinct usernames.
           wt = wt + b                                             For each user, the “Network” and “Fans” were explored
      end                                                       recursively collecting additional usernames. A user’s Net-
  end                                                           work consists of the other users that the user has explicitly
 Algorithm 3: Optional Step Including Boost K-Nearest           chosen to watch. Conversely a Fan is another user that has
 Neighbor                                                       explicitly chosen to watch the user. This resulted in a total
                                                                of 524,790 usernames.
                                                                   From 10/20/2008 to 12/15/2008 the complete profiles of
FolkRank                                                        all 524,790 users were collected. Each user profile con-
In (Hotho et al. 2006) the authors proposed an adaptation of    sisted of a collection of annotations including the resource,
link analysis to the Folksonomy data structure. They have       tags and date of the original bookmark. The top 100 most
called this technique Folkrank since it computes a Pagerank     prolific users were visually inspected; twelve were removed
vector from the tripartite graph induced by the Folksonomy.     from the data because their annotation count was many or-
This graph is generated by regarding U ∪ R ∪ T as the set of    ders of magnitude larger than other users and were therefore
vertices. Edges are defined by the two-dimensional projec-      suspected to be Web-bots.
tions, UT, UR and RT.                                              Due to memory and time constraints, 10% of the user pro-
   If we regard the adjacency matrix of this graph, W , (nor-   files was randomly selected. A P -core of 20 was derived
malized to be column-stochastic), a damping factor, d, and a    such that each user, resource and tag appear in at least 20
preference vector, p, then we compute the Pagerank vector,      posts where a post is defined as a user, resource and all tags
w, in the usual manner:                                         that user applied to the resource.
                                                                   The result was a dataset with 18,105 users, 42,646 re-
                   w = dAw + (1 − d)p                   (11)    sources and 13,053 tags. There are 2,309,426 annotations
                                                                and 8,815,545 triples. The average number of tags in a post
   However due to the symmetry inherent in the tripartite       is 3.82.
graph, this basic Pagerank can too easily focus on the most        Citeulike is a popular online tool used by researchers to
popular elements in the Folksonomy. The Folkrank vector         manage and discover scholarly references. They make their
is taken as a difference between two computations of Pager-     dataset freely available to download4. On 2/17/2009 the
ank: one with a preference vector and one without the pref-     most recent snapshot was downloaded with data extending
erence vector.                                                  back to 5/30/2007. The data contains anonymous user ids
   In order to generate tag recommendations Folkrank uti-       and posts for each user including resources, the date and
lizes the preference vector to bias the algorithm towards the   time of the posting and the tags applied to the resource. The
query user and resource(Jaschke et al. 2007). These ele-        original dataset contains 41,689 users, 1,370,729 resource
ments are given a substantial weight in the preference vector   and 284,389 tags.
where all other elements have uniformly small weights.             Because of its relatively small size and sparse data, the
   We have included this method as a benchmark as it has        Citeulike data cannot support a P -core of 20. Instead a P -
been shown to be an effective method of generating tag rec-     core of 5 was derived. This reduced the size of the data to
ommendations. However it has a distinct disadvantage in         2,051 users, 5,376 resource and 3,343 tags. There are 42,277
that it requires a complete computation of the Pagerank vec-    annotations and 105,873 triples. The average number of tags
tor for each query. This makes the method problematic when      in a post is 2.50.
working with data from large Folksonomies.
                                                                              Folksonomy       Delicious       Citeulike
              Experimental Evaluation                                               Users         18,105           2,051
Here we describe the methods used to gather data for the                        Resources         42,646           5,376
experiments and provide details of our datasets. We then                             Tags         13,053           3,343
discuss modifications to N -Fold Cross Validation for Folk-                         Posts      2,309,427         42,277
sonomies and describe our experimental methodology. We                        Annotations      8,815,545        105,873
briefly discus the common metrics recall and precision and
then detail the results of our experiments.                                              Table 1: Datasets

Data Sets                                                          An important distinction between the two datasets is their
We validate our approach through extensive evaluation of        focus. Users in Delicious are able to tag any URL avail-
the proposed modifications using data from two real Collab-     able on the Web. As such an individual’s interests are often
orative Tagging Applications: Delicious and Citeulike.          varied encompassing many topics. In Citeulike however re-
   Delicious is a popular Website in which users annotate       searchers tag scholarly publications and their tagging is of-
URLs. On 10/19/2008, 198 of the most popular tags were          ten focused in their area of expertise.
taken from the user interface. For each of these tags the
                                                                   4
2,000 most recent annotations including the contributors of            http://www.citeulike.org/faq/data.adp
   The data available for Delicious is also far more abundant.     on a 2% overtraining-holdout set showed little improvement
Using only a fraction of the data scraped from the Website,        over 50 features. Additionally, 2000 epochs were selected
the Delicious dataset still has more than fifty times the an-      to train each feature, but in all cases the algorithm halted the
notations in the Citeulike dataset. Moreover, the Delicious        training when it detected overfitting and proceeded to the
is far denser supporting a P -core of 20 rather than a P -core     next feature.
of 5.                                                                 Table 2 shows selected tags along with their six most sim-
                                                                   ilar neighbors. (Salton, Wong, and Yang 1975) has demon-
Experimental Methodologies                                         strated similar results using co-occurrence, Folkrank and
We implemented an extension of N -Fold Cross Validation            context metrics. Similarities are calculated by representing
for Folksonomies. Each user profile was divided among n            each tag as a vector of weights over the Hebbian features
folds, each fold containing approximately 1/n of each user’s       and calculating the cosine similarity between two tags.
posts. A post includes the user, a resource and all tags the          Initial examination of the nearest tags lends credence to
user applied to that resource. Models were built using n − 1       the assumption that Hebbian Deflation can be used to dis-
folds of the data, while the posts in the remaining fold served    cover meaningful features. For example the nearest tags
as test cases.                                                     to “photo” are clearly redundant tags. However other tech-
   Each test case consists of a user, u, a resource, r, and all    niques such as stemming and thesaurus tables could provide
the tags the user has applied to that resource. These tags, Th ,   the same utility.
are analogous to the holdout set commonly used in Informa-            The example “toread” demonstrates that this method
tion Retrieval. The tag recommendation algorithms accept           goes beyond what other techniques might provide. Where
the user-resource pair and return an ordered set of recom-         “photo” had been a descriptive tag, “toread” is an organiza-
mended tags, Tr . From the holdout set and recommendation          tional tag. The ability of the Hebbian features to match it
set utility metrics were calculated.                               with things that will in fact be read and other organization
   For each evaluation metric the average value was calcu-         tags illustrated the effectiveness of Hebbian Deflation.
lated across all test cases of an individual fold. The average        Moreover in the example of “folksonomies” show how
was then calculated across all folds. Experiments completed        highly related words can be discovered through the Heb-
on Delicious consisted of 10 folds, while experiments on Ci-       bian features. “Tags” are of course a crucial part of “folk-
teulike had 5 folds.                                               sonomies” that provide “classification” by providing “meta-
   The exception to this methodology are the experiments           data.”
completed for Folkrank. Due to the steep computational re-            Domain specific words like “mac” or “osx” are difficult
quired for this approach only one post from each user was          to relate, but again Hebbian features are able to highlight
placed in the testing set. Experiments were then run on this       their similarity. Perhaps the most intriguing example is the
single testing set as described in (Hotho et al. 2006).            subjective tag “cool.” Hebbian features are able to find two
                                                                   other subjective tags with high similarity, “interesting” and
Experimental Metrics                                               “fun”.
Recall is a common metric for evaluating the utility of rec-          Tags can also be modeled from features extracted from
ommendation algorithms. It measures the percentage of              UT; Similarly users and resources may be modeled from
items in the holdout set that appear in the recommendation         features extracted from the projections. In all cases similar
set. Recall is a measure of completeness and is defined as:        trends are observed.
                                                                      Table 3 shows the same experiment completed with data
                  recall = |Th ∩ Tr |/|Th|                 (12)    from Citeulike. Again RT matrix was built from count data.
                                                                   The learning rate is set to .001. As before, 2000 epochs were
   Precision is another common metric for measuring the            selected, but RMSE testing on a holdout set halted the train-
usefulness of recommendation algorithms. It measures the           ing of features when overfitting was detected. 100 features
percentage of items in the recommendation set that appear          were extracted, but only the first 20 showed progress in re-
in the holdout set. Precision measures the exactness of the        ducing RMSE.
recommendation algorithm and is defined as:                           The related tags in Citeulike show similar trends to those
                                                                   found in Delicious. However because of its sparsity and rel-
                precision = |Th ∩ Tr |/|Tr |               (13)    atively small size, it appears to be more difficult to extract
                                                                   features. Nevertheless visual inspection of the tags reaffirms
Hebbian Features                                                   the assumption that Hebbian Deflation and cosine similarity
A fundamental assumption of user models based upon Heb-            offer a method to discover meaningful features.
bian Deflation is that meaningful features can be extracted           While this paper proposes using Hebbian features to
from the two-dimensional projections. In order to affirm           model users, these features offer many other uses for Folk-
this assumption we have taken the two-dimensional projec-          sonomies. Hebbian features might provide a means for users
tion of the Delicious dataset, RT, using the tag counts as the     to navigate the Folksonomy. After selecting a user, resource
weights and performed Hebbian Deflation to generate RX             or tag the system could present additional items based upon
and T X. A learning rate of .001 was chosen as this value          the Hebbian features. The user may then select an item from
allows the features to converge quickly and smoothly. Ini-         that list and explore other users, resources or tags that are
tially 100 features were built, but examination of the RMSE        similar. By repeating this process the user could traverse
   photo                       shopping
   0.789 photos                0.803 Shopping
   0.737 photography           0.780 shop
   0.652 pictures              0.560 buy
   0.640 foto                  0.530 google
   0.637 Photo                 0.519 store
   0.627 fotos                 0.469 handmade
   toread                      folksonomy
   0.886 article               0.807 tagging
   0.816 articles              0.786 tags
   0.811 advice                0.691 tag
   0.810 Bookmarks             0.635 classification
   0.808 to read               0.574 metadata
   0.805 interesting           0.523 folksonomies               Figure 1: The effect of k in KN N on recall and precision
   mac                         cool                             for a recommendation set of 5 tags. Users are modeled as a
   0.849 osx                   0.711 interesting                vector over the tag space.
   0.840 Mac                   0.673 fun
   0.778 apple                 0.621 imported
   0.768 OSX                   0.608 how-to
   0.653 macosx                0.595 article
   0.650 Apple                 0.571 useful

Table 2: Selected Delicious tags and their nearest neighbors
using cosine similarity and 50 Hebbian features extracted
from the RTc matrix

   datamining                  networks
   0.981 computational         0.840 social networks
   0.935 conceptual            0.792 classification
   0.933 data-mining           0.751 community
   0.930 webservices           0.740 socialnetworks             Figure 2: The effect of boosting previously used tags in
   0.925 tools                 0.738 graph                      KN N on recall and precision for a recommendation set of
   0.915 data mining           0.737 functional                 5 tags. Users are modeled as a vector over the tag space.
Table 3: Selected Citeulike tags and their nearest neighbors
using cosine similarity and 20 Hebbian features extracted
                                                                techniques.
from the RTc matrix
                                                                   The experiments with K-Nearest Neighbor require the
                                                                tuning of two key variables: k, the number of neighbors,
                                                                and b, the boosting factor.
the Folksonomy or focus in on a particular domain. Heb-            Figure 1 shows the relation between k and the the evalua-
bian features might be particularly useful in this task since   tion metrics recall and precision for a recommendation set of
they are able to uncover relationships in the Folksonomy        size 5. The Delicious dataset was used for this experiment.
that other methods, such as co-occurrence, are unable to dis-   Weights in the user vector are calculated as the frequency of
cover.                                                          tags or tf. This experiment does not include any boosting
   The individual features themselves might be interesting.     factor for previously used tags. As k increases so does re-
A domain expert could analyze the features in an effort to      call and precision. However this improvement suffers from
understand how the Folksonomy is growing, what aspects          diminishing returns until a k of 100 offers little more benefit
are dominating, and which users are having the most impact.     than a k of 50. This trend was observed for all K-Nearest
   Moreover this reduced yet rich feature space can be uti-     Neighbor experiments. As such, all K-Nearest Neighbor ex-
lized in a variety of data mining tasks: recommendation, per-   periments were completed using a k of 50. Similar results
sonalization, search, navigation, etc.                          are observed for the Citeulike dataset.
                                                                   Figure 2 demonstrates the effectiveness of the boosting
Experimental Results                                            modification for K-Nearest Neighbor. The modification
Here we present our experimental results beginning with the     gives extra weight to those tags the user has previously ap-
tuning of variables. We discuss the impact of the boost vari-   plied to a resource. This experiment is completed with a k
able in the quality of the K-Nearest Neighbor algorithm,        of 50; b is adjusted in the range of 0 through 0.20 at 0.025
then provide an in depth comparison of the recommendation       increments. Both recall and precision for a recommendation
set of 5 tags are sharply improved when b in increased to       cision for a recommendation set of size 1 jumps from 40.4%
0.05. Afterward, the effect of the boosting parameter slowly    to 50.9%, a dramatic increase of 10.5%. However, this im-
diminishes.                                                     provement diminishes as the size of the recommendation is
   Table 4 provides a more detailed view of the effect boost-   increased. Precision for a recommendation set of 5 climbs
ing can have. For example without any boosting factor the       from 18.4% to 20.8%, a 2.4% improvement. For a recom-
precision for a recommendation set of size 3 is 43.5%. With     mendation set of size 10, the improvement shrinks to 0.3%.
the boosting factor, precision is increased to 45.9%, an im-       An examination of recall shows similar signs. For a rec-
mediate 3.4% gain. For all size recommendation sets preci-      ommendation set of size 1, the improvement is 5.4%. The
sion is increased. The boosting factors enables K-Nearest       improvement drops to 4.7% for a recommendation set of size
Neighbor to become more precise.                                5 and drops further to an improvement of 1.4% when N is
   Likewise recall increases across the board. For example      increased to 10.
recall given a recommendation set of size 10 jumps from            In general boosting tags based upon previous usage is
66.9% to 69.5%, a 2.6% increase. Boosting therefore ap-         demonstrated to add additional utility to the K-Nearest
pears to increase the completeness of the recommendation        Neighbor algorithm. Yet differences in the improvements
set.                                                            between Delicious and Citeulike offer additional insights.
                                                                   First the average number of tags per posts in Delicious is
                     Delicious                                  3.82. Citeulike has less with 2.5 tags per post, only 65% the
    KN N − U T, b = 0.00 KN N − U T, b = 0.05                   number found in Delicious. As a result Citeulike presents
     N Rec.    Prec.        N Rec.  Prec.                       a smaller target for tag recommendation. Moreover since
     1 0.211 0.567          1 0.232 0.606                       the holdout set is smaller for Citeulike, boosting will have
     2 0.332 0.489          2 0.361 0.519                       a greater impact on smaller recommendation sets than on
     3 0.418 0.435          3 0.450 0.459                       larger sets when contrasted with Delicious. This is observed
     4 0.485 0.394          4 0.516 0.413                       as the improvement to precision garnered from boosting
     5 0.538 0.361          5 0.568 0.376                       drops from 10.5% when the recommendation set contains
     6 0.580 0.333          6 0.609 0.345                       1 tag to only 0.3% when the recommendation set consists of
     7 0.613 0.307          7 0.641 0.317                       10 tags. Improvements in Delicious, on the other hand, are
     8 0.636 0.282          8 0.664 0.291                       more stable dropping from 3.9% to 1.8%. An examination of
     9 0.653 0.259          9 0.682 0.268                       recall shows a parallel trend; in both cases recall improves
    10 0.669 0.239         10 0.695 0.248                       as the size of the recommendation set increases subject to
                     Citeulike                                  diminishing returns. However the effect of diminishing re-
    KN N − U T, b = 0.00 KN N − U T, b = 0.05                   turns hampers Citeulike recommendations earlier and more
     1 0.201 0.404          1 0.255 0.509                       strongly than it affects Delicious. These observations sug-
     2 0.292 0.309          2 0.355 0.377                       gest that care must be taken when comparing recommenda-
     3 0.346 0.252          3 0.407 0.299                       tion algorithms across multiple Folksonomies.
     4 0.383 0.213          4 0.439 0.247                          Furthermore the two datasets have a markedly different
     5 0.407 0.184          5 0.456 0.208                       focus. Delicious users are able to annotate any URL on the
     6 0.427 0.162          6 0.465 0.178                       Web often tagging resources across many different topics.
     7 0.444 0.145          7 0.474 0.157                       Citeulike users on the other hand are focused on scholarly
     8 0.457 0.131          8 0.479 0.139                       publications and often focus primarily on their area of ex-
     9 0.467 0.120          9 0.484 0.125                       pertise. This observation suggests that recommendation al-
    10 0.474 0.110         10 0.488 0.113                       gorithms giving added weight to resources may be more ap-
                                                                propriate for Delicious since user information may befuddle
Table 4: The recall and precision for the top 10 recom-         the recommendation algorithm by incorporating unrelated
mended tags of K-Nearest Neighbor applied to the Deli-          tags. Conversely if a user in Citeulike has annotated items
cious and Citeulike datasets. Users are modeled as vectors      related to his research and does not stray from this topic, the
over the tag space. Vector weights are computed as the tag      user profile based on tags could offer exceptional utility.
frequency. k was set to 50. N is number of tags recom-             Evidence for this analysis is provided in the difference of
mended. Detailed results for two different boost values, 0.00   precision and recall between the two datasets. For a rec-
and 0.05, are presented.                                        ommendation set of size 1, boosting a user’s previously as-
                                                                signed tags offers a 3.9% gain in Delicious and a 10.5% gain
                                                                in Citeulike. This trend continues as N increases until the
   This behavior is consistent with all K-Nearest Neighbor      improvements gradually diminish. This stark contrast sug-
experiments conducted on the Delicious dataset and across       gests that recommendation algorithms augmented by boost-
all user models and all values for k; boosting results in an    ing tags offer more gain for focused Folksonomies such as
approximate 2.0% to 4.0% increase in recall and precision.      Citeulike than broader Folksonomies such as Delicious.
The optimum value for b is between 0.05 and 0.075. For             Having ascertained the optimal values for k and the boost-
consistency all other K-Nearest Neighbor experiments are        ing factor we turn our attention to the user models for K-
run using a boosting factor of 0.05.                            Nearest Neighbor. Experimental results are shown in Fig-
   Similar results were discovered using Citeulike data. Pre-   ure 3 detailing the recall and precision for recommendation
                             Figure 3: The comparison of tag recommender strategies for Delicious.


sets of size 1 through 10. For comparison purposes rec-               Similar trends are observed for Citeulike except for a
ommendation algorithms based on popularity are included.           few notable exceptions. First recommendations that rely on
“Most Popular” recommends the most popular tags in the             the popularity of a tag given a user outperform recommen-
dataset. “Most popular by User” recommends the most pop-           dations based on the popularity of a resource. This reaf-
ular tag for a particular user. “Most popular by Resource”         firms our notion that the focused nature of Citeulike is bet-
recommends the most popular tags for a given resource. K-          ter suited for algorithms that rely on user-tag information
Nearest Neighbor models users as a vectors over the tag            whereas resource-tag information is critical in broader Folk-
space (KNN-UT), vectors over the resource space (KNN-              sonomies where a user’s annotations cover multiple topic ar-
UR), a concatenation of the two (KNN-UR+T), a strict com-          eas.
bination of every resource-tag pair (KNN-U(RT)) and as fea-           Folkrank outperforms other methods as a measure of re-
tures derived through Hebbian deflation on either the UT or        call to a larger extent than in the Delicious dataset, but fur-
UR matrix (KNN-UT-Hebbian or KNN-UT-Hebbian). For                  ther trails as a measure of precision.
all K-Nearest Neighbor models, k is set to 50 and the boost           As in Delicious, K-Nearest Neighbor which treats users
factor is set to .05. “Folkrank” is provided for further com-      as vectors over the set of tags performs strongly in Citeu-
parison and adapts link analysis to the folksonomy structure,      like, whereas the effectiveness of other approaches vary. The
recommending tags through manipulation of the preference           ability of this model to outperform other methods in both
vector.                                                            datasets should be noted. This is due to the inherent com-
   The approach that merely recommends tags that are pop-          prehensiveness of the KNN-UT algorithm.
ular throughout the Folksonomy achieves poor results in the           Only those users that have tagged the query resource are
Delicious dataset. Recommending popular tags for a spe-            considered for the neighborhood resulting in an algorithm
cific user fairs better, but is clearly out done by recommend-     that focuses on user-resource information. Then only those
ing popular tags given a specific resource. Nearly all user        tags that have been applied to the query resource are consid-
models for K-Nearest Neighbor surpass these techniques             ered for the recommendation set focusing on the resource-
based upon popularity. In particular models that treat each        tag connections. Finally by treating user models as vectors
user as a vector over the set of tags appear to perform best for   over the tag space the recommender incorporates user-tag re-
the Delicious dataset. Folkrank offers additional complete-        lationships. Consequently the KNN-UT algorithm accounts
ness as seen by its superior recall, but offers less specificity   for all three aspects of the Folksonomy and is adaptable to
as measured by it precision.                                       many Folksonomies that may require an emphasis on spe-
                             Figure 4: The comparison of tag recommender strategies for Citeulike.


cific relationships. Not surprisingly the user models that per-    hour using K-Nearest Neighbor.
form nearly as well as KNN-UT are KNN-UR+T and KNN-
UT-Hebbian which share the same characteristic. Though                       Conclusions and Future Work
KNN-UT-Hebbian does perform relatively poorly in Citeu-
like, likely due to the fact that this dataset is sparse and the   In this work we have proposed using K-Nearest Neighbor
Hebbian features are more difficult to extract.                    for tag recommendation in Folksonomies. Due to the unique
   Beyond recall and precision, time constraints should            data structure of Folksonomies, modifications are required
be considered when evaluating tag recommendation algo-             to adapt the algorithm. Neighbors are selected only if they
rithms. Recommenders based on popularity can perform               have tagged the query resource and tags are selected for the
much of the computation offline thereby streamlining the           recommendation set only if they have been applied by the
recommendation process. K-Nearest Neighbor on the other            neighbor to the query resource. These modifications tie user-
hand is often referred to as a lazy algorithm since it per-        resource and resource-tag information into the algorithm
forms the bulk of its computation during the query process.        while it dramatically reduces the computational costs. There
However since the proposed modifications to the algorithm          exists a myriad of ways in which to calculate user similarity;
limit the number of similarities that must be calculated to        We have found that cosine similarity between users modeled
only those users that have tagged the query resource, the          as vectors over the tag space performs well. This model in-
algorithm scales very well with large datasets. The compu-         corporates user-tag information into the algorithm. By in-
tational cost may be reduced further if Hebbian features are       cluding all three relationships inherent in Folksonomies, the
extracted from the Folksonomy thereby reducing the length          algorithm is robust for both broad and narrow Folksonomies.
of vectors used for calculating cosine similarity. Hebbian         In addition, K-Nearest Neighbor can be improved by boost-
deflation however is appropriate only when the data is dense       ing tags the user has previously used. The performance of
enough that meaningful features can be extracted.                  K-Nearest Neighbor exceeds that of recommendation algo-
   Folkrank, while it performs well in tag recommendation,         rithms based on popularity, while the running time makes it
is hampered by computational costs requiring a complete            computationally viable for large real world Folksonomies.
calculation of the Pagerank vector for each query. For exam-          In the future we plan to investigate alternative tag rec-
ple to compute 18,105 recommendations required 80 hours            ommendation strategies and study resource or user recom-
of computation on a modern dual-core desktop. In contrast          mendation algorithms. Other approaches such as associa-
2,309,427 recommendations were completed in less than 1            tion rules mining and neural networks are worth considering
for recommendation in Folksonomies. Probabilistic Latent        Nakamoto, R. Y.; Nakajima, S.; Miyazaki, J.; Uemura, S.;
Semantic Analysis offers an alternative means to derive fea-    Kato, H.; and Inagaki, Y. 2008a. Reasonable tag-based col-
tures from the Folksonomy. Feature extraction of any sort       laborative filtering for social tagging systems. In WICOW
presents intriguing opportunities in search, navigation, per-   ’08: Proceeding of the 2nd ACM workshop on Information
sonalization and recommendation.                                credibility on the web, 11–18. New York, NY, USA: ACM.
                                                                Nakamoto, R. Y.; Nakajima, S.; Miyazaki, J.; Uemura, S.;
                  Acknowledgments                               and Kato, H. 2008b. Investigation of the effectiveness of
This work was supported in part by the National Science         tag-based contextual collaborative filtering in website rec-
Foundation Cyber Trust program under Grant IIS-0430303          ommendation. In Advances in Communication Systems and
and a grant from the Department of Education, Graduate As-      Electrical Engineering, 309–318. Springerlink.
sistance in the Area of National Need, P200A070536.             Oja, E., and Karhunen, J. 1985. On stochastic approxima-
                                                                tion of the eigenvectors and eigenvalues of the expectation
                       References                               of a random matrix. Journal of Mathematical Analysis and
 Adrian, B.; Sauermann, L.; and Roth-Berghofer, T. 2007.        Applications 106(1):69–84.
 Contag: A semantic tag recommendation system. In               Salton, G., and Buckley, C. 1988. Term-weighting ap-
 Pellegrini, T., and Schaffert, S., eds., Proceedings of I-     proaches in automatic text retrieval. Information Process-
 Semantics’ 07, pp. 297–304. JUCS.                              ing and Management: an International Journal 24(5):513–
 Basile, P.; Gendarmi, D.; Lanubile, F.; and Semeraro, G.       523.
 2007. Recommending smart tags in a social bookmarking          Salton, G.; Wong, A.; and Yang, C. 1975. A vector
 system. In Bridging the Gep between Semantic Web and           space model for automatic indexing. Communications of
 Web 2.0 (SemNet 2007), 22–29.                                  the ACM 18(11):613–620.
 Batagelj, V., and Zaveršnik, M. 2002. Generalized cores.      Schmitz, C.; Hotho, A.; Jaschke, R.; and Stumme, G. 2006.
 Arxiv preprint cs/0202039.                                     Mining association rules in folksonomies. In Proc. IFCS
 Golder, S. A., and Huberman, B. A. 2006. Usage patterns        2006 Conference, 261–270. Springer.
 of collaborative tagging systems. Journal of Information       Sigurbjörnsson, B., and van Zwol, R. 2008. Flickr tag
 Science 32(2):198.                                             recommendation based on collective knowledge. In WWW
 Heymann, P.; Ramage, D.; and Garcia-Molina, H. 2008.           ’08: Proceeding of the 17th international conference on
 Social tag prediction. In SIGIR ’08: Proceedings of the        World Wide Web, 327–336. New York, NY, USA: ACM.
 31st annual international ACM SIGIR conference on Re-          Song, Y.; Zhuang, Z.; Li, H.; Zhao, Q.; Li, J.; Lee, W.-C.;
 search and development in information retrieval, 531–538.      and Giles, C. L. 2008. Real-time automatic tag recom-
 New York, NY, USA: ACM.                                        mendation. In SIGIR ’08: Proceedings of the 31st annual
 Hotho, A.; Jaschke, R.; Schmitz, C.; and Stumme, G. 2006.      international ACM SIGIR conference on Research and de-
 Information retrieval in folksonomies: Search and ranking.     velopment in information retrieval, 515–522. New York,
 The Semantic Web: Research and Applications 4011:411–          NY, USA: ACM.
 426.                                                           Tso-Sutter, K. H. L.; Marinho, L. B.; and Schmidt-Thieme,
 Jaschke, R.; Marinho, L.; Hotho, A.; Schmidt-Thieme, L.;       L. 2008. Tag-aware recommender systems by fusion of
 and Stumme, G. 2007. Tag Recommendations in Folk-              collaborative filtering algorithms. In SAC ’08: Proceedings
 sonomies. LECTURE NOTES IN COMPUTER SCIENCE                    of the 2008 ACM symposium on Applied computing, 1995–
 4702:506.                                                      1999. New York, NY, USA: ACM.
 Lipczak, M. 2008. Tag recommendation for folksonomies          Van Rijsbergen, C.         1979.      Information Retrieval.
 oriented towards individual users. In Proceedings of           Butterworth-Heinemann Newton, MA, USA.
 the ECML/PKDD 2008 Discovery Challenge Workshop,               Xu, Z.; Fu, Y.; Mao, J.; and Su, D. 2006. Towards the
 part of the European Conference on Machine Learning            semantic web: Collaborative tag suggestions. Collabo-
 and Principles and Practice of Knowledge Discovery in          rative Web Tagging Workshop at WWW2006, Edinburgh,
 Databases.                                                     Scotland, May.
 Macgregor, G., and McCulloch, E. 2006. Collaborative
 tagging as a knowledge organisation and resource discov-
 ery tool. Library Review 55(5):291–300.
 Mathes, A. 2004. Folksonomies-Cooperative Classi-
 fication and Communication Through Shared Metadata.
 Computer Mediated Communication, (Doctoral Seminar),
 Graduate School of Library and Information Science, Uni-
 versity of Illinois Urbana-Champaign, December.
 Mika, P. 2007. Ontologies are us: A unified model of social
 networks and semantics. Web Semantics: Science, Services
 and Agents on the World Wide Web 5(1):5–15.