<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RSDC'09: Tag Recommendation Using Keywords and Association Rules</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jian Wang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liangjie Hong</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brian D. Davison</string-name>
          <email>davisong@cse.lehigh.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering Lehigh University</institution>
          ,
          <addr-line>Bethlehem, PA 18015</addr-line>
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>While a webpage usually contains hundreds of words, there are only two to three tags that would typically be assigned to this page. Most tags could be found in related aspects of the page, such as the page own content, the anchor texts around the page, and the user's own opinion about the page. Thus it is not an easy job to extract the most appropriate two to three tags to recommend for a target user. In addition, the recommendations should be unique for every user, since everyone's perspective for the page is di erent. In this paper, we treat the task of recommending tags as to nd the most likely tags that would be chosen by the user. We rst applied the TF-IDF algorithm on the limited description of the page content, in order to extract the keywords for the page. Based on these top keywords, association rules from history records are utilized to nd the most probable tags to recommend. In addition, if the page has been tagged before by other users or the user has tagged other resources before, that history information is also exploited to nd the most appropriate recommendations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Social bookmarking services allow users to share and store references to various
types of World Wide Web (WWW) resources. Users can assign tags to these
resources, several words best describing the resource content and his or her
opinion. To assist the process of assigning tags, some services would provide
recommendations to users as references. In Tatu et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] work, they mentioned
that the average number of tags in RSDC'08 bookmarking data is two to three.
Thus, it is not an easy task to provide reasonable tag recommendations for the
resource with only two to three related tags on average. Tag recommendation is
a challenge task in ECML PKDD 2009 where participants should provide either
content-based or graph-based methods to help users to assign tags. This work
shows some results that aim to this challenge.
      </p>
      <p>The challenge provides description of the resources and posts of the tag.
Description contains some basic information about the resources and post is the
tuple of user, tag and resource. In the challenge, there are two types of resources,
normal web pages, named as bookmark, and research publications, named as
bibtex, with di erent schemas of descriptions. A post records the resource and the
tags assigned to it by a particular user. The task is to provide new tags to new
resources with high F-Measure performance on the top ve recommendations.
The di culties of this challenge fall in:
{ How to take advantage of the record content itself, while the description is
very limited? For example, bookmark is only described with the title of the
web page and a short summary while bibtex is usually described with title,
publication name, and authors of the paper.
{ How to utilize history information to recommend tags which do not appear
in the page content? Though we can use keywords to help nd possible tags,
tags are not just keywords. Tags could be user's opinion about the page, the
category of the page, so on and so forth. This kind of tag might be tracked
by using history information.
{ How to choose the most appropriate two to three tags among the potential
pool? By analyzing the page content and history information, we might have
a pool which contains the reasonable tag recommendations. Yet we cannot
recommend all those to the user. Instead of that, only two to three tags need
to be extracted from that pool.</p>
      <p>In order to solve the above problems, we propose tag recommendation using
both keywords in the content and association rules from history records. After
we end with a pool which contains potential appropriate tags, we introduce a
method, named common and combine, to extract the most probable ones to
recommend. Our evaluation showed that integrating association rules can give
better F-Measure performance than simply using keywords.</p>
      <p>Besides using association rules, some history information will be used more
directly, if the resource has been tagged before or the target user tagged other
documents before. These history records would greatly improve recommendation
performance.</p>
      <p>In this paper, we tuned some parameters in our recommendation system to
generate the best F-Measure performance while recommending at most ve tags.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Lipczak [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] proposed a recommendation system mainly based on individual posts
and the title of the resource. The key conclusion of their experiments is that,
they should not only rely on tags previously attached when making
recommendations. Sparsity of data and individuality of users greatly reduce the usefulness
of previous tuple data. Looking for potential tags they should focus on the
direct surrounding of the post, suggesting a graph-based method. Tatu et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
proposed a recommendation system that takes advantage of textual content and
semantic features to generate tag recommendations. Their system outperformed
other systems in last year's challenge. Katakis et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] proposed a multilabel
text classi cation recommendation system that used titles, abstracts and
existing users to train a tag classi er.
      </p>
      <p>
        In addition, Heymann et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] demonstrated that \Page text was strictly
more informative than anchor text which was strictly more informative than
surrounding hosts", which suggests that we do not have to crawl other information
besides page content. They also showed that the use of association rules can help
to nd recommendations with high precision.
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dataset Analysis and Processing</title>
      <sec id="sec-3-1">
        <title>Dataset from the Contest</title>
        <p>Three table les were provided by the contest, including bookmark, bibtex and tas.
The bookmark le contains information for bookmark data such as contentID,
url, url-hash, description and creation date. The bibtex le contains information
for bibtex data such as contentID, and all other related publication information.
The tas le contains information for (user, tag, resource) tuple, as well as the
creation date. The detailed characteristics for these les could be found in Table
1. In this work, all contents were transformed into lower case since the evaluation
process of this contest ignores case. In the mean time, we ltered the latex format
when we exported bibtex data from the database.
We considered and tried merging duplicate records together in training process
yet found it did not help much. Thus we kept the duplicate records when building
our experiment collections. Since our proposed tag recommendation approach
does not involve a training process, we did not separate the dataset into training
one and testing one at rst. We evaluated our recommendation system on all
documents in the given dataset. Based on the type of documents, there are three
di erent collections in our dataset:
bookmark collection from dataset provided We created a collection
bookmark more to contain all bookmark information which were provided by the
contest training dataset. Every document in the collection corresponds to a unique
contentID in bookmark le. It contains all information for that record,
including description and extended description. There are 263,004 documents in this
collection.</p>
        <p>During the experiment, we crawled the external webpage for every contentID.
Yet the performance showed that the external webpage are not as useful as
the simple description provided by the contest. Regardless of performance, it
also cost too much time, which is not realistic for online tag recommending. In
addition, an external webpage usually contains too many terms, which makes it
even harder to extract two to three appropriate terms to recommend as tags.
bibtex collection from dataset provided We created a collection
bibtex original to contain all bibtex information which were provided by the original
dataset. Every document in the collection corresponds to a unique contentID in
bibtex le. It contains all information for that record, including all attributes in
Table 1 except simhash0, simhash1 and simhash2. There are 158,924 documents
docs in this collection.
bibtex collection from external resources If the url of a bibtex record
points to some external websites such as portal.acm.org and citeseer, we crawled
that webpage and extracted useful information for this record. All these
documents are stored in another collection. Similarly, every document in the collection
corresponds to a unique contentID in bibtex le. There are 3,011 documents in
this collection bibtex parsed.
4</p>
        <p>Keyword-AssocRule Recommendation
We consider the tag recommendation problem as to nd the most probable terms
that would be chosen by users. In this paper, P (X) indicates the probability of
term X to be assigned to the document as tag. For every document, the term
with high P (X) has the priority to be recommended.
4.1</p>
      </sec>
      <sec id="sec-3-2">
        <title>Keyword Extraction</title>
        <p>In this step, our assumption is that the more important this term in the
document, the more probable for this term to be chosen as tag.</p>
        <p>
          We used two term weighting functions, TF-IDF and Okapi BM25 [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] to
extract \keywords" from resources. In a single collection, we calculated TF-IDF
and BM25 value for every term in every document.
        </p>
        <p>For TF-IDF, the weighting function is de ned as follows:</p>
        <p>T F</p>
        <p>IDF = T Ft;d</p>
        <p>IDFt
where T Ft;d is the term frequency that equal to the number of occurrences of
term t in document d. IDFt is inverse document frequency that is de ned as:
(1)
(2)
(3)
(4)
IDFt = log</p>
        <p>N
dft
where dft is the number of documents in the collection that contain a term t and
N is the total number of documents in the corpus.</p>
        <p>For Okapi BM25, the weighting function is de ned as follows:</p>
        <p>BM 25 =
n</p>
        <p>T Ft;d(1 + k1)
Xi=1 IDF (qi) T Ft;d + k1(1 b + b</p>
        <p>Ld )
Lave
where T Ft;d is the frequency of term t in document d and Ld and Lave are the
length of document d and the average document length for the whole collection.
IDF (qi) here is de ned as</p>
        <p>IDF (qi) = log</p>
        <p>N</p>
        <p>n(qi) + 0:5
n(qi) + 0:5</p>
        <p>The terms in the single document are ranked according to its TF-IDF or
BM25 value in decreasing order. A term with high value or high rank is
considered to be more important in the document. Thus Pk(X) can be calculated by
Algorithm 1.</p>
        <p>Algorithm 1 To calculate Pk(X), by using results from keyword extraction
method
for all documents in the collection do
rank all terms according to TF-IDF or BM25 value in decreasing order
for all term X in the document do</p>
        <p>Pk(X) = 100 rank(X);
f//rank(X) = 1 indicated the top position, 2 indicated the second position,
etc. g
end for
end for</p>
        <p>As shown in Table 2, TF-IDF performed better than BM25 in tag
recommendation process. The following processes in this work were all performed based
on results of TF-IDF method.
4.2</p>
      </sec>
      <sec id="sec-3-3">
        <title>Using Association Rules</title>
        <p>
          Recent work by Heymann et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] showed that using association rules could
help to nd tag recommendation with high precision. They expanded their
recommendation pool in decreasing order of con dence. In this paper, we used
alternative approaches to deeply analyze association rules, which are found in
history information. It does help to extract tags which are more likely to be used
by users.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Finding association rules in history records We used three key factors in</title>
        <p>association rules, including support, con dence and interest. Every unique record
is treated as the basket and the tags (X, Y , etc.) associated with every record are
treated as the items in the basket. For every rule X ! Y , support is the number
of records that contain both X and Y . Con dence indicates the probability of
Y in this record if X already associates with the record, i.e., P (Y jX). Interest
is P (Y jX) P (Y ), showing how much more possible that X and Y associating
with the record together.</p>
        <p>The rules X ! Y we constructed all have support &gt; 10, thus at least 10
resources in our training dataset contain both X and Y as tags. As we mentioned
before, we did not separate the dataset into training and testing sets. During
evaluation, some records might bene t from the rule it contributed at rst, yet
at least 9 more resources also contributed to the rule. The support limit here
is chosen arbitrarily. Two sets of rules are constructed independently, one for
bookmark dataset and another one for bibtex dataset. Some sample rules are
showed in Table 3.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Choosing appropriate recommendations by using association rules</title>
        <p>Here the problem becomes to be:</p>
        <p>If X ! Y exists in the association rules, how possible that term Y should
be recommended when X is likely to be recommended?</p>
        <p>Given P (X) and the con dence value P (Y jX), P (Y ) could be calculated
according to law of total probability, which is sometimes called as law of
alternatives:
or</p>
        <p>P (Y ) =</p>
        <p>X P (Y \ Xn)
n
P (Y ) =</p>
        <p>X P (Y j Xn)P (Xn)
n
(5)
(6)
since P (Y \ Xn) = P (Y j Xn)P (Xn). According to the above equations, the
algorithm to calculate Pa(Y ), which is called Assoc(Y ) in this paper, is shown
in Algorithm 2.</p>
        <p>Algorithm 2 To calculate Pa(X), by using association rules
for all documents in the collection do
for all term X in the document do
for all association rule X ! Y do</p>
        <p>Pa(Y )+ = (conf idence of X ! Y ) Pk(X);
f//Pk(X) is calculated by Algorithm 1g
end for
end for
end for
4.3</p>
      </sec>
      <sec id="sec-3-6">
        <title>Combining Keyword Extraction with Association Rules Results</title>
        <p>After Pk(X) and Pa(X) are calculated for every term in the document, one
method, in Algorithm 3, is to linearly combine the two values to calculate the
nal probability Pc(X) for recommending a term X.</p>
        <p>Similarly, term with higher Pc(X), i.e., higher rank in Combined results has
the priority to be recommended.</p>
        <p>The experiments showed that weight could a ect the F-Measure performance
and the optimal weight to combine is di erent for every collection. Figure 1
shows the e ect of weight in bibtex parsed collection, where F-Measure reaches
the peak during increase of weight from 0.1 to 0.9. This trend is similar in other
two collections. Our experiments indicated that the optimal weight to achieve
best F-Measure for bibtex parsed, bibtex original, bookmark more is 0.7, 0.5 and
0.5, respectively. The evaluation results with optimal weight for every collection,
in this step, is shown in the second column of Table 4. Compared to the TF-IDF
results in the rst column, it is obvious that the association rules can greatly
help to improve the F-Measure performance.</p>
        <p>Another method we found that worked well is common and combine. In
common step, if the term in top rank of keyword extraction results do have
Assoc(X) &gt; 0, then recommend this term. In combine step, extract terms with
Algorithm 3 To calculate Pc(X), by linearly combining results from TF-IDF
&amp; association rules
for all documents in the collection do</p>
        <p>T F IDF max = maximum T F IDF (X) for all terms in this document
Assocmax = maximum Assoc(X) for all terms in this document
for all term X in the document do</p>
        <p>T F IDF (X) = T F IDF (X)=T F IDF max;
f// normalize T F IDF (X) valueg
Assoc(X) = Assoc(X)=Assocmax;
f// normalize Assoc(X) value, Assoc(X) = Pa(X) in Algorithm 2.g
Combined(X) = T F IDF (X) weight + Assoc(X) (1 weight);
f//linearly combine the two valuesg
rank terms according to decreasing order of Combined(X);
Pc(X) = 100 rank of Combined(X);
f//rank = 1 indicated the top position, 2 indicated the second position, etc.g
end for
end for
0.25
0.2
0.15
0.1
0.05
0
0.1
0.2
0.3
0.4</p>
        <p>weight
0.5
precision</p>
        <p>0.6
fmeasure
high Pc(X ) for recommendation. The total number of tags to recommend is
controlled by k, the number of tags to check in the common step is common-no,
and the number of tags to extract in the combine step is combine-no. Detailed
steps are shown in Algorithm 4.</p>
        <p>Since the evaluation of this contest only cares for the
rst 5 tags to
recommend, we set k = 5. If common-no = 10 and combine-no = 5, the results for all
three collections are shown in third column of Table 4.</p>
        <p>Generally speaking, F-Measure increases with the increase of common-no and
reaches the peak near common-no = 20. At the same time, it reaches its highest
point as combine-no increases, and remains the same level with the further
increase of combine-no. Since the total number of tags to recommend is
xed to
be 5, the combine step will stop before it reaches the limit of how many tags to
check, i.e., combine-no. Thus if combine-no is greater than a certain number, it
won't a ect the f-measure performance anymore.</p>
        <p>Since the recommendations would be further modi ed by history results, we
set k = 80, common-no=10, and combine-no=80 here.</p>
        <p>If only recommending at most 5 tags, the F-Measure performance of all
above methods, including only using TF-IDF, linearly combining results of
TFIDF &amp; association rules, and common &amp; combine the two, are shown in Figure
2. It is obvious that using association rules can greatly enhance the TF-IDF
performance, either by linear combination or common &amp; combine. Common &amp;
combine method is slightly better than linearly combining the two.
4.4</p>
      </sec>
      <sec id="sec-3-7">
        <title>Checking Resource or User</title>
      </sec>
      <sec id="sec-3-8">
        <title>Match with</title>
      </sec>
      <sec id="sec-3-9">
        <title>History Records</title>
        <p>In this section, historical information is used more directly. We performed 10-fold
cross validation to report the performance in this section.</p>
        <p>Algorithm 4 Common and Combine, to integrate results from TF-IDF &amp;
association rules
for all documents in the collection do
count = 0;
f// common stepg
for i = 1 to common-no do
f// common-no is the parameter to tune. It controls how many terms to check
in TF-IDF results.g
extract term X with TF-IDF rank = i;
if Assoc(X) &gt; 0 then
recommend this term X;
count + +;
f//count is the number of tags that have been recommended in common
step.g
end if
i + +;
end for
f// combine stepg
rank all terms by Pc(X) in decreasing order.
f// Pc(X) is the combination value of keyword extraction and association rules
results, calculated by Algorithm 3.g
for j = 1 to (k count) do
f// k is the total number of tags to recommend, k count is the number of tags
to recommend in combine stepg
extract the term Y with (rank in Pc(X) results) = j;
if Y is not in the recommendation list then</p>
        <p>recommend this term Y ;
end if
j + +;
if j &gt; combine-no then
f// combine-no is the parameter to tune. It controls how many terms to check
in combined results of TF-IDF &amp; association rules.g
exit the combine step;
end if
end for
end for
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0</p>
        <p>TFIDF
bibtex_original bibtex_parsed bookmark_more</p>
        <p>linearly combine of TFIDF &amp; association rules common &amp; combine of TFIDF and association rules
Resource match If the bookmark or bibtex in the testing dataset already
appeared before in training dataset, regardless of which user assigned the tags, the
tags that were assigned before would be directly inserted into our
recommendation list for this document. These tags from historical information have higher
priority than the tags that were recommended in previous steps.
User match Suppose the tags that are assigned by users previously in the
training dataset, regardless of to which documents, make up the user's tagging
vocabulary. Our assumption here is that every user prefers to use tags in his/her
own tagging vocabulary, as long as the tags are relevant to the document. Thus
the tags in the user's tagging vocabulary would be given higher priority. The
common and combine algorithm is again applied here. In common step, if the
terms with high rank in previous steps do appear in user's tagging vocabulary,
then recommend this term. In combine step, extract terms with high ranks in
previous steps to recommend. The number of tags to check in the common step is
common-no, and the number of tags to extract in the combine step is combine-no.</p>
        <p>The two parameters, common-no and combine-no, are tuned to achieve the
best F-Measure performance when recommending at most 5 tags. common-no is
xed to be 53 in Figure 3, while combine-no increases from 1 to 5. In that gure,
it shows that F-Measure increases and reaches the peak point at combine-no =
1. In Figure 4, combine-no is xed to be 1 and common-no increases from 1 to
80. F-Measure increases with the initial increase of common-no and reaches the
peak point in the middle. In this work, we set common-no = 53, and combine-no
= 1.</p>
        <p>0.22
0.21
0.2
0.19
0.18
0.17
0.16
0.22
0.2
0.18
0.16
0.14
0.12
0.1
recall
precision
fmeasure
recall
precision
fmeasure
1
2</p>
        <p>3
combine no
4
5
Fig. 4. In common and combine method for checking user match, performance for
di erent common-no and xed combine-no = 1. At most 5 tags are recommended.</p>
      </sec>
      <sec id="sec-3-10">
        <title>Exact match with same user and same resource In this step, if user has</title>
        <p>tagged the same document in the training dataset, then the tags he used before
for this document would be directly recommended again.</p>
        <p>For example, if a record both exists in bibtex parsed and bibtex original, the
results for this record are chosen from bibtex parsed instead of bibtex original,
since the former one has higher priority.</p>
        <p>If we only consider to combine the common &amp; combine results for all three
collections, the best performance is shown in column without checking the history
records of Table 6.
If step Tags from records that match with same user has lower priority than
tags from records that match with same resource, the best result is shown in
column resource match higher of Table 6. Otherwise, the best result is shown in
column user match higher of Table 6. The results indicate that even for those
bookmarks that were tagged by other users before, it is still bene cial to consider
the target user's own tagging vocabulary.</p>
        <p>To sum up, the best performance on training dataset is shown in Table 7,
including the detailed results only for bookmark and bibtex.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and Future Work</title>
      <p>In this paper, we proposed a tag recommendation system using keywords in the
page content and association rules from history records. If the record resource
or target user appeared before, the history tags would be used as references to
recommend, in a more direct way. Our experiments showed that association rules
could greatly improve the performance with only keyword extraction method,
while history information could further enhance the F-Measure performance of
our recommendation system.</p>
      <p>In the future, other keyword extraction method can be implemented to
compare with TF-IDF performance. In addition, graph-based methods could be
combined with our recommendation approach to generate more appropriate tag
recommendations.</p>
      <sec id="sec-4-1">
        <title>Acknowledgments</title>
        <p>This work was supported in part by a grant from the National Science
Foundation under award IIS-0545875.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>P.</given-names>
            <surname>Heymann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramage</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Garcia-Molina</surname>
          </string-name>
          .
          <article-title>Social tag prediction</article-title>
          .
          <source>In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval</source>
          , pages
          <volume>531</volume>
          {
          <fpage>538</fpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>I.</given-names>
            <surname>Katakis</surname>
          </string-name>
          , G. Tsoumakas,
          <string-name>
            <surname>and I. Vlahavas.</surname>
          </string-name>
          <article-title>Multilabel text classi cation for automated tag suggestion</article-title>
          .
          <source>In Proceedings of the ECML/PKDD 2008 Discovery Challenge Workshop, part of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Lipczak</surname>
          </string-name>
          .
          <article-title>Tag recommendation for folksonomies oriented towards individual users</article-title>
          .
          <source>In Proceedings of the ECML/PKDD 2008 Discovery Challenge Workshop, part of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Robertson</surname>
          </string-name>
          .
          <article-title>Overview of the OKAPI projects</article-title>
          .
          <source>Journal of Documentation</source>
          ,
          <volume>53</volume>
          (
          <issue>1</issue>
          ):3{
          <issue>7</issue>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M.</given-names>
            <surname>Tatu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Srikanth</surname>
          </string-name>
          , and
          <string-name>
            <surname>T. D'Silva</surname>
          </string-name>
          . Rsdc'
          <volume>08</volume>
          :
          <article-title>Tag recommendations using bookmark content</article-title>
          .
          <source>In Proceedings of the ECML/PKDD 2008 Discovery Challenge Workshop, part of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases</source>
          , pages
          <volume>96</volume>
          {
          <fpage>107</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>