<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Probabilistic Ranking Approach for Tag Recommendation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zhen Liao</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maoqiang Xie</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hao Cao</string-name>
          <email>caohaog@mail.nankai.edu.cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yalou Huang</string-name>
          <email>huangylg@nankai.edu.cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>College of Information Technology Science, Nankai University</institution>
          ,
          <addr-line>Tianjin</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>College of Software, Nankai University</institution>
          ,
          <addr-line>Tianjin</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Social Tagging is a typical Web 2.0 application for users to share knowledge and organize the massive web resources. Choosing appropriate words as tags might be time consuming for users, thus a tag recommendation system is needed for accelerating this procedure. In this paper we formulate tag recommendation as a probabilistic ranking process, especially we propose a hybrid probabilistic approach which combines language model and statistical machine translation model. Experimental results validate the e ectiveness of our method.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Folksonomy is a way to categorize Web resources via utilizing the \wisdom" of
web users, nowadays it is existing in many web applications such as Delicious3,
Filckr4, Bibsonomy5. One user could create and share her knowledge during the
tagging on resources that are interesting to her. Web resources come in many
forms, for example, one resource could be a Web pages, a published paper, or a
book. To tag a resource with appropriate words is not so easy and might cost
lots of time. Thus a tag recommendation system is needed for easing the
timeconsuming step. Typically a recommendation system would suggest 5 or 10 tags
to the user for a given resource. Those suggested tags would help one user to
think about eligible words and to realize the interesting aspects concerned by
others. To solve the problems, ECML PKDD holds the second round discovery
challenge6 of tag recommendation. This paper presents a probabilistic ranking
approach submitted to the challenge.</p>
      <p>Given a resource, users choose tags by di erent aspects of the resource and
their speci c interests. To pick up a tag from the entire tag set and assign it to
the resource could be formulated as following process: given a resource and a
user, ranking the tags by their relevance to the resource and user. Here relevance
denotes the `value' of how likely the user would label this tag on this resource.</p>
      <sec id="sec-1-1">
        <title>3 http://del.icio.us</title>
      </sec>
      <sec id="sec-1-2">
        <title>4 http://www. ickr.com/</title>
      </sec>
      <sec id="sec-1-3">
        <title>5 http://www.bibsonomy.org/</title>
      </sec>
      <sec id="sec-1-4">
        <title>6 http://www.kde.cs.uni-kassel.de/ws/dc09</title>
        <p>We suppose a tag recommendation system works best while recommending tags
are sorted by the relevance and then suggested to the user.</p>
        <p>In this paper, the datasets provided by Bibsonomy is a set of post. Each post
denotes a triple fuser, resource, a set of tagg. A resource type could be bookmark
or bibtex, where bookmark is Web page and bibtex is publication. Both bookmark
and bibtex resources contain many elds: URL, description, etc. The textural
information in the elds could be merged as a pseudo document.</p>
        <p>A natural way of choosing tags is to select words from the pseudo document
of given resource. A TF-like maximum likelihood method could reach the goal.
The important problem is that maximum likelihood model could not generate
tags which are meaningful but not existing in the document. To incorporate
previously popular tags and tags preferred by a user, a tag recommendation model
could be formulate into language model smoothed via Jelinek-Mercer method as
described in Section 3.2. However, the language modeling approach could not
learn the word-tag relateness which re ects how other users choose tags for those
words in the document. Since the textural information existing in a post could
be considered as a parallel corpus - fwords in document, tagsg, we propose to use
the statistical machine translation approach to learn the translation probability
from words to tags.</p>
        <p>Finally, we propose a candidate set based tag recommendation algorithm
which generates candidate tags from the textual elds of a resource using
maximum likelihood and statistical machine translation model. The e ectiveness of
our approach is validated on the bookmark and bibtex tagging test datasets
provided by Bibsonomy. While textural content of a bookmark resource is
inadequate, we utilize the tags used within same Domain to extend the candidate
set. We also found simple co-occurrence based translation probability
estimation performs as good as IBM Model 1 [6] which uses the EM algorithm to learn
the translation probability. An advantage of co-occurrence based approach is
its convenience for handling with new training data, since training the model
is just counting the co-occurrence of words and tags. However, EM-based
approach needs to re-train translation model though iterations which might be
time consuming for large scale dataset.</p>
        <p>The rest of this paper is organized as follows. In Section 2 the related work
is surveyed. In Section 3 our content based tag recommendation models are
presented, and the recommendation algorithm is described in Section 4. In Section
5 we descrbe the data format and preprocessing step, and experimental results
are reported in Section 6. Finally in Section 7 we conclude this paper and give
out some possible future research issues.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>Most of existing tag recommendation approaches are based on the textual
information of the resource and previous interests of users. Up to now, the
information retrieval, data mining and natural language processing techniques have
been used for solving the tag recommendation problem.</p>
      <p>Heymann et al. [1] use one of the largest crawls from the social bookmarking
system Delicious and presents studies of the factors which could impact the
performance of tag prediction. The predictability of tags is measured by some
method such as entropy based metric. The tag-based association rule is proposed
to assist tag predictions. The method of learning the word-tag relateness via
association rule needs to tune the con dence and support to nd meaningful
rules, but we transfer it into the translation probability which could get the
converged solution without tuning.</p>
      <p>Tatu et al [2] uses document and user models derived from the textual content
associated with URLs and publications by social bookmarking tool users. The
natural language processing techniques are used to extract the concept(Part of
Speech, etc.) from the textual information. WordNet7 are used to stem the
concepts and link synonyms. The di erence between our work and theirs is that they
expand the concept via WordNet, but do not have the word-to-tag translation
probability such as from `eclipse' to `java'.</p>
      <p>Lipczak [3] focus on the folksomomies towards individual users, and proposed
a three step tag recommendation system which conducts the Personmony based
ltering using previously used tags of users after the extraction and retrieving
of tags. The recommendation approach in [3] is similar with our work, but the
scores of candidate tags are computed di erently. They use the multiply strategy
for di erent factors, but we conduct a weighted sums in which the weight could
be set to prefer di erent components. Besides, we use the statistical machine
translation approach to learn the word-tag relateness which is di erent from
model proposed in [3].</p>
      <p>Language modeling approach [4] has been applied in Information Retrieval
with lots of smoothing strategies [5]. The statistical machine translation
approaches [6] shows its theoretical soundness and e ectiveness in translation, and
Berger et al [7] and Xue et al [8] incorporate the statistical translation approaches
into information retrieval and automatic question answering elds. The
theoretical soundness and e ectiveness make it stable to adopt the language modeling
and statistical machine translation approach into tag recommendation. The
statistical machine translation approach also naturally solve the problem of learning
the word-tag relateness of sharing the common tagging knowledge among users.
3
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>Content Based Tag Recommendation Models</title>
      <sec id="sec-3-1">
        <title>Problem De nition</title>
        <p>Q
In this paper, a tag set is denoted as t = ftigi=1 where ti is a single word or
term and Q is the number of tags in t.</p>
        <p>The tag recommendation task is to suggest a tag set t for a user Uk while
given a bookmark/publication resource Rj which might be a web page, a book or
paper etc. The resource Rj contains several elds such as URL, title, description
and we denote the resource content as a pseudo document Dj .</p>
        <sec id="sec-3-1-1">
          <title>7 http://wordnet.princeton.edu</title>
          <p>Suppose the recommendation system is required to suggest N tags, it is
to nd N tags ftigiN=1 from the entire tag sets with the biggest probability
p(tijUk; Dj ).</p>
          <p>For solving the task, a training set S = fSigiK=1 is given, where Si speci es
a triple fti; U i; Dig. The ti is a tag set, U i 2 U = fU1; :::; UM g is a user and
Di 2 D = fD1; :::; DN g is a resource . Then we can learn a tag recommendation
model M from S.</p>
          <p>At the testing stage, a testing set T = fT j gjP=1 where T j = fU j ; Dj g is given.
The model M is asked to suggest tag set tj for each T j . After that a groudtruth
tag sets G = fgj gjP=1 is used to judge the recommendations ftj gjP=1, and the
performance is get via some evaluation measures such as Precision, Recall and
F-measure.</p>
          <p>For a speci c user Uk, she would have her preference in choosing a word ti
as a tag, and if we have this user's information in the training set S, we can
formulate this preference as P (tijUk) = c(ti;Uk) where c(ti; Uk) is frequency of ti
jUkj
be used by user Uk, and jUkj is total frequency of all tags used by Uk.</p>
          <p>We de ne the tag generating probability a tag ti for a given user and
document tuple fUk; Dj g as:</p>
          <p>
            P (tijDj ; Uk) = (
            <xref ref-type="bibr" rid="ref1">1</xref>
            )P (tijDj ) +
Where is a trade-o parameter between the resource content and user.
          </p>
          <p>Following we will introduce language model and statistical machine
translation approaches for estimating P (tijDj ), and then we will combine them into
our nal model.
A natural and simple way to estimate P (tijDj ) is to use the maximum likelihood
approach as:</p>
          <p>Pml(tijDj ) =
c(ti; Dj )
jDj j</p>
          <p>Where c(ti; Dj ) is occurrence of ti in Dj , and jDj j is document length of
Dj . The shortcoming of the maximum likelihood estimation is that it could not
generate tag which does not exist in Dj , thus we introduce language model
smoothed via Jelinek-Mercer method [5] as:</p>
          <p>
            Plm(tijDj ) = (
            <xref ref-type="bibr" rid="ref1">1</xref>
            )Pml(tijDj ) +
          </p>
          <p>Pml(tijC)
(3)</p>
          <p>
            Where is the smoothing parameter, and C corresponds to the entire corpus.
Actually the smoothing term P (tijC) could be formulated as the probability of
the word ti be used as a tag. We de ne P (tijC) as #c(tatig)s where #tags is the total
number of tags in the training set S. The language modeling approach (3) could
be considered as the incorporation of words in the document and previously
popular tags of all users.
(
            <xref ref-type="bibr" rid="ref1">1</xref>
            )
(
            <xref ref-type="bibr" rid="ref2">2</xref>
            )
3.3
          </p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Statistical Machine Translation Approach</title>
        <p>However, the language modeling approach has not considered word-tag relateness
which would be important for tag recommendation. For solving the problem, we
further introduce the Statistical Machine Translation(SMT) approach [6] [7] [8]
for estimating the probability P (tijDj ):</p>
        <p>Psmt(tijDj ) = jDjjDj j+j 1 Ptr(tijDj ) + jDj 1j+ 1
P (tijnull)</p>
        <p>Where P (tijnull) could be regarded as the background smoothing model
P (tijC), and a more detailed comparison them could be found in [8]. Ptr(tijDj )
is the translation probability from Dj to ti as following:</p>
        <p>Ptr(tijDj ) =</p>
        <p>X Ptr(tijw)Pml(wjD)
w2Dj</p>
        <p>To learn the word-word transition probability Ptr(tijw), the EM algorithm
could be used. The detail of EM algorithm of learning the word-tag relateness
P (tijw) in Statistical Machine Translation(SMT) Model is described in [6]. In
the training set S = fSj gjK=1, the parallel corpus of tag and document as Sj =
ftj ; Dj g is utilized, and the EM step for learning P (tijw) can be formulated as:
E-Step:</p>
        <p>Pt1r(tijw) =</p>
        <p>K
w1 X c(ti; w; tj ; Dj )</p>
        <p>j=1
M-Step:
c(ti; w; tj ; Dj ) =</p>
        <p>P (tijw)
P (tijw1) + ::: + P (tijwo)
#(ti; tj )#(w; Dj )
(4)
(5)
(6)
(7)
(8)</p>
        <p>In Equation (6) w1 = Pti PjK=1 c(ti; w; tj ; Dj ) is the normalization factor.
In Equation (7) fw1; :::; wog is words contained in Dj , #(ti; tj ) and #(w; Dj )
is the number of ti in tj and number of w in Dj . The convergency of this EM
algorithm is proved in [6].</p>
        <p>In this paper, we also nd that the co-occurrence based translation
probability could be helpful in tag recommendation, and we denote it as:
Pt2r(tijw) =</p>
        <p>PK
j=1 #(ti; tj ) #(w; Dj )
PK</p>
        <p>j=1 #(w; tj ; Dj )</p>
        <p>Where #(ti; tj ) denotes the number of tag ti exists in tj and the same to
#(w; Dj ). This model could be regarded as a simple approximation of the EM
based translation model, and it is also e ective. Note that the EM based
translation probability is denoted as Pt1r(tijw) whereas the co-occurrence based
translation probability is denoted as Pt2r(tijw) hereafter.
3.4
Now we combine above methods together to get our nal model:
Pfinal(tijDj ; Uk) = P (tijC) +</p>
        <p>Where + + + = 1 and Ptr could be Pt1r or Pt2r. Tuning these four
parameters is not easy, and thus we split both Cleaned Dump and Post Core
dataset into a training set and a validation set respectively, train the model on the
training set and set parameters empirically several times for choosing one with
better performance on the validation set. We do not illustrate the detail due to
space restriction, and in the experiments we found the performance is relatively
well while = 0:15, = 0:1, = 0:05, = 0:7. We use these parameters with
Cleaned Dump dataset as our nal training set for the challenge.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Candidate Set based Tag Recommendation Algorithm</title>
      <p>Since the task of tag recommendation is to suggest tags for given document
and user, it is di erent from the task of Information Retrieval [7] or Question
Answering [8] where the query/question is given for nding the relevant
documents/answers.</p>
      <p>Given a document Dj and user Uk, we rstly nd a recommendation tag
candidate set CS from the words in Dj , and we also add the top L related words
by Ptr(tjw) for every word w in Dj . Then we compute the P (tijDj ; Uk) for each
tag ti 2 CS. Finally we sort the tags descending according to P (tijDj ; Uk), and
return the top N tags as required by the application system. The L is set to be
20 and N is set to 5 in the experiments. In summary, we get this algorithm in
Table 1.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Data Preparing and Preprocessing</title>
      <p>The dataset we used is download from ECML PKDD Discovery Challenge 20098
which is provided by BibSonomy9. There are two datasets: Cleaned Dump and
Post Core. The Cleaned Dump contains all public bookmarks and publication
posts of BibSonomy until (but not including) 2009-01-01. The Post Core is a
subset of the Cleaned Dump, it removes all users, tags, and resources which
appear in only one post from Cleaned Dump. Brief statistics of Cleaned Dump and
Post Core could be found in Table 2. One tag assignment means one user choose
a tag for a resource, and thus one posts could have several tag assignments. The
number of posts are shown for bookmark, bibtex, and entire set. The bookmark
and bibtex are seperated by `/', and the entire set are illustrated after `:'.</p>
      <sec id="sec-5-1">
        <title>8 http://www.kde.cs.uni-kassel.de/ws/dc09</title>
      </sec>
      <sec id="sec-5-2">
        <title>9 http://www.bibsonomy.org/</title>
        <p>There are three tables tas, bookmark, and bibtex in the dataset. The elds of
these tables are list in Table 3. For bookmark resource the eld `content type'
is 1 and that of bibtex resource is 2. The elds in bold are used to generate the
pseudo document Dj and the tags tj in the training process.</p>
        <p>We rstly remove the stop words in the bookmark and bibtex table since
they are seldom used as tags and usually meaningless. The stop word list are
download from Lextek10. Note that we do not remove stop words in the tas le,
and the top 5 stop words exist in Post Core and their frequency could be found
in Table 4. There are totally 19, 647 and 2, 513 stop word tag assignments in
Cleaned Dump and Post Core, corresponds to 1.39% and 0.99% respectively.
10 http://www.lextek.com/manuals/onix/stopwords1.html
In contrast, the total frequency of stop words in pseudo documents of Cleaned
Dump and Post Core are over 588, 907 and 61, 113, which suggest not to consider
stop words as tags in most cases.
dataset top 5 stop words and their frequency in tags
Cleaned Dump all:3105 of:1414 and:1227 best:1124 three:1081 c:806</p>
        <p>Post Core all:655 open:211 c:165 best:152 work:77</p>
        <p>In Table 5 we list out the top 10 tags in Cleaned Dump and Post Core.
We could see later that the co-occurrence based translation model are likely to
generate words which appear more times.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Experimental Result</title>
      <sec id="sec-6-1">
        <title>Tagging Performance</title>
        <p>The evaluation measure in following experiments are widely used Precision,
Recall, and F1-measure. The testing datasets are released by ECML-PKDD
challenge in tasks. There are 2 tasks: task 1 and task 2, where task 1 is for content
based tag recommendation, and task 2 is for graph based tag recommendation11.
In task 1 the user, resource of a post might not exist before, so the content
information of the resource would be critical for tag recommendation. In task 2
user, resource, and tags of each post in the test data are all contained in the
Post Core dataset, thus it intends for methods relying on the graph structure of
the training data only.</p>
        <p>We use the whole Cleaned Dump dataset as the training set to train the
model and test the performance of our model on both tasks. For choosing the
parameters, we set = 0:15; = 0:05; = 0:1; = 0:7 as mentioned before in
Section 3.4. The results are shown in Figure 1. The nal em denotes nal model
with Pt1r(EM-based), and nal co denotes nal model with Pt2r(Co-occurrence
based). The x-axis is the top position and y-axis is the f-measure.
11 http://www.kde.cs.uni-kassel.de/ws/dc09</p>
        <p>The results indicates that although Pt2r(Co-occurrence) is more simpler, it
is comparable to Pt1r. In our previous experiment, we also found sometimes the
textual information from the bookmark resource are not adequate enough to
generate some tags in the post and it needs to be expanded. Instead of using
extrinsic resource such as WordNet, we aggregate the tags in the same web site
domain for bookmark resource, and use them to expand the recommendations.
The reason we don't expand the term in bibtex is because resources in bibtex
are publication and the web site provide less information about tags. Also,
trying other tag expansion methods would be our future work. We formulate this
expansion as P (tijSite), and the recommendation model for bookmark would
become:</p>
        <p>Pfinal ex(tijDj ; Uk) = P (tijC) +</p>
        <p>P (tijUk) +</p>
        <p>Pml(tijDj )
+</p>
        <p>X Ptr(tijw)Pml(wjD) + P (tijSite)
w
(10)</p>
        <p>For illustrate the expansions of di erent domains, we sample some domains
and their top used tags with the probability in Table 6.
domain tags and their previously used probability
www.apple.com apple:0.17 mac:0.13 software:0.09 osx:0.07 bookmarks:0.07
answers.yahoo.com knowledge:0.14 yahoo:0.14 web20:0.07 all:0.07 answer:0.07
ant.apache.org java:0.19 ant:0.17 programming:0.07 apache:0.07 tool:0.07
picasa.google.com google:0.21 image:0.14 download:0.14 linux:0.14 picasa:0.14
research.microsoft.com microsoft:0.10 research:0.09 people:0.04 social:0.04 award:0.03
www.research.ibm.com ibm:0.11 datamining:0.07 software:0.04 machinelearning:0.04
journal:0.04</p>
        <p>After the tag expansion via the URL domain, the candidates set CS for
the recommendation will have top used tags in the same domain of Dj . The
performance of (10) with the expansions on the testing set are shown in Table
7 and 8. The performance are shown for only bookmark, only bibtex, and on
entire set. The bookmark and bibtex are seperated by `/', and the entire set
are illustrated after `:'. We choose the co-occurrence based model Pt2r in the
competition, and actually the performance in terms of F-measure at 5 is also
good when using EM-based model Pt1r. The F-measure of EM-based model with
the same parameters as Table 7 for task 1 and task 2 are shown in Table 9. We
can nd that the Pt2r and Pt1r are comparable once again, on F-measure at 1, the
Co-occurrence based model are better, but on F-measure at 5, the EM-based
model are better.
Next we conduct the experiment on each component of our nal model
(9), the document maximum likelihood method, language model(`LM + User
Model'), the EM-based translation model Pt1r(tijw), and co-occurrence based
translation model Pt2r(tijw) are chosen. In the `LM + User Model' we set the
parameters = 0:5; = 0:3; = 0:2; = 0. It could be considered as the language
model which incorporates the maximum likelihood, the previously tag
probability in the whole corpus, and the user's preference model. The performance on
both testing datasets of task 1 and task 2 are illustrated in Figure 2. The x-axis
is the top position from top1 to top5 and the y-axis is the value of F-Measure.
We only list out the F1 measure because it re ects both precision and recall.</p>
        <p>From the experimental results we can see the translation based models are
better than maximum likelihood method and `LM + User Model' in task 2.
The co-occurrence based model are worst in task 1, and the EM-based model is
better than co-occurrence based model on both task. We analyze the results of
cooccurrence based model on task 1 and nd many recommendations are common
used tags, because the co-occurrence based model would prefer to generate those
tags occurred more times before. This suggest that if the resource/users have
been seen before, thus the co-occurrence based model would perform well, if not,
then it is better to choose EM based model. The `LM + User Model' perform
best on task 1, but the performance is still lower than that in Table 7, and also,
`LM + User Model' performs worse than translation models on task 2.</p>
        <p>For comparison between EM-based and co-occurrence based model, we pick
out several words w with their top translating words ti in both
Pt1r(tijw)(EMbased) and Pt2r(tijw)(Co-occurrence based). The sampling words could be found
in Table 10. We could nd that in EM-based translation model, the words are
most likely to translate into itself. It indicates that we could consider the
EMbased translation model as the combination of the maximum likelihood which
only generates the word it self and the co-occurrence based translation model
which has higher probability to generate other words as tags. The co-occurrence
model are likely to generate those popular tags in the corpus, such as `tools',
`software', `social'.
In this paper we propose a probabilistic ranking approach for tag
recommendation. The textual information from the resources and the parallel textual corpus
from previously posts are used to learn the language and statistical translation
model. Our hybrid probabilistic approach incorporates both the content based
textural model and graph structure existing in posts for sharing the common
tagging knowledge among users.</p>
        <p>As our future work, we intent to study how to choose parameters via machine
learning approaches to avoid heuristic setting. Further more, increasing the extra
information of the resources, for example, using the citations(references) of a
publication to augment the information of bookmark resource; using other tag
expansion techniques; conducting the natural language understanding of the tag
concept as well as studying the evaluation measures for tag recommendation are
all possible future research work.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgement</title>
      <p>This paper is supported by the National Natural Science Foundation of China
under the grant 60673009 and China National Hanban under the grant 2007-433.
The authors thank Chin-Yew Lin at Microsoft Research Asia for his valuable
comments to this paper. Thanks also to Jie Liu, Yang Wang and Min Lu for
their helpful discussions and suggestions.
3. Lipczak, M. Tag Recommendation for Folksonomies Oriented towards Individual
Users. In Proceedings of ECML PKDD Discovery Challenge (RSDC 2008), pages
84-95.
4. Ponte, J. M. and Croft, W.-B. A Language Modeling Approach to Information
Retrieval. In Proceedings of the 21st annual international ACM SIGIR conference
on Research and development in information retrieval (SIGIR 1998), pages 275-281.
5. Zhai, C.-X. and La erty, J. A Study of Smoothing Methods for Language Models
Applied to Information Retrieval. ACM Transaction of Information System 2004,
pages 179-214.
6. Brown, P.-F., Pietra, V. J. D., Pietra, S. A. D. and Mercer, R.-L. The Mathematics
of Statistical Machine Translation: Parameter Estimation. Journal of Computational
Linguist 1993, pages 263-311.
7. Berger, A. and La erty, J. Information Retrieval as Statistical Translation. In
Proceedings of the 22nd annual international ACM SIGIR conference on Research and
development in information retrieval (SIGIR 1999), pages 222-229.
8. Xue, X., Jeon, J. and Croft., W.-B. Retrieval Models for Question and Answer
Archives. In Proceedings of the 31st annual international ACM SIGIR conference
on Research and development in information retrieval (SIGIR 2008), pages 475-482.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Heymann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ramage</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Garcia-Molina</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Social Tag</surname>
          </string-name>
          <article-title>Prediction</article-title>
          .
          <source>In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR</source>
          <year>2008</year>
          ), pages
          <fpage>531</fpage>
          -
          <lpage>538</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Tatu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srikanth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>D'Silva</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>RSDC'08: Tag Recommendations using Bookmark Content</article-title>
          .
          <source>In Proceedings of ECML PKDD Discovery Challenge 2008 (RSDC</source>
          <year>2008</year>
          ), pages
          <fpage>96</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>