<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Finding similar research papers using language models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>German Hurtado Mart n</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Steven Schockaert</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chris Cornelis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Helga Naessens</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Applied Math. and Comp. Science, Ghent University</institution>
          ,
          <addr-line>Ghent</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Industrial Engineering, University College Ghent</institution>
          ,
          <addr-line>Ghent</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <fpage>106</fpage>
      <lpage>113</lpage>
      <abstract>
        <p>by interpolating the corresponding language model with language models for the authors, keywords and journal of the paper. This strategy is then extended by nding topics and additionally interpolating with the resulting topic models. These topics are found using an adaptation of Latent Dirichlet Allocation (LDA), in which the keywords that were provided by the authors are used to guide the process.</p>
      </abstract>
      <kwd-group>
        <kwd>Similarity</kwd>
        <kwd>Language modeling</kwd>
        <kwd>Latent Dirichlet allocation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Due to the rapidly growing number of published research results, searching for
relevant papers can become a tedious task for researchers. In order to mitigate
this problem, several solutions have been proposed, such as scienti c article
recommender systems [2, 8] or dedicated search engines such as Google Scholar.
At the core of such systems lies the ability to measure to what extent two papers
are similar, e.g. to nd out whether a paper is similar to papers that are known
to be of interest to the user, to explicitly allow users to nd \Related articles" (as
in Google Scholar), or to ensure that the list of search results that is presented
to the user is su ciently novel and diverse [3]. To nd out whether two articles
are similar, content-based approaches can be complemented with collaborative
ltering techniques (e.g. based on CiteULike.org or Bibsonomy.org) or citation
analysis (e.g. PageRank, HITS, etc.). While the latter are well-studied,
contentbased approaches are usually limited to baseline techniques such as using the
cosine similarity between vector representations of the abstracts.</p>
      <p>Comparing research papers is complicated by the fact that their full text is
often not publicly available, and only the abstract along with some document
features such as keywords, authors, or journal can be accessed. The challenge
thus becomes to make optimal use of this limited amount of information. Hurtado
et al. [7] investigated the impact of individual document features within the
vector space model. Their main conclusion was that baseline methods using only
the abstract could not be improved signi cantly by enriching them with other
features. Language modeling techniques, however, have been shown to perform
well for comparing short text snippets [6, 10].</p>
      <p>Our goal in this paper is therefore to explore how language models can be
used to compare research paper abstracts, how they can best make use of the
other document features, and whether they are a more reasonable choice than
a vector space model based approach for this task. In particular, we combine
two ideas to address these questions. On the one hand, we consider the idea of
estimating language models for document features such as keywords, authors,
and journal, and derive a language model for the article by interpolating them
(an idea which has already proven useful for expert nding [11]). On the other
hand, we apply LDA to discover latent topics in the documents, and explore how
the keywords can help to improve the performance of standard LDA.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Research paper similarity</title>
      <p>In this section we review and introduce several methods to measure article
similarity, based on the information commonly available for a research paper:
abstract, keywords, authors, and journal.
2.1</p>
      <sec id="sec-2-1">
        <title>Vector space model</title>
        <p>The similarity of two papers can easily be measured by comparing their abstracts
in the vector space model (method abstract in the result tables): each paper is
represented as a vector, in which each component corresponds to a term
occurring in the collection. To calculate the weight for that term the standard tf-idf
approach is used, after removing stopwords. The vectors d1 and d2
corresponding to di erent papers can then be compared using standard similarity measures
such as the cosine (cos), generalized Jaccard (g.jacc), extended Jaccard(e.jacc),
and Dice (dice) similarity, using them as in [7]. Alternatively, we also consider a
vector representation where the abstract is completely ignored, and where there
is one component for each keyword, with the weights calculated analogously as
in the tf-idf model (method keywords).</p>
        <p>In [7] an alternative scheme for using the keywords has been proposed, which
does not ignore the information from the abstract. This scheme was referred to
as explicit semantic analysis (ESA) since it is analogous to the approach from
[4]. The idea is, departing from a vector d obtained by method abstract, to
de ne a new vector representation dE of this paper, with one component for
every keyword k appearing in the collection. The weights of dE's components are
de ned by wk = d qk, where the vector qk is formed by the tf-idf weights
corresponding to the concatenation of the abstracts of all papers to which keyword k
was assigned. This method is called ESA-kws in our experiments below. Similar
methods are considered in which vector components refer to authors (ESA-aut )
or to journals (ESA-jou). For e ciency and robustness, only authors are
considered that appear in at least 4 papers in the ESA-aut method, and only keywords
that appear in at least 6 papers in the ESA-kw method. Higher thresholds would
exclude too many keywords and authors, while lower thresholds would result in
a high computational cost due to the large number of values in each vector.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Language modeling</title>
        <p>A di erent approach is to estimate unigram language models [9] for each
document, and calculate their divergence. A document d is then assumed to be
generated by a given model D. This model is estimated from the terms that
occur in the abstract of d (and the rest of the abstracts in the collection). Using
Jelinek-Mercer smoothing, the probability that model D generates term w is
given by:</p>
        <p>
          P (wjD) = P (wjd) + (
          <xref ref-type="bibr" rid="ref1">1</xref>
          )P (wjC)
where C is the whole collection of abstracts. The probabilities P (wjd) and P (wjC)
are estimated using maximum likelihood, e.g. P (wjd) is the fraction of
occurrences of term w in the abstract of document d. Once the models D1 and D2
corresponding to two documents d1 and d2 are estimated, we measure their
di erence using the well-known Kullback-Leibler divergence, de ned by
KLD(D1jjD2) = X D1(w)log
w
        </p>
        <p>
          D1(w)
D2(w)
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
If a symmetric measure is desired, Jensen-Shannon divergence could alternatively
be used.
        </p>
        <p>
          Language model interpolation The probabilities in the model of a
document are thus calculated using the abstracts in the collection. However, given
the short length of the abstracts, we should make maximal use of all the
available information, i.e. also consider the keywords k, authors a, and journal j. In
particular, the idea of interpolating language models (also used for example in
[11]), which underlies Jelinek-Mercer smoothing, can be generalized:
P (wjD) = 1P (wjd) + 2P (wjk) + 3P (wja) + 4P (wjj) + 5P (wjC)
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
with Pi i = 1. In order to estimate P (wjk), P (wja), and P (wjj), we consider an
arti cial document for each keyword k, author a and journal j corresponding to
the concatenation the abstracts where k, a and j occur, respectively. Then, the
probabilities are estimated using maximum likelihood, analogously to P (wjd).
Since a document may contain more than one author and one keyword, we de ne
P (wjk) and P (wja) as:
        </p>
        <p>
          P (wjk) = 1 Xn P (wjki)
n
i=1
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
        </p>
        <p>
          m
P (wja) = 1 X P (wjaj)
m
j=1
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
where n and m are the number of keywords and authors in the document.
Latent Dirichlet Allocation Two conceptually related abstracts may contain
di erent terms (e.g. synonyms, misspellings, related terms), and may therefore
not be recognized as similar. While this is a typical problem in information
retrieval, it is aggravated here due to the short length of abstracts. To cope
with this, methods can be used that recognize which topics are covered by an
abstract. The idea is that topics are broader than keywords, but still su ciently
discriminative to yield a meaningful description of the content of an abstract.
This topical information is not directly available; however, it can be estimated
by using Latent Dirichlet Allocation (LDA) [1].
        </p>
        <p>
          The idea behind LDA is that documents are generated by a (latent) set of
topics, which are modeled as a probability distribution over terms. To generate
a document, a distribution over those topics is set, and then, to generate each
word w in the document, a topic z is sampled from the topic distribution, and
w is sampled from the word distribution of the selected topic. In other words,
the set of distributions over the words in the collection and the set of
distributions over all the topics must be estimated. To do so, we use LDA with Gibbs
sampling [5]. We can then estimate these probabilities as:
where t is the LDA model obtained with Gibbs sampling, W is the number
of words in the collection, and T is the number of topics. Parameters and
intuitively specify how close (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) and (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) are to a maximum likelihood estimation.
The count nz(w) is the number of times word w has been assigned to topic z,
while nz(d) is the number of times a word of document d has been assigned to
topic z. Finally, nz( ) is the total number of words assigned to topic z, and n (d) is
the total number of words of document d assigned to any topic. All these values
are unknown a priori; however, by using Gibbs sampling they can be estimated.
        </p>
        <p>
          To nd the underlying topics, the LDA algorithm needs some input, namely
the number T of topics to be found. Based on preliminary results, we set T =
K=10, where K is the number of keywords that are considered. The topics that
are obtained from LDA can be used to improve the language model of a given
document d. In particular, we propose to add P (wjt) to the right-hand side of
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          ), with the appropriate weight . P (wjt) re ects the probability that term
w is generated by the topics underlying document d. It can be estimated by
considering that:
        </p>
        <p>P (wjt) =</p>
        <p>T
X P (wjzi)
i=1</p>
        <p>
          P (zijt)
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          )
This method is referred to as LM0 in the result tables.
        </p>
        <p>
          The method LM0 can be improved by taking advantage of the keywords
that have been assigned to each paper. In particular, we propose to initialize the
topics by determining T clusters of keywords using k-means. Then, a document
c is created for every cluster. This arti cial document is the concatenation of
the abstracts of all papers to which some keyword from the cluster was assigned.
Once these documents c are made, initial values for the parameters nz(w), nz(d),
and nz( ) in (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) and (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) can be retrieved from them: nz(w) is initialized with the
number of occurrences of w in arti cial document cz, nz(d) with the number of
words of document d occurring in cz, and nz( ) with the total number of words in
cz. Parameter n (d) is independent from the clustering results as it takes the value
of the total number of words in document d. We furthermore take = 50=T and
= 0:1. Subsequently, we can either work directly with these initial values (i.e.,
use them in (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) and (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ): method LM1 in the results tables), or we can apply
Gibbs sampling (method LM2 ). In this last case, the values resulting from the
clustering are only used to initialize the sampler, as an alternative to the random
values used normally.
3
3.1
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental evaluation</title>
      <sec id="sec-3-1">
        <title>Experimental Set-Up</title>
        <p>To build a test collection and evaluate the proposed methods, we downloaded
a portion of the ISI Web of Science3, consisting of les with information about
articles from 19 journals in the Arti cial Intelligence domain. These les contain,
among other data, the abstract, authors, journal, and keywords freely chosen by
the authors. A total of 25964 paper descriptions were retrieved, although our
experiments are restricted to the 16597 papers for which none of the considered
elds is empty.</p>
        <p>The ground truth for our experiments is based on annotations made by 3
experts. First, 100 articles with which at least one of the experts was su ciently
familiar were selected. Then, using tf-idf with cosine similarity, the 30 most
similar articles in the test collection were found for each of the 100 articles. Each of
those 30 articles were manually tagged by the expert as similar or dissimilar. To
evaluate the performance of the methods, each paper p is thus compared against
30 others4, some of which are tagged as similar. Similarity measures can then
be used to rank the 30 papers, such that ideally the papers similar to p appear
at the top of the ranking. In principle, we thus obtain 100 rankings. However,
due to the fact that some of the lists contained only dissimilar articles, and that
sometimes the experts were not certain about the similarity of some items, the
initial 100-article set was reduced to 89 rankings. To evaluate these rankings, we
use mean average precision (MAP) and mean reciprocal rank (MRR).
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Results</title>
        <p>3 http://apps.isiknowledge.com
4 During the annotation process it was also possible to tag some items as \Don't
know" for those cases where the expert had no certainty about the similarity. These
items are ignored and therefore some papers are compared to less than 30 others.
cos</p>
        <p>
          dice
journal, topics. We xed the sum of these weights to 0:9, and set the general
smoothing factor ( 5 in (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )) to 0:1.
        </p>
        <p>The main conclusion that we can draw from these results is that language
models are indeed capable of yielding a substantial improvement over all of the
vector space approaches. The rst block of Table 2 summarizes the results
obtained with language models that only use one of the features. We nd that
language models which only use the abstract signi cantly5 improve the
performance of the most traditional vector space methods (abstract ). Models uniquely
based on other features can perform slightly better than abstract, but these
5 In this work we consider an improvement to be signi cant when p &lt; 0:05 for the
paired t-test.
improvements were not found to be signi cant. However, these results are still
useful as an indication of the amount of information contained in each of the
features: language models based exclusively on keywords or on authors perform
comparable to the method abstract. Using topics only yields such results when
LM2 is used, while the information contained in the journal is clearly poorer.</p>
        <p>In the second block of Table 2 we examine di erent combinations of two
features: abstract with topics on the rst three lines, and abstract with keywords
on the last three. These results con rm that the abstract contains the most
information, and should be assigned a high weight. On the other hand, we can
observe how the topics, when combined with the abstract, yield a better MAP
score. In particular, the MAP score for the LM2 con guration on the second
(resp. third) line of the second block are signi cantly better than the LM2 score
on the fth (resp. sixth) line.</p>
        <p>The third block of Table 2 shows the results of combining abstract and topics,
with keywords, authors, and journal. It is clear that giving a small weight to
keywords is bene cial, as it leads to the highest scores, which are signi cantly
better than all con gurations of the second block. For authors and journal,
however, we do not nd a substantial improvement. In Fig. 1 we further explore
the importance of the abstract and the topics. We set the weight of the keywords
to a xed value of 0:1, and the remaining weight of 0:8 is divided between
abstract and topics. What is particularly noticeable is that ignoring the abstract
is penalized stronger than ignoring the topics, but the optimal performance is
obtained when both features are given approximately the same weight.</p>
        <p>Finally, we can note from Table 2 that LM1 cannot improve LM0, but clear
di erences in MAP scores can be observed between LM0 and LM2. These latter
di erences are signi cant for all con gurations in the third block, and the three
rst con gurations in the second block.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>We have shown how language models can be used to compare research paper
abstracts and how their performance for this task can be improved by using
other available document features such as keywords, authors, and journal. In
particular, language models have proven more suitable in this context than any
of the vector space methods we considered. We have also explored how LDA could
be used in this case to discover latent topics, and a method has been proposed
to e ectively exploit the keywords to signi cantly improve the performance of
the standard LDA algorithm.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>D. M. Blei</surname>
            ,
            <given-names>A. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            , and
            <given-names>M. I.</given-names>
          </string-name>
          <string-name>
            <surname>Jordan</surname>
          </string-name>
          .
          <article-title>Latent Dirichlet allocation</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>3</volume>
          :
          <fpage>993</fpage>
          {
          <fpage>1022</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>T.</given-names>
            <surname>Bogers</surname>
          </string-name>
          and
          <string-name>
            <surname>A. van den Bosch.</surname>
          </string-name>
          <article-title>Recommending scienti c articles using CiteULike</article-title>
          .
          <source>In Proc. of the 2008 ACM Conf. on Recommender Systems</source>
          , pages
          <fpage>287</fpage>
          {
          <fpage>290</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Clarke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kolla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Cormack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vechtomova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ashkan</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation</article-title>
          .
          <source>In Proc. of the 31st Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval</source>
          , pages
          <volume>659</volume>
          {
          <fpage>666</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>E.</given-names>
            <surname>Gabrilovich</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Markovitch</surname>
          </string-name>
          .
          <article-title>Computing semantic relatedness using Wikipedia-based explicit semantic analysis</article-title>
          .
          <source>In Proc. of the 20th International Joint Conf. on Arti cal Intelligence</source>
          , pages
          <fpage>1606</fpage>
          {
          <fpage>1611</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. T. L.
          <article-title>Gri ths and M. Steyvers. Finding scienti c topics</article-title>
          .
          <source>Proc. of the National Academy of Sciences</source>
          ,
          <volume>101</volume>
          (
          <issue>Suppl</issue>
          . 1):
          <volume>5228</volume>
          {
          <fpage>5235</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>L.</given-names>
            <surname>Hong</surname>
          </string-name>
          and
          <string-name>
            <given-names>B. D.</given-names>
            <surname>Davison</surname>
          </string-name>
          .
          <article-title>Empirical study of topic modeling in Twitter</article-title>
          .
          <source>In Proc. of the First Workshop on Social Media Analytics</source>
          , pages
          <volume>80</volume>
          {
          <fpage>88</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>G.</given-names>
            <surname>Hurtado Mart n</surname>
          </string-name>
          , S. Schockaert,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cornelis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Naessens</surname>
          </string-name>
          .
          <article-title>Metadata impact on research paper similarity</article-title>
          .
          <source>In Proc. of the 14th European Conf. on Research and Advanced Technology for Digital Libraries</source>
          , pages
          <volume>457</volume>
          {
          <fpage>460</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>S. M.</given-names>
            <surname>McNee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Albert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cosley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gopalkrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Lam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rashid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <article-title>On the recommending of citations for research papers</article-title>
          .
          <source>In Proc. of the 2002 ACM Conf. on Computer Supported Cooperative Work</source>
          , pages
          <volume>116</volume>
          {
          <fpage>125</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Ponte</surname>
          </string-name>
          and
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <article-title>A language modeling approach to information retrieval</article-title>
          .
          <source>In Proc. of the 21st Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval</source>
          , pages
          <volume>275</volume>
          {
          <fpage>281</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>X.</given-names>
            <surname>Quan</surname>
          </string-name>
          , G. Liu,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ni</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Wenyin</surname>
          </string-name>
          .
          <article-title>Short text similarity based on probabilistic topics</article-title>
          .
          <source>Knowledge and Information Systems</source>
          ,
          <volume>25</volume>
          :
          <fpage>473</fpage>
          {
          <fpage>491</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>J. Zhu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Song</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <article-title>Ruger. Integrating multiple document features in language models for expert nding</article-title>
          .
          <source>Knowledge and Information Systems</source>
          ,
          <volume>23</volume>
          :
          <fpage>29</fpage>
          {
          <fpage>54</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>