<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Multiple-stage Approach to Re-ranking Clinical Documents</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Heung-Seon Oh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuchul Jung</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Service Center Korea Institute of Science and Technology Information</institution>
        </aff>
      </contrib-group>
      <fpage>210</fpage>
      <lpage>219</lpage>
      <abstract>
        <p>This paper presents our approach to medical information retrieval and experimental results of participating in eHealth Task 3-A of CLEF 2014. The task is to retrieve relevant documents from a medical collection given a query generated from a discharge summary. The key idea of our method is to compute accurate similarity scores via multiple stages of re-ranking documents from initial documents retrieved by a search engine.</p>
      </abstract>
      <kwd-group>
        <kwd>medical information retrieval</kwd>
        <kwd>language models</kwd>
        <kwd>abbreviations</kwd>
        <kwd>query expansion</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Health-related content is one of the most searched-for topics on the internet. This
became an important domain for research in information retrieval (IR). Recently,
medical IR is actively researched to tackle diverse medical information sources
including the general web, journal articles, social media, and hospital records. However,
medical IR is still challenging because it should consider various information needs
from a wide range of users including patients and their care givers, researchers,
clinicians, practitioners, etc. Moreover, it is highly co-related with those users’
background medical knowledge and language skills.</p>
      <p>
        eHealth Task 3-A of Conference and Labs of the Evaluation Forum (CLEF) 2014
[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] aims at improving the effectiveness of medical IR systems to support laypeople
(e.g., patients and their relatives) who have different information needs. Most of
previous researches focus on utilizing external medical resources such as MetaMap [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
NegEx [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and international classification of diseases(ICD)-9 and natural language
processing (NLP) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to understand the meanings of medical words at semantic level.
      </p>
      <p>This paper presents a multiple-stage re-ranking method which focuses on utilizing
various retrieval techniques rather than exploiting utilizing external resources and
NLP techniques.</p>
      <p>In particular, our proposed method passes through multiple re-ranking stages to
elevate the ranked position of most relevant documents. Basically, we first perform
query expansion with abbreviations, and pseudo relevance model in the end. In the
middle of the re-ranking, query expansion with discharge summary, clustering-based
document scoring, and centrality-based document scoring can be combined
selectively or sequentially.</p>
      <p>The rest of this paper is organized as follows. Section 2 delivers related researches
of medical information retrieval. Section 3 presents our re-ranking method in detail.
The experimental results are described in Section 4. Section 5 concludes with short
summary.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Recently, many IR researches have been performed with different types of medical
collections. TREC held medical track in 2011 and 2012. A research [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] presents
twostages method. They extract useful attributes such as age and gender from a collection
by NLP techniques and hand-crated regular expressions. In search time, a query is
expanded using unified medical language system (UMLS). In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], several ranking
functions are proposed to combine several evidence of different levels including
various external medical resources. The results show that the proposed methods achieved
the best performance. A research [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] presents a negotiation detection method using
syntactic information and shows the effective way of handling negations.
      </p>
      <p>
        CLEF held eHealth Lab in 2013. A research [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] presents a two-step ranking system
utilizing three different external resources: external medical collections, medical
concept mapper, and discharge summaries. It first retrieves documents in text-space and
re-rank them in concept-space.
      </p>
      <p>
        MedSearch system [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] addresses three distinctions compared to traditional systems.
First, it provides query reformulation which makes a long descriptive query to a
moderate-length query. Second, it supports the diversification of web search results.
Third, it provides medical phrases semantically related to a query from Medical
Subject Headings (MeSH) ontology.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methods</title>
      <p>The key idea of our method is to re-rank top-k documents via multiple stages for
computing more accurate similarity scores with respect to a query.</p>
      <p>
        Figure 1 shows the overview of our multiple stages re-ranking method. For a given
query Q, a set of documents, , , … , , are retrieved from a collection C
using a search engine. In our implementation, initial documents are retrieved by
Lucene1 using a query-likelihood method with Dirichlet smoothing [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Based on the
initial documents, re-ranking is performed via multiple stages. The rest of this paper
explains the details of the re-ranking method.
      </p>
      <sec id="sec-3-1">
        <title>1 http://lucene.apache.org/</title>
        <p>(a) Searching Initial documents
(b) Re-ranking initial documents</p>
        <p>
          Throughout re-ranking, KL-divergence method is utilized to compute a similarity
score between a query and a document [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ]:
1
2
,
exp
||
exp
|
|
where and are query and document language models, respectively.
In general, a query model is estimated by maximum likelihood estimate (MLE) as
below:
where , is a count of a word w in a query Q and | | is the number of words in
Q.
        </p>
        <p>
          To avoid zero probabilities and improve retrieval performance, a document model is
estimated using Dirichlet smoothing [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]:
,
| |
where , is a count of a word w in a document D, | is a probability of
a word w in a collection C, and is the Dirichlet prior parameter.
        </p>
        <p>
          The first stage aims at expanding a query with abbreviations. In numerous numbers
of medical documents, abbreviations are widely used to represent important
meanings. Unfortunately, the clear interpretation of abbreviations is quite difficult due to
the existences of several different meanings for a same abbreviated expression.
Similarly, medical queries generated by users may also contain abbreviations. If we
submit a query including abbreviations, it may not match relevant documents due to term
mismatch problem or may match documents with abbreviations implying different
meanings. To deal with this problem, query expansion considering abbreviations is
considered. To do that, we extract pairs of abbreviation and corresponding full
representation with an occurrence count using simple rule-based extraction method [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]
from the entire collection. Then, a query model is estimated by incorporating words
from the full representations of an associated abbreviation:
1
⋅
⋅
∈
|
where is MLE, is a control parameter,
consisting of a full representation for an abbreviation w, and
is a set of words
| is estimated
by
        </p>
        <p>,</p>
        <p>
          The second stage is to reflect information from a discharge summary. A query used
in CLEF eHealth Task-3 is generated by a human expert after reading a discharge
summary corresponds to the query. Therefore, it has hidden but useful information
not captured by a query. The use of a discharge summary can improve retrieval
performance by utilizing such hidden information. To do that, a query model is expanded
by combining a random-walk based discharge summary model. First, we should
compute word-to-word transition matrix to measure the associations among words in a
discharge summary. A simple solution is to use a co-occurrence count between two
words among all sentences [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. However, words are strongly associated when they
appear closely in a sentence. In addition, associations between topical words are
important than those between common words. To resolve this situation, we utilize
hyperspace analogue to language (HAL) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] function with inverse document frequency
(IDF):
,
⋅
, ,
⋅
        </p>
        <p>⋅
where | | is the number of unique words in a discharge summary DS and
damping factor .</p>
        <p>We approximate the resulting as a discharge summary model
and update the query model with it:
1
⋅
∑∈
,</p>
        <p>,
⋅
∈</p>
        <p>|
⋅
|
where
1, , ,</p>
        <p>| |
log _
Then, a transition probability is computed:
is a distance between words w and u, N is a window size, wt n</p>
        <p>is a co-occurrence count of w and u within k-distance, and
where is a discharge summary document
Based on the translation matrix |
random-walk:
, word centralities are computed using
where is a control parameter and | is a discharge summary model.</p>
        <p>
          The third stage is to incorporate cluster information of documents. Namely, a score
for a document is computed by incorporating the membership of the document to a
cluster we constructed. Bottom-up hierarchical agglomerative clustering [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] is
applied to partition the top-k documents, , into a set of disjoint clusters. At first,
kclusters for every document in are constructed. Then, two clusters which have
the highest similarity are selected and merged to a single cluster if the similarity is
above a threshold. This procedure stops when there are no clusters with the threshold
above. Similarity scores are computed using KL divergence method between a query
model and Dirichlet-smoothed cluster model.
        </p>
        <p>A new score is computed by combining the initial search score and the cluster
score:
exp
||</p>
        <p>Based on those results, a similarity matrix with the initial documents and
corresponding α documents is constructed. Then, random-walk is executed on this matrix
to produces centrality scores for the initial documents. This score is multiplied with
the previous score:
,
,
⋅
,
The fifth stage is pseudo relevance feedback. A popular way of query expansion is to
update a query based on pseudo-relevance feedback (PRF). Updating a query with
PRF assumes that top-ranked documents , , … , | | in an initial search
results relevant to a given query and terms in F are useful to modify a query for a
better representation. Relevance model (RM) is to estimate a multinomial distribution
| that is the likelihood of a term w given a query q. The first version of
relevance model (RM1) is defined as follows:</p>
        <p>RM1 is composed with three components: document prior , document
weight | , and term weight in a document | . In general, is
assumed to be a uniform distribution without the knowledge of a document D.</p>
        <p>| ∏∈ | , indicates the query-likelihood score. | can
be estimated using various smoothing methods such as Dirichlet-smoothing. Various
strategies are applicable to estimate these components.</p>
        <p>To improve the retrieval performance, a new query model can be estimate by
combing a relevance model and an original query model. RM3 [16] is a variant of a
relevance model to estimate a new query models with RM1:
where
model.</p>
        <p>Based on this query model, final scores for documents are computed.</p>
        <p>is a control parameter between the original query model and the feedback</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments</title>
      <p>As mentioned, initial documents are retrieved by Lucene using a query-likelihood
method with Dirichlet smoothing. We limited the size of the initial documents to 100.
Based on the initial documents, we submitted 7 runs by differentiating the
components of our re-ranking method.</p>
      <p>Table 1 shows the parameter and corresponding values for each component in the
experiments. Table 2 describes involving components at each run and evaluation
results from corresponding runs. Basically, component 5 which indicate the use of PRF
is applied to all runs thus regarded as baseline of our experiments. Except Run01, all
runs utilize component 1. The distinction between Run02-04 and Run05-07 is that the
former uses discharge summary while the latter doesn’t. Precision and normalized
discounted cumulative gain (NDCG) are used to measure the performance of top-10
ranked documents from 100 initial documents. They are denoted as P@10 and
NDCG@10, respectively.</p>
      <p>Our baseline achieved 0.7300 and 0.7235 in P@10 and NDCG@10, respectively.
It shows that PRF is an effective solution to find correct medical documents. For
precision, the best performance, 0.7400 is obtained from Run02 which utilizes
abbreviations and discharge summary. For NDCG, the best performance, 0.7333, is obtained
from Run04 which uses all components in the re-ranking method. This shows that
sequentially combining the proposed components is contributed to achieve the best
performance in NDCG measure. However, clustering and centrality-based document
scoring were not effective in enhancing precision measure.</p>
      <sec id="sec-4-1">
        <title>KISTI_EN_RUN03</title>
        <p>KISTI_EN_RUN04
KISTI_EN_RUN05
KISTI_EN_RUN06
KISTI_EN_RUN07</p>
        <p>O
O
O
O
O</p>
        <p>O
O</p>
        <p>O
O
O
O</p>
        <p>O
O</p>
        <p>O
O
O
O
O</p>
        <p>Due to quite high baseline (i.e., Run01) obtained by PRF with relevance model and
lack of in-depth study on the provided healthcare dataset, our experiments fail in
showing drastic improvements in evaluation measures. Meanwhile, the moderate
performances observed in our multi-stage approach to re-ranking documents (i.e.,
Run04) may arise from synergistic effects between involved components. The
detailed analysis on the involved components in terms of causal and sequential effects is
remained as our future work.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>This paper shows a multiple stage approach to re-ranking medical documents. Our
method focuses on utilizing various retrieval techniques rather than utilizing medical
dependent external resources and natural language processing to understand medical
meanings. We found that using abbreviations and discharge summary play an
important role to find correct medical documents. Our future works include further
development of two components and in-depth error analysis based on standard
assessment dataset.
16. Abdul-Jaleel, N., Allan, J., Croft, W., Diaz, F., Larkey, L.: UMass at TREC
2004: Novelty and HARD. (2004).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suominen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schrek</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leroy</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mowery</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Velupillai</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chapman</surname>
            ,
            <given-names>W.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinez</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zuccon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palotti</surname>
          </string-name>
          , J.:
          <article-title>Overview of the ShARe/CLEF eHealth Evaluation Lab 2014</article-title>
          .
          <source>Proceedings of CLEF 2014</source>
          . Springer (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palotti</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pecina</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zuccon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanbury</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mueller</surname>
          </string-name>
          , H.:
          <source>ShARe/CLEF eHealth Evaluation Lab</source>
          <year>2014</year>
          ,
          <article-title>Task 3: Usercentred health information retrieval</article-title>
          .
          <source>Proceedings of CLEF</source>
          <year>2014</year>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Aronson</surname>
            ,
            <given-names>A.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lang</surname>
            ,
            <given-names>F.-M.:</given-names>
          </string-name>
          <article-title>An overview of MetaMap: historical perspective and recent advances</article-title>
          .
          <source>Journal of the American Medical Informatics Association : JAMIA</source>
          .
          <volume>17</volume>
          ,
          <fpage>229</fpage>
          -
          <lpage>36</lpage>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carterette</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Exploring evidence aggregation methods and external expansion sources for medical record search</article-title>
          .
          <source>Proceedings of Text REtrieval Conference (TREC)</source>
          .
          <volume>1</volume>
          -
          <fpage>9</fpage>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Dıaz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ballesteros</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrillo-de-Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plaza</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          : UCM at TREC-2012:
          <article-title>Does negation influence the retrieval of medical reports?</article-title>
          <source>Proceedings of Text REtrieval Conference (TREC)</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>King</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Provalov</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Learning</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
          </string-name>
          , J.:
          <article-title>Cengage Learning at TREC 2011 Medical Track</article-title>
          .
          <source>Proceedings of Text REtrieval Conference (TREC)</source>
          (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stephen</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>James</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carterette</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , H.:
          <article-title>Using Discharge Summaries to Improve Information Retrieval in Clinical Domain</article-title>
          .
          <source>ShARe/CLEF eHealth Evaluation</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          : MedSearch.
          <source>Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM '08</source>
          . p.
          <fpage>143</fpage>
          . ACM Press, New York, New York, USA (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Zhai</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lafferty</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A study of smoothing methods for language models applied to Ad Hoc information retrieval</article-title>
          .
          <source>Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '01</source>
          . pp.
          <fpage>334</fpage>
          -
          <lpage>342</lpage>
          . ACM Press, New York, New York, USA (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kurland</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>PageRank without hyperlinks</article-title>
          .
          <source>Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '05</source>
          . p.
          <fpage>306</fpage>
          . ACM Press, New York, New York, USA (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Oh</surname>
            ,
            <given-names>H.-S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Myaeng</surname>
          </string-name>
          , S.-H.:
          <article-title>Utilizing global and path information with language modelling for hierarchical text classification</article-title>
          .
          <source>Journal of Information Science</source>
          .
          <volume>40</volume>
          ,
          <fpage>127</fpage>
          -
          <lpage>145</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Schwartz</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marti</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>Hearst: A simple algorithm for identifying abbreviation definitions in biomedical text</article-title>
          .
          <source>Proceedings of Pacific Symposium on Biocomputing</source>
          . pp.
          <fpage>451</fpage>
          -
          <lpage>462</lpage>
          (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Weeds</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weir</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity</article-title>
          .
          <source>Computational Linguistics</source>
          .
          <volume>31</volume>
          ,
          <fpage>439</fpage>
          -
          <lpage>475</lpage>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bruza</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Discovering information flow suing high dimensional conceptual space</article-title>
          .
          <source>Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '01</source>
          . pp.
          <fpage>327</fpage>
          -
          <lpage>333</lpage>
          . ACM Press, New York, New York, USA (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raghavan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schütze</surname>
          </string-name>
          , H.: Introduction to Information Retrieval.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>