<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Investigation of ectiveness of Concept-based Approach in Medical Information Retrieval GRIUM @ CLEF2014eHealthTask 3</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Wei Shen</string-name>
          <email>shenwei@iro.umontreal.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jian-Yun Nie</string-name>
          <email>nie@iro.umontreal.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaohua Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaojie Liu</string-name>
          <email>xiaojie@iro.umontreal.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>C.P. 6128, succursale Centre-ville Montreal</institution>
          ,
          <addr-line>Quebec CANADA H3C 3J7</addr-line>
        </aff>
      </contrib-group>
      <fpage>236</fpage>
      <lpage>247</lpage>
      <abstract>
        <p>In our participation in the CLEF 2014 eHealth task 3a, we investigate the e ectiveness of concept-based retrieval techniques on medical IR. Concepts are determined using the existing resources and tools: UMLS Metathesaurus and MetaMap. We tested several methods based on concepts. Although some of these methods lead to slight improvements in retrieval e ectiveness over a traditional bag-of-words method, the impact of the rich domain ressource is lower than we expected. So the whole question on whether and how such a resource can help improve medical IR e ectiveness remains open. In this report, we describe the methods tested as well as their results.</p>
      </abstract>
      <kwd-group>
        <kwd>concept-based retrieval</kwd>
        <kwd>query expansion</kwd>
        <kwd>language model</kwd>
        <kwd>UMLS</kwd>
        <kwd>MetaMap</kwd>
        <kwd>Indri</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Our experiments on CLEF 2014 eHealth Task 3 [1, 2] aim to investigate the
e ectiveness of concept-based approaches in Medical IR. Medicine is possibly
the area in which there are the best manually constructed resources for
identifying concepts. Metathesaurus [24] is a large thesaurus in medicine, gathering
resources such as MeSH [25], Snomed [26], etc. Tools for identifying and
disambiguating concepts in texts, such as MetaMap [27] have also be developed.
In Metathesaurus a term is linked to a large number of other terms, denoting
his synonyms, lexical variants, abbreviations and hypernyms, hyponyms, etc.
Intuitively, the availability of those resources and tools should result in better
IR e ectiveness than the traditional bag-of-words approaches. However, the
previous experimental results have been disappointing. For example, [3] did not
observe any improvement using concepts recognized from texts. [4] exploited
a statistical thesaurus and obtained 2.2% improvement. [5] used MetaMap to
recognize concepts from texts, and used the concepts in query expansion. This
led to an improvement of 4.4% over the bag-of-words approach. A number of
other studies [6{21] have also used di erent resources and tools. However, the
global conclusions are similar: In some cases, slight improvements are obtained,
in other cases, no improvements or even degradations are observed. Overall, the
experimental results using medical resources and tools for IR have been lower
than one expects. The whole question remains: can we really bene t from the
rich resources and tools in the medical area to improve IR e ectiveness? Are
they related to the way that the resources and tools are used?</p>
      <p>In our experiments in CLEF 2014, we would like to examine a few more
possible approaches to take advantage of medical concepts. In our experiments,
we use MetaMap to recognize medical concepts from documents and queries.
MetaMap identi es concepts from a text (document or query). From the concept
IDs (CUI - Concept Uni ed Identi er) identi ed, we can further identify the
concept word sequence (SUI - String Uni ed Identi er). Our experiments will
test several ways to exploit either CUI or SUI. In prticular, we will focus on query
expansion using concepts, as query expansion has been shown to be relatively
e ective in the previous experiments on medical IR.
2</p>
    </sec>
    <sec id="sec-2">
      <title>METHODS</title>
      <p>Let us rst describe the bag-of-words baseline method to which our methods
will be compared. Then we will describe how concepts are determined and used
in our approaches.
2.1</p>
      <p>Baseline
As baseline, we use a traditional approach based on language modeling, with
Dirichlet smoothing[23]. We use Indri as the basic experimental platform for all
the methods. For the baseline method, the score of a document D for a query Q
is determined as follows:
where n is the length of query and P (qijD) is adjusted by Dirichlet smoothing,
S(Q; D) = 1 Xn log P (qijD)
n</p>
      <p>
        i=1
P (qijD) =
tfqi;D +
jDj +
tfqi;C
jCj
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
Here C represent the whole collection and jCj is its size. All the terms are
stemmed using Porter stemmer, and stop words from PubMed are removed.
2.2
      </p>
      <p>Concept-based IR
Concept identi cation We use UMLS Metathesaurus Release2012AB as our
resource. A concept is de ned as a \meaning"1. Each meaning is given a CUI
(Concept Uni ed Identi er). The di erent synonyms and abbreviations of this
concept is called a Term which is identi ed by LUI (Lexical Uni ed Identifer).
Each of their lexical variant will be further subdivided into di erent String.
SUI (String Uni ed Identi er) is their ID. For example, concept C0004238
corresponds to the meaning atrial fibrillation. While atrial fibrillation
and auricular fibrillation are two synonyms, they are identi ed by two
different LUIs L0004238 and L0004237. These two terms have both their singular
and plural forms, with and without s. So in UMLS concept C0004238
corresponds to 4 di erent SUIs representing its 4 di erent expression strings, called
SUIname in Metathesaurus.</p>
      <p>MetaMap is a tool that identi es concepts from a text. Among other
functionalities, MetaMap can identify the CUI corresponding to the concept string.
It can also nd all di erent string expressions (i.e. SUI names) for this concept.
CUI and SUI names are the two di erent concept expressions that we used in
our experiments. An example is shown in the gure below.
Retrieval on concept ID space We can view the whole set of concepts IDs
as de ning a concept space. Both document and query can then be represented
as a set of CUI that MetaMap has recognized. The ranking score of a document
can be determined by the matching score based on the concept IDs using the
language model.</p>
      <p>S(Q; D) = S(QCUI ; DCUI ) = n1 Xi=n1 log P (qCUIi jDCUI )
It is possible that some of the concepts in documents and queries cannot be
correctly identi ed by MetaMap. In this case, a more reasonable approach is to
combine the concept-based retrieval with the traditional word-based retrieval.
We implement it as follows:</p>
      <p>
        S(QjD) = S(Qorig; Dorig) + (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )S(QCUI ; DCUI )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
Reformulation with concept SUI name CUI is a very strict expression
of concept. Another alternative expression of a concept is to enumerate all his
SUIname in Metathesaurus. These SUInames are put into the #syn() operator
in Indri[29], who treat all of the expressions listed as synonyms. We further
test di erent operators #1(), #uwN(), #uwN+1() and #combine() with di erent
axibility for each concept name, where #1() matches the term in parentheses
as an exact phrase. #uwN() and #uwN+1() allows terms to appear in unordered
window of size N and N + 1. #combine() just eliminate all dependence and
group terms as "bag of words". This method is denoted by:
      </p>
      <p>S(QjD) = S(Qsuiname; Dorig)
Again, the above method can be combined with the word-based approach as
follows:</p>
      <p>
        S(QjD) = S(Qorig; Dorig) + (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )S(Qsuiname; Dorig)
Query expansion with mutual information Term co-occurrence analysis
has been quite successful in traditional IR to determine related terms. Here, we
try to determine related concepts using concept co-occurrences. Two concepts
are considered to be related if they co-occur frequently. The relevance between
two concepts x and y is measured by Point-wise Mutual Information (PMI):
pmi = log
p(x; y)
p(x)p(y)
We found that many of the determined concepts are indeed strongly related. For
example, the related concepts to Sepsis are listed in Figure 3. We can see that
they are usually related to the related drugs, diseases and treatments.
blood poison
injectable product solesta
brem hemoglobin mer sur
      </p>
      <p>cilastatin dose mass
glomerulosclerosis intercapillary
concord enterica entericon salmonella ser subsp</p>
      <p>adrenergic nerve
aeromonadaceae family organism
injection mitomycin</p>
      <p>murexide
blood entity uidity</p>
      <p>gene kdm4b
cystic disease medullary uremic
entire pelvis renal
crotalarias
bougardirey hemoglobin mali substance
abrasive point
factor gamma interferon necrosis tumor
hazebrouck hemoglobin
blanche grange hemoglobin
immunosuppressant macrolide
hemoglobin henri mondor substance
dibromopropamidine product
hemoglobin maputo substance
abnormal blood nd urea
hemoglobin ibadan k
hemoglobin vaasa
gard hemoglobin ty
phosphomannan</p>
      <p>In our experiment, the original query is expanded by the top mutual
information concepts. In addition, the query is further expanded by the suiNames of
the concepts.</p>
      <p>S(Q; D) =</p>
      <p>
        1S(Qorig; Dorig) + 2S(Qsuiname; Dorig) + 3S(Qmi; Dorig) (
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
with
1 + 2 + 3 = 1
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
Markov Random Field Model In addition to taking into account synonyms,
we also consider dependencies between words within a concept. Markov Random
Field (MRF) model [22] can be used to account for dependencies between words.
By default, one can assume that there is a dependency between two adjacent
query words. Many experimental results showed that this model works better
than the traditional bag-of-words method. When concepts are identi ed, it is
possible that we only assume dependencies within a concept, and we believe that
this could be a better approach than the default model. The MRF model contains
three components. The rst component is the traditional uni-gram language
model. The second component is an ordered model, in which a concept is required
to appear together and in order. This can be implemented in Indri as follows:
P (qorderedConceptjD) =
tf#1(q1;q2;:::qk);D +
tf#1(q1;q2;:::qk);C
      </p>
      <p>
        jCj
jDj +
(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )
where tf#1(q1;q2;:::qk);D is the frequency of an ordered concept in document, and
k is the length of this concept.
      </p>
      <p>The third component is an unordered model, in which the words within a
concept can appear in any order within a text window.</p>
      <p>
        P (qunorderedConceptjD) =
tf#uwk+1(q1;q2;:::qk);D +
jDj +
tf#uwk+1(q1;q2;:::qk);C
jCj
(
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
where tf#uwk+1(q1;q2;:::qk);D is the frequency of the words in a window of size
k + 12.
      </p>
      <p>
        Based on the above probabilities, we can de ne S(qorderedConcept; D) and
S(qunorderedConcept; D). The nal score is a combination of these three models,
S(Q; D) = 1S(Qword; D) + 2S(qorderedConcept; D) + 3S(qunorderedConcept; D)
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
where
1 + 2 + 3 = 1
(
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
The model de ned above is compared to the default MRF model, in which any
two adjacent query words are assumed to be dependent (sequential dependence
model).
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>EXPERIMENT</title>
      <p>The data set for task 3 consists of a set of documents in the mdeical domain,
provided by the Khresmoi project. Each document contains #Uid,#date,#url
and #content elds. We convert the collection into TREC style. In the content
part, we eliminate all commend, css and JavaScript part and all HTML tags.
Only the remaining textual contents are indexed. Each query contains &lt;title&gt;,
&lt;desc&gt;, &lt;discharge_summary&gt;. We use the short title queries.</p>
      <p>The following 12 methods (runs) are tested:
1. baseline (Submitted as GRIUM_EN_Run1)
2. SUIname query, groupped by #1() oprator.
3. SUIname query expansion, groupped by #1() oprator.
4. SUIname query expansion, groupped by #uwN() oprator.
5. SUIname query expansion, groupped by #uwN+1() oprator.(Submitted as</p>
      <p>GRIUM_EN_Run5)
6. SUIname query expansion, groupped by #combine() oprator.
7. manual SUIname query expansion, groupped by #combine() oprator.
Concepts are identi ed manually.
8. Pure CUI query retrieved in CUI document
9. CUI query expansion, document also contain &lt;original&gt; and &lt;cui&gt; two
elds.(Submitted as GRIUM_EN_Run7)
10. Top mutual information and SUI name query expansion.</p>
      <p>(Submitted as GRIUM_EN_Run6 )</p>
      <sec id="sec-3-1">
        <title>2We only use k+1 as the window size in our experiments, although other sizes could</title>
        <p>also be used
11. Markov Random Field baseline with bigram and biterm.
12. Markov Random Field with concept dependence.
Only 4 of them (those with the run IDs) have been submitted.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>RESULT</title>
      <p>The experimental results are summarized in Fig. 4.3</p>
      <sec id="sec-4-1">
        <title>Submit Run ID</title>
      </sec>
      <sec id="sec-4-2">
        <title>Method</title>
        <p>Run1 Run 1</p>
        <p>Run a
Run b</p>
        <p>Run c
Run5 Run 5</p>
        <p>Run e
Run f</p>
        <p>Run g
Run7 Run 7
Run6 Run 6</p>
        <p>Run h
Run i</p>
        <p>Baseline 0.3945 0.7180 0.4201
#1(SUIname) query 0.2717 0.5680 0.3042
#1(SUIname) query expansion 0.3916 0.6900 0.4217
#uwN(SUIname) query expansion 0.4055 0.7500 0.4279
#uwN+1(SUIname) query expansion 0.4069 0.7420 0.4283
#combine(SUIname) query expansion 0.4112 0.7140 0.4286
#combine(manual SUIname) query expan- 0.4185 0.7540 0.4306
sion
CUI query 0.2276 0.4920 0.2692
CUI expansion 0.3495 0.6540 0.3862
#uwN+1(SUIname) expansion + Mutual- 0.4007 0.7120 0.4156</p>
      </sec>
      <sec id="sec-4-3">
        <title>Info expansion</title>
        <p>Markov random eld baseline 0.3999 0.7320 0.4175</p>
      </sec>
      <sec id="sec-4-4">
        <title>Markov random eld with concept depen- 0.3965 0.7260 0.4195 dence</title>
      </sec>
      <sec id="sec-4-5">
        <title>Result</title>
      </sec>
      <sec id="sec-4-6">
        <title>MAP P@10 R-prec Fig. 4. Result of 12 runs evaluated by clef2014t3.qrels.test.binary.</title>
        <p>First of all, we observe that the method using only strict concept space is
less e ective than the traditional word-based method. Run g, which use CUI
query leads to a degradation of 42.3% compared to the baseline. If we simply
compare the \bag-of-words" and \bag-of-concepts" methods, bag-of-words
approach is certainly more exible as a retrieve framework.</p>
        <p>The result is far from what was expected. That means concept mapping
procedure is still the bottleneck of the concept-based approach. Unfortunately, the
mapping process is much more complicated than it seems. The de nision of
concept itself is not clear. An important hypothesis of \concept" is that \a meaning
" should correspend only to one concept. But in fact, in UMLS a meaning can</p>
        <sec id="sec-4-6-1">
          <title>3In order to keep the result comparable with other runs, we change the lambda of</title>
          <p>GRIUM EN Run5 from 5/6 to 1/10. The submitted result was 0.4016 for MAP, 0.7540
for P10.
be represented by a single accurate concept or be broken down into smaller
concepts. For example, in query 36, for meaning open pelvic fracture, we can have
4 choices:
1. fOpen fracture of pelvisg
2. fFractures, Openg and fPelvisg
3. fOpeng and fFracture of pelvisg
4. fOpeng and fFractureg and fPelvisg</p>
          <p>This is not simply an ambiguity, but also a granularity problem. None of
them should be judged as de nitly wrong, but their retrieval performance is
di erent. In Fig.5, we show the concepts identi ed using di erent strategies:</p>
        </sec>
      </sec>
      <sec id="sec-4-7">
        <title>Mapped concept expression</title>
      </sec>
      <sec id="sec-4-8">
        <title>Mapping</title>
        <p>strategy</p>
      </sec>
      <sec id="sec-4-9">
        <title>Original query</title>
      </sec>
      <sec id="sec-4-10">
        <title>MetaMap</title>
      </sec>
      <sec id="sec-4-11">
        <title>Convalescence after an open pelvic fracture and a right</title>
        <p>superior rami fracture
[Convalescence] [Fractures, Open] [Pelvis] [Open] [Frac- 0.4958
ture of pelvis] [Right superior] [Branch of plant]
[Fracture]</p>
      </sec>
      <sec id="sec-4-12">
        <title>Broad manual [Convalescence] [Fractures, Open] [Pelvis] [Right supe- 0.3820 rior] [Fracture of public rami]</title>
      </sec>
      <sec id="sec-4-13">
        <title>Middle man- [Convalescence] [Open fracture of pelvis] [Right superior] 0.3445 ual [Fracture of public rami]</title>
      </sec>
      <sec id="sec-4-14">
        <title>Narrow man- [Convalescence] [Open fracture of pelvis, multiple public 0.3078 ual rami - unstable] MAP(in Run e)</title>
        <p>the concepts identi ed by MetaMap, the broad concepts, narrow concepts and
those in the middle level identi ed manually from Metathesaurus, as well as the
corresponding MAP score. As we can see, the strategy that group many words
into a very speci c concept (Narrow manual) does not produce the best result.
On the contrary, the other strategies that break long concepts into parts work
signi cantly better. Still, the concepts that we recognize from a text have a
large impact on the nal retrieval result. This brings some new challenges for
mapping task. [28] reported that MetaMap reached 84% in precision and 70%
in recall. However, this evaluation is not done for the purpose of IR. For the 50
test queries, MetaMap identi ed 88 concepts. A rough evaluation indicates that
only 66% of them, i.e.58 concepts seem reasonable for IR. We believe that even
these concepts may not form the best way to do retrieval.</p>
        <p>Knowing that mapping is not always acurate, some compromise solutions
have to be used. Our tests show that at least two such strategies can help to
reduce the impact of wrong mapping.</p>
        <p>First, the most simple way is to also consider the original query. The concept
Run
name</p>
      </sec>
      <sec id="sec-4-15">
        <title>Run 1 Baseline 0.3945</title>
        <p>Run g CUI query 0.2276 -42.3%
Run a #1(SUIname) query 0.2717 -31.1%
Run b #1(SUIname) expansion 0.3916 -0.7%</p>
      </sec>
      <sec id="sec-4-16">
        <title>Run e #combine(SUIname) expansion 0.4112 +4.2%</title>
      </sec>
      <sec id="sec-4-17">
        <title>Run f #combine(manual SUIname) ex- 0.4185 +6.1%</title>
        <p>pansion
+19.4%
+72.1%
+80.7%
+83.9%
based synonyms are only treated as a complement to the original query. In our
test, Run b, #1(SUIname) expansion brought an improvement of 57.2% over a
pure #1(SUIname) query. At Run c, 5, e, f, the combination query brought an
improvement.</p>
        <p>Second, instead of strict CUI Id, we use SUIname as the expression of concept. As
we can see in the result, Run a produced 19.4% less mistake than Run g. In
addition, taking into account the fact that concepts IDs can share many words.Using
SUIname can further help us retrieving documents on related concepts. That is
why, with #combine() operator, Run e achieved the best performance over all
11 automatic runs. Our two MRF runs (Run h and Run i ) showed in another
way that naive concept-based dependence does not bring any improvement.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>CONCLUSION</title>
      <p>This year in task 3, we tested several di erent ways of integrating concept
knowledge. Our results showed that the \bag-of-concepts" is less e ective than
\bagof-words" approach. We further discuss about two e ecive ways of reducing
the impact of incorrect concept mapping. Original query is indispensable, and
SUIname is a more exiable way of using a concept. The mapping performance
is still the bottleneck of the concept-based approach. This is a question that we
will examine in our future research.
28. PRATT, Wanda et YETISGEN-YILDIZ, Meliha. A study of biomedical concept
identi cation: MetaMap vs. people. In : AMIA Annual Symposium Proceedings.</p>
      <p>American Medical Informatics Association. p. 529. (2003)
29. STROHMAN, Trevor, METZLER, Donald, TURTLE, Howard, et al. Indri: A
language model-based search engine for complex queries. In : Proceedings of the
International Conference on Intelligent Analysis. p. 2.6. (2005)
30. MILLER, George A. WordNet: a lexical database for English. Communications of
the ACM, vol. 38, no 11, p. 39-41. (1995)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Liadh</given-names>
            <surname>Kelly</surname>
          </string-name>
          , Lorraine Goeuriot, Hanna Suominen, Tobias Schrek, Gondy Leroy, Danielle L. Mowery, Sumithra Velupillai, Wendy W. Chapman, David Martinez,
          <string-name>
            <given-names>Guido</given-names>
            <surname>Zuccon</surname>
          </string-name>
          and
          <string-name>
            <given-names>Joao</given-names>
            <surname>Palottim</surname>
          </string-name>
          .
          <article-title>Overview of the ShARe/CLEF eHealth Evaluation Lab 2014</article-title>
          .
          <source>Proceedings of CLEF 2014. Lecture Notes in Computer Science (LNCS)</source>
          . Springer. (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Lorraine</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          , Liadh Kelly,
          <string-name>
            <given-names>Wei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Joao</given-names>
            <surname>Palotti</surname>
          </string-name>
          , Pavel Pecina, Guido Zuccon, Allan Hanbury, Gareth Jones and
          <string-name>
            <given-names>Henning</given-names>
            <surname>Mueller</surname>
          </string-name>
          .
          <source>ShARe/CLEF eHealth Evaluation Lab</source>
          <year>2014</year>
          ,
          <article-title>Task 3: User-centred health information retrieval</article-title>
          .
          <source>Proceedings of CLEF</source>
          <year>2014</year>
          .
          <article-title>(</article-title>
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hersh</surname>
            , William R.;
            <given-names>David D.</given-names>
          </string-name>
          <string-name>
            <surname>Hickam</surname>
            ; and
            <given-names>T. J.</given-names>
          </string-name>
          <string-name>
            <surname>Leone</surname>
          </string-name>
          .
          <article-title>Words, concepts, or both: Optimal indexing units for automated information retrieval</article-title>
          . Mark E. Frisse (ed.)
          <source>Proceedings of the 16th Annual Symposium on Computer Applications in Medical Care</source>
          ,
          <fpage>644</fpage>
          -
          <lpage>648</lpage>
          (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Srinivasan</surname>
            <given-names>P</given-names>
          </string-name>
          .
          <article-title>Query expansion and MEDLINE</article-title>
          .
          <source>Information Processing and Management</source>
          ,
          <volume>32</volume>
          (
          <issue>4</issue>
          ):
          <fpage>431</fpage>
          -
          <lpage>443</lpage>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Aronson</surname>
            ,
            <given-names>A. R.</given-names>
          </string-name>
          , &amp; Rind esch, T. C.
          <article-title>Query expansion using the UMLS Metathesaurus</article-title>
          .
          <source>In Proceedings of the AMIA Annual Fall Symposium</source>
          . American Medical Informatics Association, p.
          <volume>485</volume>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. BOUDIN, Florian,
          <string-name>
            <given-names>NIE</given-names>
            ,
            <surname>Jian-Yun</surname>
          </string-name>
          , et DAWES,
          <article-title>Martin. Clinical information retrieval using document and PICO structure</article-title>
          . In : Human Language Technologies:
          <article-title>The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics</article-title>
          .
          <source>Association for Computational Linguistics</source>
          , p.
          <fpage>822</fpage>
          -
          <lpage>830</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>ZHOU</surname>
            , Wei,
            <given-names>YU</given-names>
          </string-name>
          , Clement,
          <string-name>
            <given-names>SMALHEISER</given-names>
            ,
            <surname>Neil</surname>
          </string-name>
          , et al.
          <article-title>Knowledge-intensive conceptual retrieval and passage extraction of biomedical literature</article-title>
          .
          <source>In : Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM</source>
          . p.
          <fpage>655</fpage>
          -
          <lpage>662</lpage>
          . (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Dongqing</surname>
          </string-name>
          , et al.
          <article-title>"Using discharge summaries to improve information retrieval in clinical domain</article-title>
          .
          <source>" Proceedings of the ShARe/-CLEF eHealth Evaluation Lab</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. ZUCCON,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>KOOPMAN</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>et</surname>
          </string-name>
          <string-name>
            <surname>NGUYEN</surname>
          </string-name>
          ,
          <article-title>A. Retrieval of health advice on the web: AEHRC at ShARe/CLEF eHealth evaluation lab task 3</article-title>
          .
          <source>In : Proceedings of CLEF Workshop on Cross-Language Evaluation of Methods</source>
          , Applications, and
          <article-title>Resources for eHealth Document Analysis</article-title>
          .
          <article-title>(</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Choi</surname>
            , Sungbin, and
            <given-names>Jinwook</given-names>
          </string-name>
          <string-name>
            <surname>Choi</surname>
          </string-name>
          .
          <article-title>"SNUMedinfo at CLEFeHealth2013 task 3." Proceedings of the ShARe/CLEF eHealth Evaluation Lab (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bedrick</surname>
            , Steven, and
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sheikhshabbafghi</surname>
          </string-name>
          .
          <article-title>"Lucene, metamap, and language modeling: OHSU at</article-title>
          CLEF eHealth
          <year>2013</year>
          .
          <article-title>" Proceedings of the ShARe/CLEF eHealth Evaluation Lab (</article-title>
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>CALLEJAS</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MIGUEL</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>WANG</given-names>
            ,
            <surname>Yue</surname>
          </string-name>
          , et al.
          <article-title>Exploiting Domain Thesaurus for Medical Record Retrieval</article-title>
          .
          <source>DELAWARE UNIV NEWARK</source>
          , (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. OZTURKMENOGLU,
          <string-name>
            <surname>Okan et</surname>
            <given-names>ALPKOCAK</given-names>
          </string-name>
          , Adil. DEMIR at TREC Medical:
          <article-title>Power of Term Phrases in Medical Text Retrieval</article-title>
          . In : TREC. (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. QI,
          <string-name>
            <surname>Yanjun et</surname>
            <given-names>LAQUERRE</given-names>
          </string-name>
          , Pierre-Franois.
          <article-title>Retrieving Medical Records with sennamed: NEC Labs America at TREC 2012 Medical Records Track</article-title>
          .(
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. KOOPMAN, Bevan,
          <string-name>
            <surname>BRUZA</surname>
          </string-name>
          , Peter,
          <string-name>
            <given-names>SITBON</given-names>
            ,
            <surname>Laurianne</surname>
          </string-name>
          , et al.
          <article-title>AEHRC &amp; QUT at TREC 2011 Medical Track: a concept-based information retrieval approach</article-title>
          .
          <source>In : Proceedings of 20th Text REtrieval Conference (TREC</source>
          <year>2011</year>
          ).
          <source>National Institute of Standards and Technology (NIST)</source>
          , p.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. KOOPMAN, Bevan,
          <string-name>
            <surname>ZUCCON</surname>
          </string-name>
          , Guido,
          <string-name>
            <given-names>NGUYEN</given-names>
            ,
            <surname>Anthony</surname>
          </string-name>
          , et al.
          <article-title>Exploiting SNOMED CT concepts and relationships for clinical information retrieval: Australian e-Health Research Centre and Queensland University of Technology at the TREC 2012 Medical Track</article-title>
          . (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>KING</surname>
            , Benjamin,
            <given-names>WANG</given-names>
          </string-name>
          , Lijun,
          <string-name>
            <given-names>PROVALOV</given-names>
            ,
            <surname>Ivan</surname>
          </string-name>
          , et al.
          <article-title>Cengage Learning at TREC 2011 Medical Track</article-title>
          . In : TREC. (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. FUJITA,
          <string-name>
            <surname>Sumio. Revisiting Again Document Length Hypotheses TREC 2004 Genomics Track</surname>
          </string-name>
          <article-title>Experiments at Patolis</article-title>
          . In : TREC. (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. DARWISH,
          <string-name>
            <surname>Kareem et</surname>
            <given-names>MADKOUR</given-names>
          </string-name>
          ,
          <article-title>Amgad. The GUC Goes to TREC 2004: Using Whole or Partial Documents for Retrieval and Classi cation in the Genomics Track</article-title>
          . In : TREC. (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20. KRAAIJ, Wessel,
          <string-name>
            <surname>RAAIJMAKERS</surname>
          </string-name>
          , Stephan,
          <string-name>
            <surname>WEEBER</surname>
          </string-name>
          , Marc, et al.
          <source>MeSH Based Feedback</source>
          ,
          <article-title>Concept Recognition and Stacked Classi cation for Curation Tasks</article-title>
          . In : TREC. (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>LI</surname>
          </string-name>
          ,
          <string-name>
            <surname>Jiao</surname>
          </string-name>
          , ZHANG, Xian, ZHANG, Min, et al.
          <source>THUIR at TREC</source>
          <year>2004</year>
          :
          <article-title>Genomics Track</article-title>
          . In : TREC. (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22. ARONSON,
          <string-name>
            <surname>Alan R</surname>
          </string-name>
          . et RINDFLESCH,
          <string-name>
            <surname>Thomas</surname>
            <given-names>C.</given-names>
          </string-name>
          <article-title>Query expansion using the UMLS Metathesaurus</article-title>
          .
          <source>In : Proceedings of the AMIA Annual Fall Symposium</source>
          . American Medical Informatics Association, p.
          <fpage>485</fpage>
          . (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23. ChengXiang Zhai:
          <article-title>Statistical Language Models for Information Retrieval</article-title>
          .
          <source>Synthesis Lectures on Human Language Technologies</source>
          , Morgan &amp; Claypool Publishers (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. BODENREIDER,
          <string-name>
            <surname>Olivier.</surname>
          </string-name>
          <article-title>The uni ed medical language system (UMLS): integrating biomedical terminology</article-title>
          .
          <source>Nucleic acids research</source>
          , vol.
          <volume>32</volume>
          , no suppl
          <issue>1</issue>
          , p.
          <fpage>D267</fpage>
          -
          <lpage>D270</lpage>
          . (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25. LIPSCOMB,
          <string-name>
            <surname>Carolyn</surname>
            <given-names>E</given-names>
          </string-name>
          .
          <article-title>Medical subject headings (MeSH)</article-title>
          .
          <source>Bulletin of the Medical Library Association</source>
          , vol.
          <volume>88</volume>
          , no 3, p.
          <volume>265</volume>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. SPACKMAN,
          <string-name>
            <surname>Kent</surname>
            <given-names>A</given-names>
          </string-name>
          .,
          <string-name>
            <surname>CAMPBELL</surname>
          </string-name>
          , Keith E.,
          <string-name>
            <surname>C</surname>
          </string-name>
          , R. A., et al.
          <article-title>SNOMED RT: a reference terminology for health care</article-title>
          .
          <source>In : Proceedings of the AMIA annual fall symposium. American Medical Informatics Association</source>
          . p.
          <fpage>640</fpage>
          . (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27. ARONSON,
          <string-name>
            <surname>Alan R</surname>
          </string-name>
          .
          <article-title>E ective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program</article-title>
          .
          <source>In : Proceedings of the AMIA Symposium</source>
          . American Medical Informatics Association, p.
          <fpage>17</fpage>
          . (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>