<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Post-Analysis of Query Reformulation Methods for Clinical Trials Retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>DISCUSSION PAPER</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maristella Agosti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giorgio Maria Di Nunzio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Marchesin</string-name>
          <email>stefano.marchesing@unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Information Engineering</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Mathematics University of Padua</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Precision Medicine (PM) track of the Text REtrieval Conference (TREC) focuses on providing useful precision medicine information to clinicians treating cancer patients. The PM track gives the unique opportunity to evaluate medical IR systems on two di erent collections: scienti c literature and clinical trials. In this paper, we evaluate several state-of-the-art query expansion and reduction methods to see whether a particular approach can be helpful in clinical trials retrieval. We present those approaches that are consistently e ective in all three TREC PM editions and we compare them to the results obtained by the research groups who participated in all three editions.</p>
      </abstract>
      <kwd-group>
        <kwd>Query reformulation</kwd>
        <kwd>knowledge base</kwd>
        <kwd>precision medicine</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Medical Information Retrieval (IR) helps a wide variety of users to access and
search medical information archives and data [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a classi cation of textual
medical information is proposed: 1) Patient-speci c information which applies
to individual patients. This type of information can be structured, as in the
case of an Electronic Health Record (EHR), or can be free narrative text. 2)
Knowledge-based information that has been derived and organized from
observational or experimental research. In the case of clinical research, the information
is most commonly provided by books and journals, but can take a wide
variety of other forms. Therefore, the design of e ective tools to access and search
textual medical information requires, among other things, enhancing the query
through expansion and/or rewriting methods that leverage the information
contained within knowledge resources. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] identi ed some challenges arising from
the di erences between general and medical case-based retrieval. In particular,
Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0). This volume is published
and copyrighted by its editors. SEBD 2020, June 21-24, 2020, Villasimius, Italy.
state-of-the-art retrieval methods, combined with selective query term weighing
based on medical thesauri and physician feedback, improve performance
significantly [
        <xref ref-type="bibr" rid="ref16 ref5">16, 5</xref>
        ]. In 2017, 2018, and 2019 the PM [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] track1 at TREC2 focused
on an important use case in clinical decision support: providing useful
precision medicine information to clinicians treating cancer patients. This track gives
a unique opportunity to evaluate medical IR systems since the experimental
collection is composed of a set of topics (synthetic cases created by precision
oncologists) for two di erent collections that target two di erent tasks: 1)
retrieving biomedical articles addressing relevant treatments for a given patient,
and 2) retrieving clinical trials for which a patient { described in the information
need { is eligible.
      </p>
      <p>
        This work combines and discusses the methodology and the results originally
presented at SIGIR 2019 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and TREC 2019 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The objective is to evaluate
several state-of-the-art query expansion and reduction methods to examine whether
a particular approach can be helpful in clinical trials retrieval. Precisely, we
compare the results obtained with our approach to the best experiments submitted
to the 2017 and 2018 PM tracks [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Then, we select the top three query
reformulations found in 2017 and 2018 PM tracks and we evaluate whether their
e ectiveness also holds in the 2019 PM track [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. We conduct a systematic
comparison between our approach and those proposed by the research groups that
participated in all three years of TREC PM. The analysis shows the e ectiveness
of the proposed query reformulations in 2017 and 2018 PM tracks and con rms
the trend in the 2019 PM track. The obtained runs achieve top performing
results in all PM tracks [11{13]. In particular, a speci c query reformulation allows
the retrieval system to achieve top results in all three PM tracks.
      </p>
      <p>The rest of the paper is organized as follows: Section 2 describes the approach
used to evaluate di erent query reformulation methods. Section 3 presents the
experimental setup and Section 4 compares the results obtained using our
approach with those obtained by the other research groups who participated in
TREC PM 2017, 2018 and 2019. Finally, Section 5 reports some nal remarks.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Approach</title>
      <p>The approach we propose consists of four steps: (i) indexing, (ii) query
reformulation, (iii) retrieval and (iv) ltering.</p>
      <p>Indexing. We create the following elds to index clinical trials collections:
&lt;docid&gt;, &lt;text&gt;, &lt;max_age&gt;, &lt;min_age&gt; and &lt;gender&gt;. Fields &lt;max_age&gt;,
&lt;min_age&gt; and &lt;gender&gt; contain information extracted from the eligibility
section of clinical trials and are used in the ltering step. The &lt;text&gt; eld
contains the entire content of each clinical trial.</p>
      <sec id="sec-2-1">
        <title>1 http://www.trec-cds.org/</title>
      </sec>
      <sec id="sec-2-2">
        <title>2 https://trec.nist.gov/</title>
        <p>Query Reformulation. The approach relies on two types of query
reformulation methods: query expansion and query reduction.</p>
        <p>
          Query expansion: We perform a knowledge-based a priori query expansion.
First, we rely on MetaMap [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], a state-of-the-art medical concept extractor, to
extract from each query eld all the Uni ed Medical Language System (UMLS)3
concepts belonging to the following semantic types:4 Neoplastic Process (neop),
Gene or Genome (gngm) and Cell or Molecular Dysfunction (comd ). The
gngm and comd semantic types are related to the query &lt;gene&gt; eld, while
neop is related to the &lt;disease&gt; eld. For those collections where an
additional &lt;other&gt; eld is included { which considers other potential factors that
may be relevant { MetaMap is used on &lt;other&gt; with no restriction on the
semantic types, as its content does not refer to any particular semantic type.
Second, for each extracted concept, we consider all its name variants contained
into the following knowledge sources: National Cancer Institute (NCI), Medical
Subject Headings (MeSH), SNOMED CT (SNOMEDCT) and UMLS
Metathesaurus (MTH). All knowledge sources are manually curated and up-to-date. The
expanded queries consist of the union of the original terms with the set of name
variants. For example, consider a query only containing the word \melanoma" {
which is mapped to the UMLS concept C0025202. The set of name variants for
the concept \melanoma" contains, among many others: cutaneous melanoma,
malignant melanoma, malignant melanoma (disorder). Therefore, the nal
expanded query is the union of the original term \melanoma" with all its name
variants. Additionally, we expand queries that do not mention any kind of blood
cancer (e.g. \lymphoma" or \leukemia") with the term solid. This expansion
proved to be e ective in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] where the authors found that a large part of relevant
clinical trials do not mention the exact disease. A more general term like solid
tumor is preferable and more e ective.
        </p>
        <p>
          Query reduction: We reduce original queries by removing, whenever present,
gene mutations from the &lt;gene&gt; eld. To clarify: consider a topic where the
&lt;gene&gt; eld mentions \BRAF (V600E)". After reduction, the &lt;gene&gt; eld
becomes \BRAF". The reduction aims at mitigating the over-speci city of topics,
since the information contained in a topic is too speci c compared to those
contained in the target documents [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Additionally, we remove the &lt;other&gt;
eld from those collections that include it { since it contains additional factors
that are not necessarily relevant, thus representing a potential source of noise in
retrieving precise information for patients.
        </p>
        <p>
          Retrieval. We use BM25 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] as retrieval model. Query terms obtained through
query expansion are weighted lower than 1:0 to avoid introducing too much noise
in the retrieval process [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>Filtering. The eligibility section in clinical trials comprises three important
demographic aspects that a patient needs to satisfy to be considered eligible for</p>
      </sec>
      <sec id="sec-2-3">
        <title>3 https://www.nlm.nih.gov/research/umls/</title>
      </sec>
      <sec id="sec-2-4">
        <title>4 https://metamap.nlm.nih.gov/SemanticTypesAndGroups.shtml</title>
        <p>the trial, namely: minimum age, maximum age and gender; where minimum age
and maximum age are the minimum and the maximum age, respectively, required
for a patient to be considered eligible for the trial, while gender is the required
gender. After the retrieval step, we lter out from the list of candidate trials
those for which a patient is not eligible { i.e. his/her demographic data (age and
gender) does not satisfy the three aforementioned eligibility criteria. In those
cases where part of the demographic data is not speci ed, a clinical trial is
kept or discarded on the basis of the remaining demographic information. For
instance, if the clinical trial does not specify a required minimum age, then it is
kept or discarded based on its maximum age and gender required values.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental Setup</title>
      <p>This section describes the experimental collections and the setup used to apply
and evaluate our approach.</p>
      <p>Experimental Collections. We report the main information related to topics
and document collections below.</p>
      <p>Topics consist of 30, 50, and 40 synthetic cases created by precision
oncologists in 2017, 2018, and 2019, respectively. In 2017, each topic contained four
key elements in a semi-structured format: (1) disease (e.g. a type of cancer),
(2) genetic variants (primarily present in tumors), (3) demographic information
(e.g. age, gender), and (4) other factors (which could impact certain treatment
options). In 2018 and 2019, each topic had three of the four key elements used
in 2017: (1) disease, (2) genetic variants, and (3) demographic information.
Furthermore, the 2019 topics contain ten non-cancer-related topics.</p>
      <p>Clinical Trials consist of a set of 241,006 clinical trial descriptions, for both
2017 and 2018, and of an updated version of 306,238 descriptions for 2019. The
collections are derived from ClinicalTrials.gov5 { a database of privately and
publicly funded clinical studies conducted around the world. When none of the
available treatments are e ective on oncology patients, the common recourse is
to determine if any potential treatments are undergoing evaluation in a clinical
trial. Therefore, it would be helpful to automatically identify the most relevant
clinical trials for an individual patient. Precision oncology trials typically use
a certain treatment for a certain disease with a speci c genetic variant. Such
trials can have complex inclusion and/or exclusion criteria that are challenging
to match with automated systems.</p>
      <p>
        Experimental Procedure. We use Whoosh,6 a Python search engine library,
for indexing, retrieval, and ltering. For BM25, we keep the default values k1 =
1:2 and b = 0:75 provided { as we found them to be a good combination [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
For query expansion, we rely on MetaMap to extract and disambiguate concepts
from UMLS. Below we report the procedure used for each experiment.
      </p>
      <sec id="sec-3-1">
        <title>5 https://clinicaltrials.gov/</title>
      </sec>
      <sec id="sec-3-2">
        <title>6 https://whoosh.readthedocs.io/en/latest/intro.html</title>
        <p>{ Indexing</p>
        <p>Index clinical trials using the following created elds: &lt;docid&gt;, &lt;text&gt;,
&lt;max_age&gt;, &lt;min_age&gt; and &lt;gender&gt;.
{ Query reformulation</p>
        <p>Use MetaMap to extract from each query eld the UMLS concepts
restricted to the following semantic types: neop for &lt;disease&gt;, gngm/comd
for &lt;gene&gt;, and all for &lt;other&gt;;
Extract from concepts all name variants belonging to NCI, MeSH, SNOMED
CT and MTH knowledge sources;
Expand (or not) topics that do not mention \lymphoma" or \leukemia"
with the term solid ;
Reduce (or not) queries by removing, whenever present, gene mutations
from the &lt;gene&gt; eld;</p>
        <p>Remove (or not) the &lt;other&gt; eld.
{ Retrieval
2017/2018 PM track: Adopt any combination of the reformulation
strategies;
2019 PM track: Adopt the three best reformulation strategies from 2017/2018
PM tracks;
Weigh expanded terms with a value k = 0:1;</p>
        <p>Perform a search using expanded queries with BM25.
{ Filtering</p>
        <p>Filter out clinical trials for which the patient is not eligible.</p>
        <p>Evaluation Measures. We use the measures adopted in the TREC PM tracks,
that are: inferred nDCG (infNDCG), R-precision (Rprec) and P@10.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results and Discussion</title>
      <p>In Table 1, we report the results of our experiments on query reformulation
(Part A) and compare them with the results obtained by the research groups
that participated at TREC PM 2017, 2018 and 2019 (Part B). Given the large
number of experiments we performed, we decided to only present the 5 runs
with the highest P@10 for each year. Each line shows a particular combination
(yes or no values) of semantic types (neop, comd, gngm), usage and expansion
of &lt;other&gt; eld (oth, oth exp), query reduction (orig), and expansion using
weighted solid (tumor) keyword. We use the symbol ` ' to indicate that the
features oth, oth exp are not applicable for the years 2018 and 2019 due to the
absence of the &lt;other&gt; eld in 2018 and 2019 topics. We highlight in bold the top
3 scores for each measure, and we use the symbol z to indicate the combination
that performs well in all three years. For the TREC PM participants, we select
those participants who submitted runs in all three years and reached the top
10 performing runs in at least one edition for each measure [11{13]. The results
achieved using the ve most e ective query reformulations for each year. (z)
indicates a particular query reformulation e ective in all three years. Part B
(bottom) reports the results obtained by participant runs, along with the lowest
score required to enter the top 10 TREC results list and the score obtained by
the best combination of our approach. Further details are reported in Section 4.
reported in part B of Table 1 indicate the best score obtained by a particular
run for a speci c measure; the best results of a participant are often related to
di erent runs. The symbol ` ' means that the measure is not available, while `&lt;'
indicates that none of the runs submitted by the participant achieved the top 10
performing runs. For the sake of comparison, we add for each measure the lowest
score required to enter the top 10 TREC results list, and the score obtained by
the best combination of our approach { indicated by the line number { as if we
were participants of these tracks.
Analysis of Query Reformulations. The results from Table 1 (Part A)
highlight that the use of solid expansions with weight 0.1 as well as query gene
reductions (orig = n) seems to improve performance consistently in 2017 { two of
the three best runs in terms of P@10 (lines 1 and 2) applying both techniques.
Regarding knowledge-based expansions, the semantic type gngm (lines 1 and
5) seems more e ective than neop (line 3), while comd does not seem to have
any positive e ect at all. All ve runs do not consider the other eld (oth =
n) nor its expansion (oth exp = n) { con rming our intuition that it might
represent a potential source of noise in retrieving precise information for patients.
Similarly to 2017, two of the best three runs of 2018 use no knowledge-based
expansions and rely on the solid (tumor) expansion with weight 0.1 (lines 7 and
9). In particular, the runs combining query gene reductions and solid expansions
(marked as z) provide the best performances for all the measures considered, both
in 2017 and 2018. This suggests that removing highly specialized information (i.e.
the gene mutation) or adding general terms (e.g. solid) bene ts the retrieval.
A possible reason is related to the nature of the document collections, since
clinical trials often contain general requirements to allow patients to enroll. The
results obtained in 2019 with the top three query reformulations from 2017 and
2018 con rm this trend. The run combining query gene reductions and solid
expansions (line 13z) is one of the top three runs of 2019, however two query
reformulations from 2017 (line 14) and 2018 (line 11) provide better performance.
This result shows how di cult the task is. In fact, even though we found a
particular query reformulation approach (marked as z) to be highly e ective in
all three years { especially in 2017 and 2018 { it was not the best approach for
2019. Therefore, this analysis helps researchers to select an e ective subset of
query reformulations to build strong baselines for clinical trials retrieval.
Comparison with TREC PM Participants. The results from Table 1 (Part
B) mark a clear division between the 2017 and 2018 tracks and the 2019 track. In
2017 and 2018, most of the participant runs did not reach the top 10 threshold in
any of the considered measures { the only exception is the research group from
Poznan University of Technology, whose best runs always belong to the top 10
performing runs for the track. Conversely, in 2019 all the participant groups
submitted runs that achieved results higher than the top 10 threshold.</p>
      <p>Compared with the results obtained using the query reformulations from
Table 1 (Part A), we can see that all runs employing the best query reformulations
obtain results higher than the top 10 threshold for all the considered measures
in all three years. Furthermore, the runs using the top ve query reformulations
achieved consistently better results than participant runs for each measure in all
three years. This is an indication of the robustness of our approach across the
di erent collections and also of the e ectiveness of the proposed query
reformulations for the clinical trials retrieval. In particular, it is worth to mention that
the runs using the (z) query reformulation achieve performances that belong to
the top three best performing systems of each year PM track [11{13]. Therefore,
the analysis of query reformulations made on the 2017 and 2018 PM tracks
conrmed its trend in the 2019 PM track and allowed us to identify a speci c set of
query reformulations bene cial for the retrieval of clinical trials.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Final Remarks</title>
      <p>
        In this paper, we further elaborate the results originally presented at SIGIR
2019 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and TREC 2019 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to evaluate several query expansion and reduction
methods and to see whether a particular approach can be helpful in clinical trials
retrieval. The experimental analysis showed the e ectiveness of the proposed
query reformulations in 2017 and 2018 PM tracks, and we con rmed this positive
trend in the 2019 PM track. The obtained runs achieve top performing results
in all PM tracks [11{13]. In particular, the run marked as z in Table 1 can be
considered as a valid baseline to build stronger multi-stage systems in the future.
Acknowledgements. This work is partially supported by the ExaMode project,
as part of the European Union H2020 research and innovation program under
grant agreement no. 825292.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agosti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Di</given-names>
            <surname>Nunzio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.M.</given-names>
            ,
            <surname>Marchesin</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          : The University of Padua IMS Research Group at
          <article-title>TREC 2018 PM Track</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Agosti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Di</given-names>
            <surname>Nunzio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.M.</given-names>
            ,
            <surname>Marchesin</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.:</surname>
          </string-name>
          <article-title>An Analysis of Query Reformulation Techniques for Precision Medicine</article-title>
          .
          <source>In: Proc. ACM SIGIR Conf</source>
          . pp.
          <volume>973</volume>
          {
          <issue>976</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Aronson</surname>
            ,
            <given-names>A.R.:</given-names>
          </string-name>
          <article-title>E ective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program</article-title>
          .
          <source>In: Proc. AMIA Symposium</source>
          . pp.
          <volume>17</volume>
          {
          <issue>21</issue>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Di</given-names>
            <surname>Nunzio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.M.</given-names>
            ,
            <surname>Marchesin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Agosti</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Exploring how to combine query reformulations for precision medicine</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Diao</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , et alii:
          <source>The Research of Query Expansion Based on Medical Terms Reweighting in Medical IR. EURASIP J. Wireless Comm.&amp;Networ. (1)</source>
          ,
          <volume>105</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Zobel</surname>
          </string-name>
          , J.: Medical Information Retrieval:
          <article-title>Introduction to the Special Issue</article-title>
          . Inform Retrieval J.
          <volume>19</volume>
          (
          <issue>1</issue>
          ), 1{
          <issue>5</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Goodwin</surname>
            ,
            <given-names>T.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skinner</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harabagiu</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          : UTD HLTRI at TREC 2017:
          <article-title>Precision medicine track</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gurulingappa</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toldo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schepers</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bauer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Megaro</surname>
          </string-name>
          , G.:
          <article-title>Semi-supervised information retrieval system for clinical decision support</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hersh</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Information Retrieval: A Health and Biomedical Perspective</article-title>
          .
          <source>Health and Informatics Series</source>
          , Springer-Verlag, New York, NY, USA (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Oleynik</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et alii:
          <article-title>HPI-DHC at TREC 2018: PM Track</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , et alii:
          <article-title>Overview of PM Track</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , et alii:
          <article-title>Overview of PM Track</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , et alii:
          <article-title>Overview of PM Track</article-title>
          .
          <source>In: Proc. TREC</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Robertson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaragoza</surname>
          </string-name>
          , H.:
          <article-title>The probabilistic relevance framework: BM25 and beyond</article-title>
          .
          <source>Foundations and Trends R in Information Retrieval</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <volume>333</volume>
          {
          <fpage>389</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Sondhi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , et alii:
          <article-title>Leveraging Medical Thesauri and Physician FB for Improving Medical Literature Retrieval for Case Queries</article-title>
          .
          <source>JAMIA</source>
          <volume>19</volume>
          (
          <issue>5</issue>
          ),
          <volume>851</volume>
          {
          <fpage>858</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carterette</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , H.:
          <article-title>Using Large Clinical Corpora for QE in Text-Based Cohort Identi cation</article-title>
          .
          <source>J. of Biomedical Informatics</source>
          <volume>49</volume>
          ,
          <issue>275</issue>
          {
          <fpage>281</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>