<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Merging Search Results Generated by Multiple Query Variants Using Data Fusion</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nkwebi Motlogelwa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tebo Leburu-Dingalo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edwin Thuma</string-name>
          <email>thumaeg@mopipi.ub.bw</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Botswana</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we describe the methods deployed in the different runs submitted for our participation to the CLEF eHealth2018 Task 3: Consumer Health Search Task, IRTask 3: Query Variations. In particular, we deploy data fusion techniques to merge search results generated by multiple query variants. As improvement, we attempt to alleviate the term mismatch between the queries and the relevant documents by deploying query expansion before merging the results. For our baseline system, we concatenate the multiple query variants for retrieval and then deploy query expansion.</p>
      </abstract>
      <kwd-group>
        <kwd>Query variation</kwd>
        <kwd>Data Fusion</kwd>
        <kwd>Query expansion</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The high prevalence of the internet has led to an increase in patients and health
care providers seeking for health-related information online. The sources
consulted include social media platforms and web pages owned and operated by
diverse entities. Health Information seekers can be classi ed into experts and
noexperts/laymen. The key distinction is that experts have rich domain knowledge
whereas non-experts have limited or no domain knowledge. These two groups
express their information needs by way of queries to search engines. The queries
submitted often vary in content due to the diverse backgrounds of the
information seekers. The challenge is thus for search engines to be able to return relevant
information regardless of the type of query submitted. Cognizant of this many
evaluation campaigns have been launched to enable researchers to share
knowledge and develop through experiments, e ective information retrieval systems
to cater for this need.</p>
      <p>
        We thus seek to contribute to this e ort by participating in one of these
campaigns, the CLEF eHealth 2018 Task 3: Consumer Health Search, IRTask 3:
Query variations [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The campaign is aimed at building search systems that
are robust to query variations. This task is a continuation of the previous CLEF
eHealth Information Retrieval (IR) task that ran in 2013 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], 2014 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], 2015 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
2016 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and 2017 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In this work we attempt to attain an e ective ranking
by merging search results generated by multiple query variants of the same
information need using data fusion techniques. In particular, we follow earlier
work by Thuma et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], who deployed data fusion techniques to merge search
results generated by query variants, which were formulated through the collection
enrichment approach using di erent external resources. Furthermore, we attempt
to improve the retrieval e ectiveness by alleviating the term mismatch between
the queries and the relevant documents by deploying query expansion.
      </p>
      <p>The paper is structured as follows. Section 2 contains a background on
algorithms used. Section 3 describes the test collection. Section 4 describes the
experimental environment. In Section 5 we provide a description of the di erent
runs submitted. Section 6 presents results and discussion.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>In this section, we present essential background on the di erent algorithms used
in our experiments. We start by describing the DPH term weighting model in
Section 2.1. We then describe the data fusion techniques used in this study
in Section 2.2. We conclude the background by describing the Kullback-Lieber
Divergence for Query Expansion in Section 2.3.
2.1</p>
      <p>
        PL2 Divergence From Randomness (DFR) Term
Model
Weighting
In our experiments, we deploy the PL2 Divergence from Randomness (DFR)
term weighting model, which applies term frequency normalisation of a term in
a document [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The relevance score of a document d for a given query Q based
on the PL2 DFR term weighting model is expressed as follows:
scoreP L2(d; Q) = Pt2Q qtw tf n log2 tfn + (
tf n) log2 e + 0:5 log2(2
tf n)
(1)
where score(d; Q) is the relevance score of a document d for a given query Q.
      </p>
      <p>= tNfc is the mean and variance of a Poisson distribution, tf c is the frequency
of the term t in the collection C while N is the number of documents in the
collection. The normalised query term frequency is given by qtf n = qtqftmfax ,
where qtfmax is the maximum query term frequency among the query terms
and qtf is the query term frequency. qtw is the query term weight and is given
by tfqtnf+n1 , where tf n is the Normalisation 2 of the term frequency tf of the term
t in a document d and is expressed as:
avg l</p>
      <p>l
tf n = tf log2
1 + b
; (b &gt; 0)
(2)
In the above expression, l is the length of the document d, avg l is the average
document length in the collection and b is a hyper-parameter.
2.2</p>
      <p>
        Data Fusion Techniques
In this work, we postulate that an e ective ranking can be attained by merging
search results generated by multiple query variants of the same information need.
In order to validate this hypothesis, we deploy two di erent data fusion
techniques. In particular, we deploy CombSUM and Reciprocal Rank. CombSUM is
a score aggregation technique, where the score of a document is computed the
sum of the normalised scores received by the document in each individual
ranking [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this work, we adapted CombSUM to merge search results generated
by multiple query variants of the same information need and we de ne the score
of the nal ranking as:
(3)
(4)
(5)
(6)
scoreRR(d) =
      </p>
      <p>Q7</p>
      <p>X
r(Qi=1)2R</p>
      <p>1
rankd
where rankd is the rank of document d in the document ranking r(Qi).
2.3</p>
      <p>
        Kullback-Leibler (KL) Divergence for Query Expansion
In this study, we deployed the Terrier-4.2 Kullback-Leibler divergence for query
expansion to attempt to alleviate the term mismatch between the queries and the
relevant documents in the collection being searched. In our deployment, we used
the default terrier settings, where we select the 10 most informative terms from
the top 3 documents after a rst pass document ranking. The KL divergence for
query expansion calculates the information content of a term t in the top-ranked
documents as follows [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]:
scoreCombSUM (d) =
      </p>
      <p>scorer(Qi)(d)
Q7</p>
      <p>
        X
r(Qi=1)2R
where scorer(Qi) is the score of the document d in the document ranking
r(Qi). R is the set of all the rankings generated by the query variants Qi. In
the Reciprocal Rank (RR) data fusion technique, the rank of a document in the
combined ranking is determined by the sum of the reciprocal received by the
document in each of the individual rankings [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this work, we de ne the score
of the nal ranking after merging search results generated by multiple query
variants using RR as:
      </p>
      <p>Px(t)
w(t) = (Px(t)) log2 Pn(t)</p>
      <p>Px(t) =
tf x
x
tf c
Pn(t) = (7)</p>
      <p>N
where Px(t) is the probability of t estimated from the top x ranked documents,
tf x is the frequency of the query term in the top x ranked documents, tf c is the
frequency of the term t in the collection, and N is the number of documents in
the collection. The top 10 terms with the highest information content computed
by w(t) are then selected and used for query expansion.</p>
    </sec>
    <sec id="sec-3">
      <title>Test Collection</title>
      <p>In this Section, we describe the test collection used in this study. First, we
describe the document collection (corpus) used for indexing and retrieval in
Section 3.1. In Section 3.2 we describe the queries used for retrieval.
3.1</p>
      <p>Document Collection
\The document collection used in CLEF 2018 consists of web pages acquired
from the CommonCrawl. An initial list of websites was identi ed for acquisition.
The list was built by submitting the CLEF 2018 queries to the Microsoft Bing
Apis (through the Azure Cognitive Services) repeatedly over a period of few
weeks**, and acquiring the URLs of the retrieved results. The domains of the
URLs were then included in the list, except some domains that were excluded for
decency reasons (e.g. pornhub.com). The list was further augmented by including
a number of known reliable health websites and other known unreliable health
websites, from lists previously compiled by health institutions and agencies. "1
3.2</p>
      <p>
        Queries
In this study we used queries created from 50 topics, which were identi ed from
queries issued by the general public to Health on the NET (HON)2 and TRIP3
search services. From each topic, 7 di erent query variations were created. The
rst 4 query variations were created by people with no medical knowledge, while
the second 3 were created by medical experts. Details on how the queries were
created can be found in Jimmy et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Experimental Setting</title>
      <p>FAQ Retrieval Platform: For all our experiments, we used Terrier-4.2 4, an
open source Information Retrieval (IR) platform. All the documents used in
this study were rst pre-processed before indexing and this involved tokenising
the text and stemming each token using the full Porter stemming algorithm.
Stopword removal was enabled and we used Terrier stopword list. The index
was created using blocks to save positional information with each term. For
query expansion, we used the Terrier-4.2 Kullback-Leibler (KL) Divergence for
query expansion to select the 10 most informative terms from the top 3 ranked
documents.
1 https://sites.google.com/view/clef-ehealth-2018/task-3-consumer-health-search
2 https://hon.ch/en/
3 https://www.tripdatabase.com/
4 www.terrier.org</p>
    </sec>
    <sec id="sec-5">
      <title>Description of the Di erent Runs</title>
      <p>Term Weighting Model: For all our runs, we used the Terrier-4.2 PL2 Divergence
from Randomness (DFR) term weighting model to score and rank the documents
in the document collection.
ub-botswana IRTask3 run1: This is our baseline run. We concatenated all the 7
query variants for each information need. Duplicates were not removed to ensure
that a query term appearing in multiple query varients has a higher query term
weight (qtw). We then performed retrieval on the document collection using the
concatenated queries. We ranked the documents using the PL2 term weighting
model.
ub-botswana IRTask3 run2: In this run, our aim was to validate the hypothesis
that an e ective ranking can be attained by merging search results generated
by multiple query variants of the same information need. In order to achieve
this, we retrieved and ranked the documents in the collection using the 7 query
variants for each information need. For each information need, we merged the
search results using CombSUM, which we described in Section 2.2.
ub-botswana IRTask3 run3: This is an improvement to our second, which is
ubbotswana IRTask3 run2:. In particular, our aim was to improve the retrieval
e ectiveness by alleviating the term mismatch between the queries and the
relevant documents in the document collection. We deployed query expansion using
the KL divergence model before merging the results using CombSUM in an
attempt to alleviate the term mismatch.
ub-botswana IRTask3 run4: In this run, we tested the generality of our approach
in order to validate whether an e ective ranking can be attained by merging
search results generated by multiple query variants of the same information
need by deploying a second data fusion technique. In particular, we deployed
the Reciprocal Rank (RR) data fusion technique. In the same vein as our third
run, which is ub-botswana IRTask3 run3:, we deployed query expansion using the
KL divergence model before merging the results using Reciprocal Rank (RR).
6</p>
    </sec>
    <sec id="sec-6">
      <title>Results and Discussion</title>
      <p>These working notes were compiled and submitted before the relevance
judgments were released. Therefore, we were unable to report on our results and
evaluation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>G.</given-names>
            <surname>Amati</surname>
          </string-name>
          .
          <article-title>Probabilistic Models for Information Retrieval based on Divergence from Randomness</article-title>
          . University of Glasgow,UK,
          <source>PhD Thesis</source>
          , pages
          <volume>1</volume>
          {
          <fpage>198</fpage>
          ,
          <year>June 2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>L.</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leveling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanbury</surname>
          </string-name>
          , H. Muller, S. Salantera,
          <string-name>
            <given-names>H.</given-names>
            <surname>Suominen</surname>
          </string-name>
          , and G. Zuccon.
          <source>ShARe/CLEF eHealth Evaluation Lab</source>
          <year>2013</year>
          ,
          <article-title>Task 3: Information Retrieval to Address Patients' Questions when Reading Clinical Reports</article-title>
          .
          <source>In CLEF 2013 Online Working Notes</source>
          , volume
          <volume>8138</volume>
          .
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>L.</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Palotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pecina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanbury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>H.</given-names>
            <surname>Mueller</surname>
          </string-name>
          . Share/clef ehealth Evaluation
          <source>Lab</source>
          <year>2014</year>
          ,
          <article-title>Task 3: UserCentred Health Information Retrieval</article-title>
          .
          <source>In CLEF 2014 Online Working Notes. CEUR-WS</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>L.</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Suominen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hanlen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neveol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Grouin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Palotti</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          .
          <article-title>Overview of the CLEF eHealth Evaluation Lab 2015</article-title>
          .
          <source>In CLEF 2015 - 6th Conference and Labs of the Evaluation Forum. Lecture Notes in Computer Science (LNCS)</source>
          , Springer,
          <year>September 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Jimmy</surname>
            , G. Zuccon,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Palotti</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            , and
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kelly</surname>
          </string-name>
          .
          <article-title>Overview of the CLEF 2018 Consumer Health Search Task</article-title>
          .
          <source>In CLEF 2018 Evaluation Labs and Workshop: Online Working Notes. CEUR-WS</source>
          ,
          <year>September 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>L.</given-names>
            <surname>Kelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Suominen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neveol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Palotti</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          .
          <source>Overview of the CLEF eHealth Evaluation Lab</source>
          <year>2016</year>
          , pages
          <fpage>255</fpage>
          {
          <fpage>266</fpage>
          . Springer International Publishing, Cham,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          and
          <string-name>
            <given-names>I.</given-names>
            <surname>Ounis</surname>
          </string-name>
          .
          <article-title>Voting for candidates: Adapting data fusion techniques for an expert search task</article-title>
          .
          <source>In Proceedings of the 15th ACM International Conference on Information and Knowledge Management</source>
          ,
          <source>CIKM '06</source>
          , pages
          <fpage>387</fpage>
          {
          <fpage>396</fpage>
          , New York, NY, USA,
          <year>2006</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>J.</given-names>
            <surname>Palotti</surname>
          </string-name>
          , G. Zuccon, Jimmy,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pecina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lupu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kelly</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanbury</surname>
          </string-name>
          .
          <article-title>CLEF 2017 Task Overview: The IR Task at the eHealth Evaluation Lab</article-title>
          . In In Working Notes of Conference and
          <article-title>Labs of the Evaluation (CLEF) Forum</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>V.</given-names>
            <surname>Plachouras</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Ounis.</surname>
          </string-name>
          <article-title>Multinomial randomness models for retrieval with document elds</article-title>
          .
          <source>In Proceedings of the 29th European Conference on IR Research</source>
          , ECIR'
          <volume>07</volume>
          , pages
          <fpage>28</fpage>
          {
          <fpage>39</fpage>
          , Berlin, Heidelberg,
          <year>2007</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. H.
          <string-name>
            <surname>Suominen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Kanoulas</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Azzopardi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Spijker</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neveol</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Ramadier</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Robert</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Palotti</surname>
            , Jimmy, and
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Zuccon</surname>
          </string-name>
          .
          <article-title>Overview of the CLEF eHealth Evaluation Lab 2018</article-title>
          .
          <source>In CLEF 2018 - 8th Conference and Labs of the Evaluation Forum. Lecture Notes in Computer Science (LNCS)</source>
          , Springer,
          <year>September 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. E.
          <string-name>
            <surname>Thuma</surname>
            ,
            <given-names>O.G.</given-names>
          </string-name>
          <string-name>
            <surname>Tibi</surname>
            , and
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Mosweunyane</surname>
          </string-name>
          .
          <article-title>A comparison between selective collection enrichment and results merging in patient centered health information retrieval</article-title>
          .
          <source>International Journal of Computer Applications</source>
          ,
          <volume>180</volume>
          (
          <issue>29</issue>
          ):
          <volume>1</volume>
          {
          <fpage>8</fpage>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>