<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CENTRE@CLEF2018: Overview of the Replicability Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nicola Ferro</string-name>
          <email>ferro@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Maistro</string-name>
          <email>maistro@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tetsuya Sakai</string-name>
          <email>tetsuyasakai@acm.org</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ian Soboro</string-name>
          <email>ian.soboroff@nist.gov</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Information Engineering, University of Padua</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Institute of Standards and Technology (NIST)</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Waseda University</institution>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Reproducibility has become increasingly important for many research areas, among those IR is not an exception and has started to be concerned with reproducibility and its impact on research results. This paper describes our rst attempt to propose a lab on reproducibility named CENTRE and held during CLEF 2018. The aim of CENTRE is to run a reproducibility challenge across all the major IR evaluation campaigns and to provide the IR community with a venue where previous research results can be explored and discussed. This paper reports the participant results and preliminary considerations on the rst edition of CENTRE@CLEF 2018, as well as some suggestions for future editions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Reproducibility is becoming a primary concern in many areas of science [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
as well as in computer science, as also witnessed by the recent ACM policy on
result and artefact review and badging4. Information Retrieval (IR) is especially
interested in reproducibility [
        <xref ref-type="bibr" rid="ref12 ref13 ref34">12, 13, 34</xref>
        ] since it is a discipline strongly rooted
in experimentation where experimental evaluation represents a main driver of
advancement and innovation.
      </p>
      <p>Even if reproducibility has become part of the review forms at major
conferences like SIGIR, this is more a qualitative assessment performed by a reviewer
on the basis of what can be understood from a paper rather than an actual
\proof" of the reproducibility of the experiments reported in the paper. Since
2015, the ECIR conference started a new track focused on reproducibility of
previously published results. This conference track led to a stable enough ow
of 3-4 reproducibility papers accepted each year but, unfortunately, this
valuable e ort did not produce a systematic approach to reproducibility: submitting
authors adopted di erent notions of reproducibility, they adopted very diverse
experimental protocols, they investigated the most disparate topics, resulting
4 https://www.acm.org/publications/policies/</p>
      <p>artifact-review-badging
in a very fragmented picture of what was reproduced and what not, and the
outcomes of these reproducibility papers are spread over a series of potentially
disappearing repositories and Web sites.</p>
      <p>Moreover, if we consider open source IR systems, they are typically used as:
{ starting point by new-comers in the eld, which take them almost o
-theshelf using default con guration to begin experience with IR and/or speci c
search tasks;
{ base system on top of which to add a new component/technique you are
interested to develop, keeping all the rest in the default con guration;
{ baseline for comparison, again using default con guration.</p>
      <p>
        Nevertheless, it has been repeatedly shown that best TREC systems still
outperform o -the-shelf open source systems [2{4, 23, 24]. This is due to many
di erent factors, among which lack of tuning on a speci c collection when using
default con guration, but it is also caused by the lack of the speci c and advanced
components and resources adopted by the best systems. It has been also shown
that additivity is an issue, since adding a component on top of a weak or strong
base does not produce the same level of gain [
        <xref ref-type="bibr" rid="ref23 ref4">4, 23</xref>
        ]. This poses a serious challenge
when o -the-shelf open source systems are used as stepping stone to test a new
component on top of them, because the gain might appear bigger starting from
a weak baseline. Overall, the above considerations stress the need and urgency
for a systematic approach to reproducibility in IR.
      </p>
      <p>Therefore, the goal of CENTRE@CLEF 20185 is to run a joint task across
CLEF/NTCIR/TREC on challenging participants:
{ to reproduce best results of best/most interesting systems in previous
editions of CLEF/NTCIR/TREC by using standard open source IR systems;
{ to contribute back to the community the additional components and
resources developed to reproduce the results in order to improve existing open
source systems.</p>
      <p>The paper is organized as follows: Section 2 introduces the setup of the
lab; Section 3 discusses the participation and the experimental outcomes; and,
Section 4 draws some conclusions and outlooks possible future works.
2
2.1</p>
      <sec id="sec-1-1">
        <title>Tasks</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Evaluation Lab Setup</title>
      <p>The CENTRE@CLEF 2018 lab o ered two pilot tasks:
{ Task 1 - Replicability: the task focused on the replicability of selected
methods on the same experimental collections;
{ Task 2 - Reproducibility : the task focused on the reproducibility of selected
methods on the di erent experimental collections.
5 http://www.centre-eval.org/clef2018/</p>
      <p>where we adopted the ACM Artifact Review and Badging de nition of
replicability and reproducibility:
{ Replicability (di erent team, same experimental setup): the measurement can
be obtained with stated precision by a di erent team using the same
measurement procedure, the same measuring system, under the same operating
conditions, in the same or a di erent location on multiple trials. For
computational experiments, this means that an independent group can obtain the
same result using the author's own artifacts.</p>
      <p>In CENTRE@CLEF 2018 this meant to use the same collections, topics and
ground-truth on which the methods and solutions have been developed and
evaluated.
{ Reproducibility (di erent team, di erent experimental setup): The
measurement can be obtained with stated precision by a di erent team, a di erent
measuring system, in a di erent location on multiple trials. For
computational experiments, this means that an independent group can obtain the
same result using artifacts which they develop completely independently.
In CENTRE@CLEF 2018 this meant to use a di erent experimental
collection, but in the same domain, from those used to originally develop and
evaluate a solution.
2.2</p>
      <sec id="sec-2-1">
        <title>Replicability and Reproducibility Targets</title>
        <p>For this rst edition of CENTRE we prefer to select the target runs among the
Ad Hoc tasks in previous editions of CLEF, TREC, and NTCIR. We decided to
focus on the Ad Hoc retrieval since it is a general and well known task. Moreover,
the algorithms and the approaches used for Ad Hoc retrieval are widely used as
basis for all the other types of tasks.</p>
        <p>For each evaluation campaign, CLEF, TREC, and NTCIR, we considered
all the Ad Hoc tracks and we examined the submitted papers and the proposed
approaches. CLEF Ad Hoc tracks were proposed from 2000 to 2008, for TREC
campaign we focused on the Web Track, from 2009 to 20011, while for NTCIR
we checked the WWW track of 2017. To select the target papers among all the
submitted ones we considered the following criteria:
{ the popularity of the track, by accounting for the number of participating
groups and the number of submitted runs;
{ the impact of the proposed approach, measured by the number of citations
received by the paper;
{ the year of publication, since we preferred more recent papers;
{ the tools used by the author, we discarded all those papers that were not
using publicly available retrieval systems as Lucene, Terrier, Solr, and Indri.</p>
        <p>Below we list the runs selected as targets of replicability and
reproducibility among which the participants can choose. For each run, it is speci ed the
collection for replicability and the collections for reproducibility; for more
information, the list also provides references to the papers describing those runs as
well as the overviews describing the overall task and collections.</p>
        <p>Since these runs were not originally thought for being used as targets of a
replicability/reproducibility exercise, we contacted the authors of the papers to
inform them and ask their consent to use the runs.</p>
        <p>
          { Run: AUTOEN [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]
        </p>
        <p>
          Task type: CLEF Ad Hoc Multilingual Task
Replicability: Multi-8 Two Years On with topics of CLEF 2005 [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]
Reproducibility: Multi-8 with topics of CLEF 2003 [
          <xref ref-type="bibr" rid="ref29 ref6">29, 6</xref>
          ]
{ Run: AH-TEL-BILI-X2EN-CLEF2008.TWENTE.FCW [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]
        </p>
        <p>
          Task type: CLEF Ad Hoc, Bilingual Task
Replicability: TEL English (BL) with topics of CLEF 2008 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
Reproducibility: TEL French (BNF) and TEL German (ONB) with
topics of CLEF 2008 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
TEL English (BL), TEL French (BNF) and TEL German (ONB) with
topics of CLEF 2009 [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
{ Run: AH-TEL-BILI-X2DE-CLEF2008.KARLSRUHE.AIFB ONB EN [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ]
Task type: CLEF Ad Hoc, Bilingual Task
Replicability: TEL German (ONB) with topics of CLEF 2008 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
Reproducibility: TEL English (BL) and TEL French (BNF) with
topics of CLEF 2008 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
TEL English (BL), TEL French (BNF) and TEL German (ONB) with
topics of CLEF 2009 [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]
{ Run: UDInfolabWEB2 [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]
        </p>
        <p>
          Task type: TREC Ad Hoc Web Task
Replicability: ClueWeb12 Category A with topics of TREC 2013 [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
Reproducibility: ClueWeb09 Category A and B with topics of TREC
2012 [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
ClueWeb12 Category B with topics of TREC 2013 [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
        </p>
        <p>
          ClueWeb12 Category A and B with topics of TREC 2014 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
{ Run: uogTrDwl [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]
        </p>
        <p>
          Task type: TREC Ad Hoc Web Task
Replicability: ClueWeb12 Category A with topics of TREC 2014 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
Reproducibility: ClueWeb09 Category A and B with topics of TREC
2012 [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
ClueWeb12 Category A and B with topics of TREC 2013 [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
        </p>
        <p>
          ClueWeb12 Category B with topics of TREC 2014 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
{ Run: RMIT-E-NU-Own-1 and RMIT-E-NU-Own-3 [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]
        </p>
        <p>
          Task type: NTCIR Ad Hoc Web Task
Replicability: ClueWeb12 Category B with topics of NTCIR-13 [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]
Reproducibility: ClueWeb12 Category A with topics of NTCIR-13 [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]
The participants in CENTRE@CLEF 2018 were provided with the corpora
necessary to perform the tasks. In details we made the following collections
available:
{ Multi-8 Two years On, is a document collection containing documents
written in eight languages. A grand total of nearly 1.4 million documents in the
languages Dutch, English, Finnish, French, German, Italian, Spanish and
Swedish made up the multilingual collection;
{ TEL data was provided by The European Library, the collection is divided
in di erent subsets, each one corresponding to a di erent language: English,
French and German, each of them containing about one million documents.
{ ClueWeb09 consists of about 1 billion web pages in ten languages that were
collected in January and February 2009. ClueWeb09 Category A represents
the whole dataset, while ClueWeb09 Category B consists of the rst English
segment of Category A, which is roughly the rst 50 million pages of the
entire dataset;
{ ClueWeb12 is the successor of ClueWeb2009 and consists of roughly 700
millions English web pages, collected between February 10, 2012 and May 10,
2012. ClueWeb12 Category A represents the whole dataset, while ClueWeb12
Category B is a uniform 7% sample of Category A.
        </p>
        <p>Table 1 reports the number of documents and the languages of the documents
contained in the provided corpora.</p>
        <p>Finally, Table 2 reports the topics used for the replicability and
reproducibility tasks, with the corresponding number of documents, documents' languages
and pool sizes. An example of topic for each evaluation campaign is reported in
the Figure 1 for CLEF, 2 for TREC, and 3 for NTCIR.
2.3</p>
      </sec>
      <sec id="sec-2-2">
        <title>Evaluation Measures</title>
        <p>The quality of the replicability runs has been evaluated from two points of view:
Fig. 3. Example of topic for NTCIR-13, We Want Web Track.</p>
        <p>v m
RMSE = tuu m1 X
i=1</p>
        <p>
          APorig;i
{ E ectiveness : how close are the performance scores of the replicated systems
to those of the original ones. This is measured using the Root Mean Square
Error (RMSE) [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] between the new and original Average Precision (AP)
scores:
where m is the total number of topics, P is the total number of concordant
pairs (document pairs that are ranked in the same order in both vectors)
Q the total number of discordant pairs (document pairs that are ranked
in opposite order in the two vectors), T and U are the number of ties,
respectively, in the rst and in the second ranking.
        </p>
        <p>Since for the reproducibility runs we do not have an already existing run to
compare against, we planned to compare the reproduced run score with respect
to a baseline run to see whether the improvement over the baseline is comparable
between the original and the new dataset. However, we did not receive any
reproducibility runs so we cannot put in practice this part of the evaluation
task.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Participation and Outcomes</title>
      <p>17 groups registered for participating in CENTRE@CLEF2018 but,
unfortunately, only one group succeeded in submitting one replicability run.</p>
      <p>
        Technical University of Wien (TUW) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] replicated the run by Cimiano and
Sorg, i.e. AH-TEL-BILI-X2DE-CLEF2008.KARLSRUHE.AIFB ONB EN. They
submitted four runs described in Table 3, all the code they used to replicate the
run is available online6.
6 https://bitbucket.org/centre_eval/c2018_dataintelligence/src/
master/
      </p>
      <p>The run AH-TEL-BILI-X2DE-CLEF2008.KARLSRUHE.AIFB ONB EN by
Cimiano and Sorg uses a cuto of 1,000 documents and so it has to be compared
to esalength1000 top1000, which adopts the same cut-o . However, since
TUW submitted runs also for cuto s 10 and 100 documents, we compare them
against versions of the run
AH-TEL-BILI-X2DE-CLEF2008.KARLSRUHE.AIFB ONB EN capped at 10 and
100 documents per topic.</p>
      <p>
        The paper by Cimiano and Sorg [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] uses Cross-Lingual Explicit Semantic
Analysis (CL-ESA) to leverage Wikipedia articles to deal with multiple
languages in a uniform way.
      </p>
      <p>
        TUW encountered the following issues in replicating the original run:
{ the Wikipedia underlying database dump of 2008 was no longer available
and they have to resort to the static HTML dump of Wikipedia in the same
period;
{ the above issue caused a processing of Wikipedia articles sensibly di erent
from the original one in [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] and had to rely on several heuristics to cope
with HTML;
{ they xed an issue in the Inverse Document Frequency (IDF) computation,
which might result in negative values according to the equation provided
by [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ];
{ they had to deal with redirect pages in the static HTML dump of Wikipedia
in order to nd links across wiki pages in multiple languages;
{ they had to nd an alternative interpretation language identi cation
heuristics.
      </p>
      <p>All these issues prevented TUW from successfully replicating the original run.
Indeed the Mean Average Precision (MAP) of the run by Cimiano and Sorg was
0:0667 while the MAP of the run esalength1000 top1000 by TUW is 0:0030.</p>
      <p>The detailed results at di erent cuto s for all the submitted runs are reported
in table 4. It clearly emerges that all the above mentioned issues caused the TUW
runs to severely underperform with respect to the original run and it is hard to
say, in general, the extent to which it is possible to replicate it due to the changes
in the language resources available.</p>
      <p>The di culties encountered in replicating the run are further stressed by the
RMSE between AH-TEL-BILI-X2DE-CLEF2008.KARLSRUHE.AIFB ONB EN
and esalength1000 top1000, computed according to eq. (1), which is 0:1132
and the average Kendall's correlation among the ranked lists of retrieved
documents, computed according to eq. (2), which is 5:69 10 04.</p>
      <p>Table 5 reports the results of the comparison with RMSE and Kendall's
between the original and all the replicated runs. It can be noted how the RMSE
deteriorates as the cuto size increases as it is intuitive since it should be easier
to stay closer to the original one when dealing with few top-k documents.</p>
      <p>Finally, Table 6 shows the Kendall's correlation between the submitted and
the original runs computed for each single topic. We can observe as the general
trend is to have very low correlations, i.e. very di erent document rankings
between the topics, but there are a few exceptions. For example, topic 467-AH
has a correlation of 0:6644 with Explicit Semantic Analysis (ESA) length 1,000
and 10 documents cuto , which suddenly drops to 0:0253 and 0:0038 for
cuto s 100 and 1,000, respectively, further stressing the fact that it should be
easier to replicate the very top-k documents. Another interesting example is
topic 490-AH for which the correlation at ESA length 1,000 and 10 documents
cuto is 0:6170, indicating that the right documents have been retrieved but
they have been ranked in almost reversed order.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and Future Work</title>
      <p>This paper reports the results on the rst edition of CENTRE@CLEF2018. A
total of 17 participants enrolled in the lab, however just one group managed to
submit a run. As reported in the results section, the group encountered many</p>
      <p>
        between the original and the submitted runs computed for each
substantial issues which prevented them to actually replicate the targeted run,
as described in more detail in their paper [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        These results support anecdotal evidence in the eld about how much di
cult is to actually replicate (and even more reproduce) research results, even in
a eld with such a long experimental tradition as IR is. However, the lack of
participation is a signal that the community is somehow overlooking this important
issue. As it also emerged from a recent survey within the SIGIR community [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ],
while there is a very positive attitude towards reproducibility and it is
considered very important from a scienti c point of view, there are many obstacles
to it such as the e ort required to put it into practice, the lack of rewards for
achieving it, the possible barriers for new and inexperienced groups, and, least
but not last, the (somehow optimistic) researcher's perception that their own
research is already reproducible.
      </p>
      <p>For the next edition of the lab we are planning to propose some changes in
the lab organization to increase the interest and participation of the research
community. First, we will target for newer and more popular systems to be
reproduced, moreover we will consider other tasks than the AdHoc, as for example
the medical or other popular domains.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agirre</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Di Nunzio</surname>
            ,
            <given-names>G.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mandl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : CLEF 2008:
          <article-title>Ad Hoc Track Overview</article-title>
          . In: Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Deselaers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F.</given-names>
            ,
            <surname>Kurimo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          , Pen~as, A. (eds.)
          <source>Evaluating Systems for Multilingual and Multimodal Information Access: Ninth Workshop of the Cross{Language Evaluation Forum (CLEF</source>
          <year>2008</year>
          ).
          <source>Revised Selected Papers</source>
          . pp.
          <volume>15</volume>
          {
          <fpage>37</fpage>
          . Lecture Notes in Computer Science (LNCS) 5706, Springer, Heidelberg, Germany (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Arguello</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crane</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diaz</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trotman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <source>Report on the SIGIR 2015 Workshop on Reproducibility</source>
          , Inexplicability, and
          <article-title>Generalizability of Results (RIGOR)</article-title>
          .
          <source>SIGIR Forum</source>
          <volume>49</volume>
          (
          <issue>2</issue>
          ),
          <volume>107</volume>
          {
          <issue>116</issue>
          (
          <year>December 2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Armstrong</surname>
            ,
            <given-names>T.G.</given-names>
          </string-name>
          , Mo at, A.,
          <string-name>
            <surname>Webber</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zobel</surname>
          </string-name>
          , J.:
          <source>Has Adhoc Retrieval Improved Since</source>
          <year>1994</year>
          ? In: Allan,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Aslam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.A.</given-names>
            ,
            <surname>Sanderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Zobel</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>Proc. 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR</source>
          <year>2009</year>
          ). pp.
          <volume>692</volume>
          {
          <fpage>693</fpage>
          . ACM Press, New York, USA (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Armstrong</surname>
            ,
            <given-names>T.G.</given-names>
          </string-name>
          , Mo at, A.,
          <string-name>
            <surname>Webber</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zobel</surname>
            ,
            <given-names>J.: Improvements</given-names>
          </string-name>
          <string-name>
            <surname>That Don't Add</surname>
            <given-names>Up</given-names>
          </string-name>
          :
          <article-title>Ad-Hoc Retrieval Results Since 1998</article-title>
          . In: Cheung,
          <string-name>
            <given-names>D.W.L.</given-names>
            ,
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.Y.</given-names>
            ,
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.W.</given-names>
            ,
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            ,
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.J</surname>
          </string-name>
          . (eds.)
          <source>Proc. 18th International Conference on Information and Knowledge Management (CIKM</source>
          <year>2009</year>
          ). pp.
          <volume>601</volume>
          {
          <fpage>610</fpage>
          . ACM Press, New York, USA (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Borri</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ferro</surname>
          </string-name>
          , N. (eds.):
          <source>CLEF 2008 Working Notes. CEUR Workshop Proceedings (CEUR-WS.org)</source>
          ,
          <source>ISSN 1613-0073</source>
          , http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1174</volume>
          / (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Braschler</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>CLEF 2003 { Overview of Results</article-title>
          . In: Peters et al. [
          <volume>28</volume>
          ], pp.
          <volume>44</volume>
          {
          <fpage>63</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Braschler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Di</given-names>
            <surname>Nunzio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.M.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          : CLEF 2004:
          <article-title>Ad Hoc Track Overview and Results Analysis</article-title>
          . In: Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Clough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F.</given-names>
            ,
            <surname>Kluck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <surname>B</surname>
          </string-name>
          . (eds.)
          <article-title>Multilingual Information Access for Text, Speech</article-title>
          and Images: Fifth Workshop of the Cross{
          <article-title>Language Evaluation Forum (CLEF</article-title>
          <year>2004</year>
          )
          <article-title>Revised Selected Papers</article-title>
          . pp.
          <volume>10</volume>
          {
          <fpage>26</fpage>
          . Lecture Notes in Computer Science (LNCS) 3491, Springer, Heidelberg, Germany (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Clarke</surname>
            ,
            <given-names>C.L.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craswell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E.M.:</given-names>
          </string-name>
          <article-title>Overview of the TREC 2012 Web Track</article-title>
          . In: Voorhees,
          <string-name>
            <given-names>E.M.</given-names>
            ,
            <surname>Buckland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.P. (eds.) The</given-names>
            <surname>Twenty-First Text REtrieval Conference Proceedings</surname>
          </string-name>
          (TREC
          <year>2012</year>
          ). pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          . National Institute of Standards and Technology (NIST),
          <source>Special Publication 500-298</source>
          , Washington, USA (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Collins-Thompson</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Diaz</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clarke</surname>
            ,
            <given-names>C.L.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E.M.:</given-names>
          </string-name>
          <article-title>TREC 2013 Web Track Overview</article-title>
          . In: Voorhees [31]
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Collins-Thompson</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bennett</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E.M.:</given-names>
          </string-name>
          <article-title>TREC 2014 Web Track Overview</article-title>
          . In: Voorhees and Ellis [
          <volume>32</volume>
          ]
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Di</given-names>
            <surname>Nunzio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.M.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F.</given-names>
            ,
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          : CLEF 2005:
          <article-title>Ad Hoc Track Overview</article-title>
          . In: Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Gey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.C.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F.</given-names>
            ,
            <surname>Kluck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          , Muller, H., de Rijke, M. (eds.) Accessing Multilingual Information Repositories: Sixth Workshop of the Cross{
          <article-title>Language Evaluation Forum (CLEF</article-title>
          <year>2005</year>
          ).
          <source>Revised Selected Papers</source>
          . pp.
          <volume>11</volume>
          {
          <fpage>36</fpage>
          . Lecture Notes in Computer Science (LNCS) 4022, Springer, Heidelberg, Germany (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Ferro</surname>
          </string-name>
          , N.:
          <article-title>Reproducibility Challenges in Information Retrieval Evaluation</article-title>
          .
          <source>ACM Journal of Data and Information Quality (JDIQ) 8</source>
          (
          <issue>2</issue>
          ), 8:
          <issue>1</issue>
          {
          <issue>8</issue>
          :
          <issue>4</issue>
          (
          <year>February 2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuhr</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , Jarvelin,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Kando</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Lippold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Zobel</surname>
          </string-name>
          , J.:
          <article-title>Increasing Reproducibility in IR: Findings from the Dagstuhl Seminar on \Reproducibility of Data-Oriented Experiments in e-Science"</article-title>
          .
          <source>SIGIR Forum</source>
          <volume>50</volume>
          (
          <issue>1</issue>
          ),
          <volume>68</volume>
          {82 (
          <year>June 2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>SIGIR Initiative to Implement ACM Artifact Review and Badging</article-title>
          .
          <source>SIGIR Forum</source>
          <volume>52</volume>
          (
          <issue>1</issue>
          ) (
          <year>June 2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>CLEF 2009 Ad Hoc Track Overview: TEL &amp; Persian Tasks</article-title>
          . In: Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Di Nunzio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.M.</given-names>
            ,
            <surname>Kurimo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Mostefa</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          , Pen~as,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Roda</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          . (eds.)
          <source>Multilingual Information Access Evaluation Vol. I Text Retrieval Experiments { Tenth Workshop of the Cross{Language Evaluation Forum (CLEF</source>
          <year>2009</year>
          ).
          <source>Revised Selected Papers</source>
          . pp.
          <volume>13</volume>
          {
          <fpage>35</fpage>
          . Lecture Notes in Computer Science (LNCS) 6241, Springer, Heidelberg, Germany (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Freire</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuhr</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rauber</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . (eds.)
          <source>: Report from Dagstuhl Seminar</source>
          <volume>16041</volume>
          :
          <article-title>Reproducibility of Data-Oriented Experiments in e-Science</article-title>
          .
          <source>Dagstuhl Reports</source>
          , Volume
          <volume>6</volume>
          ,
          <string-name>
            <surname>Number</surname>
            <given-names>1</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schloss</surname>
            <given-names>Dagstuhl</given-names>
          </string-name>
          {
          <article-title>Leibniz-Zentrum fu</article-title>
          r Informatik,
          <source>Germany</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Gallagher</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mackenzie</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benham</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>R.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scholer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Culpepper</surname>
            ,
            <given-names>J.S.:</given-names>
          </string-name>
          <article-title>RMIT at the NTCIR-13 We Want Web Task</article-title>
          . In: Kando et al. [
          <volume>20</volume>
          ], pp.
          <volume>402</volume>
          {
          <fpage>406</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Guyot</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radhouani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Falquet</surname>
          </string-name>
          , G.:
          <article-title>Ontology-Based Multilingual Information Retrieval</article-title>
          . In: Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Quochi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          , N. (eds.)
          <source>CLEF 2005 Working Notes. CEUR Workshop Proceedings (CEUR-WS.org)</source>
          ,
          <source>ISSN 1613-0073</source>
          , http: //ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1171</volume>
          / (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Jungwirth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanbury</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Replicating an Experiment in Cross-lingual Information Retrieval with Explicit Semantic Analysis</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.Y.</given-names>
            ,
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>L</surname>
          </string-name>
          . (eds.)
          <source>CLEF 2018 Working Notes. CEUR Workshop Proceedings (CEUR-WS.org)</source>
          ,
          <source>ISSN 1613-0073</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Kando</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fujita</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kato</surname>
            ,
            <given-names>M.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manabe</surname>
          </string-name>
          , T. (eds.):
          <source>Proc. 13th NTCIR Conference on Evaluation of Information Access Technologies. National Institute of Informatics</source>
          , Tokyo, Japan (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Kendall</surname>
            ,
            <given-names>M.G.</given-names>
          </string-name>
          :
          <article-title>Rank correlation methods</article-title>
          .
          <source>Gri n</source>
          , Oxford, England (
          <year>1948</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Kenney</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keeping</surname>
            ,
            <given-names>E.S.</given-names>
          </string-name>
          : Mathematics of Statistics { Part One. D. Van Nostrand Company, Princeton, USA, 3rd edn. (
          <year>1954</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Kharazmi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scholer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vallet</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanderson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Examining Additivity and Weak Baselines</article-title>
          .
          <source>ACM Transactions on Information Systems (TOIS) 34(4)</source>
          ,
          <volume>23</volume>
          :1{
          <fpage>23</fpage>
          :
          <fpage>18</fpage>
          (
          <year>June 2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crane</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trotman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Callan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chattopadhyaya</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Foley</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ingersoll</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vigna</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge</article-title>
          . In: Ferro,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Crestani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Moens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.F.</given-names>
            ,
            <surname>Mothe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Silvestri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Di Nunzio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.M.</given-names>
            ,
            <surname>Hau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Silvello</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          . (eds.)
          <article-title>Advances in Information Retrieval</article-title>
          .
          <source>Proc. 38th European Conference on IR Research (ECIR</source>
          <year>2016</year>
          ). pp.
          <volume>357</volume>
          {
          <fpage>368</fpage>
          . Lecture Notes in Computer Science (LNCS) 9626, Springer, Heidelberg, Germany (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sakai</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dou</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiong</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Overview of the NTCIR-13 We Want Web Task</article-title>
          . In: Kando et al. [
          <volume>20</volume>
          ], pp.
          <volume>394</volume>
          {
          <fpage>401</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>McCreadie</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deveaud</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Albakour</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mackie</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Limsopatham</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ounis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thonet</surname>
          </string-name>
          , T.: University of Glasgow at TREC 2014:
          <article-title>Experiments with Terrier in Contextual Suggestion, Temporal Summarisation and Web Tracks</article-title>
          . In: Voorhees and Ellis [
          <volume>32</volume>
          ]
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Overwijk</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hau</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trieschnigg</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hiemstra</surname>
          </string-name>
          , D., de Jong, F.M.G.:
          <article-title>WikiTranslate: Query Translation for Cross-lingual Information Retrieval Using only Wikipedia</article-title>
          . In: Borri et al. [
          <volume>5</volume>
          ]
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Braschler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kluck</surname>
          </string-name>
          , M. (eds.):
          <source>Comparative Evaluation of Multilingual Information Access Systems: Fourth Workshop of the Cross{Language Evaluation Forum (CLEF 2003) Revised Selected Papers. Lecture Notes in Computer Science (LNCS) 3237</source>
          , Springer, Heidelberg, Germany (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Savoy</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Report on CLEF-2003 Multilingual Tracks</article-title>
          . In: Peters et al. [
          <volume>28</volume>
          ], pp.
          <volume>64</volume>
          {
          <fpage>73</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Sorg</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Cross-lingual Information Retrieval with Explicit Semantic Analysis</article-title>
          .
          <source>In: Borri et al. [5]</source>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Voorhees</surname>
          </string-name>
          , E.M. (ed.): The
          <string-name>
            <surname>Twenty-Second Text REtrieval Conference Proceedings</surname>
          </string-name>
          (TREC
          <year>2013</year>
          ).
          <article-title>National Institute of Standards and Technology (NIST</article-title>
          ),
          <source>Special Publication 500-302</source>
          , Washington, USA (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ellis</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . (eds.):
          <source>The Twenty-Third Text REtrieval Conference Proceedings (TREC</source>
          <year>2014</year>
          ).
          <article-title>National Institute of Standards and Technology (NIST</article-title>
          ),
          <source>Special Publication 500-308</source>
          , Washington, USA (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fang</surname>
          </string-name>
          , H.:
          <article-title>Evaluating the E ectiveness of Axiomatic Approaches in Web Track</article-title>
          . In: Voorhees [31]
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Zobel</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Webber</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanderson</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mo</surname>
            <given-names>at</given-names>
          </string-name>
          , A.:
          <article-title>Principles for Robust Evaluation Infrastructure</article-title>
          . In: Agosti,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Thanos</surname>
          </string-name>
          , C. (eds.)
          <source>Proc. Workshop on Data infrastructurEs for Supporting Information Retrieval Evaluation (DESIRE</source>
          <year>2011</year>
          ). pp.
          <volume>3</volume>
          {
          <issue>6</issue>
          . ACM Press, New York, USA (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>