<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Living Ranking: from online to real-time information retrieval evaluation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lamjed Ben Jabeur</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laure Soulier</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Mousset</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lynda Tamine</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toulous</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Route Narbonne</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toulouse</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>UPMC Univ Paris</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paris</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>jabeur</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>mousset</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>tamine}@irit.fr</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>laure.soulier@lip</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The Living Labs for Information Retrieval Evaluation (LL4IR) initiative have provided a novel framework for evaluating retrieval models that involve real users. In this position paper, we propose an extension to the LL4IR framework that enables to evaluate real-time IR.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Extending LL4IR with Living Ranking</title>
      <p>The \Living Ranking" is an extension component pluggable to LL4IR
framework, as illustrated in Figure ref g:ll4R-extention. In contrast to o ine
rankings that must be provided in advance through LL4IR participant's API, the
new component stands as a new source that provides API with rankings
generated on-the- y for each online submitted query while maintaining the initial
framework over ow.
Experimental c</p>
      <p>PSPayarstritciecipimpaanntt r</p>
      <p>A
A r
Living Ranking</p>
      <p>To do so, participants must provide a ranking algorithm which may be
executed online via the Living Ranking component. Ranking algorithms provided
by participants may respect a standard interface with well-de ned input and
output formats. For instance, the format of the rankings issued from the
Living Ranking component could be structured as the one currently required to
participants. We outline that this architecture allows to restrict the visibility of
real-time submitted queries and eventually of documents, which avoid bias in
the algorithm design and gives more credibility to evaluation results. However,
this component might be resource-consuming. One solution could be to execute
ranking algorithms on demand, for instance when changes occurred on the result
set. Such on-demand strategy may balance between e ciency and e ectiveness.</p>
      <p>The integration of the Living Ranking component within the LL4IR
framework suggests some changes or brings further enhancements detailed below:
- Framework architecture: Living Ranking should o er a exible interface so
participants can easily implement their algorithm without requiring complex
infrastructure for all tiers. We suggest implementing algorithms in sandbox-based
scripts (i.e., JavaScript) that support online execution under strict constraints.</p>
      <p>- Challenge Organization: Since test queries and produced rankings may not
be visible, we suggest to introduce a debugging phase with simulated queries,
standing before uploading ranking algorithms. This would help participants to
validate the e ectiveness and e ciency of their algorithms.</p>
      <p>- Evaluation Metric: Living Ranking components allow to produce additional
evaluation metrics in terms of algorithm computation resources (e.g., execution
time and used memory). Although this type of metric is not commonly used in
IR, we think that such metrics are relevant for evaluating real-time IR models.
3 Conclusion
We propose in this paper to extend the LL4IR framework with a Living Ranking
component in the aim of providing an evaluation framework for real-time
ranking. This approach may add some technical complexity. We are also aware of the
additional e orts to be deployed for benchmark organization but we believe that
the proposed extension would open LL4IR to other retrieval tasks that attract
a lot of interest in IR community, namely real-time search tasks.
References</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Kille</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lommatzsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Turrin</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Larson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Brodt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Seiler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Hopfgartner</surname>
          </string-name>
          .
          <article-title>Stream-based recommendations: Online and o ine evaluation as a service</article-title>
          .
          <source>In CLEF</source>
          <year>2015</year>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schuth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Kelly</surname>
          </string-name>
          .
          <article-title>Overview of the living labs for information retrieval evaluation (ll4ir) clef lab 2015</article-title>
          .
          <source>In CLEF</source>
          <year>2015</year>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>