<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Ideas for a Standard LL4IR Extension</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Philipp Schaer</string-name>
          <email>philipp.schaer@th-koeln.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Narges Tavakolpoursaleh</string-name>
          <email>narges.tavakolpoursaleh@gesis.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>A LL4IR Component for Popular Search Environments</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Cologne University of Applied Sciences</institution>
          ,
          <addr-line>Cologne</addr-line>
          ,
          <country>Germany GESIS</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Leibniz Institute for the Social Sciences</institution>
          ,
          <addr-line>Cologne</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We introduce the idea of developing a standard extension for common search engines and repository systems. This would not only increase the number of possible living labs participants on the site level but would additionally bring some other bene ts like common standards and practices. We already developed such an extension for the repository system DSpace that might be a basis for future implementations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The idea of Living Labs for Information Retrieval has been successfully
implemented in international IR evaluation campaigns like CLEF or TREC. To
establish a robust and stable evaluation environment the LL4IR API is
publicly available and is professionally hosted thanks to a funding by EFS ELIAS
and Microsoft Azure. However, until today only ve platforms implemented the
API within their systems: REGIO JATEK and Seznam[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and the three
academic sites CiteseerX, Microsoft Academic Search, and the Social Science Open
Access Repository SSOAR1. The latter being developed by the authors. While
implementing the LL4IR component into SSOAR we learned: (1) The process of
extracting head queries, compiling the JSON markup, establishing a work ow
for uploading the feedback, implementing the interleaving, and many other tasks
sum up and make it quite an e ort to be part of LL4IR. (2) The query
distribution of our system is highly skewed and not many typical head queries that
are issued several hundred times a day are present. This might be due to the
quite speci c range of topics represented within the repository and the nature of
academic search itself. Another reason might be that a lot of users are using the
direct links provided by other search engines like Microsoft Academic Search.
SSOAR is DSpace-based repository system that uses a Solr search engine. While
implementing the LL4IR component we paid attention to make it as minimally
1 http://trec-open-search.org/sites/
invasive and encapsulated as possible. This led to a quite reusable piece of
software that might be used as an o cial extension for DSpace. We tested the
extension with the stable branches 3 and 5, both within an out-of-the-box vanilla
installation and the speci c implementation of SSOAR. We believe this to be a
bene t for the whole repository community as this allows other repository
operators to easily be part of the LL4IR community. There are more than 1,350
systems listed in OpenDOAR, a registry for Open Access Repositories, that are
based on DSpace. A huge eld of candidates for next year's CLEF or TREC
LL4IR campaigns. These di erent installations share a common system setup
but featuring di erent content (e.g. repositories from the social sciences, arts or
the sciences). This introduces the possibility to test ranking mechanisms in very
di erent domains.
      </p>
      <p>Due to many comparable systems within the same campaign it might be
possible to surpass the missing head queries as the systems might share only their
top n queries and sum up their head queries with other DSpace installations.
This wouldn't lead to 100 head queries but maybe thousands. Why not use that
many di erent queries and than later decide which of them are statistically stable
enough to be included within the nal evaluation round?</p>
      <p>A standard LL4IR implementation introduces the possibilities to set some
common standards and practices, like same timeout con gurations, the
guarantee that the interleaving algorithm is the same in every system, and so on.</p>
      <p>By having both Microsoft Academic Search as a search engine and DSpace
systems as the content-bearing repositories it might be possible to interlink
search sessions. While a user is searching for scienti c documents within
Academic Search he is later transferred to the repository where he can nd the full
text or additional document information. When both systems are part of the
Living Labs campaign they might include a common URL parameter to indicate
that this speci c request is to be taken into account for the LL4IR campaign.
This way we can nd out if users are interacting with the document like e.g.
bookmarking them, recommending them, tweeting about them or downloading
the PDF le.
3</p>
      <p>Conclusion and Outlook
All the things we listed above are still true for other popular search environments
like Solr-based systems or content management systems like Typo3 or Wordpress.
We therefore would like to discuss whether the idea of using standard extension
for popular search environments is worth the try and what other positive or
negative outcomes might be possible.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Schuth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Overview of the living labs for information retrieval evaluation (ll4ir) clef lab 2015</article-title>
          .
          <source>In: CLEF 2015 - 6th Conference and Labs of the Evaluation Forum. Lecture Notes in Computer Science (LNCS)</source>
          , Springer (
          <year>September 2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>