<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Chemnitz at CLEF 2009 Ad-Hoc TEL Task: Combining Di erent Retrieval Models and Addressing the Multilinguality</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jens Kursten</string-name>
          <email>jens.kuersten@cs.tu-chemnitz.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>09107 Chemnitz</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Chemnitz University of Technology</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Faculty of Computer Science, Dept. Computer Science and Media</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we report our e orts for the participation in the CLEF 2009 Ad-Hoc TEL task. In our second participation we were able to test and evaluate a new feature of the Xtrieval framework, which was the accessibility of the three core retrieval engines Lucene, Lemur and Terrier. This year we submitted 24 experiments in total, 12 each for the monolingual and bilingual subtasks. We compared our baseline experiments to combined runs, where we used two di erent retrieval models, namely the vector space model (VSM) used in Lucene and the Bose-Einstein model for randomness (BB2) available in the Terrier framework. We found that an almost constant improvement in terms of mean average precision over all provided collections is achievable. Furthermore we tried to bene t from the multilingual contents of the collections by running combined multilingual experiments for both subtasks. The evaluation showed that the used approach achieves small improvements in the monolingual setting of the task. Unfortunately, we were not able to con rm this nding in the bilingual setting, where the multilingual experiments were outperformed by the standard bilingual runs, especially on the English target collection.</p>
      </abstract>
      <kwd-group>
        <kwd>Evaluation</kwd>
        <kwd>Experimentation</kwd>
        <kwd>Cross-Language Information Retrieval</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction and outline</title>
      <p>
        The Xtrieval framework [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] was used to prepare and run this year's retrieval experiments in
the Ad-Hoc track TEL setting. The core retrieval functionality was provided by Lucene1 and
the Terrier framework [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. For the TEL task three di erent multilingual corpora with content
mainly in German, English and French were provided by The European Library. Each collection
consists of approximately one million library records. These library records only contain sparse
information and have descriptions in multiple languages.
We conducted monolingual experiments on each of the collections and also submitted experiments
for the bilingual task. For the translation of the topics the Google AJAX language API2 was
accessed through a JSON3 programming interface.
      </p>
      <p>The remainder of the paper is organized as follows. Section 2 describes the general setup of our
system. The individual con gurations and the results of our submitted experiments are presented
in section 3. In sections 4 and 5 we summarize the results and conclude our observations.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Experimental setup</title>
      <p>
        This year we were able to choose from various retrieval models and combine the results in the
retrieval stage by applying our implementation of the Z-Score operator [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We also used a standard
top-k pseudo-relevance feedback algorithm in the retrieval stage, where the values for the top most
frequent terms that were obtained from the top documents di ered according to the language and
used retrieval model. We used the vector space model (VSM) shipped with Lucene and the
Bose-Einstein model for randomness (BB2) available in the Terrier framework. We submitted two
monolingual baseline runs for all provided collections. Additionally we submitted one monolingual
merged experiment and another one in which we tried to bene t from the multilingual character
of the collections. The merged monolingual experiments for each collection formed the baseline for
two bilingual experiments, where the topics were translated from two di erent source languages
to the corresponding target collection. For two additional bilingual experiments on each target
collection we also tried to access the multilingual content of the collections.
      </p>
      <p>We submitted 9 experiments in which we tried to bene t from the multilingual character of
the collections. Therefore we created multiple indexes for each target collection using appropriate
stemming and stopword removal for the four most frequent languages. During the retrieval we
queried these four indexes and combined the results into one nal result list. We needed to translate
the topics for all those experiments to the according language of the index, which makes those
experiments somewhat multilingual. In table 1 we denote the experiments that had multilingual
character and present the boost values for the combination in the multilingual result set for each of
the experiments in column 'IDs'. These values were chosen according to the occurrence frequency
of the language in the corresponding target collection. All runs in the column 'IDs' correspond to
an experiment in column 'refer ID' and are directly comparable to this experiment, because we
used identical system con gurations except for the translation component and the multilingual
indexes.
2http://code.google.com/apis/ajaxlanguage/documentation
3http://json.org</p>
    </sec>
    <sec id="sec-3">
      <title>Con gurations and Results</title>
      <p>The detailed setup of our experiments and their evaluation results are presented in the following
subsections.
3.1</p>
      <sec id="sec-3-1">
        <title>Monolingual Experiments</title>
        <p>We submitted 12 monolingual experiments in total, whereof 4 were submitted for each target
collection in German, English and French. For all experiments a language-speci c stopword list
was applied4. We used the stemmers from Snowball5 for English and French and applied a special
n-gram stemmer6 for German.</p>
        <p>In table 2 the retrieval performance of our experiments is reported in terms of mean average
precision (MAP) and the absolute rank of the experiment in the evaluation. We compare the
two baseline runs to one combined experiment per target collection. Furthermore we compare
the performance of the rst baseline run per collection (cut1, cut9, cut17) to the corresponding
multilingual experiment (cut4++, cut12++, cut20++).
The evaluation of our experiments allows to draw some interesting conclusions. First the overall
performance in terms of MAP on the German and French collection were quite similar, while
the experiments on the English collection achieved much better results. Interestingly this seemed
not to be a aw in our con guration since we achieved identical position in the ranking over
all submitted experiments. Another important observation was that our combined experiments
(where di erent retrieval models were used) always performed better than the baseline run on
each of the target collections. However the overall gain was not very large. Furthermore one can
conclude that our multilingual approach also worked consistently well by slightly improving MAP
(compare cut1 to cut4++, cut9 to cut12++ and cut17 to cut20++).
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Cross-lingual Experiments</title>
        <p>We submitted 12 experiments for the bilingual subtask, whereof 4 were submitted for each target
collection. Two experiments per target collection correspond to the combined monolingual run on
that collection. Though two di erent source topic languages were translated in those experiments.
The remaining two runs per target collection had again multilingual character. We translated
the topic from the source language to the four most common languages in the target collections,
queried the four indexes and combined the results in a multilingual result set. Again the general
4http://members.unine.ch/jacques.savoy/clef/index.html
5http://snowball.tartarus.org
6http://www-user.tu-chemnitz.de/w~ags/cv/clr.pdf
con guration was equal to the corresponding monolingual reference run for comparability. In table
3 we report the evaluation results for each of the bilingual experiments in terms of MAP and the
rank over all submitted experiments. Additionally we report our best monolingual experiment for
each target collection as baseline for comparison.
The evaluation results of our bilingual experiments were very strong. The retrieval performance of
our best monolingual runs compared to our best bilingual runs decreased only about 0.6% on the
English collection, about 1% on the French collection and about 7,5% on the German collection.
We still contribute those results to the quality of the Google translation service. Another nding
was that the experiments in which we tried to bene t from the multilinguality of the collections
also performed quite good in the bilingual setting. In fact one of those experiments performed
best on the French collection and on the German collection it performed almost as good as the
best experiment. Only on the English collection we could not bene t from the multilinguality,
where those two experiments were clearly outperformed by the standard bilingual runs.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Result Analysis - Summary</title>
      <p>The following list provides a summary of the analysis of our retrieval experiments for the Ad-Hoc
TEL task at CLEF 2009:</p>
      <p>Combining retrieval models: Our experiments showed that combining di erent retrieval
models results in a small but consistent gain in terms of MAP over all target collections.
Monolingual task: The submitted monolingual experiments achieved strong performance on
all target collections. Interestingly the MAP on the French and German collection is almost
the same, while the performance is much better on the English collection
Bilingual task: Probably due to the used translation service our bilingual experiments
performed very good and achieved top results on each target collection. The gap to our best
corresponding monolingual runs ranged between 0.6% and 7.5%.</p>
      <p>
        Addressing the multilinguality of the collections: We experimented with multilingual
congurations and compared them to a baseline experiment. We found that our approach to
combine multiple indexed collections works quite good except for the bilingual con gurations
on the English target collection.
In our second participation in the CLEF Ad-Hoc TEL task we were able to chose from a wide
selection of retrieval models. The Xtrieval framework supports three di erent retrieval cores
now, namely Lucene, Lemur and Terrier. By combining results from Lucene and Terrier we
achieved constant gains in terms of mean average precision on all collections over our baseline
runs. Again we found that the translation service provided by Google seems to be extremely
superior to any other approach or system. We used this service for translating our bilingual and
multilingual experiments and got very strong retrieval performance for all those runs. In the
future we will further investigate the numerous retrieval models and try to help to develop an
open-source retrieval framework for information retrieval evaluation as it was proposed by Ferro
and Harman [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We would like to thank Jaques Savoy and his co-workers for providing numerous resources for
language processing. Also, we would like to thank Giorgio M. di Nunzio and Nicola Ferro for
developing and operating the DIRECT system7.</p>
      <p>This work was partially accomplished in conjunction with the project sachsMedia, which is
funded by the Entrepreneurial Regions 8 program of the German Federal Ministry of Education
and Research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Nicola</given-names>
            <surname>Ferro</surname>
          </string-name>
          and
          <string-name>
            <given-names>Donna</given-names>
            <surname>Harman</surname>
          </string-name>
          .
          <article-title>Dealing with multilingual information access: Grid experiments at trebleclef</article-title>
          .
          <source>Post-proceedings of the Fourth Italian Research Conference on Digital Library Systems (IRCDL</source>
          <year>2008</year>
          ), pages
          <fpage>29</fpage>
          {
          <fpage>32</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Jens</given-names>
            <surname>Ku</surname>
          </string-name>
          rsten, Thomas Wilhelm, and
          <string-name>
            <given-names>Maximilian</given-names>
            <surname>Eibl</surname>
          </string-name>
          .
          <article-title>Extensible retrieval and evaluation framework: Xtrieval</article-title>
          . LWA 2008: Lernen - Wissen - Adaption, Wurzburg,
          <year>October 2008</year>
          , Workshop Proceedings,
          <year>October 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Jens</given-names>
            <surname>Ku</surname>
          </string-name>
          rsten, Thomas Wilhelm, and
          <string-name>
            <given-names>Maximilian</given-names>
            <surname>Eibl</surname>
          </string-name>
          .
          <article-title>The xtrieval framework at clef 2007: Domain-speci c track</article-title>
          . In C. Peters,
          <string-name>
            <given-names>V.</given-names>
            <surname>Jijkoun</surname>
          </string-name>
          , Th. Mandl, H. Muller,
          <string-name>
            <given-names>D.W.</given-names>
            <surname>Oard</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Pen~as,
          <string-name>
            <given-names>V.</given-names>
            <surname>Petras</surname>
          </string-name>
          , and D. Santos, editors,
          <source>LNCS - Advances in Multilingual and Multimodal Information Retrieval</source>
          , volume
          <volume>5152</volume>
          , pages
          <fpage>174</fpage>
          {
          <fpage>181</fpage>
          , Berlin,
          <year>2008</year>
          . Springer Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Iadh</given-names>
            <surname>Ounis</surname>
          </string-name>
          , Christina Lioma, Craig Macdonald, and Vassilis Plachouras.
          <article-title>Research directions in terrier: a search engine for advanced retrieval on the web</article-title>
          .
          <source>Novatica/UPGRADE Special Issue on Next Generation Web Search</source>
          , pages
          <volume>49</volume>
          {
          <fpage>56</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Jaques</given-names>
            <surname>Savoy</surname>
          </string-name>
          .
          <article-title>Data fusion for e ective european monolingual information retrieval</article-title>
          .
          <source>Working Notes for the CLEF 2004 Workshop</source>
          , Bath, UK,
          <year>September 2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>