<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UniNE at CLEF 2012</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mitra Akasereh</string-name>
          <email>mitra.akasereh@unine.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nada Naji</string-name>
          <email>nada.naji@unine.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacques Savoy</string-name>
          <email>jacques.savoy@unine.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Dept., University of Neuchatel</institution>
          ,
          <addr-line>Rue Emile Argand 11, 2000 Neuchatel</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>As participants in this CLEF evaluation campaign, our first objective is to propose and evaluate various indexing and search strategies for the CHiC corpus, in order to compare the retrieval effectiveness across different IR models. Our second objective is to measure the relative merit of various stemming strategies when used for the French and English monolingual task in the CH context. Our third objective is to assess the effectiveness of query translation methods in a bilingual retrieval. To do so we evaluated the CHiC testcollections using Okapi, various IR models derived from the Divergence from Randomness (DFR) paradigm together with the dtu-dtn vector-space model. We also evaluated different pseudo-relevance feedback approaches. In the bilingual task, we conducted our search on the English corpus using the French and German topics with two different translations for each of them. For both English and French languages, we find that word-based indexing with our light stemming procedure results in better retrieval effectiveness than with other strategies. When ignoring stemming, the performance variations were relatively small yet for the French corpus better than applying a light stemmer. In bilingual level results show that using a combination of translation resources gives better results than a single source.</p>
      </abstract>
      <kwd-group>
        <kwd>Probabilistic IR Models</kwd>
        <kwd>Stemmer</kwd>
        <kwd>Data Fusion</kwd>
        <kwd>Cultural Heritage</kwd>
        <kwd>bilingual IR</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Cultural heritage can be defined as any handmade substance or intangible feature
remained from previous societies. It can refer to any artefacts, built or natural
environments, traditions and languages, etc. The developing use of digital information
challenges the cultural heritage organizations to provide cultural heritage collections
in electronic format. The data may come from different sources (libraries, archives,
museums, audiovisual archives, books, journals, etc.), in various languages and
formats. These digital libraries should not only be created but also properly managed and
assessed in order to bring the maximum utility to their users. As yet no proper
evaluation approaches are available and there is work to be done in this area. The goal of
Cultural Heritage in CLEF (CHiC) evaluation lab is thus providing a systematic and
large-scale evaluation of cultural heritage digital libraries.</p>
      <p>The IR group of university of Neuchâtel focuses, as one of its main tasks, on
design, implementation and evaluation of various indexing and search strategies for a set
of different natural languages. Up to this point we achieved to provide a groundwork
for evaluation and comparison of different tools for monolingual IR, in different
languages, using generic test-collections (e.g., newspaper articles). Our second goal is to
evaluate different tools considering only a specific field of knowledge in order to
integrate domain specific search into our system. The aim here is to be able to
evaluate the impact of document structure and query formulation on retrieval effectiveness
in order to study the possibilities to improve the search quality in a domain specific
search. As a third objective we also want to integrate translation into the search
process and adapting our system for bilingual and multilingual IR. Accordingly reaching
these objectives has been our main motive to participate in CHiC evaluation lab at
CLEF 2012.</p>
      <p>The rest of this report is organized as follows: section 2 presents an introduction to
our experiment setup. Section 3 describes the results obtained during the experiment
and the related analysis. Section 4 shows our official runs and finally section 5
concludes the experiment.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Experiment Setup</title>
      <sec id="sec-2-1">
        <title>Overview of the Task</title>
        <p>In our participation in CHiC we worked on the ad-hoc retrieval task. This task is a
standard retrieval task in which retrieval effectiveness for individual queries is
assessed. At this level the only authorized user/system interaction would be blind-query
expansion technics. The expected output is a ranked list of retrieved documents for
each query. The task is covering monolingual, bilingual and multilingual subtasks in
English, French and German. In our experiment we worked on monolingual English
and French retrieval as well as bilingual retrieval in which we worked with French
and German topics to be searched on the English corpus.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Overview of the Test-collection</title>
        <p>The corpus used in CHiC test-collection is extracted from Europeana
(www.europeana.eu) and is offered in 3 major European languages, namely English
(EN), French (FR) and German (DE). Europeana is a digitized collection of Europe‟s
cultural and scientific heritage. It provides access over 23 million objects such as
books, paintings, films, museum objects, etc. collected from more than 2200
institutions in 33 countries. Europeana collection is cross-domain and in multiple languages.
The documents metadata is mapped to a single data model. Each document consists of
elements providing brief descriptions of the objects (title, keywords, description, date,
provider, etc.). It is worth-mentioning that some documents contain less of these tags
than other ones which sometimes leaves them with very poor content. As far as our
experiment is concerned, only human-readable informative texts are of use.</p>
        <p>The English corpus consists of 1,106,426 documents while the French one has
3,635,388 ones. A sample of both French and English documents is shown in Figures
1 and 2.
&lt;ims:metadata
ims:identifier="http://www.europeana.eu/resolve/record/10105/662DC5085397837
C8C8891836EA6431C4A477CB2"ims:namespace="http://www.europeana.eu/"
ims:language="eng"&gt;
&lt;ims:fields&gt;
&lt;dc:identifier&gt;Orn.0446&lt;/dc:identifier&gt;
&lt;dc:subject&gt;Australian Pelican&lt;/dc:subject&gt;
&lt;dc:title&gt;Australian Pelican (Orn.0446)&lt;/dc:title&gt;
&lt;dc:type&gt;mounted specimen&lt;/dc:type&gt;
&lt;europeana:country&gt;malta&lt;/europeana:country&gt;
&lt;europeana:dataProvider&gt;Heritage Malta&lt;/europeana:dataProvider&gt;
&lt;europeana:isShownAt&gt;http://www.heritagemalta.org/sterna/orn.php?id=0446
&lt;/europeana:isShownAt&gt;
&lt;europeana:language&gt;en&lt;/europeana:language&gt;
&lt;europeana:provider&gt;STERNA&lt;/europeana:provider&gt;
&lt;europeana:type&gt;IMAGE&lt;/europeana:type&gt;
&lt;europeana:uri&gt;
http://www.europeana.eu/resolve/record/10105/662DC5085397837C8C8891
836EA6431C4A477CB2&lt;/europeana:uri&gt;
&lt;/ims:fields&gt;
&lt;/ims:metadata&gt;</p>
        <p>
          For the ad-hoc task there are 50 very short topics. These topics are mostly named
entities (people, places and works) and they mainly extracted from Europeana queries
logs. Thus they convey the real user‟s information needs in a cultural heritage search
context. Among the 50 French topics, 11 have no relevant documents in the
collection. This number grows to 14 for the English topics. One topic from each language is
shown in Figure 3. As shown in the sample below each topic consists of a title and,
sometimes, a description of the content. Even though, the only field that should be
used for retrieval is the title.
- &lt;topic lang="en"&gt;
&lt;identifier&gt;CHIC-006&lt;/identifier&gt;
&lt;title&gt;esperanto&lt;/title&gt;
&lt;description&gt;Constructed international auxiliary language&lt;/description&gt;
&lt;/topic&gt;
- &lt;topic lang="fr"&gt;
&lt;identifier&gt;CHIC-004&lt;/identifier&gt;
&lt;title&gt;film muet&lt;/title&gt;
&lt;description /&gt;
&lt;/topic&gt;
- &lt;topic lang="de"&gt;
&lt;identifier&gt;CHIC-025&lt;/identifier&gt;
&lt;title&gt; amerikanische sklaverei &lt;/title&gt;
&lt; description /&gt;
&lt;/topic&gt;
In our experiment we applied a stopword removal along with a light stemmer for both
English and French corpora. Our stopword list for English contains 571 terms while
the French one has 464 terms. These tools are freely available at
members.unine.ch/jacques.savoy/clef/. These lists are composed of terms having a high
frequency such as determinants, prepositions, conjunctions, pronouns, and some
verbal forms which convey no important meaning. The light stemmer that we used for
English removes only the plural „-s‟ and is called S-stemmer [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. The stemmer for
French removes the inflectional suffixes from plural and feminine forms of the words
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>
          Our choice of these light stemmers is based on previous experiments which show
that light stemmers tend to be as effective as stemmers based on morphological
analysis [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Moreover applying stemming would not be a good manner to achieve
high precision which is the aim in this experiment [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
2.4
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>IR Models</title>
        <p>
          In our experiments we tried different weighting schemes in order to compare them
and define the most effective ones in terms of achieving a high precision. First we
picked the dtu-dtn model [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] as an effective vector-space model. Second, as
probabilistic models, we used the Okapi (BM25) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Then we tried three other probabilistic
models extracted from the Divergence from Randomness (DFR) family [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], namely
DFR-PL2, DFR-I(ne)C2, and DFR-I(ne)B2. The indexing weight (weight of term tj in
document di) in these models is computed as shown in Table 1.
li is the length of document di and avdl is the average document length.
]
wij  Infi1j  Infij2   log2 (Prob1ij (tfij ))  (1  Probi2j (tfij ))
        </p>
        <p>DFR- I(ne)C2:</p>
        <p> n  1 
Infi1j  tfnij  log</p>
        <p>
           ne  0.5 
DFR-I(ne)B2:
Probi1j 
DFR-PL2:
e j  tjfnij
tfnij!
) ( c and mean _ dl (average document
For evaluating the retrieval performance we chose the MAP (mean average precision)
measure. This is computed with the TREC_EVAL program where MAP value is
computed based on, maximum, 1000 retrieved items per query. It is important to
mention that when computing the MAP, the topics with no relevant items are not taken
into account (14 topics among the English topics and 11 French ones).
In order to enhance the retrieval effectiveness we also applied a blind-query
expansion to our test. Our previous experiments on other corpora show that
pseudorelevance feedback (PRF or blind-query expansion) tends to improve the retrieval
effectiveness [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. As a first approach we tried the Rocchio's approach [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] with α =
0.75, β = 0.75. In this method the system expands the query by adding m terms
selected from the k best ranked documents retrieved for the original query. As a second
approach we tried an idf-based query expansion model [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. The reason for trying
both approaches is that in some cases adding frequently occurring terms produces
noise and consequently Rocchio's approach does not give good results [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
2.7
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Data Fusion</title>
        <p>
          In our experiment we tried to see whether combining different indexing schemes and
IR models improves the retrieval effectiveness, as it is supposed to, or not [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. It is
probable that different strategies retrieve the same relevant items in their top ranks
rather than the same non-relevant ones. Therefore we consider that by combining
different ranked lists, resulting from different IR models, we will gain a list with
relevant documents in higher ranks and the non-relevant items in lower ones [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. In
order to produce this combination of ranked lists, different fusion operators can be
used. In our study we chose the Z-score scheme which tends to perform the best [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ],
[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. More details about the Z-score strategy can be found in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
3
3.1
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results &amp; Analysis</title>
      <sec id="sec-3-1">
        <title>Monolingual Retrieval</title>
        <p>At monolingual ad-hoc task, we test our system using the English and French corpora.
Tables 2 and 3 show the Mean Average Precision (MAP) for, respectively, English
and French corpora. For both languages, we tried different IR models while applying
a light stemmer (Section 2.3) and compared these results with the ones obtained when
stemming is ignored. In using the Okapi model the avdl (average document length) is
set to 181 for English corpus and 169 for the French one, the constant k1 to 1.2, for
both languages, and we tried three different values for the constant b: 0.5, 0.7 &amp; 0.9.</p>
        <p>As the results show, for English the corpus with DFR-I(ne)B2 model we achieve
the highest MAP while the best performing model for French is Okapi model (with
b=0.5). The results show that applying the light stemmer for the English language
improves the effectiveness of the search which is not the case for the French
collection. As can be seen in Table 3 we achieved higher MAP while ignoring the stemming
phase for the French language. By making a query-by-query analysis on the results
we can find some examples where stemming misleads the retrieval. In Topic #21 the
title “chanrdonne” (Jacques Chardonne, Writer (F.) Or place in Switzerland) is
indexed as “chardon” (after applying the light stemmer) which leads the system to
retrieve in its top ranks non-relevant documents (in which “chardon” refers to a flower)
such as:
─ Etude de feuilles de echirops, de sphoerophalus, chardon cultivé, de chardon
sauvage de la mer, de fleur lilas, de chardon sauvage
─ Sujet ou décor : représentation végétale (fleur, chardon) ; chardon bleu ; Etude
de chardon fleuri
─ Chardons sur la côte rocheuse
As another example we can mention Topic #9 for which the title “îles malouines”
changes to “malouin” after stemming and results in the retrieval of non-relevant
documents (where “Malouin” is a proper name) such as follows in the top ranks:
─ L'Avare, comédie de Molière en 5 actes, mise en vers, par A. Malouin
─ villas de la Malouine</p>
        <p>Table 4 contains the MAP obtained when applying pseudo-relevance feedback.
These results reveal that in this experiment the PRF technic did not help to enhance
the retrieval performance. The reason should be due to the fact that in this experiment
we are dealing with relatively short documents (having the average number of distinct
indexing terms per document at ~54 for English and ~56 for French).</p>
        <p>In Table 5 we can see the results for our data fusion approach for the English
corpus. We can see that the MAP obtained by combining different result lists enhances,
in some cases, slightly the performance. However the difference between the MAP
obtained for each model separately and the combined one is rather small.</p>
        <p>
          Model
DFR-I(ne)B2
DFR-PL2
DFR-I(ne)B2
DFR-I(ne)C2
DFR-I(ne)B2
dtu-dtn
dtu-dtn
DFR-PL2
DFR-I(ne)C2
dtu-dtn
DFR-I(ne)B2
Okapi(b=0.9)
dtu-dtn
DFR-PL2
DFR-I(ne)C2
dtu-dtn
Okapi(b=0.9)
DFR-I(ne)C2
dtu-dtn
Okapi(b=0.9)
5 documents /10 terms
20 documents /10terms
5 documents /10terms
20 documents /10terms
10 documents /10terms
20 documents /10terms
10 documents /30terms
In our bilingual retrieval we used the German and French topics to search the English
corpus. Our approach was based on query translation (QT). Thus we produced the
English translations for German and French topics and then we launched the search
on English corpus. To translate the queries we first used two different strategies. First
we used Google translation which seems to give reasonable results when dealing with
very short query formulation [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. As a second approach we used the combination of
Wikipedia and Google considering that a combination of translation strategies slightly
improves the retrieval performance [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. The results for the bilingual retrieval are
shown in Tables 6 and 7. We can see that using the combination of Google and
Wikipedia results a better performance even though the difference is not remarkable.
        </p>
        <p>The topics used in this collection are mostly name entities and only the title is used
for the search which makes the translation less critical and easier. As a result there are
not many differences between translations produced with the two strategies. However,
by inspecting the results in details we can find some cases for which a better
translation led to better retrievals. In translating Topic #5 (“briefmarke”), from German to
English, Google gives us the word “stamp” versus “postage stamp” which resulted
from the Google and Wikipedia combination. As a result the system returns 9 relevant
documents among its first 10 ranks when searching “postage stamp” while by
searching “stamp” the first relevant document only appears at rank 82. Using the French
topics for the same topic (“timbre poste”), Google gives us “stamp post” versus
“postage stamp” using the combination method. Here again the system retrieves 9
relevant documents among its first 10 ranks using “postage stamp” while by searching
“stamp post” it retrieves 5 relevant documents among its first 10 having the first
relevant at rank 5.</p>
        <p>
          Table 8 summarizes our twelve official runs. We have submitted four runs for the
English monolingual ad-hoc task and four French monolingual ad-hoc runs. For
bilingual ad-hoc we submitted two runs using French topics to retrieve English
documents and two runs using German topics again on the English corpus. In each run we
used our different selected models while applying our light stemmers or alternatively
skipping the stemming phase. In some cases we applied a pseudo-relevance feedback
strategy [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] to evaluate its impact on the system‟s performance. We also tried to
merge different models into a single ranked list using the Z-score scheme [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] in
order to improve the retrieval effectiveness.
The results obtained in CLEF 2012 CHiC lab, state that the models derived from the
Divergence from Randomness (DFR) family, yield the best retrieval effectiveness
regardless the underlying language and test-collection. Applying DFR-I(ne)B2 and
DFR-PL2 for both the French and English corpora produced a high MAP compared to
other tested models. Our results reveal that the Okapi model (with b=0.5) tends also to
be an effective model. The resulting question is to define the best values for the
underlying constants.
        </p>
        <p>Our experiment shows that applying a light stemmer (removing only the plural
„-s‟) for English, helps to achieve better results than when the stemming phase is
skipped. On the contrary, when using our light stemmer for French (removing plural
and feminine suffixes) does not seem to enhance the retrieval performance. A simpler
stemmer for the French language may produce a better effectiveness than the applied
light stemmer.</p>
        <p>Considering the results, we can also conclude that when dealing with relatively
short documents, blind-query expansion is not a useful expansion method in order to
improve the retrieval effectiveness. In such cases, it seems difficult to select the most
appropriate terms to be included in the expanded query.</p>
        <p>Finally, our results from the bilingual search confirm the effectiveness of
DFRI(ne)B2 model and the S-stemmer (used for English). Furthermore, they show that a
combined translation strategy leads to perform better results than a single one. Even
though in our experiment, having very short topics (and mostly name entities), the
difference between the various translation methods is not remarkable.
Acknowledgements. This work was supported in part by the Swiss National Science
Foundation under Grant #200020-129535/1.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Harman</surname>
            ,
            <given-names>D.K.</given-names>
          </string-name>
          :
          <article-title>How effective is suffixing? JASIS</article-title>
          .
          <volume>42</volume>
          (
          <issue>1</issue>
          ),
          <fpage>7</fpage>
          -
          <lpage>15</lpage>
          (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Savoy</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A stemming procedure and stopword list for general French corpora</article-title>
          .
          <source>JASIS</source>
          .
          <volume>50</volume>
          ,
          <fpage>944</fpage>
          -
          <lpage>952</lpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Savoy</surname>
          </string-name>
          , J.:
          <article-title>Light Stemming Approaches for the French</article-title>
          , Portuguese, German and
          <string-name>
            <given-names>Hungarian</given-names>
            <surname>Languages</surname>
          </string-name>
          .
          <source>Proceedings ACM-SAC</source>
          ,
          <fpage>1031</fpage>
          -
          <lpage>1035</lpage>
          . The ACM Press, (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Fautsch</surname>
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savoy</surname>
            <given-names>J</given-names>
          </string-name>
          .:
          <source>Algorithmic Stemmers or Morphological Analysis: An Evaluation. JASIST</source>
          .
          <volume>60</volume>
          ,
          <fpage>1616</fpage>
          -
          <lpage>1624</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Savoy</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <source>Rasolofo Y.: Report on the TREC 11 Experiment: Arabic</source>
          ,
          <article-title>Named Page and Topic Distillation Searches</article-title>
          .
          <source>In: Proceedings of the eleventh text retrieval conference TREC2002</source>
          , pp.
          <fpage>765</fpage>
          -
          <lpage>774</lpage>
          . NIST Special Publication (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Singhal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          : AT &amp;
          <string-name>
            <surname>T at</surname>
          </string-name>
          TREC-6
          <source>. ACM Conference on Research and Development in Information Retrieval</source>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>41</lpage>
          . ACM/SIGIR (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Robertson</surname>
            ,
            <given-names>S.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Beaulieu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Experimentation as a way of life: Okapi at TREC</article-title>
          .
          <source>Information Processing &amp; Management</source>
          .
          <volume>36</volume>
          (
          <issue>1</issue>
          ),
          <fpage>95</fpage>
          -
          <lpage>108</lpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Amati</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp; van
          <string-name>
            <surname>Rijsbergen</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          :
          <article-title>Probabilistic models of information retrieval based on measuring the divergence from randomness</article-title>
          .
          <source>ACM Transactions on Information Systems</source>
          .
          <volume>20</volume>
          (
          <issue>4</issue>
          ),
          <fpage>357</fpage>
          -
          <lpage>389</lpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Akasereh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savoy</surname>
          </string-name>
          , J.:
          <article-title>Ad Hoc Retrieval with Marathi Language</article-title>
          . Working notes,
          <source>Forum for Information Retrieval Evaluation</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Buckley</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singhal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mitra</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salton</surname>
          </string-name>
          , G.:
          <article-title>New Retrieval Approaches Using SMART</article-title>
          .
          <source>Proceedings TREC-4</source>
          ,
          <fpage>25</fpage>
          -
          <lpage>48</lpage>
          . NIST Publication #
          <fpage>500</fpage>
          -
          <lpage>236</lpage>
          , Gaithersburg, (
          <year>1996</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Abdou</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savoy</surname>
            , J.: Searching in Medline: Stemming,
            <given-names>Query</given-names>
          </string-name>
          <string-name>
            <surname>Expansion</surname>
          </string-name>
          , and
          <source>Manual Indexing Evaluation. Information Processing &amp; Management. 44</source>
          ,
          <fpage>781</fpage>
          -
          <lpage>789</lpage>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Peat</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Willett</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>The Limitations of Term Co-Occurrence Data for Query Expansion in Document Retrieval Systems</article-title>
          . JASIS.
          <volume>42</volume>
          ,
          <fpage>378</fpage>
          -
          <lpage>383</lpage>
          , (
          <year>1991</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Vogt</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Cottrell</surname>
            ,
            <given-names>G.W.</given-names>
          </string-name>
          :
          <article-title>Fusion via a linear combination of scores</article-title>
          .
          <source>IR Journal</source>
          .
          <volume>1</volume>
          (
          <issue>3</issue>
          ),
          <fpage>151</fpage>
          -
          <lpage>173</lpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Savoy</surname>
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Data Fusion for Effective European Monolingual Information Retrieval</article-title>
          .
          <source>CLEF 2004. LNCS</source>
          , vol.
          <volume>3491</volume>
          , pp.
          <fpage>233</fpage>
          -
          <lpage>244</lpage>
          . Springer, Heidelberg (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Dolamic</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fautsch</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savoy</surname>
          </string-name>
          , J. UniNE at CLEF 2008:
          <article-title>TEL, and Persian IR</article-title>
          .
          <source>CLEF 2008. LNCS</source>
          , vol.
          <volume>5706</volume>
          , pp.
          <fpage>178</fpage>
          -
          <lpage>185</lpage>
          . Springer, Heidelberg (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Savoy</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berger</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Selection and Merging Strategies for Multilingual Information</article-title>
          .
          <source>CLEF 2004. LNCS</source>
          , vol.
          <volume>3491</volume>
          , pp.
          <fpage>27</fpage>
          -
          <lpage>37</lpage>
          . Springer, Heidelberg (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Dolamic</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savoy</surname>
            <given-names>J</given-names>
          </string-name>
          .:
          <article-title>How effective is Google's translation service in search? Commun</article-title>
          . ACM.
          <volume>52</volume>
          ,
          <fpage>139</fpage>
          -
          <lpage>143</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>