=Paper=
{{Paper
|id=Vol-1168/CLEF2002wn-adhoc-ReidsmaEt2002b
|storemode=property
|title=MediaLab @ CLEF-2002: Comparing Search Strategies
|pdfUrl=https://ceur-ws.org/Vol-1168/CLEF2002wn-adhoc-ReidsmaEt2002b.pdf
|volume=Vol-1168
|dblpUrl=https://dblp.org/rec/conf/clef/ReidsmaW02
}}
==MediaLab @ CLEF-2002: Comparing Search Strategies==
<pdf width="1500px">https://ceur-ws.org/Vol-1168/CLEF2002wn-adhoc-ReidsmaEt2002b.pdf</pdf>
<pre>
                             MediaLab @ CLEF-2002:
                                   Comparing search strategies.
                                      Dennis Reidsma, Peter van der Weerd
                                       {d.reidsma, pweerd}@medialab.nl

                                 MediaLab BV, Schellinkhout, The Netherlands
                                           http://www.medialab.nl

                                                     Abstract

          This report describes the participation of MediaLab BV in the CLEF-2002 evaluations. This
       year we participated in the monolingual Dutch task for the second time. Our main objective last
       year was to get some experience with participating in these experiments and to get a first
       impression on how our search engine performed compared to other search engines. This year
       we wanted to experiment with different search strategies and parameterisations on our core
       search engine.

1 Approach
All experiments were done using the core search engine developed by MediaLab.
We had two aims in mind: the first was to experiment with some completely different search strategies, the
second was to run those strategies with variations on several parameters in order to investigate the effect on the
quality of the results. The next section describes those strategies and parameters in detail. To compare the
strategies we used the queries and assessments of last year, evaluating the results using the trec_eval program
developed by NIST. Section 3 discusses those results, also comparing them to our performance last year. Section
4 is about our results with the topics for this year and section 5 presents our conclusions and some discussion on
what we intend to do next year at CLEF-2003.

2 Strategies
In the working notes of the CLEF-2001 workshop we stated that we would extend our approach with Natural
Language Processing and query word weighting for our participation in 2002 [2]. However, when we started
preparing our submission for this year we decided that we should start with deciding on a good base line
algorithm that uses only the core functionality of our search engine. For this we investigated two basic strategies,
which are discussed in the rest of this section. The first strategy only searches for query words in a collection of
simple indexes over the document fields, the second strategy uses co-occurrence indexing.

Common parts

The following is a list of parts that were common to both strategies.
    • A stopword list was obtained by extending the CLEF stopword list for Dutch with our own default
         stopword collection.
    • Stemming was done using a default algorithm based on the Porter stemmer.
    • Compound terms were generated from the words in the freetext index: if for instance both “klap”,
         “roos” and “klaproos” are found in the index, “klaproos”is considered a compound term consisting of
         “roos” and “klap”. So, searching for “roos” will also retrieve hits on “klaproos”,

Strategy 1

The first strategy was very straightforward, since our aim was not to find the best algorithm but to define a base
line algorithm that performs well and is easy to understand and analyse. We defined a collection of indexes over
the document fields ‘TI’, ‘LE’ and ‘TE’ and combined indexes over every combination of those fields. The next
step was to define runs that differed on the following points:
     • Which indexes were searched (at most 3 indexes per run)
     • Which parts of the query were used in searching (title, description and/or narrative). The different
         indexes in one run could be searched using a different combination of query fields.
    •   How strong the influence of keyword stemming should be for the separate indexes in a run. Since we
        wanted to keep the number of runs small, we used only the settings no stemming, default stemming
        (relative weight of 0.6), full stemming (same weight as exact matches).
Using these parameters, we defined about 500 runs on the topics of last year.

Strategy 2

With the second strategy we tried to improve the results of a run by using co-occurrence information. For this we
first defined a co-occurrence network on the document collection. This network then was used to modify the
query results in the following way:
For every word in the query it’s co-occurrences were determined. Combining those co-occurrences for all words
with an AND or an OR operation resulted in so called strong query expansions and weak query expansions.
Those query expansions could then be used to find more query results or just rerank the results. The assumption
was that this procedure might improve recall in the runs.

3 Results on last year’s collection
All variations on both strategies were tested on the topics of last year. The result files that were produced were
analysed using the trec_eval program. This section describes the outcome of that analysis.

Strategy 1

Table 1 shows a few of the best results we achieved using the first strategy. Given the fact that the average
precision for the top 5 participants on that 2001 collection were 0.3917, 0.3844, 0.3775, 0.3497 and 0.2833 [1] it
may be clear that the new strategy improved results drastically.
We did not have time to do a proper statistical analysis of the results to determine which values for the different
parameters resulted in the best performance. Still, the results seem to indicate a trend that the best results are
achieved using the title and description field of the query, leaving the narrative out and giving stemmed variants
of the query words a relative weight of 0.6 (so 2 stemmed keyword matches are slightly more important than one
exact match).

        Run                                       Relevant      Rel_ret     Average prec.       R-precision
                                                                            (non-interp.)
        279                                       1224          1006        0.3177              0.3287
        87                                        1224          1007        0.3176              0.3351
        71                                        1224          1017        0.3153              0.3225

        Official                                  1224          879         0.1640              0.1803
        2001 submission
         Table 1: Retrieval results on 2001 collection

Strategy 2

The results produced with the second searching strategy were not as much improved. On the contrary, every
variation on this strategy performed worse with than without using the co-occurrence enrichment. With hindsight
it is not hard to think of an explanation why this ‘had to be unavoidable’, although we would have to devise
some other experiments to verify that the explanation is the right one. Currently we are inclined to think that
since co-occurrence information describes which words occur often together in a document, the co-occurrences
of the query words are already bound to be in the top part of the results. That would mean that using this
information to expand the results would not result in much improvement in recall, whereas the noise introduced
by the expansion would result in the retrieval of more irrelevant results.

4 Our results this year
To produce our results for this year we used several variations on the first strategy that performed not too badly
on last year’s topics. We decided not to use the second strategy because it produced such poor results.
Though our best results on the topics of last year were pretty good, the comparisons to the median suggest that
we did not perform as well on this year’s topics. The difference is large enough to be more than somewhat
surprising, but unfortunately we did not have time to find out what caused this behaviour.

5 Conclusions
Our achievements in improving the results for the topics of last year with such a simple strategy gives us
confidence that we are on the right track. For next year we might experiment with several extensions to this base
line algorithm, such as for example:
     • Blind (negative) relevance feedback on the lowest retrieved results, such as was used last year by
         McNamee and Mayfielsd among others [3].
     • Weighting the different query words based on word category, corpus frequency and language frequency
     • More detailed experimenting with the optimal settings for stemming and compound word searching.

6 References

[1] Working Notes for the CLEF 2001 Workshop, edited by Carol Peters, September, Darmstadt, Germany

[2] First experiments with CLEF Peter van der Weerd, Wilfred Blom, Medialab BV, Schellinkhout, in [1]
[3] APL Experiments at CLEF: Translation Resources and Score Normalization, Paul McNamee and James
    Mayfield, Johns Hopkins University, USA, in [1]

</pre>