=Paper=
{{Paper
|id=Vol-1172/CLEF2006wn-adhoc-ZazoEt2006
|storemode=property
|title=REINA at CLEF 2006 Robust Task: Local Query Expansion Using Term Windows for Robust Retrieval
|pdfUrl=https://ceur-ws.org/Vol-1172/CLEF2006wn-adhoc-ZazoEt2006.pdf
|volume=Vol-1172
|dblpUrl=https://dblp.org/rec/conf/clef/RodriguezFB06
}}
==REINA at CLEF 2006 Robust Task: Local Query Expansion Using Term Windows for Robust Retrieval==
REINA at CLEF 2006 Robust Task: Local Query Expansion Using Term Windows for Robust Retrieval Angel Zazo, Carlos G. Figuerola, and José Luis A. Berrocal REINA Research Group - Universidad de Salamanca C/ Francisco de Vitoria 6-16, 37008 Salamanca, SPAIN http://reina.usal.es Abstract This paper describes our work at CLEF 2006 Robust task. This task is an ad-hoc task that explores methods for stable retrieval by focusing on poorly performing topics. We have realized experiments for all subtask: monolingual (EN, ES, FR and IT), bilingual (IT→ES) and multilingual (ES→[EN ES FR IT]) retrieval. For monolingual retrieval we have focused our work on local query expansion, i.e. using only the information from retrieved documents. External corpora, such as the Web, were not used. Our document retrieval system is simple; it is based on vector space model. Some local expansion techniques were applied for training topics. The best improvement was achieved using association thesauri, which were constructed employing co-occurrence relations in term windows, not in complete document. This technique is effective and can be easily implemented without tuning some parameters. Our mandatory runs (title+description topic fields) have obtained good positions in all monolingual subtasks we participate. For bilingual retrieval two machine translation programs were used to translate the topics from Italian into Spanish. Both translations were joined before searching. The same expansion technique was also applied. Our mandatory run has got the top rank in the bilingual subtask. For multilingual research we used the same procedure to obtain the retrieval list for each target language, and we combined them with the MAX-MIN data fusion method. In this subtask, our mandatory run has been in the lower part of the ranking of runs. Categories and Subject Descriptors H.3.1 [Content Analysis and Indexing]: Indexing methods, Thesauruses; H.3.3 [Information Search and Retrieval]: Query formulation, Relevance feedback ; H.3.4 [Systems and Soft- ware]: Performance evaluation; I.2.7 [Natural Language Processing]: Machine Translation General Terms Measurement, Performance, Experimentation Keywords Robust Retrieval, Query Expansion, Term Windows, Association Thesauri, CLIR, Machine Trans- lation 1 Introduction Robust retrieval tries to obtain stable performance over all topics by focusing on poorly performing topics. Robust tracks were carried out in TREC 2003, 2004 and 2005 (Voorhees, 2003, 2004, 2005) for monolingual retrieval, but not for cross-language information retrieval. The users of a information retrieval system don’t know concepts such as average precision, recall, etc. They only use it, and they usually remember better failures than success. Failures decide if a system will be used again. The robustness ensures that all topics obtain minimum effectiveness levels. In information retrieval the mean of the average precision (MAP) has been used to measure the systems’ performance. But, poorly performing topics have little influence on MAP. At TREC, geometric average (rather than MAP) turned out to be the most stable evaluation method for robustness (Voorhees, 2004). The geometric average (GMAP) has the desired effect of emphasizing scores close to 0.0 (the poor performers) while minimizing differences between larger scores. In CLEF 2006 Ad-hoc track a new robust task was introduced. Three subtask were designed for robust task: • Monolingual: for all six document languages: Dutch (NL), English (EN), German (DE), French (FR), Italian (IT) and Spanish (ES). • Three bilingual: Italian→Spanish, French→Dutch and English→German. • Multilingual: All six languages are allowed as topic language. Our research group has participated in all subtasks. We have carried out monolingual (EN, ES, FR, IT), bilingual (IT→ES) and multilingual (ES→[EN ES FR IT]) experiments. For each subtask two runs was submitted, one with title and description topic fields (mandatory) and one with only the title field. All experiments were run with the same setup (except for language specific resources). 2 Experiments We have focused our work on local query expansion, i.e. using only the information from retrieved documents. In CLEF 2002 we used association and similarity thesauri to expand sort queries: all documents of the collection (i.e. global query expansion) was used to construct the thesauri (Zazo et al., 2003). In later works (Zazo et al., 2002, 2005; Zazo, 2003) we have studied in depth several query expansion techniques: local vs. global analysis, term reweighting, coefficients for expansion, etc. Some conclusions we have taken out: • Query expansion depends on the technique using to obtain relations between terms. • Performance improves if terms added to the original query have high relation value with all terms of the original query, not with only one separately. • Expansion depends on the importance (weight) of the terms added to the original query. • Performance is higher for sort queries than long queries. Long queries usually have well defined the user information need, and frequently several additional terms are not necessary to improve performance. • In most cases the expansion techniques are based on local analysis, using the retrieved documents to obtain relations between terms. The performance of the first retrieval is fundamental to obtain high improvement with the expansion: a good retrieval system (term weighting) is better than a good expansion technique. Considering these items, a lot of experiments have been carried out, only with training topics (mandatory). One observes that the topic collection of robust task came from CLEF 2001 through CLEF 2003, but the document collections came from CLEF 2003, and they were different than CLEF 2001 and 2002 collections. It’s known that retrieval performance depends not only in term weighting, but topic and document collections; for the same document collection and weighting schema, two different topic collections obtain different performance. So, we take a daring decision: for our experiments we have only used the training topics of CLEF 2003 topic collection. Our primary effort was monolingual retrieval. The steps in monolingual subtask will be ex- plained bellow. For bilingual and multilingual experiments we have used machine translation (MT) programs to translate the topics into document language, and then performing monolingual retrieval. The MAX-MIN data fusion method was used to joining lists in multilingual retrieval. 2.1 Monolingual Experiments Our document retrieval system is simple. It is based on vector space model. Not additional plugins for word sense disambiguation nor other linguistic techniques were used. We have focused our work on local query expansion, i.e. using only the information from retrieved documents. Complete document collection or external corpora, such as the Web, were not used. First, it is necessary to have a good term weighting schema to take as the base, and to check if stop words removing or stemming processes improve robustness. Second, we have applied some local query expansion techniques to see which had better improvement over the least effective topics. For each test we realized, each topic was classified into three category: “OK” if its average precision was >MAP; “bad” if it was only >MAP/2, and “hard” if it was