Introduction

Interactive Cross-Language Searching: phrases are better than terms for query formulation and re nement

Fernando Lopez-Ostenero

Julio Gonzalo

Anselmo Pen~as

Felisa Verdejo

This paper summarizes the participation of the UNED group in the CLEF 2002 Interactive Track. We focused on interactive query formulation and re nement, comparing two approaches: a) a reference system that assists the user to provide adequate translations for terms in the query; and b) a proposed system that assists the user to formulate the query as a set of relevant phrases, and to select promising phrases in the documents to enhance the query. All collected evidence indicates that the phrasebased approach is preferable: the o cial F =0:8 measure is 65% better for the proposed system, and all users in our experiment preferred the phrase-based system as a simpler and faster way of searching.

Introduction Experiment Design Our experiment consists of:

Eight native Spanish speakers with null or very low English skills.

The Spanish version of the four o cial iCLEF topics.

The English CLEF document collection (LA Times 1994).

A reference interactive cross-language search system based on assisted term translation (System WORDS).

A proposed system based on noun-phrase selections (System PHRASES). The o cial iCLEF latin square to combine topics, searchers and systems into 32 di erent searching sessions.

The o cial iCLEF search procedure.

In this section we describe the most relevant aspects of the above items. 2.1

Reference system

The reference system (WORDS) uses assisted query term translation and re nement all along the search process:

Initial query formulation. The system translates all content words in the iCLEF topic using a bilingual dictionary, and displays possible English translations to the user. When the user points to an English term, the system displays inverse translations into Spanish. This information can be used by the searcher to decide which translations to keep and which translations to discard before performing the rst search. Figure 1 illustrates this initial step.

A) Colour codes in the ranked list indicate already judged documents.

B) Clicking on a Spanish term in the document takes the user to the source English keyword matched. Cross-Language search. The system performs a monolingual search of the LA Times collection with the English terms selected by the user.

Ranked document list. The ranked list of documents displays the (translated) title of the document and a colour code to indicate whether each document has already been marked as relevant, not relevant or unsure. Figure 2 A shows a retrieved ranked list. Document selection. Instead of using Machine Translation to display the contents of a document, the system displays a cross-language summary consisting on the translation of all noun phrases in the body of the document, plus an MT (Systran Professional 3.0) translation of the title. The user can select the document as relevant, mark the document as non-relevant or unsure, or leave it unmarked.

Query re nement by selection. When a Spanish term in a document translation corresponds to an original English term already in the query, the user can point to the Spanish term (highlighted); then the system points to the English query term, allowing for deselection or selection of the English term (or some of its companion translations) or the original Spanish term (then all translations are disabled). Figure 2 B illustrates this process. Additional query re nement. Additionally, the user can also enter a single term at any time along the search. Again, the system displays its possible translations into the target language, along with their inverse translations, and permits individual selection and de-selection of translations. 2.2

Phrase-based searching

Our proposed system uses noun phrasal information all along the Cross-Language assisted search process:

Initial query formulation. The system extracts noun phrases from the full iCLEF topic, lters phrases with optimal translations, and displays the resulting set of phrases for user selection.

Cross-Language search. The system translates automatically the phrases selected by the user, and performs a monolingual search in the document collection.

Ranked document list. The ranked list is identical for both systems (see reference system above).

Document selection. Again, document selection is identical for both systems (see WORDS system above).

Query re nement by term suggestion. Optimally translated noun phrases in the documents can be selected to enrich the original query. When a user clicks on a noun-phrase in a document, the system automatically translates the noun-phrase and performs a new monolingual search with the enlarged query, updating the list of ranked documents. This process is illustrated in Figure 3.

Additional query re nement. Identical in both systems (see system WORDS above).

In order to achieve such functionalities, there is a pre-processing phase using shallow Natural Language Processing techniques, which has been described in detail in [ 3 ]. The essential steps are: Phrase indexing. Shallow parsing of two comparable collections (the CLEF Spanish and English collections in this case) to obtain an index of all noun phrases in both languages and their statistics.

Phrase Alignment. Spanish and English noun phrases (up to three lemmas) are aligned for translation equivalents using only a bilingual dictionary and statistical information about phrases (see [ 3 ] for details). As a result of this step, aligned phrases receive a list of candidate phrase translations in decreasing order of frequency. The result is a pseudo bilingual dictionary of phrases that is used in all other translation steps. The statistics for the CLEF English-Spanish collection can be seen in Table 1.

A) Clicking on best-aligned phrases incorporates them to the query.

B) Results of clicking the phrase \huelga de hambre en Guatemala". The phrase is added to the query and a new ranked list is displayed. Document translation. All noun phrases are extracted and translated. Translation is performed in two steps: rst, maximal aligned subphrases are translated according to the alignment information. Then, the rest of the terms are translated using an estimation that selects target terms which overlap maximally with the set of related subphrases. Only an additional step is required at searching time:

Phrase set

Spanish, 2 lemmas Spanish, 3 lemmas English, 2 lemmas English, 3 lemmas Query translation. All Spanish phrases selected by the user are replaced by: 1) the most frequent aligned English phrase and 2) the second most frequent aligned phrase, if its frequency reaches a threshold of 80% of the most frequent one. The INQUERY phrase operator is used to formulate the nal monolingual query with all English phrases. The search is then performed using the INQUERY search engine. Every searcher performed 4 searches, one per iCLEF topic, alternating systems and topics according to the iCLEF latin square design. The time for each search was 20 minutes, and the overall time per searcher was around three hours, including training, questionnaires and searches (see [ 2 ] for details). For every user/topic/system combination, the following data were collected: The set of documents retrieved by the user, and the time at which every selection was made. The ranked lists produced by the system in each query re nement.

The questionnaires lled-in by the user.

An observational study of the search sessions. 3 3.1

Results

cial F =0:8 scores The o cial iCLEF score for both systems is F =0:8, which combines precision and recall over the set of manually retrieved documents, favoring precision. The results of our experiment can be seen in Table 2. Our proposed system (PHRASES) improves the reference system (WORDS) by a 65% increment. In a more detailed analysis per topic, there can be seen that topic 3 was too di cult and did not contribute to the results (no searcher found relevant documents with any of the systems). All the other topics receive a better F measure with the PHRASES system than with the WORDS system. The di erence is not very high for topics 1 and 2, but it is very accused for topic 4, which seemed easy for system PHRASES and very di cult for system WORDS.

The most important expression in Topic 4 is \hunger strikes" (the description is \documents will report any information relating to a hunger strike attempted in order to attract attention to a cause"). Searchers using the PHRASES system easily select \huelga de hambre" (the Spanish equivalent) from the displayed options, and the aligned translation, which is in turn \hunger strikes", will retrieve useful documents. Searchers using the WORDS system, however, nd that \huelga" (strike) and \hambre" (hunger) may receive many possible translations into English. Looking at the average F , it is obvious that they do not manage to nd the appropriate translations for both terms, failing to match relevant documents. 3.2

Additional data

Besides the o cial F result, there are many other sources of evidence to compare both systems: additional quantitative data (time logs, ranked results for every query re nement), questionnaires

System WORDS

PHRASES lled by participants, and the observation study of their searching sessions. additional evidence here.

We discuss that

3.2.1

Searching behavior across time 2 0

The plot of document selections against time in Figure 4 provides interesting evidence about searching behavior: Searchers begin selecting documents much faster with the PHRASES system (8 selections made in minute one) than with the WORDS system (the rst selection is made in minute 3). The obvious explanation is that initial query formulation is very simple in the PHRASES system (select a few phrases in the native language), and time consuming in the WORDS system (examining many foreign-language candidate translations per term and selecting them using inverse dictionary evidence).

The initial precision (i.e. the precision after initial query formulation) is not higher for system WORDS, in spite of the substantially higher time spent by searchers in the rst query formulation. This con rms that a good initial selection of native-language phrases can provide good initial translations of the topic terms.

Searchers perform many more query re nements with the PHRASES system, con rming that is easier to enhance the query using phrases selected from documents.

Searchers obtain occasional precision gures of 1, .95, .90, etc. using the PHRASES system, while the highest precision obtained with WORDS is .75 for topic 1, searcher 1.

Overall, the additional quantitative data also supports our initial hypothesis. 3.2.3

Analysis of questionnaires The answers supplied by the eight searchers strongly support our hypothesis. All of them stated that the PHRASES system was easier to learn, easier to user and better overall. They appreciated both the ability of selecting phrases rather than individual terms, and most of them added that it was much better not to see English terms at any moment. A general claim was that the dictionary had too many acceptions for each term. 3.2.4

Observational study The careful observation of searchers' behavior is in agreement with the above results. Some points are worth commenting:

Users get discouraged with terms that have a lot of alternative translations in the WORDS system. Even if the term is important for the topic, they try to avoid them.

Selecting foreign-language terms is perceived as a hard task; when no relevant documents are found after a few iterations, users get discouraged with the WORDS system. The re nement loop works well for the PHRASES system once relevant documents begin to appear. However, if relevant documents do not appear soon, the initial query re nements are not obvious and both systems are equally hard.

The automatic translation of phrases may be harmful when the aligned equivalent is incorrect. This is the case of \busqueda de tesoros", which does not receive a correct translation (\treasure hunting") and it is the most important concept for Topic 2. The problem is that users do not detect that the translation is incorrect; they simply think that there is no match in the collection for such concept.

The di culty of topic 3 (campaigns against racism in Europe) comes from the fact that the LA Times collection does not refer to any of such campaigns as generically \European", and the overwhelming majority of documents about racism are US-centered. 4

Conclusions

We have obtained multiple evidence (quantitative data, user opinions and observational study) that a phrase-based approach to cross-language query formulation and re nement, without userassisted translation, can be easier to use and more e ective than assisted term by term translation. Of course, this is not an absolute conclusion, if only because our reference system o ered only crude help for term-by-term translation (inverse translations using a bilingual dictionary). Probably a more sophisticated translation assistance would stretch the di erences between approaches. But we believe that a valid conclusion, in any case, is that language barriers are perceived as a strong impediment by users, and it is worth studying strategies of Cross-Language Search Assistance keeping a monolingual perspective from the user. 100 90 80 75 70 60 50 40 30 20 10 0

Topic 1: Iterative rankings

5 re nements

4 re nements 0 1 2 3 4 5 Searcher 2 Searcher 7 0 1

2 re nements 0 re nements 0 re nements 0 1

2 re nements System PHRASES System WORDS

Topic 2: iterative rankings

5 re nements

5 re nements 15 3 4 re nements 5 7 40

2 re nements Searcher 7

5 re nements

2 re nements

4 re nements System WORDS

[1]

Erbach , Gunter Neumann, and

Hans

Uszkoreit . Mulinex: Multilingual intexing, navigation and editing extensions for the world-wide web . In AAAI Symposium on Cross-Language Text and Speech Retrieval , 1997 .

[2]

Gonzalo and

Oard . The clef 2002 interactive track . In Proceedings CLEF 2002 , 2002 .

[3]

Lopez-Ostenero ,

Gonzalo , A.

Pen~as, and

Verdejo . Noun-phrase translations for crosslanguage document selection . In Proceedings of CLEF 2001 , 2001 .

[4]

Ogden ,

Cowie ,

Davis ,

Ludovic ,

Nirenburg ,

Molina-Salgado , and

Sharples . Keizai: An interactive cross-language text retrieval system . In Proceeding of the MT SUMMIT VII Workshop on Machine Translation for Cross Language Information Retrieval , 1999 .