-

2002

500

to contain the answer. question terms (keywords) that will allow locating the documents that are likely tion and keyword selection. The former detects the type of information that the Question analysis module carries out two main processes: answer type classic aof lexical patterns. Each pattern is associated with its corresponding expected These processes are performed by using a simple manually developed set question expects as answer (a date, a quantity, etc) and the latter selects those approach (1 person month) that will facilitate later error analysis and will allow correct if there is no answer known to exist in the document collection; otherwise answer or a 50 bytes long string that should contain the exact answer. detecting those basic language-dependent characteristics that make Spanish QA titions [4{6], we decided to build a new system mainly due to the big dierences This paper is organised as follows: Section 2 describes the structure and be associated to the document they are found in. A response can be either a dieren t from English QA results obtained at CLEF QA Spanish monolingual task. Finally we extract it is judged as incorrect. Two dieren t kinds of answers are accepted: the exact between English and Spanish languages. Moreover, we designed a very simple a correct answer in the document collection. The \NIL" string is considered operation of our Spanish QA system. Afterwards, we present and analyse the Our participation has been restricted to the Spanish monolingual task in the [answer-string, docid ] pair or the string \NIL" when the systems do not nd initial conclusions and discuss directions for future work. category of exact answers. Although we have experience in past TREC compeparallel retrieving relevant passages from the Spanish EFE document collection swer. Figure 1 shows system architecture. and the Spanish pages in the World Wide Web. Finally, the answer selection information they contain. This information is represented in a form that allows questions formulated to the system in order to detect and extract the useful to be easily processed by the remaining modules. Passage retrieval module accomplishes a rst selection of relev ant passages. This process is accomplished in Question analysis is the rst stage in QA process. This module processes module processes relevant passages in order to locate and extract the nal anOur QA system is structured into the three main modules of a general QA system architecture: are used for building the passages. First, IR-n system performs passage retrieval engines: IR-n [3] and Google3. at question analysis stage are processed using MACO Spanish lemmatiser [1] and sentences as unit of information. From QA perspective, this passage extraction Passage retrieval stage is accomplished in parallel using two dieren t search trieval models since self-contained information units of text, such as sentences, over the entire Spanish EFE document collection. In this case, keywords detected IR-n system is a passage retrieval system that uses groups of contiguous their corresponding lemmas are used for retrieving the 50 most relevant passages model allows us to benet from the advantages of discourse-based passage reIR-n Passage

Retrieval Relevant passages

Question Question Analysis Answer Extraction

Answers

Google Passage Retrieval

Relevant passages 2.3 Answer extraction Fig. 2. Question analysis example EFE document set and another from available Spanish web documents. If scored according to the number of times this candidate appears in the in parallel for retrieving answers from web documents. Therefore, at this (a) Repeated candidate answers are merged into a unique expression that is (b) Shorter expressions are preferred as answer to longer ones. This way, 4. Web evidence addition. All previous processes may be optionally performed candidate answer set. terms in long candidates that appear themselves as answer candidates rectness as follows: (e) From the remaining candidate set, only those whose semantic type matches that start of nish with a stop word or contain a question keyword. question 103. boost shorter candidate answer scores by adding long candidate scores merged into unique expressions. swers. Figure 3 shows (in boldface) the selected answer candidates for sentences, the candidate answer set may contain repeated elements. Our (c) Every term or merged expression in relevant sentences is considered a to the frequency value obtained by shorter ones. the expected answer type are selected. When the expected answer type candidate answer. 3. Candidate answer combination. Each answer candidate is assigned a score (b) Quantities, dates and proper noun sequences are detected and they are system exploits this fact by relating candidate redundancy with answer corthat measures its probability of being the correct answer (answer frequency). (d) Candidate answers are ltered. This process gets rid of those candidates is OTHER, only proper noun phrases are selected as nal candidate anmoment the system has two lists of candidate answers: one obtained from As the same candidate answer can probably be found in dieren t relevant Question 103 ¿De cuántas muertes son responsables los Jemeres Rojos?

First retrieved passage from EFE Collection: <DOCNO> EFE19940913-06889 ... explotan los Jemeres Rojos, quienes no les preocupa que sus ideas no sean respetadas por la comunidad internacional, que los acusa de ser los responsables de la muerte de más de un millón de camboyanos durante el genocidio de 1975 1978.

First retrieved passage from the World Wide Web: <DOCNO> 1 Gooogle

Los Jemeres Rojos fueron responsables de más de un millón de muertes, mataron al menos a 20.000 presos políticos y torturaron a cientos de miles de personas. adding their corresponding frequency values obtained on web list. This way, the context they have been found in (sentence score). As the same candiweb retrieval has been activated, candidate answer lists are merged. This dancy through the answer extraction process (answer frequency) and (2) process consists on increasing answer frequency of EFE list candidates by date answer may be found in dieren t contexts, an answer will maintain the candidates appearing only in web list are discarded. 5. Final answer selection. Answer candidates from previous steps are given a computed as follows: maximum score for all the contexts they appear in. Final answer score is nal score ( answer score) that measures two circumstances: (1) their redun3 Results Table 1. Spanish monolingual task results answer score = sentence score answer f requency (1) Answers are then ranked accordingly to their answer score and rst three answers are selected for presentation. Among the candidate answers for quesas the nal answ er. tion 103 (example in Figure 3), the system selects \un millon" (one million) results obtained for each run. obtained applying the whole system described above while second run performed QA process without activating Web retrieval (alicex032ms). Table 1 shows the We submitted two runs for exact answer category. First run (alicex031ms) was fact conrms that QA systems performance for other languages than English can the simplicity of our approach. Besides, the lack of the correct answers for test questions at this moment do not allow us to perform a correct error analysis.

Result analysis may not be as conclusive as we would desire mainly due to Anyway, results obtained show that using the World Wide Web as external resource increases the percentage of correct answers retrieved in v e points. This also benet from this resource.

Strict Lenient Run MRR % Correct MRR % Correct alicex032ms 0,2966 35,0 0,3175 38,5 alicex031ms 0,3075 40,0 0,3208 43,5 hari, Tomek Strzalkowski, Ellen Voorhees, and Ralph Weishedel. Issues, Tasks and Jordi Turmo. Morphosyntactic Analysis and Parsing of Unrestricted Spanish http://www-nlpir.nist.gov/projects/duc/papers/qa.Roadmap-paper v2.doc, 2000.

Marquez, M.A. Mart, Llu s P adro, Roser Placer, Horacio Rodrguez, Mariona T aule, Evaluation. LREC’98, pages 1267{1272, Granada, Spain, 1998.

Text. In Proceedings of First International Conference on Language Resources and Dan Moldovan, Bill Ogden, John Prager, Ellen Rilo, Amit Singhal, Rohini Shri2. John Burger, Claire Cardie, Vinay Chaudhri, Robert Gaizauskas, Sanda Harabagiu, and Program Structures to Roadmap Research in Question & Answering (Q&A). 1. Jordi Atserias, Josep Carmona, Irene Castellon, Sergi Cervell, Montse Civit, Llus David Israel, Christian Jacquemin, Chin-Yew Lin, Steve Maiorano, George Miller, 4 http://www.dcs.shef.ac.uk/nlp/funded/eurowordnet.html tion expects as answer. Therefore we need to integrate named-entity tagging retrieving passages including relevant information expressed with terms that tation resides in systems ability of relating questions with their respective taxonomy that enables multilingual answer type classication. Probably , uswe need to study aspects such as recognizing equivalent questions regardless ing semantic net structure. EuroWordNet4 answers characteristics. Consequently, we need to develop a broad answer of the speech act or of the words, syntactic and semantic inter-relations or taxonomy involves using tools capable of identifying the entity that a ques{ Answer taxonomy. An important part in the process of question interpreforms (interrogative, aÆrmative, using dieren t words and structures,. . . ), { Answer Extraction. Integrating named-entity taggers. Using a broad answer trieval performance by including question expansion techniques that enable idiomatic forms employed. are dieren t (but equivalent) to those used for question formulation. capabilities that allows to narrow down the number of candidates to be con{ Question analysis. Since the same question can be formulated in very diverse sidered for answering a question. { Passage Retrieval. An enhanced question analysis will improve passage re