<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Priberam's question answering system in a cross-language environment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ada´n Cassan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Helena Figueira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andr´e Martins</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Afonso Mendes</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pedro Mendes</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cla´udia Pinto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Vidal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Priberam Informa ́tica Alameda D. Afonso Henriques</institution>
          ,
          <addr-line>41 - 2</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Following last year's participation in the monolingual question answering (QA) track of CLEF, where Priberam's QA system achieved state-of-the-art results, this year we decided to take part in both Portuguese and Spanish monolingual tasks, as well as in two bilingual (Portuguese-Spanish and Spanish-Portuguese) tasks of QA@CLEF. The architecture of our QA system relies on previous work done for a multilingual semantic search engine developed in the framework of TRUST1, where Priberam was responsible for the Portuguese module. Unlike TRUST, however, which used a third party indexation engine, our QA system is based on the indexing technology of LegiX, Priberam's legal information tool2, whose indexing engine was adapted to index semantic information, ontology domains, question categories and other specificities for QA. Given the multilingual platform where the system has been developed and tested, as well as the results obtained so far, it seemed natural to assess its language independence. To that intent, we have extended it to another language, Spanish, thus demonstrating the applicability of the system architecture. This paper describes the improvements and changes implemented in Priberam's QA system since last CLEF participation, detailing the work involved in its cross-lingual extension and discussing the results of the runs submitted to evaluation.</p>
      </abstract>
      <kwd-group>
        <kwd>Question answering</kwd>
        <kwd>Questions beyond factoids</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Priberam took part in the 2005 CLEF campaign, introducing its QA system for Portuguese in
the monolingual task. This year, encouraged by last year’s results [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we decided to participate
in the same track, but we extended the participation to the Spanish monolingual and bilingual
(Portuguese-Spanish and Spanish-Portuguese) tasks.
      </p>
      <p>
        The architecture of our QA system remains roughly the same, as detailed in [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]: after
the question is submitted, it is categorised according to a question typology. An internal query
retrieves a set of potentially relevant documents, containing a list of sentences related with the
question. Sentences are weighted according to their semantic relevance and similarity with the
question. Next, through specific answer patterns, these sentences are examined once again and the
parts containing possible answers are extracted and weighted. Finally, a single answer is chosen
among all candidates.
      </p>
      <p>This year, we focused on the handling of temporally restricted questions, the addition of another
language, and the adaptation of the system to a cross-language environment. Priberam relied on
the scalability of the system’s architecture to cope with the inclusion of another language module,
Spanish. Currently, two M-CAST3 partners, the University of Economics, Prague (UEP) and TiP,
are also using this framework to develop Czech and Polish language modules. The purpose of this
year’s participation in QA@CLEF was to evaluate the language independence of the system, as
well as its performance in a bilingual context.</p>
      <p>The next section gives an overview of the work done for the inclusion of another language,
describing the development of the Spanish module. Section 3 addresses the various improvements
in Priberam’s QA system architecture and depicts the cross-language scenario. Section 4 discusses
both the monolingual and bilingual results of the system at this year’s QA@CLEF campaign, and
section 5 concludes with future guidelines.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Addition of Spanish</title>
      <p>
        Taking advantage of the company’s natural language processing (NLP) technology and workbench
[
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], it was somewhat straightforward to build a new language module for Spanish, similar to
the Portuguese one. The QA system architecture was designed to be language independent, and
by using the same software tools, like Priberam’s SintaGest, new language modules can easily
be implemented and tested. This means that only the language resources (lexicon, thesaurus,
ontology, QA patterns) have to be adapted or imported to be in conformity with the existing
NLP tools. With a team of four people, the work on the lexicon took us about three months to
complete, because manual work was involved, and the adaptation and development of the QA
rules took about two months, as it had to be tested while it was being developed.
      </p>
      <p>
        The Spanish ontology was added to the common multilingual ontology, in a joint work of
Priberam and Synapse D´eveloppement. As detailed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], this multilingual taxonomy groups over
160 000 words and expressions through their conceptual domains organised in a four level tree with
3 387 terminal nodes.
      </p>
      <p>Priberam started by acquiring a Spanish lexicon and converting it to the same format and
specifications of the Portuguese one. After loading the existent lexical information in a database,
we had to establish equivalencies between POS categories, uniform them and classify all the lexical
entries, so the Spanish lexicon could be used with the tools that were used to build the Portuguese
module. New entries in the lexicon were also inserted, mainly proper nouns, such as toponyms
and anthroponyms. This was particularly important for the recognition of named entities (NEs).</p>
      <p>Unlike the Portuguese lexicon, the Spanish one does not contain any semantic features, nor
sense definitions connected to the ontology levels. The semantic classification has already been
3M-CAST – Multilingual Content Aggregation System based on TRUST Search Engine – is an European
Commission co-financed project (EDC 22249 M-CAST), whose aim is the development of a multilingual infrastructure
enabling content producers to access, search and integrate the assets of large multilingual text (and
multimedia) collections, such as internet libraries, resources of publishing houses, press agencies and scientific databases
(http://www.m-cast.infovide.pl).
started but we still have to define senses for each entry in the lexicon and link each one of them
to the levels of the ontology. This future work in the Spanish lexicon will hopefully lead to similar
results of both language modules. There is also still no thesaurus for Spanish, which may influence
the retrieval of documents and sentences that contain synonyms of the question’s keywords.</p>
      <p>Some relations between words were semi-automatically established, such as derivation
relations (e.g. caracterizar /caracterizaci´on), names of toponyms and their gentilics (e.g.
Australia/australiano), currencies (e.g. Grecia/euro) and cities (e.g. Espan˜a/Madrid ). Other
relations, such as hyperonymy, hyponymy, antonymy and synonymy, still need to be implemented in
the Spanish lexicon and improved in the Portuguese one.</p>
      <p>
        The system’s overall design remained unchanged for Spanish; the question patterns, answer
patterns and question answering patterns [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] were adapted to fit the new language module. The
typology of the 86 question categories was not altered, since it is language independent. Due to
the syntactical similarities between the two Romance languages, many of the Portuguese patterns
remained applicable. Groups of semantically related words and question identifiers had to be
translated and revised, which was very helpful to improve and fix some errors of the Portuguese
language module.
      </p>
      <p>
        For the Spanish module, some of the Portuguese contextual rules, such as the ones for
morphological disambiguation and for NEs recognition, were rewritten and adapted as well. Again, the
similarity between Portuguese and Spanish allowed us to adapt much of the work done previously
for Portuguese without having to start everything from scratch. Here the work was mostly done
with constants and entity identifiers [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], whereas the detection rules were just adapted in some
cases.
      </p>
      <p>
        Like in Portuguese, Spanish morphological disambiguation is done in two stages: first, the
contextual rules defined in SintaGest are applied; then, remaining ambiguities are suppressed
with a statistical POS tagger based on a second-order hidden Markov model [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. For training,
we used a corpus previously disambiguated with SVMTool4 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
3
      </p>
      <p>Cross-language and improvements of the system
architecture
The five main steps of Priberam’s QA system are:
• The indexing process, in which a large set of files in text format is analysed and index keys
for morphologically disambiguated lemmas, question categories and ontology domains are
created;
• The question analysis, in which a question is parsed, categories for the question are
determined and pivots and other search keys are extracted;
• The document retrieval, in which a query is made to the index database and a set of document
sentences is retrieved;
• The sentence retrieval, in which each sentence previously retrieved is parsed and given a
score to express its likelihood of containing an answer;
• The answer extraction, in which a unique answer is selected from the best-scored sentences,
by means of extraction patterns.</p>
      <p>This year some minor changes were introduced regarding: (i ) the handling of temporally
restricted questions, (ii ) the final validation of the extracted answers, and (iii ) the adaptation of
the system to a cross-language environment. The next subsections describe these three topics in
more detail.</p>
      <p>4SVMTool is developed by TALP Research Center NLP group, of Universitat Polit`ecnica de Catalunya
(http://www.lsi.upc.es/~nlp/SVMTool/).</p>
      <p>Additionally, other small improvements were made, such as the implementation of a priority
based scheme to make question categorisation more assertive, and the use of inexact matching
techniques (based on the Levenshtein distance) for partial matching of proper nouns in the
sentence retrieval module. The latter was done adapting existing technology for the Portuguese spell
checker5. Future work will address extending the use of inexact matching techniques in the
document retrieval module, which will prevent the exclusion of documents where a proper noun is
misspelled or spelled differently from the question, which is included in the causes of failure in the
cross-language task (cf. section 4).
3.1</p>
      <sec id="sec-2-1">
        <title>Handling of temporally restricted questions</title>
        <p>Improving the system’s ability to handle temporally restricted questions was one of the goals for
this year’s QA@CLEF. The organisation distinguishes three types of temporal restrictions:
• Restriction by date: e.g., “Who was the US president in 1962?”;
• Restriction by period : e.g., “How many cars were sold in Spain between 1980 and 1995?”;
• Restriction by event : e.g., “Where did Michael Milken study before enrolling in the
University of Pennsylvania?”.</p>
        <p>Our approach focuses on the two first types of restrictions. We take into account two sources of
information: (i ) the dates of documents, and (ii ) temporal expressions in the documents’ text.</p>
        <p>The documents’ dates in the collections EFE (for Spanish), Pu´blico (for European Portuguese)
and Folha de Sa˜o Paulo (for Brazilian Portuguese) are an instance of metadata. To exploit this
source, we provided our system with the ability to deal with metadata information. This is a
common requirement of real-world systems in most domains (the Web, digital libraries and local
archives). Our procedure to deal with dates is extensible to other kinds of metadata, such as
the document’s title, subject, author information, etc. This is currently being done for M-CAST
project, applied to digital libraries. In many cases, questions with restrictions require a hybrid
search procedure, using both metadata Boolean-like queries and NLP techniques. The same hybrid
behaviour is required to handle temporally restricted questions. During indexation, our system
finds the adequate piece of metadata containing the date of each document which is then indexed.
This makes the system able to handle Boolean-like date queries (such as “select all documents
dated above 1994-10-01 and below 1994-10-15”).</p>
        <p>A more difficult issue is the recognition of temporal expressions in natural language text. While
absolute dates like “25th April, 1974” are easy to recognise and convert to a numeric format, the
same does not happen when the dates are incomplete (“25th April”, “April 1974”, “1974”), when
the temporal expressions are less conventional (“spring 1974”, “2nd quarter of 1974”), or when
we consider temporal deictics (like “yesterday”, “last Monday”, “20th of the current month”) that
require knowledge about the discourse date.</p>
        <p>As a first approach, we were only concerned with temporal expressions that refer to possibly
incomplete absolute dates or periods. To answer temporally restricted questions, we need to
perform operations over numeric representations of dates and periods, like date comparison (is
1974-04-25 greater, equal or lower than 1974-05-01?), translation of time units (add 8 days to
1974-04-25), or checking if a date lies in a period (is 1974-04-28 in the interval [1974-04-25,
197405-01]?). Although with a full numeric representation of dates these operations are exact and well
defined, the need to deal with incomplete dates requires some fuzziness.</p>
        <p>The procedure for answering a temporally restricted question is as follows: during question
analysis, the temporal expression corresponding to the restriction is recognised and converted to
the above numeric format. The document retriever module relaxes the restriction into a larger
period (starting a few days before, and ending a few days after), and applies a metadata query to
5The Portuguese spell checker is included in FLiP, together with a grammar checker, a thesaurus, a hyphenator,
a verb conjugator and a translation assistant that enable different proofing level – word, sentence, paragraph and
text – of European and Brazilian Portuguese. An online version is available at http://www.flip.pt.
retrieve a set T1 of those documents whose dates are relevant. This “relaxation” approach seems
reasonably appropriate for newspaper corpora, since it works both for news about events that
already occurred, and for announces of incoming events. As an example, consider the question
“Contra que clube jogou o Mar´ıtimo a 2 de Abril de 1995?” [Against what team did Mar´ıtimo
play on the 2nd April 1995?]. Here T1 will contain all documents from the 29th March until the
5th April 1995. After this, another query is applied to retrieve a set T2 of documents containing
temporal expressions that match the original (non-relaxed) temporal restriction of the question.
Here, fuzzy matches are accepted (although having a lower score) to deal with incomplete dates.
An OR-operation of these two queries retrieves a set T = T1 ∪ T2 of documents that possibly
satisfy the restriction either because of their date or because of the temporal expressions they
contain. Finally, the usual document retrieval procedure continues as in the unrestricted case,
with the difference that the top-30 retrieved documents are constrained to belong to the set T .
These documents are then analysed at sentence level by the sentence retrieval module and only
those sentences that either have some temporal expression satisfying the restriction, or belong to
a properly dated document, are kept for answer extraction.
3.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Answer validation</title>
        <p>Last year the absence of a strategy for final answer validation was considered one of the major
causes of failure of our system. This year we have made a na¨ıve approach to this problem. In
order to strengthen the match between the question and the sentence containing the answer, we
demand the following: (i ) the answering sentence must match (at least partially) all the proper
nouns and NEs in the question, and (ii ) it must match a given amount of nominal and verbal
pivots in the question. Here, a match is any correspondence of lemmas, heads of derivation, or
synonyms. If a candidate answer is extracted from a sentence that fails one of these two criteria,
it is discarded, although it can still be used for coherence analysis of other answers.</p>
        <p>Notice that this approach ignores any relation among the question pivots and the extracted
answer; it merely checks for matches of each pivot individually. Current work is addressing a more
sophisticated strategy for answer validation through syntactic parsing. The idea is to capture the
argument structure of the question, by setting a syntactical role to each pivot or group of pivots
in the sentence, requiring that the answer to be extracted is contained in a specific phrase, and
checking if the answering sentence actually offers enough support.
3.3</p>
        <p>Adaptation of the system to a cross-language environment
Multilingual question answering has been introduced in this year Priberam’s participation, namely
in the Portuguese-Spanish and Spanish-Portuguese tasks. The adaptation of the system to a
cross-language environment required few modifications. Moreover, unlike most approaches, it is
self-contained, in the sense that it does not require using any external software or Web searches
to perform translations. Instead, we use an ontology based direct translation, refined by means of
a statistical corpora based approach.</p>
        <p>
          The central piece of our system for the cross-language tasks is the multilingual ontology, also
used in last year’s CLEF for monolingual [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and bilingual [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] purposes. The combination of the
ontology information of all TRUST languages provides a bidirectional word/expression translation
mechanism. Some language pairs are directly connected, as is the case of Portuguese and Spanish;
others are connectable using the English language as an intermediate. This allows operating in
a cross-language scenario for any pair of languages in the ontology (among Portuguese, Spanish,
French, English, Polish and Czech6). For each language pair that is directly connected, translation
scores are used to reflect the likelihood of each translation. By using this method, the system is
capable of selecting the preferential equivalent among the available translations. For instance,
in the case of the Spanish word hijo in the ontological domain [family/lineage], which has the
Portuguese translations filho (son) and crian¸ca (child), the selected translation is filho, since it has
6The French, Polish and Czech parts of the ontology are property respectively of Synapse D´eveloppement, TiP
and University of Economics, Prague (UEP).
a higher score. These scores are computed in a background task using the Europarl parallel corpus
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], a paragraph aligned corpus in the official languages of the European Union, containing the
proceedings of the European Parliament. After a preliminary sentence alignment step, each aligned
sentence is processed simultaneously for the two languages under consideration. Once processing
is finished, words from one language are associated to their translation in the second language
corresponding sentence. The number of times one word is associated to another is recorded and
then used to calculate the translation score. This parallel corpus was also used to improve the
ontology translation database. Translations were extracted based on the co-occurrence of words
in aligned sentences using likelihood and Chi-squared criteria and then added to the database.
Presently, we are also exploiting Europarl for other purposes, like word sense disambiguation.
        </p>
        <p>
          In a cross-language environment, our QA system starts by instantiating independent language
modules for each language (in this case, Portuguese and Spanish). Suppose that a question is
asked in language X, while language Y is chosen as target. First, the question analyser performs
pivot extraction and question categorisation, using the language module for X. Then, if the
ontology has a direct connection from X to Y , each pivot is translated to language Y . Otherwise,
each pivot is translated to English, and then the language module for Y is used to translate from
English to Y . Whenever multiple translations are available, the one with the highest translation
score is chosen as default, while the others are kept as synonyms (Table 1 illustrates this process).
This is done for pivots’ lemmas, heads of derivation and synonyms; in the end, equivalent lemmas,
heads of derivation and synonyms are selected for language Y . Notice that this strategy does not
translate the whole question from X to Y , which would discard alternative translations of some
words. At this point, the language module for Y has pivots in its own language, and question
categories, which are language independent. It only needs to activate the question answering
patterns (QAPs) to be later used by the answer extraction module [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Remember that in a
monolingual environment, the QAP activation is done via question patterns (QP), which allows
taking profit of finer relations between the question and the answer structures that go beyond
question categories. However, this is not possible in a cross-language environment, since QPs
are connected to the language module for X and QAPs to the one for Y . For this reason, our
current approach does cross-language QAP activation via question categories: we look in language
Y for all the QAPs associated with those categories. Finally, the system proceeds as in the Y
monolingual environment, with the document retrieval module, the sentence extractor and the
answer extractor.
        </p>
        <p>Original pivots</p>
        <p>Translated pivots</p>
        <p>Other translations kept as synonyms
recorde
mundial
salto em altura
salto
altura
r´ecord
mundial
salto de altura
salto
momento
universal
brinco, salto mortal, sobresalto, carrerilla, bronco,
voltereta, rebote, tac´on, tal´on, cambio brusco
porte, taman˜o, talla, estatura, nivel, niveles, altitud,
cumbre, cima, tope, pin´aculo, grandeza, altura, instante,
segundo
This year the CLEF organization did not provide the sets of questions classified according to the
CLEF question typology (factoid, definition, temporally restricted factoid and list). Therefore,
we divided the sets of 200 questions into five major categories regarding the type of the expected
answer: denomination/designation (DEN), definition (DEF), location/space (LOC), quantification
(QUANT) and duration/time (TEMP).</p>
        <p>Ans. →
Quest. ↓
DEN
DEF
LOC
QUANT
TEMP
Total</p>
        <p>
          The general results of Table 2 show that there was a slight increase regarding last year’s
Portuguese monolingual task. As for Priberam’s first participation in the Spanish monolingual
task, we achieved better results than last year’s best performing system [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. As for the
crosslanguage environment, the accuracy is significantly lower, and relatively similar for the
PortugueseSpanish and Spanish-Portuguese tasks.
        </p>
        <p>Despite the fairly satisfying results in the Spanish monolingual task, there is still a considerable
gap between the performance of the Spanish and the Portuguese modules. A few reasons
contributed to this. The semantic classification, the addition of proper nouns and the NEs detection
rules for Spanish were just in an early stage, which led to a higher percentage of errors related
to the extraction of candidate answers. The absence of a thesaurus also played a part in the
document and sentence retrieval stages. Finally, one has to bear in mind that, since the addition
of the Spanish language module is recent, it is not so extensively tested and fine-tuned.</p>
        <p>Table 3 displays the distribution of errors along the main stages of our QA system, both in
the monolingual and bilingual runs.</p>
        <sec id="sec-2-2-1">
          <title>Question →</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Stage ↓</title>
          <p>Document retrieval
Extraction of candidate answers
Choice of the final answer
NIL validation
Translation
Other
Total</p>
          <p>PT</p>
          <p>In the monolingual runs, most errors occur during the extraction of candidate answers. This
is related with several issues: QAPs that are badly tuned or erroneously applied due to errors in
question categorisation, difficulties in writing QAPs to extract the right answer in long sentences,
and a few errors due to the handling of morphological and lexical relations (for instance, no
connection was found between festival de cinema of PT question 82 “Que festival de cinema
atribui o ‘Urso de Ouro’ ?” [What film festival awards the ‘Golden Bear’ ?] and the NE Festival
de Cinema de Berlim of the retrieved sentence), as well as anaphoric relations (e.g. the ES
question 80 “¿En qu´e an˜o muri´o Bernard Montgomery?” [In what year did Bernard Montgomery
die?] did not retrieve the answer “(...) su muerte, ocurrida en 1976, motiv´o un aut´entico duelo
nacional”). The inclusion of other kinds of metadata besides the date may improve the answer
extraction performance, as suggested by PT question 133 “Onde ganharam os Abba o Festival
da Eurovis˜ao?” [Where did Abba win the Eurovision Song Contest?], whose answer, Brighton,
should be inferred from the title of the news.</p>
          <p>As for the document retrieval stage, many errors are connected with long questions with many
pivots, which may have the effect of filtering out relevant documents while keeping others that
contain more but less important pivots. There were also issues of matching proper nouns and
NEs, which are usually the core pivots (e.g. in PT question 53 “Quem venceu a Volta a Franc¸a em
1988?” [Who won the Tour of France in 1988?] the right document was not retrieved, because the
sentence of the answer contained Tour instead of Volta a Fran¸ca); tokenization issues (e.g. in PT
question 166 “Qual foi o resultado do It´alia-Nig´eria no Campeonato do Mundo de 1994?” [What
was the score of Italy-Nigeria in the World Cup of 1994?] It´alia-Nig´eria was considered a single
token, hence not matching It´alia or Nig´eria); difficulties with ambiguous NEs (e.g. in PT question
189 “Quem escreveu ‘A Capital’ ?” [Who wrote ‘A Capital’ ?], ‘A Capital’ occurs frequently in
the corpus as the name of a newspaper and less often as the title of a book). Furthermore, some
questions could only be answered if a more sophisticated strategy of indexation was made (e.g.
the answer to ES question 124 “¿Qu´e zar ruso muri´o en 1584?” lies in a document with a list of
events: “EFEMERIDES DEL 17 DE MARZO (...) Defunciones (...) 1584.- Ivan ‘El Terrible’, zar
ruso.”, and could only be retrieved with a different indexation strategy to deal with items in a
list). Table 3 shows also that the Spanish document retrieval stage outperformed the Portuguese
one, although we were expecting the opposite behaviour since the system did not use a Spanish
thesaurus. The only explanation we could find is that the Portuguese questions were, on average,
harder to retrieve than the Spanish ones, more of them requiring anaphora resolution and difficult
matches between NEs (e.g., Isabel II /Elizabeth 2a).</p>
          <p>Our simple strategy for answer validation led to fair results. Although this year’s CLEF
organization did not provide us with the information of which questions were considered NIL,
we estimated our system’s NIL recall/precision as respectively 65%/43% (PT), 60%/34% (ES),
30%/29% (PT-ES) and 30%/18% (ES-PT).</p>
          <p>The work done for temporally restricted questions, described in the subsection 3.1, allowed the
system to retrieve the right answer in 40.7% of the questions, in the Portuguese run, and 32.1%, in
the Spanish one. The justification for these results is related with several different issues, mostly
in the document retrieval and answer selection stages, and not necessarily with our procedure
to handle this type of restriction. For instance, in PT question 155 “Contra que clube jogou o
Mar´ıtimo a 2 de Abril de 1995?” [Against what team did Mar´ıtimo play on the 2nd April 1995?]
the answer was not extracted because, due to tokenization, we could not retrieve the document
containing the answer, Mar´ıtimo-Tirsense, although the system correctly matched the date of the
document within the relaxed interval set by the temporal restriction.</p>
          <p>In the bilingual runs, one of the major causes of failure is related with translation errors,
mostly with proper nouns and NEs equivalencies. For instance, in the questions “Quantos O´scares
ganhou ‘A Guerra das Estrelas’ ?”; “O que ´e a Eurovis˜ao?” and “Cu´ando se suicid´o
Cleopatra?”, the system missed the translations A Guerra das Estrelas/La guerra de las Galaxias,
Eurovis˜ao/Eurovisi´on and Cle´opatra/Cleopatra. Since NEs and proper nouns are generally the most
important elements for the right extraction of answers, if one fails to translate them correctly, they
are bound to have a negative impact in the results. Notice that the ontology was not designed to
contain proper nouns and NEs except those belonging to closed domains, like names of countries
and other geographic entities. Frequently (like in the two last examples above) there are only
slight spelling differences between a word and its translation. This suggests dealing with this issue
by extending the use of inexact matching techniques to the document retrieval module, hence
preventing the exclusion of documents by misspelling of proper nouns or NEs. This technique may
also improve the monolingual usage, since in “real-world” systems it is frequent to have different
spellings of proper nouns in the question and in the target documents.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and Future Work</title>
      <p>Despite the positive results presented in the previous section, there are still a few issues that need
to be solved, in order to improve the general performance of the system. With that purpose in
mind, we are currently enriching the Spanish lexicon with the inclusion of proper nouns, semantic
features, synonyms and lexical relations.</p>
      <p>In addition, we are implementing syntactical processing to question categorisation to capture
the argument structure of the question. This will allow a more tuned answer extraction, by
matching the specific roles of the question arguments in the answer. This strategy will also lead
to an improvement of the cross-language performance, taking profit of the language independence
of that argument structure.</p>
      <p>
        Current work is being done in M-CAST project for dealing with other metadata information
besides dates, like the document title, its author, and the subject. A task to be addressed in
the future is the application of inexact matching techniques, currently used only in the sentence
retrieval stage, in the document retrieval module as well. Of course, this would imply some changes
in the indexation scheme. As said above, we expect this to improve not only the translation
of proper nouns and NEs in cross-language environments, but also the monolingual robustness.
Finally, research is being done to address word sense disambiguation. Europarl corpus [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is
being used with the aim of building a Portuguese sense disambiguated corpus, taking profit of the
multilingual ontology, which enables domain specific translation, hence allowing discriminating
word senses.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>Priberam Inform´atica would like to thank the partners of the NLUC consortium7, especially
Synapse D´eveloppement, the CLEF organization and Linguateca. We would also like to
acknowledge the support of the European Commission in TRUST (IST-1999-56416) and M-CAST (EDC
22249 M-CAST) projects.</p>
      <p>7See http://www.nluc.com.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vallin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Giampiccolo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Aunimo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ayache</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Osenova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Penas</surname>
          </string-name>
          , M. de Rijke,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sacaleanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Santos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Sutcliffe</surname>
          </string-name>
          .
          <article-title>Overview of the CLEF 2005 multilingual question answering track</article-title>
          .
          <source>In Cross Language Evaluation Forum: Working Notes for the CLEF 2005 Workshop</source>
          (Vienna, Austria,
          <fpage>21</fpage>
          -
          <lpage>23</lpage>
          September),
          <year>2005</year>
          . Available at http://clef.isti.cnr.it/2005/working notes/WorkingNotes2005/vallin05.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Amaral</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Laurent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Pinto</surname>
          </string-name>
          .
          <article-title>Design and Implementation of a Semantic Search Engine for Portuguese</article-title>
          .
          <source>In Proceedings of 4th International Conference on Language Resources and Evaluation (LREC</source>
          <year>2004</year>
          ), Lisbon, Portugal,
          <fpage>26</fpage>
          -28 May, volume
          <volume>1</volume>
          , pages
          <fpage>247</fpage>
          -
          <lpage>250</lpage>
          ,
          <year>2004</year>
          . Also available at http://www.priberam.pt/docs/LREC2004.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Amaral</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Figueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Pinto</surname>
          </string-name>
          .
          <article-title>Priberam's question answering system for Portuguese</article-title>
          .
          <source>In Cross Language Evaluation Forum: Working Notes for the CLEF 2005 Workshop</source>
          (Vienna, Austria,
          <fpage>21</fpage>
          -
          <lpage>23</lpage>
          September),
          <year>2005</year>
          . Available at http://clef.isti.cnr.it/2005/working notes/WorkingNotes2005/amaral05.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Amaral</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Figueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Pinto</surname>
          </string-name>
          .
          <article-title>A Workbench for Developing Natural Language Processing Tools</article-title>
          .
          <source>In Pre-proceedings of the 1st Workshop on International Proofing Tools and Language Technologies (Patras, Greece, 1-2 July)</source>
          ,
          <year>2004</year>
          . Also available at http://www.priberam.pt/docs/WorkbenchNLP.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.M.</given-names>
            <surname>Thede</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.P.</given-names>
            <surname>Harper</surname>
          </string-name>
          .
          <article-title>A second-order hidden Markov model for part-of-speech tagging</article-title>
          .
          <source>In Proceedings of the 37th Annual Meeting of the ACL</source>
          , Maryland: College Park, pages
          <fpage>175</fpage>
          -
          <lpage>182</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Christopher</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          and Hinrich Schu¨tze.
          <source>Foundations of Statistical Natural Language Processing (2nd printing)</source>
          . The MIT Press, Cambridge, Massachusetts,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>[7] Jesu´s Gim´enez and Llu´ıs M`arquez</article-title>
          . SVMTool:
          <article-title>A general POS tagger generator based on Support Vector Machines</article-title>
          .
          <source>In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC'04)</source>
          (Lisbon, Portugal),
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Laurent</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <article-title>S´egu´ela, and</article-title>
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>N`egre. Cross Lingual Question Answering using QRISTAL for CLEF 2005</article-title>
          .
          <source>In Cross Language Evaluation Forum: Working Notes for the CLEF 2005 Workshop</source>
          (Vienna, Austria,
          <fpage>21</fpage>
          -
          <lpage>23</lpage>
          September),
          <year>2005</year>
          . Available at http://clef.isti.cnr.it/2005/working notes/WorkingNotes2005/laurent05.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Koehn</surname>
          </string-name>
          .
          <article-title>Europarl: A Parallel Corpus for Statistical Machine Translation</article-title>
          .
          <source>In Proceedings of MT Summit X (Phuket, Thailand)</source>
          , pages
          <fpage>79</fpage>
          -
          <lpage>86</lpage>
          ,
          <year>2005</year>
          . Also available at: http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/europarl-mtsummit05.
          <fpage>pdf</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>