=Paper=
{{Paper
|id=Vol-1176/CLEF2010wn-MLQA10-TannierEt2010
|storemode=property
|title=FIDJI @ ResPubliQA 2010
|pdfUrl=https://ceur-ws.org/Vol-1176/CLEF2010wn-MLQA10-TannierEt2010.pdf
|volume=Vol-1176
}}
==FIDJI @ ResPubliQA 2010==
<pdf width="1500px">https://ceur-ws.org/Vol-1176/CLEF2010wn-MLQA10-TannierEt2010.pdf</pdf>
<pre>
                     FIDJI  ResPubliQA'10


                      Xavier Tannier, Véronique Mori eau


                                  LIMSI-CNRS
                          Univ. Paris-Sud, Orsay, Fran e
                         xtannier, mori eaulimsi.fr


      Abstra t.   In this paper, we present the results obtained by the sys-
      tem FIDJI for both Fren h and English monolingual evaluations,
      at ResPubliQA 2010       ampaign. In this    ampaign, we fo used on
       arrying on our evaluations   on erning the       ontribution of our syn-
      ta ti   modules with this spe i     olle tion.


1    Introdu tion


FIDJI (Finding In Do uments Justi ations and Inferen es) is an open-domain
question-answering (QA) system for Fren h [1℄ and, more re ently, English. It
 ombines synta ti   information with traditional QA te hniques su h as named
entity re ognition and term weighting in order to validate answers through dif-
ferent do uments.
    This paper fo uses on the results obtained by FIDJI at ResPubliQA 2010
evaluation. It presents rst a brief overview of the system and of its adaptation
to English. Then, the spe i     hoi es made for the      ampaign are detailed, and
some results are nally given.


2    FIDJI


Figure 1 presents the ar hite ture of FIDJI. The system relies on a synta ti
analysis and named entity tagging of the question and of a limited number of
do uments for ea h question. This analysis is performed by the parser XIP [2℄
enri hed with some additional spe i     rules.
    The do ument
                                                                         1
                    olle tion is indexed by the sear h engine Lu ene . The index
 ontains raw text only. First, the system analyses the question and submits the
keywords of the question to Lu ene (module A): the rst 15 do uments are then
pro essed (module B). We de ided to redu e the number of do uments be ause
they are rather long and their parsing would take too mu h time. The reason we
perform this analysis online is that we aim at avoiding as mu h prepro essing
as possible (the system is designed to explore Web       olle tions [1℄). Among these
do uments, FIDJI looks for senten es     ontaining the highest number of synta ti
relations of the question (module C1). Finally, answers are extra ted from these
                           Fig. 1.   Ar hite ture of FIDJI


senten es (module D1) and the answer type, when spe ied in the question, is
validated (module E).

     The main obje tive of FIDJI is to produ e answers whi h are fully validated
by a supporting text (or passage) with respe t to a given question. The di ulty
is that an answer (or some pie es of information         omposing an answer) may be
validated by several do uments.

     Our approa h    onsists in    he king if all the     hara teristi s of a question
(namely the dependen y relations and the answer type) may be retrieved in one
or several do uments. In this     ontext, FIDJI has to dete t synta ti   impli ations
between questions and passages       ontaining the answers and to validate the type
of the potential answer in this passage or in another do ument.


     Sin e the last evaluation    ampaign in 2009, FIDJI has been adapted to En-
glish. Spe i   rules have been developped for question analysis (module A) and
do ument pro essing (module B). The other modules are            ommon to both En-
glish and Fren h.

     The following examples illustrate how FIDJI extra ts answers, and more
details   on erning the system     an be found in [1℄.


1
    http://lu ene.apa he.org/
2.1 Example 1
Question analysis provides lemmatisation, POS tagging and dependen y rela-
tions, as well as the question type and the expe ted answer type. For example:


Question: Quel premier ministre s'est sui idé en 1993 ?
(   Whi h Prime Minister ommitted sui ide in 1993? )
Dependen ies: DATE(1993)
                PERSON(ANSWER)
                SUBJ(se sui ider, ANSWER)
                attribut(ANSWER, ministre)
                attribut(ministre, premier)
Question type: fa toid
Expe ted answer type: person (spe i      answer type: prime minister)


      The question is turned into a de larative senten e where the answer is rep-
resented by the `ANSWER' lemma. The following senten e is sele ted be ause
it    ontains the highest number of dependen y relations:


Pierre Bérégovoy s'est sui idé en 1993.
(Pierre Bérégovoy ommitted sui ide in 1993.)
Dependen ies:
         DATE(1993)
         PERSON(Pierre Bérégovoy)
         SUBJ(se sui ider, Pierre Bérégovoy)

      Pierre Bérégovoy instantiates the ANSWER slot of the question dependen ies
and be omes a      andidate answer. The named entity type (person) and the rst
three dependen ies of the question are validated in this senten e. In order to fully
validate the     andidate answer, the system sear hes the missing dependen ies
(attribut(Pierre     Bérégovoy, ministre) and attribut(ministre, premier) ) in
a single senten e of the whole do ument      olle tion. These dependen ies will be
found in any senten e speaking about      le premier ministre Pierre Bérégovoy 
(   Prime Minister Pierre Bérégovoy ) and the answer will be validated.

2.2 Example 2
For     omplex questions, it is obvious that answers are not always short phrases.
For this reason, FIDJI provides a full passage as an answer. On these kinds
of questions, the system behaves as a     lassi al passage retrieval system, ex ept
that     andidate passages are retrieved through synta ti   relations and relevant
dis ourse markers (about 100 nouns, verbs, prepositions and adje tives, manually
    ompiled) instead of keywords only. Here is an example of a   omplex question:


Question: Why is the sky blue?
Dependen ies: attribut(sky, blue)
Question type:     omplex (why)
Expe ted answer type: reason
                                  2


     The following passage is sele ted be ause it       ontains all the dependen y re-
lations of the question and a      ausal marker:


And if the sky is blue, it is be ause of Rayleigh s attering ...
         attribut(sky, blue)
         VMOD(be, s attering)
         PREPOBJ(s attering, be ause of)
         ...


3      ResPubliQA'10 experiments

In 2009, ResPubliQA results learned us a lot about the behavior of our system.
     Other evaluations (former CLEF and Quaero             ampaigns) had shown that
using synta ti     analysis modules for retrieving do uments and extra ting the
answers signi antly improved the results [1℄. However, with ResPubliQA eval-
uation set, passage extra tion turned out to be mu h better by repla ing syntax
by traditional bag-of-words te hniques [3℄. This is done by turning o modules
C1 and D1 in Figure 1.
     Passage extra tion is then performed by a      lassi al sele tion of senten es    on-
taining a maximum of question signi ant keywords (module C2), and answer ex-
tra tion is a hieved without slot instantiation within dependen ies (module D2).


     The new guidelines in ResPubliQA 2010 oered us the possibility to          arry on
our experiments in this way. Indeed, two dierent tasks were allowed this year:

     Paragraph sele tion (PS), similar to 2009 task, where only the full paragraph
       ontaining the exa t answer were to be returned. Passages are not indenite
      parts of texts of limited length, but predened paragraphs identied in the
       orpus by XML tags <p>.
     Answer sele tion (AS),     loser to traditional QA tasks, where systems were
      required to demar ate also the exa t answer, supported by a full paragraph.

     In this latter task, judged answers      an be INEXACT (good support but
bad boundaries for short answer), MISSED (good support but wrong short
answer), RIGHT (good support and good answer) or WRONG.


     Two runs per language were allowed. In order to         ontinue testing our plug/
unplug strategies, and to experiment them for the rst time in English, we            hose
the following pro edure for our two runs:

2
    Reason is not a named entity, as person in the rst example, but this answer
    type points out that a text expli itely explaining a reason should be prefered (in our
     ase, using dis ourse markers).
 1.     PS task, synta ti modules turned o, leading to an approa h loser to
        passage retrieval, that had the best results of the system last year.
 2.     AS task, synta ti modules turned on, in order to test whether answer ex-
        tra tion was ee tive or not on this     olle tion. Moreover, by adding answers
        with INEXACT, MISSED and RIGHT status from our AS run, we
         an obtain a PS run with modules turned on, whi h allows us to evaluate
        modules on the same task.


4        Results


We present the results of 5 experiments for both Fren h and English. The rst
three        ome from o ial ResPubliQA runs:


       ➀: AS task with synta ti       modules turned on (exa t answers judged as
        RIGHT),
       ➁: PS task with synta ti    modules turned on (exa t answers of ➀ judged as
        RIGHT, INEXACT, MISSED),
       ➂: PS task with synta ti    modules turned o.


        To   omplete the evaluation, we also ran uno ial     onguration and a hieved
the assessment by ourselves:


       ➃: AS task with passage retrieval turned o but answer extra tion turned
        on (modules C2 and D1, with exa t answers judged as RIGHT),
       ➄: PS task with passage retrieval C1 turned o but answer extra tion turned
        on (exa t answers of ➃ judged as RIGHT, INEXACT, MISSED).


        In order to evaluate the performan e of the question analysis module, we
manually identied the types of question. As FIDJI          annot pro ess opinion ques-
tions, we de ided to       onsider them as fa toid. Although questions in Fren h and
English are translations of ea h other and their respe tive answer should be ex-
tra ted from the same paragraph, we noti ed that, for a given question, its type
is not always the same in English as in Fren h. For example, in English, the type
of question 169 is     reason/purpose while in Fren h, it is fa toid :
(EN) Why is the trade in ammonium nitrate fertilizers hampered within the Eu-
ropean E onomi Community?
(FR) Qu'est- e qui a entravé le ommer e d'engrais à base de nitrate d'ammonium
dans la Communauté É onomique Européenne? (What has hampered the trade
in ammonium nitrate fertilizers...? )
        This is not only an issue of synta ti   dieren es due to translation paraphras-
ing; the target of the question is dierent. Stri tly speaking, the Fren h question
might a ept a noun phrase like  les réglementations régissant la ommer ial-
isation des engrais à base de nitrate d'ammonium  (the dierent regulations
 ontrolling the marketing of ammonium nitrate based fertilizers ), while su h an
answer would be odd with the English question. We identied 7 questions raising
           3
this issue .


     Tables 1 and 2 presents FIDJI's results for runs ➀, ➁ and ➂, as well as
experiments ➃ and ➄, by types of questions (manually identied). In Fren h,
86% of question types were        orre tly identied by FIDJI (we found 9 questions
that were ill-formed or with misspellings and whi h FIDJI            ould not      orre tly
analyse) whereas in English, only 69.5% were           orre tly identied.
     Con erning our o ial runs, as we          an see in Tables 1 and 2, answer extra -
tion performan e (➀) is very low (0.25 for both English and Fren h). Results
are better for passage sele tion (➁ and ➂) for every type of questions and even
better when synta ti        modules are swit hed o (➂). Results are globally better
for English than for Fren h so the performan e of the question analysis module
 annot explain these results.
     In both languages,      orre t answers to denition questions dramati ally de-
 rease with D1 turned o. This is be ause we do not have any non-synta ti             way
to extra t the answer for many of these questions (denitions not expe ting a
named entity, as   What is maladministration?, an only be answered by denition
patterns in FIDJI). Turning o synta ti     modules ne essarily leads to a NOA
answer in these     ases.
     We    an noti e that for both English and Fren h, the results follow the same
trend and that results for passage sele tion are better for  omplex questions
(reason/purpose and pro edure), probably be ause FIDJI sele ts passages                on-
taining dis ourse markers for this type of questions. Also, for these questions, we
always returned the full paragraph as exa t short answer,         onsidering that try-
ing to fo us even more inside the paragraph was not useful for su h questions. As
the assessors did    onsider that shorter answers        an be better, the system often
gets an INEXACT status for.
     Finally, our additional runs ➃ and ➄ show a small improvement, showing
that best results are obtained when turning o synta ti           passage retrieval, but
turning on synta ti      answer extra tion (using modules C2 and D1). This is at
least   lear   on erning non-fa toid questions. This nding is important and will
help us in the future to hoose our sear h strategies a        ording to dierent    orpora
and question types.
     Last year, the pure information retrieval baseline [4℄ whi h           onsisted in
querying the indexed        olle tion with the exa t text of the question and returning
the paragraph retrieved in the rst position, had the best results for Fren h and
ranked 5 out of 14 in English [5℄. Even if a subset of the Europarl            orpus has
been added to the do ument         olle tion in 2010, we   an see that our   1 measures
(see Table 3) are still lower than the 2009 baseline (0.53 for English and 0.45 for
Fren h).
     In 2009, we noted that our results were due to ACQUIS           orpus spe i ities:
dierent register of language, more      onstrained vo abulary, texts having a parti -
ular stru ture, with an introdu tion followed by long senten es extending on sev-

3
    Questions 3, 11, 134, 169, 175, 197, 199.
  Type of questions        Fa toid     Denition Reason/Purpose                Pro edure        TOTAL
Number of questions          110           29                 29                    32            200
➀ Corre t answers 10 (9.1%) 3 (10.3%)                   1 (3.5%)                3 (9.4%)        17 (8.5%)
➁ Corre t passages 33 (30%) 10 (34.5%)                 10 (34.5%)              14 (43.8%) 67 (33.5%)
➂ Corre t passages 51 (46.3%) 3 (10.3%)                18 (   62%   )        17 (   53.1%)     89 (44.5%)

Uno ial runs
➃ Corre t answers 13 (11.8%) 3 (10.3%)                  2 (6.9%)               4 (12.5%)        22 (11%)
➄ Corre t passages 47 (42.7%) 9 (31.0%)                19 (65.5%)              18 (56.3%) 93 (46.5%)
                         Table 1.   Results by question type (English).


     Type of questions      Fa toid    Denition Reason/Purpose Pro edure                       TOTAL
 Number of questions          117          29             26                        28            200
 ➀ Corre t answers 11 (9.4%) 2 (6.9%)                   0 (0%)                 1 (3.6%)         14 (7%)
 ➁ Corre t passages 35 (29.9%) 6 (20.7%)               8 (30.8%)               8 (28.6%)       57 (28.5%)
 ➂ Corre t passages 30 (25.6%) 6 (20.7%)               13 (50%     )         13 (46.4%     )   62 (31%)

 Uno ial runs
 ➃ Corre t answers 12 (10.3%) 3 (10.3%)                 0 (0%)                 2 (6.3%)        17 (8.5%)
 ➄ Corre t passages 31 (28.2%) 7 (24.1%)               14 (53.8%)            15 (50.0%) 67 (33.5%)
                         Table 2.   Results by question type (Fren h).


eral paragraphs, et . Table 4 shows that FIDJI found                     orre t answers/passages
mainly in the ACQUIS          olle tion. As FIDJI has di ulty with sele ting passages
in the ACQUIS       olle tion, FIDJI's low results            ould be explained if a majority
of    orre t answers are in the ACQUIS          olle tion.


      The main dieren e between FIDJI ar hite ture used for ResPubliQA and
the one used for other evaluation          ampaigns (CLEF, Quaero) is the number of
do uments returned by Lu ene: 15 do uments for ResPubliQA and 100 for other
 ampaigns. We have to evaluate if sele ting more do uments would improve the
results.


                          Campaign      FIDJI 2010       FIDJI 2009
                           Language English Fren h English Fren h
                              ➀         0.09    0.08          -          -
                              ➁         0.35    0.30          -         0.30
                              ➂         0.48    0.36          -         0.42

                              ➃         0.11    0.08          -          -
                              ➄
                     Table 3.
                                        0.47    0.34          -          -
                                    1 measure for Fren h and English.
                      Language        English           Fren h
                       Corpus    Europarl A quis Europarl A quis
                           ➀          3       14       6          8
                           ➁          24      43       22        36
                           ➂          33      56       21        41
               Table 4.   Number of    orre t answers/passages per     orpus.


5    Con lusion


We presented in this paper our parti ipation to the         ampaign ResPubliQA 2010
in Fren h and English. We evaluated two strategies: plugging or unplugging the
synta ti    modules for do ument sele tion and answer extra tion. As in 2009, the
system got low results and even lower when synta ti              modules are turned o.
Dierent experiments on the       olle tion     onrmed that the use of synta ti     anal-
ysis de reased results, whereas it proved to help when used in other             ampaigns.
We still have to evaluate if a higher number of do uments sele ted by the sear h
engine     an improve the results.


6    A knowledgements


This work has been partially nan ed by OSEO under the Quaero program.


Referen es


1. Mori eau, V., Tannier, X.: FIDJI: Using Syntax for Validating Answers in Multiple


    10791
    Do uments. Information Retrieval, Spe ial Issue on Fo used Information Retrieval
            (2010)
2. Aït-Mokhtar, S., Chanod, J.P., Roux, C.: Robustness beyond shallowness: In re-
    mental deep parsing. Natural Language Engineering       8   (2002) 121144
3. Tannier, X., Mori eau, V.:    Studying Synta ti     Analysis in a QA System: FIDJI
     ResPubliQA'09. In: Pro eedings of CLEF 2010. Number LNCS 6241 in Le ture
    Notes in Computer S ien e, Springer-Verlag, New York City, NY, USA (2010)
4. Pérez, J., Garrido, G., Álvaro Rodrigo, Araujo, L., Peñas, A.:         Information Re-
    trieval Baselines for the ResPubliQA Task. In: Working Notes for the CLEF 2009
    Workshop, Corfu, Gree e (2009)
5. Peñas, A., Forner, P., Sut lie, R., Rodrigo, A., For s u, C., Alegria, I., Giampi -
    olo, D., Moreau, N., Osenova, P.:      Overview of ResPubliQA 2009: Question An-
    swering Evaluation over European Legislation.      In: Working Notes for the CLEF
    2009 Workshop, Corfu, Gree e (2009)

</pre>