<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <pub-date>
        <year>2009</year>
      </pub-date>
      <issue>974</issue>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>FIDJI has to dete t synta ti impli ations between questions and passages ontaining the answers.</title>
      <p>and the do uments from whi h answers are extra ted.
Our system relies on synta ti analysis provided by XIP, whi h is used to parse both the questions
the most synta ti relations of the question. Finally, answers are extra ted from these senten es
and the answer type, when spe ied in the question, is validated. Figure 1 presents the ar hite ture
of FIDJI and more details an be found in [4, 3℄. Next se tions summarize the way FIDJI extra t
analysis and named entity tagging). Among these do uments, FIDJI looks for senten es ontaining
answers and fo use on ResPubliQA spe i ities.</p>
    </sec>
    <sec id="sec-2">
      <title>VMOD (verb modier), COORDITEMS ( oordinated elements) CONNECT ( onne tor introdu</title>
      <p>XIP [1℄ is a robust parser for Fren h and English whi h provides dependen y relations and
ing lause).
mainly: SUBJ (subje t), OBJ (obje t), PREPOBJ (prepositional group), NMOD (noun modier),
named entity re ognition. The dependen y relations provided by XIP whi h are used by FIDJI are
tagged as a NE and extra ted as an answer to What is the pri e of a Fren h stamp?. Other
speed, weight, money, physi s, so that 0.55 euro in a Fren h stamp osts 0.55 euro an be
allow for more pre ise types. For example, for number, we added the following features: length,
The named entities (NE) are tagged using a set of 8 types: person, organization, lo ation,
(lo ation) an be made more spe i ( ountry, region, ontinent...). We also added features to
elements are also tagged, as names introdu ing persons: fun tions (leader...), professions
(minisdate (dened by XIP), as well as nationality, number, duration, age (that we added). XIP’s lieu
ter...), family indi ations (father...).</p>
      <p>Question analysis onsists in identifying:
The synta ti dependen ies given by XIP; •
On e andidate do uments are sele ted by the sear h engine and analyzed by the parser, the
are not indenite parts of texts of limited length; they must be predened paragraphs identied
’how’ or ’why’ questions, where no short answer may be retrieved.
in the olle tion, rather than usual end-of-senten e markers.
short answers, and then to return a paragraph ontaining the best answer. This is not the ase of
fo used, short parts of texts, but full paragraphs that must ontain the answer. Se ond, passages
hunt down short answers. For most questions, typi ally fa toid questions, it is still relevant to nd
ResPubliQA answer format is dieren t from traditional QA ampaigns. First, answers are not
in the olle tion by XML tags &lt;p&gt;.</p>
      <p>FIDJI usually works at senten e level. For the aim of ResPubliQA spe i rules, we hose to
work at paragraph level. This onsisted in spe ifying that senten e separators were &lt;p&gt; XML tags
system ompares the do ument paragraphs with question analysis, in order to:
Although answers to submit to the ampaign are full paragraphs, our system is designed to
Expe ted type: lo ation (state) •</p>
    </sec>
    <sec id="sec-3">
      <title>2.2.2 Complex questions</title>
      <p>(Why should the stru ture of an ANIMO network be revised? )
is the one that is returned rst by Lu ene. For example:
synta ti dependen ies in ommon with the question are sele ted. Among them, the best-ranked
tions, the system behaves more as a passage retrieval system. The paragraphs ontaining the more
0155 - Pourquoi onvient-il de revoir l’ar hite ture du rØseau Animo ?
Complex questions (’how’, ’why’, et .) do not expe t any short answer. On these kinds of
ques</p>
    </sec>
    <sec id="sec-4">
      <title>VMOD( onvenir, revoir) DEEPOBJ(revoir, ar hite ture)</title>
      <p>Synta ti dependen ies and NE tagging: •
Question type: omplex (why) •
ATTRIBUT_DE(ar hite ture, rØseau) NMOD(rØseau, animo)
riteria are listed below, and are presented in de reasing order of importan e:
FIDJI’s s ores are not omposed of a single value, but of a list of dieren t values and ags. The
As we said, a paragraph ontaining an extra ted short answer will be prefered if it exists. •
2.3 S oring
a relevant passage. This is espe ially true for omplex questions, but not only. Indeed, the sele tion
questions.</p>
      <p>Results are lower than former ampaigns’ s ores, espe ially on erning fa toid and denition
Looking arrefully at the results shows that, in these parti ular do uments, using synta ti
dependen ies as the main lue to hoose paragraph andidates is not always a good way to nd out
of the paragraph ontaining the most question dependen ies often leads to the introdu tion of the
For example:
do ument or to a very general paragraph ontaining poor information.
(66/401/EEC)&lt;/p&gt;
&lt;p&gt;COUNCIL DIRECTIVE of 14 June 1966 on the marketing of fodder plant seed
or newspapers. Question as well as do ument analyses suered from the spe i expressions and
newspaper orpora, have been poorly re ognized for this evaluation.
stru tures used by Fren h texts, and espe ially for denitions. Denitions, quite easy to dete t in
Also, JRC-A quis orpus uses a dieren t register of language than usual orpora su h a Web</p>
    </sec>
    <sec id="sec-5">
      <title>Dependen y relations are still useful to nd the good do ument, but often fails to point out to</title>
      <p>the orre t paragraph.</p>
    </sec>
    <sec id="sec-6">
      <title>We present the results Table 1 by types of questions. Only one answer per question was allowed,</title>
      <p>so the values simply orrespond to the rate of orre t answers for ea h question type.
form of JRC-A quis tagged paragraphs. Results showed that synta ti analysis should be used in
adapted our synta ti -based QA system FIDJI in order to produ e a single long answer in the
dieren t manners a ording to the type of tasks and questions. A areful look at our system’s
errors should enable improvement of robustness of the sear h by applying ontextual strategies.
We presented in this arti le our parti ipation to the ampaign resPubliQA 2009 in Fren h. We
&lt;p&gt;This Dire tive shall apply to fodder plant seed marketed within the Community, irrespe
tive of the use for whi h the seed as grown is intended.&lt;/p&gt;
170
76
37
101
500
Number of questions
116
40 %
Corre t answer
15.8 %
36.2 %
30.4 %
22.4 %
16.2 %</p>
    </sec>
    <sec id="sec-7">
      <title>4 Con lusion</title>
      <p>3 Results
0006 - What is the s ope of the oun il dire tive on the trading of fodder seeds?
Table 1: FIDJI results by question types.
is answered by
ontaining many dependen ies but answering nothing, while a good result was later in the
same do ument, but with an anaphora:</p>
    </sec>
    <sec id="sec-8">
      <title>Fa toid</title>
      <p>"How"
Question type
"Why"
List
Denition
TOTAL
of the fth onferen e on Applied natural language pro essing, pages 7279, Washington, DC,
USA, 1997. Morgan Kaufmann Publishers In ., San Fran is o, California, USA.
[1℄ Salah At-Mokh tar and Jean-Pierre Chanod. In remental nite-state parsing. In Pro eedings
(TALN 2009, poster), Senlis, Fran e, jun 2009.
question-rØponse. In A tes de la ConfØren e Traitement Automatique des Langues Naturelles
[3℄ VØronique Mori eau and Xavier Tannier. tude de l’apport de la syntaxe dans un systŁme de
les rØponses des questions par plusieurs do uments. In Pro eedings of workshop on
COnfØren e en Re her he d’Information et Appli ations, CORIA, Presqu’le de Giens, Fran e, 2009.
[4℄ VØronique Mori eau, Xavier Tannier, and Brigitte Grau. Utilisation de la syntaxe pour valider</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>