<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UNED at iCLEF 2005: Automatic highlighting of potential answers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vıc´tor Peinado</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fernando Lo´pez-Ostenero</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julio Gonzalo</string-name>
          <email>julio@lsi.uned.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felisa Verdejo</string-name>
          <email>felisa@lsi.uned.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>NLP Group</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ETSI Informa´tica</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>General Terms</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>c/ Juan del Rosal</institution>
          ,
          <addr-line>16, E-28040 Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>highlighting</institution>
          ,
          <addr-line>interactive QA</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we describe UNED's participation in the iCLEF 2005 track. We have compared two strategies for nfiding an answer using an interactive question answering system: i) a search system over full documents and ii) a search system over passages (document's paragraphs). We have added an interesting feature to both system in order to facilitate reading: the possibility to enable/disable the highlighting of named entities such as proper names, temporal references and numbers likely to contain the right answer. Our Document Searcher obtained better overall accuracy (.53 vs. .45) but our subjects found browsing passages simpler and faster. However, most of them presented a similar search behavior (regarding time consumption, confidence in their answers and query refinements) using both systems. All our users considered helpful the highlighting of named entities and they all made extensive use of this possibility as a quick way of discriminating between relevant and non relevant documents and finding a valid answer.</p>
      </abstract>
      <kwd-group>
        <kwd>H</kwd>
        <kwd>3 [Information Storage and Retrieval]</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>1 Content Analysis and Indexing</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>3 Information Search and Retrieval</kwd>
        <kwd>H</kwd>
        <kwd>4 [Information Systems Applications]</kwd>
        <kwd>H</kwd>
        <kwd>4</kwd>
        <kwd>m Miscellaneous</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>interactive question answering, named entities recognition, cross-language information retrieval,
search behavior
The main goal of the Interactive Cross-Language Question Answering task (iCLEF) consists of
finding an answer for 16 general questions (e. g. Who is the president of Burundi? ) and selecting
a certain document that supports the answer, before the time limit of vfie minutes expires.</p>
      <p>
        Our participation in iCLEF 2004 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] focused on comparing two strategies for nfiding an answer
using an interactive question answering (QA) system: i) a documents retrieval search engine and;
ii) a passages retrieval search engine. We wanted to study what approach was more helpful:
browsing documents or passages?
      </p>
      <p>Our subjects preferred the passages system because browsing paragraphs was simpler and
faster, but they also missed the possibility of accessing the full context of the passage since
sometimes it was difficult to understand the context of the paragraph. But, in spite of the preferences,
average strict accuracy turned out to be slightly higher in the documents system (69%).</p>
      <p>This year we intended to study the impact of automatic highlighting of named entities in both
systems. First of all, in the Passages system, we allowed our subjects to visualize the full contents
of the documents. Then, we made use of our simple recognizer, which was able to locate proper
nouns, temporal references and numbers, and we added the possibility of enable and disable the
emphasis of these named entities. Is it helpful to highlight the named entities in order for the
subjects to find a possible answer? How much does the highlighting help the user while browsing
documents and while browsing passages?</p>
      <p>The remaining sections of this paper are divided as follows. In Section 2, we describe the
design of the experiments, our testbed and how search sessions are organized. In Section 3, we
present our two cross-language search systems. Then, in Section 4, we discuss the official results,
analyzing the causes of failure (4.2), the users’ and topics’ effects (4.3 and 4.4) and the cases in
which subjects found the answer in the Passages system thanks to the possibility of access the full
document (4.5). Lastly, in Section 5, we present some conclusions.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Experiment design</title>
      <sec id="sec-2-1">
        <title>Testbed</title>
        <p>
          Following the iCLEF 2005 guidelines, 1 we have carried out the comparison of two different
crosslanguage search systems. Eight subjects have searched for the answer of 16 xfied questions in
Spanish over a collection of documents written originally in English. The subjects performed eight
queries with each system, according to the design of a latin-square proposed by the organization
of the task [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>The collection of documents consisted of news from 1994 and 1995 taken from Los Angeles
Times and Glasgow Herald newspapers, respectively. In our experiments, we did not use the
original documents but a Spanish version translated with Systran Professional 3.0.</p>
        <p>
          From this translated version of the collection, we made use of the Inquery’s API [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] in order
to build two different indexes, one for each search system:
1. One index whose documents correspond with news articles.
2. Another one in which each document corresponds with a single passage (a paragraph of a
news article).
        </p>
        <p>We recruited eight users who were between 19 and 30 years old and had different levels of
education, from high school to master degrees. Their mother tongue was Spanish and they all
claimed to have between low and medium-high skills in written English comprehension. They were
highly familiarized with graphical interfaces and web-based search engines. They also declared to
have been using WWW search engines for at least 2-7 years (avg=4.6). On the contrary, none of
them had any familiarity using Machine Translation (MT) systems.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Search sessions</title>
        <p>We asked the subjects to find a valid answer and select a document supporting it before the time
limit. The maximum search time per question was set in five minutes. Once time expired, the
system stopped the search and allowed to visualize the subject the set of stored documents, giving
her/him a last chance to write an answer.</p>
        <p>1For further details, please see http://nlp.uned.es/iCLEF.</p>
        <p>They also had to fill in a pre-search questionnaire about their previous experience with search
engines, two post-system questionnaires analyzing their performance and the specific features of
each approach, and a nfial post-search questionnaire about their overall experience.
3
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Description of the reference and contrastive systems</title>
      <sec id="sec-3-1">
        <title>Reference system</title>
        <p>
          Our reference system, henceforth the Documents Searcher, is a simple traditional search engine
in which each retrieved document corresponds with a complete news article. Indeed, it has few
differences compared to the reference system used last year [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>We may outline the normal sequence of a subject’s actions as follows:
1. The subject types the query terms in Spanish and launches the query.
2. The system makes use of the Inquery’s API to retrieve a ranking of relevant documents.
3. The main interface displays only the titles and dates of each document (see Figure 1).</p>
        <p>This interface has additional buttons to discard non-relevant documents, to store a certain
document considered interesting, to list already stored documents, and to conclude the search
selecting a certain document when an answer has been found.</p>
        <p>We have added a feature that did not exist in last year’s systems in order to improve the
reading: query terms’ occurrences appear within the text in boldface. In addition, it is
possible to handle some checkboxes in order to enable/disable the highlighting of named
entities, such as proper nouns, temporal references, dates and numbers. See Figure 2 for a
detailed screenshot showing the highlighting.</p>
        <p>5. Lastly, the subject must type the answer and assign it a confidence value : high or
low.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Contrastive system</title>
        <p>We propose as contrastive system a Passages Searcher, which performs the queries over a
collections of news paragraphs.</p>
        <p>In this case, the sequence of actions is the following:
1. First of all, the subject is asked to choose the type of answer she/he is searching for: a
proper noun, a date or a number (see Figure 3).</p>
        <p>Notice that: i) this distinction agrees with the three different types of named entities
identiafible by our recognizer 2 and; ii) this initial choice determines which pieces of information
will be automatically highlighted.</p>
        <p>The underlying idea is that, in order to facilitate reading and locating a possible answer, the
system will highlight named entities of the same type of the one chosen before submitting
the query. For instance, if a subject if looking for a date, it can be useful to automatically
emphasize all kind of temporal references.
2. The subject types the query terms in Spanish and launches the query.</p>
        <p>
          2We have used a straightforward recognizer which is able to identify proper nouns, temporal references and
numbers. See also [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
3. The system retrieves and shows a ranking of relevant passages. Those passages
containing the selected type of answer are promoted by the search engine, and the system
automatically highlights query terms and named entities, depending on the initial subject’s
election.
4. The main interface, as shown in Figure 4, provides also titles and dates of each news article,
and has the same buttons that the Documents Searcher to discard and store documents.
Unlike last year’s experiments, now it is possible to access the complete document the passage
makes part of. If this situation takes place, the whole document will clearly show the passage
with two dashed lines.
        </p>
        <p>
          In our participation in iCLEF 2004[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], we intentionally excluded the possibility of
examining the context of a given passage by providing the complete document. All our subjects
expressed their complaints because this lack hindered them from understanding the general
sense of some short paragraphs. In addition, other works had already analyzed the benefits
of allowing the subjects to get the full contents of the documents [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and we decided to add
this feature.
5. As in the Documents Searcher, when visualizing the full document, it is possible to
enable/disable the highlighting of query terms, proper nouns , temporal references and numbers
(Figure 2).
6. Lastly, the subject must type the answer and assign it a confidence value : high or
low.
4
4.1
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results and discussions</title>
      <sec id="sec-4-1">
        <title>Comparison between systems</title>
        <p>From the general results shown in Table 1, we can remark the following:
1. The Documents Searcher obtained again better accuracy than the Passages Searcher: .53
and .45, respectively.
2. Both systems got the same values of strict and lenient accuracy. None of our subject’s
answers was judged as inexact by the assessors.
3. Regarding the average time consumption, confidence values and the average number of
refinements, our subjects present a quite similar behavior with both systems.</p>
        <p>The 2004 and 2005 results are not directly comparable because the topics, the systems’ features,
the participating subjects and the conditions of the experiments were not obviously the same.
Nevertheless, the difference between the two strategies has increased: now the Passages Searcher
has been 15% worse than the Documents Searcher.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Failure analysis</title>
        <p>Most of the failure causes was related to mistranslations. As we will discuss below in Section 4.4,
in some occasions, the MT system did not translated correctly, for instance, translating some
terms when it shouldn’t and vice versa.</p>
        <p>There were also remarkable human errors. Specicfially, some users got confused in those topics
in which different potential answers (some of them looking contradictory) appeared in the collection
(e.g. topics asking for a number of casualties in a incident).</p>
        <p>Regarding responsiveness criteria, the results have been strongly language-biased because the
same answer was judged in a different way by English and French assessors (see Section 4.4).
4.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>User effects</title>
        <sec id="sec-4-3-1">
          <title>User</title>
        </sec>
        <sec id="sec-4-3-2">
          <title>Accuracy</title>
          <p>Docs Pass
The data about accuracy, condfience, number of refinements and time consumption per user are
shown in Table 2. Seven out of the eight subjects stated in the questionnaires that they preferred
the Passages Searcher. However, six out of eight found more right answers with the Documents
Searcher. Some users had some difficulties when using one of the systems. User 7, particularly,
obtained poor results with the Passages Searcher, in spite of the fact that he spent, on average,
245.38 seconds for each topic. On the contrary, users 2 and 6 performed much worse with the
Documents searcher.</p>
          <p>Notice that confidence values are generally coherent with the accuracy. Except for users 3 and
6, there are no big differences between the number of answers with a high confidence and the
accuracy. For instance, user 6 assigned a high condfience to five of the topics performed with the
Documents Searcher but obtained an accuracy of .25, representing only two answers assessed as
right.</p>
          <p>Also, there seems to be a certain correlation between number of query renfiements and the
experience using our systems, because the three subjects who had already collaborated in 2004
(3, 5, 6) made, on average, fewer refinements than the others.
4.4</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>Topic effects</title>
        <p>• 12: When do we estimate that the Big Bang happened? In the astronomic domain, the
English term “Big Bang” is used as is in Spanish but in our collection it had been translated
as “Gran Estallido”. This misled most of our subjects and only one of them was able to nfid
a valid answer.
• 15: How many states are members of the Council of Europe? Most of our subject
misunderstood the Council of Europe with the European Union.</p>
        <p>Topic 9 (What disease name does the acronym BSE stand for? ) was thought to be an easy topic
and its low accuracy deserves a more detailed explanation. While English assessors considered
with good sense that answers different from “Bovine Spongiform Encephalopathy” were wrong,
French assessors judged variations of “mad cow disease” as perfectly right and this caused an
important language bias. In our case, vfie of our subjects thought that “mad cow disease” was a
valid answer. If we would have accepted this answer as right, topic 9 would have obtained a global
accuracy of 100%.</p>
        <p>On the other hand, topics 8, 10, 11 and 16 turned out to be quite easy. Notice that they got
an accuracy of 100% in at least one of the proposed systems and they took our subjects fewer
time than other topics.
4.5</p>
      </sec>
      <sec id="sec-4-5">
        <title>From passages to documents</title>
        <p>We also wanted to analyze the impact of allowing our subject to access the full documents when
browsing passages. 29 answers performed with the Passages Searcher was judged as right. In 19
of theses cases, the subject found the answer directly in the passage retrieved by the system, that
is, the user wouldn’t have needed to visualize the full context. For example, in topic 16 (When
did Edward VIII abdicate? ) the rfist passage of the ranking contained the answer. In spite of this,
most of the subjects used to access the whole document in order to validate the answer and make
themselves sure.</p>
        <p>On the contrary, when searching topic 8 (Which airline did the plane hijacked by the GIA
belong to? ), the system retrieved passages about GIA’s hijackings but it was necessary to check
the full context of the paragraph to nfid out the right answer.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>In this paper, we have described our participation in the iCLEF 2005 track. We have compared
two strategies for finding an answer using an interactive question answering system: i) a search
system over full documents and ii) a search system over passages (document’s paragraphs). We
have added an interesting feature to both system in order to facilitate reading: the possibility to
enable/disable the highlighting of named entities such as proper names, temporal references and
numbers likely to contain the right answer.</p>
      <p>The Document Searcher obtained better overall accuracy (.53 vs. .45) but our subjects found
browsing passages simpler and faster. However, most of them presented a similar search
behavior (regarding time consumption, condfience in their answers and query renfiements) using both
systems. Besides, we discuss these data focusing on the causes of failure.</p>
      <p>All our users considered helpful the highlighting of named entities. They all extensively used
the possibility of emphasize proper names, dates and numbers, specially while the rfist reading of
a long document. They also appreciated the way the Passages Searcher automatically highlighted
named entities, according to their initial choices. This feature helped to quickly discriminate
between relevant and non relevant passages.</p>
      <p>As shown in other CLEF works, it is necessary to count on a good translation of the documents,
using MT systems able to distinguish what should and should not be translated. Therefore, we
intend to have a more reliable translation of the collections in the future which, without question,
will improve the overall results of any cross-language information retrieval experiment.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The authors would like to thank the participating subjects. We are also grateful to Valennıt´ Sama
and Javier Artiles for their collaboration during the recruitment of volunteers and the achievement
of the experiments, and to Anselmo Pen˜as for his valuable remarks about this paper.</p>
      <p>This work has been partially supported by the Spanish Government under project
R2D2Syembra (TIC2003-07158-C04-02). Vıc´tor Peinado holds a PhD grant by UNED ( Universidad
Nacional de Educacoi´n a Distancia ).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Callan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Croft</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Harding</surname>
          </string-name>
          .
          <article-title>The Inquery Retrieval System</article-title>
          .
          <source>In Proceedings of the Third International Conference on Database and Expert Systems Applications</source>
          , pages
          <fpage>78</fpage>
          -
          <lpage>83</lpage>
          . Springer-Verlag,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Figuerola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Zazo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L. Alonso</given-names>
            <surname>Berrocal</surname>
          </string-name>
          , and E. Rodgır´ uez Va´zquez de Aldana.
          <article-title>Results of the CLEF 2004 Evaluation Campaign</article-title>
          , volume
          <volume>3491</volume>
          of Lecture Notes in Computer Science,
          <source>chapter REINA at the iCLEF 2004</source>
          . Springer Verlag,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. W.</given-names>
            <surname>Oard</surname>
          </string-name>
          .
          <article-title>Results of the CLEF 2004 Evaluation Campaign</article-title>
          , volume
          <volume>3491</volume>
          of Lecture Notes in Computer Science, chapter
          <article-title>iCLEF 2004 Track Overview: Interactive CrossLanguage Question Answering</article-title>
          . Springer Verlag,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Lo</surname>
          </string-name>
          <article-title>´pez-</article-title>
          <string-name>
            <surname>Ostenero</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Peinado</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Verdejo</surname>
          </string-name>
          .
          <article-title>Results of the CLEF 2004 Evaluation Campaign</article-title>
          , volume
          <volume>3491</volume>
          of Lecture Notes in Computer Science, chapter
          <article-title>Interactive Cross-Language Question Answering: Searching Passages versus Searching Documents</article-title>
          , pages
          <fpage>323</fpage>
          -
          <lpage>333</lpage>
          . Springer Verlag,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Peinado</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <article-title>Lo´pez-</article-title>
          <string-name>
            <surname>Ostenero</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          . UNED at ImageCLEF 2005:
          <article-title>Automatically Structured Queries with Named Entities over Metadata</article-title>
          .
          <source>In Cross Language Evaluation Forum, Working Notes for the CLEF 2005 Workshop</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>