<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>University of Alicante at WiQA 2006</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antonio Toral Ruiz</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Georgiana Pu»sca»su¤</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorenza Moreno Monteagudo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>General Terms</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Measurement, Performance, Experimentation</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>o Natural Language Processing and Information Systems Group Department of Software and Computing Systems University of Alicante</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the participation of University of Alicante at the WiQA pilot task organized as part of the CLEF 2006 campaign. For a given set of topics, this task presupposes the discovery of important novel information distributed across di®erent Wikipedia entries. The approach we adopted for solving this task uses Information Retrieval, query expansion by feedback, relevance and novelty re-ranking, as well as temporal ordering. Our system has participated both in the Spanish and English monolingual tasks. For each of the two participations the results are promising because, by employing a language independent approach, we obtain scores above the average. Moreover, in the case of Spanish, our result is very close to the best achieved score. Apart from introducing our system, the present paper also provides an in-depth result analysis, and proposes future lines of research, as well as follow-up experiments.</p>
      </abstract>
      <kwd-group>
        <kwd>H</kwd>
        <kwd>3 [Information Storage and Retrieval]</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>1 Content Analysis and Indexing</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>3 Information Search and Retrieval</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>4 Systems and Software</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Wikipedia1 is a multi-lingual web-based, free content encyclopedia, continuously updated in a
collaborative way. It may be seen as a paradigmatic example of a huge2 and fast-growing source
of written natural language.</p>
      <p>
        Several inherent characteristics of this resource, such as its continuous growing nature, its
general domain coverage, as well as its multilinguality, make Wikipedia a valuable resource for
the Natural Language Processing (NLP) research ¯eld. The NLP community has only just lately
become aware of this fact and started investing research e®ort in possible ways of exploiting
Wikipedia within strategic areas such as Question Answering [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or Knowledge Acquisition [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        WIQA3 is a pilot task at CLEF 20064 exploiting the fact that, in Wikipedia, the distinction
between author and reader has become blurred. The aim of the task is to discover how Information
Retrieval and NLP techniques can be e®ectively used to help readers and authors of articles
get access to information spread throughout Wikipedia rather than stored locally on a single
page [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In a nutshell, WiQA is about collecting information about a certain topic not yet
present on its page, thus avoiding data sparseness and unifying related content distributed among
di®erent entries. The motivation to launch this task lies in the already pointed out challenges that
Wikipedia poses to the NLP community.
      </p>
      <p>This paper is organized as follows. The following section presents a description of our approach
and the developed system. Section 3 describes the experiments submitted to the WIQA task.
Afterwards, in section 4, we present and comment the obtained results. Finally, in section 5,
conclusions are drawn and future lines of research are pointed out.
2</p>
    </sec>
    <sec id="sec-2">
      <title>System description</title>
      <p>Inspired by the Novelty and the QA tasks at TREC, the WiQA pilot task aims at recovering
information not explicitly mentioned on a page, but distributed across the entire encyclopedia.
The envisaged participating systems should help provide access to, author and edit Wikipedia's
content. They should, mainly, be able to locate relevant and new sentences within the Wikipedia
document collection, in response to a topic.</p>
      <p>
        The WiQA task could be tackled from two perspectives, by employing either Information
Retrieval methods, or Question Answering capabilities. Our approach embraces the IR strategy.
Information Retrieval [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is a Natural Language Processing application that, given a query and a
document collection, returns a ranked list of relevant documents in response to the input query.
IR usually comprises two stages. The ¯rst is a preprocessing phase which is carried out o²ine and
consists of indexing the document collection. Its aim is to represent the documents in a way that
makes it easier and more e±cient to store and interrogate the collection. The second step, carried
out online, consists of the actual retrieval of relevant documents as answer to an input query.
      </p>
      <p>The architecture of our system is depicted in Figure 1. IR forms the core part of the system,
being used to retrieve the documents relevant to the most meaningful terms in the topic document.
For example, considering the topic Alice Cooper, we ¯rst extract the most meaningful terms in
the supporting topic document. Then, in order to retrieve documents not only mentioning Alice
Cooper, but also belonging to the domain de¯ned by the extracted terms, we search the collection
using these relevant terms.</p>
      <p>
        For the above presented purposes, we have employed a probabilistic open source IR library
called Xapian [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Besides the probabilistic search capability, in which the most relevant words
are given increased weight, it allows boolean searches with operators that a®ect the query words,
thus placing user-de¯ned constraints on the search. These operators allow the user to specify, for
example, that the desired terms occur in close proximity to each other. Another useful feature of
Xapian is the possibility to receive feedback. By using this technique, Xapian can extract relevant
terms for the query and carry out an expansion of it. The system ¯rst performs a basic retrieval
and then gives the user the opportunity to select a set of documents considered relevant. At
the next step, Xapian extracts from the selected documents relevant terms for query expansion.
Finally, a second retrieval is performed by adding these terms to the query.
      </p>
      <p>Due to the nature of WiQA pilot task the indexation has been performed at sentence level,
that is sentences have been resembled to complete documents, therefore each indexed document is
made up of only one sentence. This makes it straightforward to retrieve directly sentences in order
to be compliant with the desired output of the system. In consequence, our system comprised
3http://ilps.science.uva.nl/WiQA/
4http://www.clef-campaign.org/
a preprocessing phase that consisted of document sentence splitting and SGML tags removal.
Finally, all the sentences contained in the document collection are indexed.</p>
      <p>At the retrieval stage, Xapian has been con¯gured using options and parameters to be presented
in detail in the next section 3. Once the relevant sentences have been identi¯ed, they have been
passed through a post-processing stage consisting of the following actions:
² Eliminate those sentences which belong to the topic supporting document (the document
identi¯er corresponding to the sentence matches the one of the topic)
² Eliminate those sentences that belong to documents linked from the query document (there
is a link in the topic supporting document that points to the document of the sentence)
At this stage, a core set of sentences possibly relevant and important for the topic in question
has already been delimited. In the case of the Spanish monolingual task, this set of sentences
forms the system output, while, for the English task, they pass through subsequent processing
stages, as described in the following.</p>
      <p>
        Our English system proceeds by parsing the set of possibly relevant sentences, as well as the
topic document sentences, with the Conexor's FDG Parser [
        <xref ref-type="bibr" rid="ref5 ref7">7, 5</xref>
        ]. For the two sets of sentences
and supporting document titles, the time expressions are also identi¯ed and resolved using the
temporal expression recogniser and normaliser previously developed by one of the authors [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
The relevance and novelty of each retrieved sentence with respect to the sentences included in
the Wikipedia topic document is then measured, in order to preserve the most relevant sentences
to update the content of the Wikipage. Therefore, the sentences manifesting a high degree of
similarity with the content of the topic document were characterised by very low scores.
      </p>
      <p>The degree of relevance and novelty of a retrieved sentence with respect to a sentence from
the topic's Wikipage was considered to be a weighted measure revealing the percentage of novel
named nouns (all uppercase nouns situated in the middle of the sentence), non-matching temporal
expressions, novel nouns and verbs included in the former sentence, but not in the latter one.</p>
      <p>The retrieved sentences are then ranked with respect to their relevance/novelty score, and
passed on to a temporal ordering module. The temporal ordering module labels each retrieved
sentence with the ¯rst TE occurring in the sentence, or, if no TE is present in the sentence, with
the ¯rst TE of the title, or, if still no TE is found, with no label. Afterwards, the labelled sentences
are interchanged so that their new order re°ects their temporal order. The unlabelled sentences
thus preserve their rank re°ecting their relevance.</p>
    </sec>
    <sec id="sec-3">
      <title>Description of submitted runs</title>
      <p>Our WiQA submission includes one run for the Spanish monolingual task and two runs for the
English task. The Spanish run and the ¯rst English run have been obtained with the same
methodology, but applied to the language speci¯c text collections. The second English run was
obtained by employing more sophisticated NLP techniques to rank the set of retrieved sentences
according to novelty/relevance, and to order them chronologically.</p>
      <p>The ¯rst English run, as well as the Spanish run, both employ only the Information Retrieval
capabilities of our system. Xapian ¯rstly performs a search for the topic title constrained by the
NEAR operator. The NEAR operator helps in locating the topic title words situated in any order
within a short distance from each other. The feedback characteristic of Xapian is then employed to
extract the important terms from the de¯ned set of relevant documents (we classify as relevant the
¯rst 50 retrieved documents). Then the query is expanded with the identi¯ed relevant terms and
a second retrieval is performed, this time constrained by the PHRASE operator. The PHRASE
operator identi¯es only the sentences where the group of words de¯ning the topic occur together
and in the same order as in the query. After post-processing the retrieved sentences by ¯ltering
out the ones that occur either in the topic document or in any document linked from the topic
document, we preserve only the ¯rst 10 resulted sentences and return them as result of the ¯rst
English run and of the Spanish run respectively.</p>
      <p>Our second English run performs, apart from Information Retrieval, a relevance/novelty-based
ranking, followed by a temporal ordering stage. Information Retrieval in employed in the same
manner and with the same speci¯cations as in the case of the ¯rst run. The retrieved sentences
together with the topic sentences are parsed with the morpho-syntactic parser and with the
temporal expression identi¯er/normaliser described above. A measure of relevance and novelty is
then computed for each retrieved sentence with respect to all sentences from the topic document
in turn, and the minimum relevance score obtained will represent its degree of relevance/novelty
with respect to the entire topic document. A relevance/novelty ranking of the retrieved sentences
is then produced and passed on to a temporal ordering module. The temporal ordering module
produces a new order of the sentences that re°ects their succession on the temporal axis.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Results and discussion</title>
      <p>In this section we present and comment on the obtained results. As already stated, we have
submitted three runs for WiQA 2006. Two runs represent solutions for the English monolingual
task (one using IR only and the other one employing extra re-ranking and temporal ordering
capabilities). The third run employs only IR and corresponds to the Spanish monolingual task.</p>
      <p>The following two tables summarize the results for the monolingual English (Table 1) and for
the monolingual Spanish (Table 2) tasks.</p>
      <p>Run ID
1
2</p>
      <sec id="sec-4-1">
        <title>Average Yield</title>
        <p>2.98
2.63
1.52
2.46
3.38
MRR
0.53
0.52
0.30
0.52
0.59</p>
      </sec>
      <sec id="sec-4-2">
        <title>Precision</title>
        <p>0.33
0.32
0.20
0.32
0.37</p>
        <p>For each run, three di®erent measures are provided (average yield, MRR and precision), all
measured for the top 10 snippets returned. The average yield represents the average number of
supported &amp; novel &amp; non-repetitive &amp; important snippets retrieved. The MRR (Mean Reciprocal
Rank) score refers to the ¯rst supported &amp; novel &amp; non-repetitive &amp; important snippet returned.
The precision was calculated as the percentage of supported &amp; novel &amp; non-repetitive &amp; important
snippets encountered among the submitted snippets. Apart from the results achieved by our runs,
the two tables also include the minimum, median and maximum scores obtained for the tasks at
hand. We are therefore able to evaluate and compare the performance of our system with respect
to other participants.</p>
        <p>Run ID</p>
        <p>1</p>
      </sec>
      <sec id="sec-4-3">
        <title>Average Yield</title>
        <p>1.76
1.02
1.06
1.82</p>
        <p>MRR
0.36</p>
        <p>The results show that our approach using feedback-driven IR has obtained results above the
median value, both for English (Table 1) and for Spanish (Table 2). The fact that we score
considerably better for English than for Spanish, though using the same approach, (average yield
2.98 EN vs. 1.76 ES, MRR 0.53 EN vs. 0.36 ES and precision 0.33 EN vs. 0.22 ES) might be due
to the di®erent size of the Spanish Wikipedia in comparison with the English version. Being the
English version notably larger than the Spanish version, there might be many more topic relevant
text snippets spread across its entries.</p>
        <p>Regarding the run which uses, apart from IR, relevance and novelty re-ranking plus temporal
ordering, it has been submitted only for English, as it makes use of language dependent tools. The
obtained results have been slightly worse than the ones for the ¯rst run. Our expectations were
that these post-processing stages would at least bring a slight improvement to the results given
by the IR engine alone. However, the performance has decreased. An in-depth analysis is needed
to ¯nd out the causes of this unexpected behavior. Our opinion is that further investigation
is required to improve or discover a more appropriate formula to be employed for measuring
the degree of relevance and novelty of a retrieved snippet. Besides, temporal processing should
probably be employed at a point when WiQA systems are more mature and the input snippets
are more reliable.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>This paper presents our approach and participation in the WiQA 2006 competition. We propose
feedback-driven IR with query expansion in order to retrieve, in response to given Wikipedia
entries, relevant information scattered throughout the entire encyclopedia. Moreover, we have
introduced a post-processing stage consisting of novelty/relevance ranking and temporal ordering.
Our system participated both in the Spanish and English monolingual tasks.</p>
      <p>When compared to the other systems presented in this competition, we have obtained good
results that situate us above the medium score and quite close to the best result in the case of
Spanish. Therefore, we conclude that the proposed approach is appropriate for the WiQA task
and we plan to ¯nd ways of improving the system's performance.</p>
      <p>Several future work directions emerge naturally from a ¯rst look and shallow analysis of the
results. Firstly, we would like to carry out an in-depth study of the e®ects induced by applying
novelty/relevance ranking and temporal ordering, as the results obtained have not been those
expected. Secondly, we aim at furtherly investigating this topic departing from our
feedbackdriven Information Retrieval approach.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This research has been partially funded by the Spanish Government under project CICyT number
TIC2003-07158-C04-01.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] Xapian: an Open Source Probabilistic IR library</article-title>
          .
          <source>On line www.xapian.org. Visited</source>
          <year>2006</year>
          -
          <volume>06</volume>
          -01.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ahn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Jijkoun</surname>
          </string-name>
          , G. Mishne,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mller</surname>
          </string-name>
          , M. de Rijke, and
          <string-name>
            <given-names>S.</given-names>
            <surname>Schlobach</surname>
          </string-name>
          .
          <article-title>Using wikipedia at the trec qa track</article-title>
          . In The University of Amsterdam at QA@CLEF
          <year>2004</year>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Ribeiro-Neto</surname>
          </string-name>
          .
          <source>Modern Information Retrieval</source>
          .
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Jijkoun and M. de Rijke</surname>
          </string-name>
          .
          <article-title>A Pilot for Evaluating Exploratory Question Answering</article-title>
          .
          <source>In Proceedings SIGIR 2006 workshop on Evaluating Exploratory Search Systems (EESS)</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Moreno-Monteagudo</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Suarez</surname>
          </string-name>
          . Una Propuesta de
          <article-title>Infrastructura para el Procesamiento del Lenguaje Natural</article-title>
          .
          <source>In Proceedings of SEPLN</source>
          <year>2005</year>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Puscasu</surname>
          </string-name>
          .
          <article-title>A Framework for Temporal Resolution</article-title>
          .
          <source>In Proceedings of the 4th Conference on Language Resources and Evaluation (LREC2004)</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Tapanainen</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Jaervinen</surname>
          </string-name>
          . A Non{
          <article-title>Projective Dependency Parser</article-title>
          .
          <source>In Proceedings of the 5th Conference of Applied Natural Language Processing</source>
          , ACL,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Antonio</given-names>
            <surname>Toral</surname>
          </string-name>
          and
          <article-title>Rafael Mun~oz. A proposal to automatically build and maintain gazetteers for named entity recognition using wikipedia</article-title>
          .
          <source>In Workshop on New Text, 11th Conference of the European Chapter of the Association for Computational Linguistics</source>
          , Trento, Italy,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>