<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploiting Semantic Features for Image Retrieval at CLEF 2005</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martínez-Fernández</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Villena</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>García-Serrano</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>González-Tortosa</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carbone</string-name>
          <email>fcarbone@isys.dia.fi.upm.es</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Castagnone</string-name>
          <email>mcastagnone@isys.dia.fi.upm.es</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universidad Politécnica de Madrid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universidad Carlos III de Madrid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>DAEDALUS - Data</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Decisions</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Language</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2004</year>
      </pub-date>
      <volume>3491</volume>
      <fpage>210</fpage>
      <lpage>219</lpage>
      <abstract>
        <p>This paper presents the MIRACLE's team approach to text-based image retrieval at ImageCLEF 2005 adhoc task. The experiments defined this year try to use semantic information sources, like semantic dictionaries or text structure. For this purpose EuroWordnet has been considered and a new algorithm to extract synonyms from the semantic database has been developed. This new algorithm implementation is based on the proximity of words in the EuroWordnet tree and has been previously studied in [11]. On the other side, semantic information is implicitly included in the fields in which image descriptions are structured. Linguistic Engineering, Information Retrieval, text-based image retrieval, semantic data. ImageCLEF is the cross-language image retrieval track which was established in 2003 as part of the Cross Language Evaluation Forum (CLEF), a benchmarking event for multilingual information retrieval held annually since 2000. Images are language independent by nature, but often they are accompanied by texts semantically related to the image (e.g. textual captions or metadata). Images can then be retrieved using primitive features based on its contents (e.g. visual exemplar) or abstract features expressed through text or a combination of both. Originally, ImageCLEF focused specifically on evaluating the retrieval of images described by text captions using queries written in a different language, therefore having to deal with monolingual and bilingual image retrieval (multilingual retrieval was not possible as the document collection is only in one language). Later, the scope of ImageCLEF widened and goals evolved to investigate the effectiveness of combining text and image for retrieval (text and content-based), collect and provide resources for benchmarking image retrieval systems and promote the exchange of ideas which will lead to improvements in the performance of retrieval systems in general. The MIRACLE team is made up of three university research groups located in Madrid (UPM, UC3M and UAM) along with DAEDALUS, a company founded in 1998 as a spin-off of two of these groups. DAEDALUS is a leading company in linguistic technologies in Spain and is the coordinator of the MIRACLE team. This is the third participation in CLEF, after years 2003 and 2004 [5],[8],[12],[15],[18]. As well as bilingual, monolingual and cross lingual tasks, the team has participated in the ImageCLEF, Q&amp;A, WebCLEF and GeoCLEF tracks. This year a semantic driven approach to image retrieval has been tried. Semantic tools used have been: EuroWordnet [3] and textual image descriptions structure. A new implementation of a query semantic expansion has been developed, centered on the computation of closeness among the nodes of the EuroWordnet tree, where each node corresponds to a word appearing in the query. An expansion method based on the same idea was previously described in [11]. On the other hand, image captions have a predefined structure, each line of the text corresponds to a field. This information is exploited to build different indexes according to the type of field considered.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>Semantic Expansion using EuroWordnet</title>
      <p>EuroWordnet is a lexical database with semantic information in several languages. In the semantic level, for a
given language, different relations have been defined among dictionary entries. These relations include:
hyperonym, where links with more general concepts are defined, hyponym, where relations with more specific
terms are included and synonym, where constructions grouping entries with the same meaning (named synsets)
are built. All possible meanings for a given concept are part of the EuroWordnet data structure. So, as can be
seen, a tree graph can be built using these semantic relations, and the distance among concepts in this tree can be
used as a disambiguation method when expanding query expressions.</p>
      <p>For example, the entry bank is defined in EuroWordnet as "a financial institution that accepts deposits and
channels the money into lending activities" and also as "sloping land (especially the slope beside a body of
water)" along with eight more different senses. The question arising is: how can be the word bank disambiguated
when used as part of a query? The answer considered in this work is: by means of the rest of the words appearing
with bank in the query. That is, some of the synonyms for the words appearing with the word bank will overlap
with the synonyms of bank. If it does not happen hyponyms and hypernyms of the given words are considered,
until some relations among the initial words are found. The senses which are not linked with the senses of other
words appearing in the query expression can be discarded. Somehow, the main goal is to find one unique path, in
the EuroWordnet tree, joining all the words that are present in the query.</p>
      <sec id="sec-2-1">
        <title>The described algorithm has been implemented using Ciao Prolog and an adaptation of the Dijkstra algorithm has been developed to compute the shortest way between two nodes. An efficient implementation of the expansion method has been pursued and, for this reason, not all possible paths among nodes are computed, a maximum of three jumps are allowed to limit execution times to an affordable value.</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Morpho-Syntactic Processing</title>
      <p>
        A more refined linguistic processing has been applied to the supplied captions. The availability of a tool to make
morpho-syntactic analysis of English texts, based on the TreeBank tag set [
        <xref ref-type="bibr" rid="ref9">14</xref>
        ], allows for a deeper linguistic
processing. This module is in charge of assigning a POS tag to each word, also identifying phrases appearing in a
sentence. Negative particles present in sentences can then be identified, so specific and explicitly non-desired
terms can be excluded when performing the search process. To identify these negative particles, an analysis of
different sets of topics was carried out, obtaining a set of patterns to be matched against the input text. Terms
obtained with this process were excluded from the documents to be retrieved by applying the corresponding
operator provided by Xapian. This kind of processing could only be applied to English documents and several
runs were submitted including this functionality, which has been always used together with the semantic
expansion with EuroWordnet.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Exploiting image caption structure</title>
      <p>
        The captions supplied for the St. Andrews image collection are divided in fields, each of them containing
specific information such as short title, location, etc. Image textual descriptions are as shown in Figure 2. A total
of 9 fields are defined for each caption, and only some of them are considered of interest for the defined retrieval
tasks. Taking into account this structure, several indexes have been defined, one containing only image
descriptions, another one with short title, one more with the photographer, another one with the places shown in
the images, one with the dates when the pictures were taken and the last one with the proper nouns that have
been identified in the image caption. In this way is possible to isolate pieces of information, allowing the
retrieval of images based on specific data and mixing the results of different retrieval processes over distinct
indexes if necessary. This year, again the Xapian search engine [
        <xref ref-type="bibr" rid="ref14">19</xref>
        ] has been used to index text representations
for the image captions and the ability for this search engine to perform search processes combining independent
indexes has been used.
      </p>
      <sec id="sec-4-1">
        <title>Record ID</title>
      </sec>
      <sec id="sec-4-2">
        <title>Short Title</title>
      </sec>
      <sec id="sec-4-3">
        <title>Long Title</title>
      </sec>
      <sec id="sec-4-4">
        <title>Description</title>
        <p>This information distribution allows for the assignment of semantic interpretation for each field and, with a
minimum processing for the query, it is possible to search a specific entity over the right index. For example,
several queries ask for images taken by a predefined photographer; a simple processing of the query allows for
the identification of structures like "... taken by ..." where the name to be searched can be extracted and located
over the picture author index. This strategy allows for a fine-grained search process that is supposed to provide
better precision figures.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experiments Description</title>
      <sec id="sec-5-1">
        <title>This year, mono and bilingual experiments have been performed. In the monolingual experiments, a total of 17</title>
        <p>executions have been submitted, while for the bilingual experiments 89 runs have been sent for 23 different
source languages. As it is well known, an Information Retrieval process is divided in two main subtasks,
indexing and searching. To obtain the best retrieval performance both subtasks can be parameterized, taking
always into account that index terms and search terms must be represented using a common model (i.e.: in some</p>
      </sec>
      <sec id="sec-5-2">
        <title>Baseline: This is the basic and simplest approach whose results constitute the record to break. The</title>
        <p>transformations performed with the input text are: a basic parser is in charge of dividing the text in
words, then all words are normalized (by lowercasing every letter and removing special characters),
stopwords are removed and, finally, the stem for each word in the query is obtained. The resulting
words are then used to search the indexes.</p>
      </sec>
      <sec id="sec-5-3">
        <title>Query Expansion: As explained in section 2, a semantic expansion algorithm has been implemented this year, to include in the query semantically related concepts for the provided words. This method is applied at the output of the baseline process (but passing the stemming step to the end, i.e., to the output of the query expansion module).</title>
      </sec>
      <sec id="sec-5-4">
        <title>Linguistic Processing: This component (described in section 3) is in charge of obtaining the linguistic analysis of the text contained in the query. As already mentioned this analysis is in charge of selecting only the nouns appearing in the topic and of identifying which words should be excluded from the query. This module is always used in combination with the Query Expansion component.</title>
      </sec>
      <sec id="sec-5-5">
        <title>Combination Operator: There are two possible ways of joining together the words of the query with the expanded terms. The simplest one is using the OR operator to combine every pair of words. One more complex way of joining terms is considering the OR operator to join synonyms for a word and the AND operator to join sets of synonyms for different words.</title>
      </sec>
      <sec id="sec-5-6">
        <title>Proper noun module: A simple proper noun detection module, based on a finite state automaton, has also been applied to image captions and topics. Some attempts to work only with proper nouns identified in the query and in image descriptions. Because of the poor recall and precision figures obtained with former ImageCLEF data sets, none of these runs were sent.</title>
        <p>Until this point, the different transformations applied to the query processing task have been described. Now, the
processes followed in the indexing subtask are explained. Several indexes have been built with the image
captions collection, where different data arrangements were defined. When more than one of these indexes was
targeted as part of the same search process, the ability of the Xapian search engine to perform queries over
several databases1 at the same time was exploited. A total of seven indexes were built:
situations, the same alterations introduced for index terms must be produced for search terms). Regarding the
query processing subtask, the following features that can be included:
Source Field: The topics used in the ImageCLEF track are divided in two main fields, a title (a short description
of the aim of the topic) and a narrative (with a more detailed description of the topic purpose). Depending on
which of these fields are used, several experiments can be defined. In our case, experiments where only the title
field for the query has been used are marked with 't0'; if only the narrative field has been considered, the name
for the run is marked with 'd0' and, when both fields are used together, a 'td' is included in the name of the run.
•
•
•
•
•
•
•
•
•
•
•
•</p>
      </sec>
      <sec id="sec-5-7">
        <title>Caption index: All information contained in the image caption was used to build a unique index.</title>
      </sec>
      <sec id="sec-5-8">
        <title>Title index: The titles included in image textual descriptions were indexed in the same database.</title>
      </sec>
      <sec id="sec-5-9">
        <title>Description index: Only the description field of the image caption was used to build an index.</title>
      </sec>
      <sec id="sec-5-10">
        <title>Author index: All contents of Photographer fields present in image captions were indexed as part of the same database.</title>
      </sec>
      <sec id="sec-5-11">
        <title>Place index: Words appearing in the Location field of image captions were used to build an index.</title>
      </sec>
      <sec id="sec-5-12">
        <title>Date index: Words and dates present in the Date field of the captions were indexed as part of the same database.</title>
      </sec>
      <sec id="sec-5-13">
        <title>Proper Nouns index: All proper nouns detected in any of the fields defined for the image captions are included in this index. Results presented in section 6 do not include experiments involving this index because it was rejected due to the low precision and recall values obtained.</title>
      </sec>
      <sec id="sec-5-14">
        <title>When fields with plain text content are treated (such as the 'Description' or ''Title' attributes) the same parser, stopwords removing, accented characters substitution and stemming processes than the ones applied to the queries are considered.</title>
      </sec>
      <sec id="sec-5-15">
        <title>1 Note that the term 'database' is being used with the same sense that 'index'</title>
      </sec>
      <sec id="sec-5-16">
        <title>Taking into account these descriptions, the nomenclature followed for the submitted experiments is depicted in Figure 3. The possible values for each field are:</title>
        <p>•
•
•
•
•
the 'Query field used' can take values: 't0', when only the query title is used, 'd0', when only the
narrative field is used, and 'dt' when both title and narrative are used to build the search expression for
the search engine.
the 'Linguistic processing applied to the query' can have values: 'base', when the processes for the
baseline are applied (i.e.: parsing, stopword filtering, special characters substitution and lowercasing
and stemming); 's', when the module to obtain the morphosyntactyc analysis for the query is used; 'e',
when the semantic expansion based on EuroWordnet is applied; 'o', when the operator to combine the
expanded words is OR; 'a' when the operator to join expanded query words is a combination of OR
operators with AND operators; 'pn', when proper nouns are identified in the text.
the 'Index used' field identifies which index (or indexes) is (are) used to retrieve images. The possible
values are: 't0', if only the titles of the captions are indexed, 'd0', when only the descriptions for the
captions are searched, 'dt', when both titles and descriptions constitute a unique index, 'attr', if indexes
for the different captions fileds are used (the identified fields are: text, author, date, place), and finally
'allf', when a unique index with the content of all fields is used.
the 'Source Language' part identifies the language in which the query is supplied. In monolingual
experiments it is English, but for bilingual experiments it can it can identify one from 22 different
languages (Bulgarian, Croatian, Czech, Dutch, English, Finnish, Filipino, French, German, Greek,</p>
      </sec>
      <sec id="sec-5-17">
        <title>Hungarian, Italian, Japanese, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish</title>
      </sec>
      <sec id="sec-5-18">
        <title>Latinamerica, Spanish - Spain, Swedish, Turkish and Simplified Chinese.</title>
        <p>the 'last part, denoted Target Language', identifies the language in which the image captions collection
is written. Until now, the target language is always English.</p>
        <p>
          For bilingual experiments, were the source language is other than English, different translation tools have been
used to transform the original query texts to English. Among these translation tools is with mentioning Systran
5.0 [
          <xref ref-type="bibr" rid="ref10">15</xref>
          ], FreeTranslation [
          <xref ref-type="bibr" rid="ref3">4</xref>
          ] and TranExp InterTran [
          <xref ref-type="bibr" rid="ref11">16</xref>
          ]. Once the queries have been translated, the following
processes are the same than the ones used in the monolingual experiments.
        </p>
        <p>i
m
i
r
t
0
b
a
s
e
d
t
e
n
e
n</p>
      </sec>
      <sec id="sec-5-19">
        <title>Research group identifier</title>
      </sec>
      <sec id="sec-5-20">
        <title>Linguistic processing applied to the query</title>
      </sec>
      <sec id="sec-5-21">
        <title>Source language</title>
      </sec>
      <sec id="sec-5-22">
        <title>Query field used</title>
      </sec>
      <sec id="sec-5-23">
        <title>Index used</title>
      </sec>
      <sec id="sec-5-24">
        <title>Target language</title>
        <p>¡Error! No se encuentra el origen de la referencia. shows the Medium Average Precision (MAP) for the
monolingual experiments presented this year by the Miracle group. The best monolingual result is obtained for
experiment 'imirt0attren', where the title for the topic is processed with the baseline procedure (parsing,
normalizing words, stopwords removal and stemming) and the built query is performed against the combination
of attribute indexes (text, place, author, date). The MAP for this experiment is 37%, not far from the next one
'imirt0allfen'. It is worth mentioning that these figures are not conclusive, a programming error in the
combination of the different indexes introduced duplicate entries in the final result list. These duplicate results
were simply deleted from the final result list and lowering precision and recall rates. To the time of writing, it
has not be possible to repeat the experiments to produce new runs without duplicates.</p>
        <p>MAP for MIRACLE Monolingual Experimetns
imird0baset0enen
imirtdbaset0enen
imird0basedtenen
imird0based0enen
imirtdbasedtenen
imirtdbased0enen
imirtdseot0enen
imirt0eot0enen
imirt0baset0enen
imirtdseotdenen
imirtdseod0enen
imirt0eotdenen
imirt0eod0enen
imirt0basedtenen
imirt0based0enen
imirt0allfen</p>
      </sec>
      <sec id="sec-5-25">
        <title>Results for bilingual experiments are also very interesting. In ¡Error! No se encuentra el origen de la</title>
        <p>referencia., a graph showing the differences among the experiments for each language is depicted. The MAP
precision values for the best result for each language are compared. The best bilingual MAP result is 31%, and it
is reached for the Portuguese language. Comparing with the best monolingual result, a difference of around 7%
in MAP value can be seen.</p>
      </sec>
      <sec id="sec-5-26">
        <title>As already tested in previous campaigns, the translation process between languages introduces a lot of noise,</title>
        <p>decreasing the precision of the retrieval process. The process followed in the 'imirt0attrpt' experiment is
equivalent to the one applied in the best monolingual run, but including a previous translation step using the
previously mentioned translators. That is, the topic title is translated from Portuguese into English and then
parsed, normalized, stopwords are removed and the rest of words are stemmed. The words forming the query are</p>
      </sec>
      <sec id="sec-5-27">
        <title>ORed and searched against the combination of attribute indexes (text, place, author, date). Of course, the previously explained problem with duplicate results in the final list also applies to the bilingual runs submitted.</title>
        <p>MAP for MIRACLE Bilingual Experim ents
imirt0allfhu
imirt0attrfi
imirt0attrcr
imirt0allfcz
imirt0allfbu
imirt0attrro
imirt0allffl
imirt0attrpo
imirt0attrno
imirt0allfzh
imirt0attrsw
imirt0allftk
imirt0attrsp
imirt0allfgr
imirt0attrit
imirt0allfru
imirt0attrge
imirt0attrja
imirt0attrfr
imirt0allfsl
imirt0attrdu
imirt0attrpt</p>
      </sec>
      <sec id="sec-5-28">
        <title>It can also be observed that the MIRACLE team has been the only participant for some target languages such as</title>
      </sec>
      <sec id="sec-5-29">
        <title>Bulgarian, Croatian, Czech, Filipino, Finnish, Hungarian, Norwegian, Polish, Romanian and Turkish.</title>
        <p>7</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusions and Future Works</title>
      <sec id="sec-6-1">
        <title>The results shown in the previous section can lead us to some preliminary conclusions. The best results are</title>
        <p>obtained for the approach where information in captions fields is isolated. The query expansion based on
EuroWordnet dos not lead to better results, although more accurate synonyms are selected by applying the new
expansion algorithm which tries to disambiguate different senses for the words appearing in the topic text. The
reason can be related with the number of words constituting the query once the expansion is performed; too
much information to perform a quality search. This idea is reinforced if we take a look to the MAP produced by
the experiment 'imirtdbasedtenen', where the same process is followed than in 'imirt0basedtenen' but also using
the narrative field of the topic. In this situation a 10% MAP is obtained, a 20% worst than the experiment where
only the title of the topic is used.</p>
      </sec>
      <sec id="sec-6-2">
        <title>On the other hand, no definitive conclusions can be drawn until the experiments are repeated without producing duplicate results in the final list.</title>
      </sec>
      <sec id="sec-6-3">
        <title>Future works in text-based image retrieval could be devoted to the use of the trie-based indexing tool available among the utilities developed by the MIRACLE team and, on the other hand, to the improvement of the semantic expansion algorithm which, although it has not been proved to be useful for this task, seem to be accurate if the obtained expansions are visually checked.</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <sec id="sec-7-1">
        <title>This work has been partially supported by the Spanish R+D National Plan, by means of the project RIMMEL (Multilingual and Multimedia Information Retrieval, and its Evaluation), TIN2004-07588-C03-01 and also by the European Union with the funding of NEDINE project in the e-Content programme.</title>
      </sec>
      <sec id="sec-7-2">
        <title>Special mention to our colleagues of the MIRACLE team should be done (in alphabetical order): José Carlos</title>
      </sec>
      <sec id="sec-7-3">
        <title>González-Cristóbal, Ana González Ledesma, José Miguel Goñi-Menoyo, José Mª Guirao, Sara Lana-Serrano,</title>
      </sec>
      <sec id="sec-7-4">
        <title>Paloma Martínez-Fernández, Ángel Martínez-González, Antonio Moreno Sandoval and César de Pablo Sánchez.</title>
        <p>[1] University of Neuchatel. page of resources for CLEF (Stopwords, transliteration, stemmers, …). On line
http://www.unine.ch/info/clef/. [Visited 13/07/2005]</p>
      </sec>
      <sec id="sec-7-5">
        <title>Martínez, J.L.; Villena-Román, J.; Fombella, J.; García-Serrano, A.; Ruiz, A.; Martínez, P.; Goñi, J.M.;</title>
        <p>and González, J.C. (Carol Peters, Ed.): Evaluation of MIRACLE approach results for CLEF 2003.</p>
      </sec>
      <sec id="sec-7-6">
        <title>Working Notes for the CLEF 2003 Workshop, 21-22 August, Trondheim, Norway.</title>
        <p>Montoyo, A., "Método basado en marcas de especificidad para WSD", In Proceedings of SEPLN, nº 24,</p>
      </sec>
      <sec id="sec-7-7">
        <title>September 2000.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Aoe</surname>
            , Jun-Ichi; Morimoto, Katsushi; Sato,
            <given-names>Takashi.</given-names>
          </string-name>
          <article-title>An Efficient Implementation of Trie Structures</article-title>
          .
          <source>Software Practice and Experience</source>
          <volume>22</volume>
          (
          <issue>9</issue>
          ):
          <fpage>695</fpage>
          -
          <lpage>721</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [3] “Eurowordnet:
          <article-title>Building a Multilingual Database with Wordnets for several European Languages</article-title>
          .” http://www.let.uva.nl/ewn/, March (
          <year>1996</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [4]
          <fpage>Free2Translation</fpage>
          .
          <article-title>Free text translator</article-title>
          . On line http://www.freetranslation.
          <source>com [Visited</source>
          <volume>20</volume>
          /07/2005].
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Goñi-Menoyo</surname>
          </string-name>
          , José M; González, José C.;
          <string-name>
            <surname>Martínez-Fernández</surname>
          </string-name>
          , José L.; and
          <string-name>
            <surname>Villena</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>MIRACLE's Hybrid Approach to Bilingual and Monolingual Information Retrieval</article-title>
          .
          <article-title>CLEF 2004 proceedings</article-title>
          (Peters,
          <string-name>
            <surname>C.</surname>
          </string-name>
          et al.,
          <source>Eds.). Lecture Notes in Computer Science</source>
          , vol.
          <volume>3491</volume>
          , pp.
          <fpage>188</fpage>
          -
          <lpage>199</lpage>
          . Springer,
          <year>2005</year>
          (to appear).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Goñi-Menoyo</surname>
          </string-name>
          , José M.;
          <string-name>
            <surname>González</surname>
          </string-name>
          , José C.;
          <string-name>
            <surname>Martínez-Fernández</surname>
          </string-name>
          , José L.;
          <string-name>
            <surname>Villena-Román</surname>
          </string-name>
          , Julio; GarcíaSerrano, Ana; Martínez-Fernández, Paloma; de Pablo-Sánchez,
          <article-title>César;</article-title>
          and
          <string-name>
            <surname>Alonso-Sánchez</surname>
          </string-name>
          ,
          <article-title>Javier. MIRACLE's hybrid approach to bilingual and monolingual Information Retrieval</article-title>
          .
          <source>Working Notes for the CLEF 2004 Workshop (Carol Peters and Francesca Borri, Eds.)</source>
          , pp.
          <fpage>141</fpage>
          -
          <lpage>150</lpage>
          . Bath, United Kingdom,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Goñi-Menoyo</surname>
          </string-name>
          , José Miguel;
          <article-title>González-Cristóbal, José Carlos</article-title>
          and
          <string-name>
            <surname>Fombella-Mourelle</surname>
            ,
            <given-names>Jorge.</given-names>
          </string-name>
          <article-title>An optimised trie index for natural language processing lexicons</article-title>
          .
          <source>MIRACLE Technical Report</source>
          . Universidad Politécnica de Madrid,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [12]
          <string-name>
            <surname>de Pablo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Martínez-Fernández</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Martínez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Villena</surname>
          </string-name>
          , J. miraQA:
          <article-title>Initial experiments in Question Answering</article-title>
          .
          <article-title>CLEF 2004 proceedings</article-title>
          (Peters,
          <string-name>
            <surname>C.</surname>
          </string-name>
          et al.,
          <source>Eds.). Lecture Notes in Computer Science</source>
          , vol.
          <volume>3491</volume>
          . Springer,
          <year>2005</year>
          (to appear).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Porter</surname>
            ,
            <given-names>Martin.</given-names>
          </string-name>
          <article-title>Snowball stemmers and resources page</article-title>
          . On line http://www.snowball.
          <source>tartarus.org. [Visited</source>
          <volume>13</volume>
          /07/2005]
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Santorini</surname>
          </string-name>
          ,
          <article-title>"Part-of-speech tagging guidelines for the Penn Treebank Project (3rd revision)," Department of Computer</article-title>
          and Information Science, University of Pennsylvania, Philadelphia,
          <source>Tech. Rep. MS-CIS90 -47</source>
          , Line Lab 178,
          <year>1990</year>
          , ftp://ftp.cis.upenn.edu/pub/treebank/doc/ manual/root.ps.gz.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>SYSTRAN</given-names>
            <surname>Software</surname>
          </string-name>
          <article-title>Inc</article-title>
          .,
          <source>USA. SYSTRAN 5</source>
          .
          <article-title>0 translation resources</article-title>
          . On line http://www.systransoft.
          <source>com [Visited</source>
          <volume>13</volume>
          /07/2005].
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Translation</given-names>
            <surname>Experts</surname>
          </string-name>
          <article-title>Ltd</article-title>
          .
          <article-title>InterTrans translation resources</article-title>
          . On line http://www.tranexp.
          <source>com [Visited</source>
          <volume>28</volume>
          /07/2005].
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Villena</surname>
          </string-name>
          , Julio; Martínez, José L.;
          <string-name>
            <surname>Fombella</surname>
          </string-name>
          , Jorge; G. Serrano, Ana; Ruiz, Alberto; Martínez, Paloma; Goñi, José M.; and González, José C.
          <article-title>Image Retrieval: The MIRACLE Approach</article-title>
          .
          <article-title>Comparative Evaluation of Multilingual Information Access Systems (Peters, C; Gonzalo</article-title>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          ; Brascher,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ; and Kluck, M., Eds.).
          <source>Lecture Notes in Computer Science</source>
          , vol.
          <volume>3237</volume>
          , pp.
          <fpage>621</fpage>
          -
          <lpage>630</lpage>
          . Springer,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Villena-Román</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Martínez</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Fombella</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>García-Serrano</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Martínez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Goñi</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>González</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          (Carol Peters, Ed.);
          <article-title>MIRACLE results for ImageCLEF 2003</article-title>
          .
          <source>Working Notes for the CLEF 2003 Workshop</source>
          ,
          <fpage>21</fpage>
          -
          <lpage>22</lpage>
          August, Trondheim, Norway.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [19]
          <article-title>Xapian: an Open Source Probabilistic Information Retrieval library</article-title>
          . On line http://www.xapian.
          <source>org. [Visited</source>
          <volume>13</volume>
          /07/2005]
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>