<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SINAI at ImageCLEF 2006</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>M.C. D</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>az-Galiano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M.A. Garc</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>a-Cumbreras</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M.T. Mart</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>n-Valdivia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. Montejo-Raez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>L.A. Uren~a-L</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Ja</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1849</year>
      </pub-date>
      <abstract>
        <p />
      </abstract>
      <kwd-group>
        <kwd>H</kwd>
        <kwd>3 [Information Storage and Retrieval]</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>1 Content Analysis and Indexing</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>3 Information Search and Retrieval</kwd>
        <kwd>H</kwd>
        <kwd>3</kwd>
        <kwd>4 Systems and Software</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>This paper describes SINAI team participation in the ImageCLEF campaign. The
SINAI research group participated in both the ad hoc task and the medical task. The
experiments accomplished in both tasks result from very di®erent approaches.</p>
      <p>For the adhoc task the main IR system used is the same as that of the 2005
ImageCLEF adhoc task. The improvement of the adhoc system is a new Machine
Translation system that works with several translators and implements several heuristics.
We have participated in the English monolingual task and in six bilingual tasks for
the languages: Dutch, French, German, Italian, Portuguese and Spanish. The results
obtained shown that the English monolingual results are good (0,2234 is our best
result) and there is a loss of precision with the bilingual runs and some languages like
German or Spanish works better than others, because of the translations.</p>
      <p>For the medical task, this year we carried out new and very di®erent experiments to
imageCLEFmed2005 ones. First of all, we have processed the set of collections using
Information Gain (IG) to determine which are the best tags that should be considered
in the indexing process. These tags are those supposed to provide the most relevant
and non-redundant information, and have been selected automatically according to
our information-based strategy along with the data and relevance assessments from
last year.</p>
      <p>This year, our goal was to analyze how tag selection may contribute to the quality
of ¯nal results. In order to select reduced set of tags we have computed IG. 11 di®erent
collections were generated according to the percentage of tags with highest IG value.
Finally, only results related to experiments with selections over the 20%, 30% and 40%
of available tags were submitted, since they reported best performance on 2005 data.</p>
      <p>Experiments using only textual query and using textual mixing with visual query
have been submitted. For visual query we have used the GIFT lists provide by the
organization. Surprisingly, the system performs better on the text retrieval alone than
mixed textual and visual retrieval.</p>
      <p>On the other hand, we try show that information ¯ltering through tag selection
using information gain improves retrieval results without the need of a manual selection,
but the obtained results are no conclusive. Unfortunately, the results obtained are not
as successful as desired. Due to a computing processing mistake all our mixed runs
obtain the same results than the visual GIFT baseline (0.0467). At the moment of
writing of this paper we are modifying our system in order to solve this problem.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>This is the second participation of the SINAI research group at the ImageCLEF campaign. We
have participated in the ad hoc task and the medical task.</p>
      <p>As a cross-language retrieval task, multilingual image retrieval based on query translation can
achieve high performance, more than monolingual retrieval. The ad hoc task involves to retrieve
relevant images using the text associated to each image query.</p>
      <p>
        The goal of the medical task is to retrieve relevant images based on an image query [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. For
this, organizers supply a multilingual and visual collection and a set of queries (images and a
short text in English, French and German are associated). We ¯rst preprocess the collection using
Information Gain (IG). This year, our main goal is to compare the e®ect of select di®erent tags
from the collection using this measure. We have attempted to choose those tags, providing the
best information in order to improve the result obtained. We have generated several collections
with di®erent number of tags depending on their IG. Finally, we have only submitted runs on 3
di®erent collections (at 20%,30% and 40%) because they reported the best results for the
ImageCLEFmed2005 data. For each collection, we ¯rst compare the results obtained using only textual
query against results obtained combining textual and visual information. Finally, we have used
di®erent methods to merge visual and textual results.
      </p>
      <p>Next section describes the ad hoc experiments. In Section 3, we explain the experiments for
the medical task. Finally, conclusions and future work are presented in Section 4.
2</p>
    </sec>
    <sec id="sec-3">
      <title>The Ad Hoc Task</title>
      <p>The goal of the ad hoc task is, given a multilingual query, to ¯nd as many relevant images as
possible, from an image collection.</p>
      <p>
        The proposal of the ad hoc task is to compare results with and without pseudo-relevant
feedback, with or without query expansion, using di®erent methods of query translation or using
di®erent retrieval models and weighting functions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
2.1
      </p>
      <sec id="sec-3-1">
        <title>Experiments Description</title>
        <p>In our experiments we have used seven languages: Dutch, English, French, German, Italian,
Portuguese and Spanish.</p>
        <p>Because in 2005 the results were quite good, this year we have used the same IR system and the
same strategies, but introducing a new translation module. This module combines some Machine
Translators and implements some heuristics.</p>
        <p>The Machine Translators used have been (in brackets the translator by default for each
language):
² Epals (German and Portuguese)
² Prompt (Spanish)
² Reverso (French)
² Systran (Dutch and Italian)</p>
        <p>Some heuristics are, for instance, the use of the translation made by the translator by default,
a combination with the translations of every translator, or a combination of the words with a
higher punctuation (two points if it appears in the default translation and one point if it appears
in all of the other translations).</p>
        <sec id="sec-3-1-1">
          <title>Weight Okapi Okapi T¯df</title>
          <p>T¯df
MAP
0.1602
0.1359
0.1489
0.1369</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Rank</title>
          <p>4/8
7/8
5/8
6/8</p>
          <p>The dataset is a new collection: IAPR. The IAPR TC-12 image collection consists of 20,000
images taken from locations around the world and comprising a varying cross-section of still natural
images. It includes pictures of a range of sports, actions, photographs of people, animals, cities,
landscapes and many other aspects of contemporary life.</p>
          <p>The collections have been preprocessed, using stopwords and the Porter's stemmer.</p>
          <p>The collection dataset has been indexed using LEMUR IR system. It is a toolkit that supports
indexing of large-scale text databases, the construction of simple language models for documents,
queries, or subcollections, and the implementation of retrieval systems based on language models
as well as a variety of other retrieval models. The toolkit is being developed as part of the
Lemur Project, a collaboration between the Computer Science Department at the University of
Massachusetts and the School of Computer Science at Carnegie Mellon University.</p>
          <p>One parameter for each experiment is the weighting function, such as Okapi or TFIDF. Another
is the use or not of PRF (pseudo-relevance feedback ).
2.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Results and Discussion</title>
        <p>As parameters all the results are obtained using the title and narrative text, when possible. In the
English monolingual task and in the German-English bilingual task we have combined the use or
not of pseudo-relevance feedback and the weighting function (Okapi or T¯df).</p>
        <p>In table 1, we can see the English monolingual results. The results obtained show that the
pseudo-relevance feedback is too important when Okapi is used as weighing function. The results
with T¯df and with Okapi without PRF are very poor.</p>
        <p>Table 2 show a summary of experiments submitted and results obtained for the German-English
bilingual runs. In this case we have combine the same parameters than in the monolingual task.</p>
        <p>The results obtained show that there is a loss of MAP between the best monolingual experiment
and this bilingual, around a 28%. Even though, the other results in the English monolingual task
are quite worse compared to the German bilingual ones.</p>
        <p>Finally, table 3 show a summary of experiments submitted and results obtained for the other
¯ve bilingual runs.</p>
        <p>The results obtained show that in general there is a loss of precision compared to the English
monolingual results. The Spanish result is around a 17% worse. The other languages decrease the
results.
3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The Medical Task</title>
      <p>The main goal of medical ImageCLEF task is to improve the retrieval of medical images from
heterogeneous and multilingual document collections containing images as well as text. Queries
Language Experiment
Dutch sinaiNlEnFbOkapiExp1
French sinaiFrEnFbOkapiExp1
Italian sinaiItEnFbOkapiExp1
Portuguese sinaiPtEnFbOkapiExp1
Spanish sinaiEsEnFbOkapiExp1</p>
      <sec id="sec-4-1">
        <title>Initial Query</title>
        <p>title + narr
title + narr
title + narr
title + narr
title + narr</p>
      </sec>
      <sec id="sec-4-2">
        <title>Expansion</title>
        <p>with
with
with
with
with
are formulated with sample images and a sort of textual description explaining the research goal.
For the medical task, we have used the list of retrieved images by GIFT1 which was supplied by
the organizers of this track.</p>
        <p>
          Last year, our e®orts concentrated in manipulating the text descriptions associated with these
images and mixing the partial results lists with the GIFT lists [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. However, this year our
experiments focus in preprocessing the collection using Information Gain (IG) in order to improve the
quality of results and to automate the tag selection process.
3.1
        </p>
        <sec id="sec-4-2-1">
          <title>Preprocessing the Collection</title>
          <p>In order to generate the textual collection we have used the ImageCLEFmed.xml ¯le that links
collections with their images and annotations. It has external links to the images and the associated
annotations in XML ¯les. It contains relative paths, from the root directory, to all the related
¯les.</p>
          <p>The entire collection consists of 4 datasets (CASImage, Pathopic, Peir and MIR) containing
about 50,000 images. Each subcollection is organized into cases that represent a group of related
images and annotations. At every case a group of images and an optional annotation is given.
Each image is part of a case and has optional associated annotations, which encloses metadata
and/or a textual annotation. All of the images and annotations are stored in separate ¯les.
ImageCLEFmed.xml only contains the connections between collections, cases, images, and annotations.</p>
          <p>The collection annotations are in XML format. The majority of the annotations are in English
but a signi¯cant number is also in French (in the CASImage collection) and German (in the
Pathopic collection), with ew cases not contain any annotation at all. The quality of the texts
varies across collections and even within the same collection.</p>
          <p>For the MIR subset, speci¯cally designed regular expressions have been applied in order to get
di®erent segments of information, due to the lack of prede¯ned XML tags. In this way, information
such as identi¯cator string, authors, date and so on has been extracted from within the corpus.</p>
          <p>We generate a textual document per image, where the identi¯er number of document is the
name of the image and the text of document is the XML annotation associated to this image. If
there were several images of the same case, then the text was copied several times.</p>
          <p>We have used English language for the document collection as well as for the queries. Thus,
French annotations in CASImage collection were translated into English and then were
incorporated to the collection. Pathopic collection has annotations in both English and German languages.
We only used English annotations in order to generate the Pathopic documents, discarding German
annotations.
3.2</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>Information Gain and Tag Selection</title>
          <p>Last year, almost all tags were used to generate the ¯nal corpus. Only those labels that seemed
not to provide any information were removed, like the LANGUAGE tag. But this year these tags
have been selected according to the amount of information theoretically supplied. For this, we
have used the information gain measure as a method to select the best tags in the collection.</p>
          <p>The main goal was to determine whether the results obtained from a corpus where tags have
been reduced by discarding those with low IG may show higher performance levels. The aim is
to eliminate those tags that do not provide further information or that introduce noise, therefore
degradating results.</p>
          <p>At the beginning, experiments with only 10%, 20%, 30%, ..., 100% of those labels with highest
associated IG were performed, using 2005 data for evaluation. Once results were analyzed, most
accurated results were obtained with 20%, 30% and 40% of the total of available tags, being these
ones the collections used in the submitted experiments for the 2006 campaign.</p>
          <p>The method applied consists in computing the information gain for every tag at every
subcollection. Since each subcollection (CASImage, Pathopic, Peir and MIR) has a di®erent set of
tags, the information gain was calculated using each subcolletion as scope, isolating each one from
the others. Let C be the set of cases, E the value set for the E tag, then the formula applied is
as follows:</p>
          <p>IG(CjE) = H(C) ¡ H(CjE)
H(CjE) =</p>
          <p>XjEj jCej j µ
j=1 jCj
¡
jCej j
X
i=1 jCej j
1</p>
          <p>1 ¶
log2 jCej j
= ¡ XjEj jCej j log2 jCej j</p>
          <p>1
j=1 jCj
is the subset of cases in C having the tag E set to the value ej (this
value is a combination of words where order does not matter)
where
where</p>
          <p>Cej
(1)
(2)
(3)
(4)
IG(CjE) is the information gain for the E tag,</p>
          <p>H(C) is the entropy and</p>
          <p>H(CjE) is the relative entropy
In order to calculate this value, we compute the entropy of the set of cases C as:
H(C) = ¡
jCj jCj 1
X p(ci) log2 p(ci) = ¡ X
i=1
i=1 jCj</p>
          <p>1 1
log2 jCj = ¡ log2 jCj
And the entropy of the set of cases C conditioned by the tag E would be:</p>
          <p>Therefore, we can conclude the ¯nal equation for the computation of the information gain
supplied by a given tag E over the set of cases C as follows:</p>
          <p>IG(CjE) = ¡ log2 jC1 j + XjEj jCej j log2 jC1ej j
j=1 jCj</p>
          <p>For every tag in every collection, its information gain is computed. Then, the tags selected to
compose the ¯nal collection are those showing high values of IG. Once the document collection was
generated, experiments were conducted with the LEMUR2 retrieval information system, applying
the Kl-divergence weighting scheme.
3.3</p>
        </sec>
        <sec id="sec-4-2-3">
          <title>Experiment Description</title>
          <p>Our main goal is to investigate the e®ectiveness of ¯ltering tags using IG in the text collection.
For this, we have accomplished several experiments using the ImageCLEFmed2005 in order to
determinate the best tag percentage.</p>
          <p>Experiment
IPAL Textual CDW (best result)
SinaiOnlytL30
SinaiOnlytL40
SinaiOnlytL20</p>
          <p>First, we have carried out experiments with 10%, 20%...100% of tags and we have evaluated
the results with the relevance assessments of the 2005 collection. Based on the result obtained,
we have only submitted runs with 20%, 30% y 40% of tags for the 2006 collection because these
corpus reported the best results. Thus for each experiment, we have submitted 3 runs (one per
corpus generated at: 20%, 30% and 40% of all available tags).</p>
          <p>We wanted also to compare the obtained results when we only use the text associated to
the query topic and the results when we merge visual and textual information. For this, ¯rst
experiment has been performed as baseline case. This experiment simply consists of taking the
text associated to each query as a new textual query. Then, each textual query is submitted to
the LEMUR system. The resulting list is directly the baseline run.</p>
          <p>The remain experiments start from the ranked lists provided by the GIFT tool. The
organization provides list of relevant images generated by GIFT for each query. For each list/query
we have used an automatic textual query expansion using the associated text to the top ranked
images from GIFT lists. Thus, we have added the text associated to the ¯rst four images from
the GIFT list to the original textual query in order to generate a new textual query. Then, the
new textual query is submitted to the LEMUR system and we obtain a new ranked list. Thus,
for each original query we have 2 partial lists: one (expanded) text list and one GIFT list. The
last step consists of merging these partial resulting lists using some strategy in order to obtain
one ¯nal list (FL) with relevant images ranked by relevance. The merging process was done given
di®erent weight of importance to the visual (VL) and textual lists (TL):</p>
          <p>F L = V L ¤ ® + T L ¤ ¯, with ® + ¯ = 1
(5)</p>
          <p>
            In order to set these parameters we have again launched some experiments with the 2005
collection varying ® and ¯ in the range [
            <xref ref-type="bibr" rid="ref1">0,1</xref>
            ] with step 0.1 (i.e., 0, 0.1, 0.2,...,0.9 and 1). After
analyzing the results, we have submitted runs with ¯ set to 0.5, 0.6 and 0.7 for the 2006 collection.
          </p>
          <p>These 3 experiments and the baseline experiment (that only uses textual information of the
query) have been accomplished over the 3 di®erent corpus generated with 20%, 30% and 40%
of tags. All textual experiments have been carried out with LEMUR using Pseudo Relevance
Feedback and the Kl-divergence weighting scheme, as pointed out previously. In summary, we
have submitted 12 runs.
3.4</p>
        </sec>
        <sec id="sec-4-2-4">
          <title>Results</title>
          <p>The total runs submitted at ImageCLEFmed2006 for text only were 31 and for mixed retrieval
were 37.</p>
          <p>Table 4 shows the results for text only retrieval with the SINAI system. Unfortunately, due to
a computing processing mistake all our mixed runs obtain the same results than the visual GIFT
baseline (0.0467). At the moment of writing of this paper we are modifying our system in order
to solve this problem.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Further Work</title>
      <p>In this paper, we have presented the experiments carried out in our participation in the ImageCLEF
campaign.</p>
      <p>For the adhoc task, we have tried a new Machine Translation module. The application of some
heuristics improves the bilingual results, but it is necessary to study the queries with poorest
results, in order to improve them. Our next work will be the improvement of the results in the IR
phase, applying new techniques for query expansion (using thesauri or web information) and the
investigation in other heuristics for the Machine Translation module.</p>
      <p>For the medical task, we have tried to apply Information Gain in order to improve the results.
Unfortunately, the performance obtained has been very poor. In addition, for mixed runs our
system has a computing mistake and result obtained are no conclusive. However, we consider that
the Information Gain is a good idea and a widely used method to ¯lter information without the
need of a manual tag selection. Thus, our next step will focus on improving the visual lists and
the merging process.
5</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work has been partially supported by a grant from the Spanish Government, project R2D2
(TIC2003-07158-C04-04)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Paul</given-names>
            <surname>Clough</surname>
          </string-name>
          , Michael Grubinger, Thomas Deselaers, Allan Hanbury,
          <article-title>Henning MuÄller: Overview of the ImageCLEF 2006 photo retrieval and object annotation tasks</article-title>
          .
          <source>In Proceedings of the Cross Language Evaluation Forum (CLEF</source>
          <year>2006</year>
          ),
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Henning</surname>
            <given-names>MuÄller</given-names>
          </string-name>
          , Thomas Deselaers, Thomas Lehmann, Paul Clough, William Hersh:
          <article-title>Overview of the ImageCLEFmed 2006 medical retrieval and annotation tasks</article-title>
          .
          <source>In Proceedings of the Cross Language Evaluation Forum (CLEF</source>
          <year>2006</year>
          ),
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.T.</given-names>
            <surname>Mart</surname>
          </string-name>
          <article-title>¶³n-</article-title>
          <string-name>
            <surname>Valdivia</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <article-title>Garc¶³a-</article-title>
          <string-name>
            <surname>Cumbreras</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>D¶</surname>
          </string-name>
          <article-title>³az-</article-title>
          <string-name>
            <surname>Galiano</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uren~</surname>
            a-L¶opez,
            <given-names>L.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montejo-Raez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SINAI at ImageCLEF 2005</article-title>
          .
          <source>In Proceedings of the Cross Language Evaluation Forum (CLEF</source>
          <year>2005</year>
          ),
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>