<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the Wikipedia Image Retrieval Task at ImageCLEF 2011</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Theodora Tsikrika</string-name>
          <email>theodora.tsikrika@acm.org</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adrian Popescu</string-name>
          <email>adrian.popescu@cea.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jana Kludas</string-name>
          <email>jana.kludas@unige.ch</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CEA, LIST, Vision &amp; Content Engineering Laboratory</institution>
          ,
          <addr-line>92263 Fontenay aux Roses</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>CUI, University of Geneva</institution>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Applied Sciences Western Switzerland (HES-SO)</institution>
          ,
          <addr-line>Sierre</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>ImageCLEF's Wikipedia Image Retrieval task provides a testbed for the system-oriented evaluation of multimedia and multilingual information retrieval from a collection of Wikipedia images. The aim is to investigate retrieval approaches in the context of a large and heterogeneous collection of images (similar to those encountered on the Web) that are searched for by users with diverse information needs. This paper presents an overview of the resources, topics, and assessments of the Wikipedia Image Retrieval task at ImageCLEF 2011, summarizes the retrieval approaches employed by the participating groups, and provides an analysis of the main evaluation results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The Wikipedia Image Retrieval task is an ad-hoc image retrieval task. The
evaluation scenario is thereby similar to the classic TREC ad-hoc retrieval task:
simulation of the situation in which a system knows the set of documents to be
searched, but cannot anticipate the particular topic that will be investigated
(i.e., topics are not known to the system in advance). Given a multimedia query
that consists of a title in three different languages and a few example images
describing a user’s information need, the aim is to find as many relevant images
as possible from a collection of Wikipedia images. Similarly to past years,
participants are encouraged to develop approaches that combine the relevance of
different media types and of multilingual textual resources into a single ranked
list of results. A number of resources that support participants towards this
research direction were provided this year.</p>
      <p>The paper is organized as follows. First, we introduce the task’s resources:
the Wikipedia image collection and additional resources, the topics, and the
assessments (Sections 2–4). Section 5 presents the approaches employed by the
participating groups and Section 6 summarizes their main results. Section 7
concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>Task resources</title>
      <p>
        The ImageCLEF 2010 Wikipedia collection was used for the second time in
2011. It consists of 237,434 Wikipedia images, their user-provided annotations,
the Wikipedia articles that contain these images, and low-level visual features
of these images. The collection was built to cover similar topics in English,
German, and French and it is based on the September 2009 Wikipedia dumps.
Images are annotated in none, one or several languages and, wherever possible,
the annotation language is given in the metadata file. The articles in which these
images appear were extracted from the Wikipedia dumps and are provided as
such. The collection is described in more detail in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and an example image
with its associated metadata is given in Figure 1. A first set of image features
were extracted using MM, CEA LIST’s image indexing tool [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and include both
local (bags of visual words) and global features (texture, color and edges). An
alternative set of global features, extracted with the MMRetrieval tool [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], was
kindly provided by the Information Retrieval group at the Democritus
University of Thrace, Greece (DUTH group).
The topics are descriptions of multimedia information needs that contain
textual and visual hints.
      </p>
      <sec id="sec-2-1">
        <title>3.1 Topic Format</title>
        <p>These multimedia queries consist of a multilingual textual part, the query title,
and a visual part made of several example images. The narrative of the query
is only used during the assessment phase.
&lt;title&gt; query by keywords, one for each language: English, French, German
&lt;image&gt; query by image content (four or five)
&lt;narrative&gt; description of query in which an unambiguous definition of
relevance and irrelevance is given
&lt;title&gt; The topic &lt;title xml:lang=“en”&gt; has a language attribute that marks
the English (en), French (fr) and German (de) topic title. It simulates a user who
does not have (or want to use) example images or other visual constraints. The
query expressed in the topic &lt;title&gt; is therefore a text-only query. This profile
is likely to fit most users searching digital libraries or the Web.</p>
        <p>Upon discovering that a text-only query does not produce many relevant hits,
a user might decide to add visual hints and formulate a multimedia query.
&lt;image&gt; The visual hints are example images, which express the narrative of
the topic.
&lt;narrative&gt; A clear and precise description of the information need is
required in order to unambiguously determine whether or not a given document
fulfils the given information need. In a test collection this description is known
as the narrative. It is the only true and accurate interpretation of a user’s needs.
Precise recording of the narrative is important for scientific repeatability - there
must exist, somewhere, a definitive description of what is and is not relevant to
the user.</p>
        <p>Textual terms and visual examples can be used in any combination in order to
produce results. It is up to the systems how to use, combine or ignore this
information; the relevance of a result does not directly depend on these constraints,
but it is decided by manual assessments based on the &lt;narrative&gt;.</p>
      </sec>
      <sec id="sec-2-2">
        <title>3.2 Topic Development</title>
        <p>The 50 topics in the ImageCLEF 2011 Wikipedia Image Retrieval task (see
Table 1), created by the organizers of the task, aim to cover diverse information
needs and to have a variable degree of difficulty. They were chosen after a
statistical analysis of a large scale image query log kindly provided by Exalead so
as to cover a wide variety of topics commonly searched on the Web. Candidate
topics were run through the Cross Modal Search Engine 4 (CMSE developed by</p>
        <sec id="sec-2-2-1">
          <title>4 http://dolphin.unige.ch/cmse/</title>
          <p>the University of Geneva) in order to get an indication of the number of relevant
images in top results for visual, textual and multimodal candidate queries.</p>
          <p>
            The topics range from simple, and thus relatively easy (e.g., “brown bear”),
to semantic, and hence highly difficult (e.g., “model train scenery”), with a
balanced distribution of the two types of topics. One difference with 2010 [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ] is the
higher number of topics with named entities (and particularly known person
names and products) proposed this year. This change is motivated by the
results of the log analysis which confirmed that a lot of named entities are used in
Web queries. Semantic topics typically have a complex set of constraints, need
world knowledge, and/or contain ambiguous terms, so they are expected to
be challenging for current state-of-the-art retrieval algorithms. We encouraged
the participants to use multimodal and multilingual approaches since they are
more appropriate for dealing with semantic information needs.
          </p>
          <p>
            Image examples were selected from Flickr, after ensuring that they were
uploaded under Creative Commons licenses. Each topic has four or five image
examples, chosen so as to illustrate, to the extent possible, the visual diversity
of the topic. Compared to 2010 [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ], a larger number of images was provided
per topic in order to have an improved visual characterization of the topics and
thus to encourage multimodal approaches. Query image examples and their
low-level features were also provided with the collection in order to ensure
repeatability of the experiments. On average, the 50 topics contain 4:84 images
and 3:1 words in their English formulation.
The Wikipedia Image Retrieval task is an image retrieval task, where an image
is either relevant or not (binary relevance). We adopted TREC-style pooling of
          </p>
          <p>Fig. 2: Instructions to workers for performing the relevance assessments.
the retrieved images with a pool depth of 100, resulting in pool sizes of between
764 and 2327 images with a mean of 1467 and median of 1440.</p>
          <p>The relevance assessments were performed with a crowdsourcing approach
using CrowdFlower (http://crowdflower.com/), a general-purpose
platform for managing crowdsourcing tasks and ensuring high-quality responses.
CrowdFlower enables the processing of large amounts of data in a short period
of time by breaking a repetitive “job” into many “assignments”, each consisting
of small numbers of “units”, and distributing them to many “workers”
simultaneously. In our case, a job corresponded to performing the relevance
assessments of the pooled images for a single topic, each unit was the image to be
assessed, whereas each assignment consisted of assessing the relevance for a set
of five units (i.e., images) for a single topic. The assessments were carried out
by Amazon Mechanical Turk (http://www.mturk.com) workers based in the
UK and the USA and each assignment was rewarded with 0:04$.</p>
          <p>For each assignment, each worker was provided with instructions in
English for the English version of the topic, as shown in Figure 2 for topic 76,
followed by five units to be assessed for that topic, each similar to the one
shown in Figure 3. To prevent spammers and thus obtain accurate results, each
assignment contained one “gold standard” image among the five images, i.e.,
an image already correctly labelled by the organizers. These gold standard data
were used for estimating the workers’ accuracy: if a worker’s accuracy dropped
below a threshold (70%), his/her assessments were excluded.</p>
          <p>Fig. 3: An image to be assessed.</p>
          <p>For each topic, the gold standard data were created as follows. First, the
images in the pool of depth 5 for that topic which could be unambiguously
marked as relevant or non-relevant were assessed. This subset was selected
so as to ensure that at least some relevant images were included in the gold
standard set. If at the end of this round of assessment, the gold standard set
contained less than 6% of the total images to be assessed for that topic, then
further images from the original pool of depth 100 were randomly selected and
assessed until the 6% limit was reached.</p>
          <p>Each image was assessed by three workers with the final assessment
obtained through a majority vote.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5 Participants</title>
      <p>A total of 11 groups submitted 110 runs. The participation has slightly reduced
compared to last year both in terms of number of participants (11 vs. 13) and of
submitted runs (110 vs. 127). Nine participating groups out of 11 are located in
Europe, one comes from Turkey and another one from Tunisia.</p>
      <p>Table 2 gives an overview of the types of the submitted runs. Similarly to last
year, more multimodal (text/visual) than text-only runs were submitted.
Table 3 presents the combinations of annotation and topic languages used by
participants in their textual and multimodal runs. The majority of submitted runs
are multilingual in at least one of the two aspects. Most teams used both
multilingual queries and multilingual annotations in order to maximize retrieval
performance and the best results presented in the next section (see Tables 4 and
5) validate this approach. Although runs that implicate English only queries
are by far more frequent than runs implicating German and French only, some
participants also submitted the latter type of runs. A short description of the
participants’ approaches follows.</p>
      <p>
        CEA LIST (9 runs - 5 single + 4 CEA-XRCE) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] Their approach is mainly based
on query expansion with Wikipedia. Given a topic, related concepts are
retrieved from Wikipedia and used to expand the initial query. Then results
are re-ranked using query models extracted from Flickr. They also used
visual concepts (face/no face; indoor/outdoor) to characterize topics in terms
of presence of these concepts in the image examples and to re-rank the
results accordingly. Some of the runs submitted by CEA LIST (noted
CEAXRCE) were created using a late fusion of results with visual results
produced by XRCE.
      </p>
      <p>
        DBISForMaT (12 runs) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] They introduced a retrieval model based on the
polyrepresentation of documents which assumes that different modalities
of a document can be combined in a structured manner to reflect a user’s
information need. Global image features were extracted using LIRE, a CBIR
engine built on top of LUCENE. As it is underlined in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], although
promising, their results are hampered by the use of a naive textual representation
of the documents.
      </p>
      <p>
        DEMIR (6 runs) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] They used the Terrier IR platform to test a large number of
classical weighting schemes (BM25, TF-IDF, PL2 etc.) over a bag-of-words
representation of the collection for text retrieval. They also performed a
comparison of the visual descriptors provided by DUTH and report that
the best purely visual results are obtained using the CEDD descriptor. Their
multimodal runs are based on a late fusion approach and results show that
merging modalities achieves small improvements compared to the textual
results.
      </p>
      <p>
        DUTH (19 runs) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] The group has further developed its MMRetrieval engine
that they introduced in 2010. It includes a flexible indexing of text and
visual modalities as well as different fusion strategies (score combination and
score normalization). This year, they introduced an estimation of query
difficulty whose combination with score combination gave the best results.
The group also kindly provided a set of low-level features which were used
by a large number of participants.
      </p>
      <p>
        ReDCAD (4 runs) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] They focused on text retrieval and tested the use of the
metadata related to the images as well as of the larger textual context of the
images. LUCENE was used for both indexing and retrieving documents.
Taking into account the textual context of the image is more effective than
the use of the metadata only and a combination of the two provides a small
additional improvement of results.
      </p>
      <p>
        SINAI (6 runs) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] The group submitted only textual runs and focused on an
automatic translation of image descriptions from French and German to
English. All their runs work with English queries only. Different linear
combinations of image captions and descriptions were tested and they also
combined results from Lemur and LUCENE retrieval engines. The combination
of the two achieved the best results.
      </p>
      <p>
        SZTAKI (10 runs) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] The team used a retrieval system based on Okapi BM25
and also added synonyms from WordNet to expand the initial queries.
Light Fisher vectors were used to represent low-level image features and
then used to re-rank the top results obtained with purely textual retrieval.
This late fusion procedure resulted in a slight degradation of performance
compared to the textual run.
      </p>
      <p>
        UAIC (6 runs) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] For textual retrieval, they used the standard LUCENE search
engine library and expanded some of the queries using WordNet synonyms.
The visual search was performed using the Color and Edge Directionality
Descriptor (CEDD) provided by the DUTH team. A linear combination of
text and image results was performed which gave the best result.
      </p>
      <p>
        UNED (20 runs) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] They performed textual retrieval with a combination of
IDRA, their in-house retrieval tool, and LUCENE and experimented with
different settings (such as named entity recognition or use of Wikipedia
articles). For multilingual textual runs, UNED tested early and late fusion
strategies and the results show that the latter approach gives better results.
Content based retrieval based on the CEDD features provided by DUTH
was applied to the textual results. UNED tested both early and late fusion
approached to obtain merged runs. Their fusion approaches were effective
and the best results were obtained with a logistic regression feedback
algorithm.
      </p>
      <p>
        UNTESU (7 runs) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] They applied Salient Semantic Analysis in order to
expand queries with semantically similar terms from Wikipedia. A
picturability measure was defined in order to boost the weight of terms which
are associated to the initial topic in Flickr annotations. French and German
annotations in the collection were translated to English and only English
topics were used for the experiments. The best results were obtained with a
combination of terms from the initial query and of expanded terms found
using Lavrenko’s relevance model.
      </p>
      <p>
        XRCE (11 runs - 4 single + 7 XRCE-CEA) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] For text retrieval, they implemented
an information-based model and a lexical entailment IR model. Image
content was described using spatial pyramids of Fisher Vectors and local RGB
statistics. Late Semantic Combination (LSC) was exploited to combine
results from text and image modalities. They showed that, although text
retrieval largely outperforms pure visual retrieval, an appropriate
combination of the two modalities results in a significant improvement over each
modality considered independently. A part of the runs submitted by XRCE
(noted XRCE-CEA) were created using a LSC approach which combined
their text and visual runs as well as textual runs proposed by CEA LIST.
The complete list of results can be found at the ImageCLEF website 5.
      </p>
      <sec id="sec-3-1">
        <title>5 http://www.imageclef.org/2011/wikimm-results</title>
        <sec id="sec-3-1-1">
          <title>6.1 Performance per modality for all topics</title>
          <p>Here, we analyze the evaluation results using only the top 90% of the runs to
exclude noisy and buggy results. Table 6 shows the average performance and
standard deviation with respect to each modality. On average, the multimodal
runs have better performance than textual ones with respect to all examined
evaluation metrics (MAP, Precision at 20, and precision after R (= number of
relevant) documents retrieved). This is in contrast with results reported in
previous years when textual runs had better performances on average. This shift
can be explained with changes in the resources as well as the approaches this
year, i.e., increased number of visual examples in the queries, improved visual
features and more appropriate fusion techniques used by the participants.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>6.2 Performance per topic and per modality</title>
          <p>To analyze the average difficulty of the topics, we classify the topics based on
the AP values per topic averaged over all runs as follows:
easy: M AP &gt; 0:3
medium: 0:2 &lt; M AP &lt;= 0:3
hard: 0:1 &lt; M AP &lt;= 0:2
very hard: M AP &lt; 0:1.</p>
          <p>Table 7 presents the top 10 topics per class (i.e., easy, medium, hard, and
very hard), together with the total number of topics per class. Out of 50
topics, 23 fall in the hard or very hard classes. This was actually intended
during the topic development process, because we opted for highly semantic
topics that are challenging for current retrieval approaches. 7 topics were very
hard to solve(M AP &lt; 0:10). The topic set includes only 17 easy topics ( such
as “illustrations of Alice’s adventures in Wonderland”, “Sagrada Familia in
Barcelona”, “colored Volkswagen beetles”, “KISS live”). Similarly to last year, a
large number of the topics in the easy and medium classes include a reference
to a named entity and, consequently, are easily retrieved with simple textual
approaches. As for very hard topics, they often contain general terms (“cat”,
“house”, “train” or “bird”), which have a difficult semantic interpretation or
high concept variation and are, hence, very hard to solve.
102 black cat
110 male color portrait
116 houses in mountains
easy (17 topics) medium (12 topics)
98 illustrations of Alice’s 108 carnival in Rio
adventures in Wonderland
88 portrait of S. Royal 92 air race
84 Sagrada Familia in 120 bar codes
Barcelona
89 Elvis Presley</p>
          <p>Fig. 4: Average topic performance over all, text-only, and mixed runs.
We also analyzed the performance of runs that use only text (TXT) versus runs
that use both text and visual resources (MIX). Figure 4 shows the average
performance on each topic for all, text-only and text-visual runs. The multimodal
runs outperform the textual ones in 42 out of the 50 topics and the textual runs
outperform mixed runs in 8 cases. This indicates that most of the topics benefit
from a multimodal approach.</p>
          <p>The “visuality” of topics can be deduced from the performance of text-only
and text-visual approaches that were presented in the last section. We consider
that, if for a topic the text-visual approaches improve significantly the MAP
over all runs (i.e., by diff(M AP ) &gt;= 0:01), then we could consider that to be a
visual topic. In the same way, we can define topics as textual, if the text-only
approaches improve significantly the MAP over all runs of a topic. Based on
this analysis, 38 of the topics can be characterized as visual and 7 as textual. The
remaining 5 topics, where no clear improvements are observed, are considered
to be neutral. Compared to 2010, when there were more textual than visual
topics, the distribution of topics in visual vs. textual changed significantly. As
with the aggregate run performances, this change is most probably a result of
the increased number of query images, the improved low-level image indexing
as well as the better fusion techniques proposed this year.</p>
          <p>Table 8 presents the topics in each group, as well as some statistics on the
topic, their relevant documents, and their distribution over the classes that
indicate their difficulty. Given that there are only few textual and neutral topics,
it is difficult to provide a robust analysis of the characteristics of the topics of
each type.</p>
          <p>The number of words per topic is larger for neutral queries than for textual
and visual ones. The average number of relevant documents is significantly
smaller for textual topics compared to the other two classes whereas the
average MAP is bigger for neutral topics.</p>
          <p>The distribution of the textual, visual and neutral topics over the classes
expressing their difficulty shows that the visual and textual topics are more
likely to fall into the hard/very hard class than the neutral ones.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>Effect of Query Expansion and Relevance Feedback</title>
          <p>Finally, we analyze the effect of the application of query expansion (QE),
relevance feedback (FB) techniques as well as of their combination (FBQE).
Similarly to the analysis in the previous section, we consider the techniques to be
useful for a topic, if they improved significantly the MAP over all runs. Table 9
presents the best performing topics for these techniques and some statistics.
Query expansion is useful only for 3 topics and relevance feedback for 10.
Interestingly, a combination of query expansion and of relevance feedback is
effective for a much larger number of topics (33 out of 50). Expansion and feedback
tend to be more useful for topics that are either hard or very hard compared to
easy or medium topics.
For the second time this year, a multimodal and multilingual approach
performed best in the Wikipedia Image Retrieval task. The majority of runs
focused either on a combination of topic languages or on English queries only,
only a few runs were submitted for German and French queries only.
Multilingual runs perform clearly better than monolingual ones due to the distribution
of the information over the different languages.</p>
          <p>It is encouraging to see that more than half of the submitted runs were
multimodal and that the best submitted runs were multimodal for eight out of
nine participating groups that submitted such runs. Many of the participants
in the Wikipedia Image Retrieval Task have participated in the past and thus
have been able to improve their multimodal retrieval approaches continuously.
For the first time this year, there was a cooperation of two of the participating
groups for testing late fusion of their results which is an interesting
development.</p>
          <p>A further analysis of the results showed that most topics (42 out of 50) were
significantly better solved with multimodal approaches. This is not only due to
the improvement of the fusion approaches mentioned above, but also due to
an increased number of query images compared to the last years and improved
visual features. Finally, we found that expansion and feedback techniques tend
to be more useful for topics that are either hard or very hard compared to easy
or medium topics.
8</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>Theodora Tsikrika was supported by the EU in the context of Promise (contract
no. 258191) and Chorus+ (contract no. 249008) FP7 projects. Adrian Popescu
was supported by the French ANR (Agence Nationale de la Recherche) via the
Georama project (ANR-08-CORD-009). Jana Kludas was funded by the Swiss
National Fund (SNF).</p>
      <p>The authors would also like to thank the Information Retrieval group at
the Democritus University of Thrace, Greece (DUTH group) for sharing visual
features with all other task participants.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Avi</given-names>
            <surname>Arampatzis</surname>
          </string-name>
          , Konstantinos Zagoris, and
          <string-name>
            <surname>Savvas</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Chatzichristofis</surname>
          </string-name>
          . DUTH at
          <article-title>ImageCLEF 2011 Wikipedia Retrieval</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Hatem</given-names>
            <surname>Awadi</surname>
          </string-name>
          , Mouna Torjmen Khemakhem, and Maher Ben Jemaa.
          <article-title>Evaluating some contextual factors for image retrieval: ReDCAD participation at ImageCLEFWikipedia 2011</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Tolga</given-names>
            <surname>Berber</surname>
          </string-name>
          , Ali Hosseinzadeh Vahid, Okan Ozturkmenoglu, Roghaiyeh Gachpaz Hamed, and
          <string-name>
            <given-names>Adil</given-names>
            <surname>Alpkocak</surname>
          </string-name>
          . DEMIR at ImageCLEFwiki 2011:
          <article-title>Evaluating Different Weighting Schemes in Information Retrieval</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Emanuela</given-names>
            <surname>Boros</surname>
          </string-name>
          ,
          <string-name>
            <surname>Alexandru-Lucian Ginsca</surname>
            , and
            <given-names>Adrian</given-names>
          </string-name>
          <string-name>
            <surname>Iftene</surname>
          </string-name>
          .
          <article-title>UAIC's participation at Wikipedia Retrieval @ ImageCLEF 2011</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Gabriela</given-names>
            <surname>Csurka</surname>
          </string-name>
          ,
          <article-title>Ste´phane Clinchant, and Adrian Popescu. XRCE and CEA LIST's Participation at Wikipedia Retrieval of ImageCLEF 2011</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ba</surname>
          </string-name>
          <article-title>´lint Daro´ czy, Ro´ bert Pethes, and Andra´s A. Bencz u´r</article-title>
          . SZTAKI @
          <article-title>ImageCLEF 2011</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Bertrand</given-names>
            <surname>Delezoide</surname>
          </string-name>
          , Herve´ Le Borgne, Romaric Besanc¸on, Gae¨l De Chalendar, Olivier Ferret, Fa¨ıza Gara, Patrick He`de, Meriama Laib, Olivier Mesnard, PierreAlain Moellic, and
          <string-name>
            <given-names>Nasredine</given-names>
            <surname>Semmar</surname>
          </string-name>
          .
          <article-title>MM: modular architecture for multimedia information retrieval</article-title>
          .
          <source>In Proceedings of the 8th International Workshop on Content-Based Multimedia Indexing (CBMI</source>
          <year>2010</year>
          ), pages
          <fpage>136</fpage>
          -
          <lpage>141</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Miguel</surname>
            <given-names>A</given-names>
          </string-name>
          ´
          <article-title>ngel Garc´ıa-</article-title>
          <string-name>
            <surname>Cumbreras</surname>
          </string-name>
          ,
          <article-title>Manuel Carlos D´ıaz-</article-title>
          <string-name>
            <surname>Galiano</surname>
            ,
            <given-names>L. Alfonso</given-names>
          </string-name>
          <article-title>Ure n˜aLo´ pez, and Javier Arias-Buend´ıa</article-title>
          . SINAI at ImageCLEF Wikipedia Retrieval task
          <year>2011</year>
          :
          <article-title>testing combined systems</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Ruben</given-names>
            <surname>Granados</surname>
          </string-name>
          , Joan Benavent, Xaro Benavent, Esther de Ves, and Ana Garc´ıaSerrano.
          <article-title>Multimodal information approaches for the Wikipedia collection at ImageCLEF 2011</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Vivien</surname>
            <given-names>Petras</given-names>
          </string-name>
          , Pamela Forner, and Paul Clough, editors.
          <source>CLEF 2011 working notes</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Adrian</surname>
            <given-names>Popescu</given-names>
          </string-name>
          , Theodora Tsikrika, and
          <string-name>
            <given-names>Jana</given-names>
            <surname>Kludas</surname>
          </string-name>
          .
          <article-title>Overview of the wikipedia retrieval task at imageclef 2010</article-title>
          .
          <source>In Working notes of ImageCLEF</source>
          <year>2010</year>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Miguel</surname>
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Ruiz</surname>
            , Chee Wee Leong, and
            <given-names>Samer</given-names>
          </string-name>
          <string-name>
            <surname>Hassan</surname>
          </string-name>
          . UNT at ImageCLEF 2011:
          <article-title>Relevance Models and Salient Semantic Analysis for Image Retrieval</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Konstantinos</surname>
            <given-names>Zagoris</given-names>
          </string-name>
          , Avi Arampatzis, and
          <article-title>Savvas A. Chatzichristofis. www.mmretrieval.net: a multimodal search engine</article-title>
          .
          <source>In Proceedings of the Third International Conference on SImilarity Search and APplications</source>
          , SISAP '
          <volume>10</volume>
          , pages
          <fpage>117</fpage>
          -
          <lpage>118</lpage>
          , New York, NY, USA,
          <year>2010</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>David</surname>
          </string-name>
          <article-title>Zellho¨ fer and Thomas Bo¨ ttcher</article-title>
          . BTU DBIS'
          <article-title>Multimodal Wikipedia Retrieval Runs at ImageCLEF 2011</article-title>
          . In Petras et al. [
          <volume>10</volume>
          ].
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>