<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of the ImageCLEF 2018 Caption Prediction Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alba G. Seco de Herrera</string-name>
          <email>alba.garcia@essex.ac.uk</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carsten Eickho</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vincent Andrearczyk</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Henning Muller</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Brown University</institution>
          ,
          <addr-line>Providence RI</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>SO)</institution>
          ,
          <addr-line>Sierre</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Applied Sciences Western Switzerland (HES</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Essex</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Geneva</institution>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The caption prediction task is in 2018 in its second edition after the task was rst run in the same format in 2017. For 2018 the database was more focused on clinical images to limit diversity. As automatic methods with limited manual control were used to select images, there is still an important diversity remaining in the image data set. Participation was relatively stable compared to 2017. Usage of external data was restricted in 2018 to limit critical remarks regarding the use of external resources by some groups in 2017. Results show that this is a di cult task but that large amounts of training data can make it possible to detect the general topics of an image from the biomedical literature. For an even better comparison it seems important to lter the concepts for the images that are made available. Very general concepts (such as \medical image") need to be removed, as they are not speci c for the images shown, and also extremely rare concepts with only one or two examples can not really be learned. Providing more coherent training data or larger quantities can also help to learn such complex models.</p>
      </abstract>
      <kwd-group>
        <kwd>Caption prediction</kwd>
        <kwd>Image understanding</kwd>
        <kwd>radiology</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The caption task described in this paper is part of the ImageCLEF5
benchmarking campaign [1{4], a framework where researchers can share their
expertise and compare their methods based on the exact same data and evaluation
methodology in an annual rhythm. ImageCLEF is part of CLEF6 (Cross
Language Evaluation Forum). More on the 2018 campaign in general is described
in Ionescu et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and on the related medical tasks in [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. In general,
ImageCLEF aims at building tasks that are related to clear information needs in
medical or non-medical environments [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. Relationships also exist with the
LifeCLEF and CLEFeHealth labs [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
5 http://www.imageclef.org/
6 http://www.clef-campaign.org/
      </p>
      <p>
        The caption task started in 2016 as a pilot task. In 2016, the task was part of
the medical image classi cation task [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ], although it unfortunately did not
have any participants, also because the questions of the task were not strongly
developed at the time. Since 2017, the caption task has been running in the
current format. The motivation of this task is the strong increase in available
images from the biomedical literature that is growing at an exponential rate and
is made available via the PubMed Central R (PMC)7 repository. As the data set
is dominated by compound gures and many general graphs, ImageCLEF has
addressed the analysis of compound gures in the past [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. To extract the image
types a hierarchy was created [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and as training data for these image types are
available the global data set of over 5 million images can be ltered to a more
homogeneous set containing mainly radiology images as is described in the data
preparation section (Section 3). The ImageCLEF caption task aims at better
understanding the images in the biomedical literature and extract concepts and
captions based only on the visual information of the images (see Figure 1). A
further description of the task can be found in Section 2.
      </p>
      <p>This paper presents an overview of the ImageCLEF caption task 2018
including the task and participation in Section 2, the dataset in Section 3 and
an explanation of the evaluation framework in Section 4. The participant
approaches are described in Section 5, followed by a discussion and the conclusions
in Sections 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Tasks and Participation</title>
      <p>Following 2017 format, the 2018 caption task contains two subtasks, a concept
detection subtask that aims at extracting UMLS (Uni ed Medical Language
System R ) Concept Unique Identi ers (CUIs) from the images automatically
based on the training data made available and a caption prediction subtask that
requires to predict a precise text caption for the images in the test data set.
Table 1 shows the 8 participants of the task who submitted 44 runs, 28 to the
concept detection subtask and 16 to the caption prediction subtask. Three of the
groups already participated in 2017, showing that the majority of the participant
were new to the task.</p>
      <p>It is interesting that despite the fact that the output of the concept detection
task can be used for the caption prediction task, none of the participant used
such an approach and only two groups participated in both tasks.
Concept Detection As a rst step towards automatic image caption and scene
understanding, this subtask aims at automatically extracting high-level
biomedical concepts (CUIs) from medical images using only the visual content. This
approach provides the participating systems with a solid initial building block
for image understanding by detecting relevant individual components from which
7 https://www.ncbi.nlm.nih.gov/pmc/</p>
      <p>Concept detection:
{ C0589121: Follow-up visit
{ C0018946: Hematoma, Subdural
{ C1514893: physiologic resolution
{ C0546674: Sorbitol dehydrogenase measurement
{ C4038402: Low resolution
{ C0374531: Postoperative follow-up visit, normally included in the surgical package,
to indicate that an evaluation and management service was performed during a
postoperative period for a reason(s) related to the original procedure
{ C0202691: CAT scan of head
{ C1320307: Urgent follow-up
{ C3694716: follow-up visit date</p>
      <p>Caption prediction: CT head at follow-up visit demonstrates resolution of SDH.
full captions can be composed. The detected concepts are evaluated with a
metric based on precision and recall using the concepts extracted from the ground
truth captions (see Section 4).</p>
      <p>Caption Prediction In this subtask the participants need to predict a coherent
caption for the entire medical image. The prediction can be based on the concepts
detected in the rst subtask as well as the visual analysis of their interaction
in the image. Rather than the mere detection of visual concepts, this subtask
requires to analyze the interplay of visible elements.</p>
      <p>The evaluation of this second subtask is based on metrics such as BLEU
scores independent from the rst subtask and designed to be robust to variability
in style and wording (see Section 4).</p>
      <p>
        Team Institution
UA.PT Bioinformatics [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] * DETI - Institute of Electronics and Informatics
      </p>
      <p>
        Engineering, University of Aveiro, Portugal
ImageSem [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] Institute of Medical Information, Chinese
      </p>
      <p>Academy of Medical Sciences/Peking Union</p>
      <p>
        Medical College, Beijing, China
IPL [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] * Information Processing Laboratory, Athens
University of Economics and Business, Athens,
      </p>
      <p>
        Greece
CS MS [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] * Computer Science Department, Morgan State
      </p>
      <p>
        University, Baltimore, MD, USA
AILAB University of the Aegean, Greece
UMass [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] Umass Medical School, Worcester, MA, USA
KU Leuven [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] Department of Computer Science, KU Leuven,
      </p>
      <p>
        Leuven, Belgium
WHU Wuhan University, Wuhan, Hubei, China
Similarly to previous years, the experimental corpus is derived from scholarly
biomedical articles on PMC from which we extract gures and their
corresponding captions. As PMC contains many compound and non-clinical gures, we
extract a subset of mainly clinical gures to remove noise from the data and
focus the challenge on useful radiology/clinical images. The subset was created
using a fully automated method based on deep multimodal fusion of
Convolutional Neural Networks (CNNs) to classify all 5.8 million images of PMC from
2017 into image types, as described in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. This lead to a more homogeneous set
of gures than in the 2017 ImageCLEF caption task but diversity still remained
high. Besides the removal of many general graphs, also the number of compound
gures (i.e. images containing more than one sub gure) was much lower than
in 2017. Figure 2 shows some examples of the images contained in the collection
including some of the noise that still remained in the data.
      </p>
      <p>
        In total, the collection is comprised of 232,305 image-caption pairs8. This
overall set is further split into disjunct training (222,305 pairs) and test (10,000
pairs) sets. For the concept detection subtask, the QuickUMLS library [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]
was used to identify UMLS concepts mentioned in the caption text. As a result
111,155 unique UMLS concepts were extracted from the training set.
      </p>
      <p>Table 2 shows examples of the concepts. The average number of concepts
per image in the training set is 30 varying between 1 and 1,276. In the training
set 2,577 images are labeled with only 1 concept and 3,162 with only 2. Despite
the collection being carefully created, there are still non-clinical images (see
Figure 2), as all processing was automatic with only limited human checked. There
are also non-relevant concepts for the task extracted, again linked to the fact
that the data analysis was fully automatic with limited manual quality control.
The concepts such as \and",\medical image" or \image" are not relevant for
8 Nine pairs were removed after the challenge started due to incorrect duplicates in
the PMC gures.</p>
      <p>(a) Relevant images.</p>
      <p>(b) Irrelevant images.
the task and not useful to predict from the visual information. Some of the
concepts are redundant such as \Marrow - Specimen Source Codes" and \Marrow".
Regardless of the limitations of the annotation the majority of concepts was of
good quality and helps to understand the content of the images, as for example
the concepts \Marrow" or \X-ray".
4</p>
    </sec>
    <sec id="sec-3">
      <title>Evaluation Methodology</title>
      <p>
        The performance evaluation follows the approach in the previous edition [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]
in evaluating both subtasks separately. For the concept detection subtask, the
balanced precision and recall trade-o were measured in terms of F1 scores.
Python's scikit-learn (v0.17.1-2) library was used. Micro F1 is calculated per
image and then the average across all test images is taken as the nal measure.
      </p>
      <p>
        Caption prediction performance is assessed on the basis of BLEU scores [24]
using the Python NLTK (v3.2.2) default implementation. Candidate captions
are lower cased, stripped of all punctuation and English stop words. Finally, to
increase coverage, Snowball stemming was applied. BLEU scores are computed
per reference image, treating each entire caption as a sentence, even though it
may contain multiple natural sentences. We report average BLEU scores across
all test images.
This section shows the results achieved by the participants in both subtasks.
Table 3 contains the results of the concept detection subtask and Table 4 contains
the results of the caption prediction subtask. None of the participants used
external data this year and despite less noise in the 2018 data, no better results
were achieved in 2018 compared to 2017, maybe also due to the fact that no
external data were used.
28 runs were submitted by 5 groups (see Section 2) to the Concept detection
subtasks. Table 2 shows the details of the results. Several approaches were used
for the concept detection task, ranging from retrieval systems [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to deep neural
networks. Most research groups implemented at least one approach based on
deep learning [
        <xref ref-type="bibr" rid="ref15 ref16 ref18">15, 16, 18</xref>
        ], including recurrent networks, various deep CNNs and
generative adversarial networks.
      </p>
      <p>
        Best results were achieved by UA.PT Bioinformatics [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] by applying an
adversarial auto-encoder for unsupervised feature learning. They also
experimented with a traditional bag of words algorithm, using Oriented FAST and
rotated BRIEF (ORB) key point descriptors. UA.PT employed two classi
cation algorithms for concept detection over the learned feature spaces, namely
a logistic regression and a variant of k-nearest neighbor (k-NN). Test results
showed a best mean F1 score of 0.1102 for linear classi ers, by using the features
of the adversarial auto-encoder.
      </p>
      <p>
        ImageSem [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]was the group following UA.PT Bioinformatics in the ranking.
ImageSem was the only group using a retrieval approach, which was more
popular in 2017. This approach is based on the open-source Lucene Image Retrieval
(LIRE) system used in combination with Latent Dirichlet Allocation (LDA) for
clustering concepts of the similar images. ImageSem also experimented with a
pre-trained CNN ne-tuned to predict a selected subset of concepts.
      </p>
      <p>
        IPL [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] proposed a k-NN classi er using two image representation models.
One of the methods used is a bag of visual words with dense Scale Invariant
Feature Transform (SIFT) descriptors using 4,096 clusters. A second method
uses a generalized bag of colors, dividing the image into a codebook of regions
of homogeneous colors with 100 or 200 clusters.
      </p>
      <p>
        The CS MS group [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] used an encoder-decoder model based on a
multimodal Recurrent Neural Networks (RNNs). The encoded captions were the input
to the RNN via word embedding, while deep image features were encoded via
a pre-trained CNN. The combination of the two encoded inputs was used to
generate the concepts.
      </p>
      <p>The AILAB used a multimodal deep learning approach based. Instead of
using the 220K images, AILAB only used a subset of 4,000 images with feature
generation. The visual features are extracted by a pre-trained CNN, while the
text features are obtained by word embedding, followed by a Long Short-Term
Memory (LSTM) network. The two modalities are then merged and processed
by a dense layer to make a nal concept prediction.
5.2</p>
      <p>Results for the Caption Prediction Task
16 runs were submitted by 5 groups (see Section 2) to the caption prediction
subtask. Table 4 shows the details of the results.</p>
      <p>
        ImageSem [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] achieved best results (0.2501 mean BLEU score) using the
image retrieval method described in the previous section to combine captions of
similar images. Preferred concepts, detected in the concept subtask by high CNN
or LDA scores, were also used in other runs to improve the caption generation.
      </p>
      <p>
        UMass [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] explored and implemented an encoder-decoder framework to
generate captions. For the encoder, deep CNN features are used while an LSTM
network is used for the decoder. The attention mechanism was also experimented
on a smaller sample to evaluate its impact on the model tting and prediction
performance.
      </p>
      <p>
        As mentioned in the previous section for concept detection, CS MS [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] also
used a similar multimodal deep learning method for caption prediction. A CNN
feature extraction of the images was combined with an LSTM on top of word
embeddings of the captions. A decoder made of two fully-connected layers produces
the captions.
      </p>
      <p>
        Instead of generating textual sequences directly from images, KU Leuven [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
rst learn a continuous representation space for the captions. The
representation space is learned by an adverserially regularized autoencoder (ARAE) [25],
combining a GAN and the auto-encoder. Subsequently, the task is reduced to
learning the mapping from the images to the continuous representation, which is
performed by a CNN. The decoder learned in the rst step decodes the mapping
to a caption for each image.
      </p>
      <p>WHU also used a simple LSTM network that produces a caption by
generating one word at every time step conditioned on a context vector, the previous
hidden state and the previously generated words.
6</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion and Conclusions</title>
      <p>The 2018 caption prediction task of ImageCLEF attracted a similar number of
participants compared to 2017. No external resources were used, making the task
hard, which is also show in the results that overall were lower compared to 2017
despite the training and test data being more homogeneous, which should make
the task slightly easier.</p>
      <p>Most of the participants used deep learning approaches, but the used
networks and architectures varied very strongly. This shows that there is still much
research required and that the potential is high to improve results. Also more
conventional features extraction and approaches based on retrieval delivered
good results, showing also that there are many di erent ways for creating good
models.</p>
      <p>The limited participation was partly also linked to the large amount of data
made available that caused problems for some research groups. The data set
also remains noisy. Only more manual control can likely help creating cleaner
data and thus maybe make results of automatic approaches more coherent. Even
larger data sets could also help in this direction and really allow to create models
for at least more frequently extracted concepts.
24. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic
evaluation of machine translation. In: Proceedings of the 40th annual meeting on
association for computational linguistics, Association for Computational
Linguistics (2002) 311{318
25. Zhao, J.J., Kim, Y., Zhang, K., Rush, A.M., LeCun, Y.: Adversarially regularized
autoencoders for generating discrete structures. CoRR, abs/1706.04223 (2017)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Muller, H.,
          <string-name>
            <surname>Clough</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deselaers</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Caputo</surname>
          </string-name>
          , B., eds.: ImageCLEF {
          <article-title>Experimental Evaluation in Visual Information Retrieval</article-title>
          . Volume
          <volume>32</volume>
          of The Springer International Series On Information Retrieval. Springer, Berlin Heidelberg (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <source>Garc</source>
          a Seco de Herrera,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Demner-Fushman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Antani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Bedrick</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , Muller, H.:
          <article-title>Evaluating performance of biomedical image retrieval systems: Overview of the medical image retrieval task at ImageCLEF 2004{2014</article-title>
          .
          <source>Computerized Medical Imaging and Graphics</source>
          <volume>39</volume>
          (
          <issue>0</issue>
          ) (
          <year>2015</year>
          )
          <volume>55</volume>
          {
          <fpage>61</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Clough</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Sanderson</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The CLEF 2004 cross{language image retrieval track</article-title>
          . In Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Clough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.J.F.</given-names>
            ,
            <surname>Kluck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Magnini</surname>
          </string-name>
          , B., eds.:
          <article-title>Multilingual Information Access for Text, Speech and Images: Result of the fth CLEF evaluation campaign</article-title>
          . Volume
          <volume>3491</volume>
          of Lecture Notes in Computer Science (LNCS).,
          <string-name>
            <surname>Bath</surname>
          </string-name>
          , UK, Springer (
          <year>2005</year>
          )
          <volume>597</volume>
          {
          <fpage>613</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Caputo</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muller</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomee</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villegas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paredes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zellhofer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goeau</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          , et al.:
          <source>Imageclef</source>
          <year>2013</year>
          :
          <article-title>the vision, the data and the open challenges</article-title>
          .
          <source>In: International Conference of the Cross-Language Evaluation Forum for European Languages</source>
          , Springer (
          <year>2013</year>
          )
          <volume>250</volume>
          {
          <fpage>268</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ionescu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Villegas</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eickho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andrearczyk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cid</surname>
            ,
            <given-names>Y.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liauchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kovalev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasan</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ling</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farri</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lungren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dang-Nguyen</surname>
            ,
            <given-names>D.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piras</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riegler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lux</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Overview of ImageCLEF 2018:
          <article-title>Challenges, datasets and evaluation. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the Ninth International Conference of the CLEF Association (CLEF</source>
          <year>2018</year>
          ), Avignon, France,
          <source>LNCS Lecture Notes in Computer Science</source>
          , Springer (September
          <volume>10</volume>
          -14
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Dicente</given-names>
            <surname>Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          , , Muller, H.:
          <article-title>Overview of ImageCLEFtuberculosis 2018 - detecting multi-drug resistance, classifying tuberculosis type, and assessing severity score</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hasan</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ling</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farri</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lungren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Overview of the ImageCLEF 2018 medical domain visual question answering task</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Markonis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holzer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dungs</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vargas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Langs</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kriewel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>A survey on visual information search behavior and requirements of radiologists</article-title>
          .
          <source>Methods of Information in Medicine</source>
          <volume>51</volume>
          (
          <issue>6</issue>
          ) (
          <year>2012</year>
          )
          <volume>539</volume>
          {
          <fpage>548</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Muller, H.,
          <string-name>
            <surname>Despont-Gros</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hersh</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jensen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lovis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geissbuhler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Health care professionals' image use and search behaviour</article-title>
          .
          <source>In: Proceedings of the Medical Informatics Europe Conference (MIE</source>
          <year>2006</year>
          ). IOS Press,
          <article-title>Studies in Health Technology and Informatics, Maastricht, The Netherlands</article-title>
          (aug
          <year>2006</year>
          )
          <volume>24</volume>
          {
          <fpage>32</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Joly</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Goeau, H.,
          <string-name>
            <surname>Glotin</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spampinato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonnet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vellinga</surname>
            ,
            <given-names>W.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lombardo</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Planque</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palazzo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Lifeclef 2017 lab overview: multimedia species identi cation challenges</article-title>
          .
          <source>In: Proceedings of CLEF</source>
          <year>2017</year>
          .
          <article-title>(</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palotti</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pecina</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zuccon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanbury</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Share/clef ehealth evaluation lab 2014, task 3: User-centred health information retrieval clef ehealth overview</article-title>
          .
          <source>In: CLEF Proceedings</source>
          , Springer LNCS (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Villegas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bromuri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilbert</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piras</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ramisa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>General overview of imageclef at the clef 2016 labs</article-title>
          . In:
          <article-title>International conference of the cross-language evaluation forum for European languages</article-title>
          , Springer (
          <year>2016</year>
          )
          <volume>267</volume>
          {
          <fpage>285</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Garc</surname>
            a Seco de Herrera,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schaer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bromuri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Muller, H.:
          <article-title>Overview of the ImageCLEF 2016 medical task</article-title>
          .
          <source>In: Working Notes of CLEF</source>
          <year>2016</year>
          (
          <article-title>Cross Language Evaluation Forum)</article-title>
          .
          <source>(September</source>
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. Muller, H.,
          <string-name>
            <surname>Kalpathy-Cramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Creating a classi cation of image types in the medical literature for visual categorization</article-title>
          .
          <source>In: SPIE Medical Imaging</source>
          . (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Pinho</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costa</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Feature learning with adversarial networks for concept detection in medical images: UA.PT Bioinformatics at ImageCLEF 2018</article-title>
          . In: CLEF2018 Working Notes. CEUR Workshop Proceedings, Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>ImageSem at ImageCLEF 2018 caption task: Image retrieval and transfer learning</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Valavanis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalamboukis</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          : IPL at ImageCLEF 2018:
          <article-title>A kNN-based concept detection approach</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Rahman</surname>
            ,
            <given-names>M.M.:</given-names>
          </string-name>
          <article-title>A cross modal deep learning based approach for caption prediction and concept detection by cs morgan state</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>UMass at ImageCLEF caption prediction 2018 task</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Spinks</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moens</surname>
            ,
            <given-names>M.F.</given-names>
          </string-name>
          :
          <article-title>Generating text from images in a smooth representation space</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , Avignon, France, CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt; (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Andrearczyk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henning</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Deep multimodal classi cation of image types in biomedical journal gures</article-title>
          .
          <source>In: International Conference of the Cross-Language Evaluation Forum (CLEF)</source>
          .
          <article-title>(</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Soldaini</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goharian</surname>
          </string-name>
          , N.:
          <article-title>Quickumls: a fast, unsupervised approach for medical concept extraction</article-title>
          . In: MedIR workshop, sigir. (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Eickho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwall</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , Garc a Seco de Herrera,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , Muller, H.:
          <article-title>Overview of imageclefcaption 2017-image caption prediction and concept detection for biomedical images</article-title>
          .
          <source>CLEF working notes</source>
          ,
          <source>CEUR</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>