<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>NLM at ImageCLEF 2017 Caption Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Asma Ben Abacha</string-name>
          <email>asma.benabacha@nih.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alba G. Seco de Herrera</string-name>
          <email>albagarcia@nih.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soumya Gayen</string-name>
          <email>soumya.gayen@nih.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dina Demner-Fushman</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sameer Antani</string-name>
          <email>santani@mail.nih.gov</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lister Hill National Center for Biomedical Communications, National Library of Medicine</institution>
          ,
          <addr-line>Bethesda</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper describes the participation of the U.S. National Library of Medicine (NLM) in the ImageCLEF 2017 caption task. We proposed di erent machine learning methods using training subsets that we selected from the provided data as well as retrieval methods using external data. For the concept detection subtask, we used Convolutional Neural Networks (CNNs) and Binary Relevance using decision trees for multi-label classi cation. We also proposed a retrieval-based approach using Open-i image search engine and MetaMapLite to recognize relevant terms and associated Concept Unique Identi ers (CUIs). For the caption prediction subtask, we used the recognized CUIs and the UMLS to generate the captions. We also applied Open-i to retrieve similar images and their captions. We submitted ten runs for the concept detection subtask and six runs for the caption prediction subtask. CNNs provided good results with regards to the size of the selected subsets and the limited number of CUIs used for training. Using the CUIs recognized by the CNNs, our UMLS-based method for caption prediction obtained good results with 0.2247 mean BLUE score. In both subtasks, the best results were achieved using retrieval-based approaches outperforming all submitted runs by all the participants with 0.1718 mean F1 score in the concept detection subtask and 0.5634 mean BLUE score in the caption prediction subtask.</p>
      </abstract>
      <kwd-group>
        <kwd>Concept Detection</kwd>
        <kwd>Caption Prediction</kwd>
        <kwd>Convolutional Neural Networks</kwd>
        <kwd>Multi-label Classi cation</kwd>
        <kwd>Open-i</kwd>
        <kwd>MetaMapLite</kwd>
        <kwd>UMLS</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        This paper describes the participation of the U.S. National Library of Medicine1
(NLM) in the ImageCLEF 2017 caption task [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. ImageCLEF [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is an evaluation
campaign organized as part of the CLEF2 initiative labs. In 2017, the caption
task consisted of two subtasks including concept detection and caption
prediction. A detailed description of the data and the task is presented in Eickho et
al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>1 http://www.nlm.nih.gov</title>
    </sec>
    <sec id="sec-3">
      <title>2 http://clef2017.clef-initiative.eu</title>
      <p>
        The concept detection subtask consists of identifying the UMLS R (Uni ed
Medical Language System)3 Concept Unique Identi ers (CUIs). To solve this
rst challenge of detecting CUIs from a given image from the biomedical
literature, we propose several approaches based on multi-label classi cation and
information retrieval. For the multi-label classi cation, Convolutional Neural
Networks (CNNs) and Binary Relevance using Decision Trees (BR-DT) are
applied. The information retrieval approach is based on the Open-i Biomedical
Image Search Engine4 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>The caption prediction subtask aims to recreate the original image caption.To
predict the captions of the images, we proposed a retrieval-based approach using
Open-i and a second approach based on the retrieved CUIs and the UMLS R to
nd the associated terms and groups.</p>
      <p>The rest of the paper is organized as follows. Section 2 describes the data
provided for the two subtasks and our method to select training subsets. Then we
present the proposed approaches for concept detection in Section 3 and caption
prediction in Section 4. Section 5 provides a description of the submitted runs.
Finally Section 6 presents and discusses our results.
2</p>
      <sec id="sec-3-1">
        <title>Data Analysis and Selection</title>
        <p>Training, validation and test datasets were provided containing 164,614, 10,000
and 10,000 biomedical images respectively. The images were extracted from
scholarly articles on PubMed Central5 (PMC).</p>
        <p>For the concept detection subtask, a set of CUIs was provided for each image.
For the caption prediction subtask, captions were provided. Figure 1 shows an
example from the provided data.
2.1</p>
        <sec id="sec-3-1-1">
          <title>Analysis of Concept Detection Data</title>
          <p>We analyzed the task data in order to study the types of methods that could be
applied for concept detection subtask and whether it is needed to select training
data and remove the less frequent CUIs. Also we studied whether it is relevant
to build rule-based methods and construct patterns for the caption prediction
subtask based on the recognized CUIs.</p>
          <p>For the concept detection subtask:
{ Training data includes 164,614 images associated with 20,463 CUIs. 19,145</p>
          <p>CUIs have less than 100 images, including 6,251 CUIs with only one image.
{ Validation data includes 10,000 images associated with 7,070 CUIs. 6,981</p>
          <p>CUIs have less than 100 images, including 3,247 CUIs with only one image.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3 https://www.nlm.nih.gov/research/umls</title>
    </sec>
    <sec id="sec-5">
      <title>4 http://Open-i.nlm.nih.gov</title>
    </sec>
    <sec id="sec-6">
      <title>5 http://www.ncbi.nlm.nih.gov/pmc</title>
      <p>Concepts:
{ C0016911: Gadolinium
{ C0021485: Injection of therapeutic agent
{ C0024485: Magnetic Resonance Imaging
{ C0577559: Mass of body structure
{ C1533685: Injection procedure
Caption: Magnetic resonance imaging. After intravenous injection of adolinium, the
mass showed a progressive, heterogeneous, and delayed enhancement.
The heterogeneous distribution of CUIs in the training data is not adapted for
multi-label classi cation, therefore we studied data selection methods.</p>
      <p>
        Cho et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] applied deep learning to medical image classi cation and focused
on determining the ideal training data size to achieve high classi cation accuracy.
They trained a CNN using di erent sizes of training data and tested the models
on 6000 computed tomography (CT) images. Using 200 training samples, the
classi cation accuracy was already near or at 100%. Based on these experiments,
we xed a threshold of 200 training images for each CUI.
      </p>
      <p>In addition to the number of examples for each CUI, some CUIs are a lot
more frequent than others in the datasets (the number of training images for
each CUI varies from 1 to 17,998). Therefore we built two di erent training
subsets targeting the most frequent CUIs:
{ Subset 1 [92 CUIs with frequency&gt;=1,500]: We selected CUIs having
at least 1,500 training examples. This subset corresponded to 92 distinct
CUIs. For each CUI, we selected randomly 200 training examples from the
provided training images.
{ Subset 2 [239 CUIs with frequency&gt;=400]: We selected CUIs having
at least 400 training examples. This subset corresponded to 239 CUIs. For
each CUI, we selected randomly 200 training examples.</p>
      <p>We used these two subsets to train our machine learning (ML) methods.
3</p>
      <sec id="sec-6-1">
        <title>Concept Detection Methods</title>
        <p>For the concept detection subtask, each image can be associated with one or
multiple CUIs. We approached the problem in two ways, (1) applying
multilabel classi cation methods and (2) using a retrieval-based approach.</p>
        <p>In the multi-label classi cation approach we consider the CUIs in the training
set as the labels to be assigned. Thus each image will be assigned one or multiple
labels from the prede ned label set. Two methods for multi-label classi cation
were applied: Convolutional Neural Networks (CNNs) and Binary Relevance
using Decision Trees (BR-DT). To train our ML models, we utilized the
highperformance computational capabilities of the Biowulf Linux cluster at the U.S.
National Institutes of Health7.</p>
        <p>In the information retrieval approach, we used Open-i to retrieve the most
similar images and their associated labels and CUIs.
3.1</p>
        <sec id="sec-6-1-1">
          <title>Multi-label classi cation with Convolutional Neural Networks (CNNs)</title>
          <p>
            Deep learning methods have been widely applied to image analysis. In particular,
CNNs achieved excellent results for image classi cation [
            <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
            ].
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7 http://biowulf.nih.gov</title>
      <p>We applied CNNs for multi-label classi cation and tested di erent neural
networks such as the GoogleNet network [7]. GoogLeNet won the classi cation
and object recognition challenges in the 2014 ImageNet LSVRC competition
(ILSVRC20148). In our experiments on the training sets, the GoogleNet network
provided better results compared to AlexNet [8] and LeNet [9].</p>
      <p>We ran the CNNs using NVIDIA Deep Learning GPU Training System
(DIGITS)9. DIGITS is a Deep Learning (DL) training system with a web interface
that allows designing custom network architectures and evaluating their e
ectiveness. It also allows the design of new models by providing the details of
optimization and network architecture. DIGITS can be used for image classi
cation, segmentation and object detection tasks.</p>
      <p>In our nal runs, we used the GoogLeNet network. We applied stochastic
gradient descent (SGD) and performed 100 training epochs. We used the two
training subsets associated respectively to 92 and 239 CUIs to train the network
(see Section 2.2).
3.2</p>
      <sec id="sec-7-1">
        <title>Multi-label Classi cation with Binary Relevance using Decision</title>
      </sec>
      <sec id="sec-7-2">
        <title>Trees (BR-DT)</title>
        <p>The Meka project [10]10 is based on the Weka machine learning library [11], and
provides an open source implementation of methods for multi-label classi cation.
It contains several algorithms, such as Binary Relevance (BR) or Label Powerset.</p>
        <p>Similar to [12] we used BR-DT as implemented in Meka (J48). BR methods
create an individual model for each label, thus each model is a simply binary
problem. We used Decision Trees (DT) as a base classi er because DT are able to
capture relations between labels. For the experiments we extract from the images
one visual descriptors commonly used for image classi cation Colour and Edge
Directivity Descriptor (CEDD) [13]. The descriptor was provided as input to
Meka.</p>
        <p>Before submitting the runs we carried out some experiments on the training
data using also Fuzzy Colour and Texture Histogram (FCTH) [14] as a visual
descriptor. However using CEDD provided better results.
3.3</p>
      </sec>
      <sec id="sec-7-3">
        <title>Retrieval and Annotation Approach with Open-i and</title>
      </sec>
      <sec id="sec-7-4">
        <title>MetaMapLite</title>
        <p>The Open-i service of the NLM enables search and retrieval of abstracts and
images (including charts, graphs, clinical images) from the open source literature,
and biomedical image collections. Open-i provides access to over 3.7 million
images from about 1.2 million PubMed Central articles, 7,470 chest x-rays with
3,955 radiology reports, 67,517 images from NLM History of Medicine collection,</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>8 http://image-net.org/challenges/LSVRC/2014/eccv2014</title>
    </sec>
    <sec id="sec-9">
      <title>9 http://github.com/NVIDIA/DIGITS 10 http://meka.sourceforge.net</title>
      <p>2,064 orthopedic illustrations and 8084 medical case images from MedPix11.
Open-i combines text processing, image analysis and machine learning techniques
to retrieve relevant images from an input image-query.</p>
      <p>We submitted each query image to the Open-i search API and selected 10
result images with captions. For each retrieved image, we annotated its caption
with MetaMapLite12 (3.1-SNAPSHOT version) to recognize CUIs. MetaMapLite
recognizes named entities using the longest match as well as associated CUIs.
It also allows restricting the CUIs with UMLS Semantic Types. We did not use
any restriction as CUIs in the provided data have heterogeneous semantic types.
4</p>
      <sec id="sec-9-1">
        <title>Caption Prediction Methods</title>
        <p>To predict image captions, we used two di erent methods based on UMLS R and
Open-i.
4.1</p>
        <sec id="sec-9-1-1">
          <title>UMLS-based Method</title>
          <p>We used the CUIs recognized in the rst concept detection subtask to
generate the associated UMLS terms and semantic types. We then grouped the
recognized UMLS terms using the UMLS groups of their semantic types. The
UMLS Semantic Network includes 15 groups: Activities &amp; Behaviors, Anatomy,
Chemicals &amp; Drugs, Concepts &amp; Ideas, Devices, Disorders, Genes &amp; Molecular
Sequences, Geographic Areas, Living Beings, Objects, Occupations,
Organizations, Phenomena, Physiology and Procedures.</p>
          <p>The following are examples of four captions and their corresponding image
IDs, generated using the UMLS-based method:
1. 1471-2342-10-23-4: Procedures: diagnostic computed tomography,
imaging pet. Anatomy: armpit. Disorders: metastasis. Physiology: uptake.
2. iej-04-20-g007: Procedures: h&amp;e stain. Chemicals &amp; Drugs: haematoxylin,
11445 red, eosin. Disorders: proliferation.
3. 13014 2015 335 Fig1 HTML: Procedures: brain mri, di usion weighted
imaging, bodies weight. Concepts &amp; Ideas: rows. Chemicals &amp; Drugs:
gadolinium.
4. fonc-04-00350-g002: Procedures: antineoplastic chemotherapy regimen.
Disorders: abnormally opaque structure, condition response. Anatomy: left lung,
anterior thoracic region.
4.2</p>
        </sec>
        <sec id="sec-9-1-2">
          <title>Open-i-based Method</title>
          <p>For each input image, the Open-i biomedical image search engine returns a list
of similar images. In our experiments we performed several tests with the
caption, mention, Medical Subject Headings (MeSH R ) terms, three outcomes and
11 As of September 2016.
12 https://metamap.nlm.nih.gov/MetaMapLite.shtml
medical problems from the retrieved images. In our nal runs, we used only the
captions of the rst and second retrieved images.</p>
          <p>The following are two examples of results provided by Open-i:
1. 1MS-10-20646-g003: Open-i provides the following relevant results:
{ Caption: Laryngostenosis in patient with laryngeal tuberculosis.
Tracheostomy.
{ Problem(s): tuberculoses.
{ Concept(s): laryngostenoses; laryngeal tuberculoses.
{ Outcomes: (i) Within the group of patients with lymph node tuberculosis
in 15 cases there were infected lymph nodes of the 2(nd) and 3(rd)
cervical region and in 11 infected lymph nodes of the 1(st) cervical region.
(ii) In 5 cases of laryngeal tuberculosis there was detected coexistence
of cancer. (iii)Chest X-ray was performed in all cases and pulmonary
tuberculosis was identi ed in 26 (35.6%) cases.
{ Mention: Moreover, histopathological examination revealed in 5 cases
coexistence of planoepithelial carcinoma with tuberculosis. In all 5 cases
total laryngectomy was performed. Chest X-ray was performed in all
patients and the evidence of lung tuberculosis was con rmed in 14 (70%)
cases. Tuberculin skin test was positive in 10 (66.6%) out of 15 tests
performed. Contact history with active tuberculosis was detected in 3
(15%) cases (Figures 2 and 3).
2. 110.1177 2324709614529417- g1: Open-i provides the following results:
{ Caption: Magnetic resonance imaging after the onset of isolated
adrenocorticotropic hormone de ciency. Magnetic resonance imaging showed
no space-occupying lesions in the pituitary gland or hypothalamus.
{ Problem(s): isolated adrenocorticotropic hormone de ciency.
{ Concept(s): isolated adrenocorticotropic hormone de ciency.
{ Outcomes: (i) Although the neutrOpen-ia and fever immediately
improved, he became unable to take any oral medications and was
bedridden 1 week after admission. (ii) His serum sodium level abruptly
decreased to 122mEq/L on the fth day of hospitalization. (iii)
Hydrocortisone replacement therapy was begun at 20mg/day, resulting in a
marked improvement in his anorexia and general fatigue within a few
days.
{ Mention: CT and magnetic resonance imaging showed no space-occupying
lesion or atrophic change in his pituitary gland or hypothalamus
(Figure1).
5</p>
        </sec>
      </sec>
      <sec id="sec-9-2">
        <title>Runs</title>
        <p>This section provided a detailed description of the runs submitted to ImageCLEF
2017 caption task. The methods used to implement these runs are described in
previous Sections 3 and 4.
5.1</p>
        <sec id="sec-9-2-1">
          <title>Concept Detection</title>
          <p>As speci ed by the task guidelines, a maximum of 50 UMLS concepts per gure
is accepted. Therefore, if the limit of 50 CUIs per image is reached, we took
only the rst 50 CUIs for each image. We submitted the following runs to the
Concept Detection subtask:
DET 1. DET run 1 Open-i MetaMapLite 1: We used Open-i to nd
similar images and then extracted CUIs from their captions using MetaMapLite.
In this rst run, we used the caption of the most similar image
according to Open-i. The returned CUIs are all the CUIs recognized by
MetaMapLite.</p>
          <p>DET 2. DET run 1 baseline: The same DET 1 run with the exclusion of test
images if they are retrieved by Open-i.</p>
        </sec>
        <sec id="sec-9-2-2">
          <title>DET 3. DET run 2 Open-i MetaMapLite 2: The same as DET 1 except</title>
          <p>that we took only the rst CUI recognized by MetaMapLite for each
term.</p>
        </sec>
        <sec id="sec-9-2-3">
          <title>DET 4. DET run 3 Open-i MetaMapLite 3: Similar to DET 1 except that</title>
          <p>we used the captions from the rst and second best images retrieved by
Open-i.</p>
          <p>DET 5. DET run 5 Meka CEDD: Multi-label Classi cation method using
MEKA software to applied binary relevance method. CEED is used
as a visual descriptor for the images. Subset 1 of 92 CUIs is used for
training.</p>
          <p>DET 6. DET run 6 CNN GoogLeNet 92Cuis: Multi-label classi cation with
a convolutional neural network. We trained the GoogLeNet network
using subset 1 of 92 CUIs.</p>
          <p>DET 7. DET run 7 CNN GoogLeNet 239Cuis: We trained the GoogLeNet
network using subset 2 of 239 CUIs.</p>
          <p>DET 8. DET run 8 comb1 CNN2: Fusion of the runs DET 6 and DET 7 )
DET 9. DET run 9 comb2 CNN2Meka: Fusion of the runs DET 5, DET 6
and DET 7 )</p>
        </sec>
        <sec id="sec-9-2-4">
          <title>DET 10. DET run 10 comb3 CNN2MekaOpen-i: Fusion of the runs DET</title>
          <p>1, DET 5, DET 6 and DET 7 )
5.2</p>
        </sec>
        <sec id="sec-9-2-5">
          <title>Caption Prediction</title>
          <p>We submitted the following runs to the Caption Prediction subtask:
PRED 1. PRED run 1 Open-iMethod: We used Open-i Biomedical Image
Search Engine to nd similar images. In this run, we used the caption
of the rst retrieved image.</p>
          <p>PRED 2. PRED run 1 baseline: Same as PRED 1, except we excluded the
test images if they are retrieved by Open-i.</p>
          <p>PRED 3. PRED run 2 CNN 92: We used the CUIs recognized by the CNN
(CRun DET 6 ) and the UMLS semantic groups to generate the
captions.</p>
          <p>PRED 4. PRED run 3 CNN 239: We used the CUIs recognized by the CNN
(run DET 7 ) and the UMLS semantic groups to generate the captions.
PRED 5. PRED run 4 CNN comb: We used the CUIs recognized by the
CNN (run DET 8 ) and the UMLS semantic groups to generate the
captions.</p>
          <p>PRED 6. PRED run 5 comb all: We used the CUIs recognized by the hybrid
method (run DET 10 ) and the UMLS R to generate the captions.</p>
        </sec>
      </sec>
      <sec id="sec-9-3">
        <title>6 O cial Results</title>
        <p>In this section we describe and discuss the results obtained by the submitted
runs.
6.1</p>
        <sec id="sec-9-3-1">
          <title>Concept Detection Results</title>
          <p>The best overall results were obtained by run DET 1 followed by run DET
3 ; both approaches are based on Open-i retrieval system. To better understand
the results, Table 3 shows the e ciency of the Open-i system on the test set by
presenting how many times the query image itself was retrieved and ranked in
the rst 10 positions when searching on the full Open-i collection (3.7 million
images). We analyze only the rst 10 because it is the maximum number of
retrieved images that we used in our experiments.</p>
          <p>Open-i was able to nd the image in the rst top 10 results in 61% of the
cases, and extract the relevant information from the image itself.</p>
          <p>For comparison, we performed a second run called DET 2, which is equivalent
to run DET 1 but with the exclusion of test images if they are retrieved by
Openi. For run DET 2 the mean F1 score decreased to 0.0162, which we consider as
baseline result. The best results using Open-i based approaches were obtained
when using all the CUIs associated with the rst retrieved image.</p>
          <p>Without using external resources the results were poorer. One of the reasons
could be that not all the CUIs in the test set were contained in the training and
validation sets. Also, we only considered the most frequent CUIs in the training
set. With CNNs, up to 0.0880 mean F1 score was achieved and only 0.0012 when
applying BR-DT (BR-DT detected at least one CUI on 2046 images only).</p>
          <p>Table 2 also shows the performance of three hybrid methods: run DET 8,
run DET 9 and run DET 10.
6.2</p>
        </sec>
        <sec id="sec-9-3-2">
          <title>Caption Prediction Results</title>
          <p>The best results were achieved by run PRED 1 using Open-i with 0.5634
mean BLUE score and was ranked rst. As baseline, we proposed run PRED 2,
similar to run PRED 1 but without including test images if they are retrieved
by Open-i. Run PRED 2 obtained 0.2646 mean BLUE score and was the 4th
best run out of 34 submitted runs by the participating teams.</p>
          <p>CNN approaches achieved good results with 0.2247 mean BLUE score
despite the limited number of CUIs used for training and the simple UMLS-based
patterns built for caption generation. Two hybrid methods were also presented:
run PRED 5 and run PRED 6. In this subtask, run PRED 6 was ranked second.
7</p>
        </sec>
      </sec>
      <sec id="sec-9-4">
        <title>Conclusions</title>
        <p>This paper describes our participation in ImageCLEF 2017 caption task. We
proposed and compared di erent approaches for concept detection and caption
prediction. Our retrieval methods using Open-i obtained the best results with
0.1718 mean F1 score in the concept detection subtask and 0.5634 mean BLUE
score in the caption prediction subtask. We proposed baseline results by
excluding test images if they are found by Open-i. Open-i baseline was ranked 4th with
0.2646 mean BLUE score in the caption prediction subtask.</p>
        <p>We also performed multi-label classi cation of CUIs with CNNs and BR-DT.
Both methods used selected subsets from the training data. CNNs provided
acceptable results with regards the limited number of CUIs used for training. CNNs
method achieved 0.2247 mean BLUE score in the caption prediction subtask.</p>
        <p>Future improvements can tackle Open-i method as it does not support images
with panels. One better way would be to perform panel segmentation before the
search. Open-i also has size limitations on images of 2MB. A better approach
would be to resize the image if needed before submitting to Open-i API. Also,
MetaMapLite provided CUIs that are di erent from the gold standard even if
the labels retrieved by Open-i are correct. Moreover, we only used the fusion to
combine the results of our di erent methods for concept detection (the
intersection gave very few CUIs). More sophisticated combination methods could be
used to improve the results of the hybrid methods.</p>
      </sec>
      <sec id="sec-9-5">
        <title>Acknowledgments</title>
        <p>This research was supported by the Intramural Research Program of the National
Institutes of Health (NIH), National Library of Medicine (NLM), and Lister Hill
National Center for Biomedical Communications (LHNCBC).
7. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan,
D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE
Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston,
MA, USA, June 7-12, 2015. (2015) 1{9
8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classi cation with deep
convolutional neural networks. In Pereira, F., Burges, C.J.C., Bottou, L., Weinberger,
K.Q., eds.: Advances in Neural Information Processing Systems 25. Curran
Associates, Inc. (2012) 1097{1105
9. LeCun, Y., Bottou, L., Bengio, Y., Ha ner, P.: Gradient-based learning applied
to document recognition. Proceedings of the IEEE 86(11) (November 1998) 2278{
2324
10. Read, J., Reutemann, P., Pfahringer, B., Holmes, G.: MEKA: A
multi-label/multitarget extension to Weka. Journal of Machine Learning Research 17(21) (2016)
1{5
11. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical machine
learning tools and techniques. Morgan Kaufmann (2016)
12. Tanaka, E.A., Nozawa, S.R., Macedo, A.A., Baranauskas, J.A.: A multi-label
approach using binary relevance and decision trees applied to functional genomics.</p>
        <p>Journal of Biomedical Informatics 54 (2015) 85{95
13. Chatzichristo s, S.A., Boutalis, Y.S.: CEDD: Color and edge directivity
descriptor: A compact descriptor for image indexing and retrieval. In: Lecture notes in
Computer Sciences. Volume 5008. (2008) 312{322
14. Chatzichristo s, S.A., Boutalis, Y.S.: FCTH: Fuzzy color and texture histogram:
A low level feature for accurate image retrieval. In: Proceedings of the 9th
International Workshop on Image Analysis for Multimedia Interactive Service. (2008)
191{196</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Eickho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwall</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          , Garc a Seco de Herrera,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , Muller, H.:
          <article-title>Overview of ImageCLEFcaption 2017 - the image caption prediction and concept extraction tasks to understand biomedical images</article-title>
          .
          <source>CLEF working notes</source>
          ,
          <source>CEUR</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ionescu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Villegas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boato</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dang-Nguyen</surname>
            ,
            <given-names>D.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dicente Cid</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eickho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garc</surname>
            a Seco de Herrera,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Islam</surname>
          </string-name>
          , Bayzidul and,
          <string-name>
            <surname>K</surname>
          </string-name>
          .V.,
          <string-name>
            <surname>Liauchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piras</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riegler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwall</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Overview of ImageCLEF 2017: Information extraction from images</article-title>
          .
          <source>In: CLEF 2017 Proceedings. Lecture Notes in Computer Science</source>
          , Dublin, Ireland, Springer (September
          <volume>11</volume>
          -14
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Demner-Fushman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thoma</surname>
            ,
            <given-names>G.R.</given-names>
          </string-name>
          :
          <article-title>Design and development of a multimodal biomedical information retrieval system</article-title>
          .
          <source>Journal of Computing Science and Engineering</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ) (
          <year>2012</year>
          )
          <volume>168</volume>
          {
          <fpage>177</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choy</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Do</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Medical image deep learning with hospital PACS dataset</article-title>
          .
          <source>CoRR abs/1511</source>
          .06348 (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>H.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>C.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shin</surname>
            ,
            <given-names>H.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Se</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Summers</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          :
          <article-title>Anatomy-speci c classi cation of medical images using deep convolutional nets</article-title>
          . In: ISBI, IEEE (
          <year>2015</year>
          )
          <volume>101</volume>
          {
          <fpage>104</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Havaei</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davy</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Warde-Farley</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Biard</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Courville</surname>
            ,
            <given-names>A.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pal</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jodoin</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Larochelle</surname>
          </string-name>
          , H.:
          <article-title>Brain tumor segmentation with deep neural networks</article-title>
          .
          <source>CoRR abs/1505</source>
          .03540 (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>