<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Medico-Task 2018: Disease Detection in the Gastrointestinal Tract using Global Features and Deep Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vajira Thambawita</string-name>
          <email>vajira@simula.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Debesh Jha</string-name>
          <email>debesh@simula.no</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Riegler</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pål Halvorsen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hugo Lewi Hammer</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Håvard D. Johansen</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dag Johansen</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Oslo Metropolitan University</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Simula Metropolitan</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Simula Research Laboratory</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Tromsø</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>29</fpage>
      <lpage>31</lpage>
      <abstract>
        <p>In this paper, we present our approach for the 2018 Medico Task classifying diseases in the gastrointestinal tract. We have proposed a system based on global features and deep neural networks. The best approach combines two neural networks, and the reproducible experimental results signify the eficiency of the proposed model with an accuracy rate of 95.80%, a precision of 95.87%, and an F1score of 95.80%.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Our main goal for the Medico Task [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] is to classify findings in
images from the Gastrointestinal (GI) tract. This task provides two
types of input data: Global Features (GFs) and original images.
The 2017 Medico Task consisted of a balanced dataset with only
8 classes [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] whereas the current task consists of a highly
imbalanced dataset with 16 classes [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ], i.e., making this years task
more complicated. Diferent approaches have been used in the last
year medico task [
        <xref ref-type="bibr" rid="ref10 ref14 ref17 ref5 ref7 ref9">5, 7, 9, 10, 14, 17</xref>
        ] based on GFs extractions and
Convolutional Neural Networks (CNN) methods. We extend upon
these solutions and present our solutions based on both GFs and
transfer learning mechanisms using CNN. We achieve best results
combining two CNNs and using an extra multilayer perceptron to
combine the outputs of the two networks.
      </p>
    </sec>
    <sec id="sec-2">
      <title>APPROACHES</title>
      <p>We approach the problem of GI tract disease detection with small
training datasets using five diferent methods: two based on GF
extractions, and three based on CNN with transfer learning described
below.</p>
    </sec>
    <sec id="sec-3">
      <title>Global-feature-based approaches</title>
      <p>
        Method 1 and Method 2 use the concept of GFs. For the extraction
of GFs, we use Lucence Image Retrieveal (LIRE) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. GFs are easy and
fast to calculate, and can also be used for image comparison, image
collection search and distance computing [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Based on [
        <xref ref-type="bibr" rid="ref13 ref16">13, 16</xref>
        ],
we use Joint Composite feature (JCD), Tamura, Color layout, Edge
Histogram, Auto Color Correlogram and Pyramid Histogram of
Oriented Gradients (PHOG). These features represent the overall
properties of the images. Adding more GFs is possible, but it may
increase the redundant information which can reduce the overall
classification performance.
      </p>
      <p>The extracted features are sent to the diferent machine learning
classifier for the multi-class classification. Method 1 makes the use
of extracted GFs that are sent to SimpleLogistic (SL) classifier. We
input the same selected set of features to the logistic model tree
(LMT) classifier in Method 2.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Transfer learning based approaches</title>
      <p>
        Our CNN approaches use transfer learning mechanism with
pretrained models using the ImageNet dataset [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Resnet-152 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and
Densenet-161 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] have been selected, and this selection is based
on top 1-error and top-5-errors rate of pre-trained networks in the
Pytorch [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] deep learning framework.
      </p>
      <p>
        One of the main problems of the given dataset is the "out of
patient"-category which has only four images while other classes
have a considerable number. The colour distribution of this class
shows a completely diferent colour domain compared to the other
categories. We identified this diference via manual investigations
of the dataset and moved all four images of this category into the
corresponding validation set folder. Then, the training set folder
is filled with random Google images which are not related to the
GI tract. To overcome the problems of stopping training in a local
minima, we use the stochastic gradient descent [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] method with
dynamic learning rate scheduling. The losses (loss 1 and loss 2
in Figure 1) of CNN methods were calculated for each network
separately. Additionally, horizontal flips, vertical flips, rotations
and re-sizing data augmentations have been applied to overcome
the problem of over-fitting.
      </p>
      <p>Method 3 uses transfer learning with Resnet-152 which has the
top-1-error and top-5-error rates. The last fully connected layer of
Resnet-152, which is originally designed to classify 1000 classes of
the ImageNet dataset, has been changed to classify the 16 classes in
the MEdico task. Usually, the transfer learning freezes pre-trained
layers to avoid back propagation of large errors. This is because
of newly added layers with random weights. However, we did not
freeze the pre-trained layers, because modifying only the last layer
cannot propagate huge errors backwards in transfer learning. The
network was trained until it reached to the maximum validation
accuracy of the validation dataset.</p>
      <p>Method 4 extends Method 3 by using two parallel pre-trained
models, Resnet-152 and Densenet-161, to get a cumulative decision
at the end as depicted in Figure 1. The classification is based on an
average of the two output probability vectors. Finally, one loss value
was calculated and propagated for updating weights. However,
this yields a restriction of updating weights of networks
Resnet152 and Densenet-161 separately as they required. Therefore, we
calculated two diferent loss values (loss 1 and loss 2 in Figure
1) from each network to update their weights separately. Both</p>
      <p>X</p>
      <p>Resnet-152
loss 1
loss 2
Base Network
networks were trained simultaneously until it reached to the best
validation accuracy by changing hyper-parameters manually.</p>
      <p>Method 5 was constructed to overcome the limitation of
calculating the average of the probabilistic output of the two networks
used in Method 4. Instead of calculating the average using the
simple mathematical formula, another multilayer perceptron (MLP)
has been merged with the above network to identify complex
mathematical formula to get the cumulative decision as illustrated in
Figure 1. Therefore, we passed the probability output of two
networks (16 probabilities from each network) to a new MLP with 32
inputs, 16 outputs (via sigmoid layer) and one hidden layer with
32 units. In this, we used pre-trained Resnet-152 and Densenet-161
using the dataset and froze them before training the MLP. Then,
we trained only the MLP to identify the best mathematical formula
to get the cumulative decision.</p>
    </sec>
    <sec id="sec-5">
      <title>3 RESULTS AND ANALYSIS</title>
      <p>
        We have divided the development dataset into a training set (70%)
and a validation set (30%). For the GFs based approach, ensembles of
six extracted GFs were fetched to all the available machine learning
classifiers (with diferent parameters) using WEKA[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] library. The
SL and LMT classifiers outperform all other available classifiers for
the dataset. The other promising classifier were Sequential minimal
optimization (RBF kernel), and a combination of PCA with LibSVM
(RBF) classifier.
      </p>
      <p>On validation set, all the CNN methods (3-5) show accuracies of
around 95% and specificities of around 99%. These are always better
than the GFs based extraction methods (1,2) which have accuracies
of around 82% and specificities of around 98%. According to the
task organizers’ evaluation results of the test dataset, Methods 3
to 5 show accuracies and specificities of around 99% again,which
demonstrates our CNN methods are not overfitted with validation
dataset.</p>
      <p>Method 5 and 4 with Resnet-152 and Densenet-161 performs
better compared to the Method 3 which has only Resnet-152 because
of the capability of deciding the final answer based on two answers
generated from two deep learning networks. However, getting a
cumulative decision based on simple averaging function (Method
4) shows poor performance than the decision taken from a MLP
(Method 5). As a result, Method 5 shows better results than method
4 by increasing the accuracy from 0.955 to 0.958. Therefore, Method
5 has been selected as our best method and confusion matrix
represented in Table 1 was generated. An overview of the individual
results obtained from five diferent experiments along with their
performance metrics is presented in Table 2. Results obtained from
the organizers for the test dataset is presented in the Table 3.
A:blurry-nothing, B:colon-clear, C:dyed-lifted-polyps, D:dyed-resection-margins,
E:esophagitis,F:instruments, G:normal-cecum, H:normal-pylorus, I:normal-z-line,
J:out-of-patient, K:polyps, L:retroflex-rectum, M:retroflex-stomach, N:stool-inclusions,
O:stool-plenty, P:ulcerative-colitis</p>
      <p>The main considerable point in the confusion matrix in Table 1
is misclassification between categories E: esophagitis and I:
normalz-line. A large number of misclassifications like 30 images from
the validation set occurred and a manual investigation was done
to identify the reason. We notice that the images of these two
categories were very similar to each other because of the close
location in the GI tract, and identifying these is also a challeng for
physicians.</p>
    </sec>
    <sec id="sec-6">
      <title>4 CONCLUSION</title>
      <p>In this paper, we presented five diferent methods for the multi-class
classification of GI tract diseases. The proposed approach are based
on the GFs, and pre-trained CNN with transfer learning
mechanism. The combination of Resnet-152 and Densenet-161 with an
additional MLP achieved the highest performance with both the
validation dataset and the test dataset provided by the task
organizers. We show that a combination of pre-trained deep neural models
on ImageNet has better capabilities to classify images into the
correct classes because of cumulative decision-making capabilities. For
future work, we will combine deeper CNNs parallelly to add more
cumulative decision taking capabilities for classifying multi-class
objects. In addition to that, Generative Adversarial Network (GAN)
methods can be utilized to handle imbalance dataset by generating
more data to train deep neural networks.</p>
      <p>Medico: The 2018 Multimedia for Medicine Task</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Ian</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          , Yoshua Bengio, Aaron Courville, and
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Deep learning</article-title>
          .
          <source>Vol. 1</source>
          . MIT press Cambridge.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Mark</given-names>
            <surname>Hall</surname>
          </string-name>
          , Eibe Frank, Geofrey Holmes, Bernhard Pfahringer,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Reutemann</surname>
          </string-name>
          , and
          <string-name>
            <surname>Ian H Witten</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>The WEKA data mining software: an update</article-title>
          .
          <source>ACM SIGKDD explorations newsletter (SIGKDD Explor. Newsl.) 11</source>
          ,
          <issue>1</issue>
          (
          <year>2009</year>
          ),
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Kaiming</given-names>
            <surname>He</surname>
          </string-name>
          , Xiangyu Zhang, Shaoqing Ren, and
          <string-name>
            <given-names>Jian</given-names>
            <surname>Sun</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Deep residual learning for image recognition</article-title>
          .
          <source>In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)</source>
          .
          <volume>770</volume>
          -
          <fpage>778</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Gao</given-names>
            <surname>Huang</surname>
          </string-name>
          , Zhuang Liu,
          <string-name>
            <surname>Laurens Van Der Maaten</surname>
          </string-name>
          , and
          <string-name>
            <surname>Kilian Q Weinberger</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Densely Connected Convolutional Networks</article-title>
          .
          <source>In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          .
          <volume>2261</volume>
          -
          <fpage>2269</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Yang</surname>
            <given-names>Liu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhonglei Gu</surname>
          </string-name>
          , and William K Cheung.
          <year>2017</year>
          . HKBU at MediaEval 2017 Medico:
          <article-title>Medical multimedia task</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2017 Workshop (MediaEval</source>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Mathias</given-names>
            <surname>Lux</surname>
          </string-name>
          , Michael Riegler, Pål Halvorsen, Konstantin Pogorelov, and
          <string-name>
            <given-names>Nektarios</given-names>
            <surname>Anagnostopoulos</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>LIRE: open source visual information retrieval</article-title>
          .
          <source>In Proceedings of the 7th International Conference on Multimedia Systems (MMSys)</source>
          .
          <source>ACM</source>
          ,
          <volume>30</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Syed</given-names>
            <surname>Sadiq Ali Naqvi</surname>
          </string-name>
          , Shees Nadeem, Muhammad Zaid, and Muhammad Atif Tahir.
          <year>2017</year>
          .
          <article-title>Ensemble of Texture Features for Finding Abnormalities in the Gastro-Intestinal Tract</article-title>
          .
          <source>Working Notes Proceedings of the MediaEval 2017 Workshop (MediaEval</source>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Adam</given-names>
            <surname>Paszke</surname>
          </string-name>
          , Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang,
          <string-name>
            <surname>Zachary</surname>
            <given-names>DeVito</given-names>
          </string-name>
          , Zeming Lin, Alban Desmaison, Luca Antiga, and
          <string-name>
            <given-names>Adam</given-names>
            <surname>Lerer</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Automatic diferentiation in PyTorch</article-title>
          .
          <source>In Proceedings of 31st Conference on Neural Information Processing Systems (NIPS).</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Petscharnig</surname>
          </string-name>
          and
          <string-name>
            <given-names>Klaus</given-names>
            <surname>Schöfmann</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Learning laparoscopic video shot classification for gynecological surgery</article-title>
          .
          <source>An International Journal of Multimedia Tools and Applications</source>
          <volume>77</volume>
          ,
          <issue>7</issue>
          (
          <year>2018</year>
          ),
          <fpage>8061</fpage>
          -
          <lpage>8079</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Stefan</surname>
            <given-names>Petscharnig</given-names>
          </string-name>
          , Klaus Schöfmann, and
          <string-name>
            <given-names>Mathias</given-names>
            <surname>Lux</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>An Inception-like CNN Architecture for GI Disease and Anatomical Landmark Classification</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2017 Workshop (MediaEval</source>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Kristin Ranheim Randel, Thomas de Lange, Sigrun Losada Eskeland, Carsten Griwodz, Dag Johansen, Concetto Spampinato, Mario Taschwer, Mathias Lux, Peter Thelin Schmidt,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Pål</given-names>
            <surname>Halvorsen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Nerthus: A Bowel Preparation Quality Video Dataset</article-title>
          .
          <source>In Proceedings of the 8th ACM on Multimedia Systems Conference (MMSYS)</source>
          .
          <source>ACM</source>
          ,
          <volume>170</volume>
          -
          <fpage>174</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Kristin Ranheim Randel, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Concetto Spampinato,
          <string-name>
            <surname>Duc-Tien</surname>
          </string-name>
          Dang-Nguyen, Mathias Lux, Peter Thelin Schmidt, and others.
          <source>2017</source>
          .
          <article-title>Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection</article-title>
          .
          <source>In Proceedings of the 8th ACM on Multimedia Systems Conference (MMSYS)</source>
          .
          <source>ACM</source>
          ,
          <volume>164</volume>
          -
          <fpage>169</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Michael Riegler, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Carsten Griwodz, Peter Thelin Schmidt, and
          <string-name>
            <given-names>Pål</given-names>
            <surname>Halvorsen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Eficient disease detection in gastrointestinal videos-global features versus neural networks</article-title>
          .
          <source>An International Journal Multimedia Tools and Applications</source>
          <volume>76</volume>
          ,
          <issue>21</issue>
          (
          <year>2017</year>
          ),
          <fpage>22493</fpage>
          -
          <lpage>22525</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Michael Riegler, Pål Halvorsen, Carsten Griwodz, Thomas de Lange, Kristin Ranheim Randel, Sigrun Eskeland, Dang Nguyen, Duc Tien, Olga Ostroukhova, and others.
          <year>2017</year>
          .
          <article-title>A comparison of deep learning with global features for gastrointestinal disease detection</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2017 Workshop (MediaEval</source>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Michael Riegler, Pål Halvorsen, Thomas De Lange, Kristin Ranheim Randel,
          <string-name>
            <surname>Duc-Tien</surname>
            Dang-Nguyen,
            <given-names>Mathias</given-names>
          </string-name>
          <string-name>
            <surname>Lux</surname>
            , and
            <given-names>Olga</given-names>
          </string-name>
          <string-name>
            <surname>Ostroukhova</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Medico Multimedia Task at MediaEval 2018</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2018 Workshop.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , Konstantin Pogorelov, Sigrun Losada Eskeland, Peter Thelin Schmidt, Zeno Albisser, Dag Johansen, Carsten Griwodz, Pål Halvorsen, and Thomas De Lange.
          <year>2017</year>
          .
          <article-title>From annotation to computeraided diagnosis: Detailed evaluation of a medical multimedia system</article-title>
          .
          <source>ACM Transactions on Multimedia Computing</source>
          , Communications, and
          <string-name>
            <surname>Applications</surname>
          </string-name>
          (TOMM)
          <volume>13</volume>
          ,
          <issue>3</issue>
          (
          <year>2017</year>
          ),
          <fpage>26</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , Konstantin Pogorelov, Pål Halvorsen, Carsten Griwodz, Thomas Lange, Kristin Ranheim Randel, Sigrun Eskeland, Dang Nguyen, Duc Tien, Mathias Lux, and others.
          <source>2017</source>
          .
          <article-title>Multimedia for medicine: the medico Task at mediaEval 2017</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2017 Workshop (MediaEval</source>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Olga</surname>
            <given-names>Russakovsky</given-names>
          </string-name>
          , Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Bernstein</surname>
          </string-name>
          , and others.
          <year>2015</year>
          .
          <article-title>ImageNet Large Scale Visual Recognition Challenge</article-title>
          .
          <source>International Journal of Computer Vision</source>
          (IJCV) (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>