<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Comparison of Deep Learning with Global Features for Gastrointestinal Disease Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Konstantin Pogorelov</string-name>
          <email>konstantin@simula.no</email>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Riegler</string-name>
          <email>michael@simula.no</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pål Halvorsen</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carsten Griwodz</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas de Lange</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kristin Ranheim Randel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sigrun Losada Eskeland</string-name>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Duc-Tien Dang-Nguyen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olga Ostroukhova</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mathias Lux</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Concetto Spampinato</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cancer Registry of Norway</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dublin City University</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Research Institute of Multiprocessor Computation Systems n.a. A.V.</institution>
          <addr-line>Kalyaev</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Simula Research Laboratory</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Catania</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Klagenfurt</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff7">
          <label>7</label>
          <institution>Vestre Viken Hospital Trust</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <fpage>13</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>This paper presents our approach for the 2017 Multimedia for Medicine Medico Task of the MediaEval 2017 Benchmark. We propose a system based on global features and deep neural networks, and preliminary results comparing the approaches are presented.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Following the initiative to investigate how multimedia can improve
medical systems [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], the 2017 Multimedia for Medicine Medico
Task [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] addresses the challenge of detecting diseases based on
multimedia data collected in hospitals [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], i.e., the task focuses
on detecting abnormalities, diseases and anatomical landmarks
in images in the gastrointestinal (GI) tract. There do exist some
proposals in this area using various approaches [
        <xref ref-type="bibr" rid="ref20 ref21">20, 21</xref>
        ], and in this
paper, we describe our solutions, based on both our
global-featuresbased and neural-network-based EIR prototypes [
        <xref ref-type="bibr" rid="ref12 ref14 ref16 ref17">12, 14, 16, 17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>CLASSIFICATION APPROACHES</title>
      <p>
        The proposed approaches are based on the hypothesis that GI tract
diseases and findings can be recognized and classified based on
color, shape and texture properties. In this challenge, there is no
detailed ground truth ROIs provided for the training dataset, thus,
already existing and well performing approaches to objects
recognition are not suitable for this particular task. Moreover, a relatively
low amount of training data is provided making it dificult to use
modern convolutional neural network (CNN) image segmentation
and region-based classification approaches. Furthermore, some
objects like polyps and resection margins have a compact body and
can be easily diferentiated from the surrounding tissue, but other
ifndings like ulcerative colitis have only tissue with a slightly
diferent color properties. To address these diferent detection challenges,
we present 17 diferent approaches that implement our idea of using
visual properties of images for performing multi-class classification
with the limited training set size. For the final classification step,
we use the WEKA machine learning support library [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] which is an
open source collection of algorithms for machine learning and data
mining. For all the approaches based on global features (GFs), we
use Lucene Image Retrieval (LIRE) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], an open source
implementation of global and local features extraction and comparison. For
all the deep-learning-based approaches, we use Keras [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], an open
source high-level neural networks API with Google Tensorflow [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
as a computational back-end.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Global-features-based</title>
      <p>
        For the GF-based approaches, we use features that represent the
overall image visual properties, they are easy and fast to calculate,
and they can be used for image comparison, distance computing
and image collection search. Here, we use the indexes of visual
features extracted from training image set. A classifier is used to
search the index for the image that is most similar to a given
input image. The GFs we use are JCD, Tamura, Color Layout, Edge
Histogram, Auto Color Correlogram and Pyramid Histogram of
Oriented Gradients [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. We decided for these combinations based
on our previous findings and experiments in [
        <xref ref-type="bibr" rid="ref14 ref16">14, 16</xref>
        ]. Multi-class
classification is implemented as an additional classification step
to determine the final image class based on the the ranked lists
of a search-based classifier for each class of findings. We use the
random tree (RT), random forest (RF) and logistic model tree (LMT)
classifiers [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] from WEKA.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Deep-features-based</title>
      <p>
        For the deep-features-based approaches, we use a combined method
with deep residual networks for image recognition as features
extractor and machine-learning classifier with the input of extracted
deep-features as a multi-class classifier. We use the Inception v3 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]
and ResNet50 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] models pre-trained on a set of general images.
The models were modified in order to produce numerical
probability output for all recognized object classes. Then, we use the class
(concept) probabilities (1000 values for both networks) directly in
the Concepts runs. For the Features runs, we have used the same
pre-trained models without including the fully-connected layer
at the top of the network, which give us an output of high-level
feature probabilities (16384 values for Inception v3 and 2048 for
ResNet50). Finally, we combine the probabilities by simple early
fusion in one big vector of floating point numbers and use it as an
input for the same classifiers we used in the GF-based approaches.
2.3
      </p>
    </sec>
    <sec id="sec-5">
      <title>CNN-based</title>
      <p>
        For the CNN-based approach, we created and trained a custom
CNN from scratch. Our CNN consist of six convolution layers. As
an activation function, we used the rectified linear unit (ReLU) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
and maxpooling for pooling. In all the layers, we also included a
0.5 dropout, and the final classification step was performed using
two dense layers with first ReLU and then Sigmoid as activation
functions. Both networks were trained for 200 epochs using the
Adam optimizer [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
2.4
      </p>
    </sec>
    <sec id="sec-6">
      <title>Transfer-learning-based</title>
      <p>
        For the transfer-learning-based (TFL) approach, we use the
pretrained Inception v3 [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] model and transfer learning technique [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
to train the network on our specific training set. We re-trained
the base model and fine-tuned the last layers on the training set
following the DeCAF approach [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We did not perform complex
data augmentation and only relied on transfer learning. We froze
all the basic convolutional layers of the network and only retrained
the two top dense layers. The dense layers were retrained using the
RMSprop [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] optimizer that allows an adaptive learning rate during
the training process. After 1,000 epochs, we stopped the retraining
of the dense layers and started fine tuning the convolutional layers.
For that step, we did the analysis of the Inception v3 model
layers structure and decided to apply the fine-tuning on the top two
convolutional layers. For this training step, we used a stochastic
gradient descent method with a low learning rate to achieve the
best efect in terms of speed and accuracy [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-7">
      <title>EXPERIMENTAL RESULTS</title>
      <p>
        First, we have performed an initial evaluation of the approaches
using the development dataset only randomly splitting it into new
training and test sets with the equal number of 2, 000 images in each.
We assessed 17 diferent methods executed in 17 internal runs using
the new sets generated. An overview of the conducted internal runs
can be found in table 1 where we provide the measured performance
metrics [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. We can see that not all our approaches can perform
eficiently on the given dataset. In general, we can conclude that for
all the machine-learning-based classification approaches, the LMT
classifier is performing the best, the RF classifier is slightly worse,
and the RT classifier performs the worst. The 6 Layers CNN and
Inception v3 TFL approaches performs with the comparable
precision, but Inception v3 TFL have slightly better results. The Inception
v3 Concepts and ResNet50 Concepts approaches performs with the
comparable precision too, but all the ResNet50 Concepts approaches
perform slightly better. The Inception v3 Features approaches
perform the worst compared to all other features-based approaches
even for the eficient LMT classifier, which can be caused by the
huge feature values vector generated by the Inception v3 network.
Finally, the best performing approach is the ResNet50 Features
approach with the LMT classifier showing the performance of 0.828
for RK and 0.856 for F1 score.
      </p>
      <p>Based on the initial evaluation, we have selected the five diferent
approaches for the oficial competition submission. The approaches
selected (see table 2) are the best performing in the internal runs
while keeping as much diversity of the methods as possible. The
oficial evaluation results provided by the organizers is presented in
table 2. The best performing approach is again the ResNet50 Features
approach with the LMT classifier (run #4) with the RK value of
0.802 and F1 score of 0.826. The confusion matrix of this run is
presented in table 3. The often miss-classified classes are Esophagitis
and Z-line that is caused by the nature of the used visual features.
Both of these classes consist of pictures of Z-Line, but Esophagitis
In this paper, we presented 17 diferent combined approaches
designed for multi-class classification of medical imaging data with
the limited training dataset. We presented a novel comparison of
the performance of the various visual-features-based methods with
traditional custom CNN and Inception v3 with
transfer-learningbased approaches. We used modified Inception v3 and ResNet50
networks and the LIRE library for the features extraction, with
machine-learning classification algorithms from WEKA. Despite
the limited training dataset and a presence of visually similar image
classes, we achieved a good multi-class classification performance
with the RK value of 0.802 and a classification speed of 46 frames
per second. For our future research, we will investigate the
combined approach with the fusion of multiple deep-network-based
feature extractors for the initial coarse image classification together
with the fine-tuned local-feature-based sub-classification for the
eficient cross-class detection between visually similar images.</p>
    </sec>
    <sec id="sec-8">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work is founded by the FRINATEK project ”EONS” #231687.
A Comparison of Deep Learning with Global Features for GI disease detection</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Martín</given-names>
            <surname>Abadi</surname>
          </string-name>
          , Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis,
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Matthieu</given-names>
            <surname>Devin</surname>
          </string-name>
          , and others.
          <source>2016</source>
          .
          <article-title>Tensorflow: Large-scale machine learning on heterogeneous distributed systems</article-title>
          .
          <source>arXiv preprint arXiv:1603.04467</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Souad</given-names>
            <surname>Chaabouni</surname>
          </string-name>
          , Jenny Benois-Pineau, and Chokri Ben Amar.
          <year>2016</year>
          .
          <article-title>Transfer learning with deep networks for saliency prediction in natural video</article-title>
          .
          <source>In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP)</source>
          .
          <volume>1604</volume>
          -
          <fpage>1608</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>François</given-names>
            <surname>Chollet</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Keras: Deep learning library for theano and tensorflow</article-title>
          . (
          <year>2015</year>
          ). https://keras.io/ Accessed:
          <fpage>2017</fpage>
          -09-01.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>YN</given-names>
            <surname>Dauphin</surname>
            , H De Vries
          </string-name>
          ,
          <string-name>
            <given-names>J</given-names>
            <surname>Chung</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>RMSProp and equilibrated adaptive learning rates for non-convex optimization</article-title>
          .
          <source>arXiv preprint arXiv:1502.04390</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Jef</given-names>
            <surname>Donahue</surname>
          </string-name>
          , Yangqing Jia, Oriol Vinyals, Judy Hofman, Ning Zhang, Eric Tzeng, and
          <string-name>
            <given-names>Trevor</given-names>
            <surname>Darrell</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition</article-title>
          .
          <source>In Proceedings of the 31st International Conference on Machine Learning (ICML)</source>
          , Vol.
          <volume>32</volume>
          .
          <fpage>647</fpage>
          -
          <lpage>655</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Richard</surname>
            <given-names>HR Hahnloser</given-names>
          </string-name>
          , Rahul Sarpeshkar, Misha A Mahowald,
          <string-name>
            <surname>Rodney J Douglas</surname>
            , and
            <given-names>H Sebastian</given-names>
          </string-name>
          <string-name>
            <surname>Seung</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit</article-title>
          .
          <source>Nature</source>
          <volume>405</volume>
          ,
          <issue>6789</issue>
          (
          <year>2000</year>
          ),
          <fpage>947</fpage>
          -
          <lpage>951</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Mark</given-names>
            <surname>Hall</surname>
          </string-name>
          , Eibe Frank, Geofrey Holmes, Bernhard Pfahringer,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Reutemann</surname>
          </string-name>
          , and
          <string-name>
            <surname>Ian H Witten</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>The WEKA data mining software: an update</article-title>
          .
          <source>ACM SIGKDD explorations newsletter 11</source>
          ,
          <issue>1</issue>
          (
          <year>2009</year>
          ),
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Kaiming</given-names>
            <surname>He</surname>
          </string-name>
          , Xiangyu Zhang, Shaoqing Ren, and
          <string-name>
            <given-names>Jian</given-names>
            <surname>Sun</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Deep residual learning for image recognition</article-title>
          .
          <source>In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          .
          <volume>770</volume>
          -
          <fpage>778</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Diederik</given-names>
            <surname>Kingma</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jimmy</given-names>
            <surname>Ba</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Adam: A method for stochastic optimization</article-title>
          .
          <source>arXiv preprint arXiv:1412.6980</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Mathias</surname>
            <given-names>Lux</given-names>
          </string-name>
          , Michael Riegler, Pål Halvorsen, Konstantin Pogorelov, and
          <string-name>
            <given-names>Nektarios</given-names>
            <surname>Anagnostopoulos</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>LIRE: open source visual information retrieval</article-title>
          .
          <source>In Proceedings of the 2016 ACM Conference on Multimedia Systems (MMSys)</source>
          .
          <source>Article no. 30.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Jiquan</surname>
            <given-names>Ngiam</given-names>
          </string-name>
          , Adam Coates, Ahbik Lahiri, Bobby Prochnow, Quoc V Le, and Andrew Y Ng.
          <year>2011</year>
          .
          <article-title>On optimization methods for deep learning</article-title>
          .
          <source>In Proceedings of the 28th International Conference on Machine Learning (ICML)</source>
          .
          <volume>265</volume>
          -
          <fpage>272</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Sigrun Losada Eskeland, Thomas de Lange, Carsten Griwodz, Kristin Ranheim Randel, Håkon Kvale Stensland,
          <string-name>
            <surname>Duc-Tien</surname>
            Dang-Nguyen, Concetto Spampinato, Dag Johansen,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Riegler</surname>
          </string-name>
          , and others.
          <year>2017</year>
          .
          <article-title>A holistic multimedia system for gastrointestinal tract disease detection</article-title>
          .
          <source>In Proceedings of the 8th ACM Conference on Multimedia Systems (MMSys)</source>
          .
          <fpage>112</fpage>
          -
          <lpage>123</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Kristin Ranheim Randel, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Concetto Spampinato,
          <string-name>
            <surname>Duc-Tien</surname>
            Dang-Nguyen, Mathias Lux, Peter Thelin Schmidt,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Riegler</surname>
            , and
            <given-names>Pål</given-names>
          </string-name>
          <string-name>
            <surname>Halvorsen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Kvasir: A MultiClass Image Dataset for Computer Aided Gastrointestinal Disease Detection</article-title>
          .
          <source>In Proceedings of the 8th ACM on Multimedia Systems Conference (MMSys)</source>
          .
          <fpage>164</fpage>
          -
          <lpage>169</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Michael Riegler, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Carsten Griwodz, Peter Thelin Schmidt, and
          <string-name>
            <given-names>Pål</given-names>
            <surname>Halvorsen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Eficient disease detection in gastrointestinal videos - global features versus neural networks</article-title>
          .
          <source>Multimedia Tools and Applications</source>
          (
          <year>2017</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>33</lpage>
          . https://doi.org/10.1007/ s11042-017-4989-y
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , Mathias Lux, Carsten Gridwodz, Concetto Spampinato, Thomas de Lange, Sigrun L Eskeland, Konstantin Pogorelov, Wallapak Tavanapong,
          <string-name>
            <surname>Peter T Schmidt</surname>
            , Cathal Gurrin, Dag Johansen, Håvard Johansen, and
            <given-names>Pål</given-names>
          </string-name>
          <string-name>
            <surname>Halvorsen</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Multimedia and Medicine: Teammates for better disease detection and survival</article-title>
          .
          <source>In Proceedings of the 2016 ACM Multimedia Conference (ACM MM)</source>
          .
          <volume>968</volume>
          -
          <fpage>977</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , Konstantin Pogorelov, Sigrun Losada Eskeland, Peter Thelin Schmidt, Zeno Albisser, Dag Johansen, Carsten Griwodz, Pål Halvorsen, and Thomas de Lange.
          <year>2017</year>
          . From Annotation to Computer Aided Diagnosis:
          <article-title>Detailed Evaluation of a Medical Multimedia System</article-title>
          .
          <source>Transactions on Multimedia Computing, Communications and Applications 9</source>
          ,
          <issue>4</issue>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , Konstantin Pogorelov, Pål Halvorsen, Thomas de Lange, Carsten Griwodz, Peter Thelin Schmidt, Sigrun Losada Eskeland, and
          <string-name>
            <given-names>Dag</given-names>
            <surname>Johansen</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>EIR - Eficient Computer Aided Diagnosis Framework for Gastrointestinal endoscopies</article-title>
          .
          <source>In Proceedings of the 14th International Workshop on Content-based Multimedia Indexing (CBMI)</source>
          .
          <article-title>1-6</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , Konstantin Pogorelov, Pål Halvorsen, Kristin Ranheim Randel, Sigrun Losada Eskeland,
          <string-name>
            <surname>Duc-Tien</surname>
          </string-name>
          Dang-Nguyen, Mathias Lux, Carsten Griwodz, Concetto Spampinato, and Thomas de Lange.
          <year>2017</year>
          .
          <article-title>Multimedia for Medicine: The Medico Task at MediaEval 2017</article-title>
          .
          <article-title>In Proceedings of the 2017 MediaEval Benchmarking Initiative for Multimedia Evaluation</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Christian</surname>
            <given-names>Szegedy</given-names>
          </string-name>
          , Vincent Vanhoucke, Sergey Iofe, Jonathon Shlens, and
          <string-name>
            <given-names>Zbigniew</given-names>
            <surname>Wojna</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Rethinking the inception architecture for computer vision</article-title>
          .
          <source>arXiv preprint arXiv:1512.00567</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Yi</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Wallapak Tavanapong, Johnny Wong, JungHwan Oh, and
          <string-name>
            <surname>Piet C De Groen</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Computer-aided detection of retroflexion in colonoscopy</article-title>
          .
          <source>In Proceeding of the 24th International Symposium on Computer-Based Medical Systems (CBMS)</source>
          .
          <article-title>1-6</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Yi</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Wallapak Tavanapong, Johnny Wong, Jung Hwan Oh, and
          <string-name>
            <surname>Piet C De Groen</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Polyp-alert: Near real-time feedback during colonoscopy</article-title>
          .
          <source>Computer methods and programs in biomedicine 120</source>
          ,
          <issue>3</issue>
          (
          <year>2015</year>
          ),
          <fpage>164</fpage>
          -
          <lpage>179</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>