<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Deep Learning Based Disease Detection Using Domain Specific Transfer Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Steven A. Hicks</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pia H. Smedsrud</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pål Halvorsen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Riegler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Simula Metropolitan Center for Digital Engineering</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Simula Research Laboratory</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Oslo</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>29</fpage>
      <lpage>31</lpage>
      <abstract>
        <p>In this paper, we present our approach for the Medico Multimedia Task as part of the MediaEval 2018 Benchmark [13]. Our method is based on convolutional neural networks (CNNs), where we compare how fine-tuning, in the context of transfer learning, from diferent source domains (general versus medical domain) afect classification performance. The preliminary results show that fine-tuning models trained on large and diverse datasets is favorable, even when the model's source domain has little to no resemblance to the new target.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>2 APPROACH</title>
      <p>
        As the current state-of-the-art method for solving most computer
vision tasks involves various implementations of deep neural
networks, we decided to base our approach on this class of algorithms,
specifically CNNs. However, due to the limited size of the
development dataset [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ], training a CNN from scratch would most
likely yield subpar results. Therefore, to resolve this issue, we
finetune the weights of networks previously trained on larger datasets
using the limited data that we have to fit our specific domain
(classification of images taken from the GI tract). This technique is
commonly referred to as transfer learning (TL), and has been shown to
work well across diferent domains [
        <xref ref-type="bibr" rid="ref16 ref5 ref6">5, 6, 16</xref>
        ].
      </p>
      <p>
        For this challenge, we hypothesized that adapting the weights of
a model trained on data similar to our own (medical images) would
yield better results than that of models trained on data with little
resemblance, both in terms of time to convergence and
classification performance. To test this hypothesis, we compared models
trained for the purpose of gaining high-scores on the ImageNet
challenge [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to models trained for medical image classification.
      </p>
      <p>For the classification task, all models were measured by the
requirements given, namely matthews correlation coeficient (MCC)
and the number of samples used for training. Runs submitted to
the eficiency task were evaluated based on their classification
throughput, i.e., the time it takes for the model to classify an image.</p>
    </sec>
    <sec id="sec-2">
      <title>2.4 Techniques for Eficient Classification</title>
      <p>
        Our approach for the fast and eficient classification task, we simply
reused the models trained for the classification task to see which
models were most eficient. We quickly observed that models
implemented in PyTorch [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] had a much higher frames per second (FPS)
than their Tensorflow [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] based counterparts, largely due to the
diference of how tensors are laid out between the two frameworks.
This led to some models being re-implemented in PyTorch and
re-evaluated.
S. Hicks, P. Smedsrud, P. Halvorsen, M. Riegler
      </p>
    </sec>
    <sec id="sec-3">
      <title>3 RESULTS AND ANALYSIS</title>
      <p>The initial evaluation of our internal experiments was done using
3-fold cross-validation, where each run was scored by averaging
the macro-average classification scores of each model split. A
complete overview of the internal runs for both tasks are shown in
Table 1. Based on these initial findings, we selected four runs for
the classification of diseases and findings task (Table 2) and three
runs for the fast and eficient classification task (Table 3) as oficial
runs to be submitted to the event organizers.</p>
      <p>Prioritizing runs for submissions was done by looking at which
experiments achieved the highest metric relative to the task at hand
(MCC or FPS). Additionally, we wanted to submit a variety of
diferent models, e.g., even though our fine-tuned medical based models
did not perform as well as their ImageNet based counterparts, we
still wanted to submit a run for oficial evaluation. For this same
reason, we also submitted a model which was trained on a
significantly limited development dataset, i.e., a model trained on only
657 samples.</p>
    </sec>
    <sec id="sec-4">
      <title>3.1 Classification Subtask Results</title>
      <p>Looking at the results for the classification task (Table 2), we see that
the best performing run is the 3-Averaged DenseNet169. This was
expected as it constitutes the averaged output of the best performing
model from our internal experiments. Furthermore, as shown in
our internal runs, the ImageNet based model beats the medical
based on by approximately 10% when comparing MCC scores. We
believe these results may be due to the diference in variety and
size between the two datasets used to train the base models. Due
to limited time and resources, we were only able to train a small
variety of networks on the medical dataset, and we believe there is
more work to be done in this aspect.</p>
      <p>Somewhat surprisingly, the submitted model which was trained
on a severely limited training set (657 samples), (Tiny) DenseNet2010,
was still able to retain a relatively high MCC score. We believe this
is due to the similarity between images within the same class, and
how each class is quite visually distinct (with a few exceptions).
This is supported by the confusion matrix shown in Table 4, where
we see the model fails on just a few categories.</p>
    </sec>
    <sec id="sec-5">
      <title>3.2 Eficiency Subtask Results</title>
      <p>Looking at Table 3, we see the oficial results for the eficiency
subtask. Note that all models submitted to this task were implemented
in PyTorch. Of the three models, AlexNet was the most performant
by quite a large margin. We believe this is due to the networks
depth and complexity, i.e., the number of layers and parameters.
Additionally, the model’s MCC score is relatively high, considering
that AlexNet is rather simple compared to models we used for the
classification</p>
    </sec>
    <sec id="sec-6">
      <title>4 CONCLUSION</title>
      <p>In this paper, we presented the work done as part of the Medico
Multimedia Task where we participated in two of the four available
subtasks. Our main hypothesis for this challenge was that
finetuned models with a medical source domain would perform better
than fine-tuned ImageNet models, when used for medical disease
detection. Furthermore, with a goal of submitting to the eficiency
task, we measured the FPS of the models. Based on our internal
experiments and the oficial evaluation metrics received from the
event organizers, we conclude that a large and varied dataset takes</p>
      <sec id="sec-6-1">
        <title>Internal Classification Evaluation Results</title>
        <p>Method
InceptionResnetV2
ResNet50
ResNet18</p>
        <p>AlexNet
DenseNet169</p>
        <p>VGG11
(Tiny) DenseNet201</p>
        <p>DenseNet169
InceptionResnetV2</p>
      </sec>
      <sec id="sec-6-2">
        <title>Oficial Classification Evaluation Results</title>
        <p>precedence over how similar the source domain is to the target.
Additionally, we found that networks of lesser depth and complexity
were generally more eficient. We admit that these results may
be anecdotal, but we believe this requires more research to fully
explore the potential of our approach.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Martín</given-names>
            <surname>Abadi</surname>
          </string-name>
          , Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis,
          <string-name>
            <given-names>Jefrey</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Matthieu</given-names>
            <surname>Devin</surname>
          </string-name>
          , Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geofrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke,
          <string-name>
            <given-names>Yuan</given-names>
            <surname>Yu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Xiaoqiang</given-names>
            <surname>Zheng</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <source>TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems</source>
          . (
          <year>2015</year>
          ). https://www.tensorflow.org/ Software available from tensorflow.
          <source>org.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Mateusz</given-names>
            <surname>Buda</surname>
          </string-name>
          , Atsuto Maki, and
          <string-name>
            <surname>Maciej</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mazurowski</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A systematic study of the class imbalance problem in convolutional neural networks</article-title>
          .
          <source>Computing Research Repository abs/1710</source>
          .05381 (
          <year>2017</year>
          ). arXiv:
          <volume>1710</volume>
          .05381 http://arxiv.org/abs/1710.05381
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>François</given-names>
            <surname>Chollet</surname>
          </string-name>
          and others.
          <source>2015</source>
          . Keras. https://keras.io. (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Fei-Fei</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>ImageNet: A Large-Scale Hierarchical Image Database</article-title>
          .
          <source>In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision</source>
          and Pattern Recognition.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H. G.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Choi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y. M.</given-names>
            <surname>Ro</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Modality-bridge Transfer Learning for Medical Image Classification. ArXiv e-prints (Aug</article-title>
          .
          <year>2017</year>
          ).
          <source>arXiv:cs.CV/1708.03111</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Simon</given-names>
            <surname>Kornblith</surname>
          </string-name>
          , Jonathon Shlens, and
          <string-name>
            <surname>Quoc</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Le</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <string-name>
            <given-names>Do</given-names>
            <surname>Better ImageNet Models Transfer Better</surname>
          </string-name>
          ? Computing Research Repository abs/
          <year>1805</year>
          .08974 (
          <year>2018</year>
          ). arXiv:
          <year>1805</year>
          .08974 http://arxiv.org/abs/
          <year>1805</year>
          . 08974
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Anders</given-names>
            <surname>Krogh</surname>
          </string-name>
          and
          <string-name>
            <given-names>John A.</given-names>
            <surname>Hertz</surname>
          </string-name>
          .
          <year>1992</year>
          .
          <article-title>A Simple Weight Decay Can Improve Generalization</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          <volume>4</volume>
          , J. E. Moody, S. J.
          <string-name>
            <surname>Hanson</surname>
          </string-name>
          , and R. P. Lippmann (Eds.). Morgan-Kaufmann,
          <fpage>950</fpage>
          -
          <lpage>957</lpage>
          . http://papers.nips.cc/paper/ 563-a
          <article-title>-simple-weight-decay-can-improve-generalization</article-title>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Leibetseder</surname>
          </string-name>
          , Stefan Petscharnig, Manfred Jürgen Primus, Sabrina Kletz, Bernd Münzer, Klaus Schoefmann, and
          <string-name>
            <given-names>Jörg</given-names>
            <surname>Keckstein</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Lapgyn4: A Dataset for 4 Automatic Content Analysis Problems in the Domain of Laparoscopic Gynecology</article-title>
          .
          <source>In Proceedings of the 9th ACM Multimedia Systems Conference (MMSys '18)</source>
          . ACM, New York, NY, USA,
          <fpage>357</fpage>
          -
          <lpage>362</lpage>
          . https://doi.org/10.1145/3204949.3208127
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Pushparaja</given-names>
            <surname>Murugan</surname>
          </string-name>
          and
          <string-name>
            <given-names>Shanmugasundaram</given-names>
            <surname>Durairaj</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Regularization and Optimization strategies in Deep Convolutional Neural Network</article-title>
          .
          <source>Computing Research Repository abs/1712</source>
          .04711 (
          <year>2017</year>
          ). arXiv:
          <volume>1712</volume>
          .04711 http://arxiv.org/abs/1712.04711
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Adam</surname>
            <given-names>Paszke</given-names>
          </string-name>
          , Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang,
          <string-name>
            <surname>Zachary</surname>
            <given-names>DeVito</given-names>
          </string-name>
          , Zeming Lin, Alban Desmaison, Luca Antiga, and
          <string-name>
            <given-names>Adam</given-names>
            <surname>Lerer</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Automatic diferentiation in PyTorch</article-title>
          . (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Kristin Ranheim Randel, Thomas de Lange, Sigrun Losada Eskeland, Carsten Griwodz, Dag Johansen, Concetto Spampinato, Mario Taschwer, Mathias Lux, Peter Thelin Schmidt,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Riegler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Pål</given-names>
            <surname>Halvorsen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Nerthus: A Bowel Preparation Quality Video Dataset</article-title>
          .
          <source>In Proceedings of the 8th ACM on Multimedia Systems Conference. ACM</source>
          ,
          <volume>170</volume>
          -
          <fpage>174</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Kristin Ranheim Randel, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Concetto Spampinato,
          <string-name>
            <surname>Duc-Tien</surname>
            Dang-Nguyen, Mathias Lux, Peter Thelin Schmidt,
            <given-names>Michael</given-names>
          </string-name>
          <string-name>
            <surname>Riegler</surname>
            , and
            <given-names>Pål</given-names>
          </string-name>
          <string-name>
            <surname>Halvorsen</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Kvasir: A MultiClass Image Dataset for Computer Aided Gastrointestinal Disease Detection</article-title>
          .
          <source>In Proceedings of the 8th ACM on Multimedia Systems Conference. ACM</source>
          ,
          <volume>164</volume>
          -
          <fpage>169</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Michael Riegler, Pål Halvorsen, Thomas de Lange, Kristin Ranheim Randel,
          <string-name>
            <surname>Duc-Tien</surname>
            Dang-Nguyen,
            <given-names>Mathias</given-names>
          </string-name>
          <string-name>
            <surname>Lux</surname>
            , and
            <given-names>Olga</given-names>
          </string-name>
          <string-name>
            <surname>Ostroukhova</surname>
          </string-name>
          .
          <year>2018</year>
          . Medico Multimedia Task at MediaEval
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Konstantin</surname>
            <given-names>Pogorelov</given-names>
          </string-name>
          , Michael Riegler, Pål Halvorsen, Carsten Griwodz, Thomas de Lange, Kristin Ranheim Randel, Sigrun Losada Eskeland,
          <string-name>
            <surname>Duc-Tien</surname>
            Dang-Nguyen, Olga Ostroukhova, Mathias Lux, and
            <given-names>Concetto</given-names>
          </string-name>
          <string-name>
            <surname>Spampinato</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>A Comparison of Deep Learning with Global Features for Gastrointestinal Disease Detection</article-title>
          . In MediaEval.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Klaus</surname>
            <given-names>Schoefmann</given-names>
          </string-name>
          , Mario Taschwer, Stephanie Sarny, Bernd Münzer, Manfred Jürgen Primus, and
          <string-name>
            <given-names>Doris</given-names>
            <surname>Putzgruber</surname>
          </string-name>
          .
          <year>2018</year>
          . Cataract-
          <volume>101</volume>
          :
          <article-title>Video Dataset of 101 Cataract Surgeries</article-title>
          .
          <source>In Proceedings of the 9th ACM Multimedia Systems Conference (MMSys '18)</source>
          . ACM, New York, NY, USA,
          <fpage>421</fpage>
          -
          <lpage>425</lpage>
          . https://doi.org/10.1145/3204949.3208137
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Chuen-Kai</surname>
            <given-names>Shie</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chung-Hisang</surname>
            <given-names>Chuang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chun-Nan</surname>
            <given-names>Chou</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meng-Hsi Wu</surname>
          </string-name>
          , and
          <string-name>
            <surname>Edward</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Transfer representation learning for medical image analysis</article-title>
          .
          <source>2015 (08</source>
          <year>2015</year>
          ),
          <fpage>711</fpage>
          -
          <lpage>714</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Luke</given-names>
            <surname>Taylor</surname>
          </string-name>
          and Geof Nitschke.
          <year>2017</year>
          .
          <article-title>Improving Deep Learning using Generic Data Augmentation</article-title>
          .
          <source>Computing Research Repository abs/1708</source>
          .06020 (
          <year>2017</year>
          ). arXiv:
          <volume>1708</volume>
          .06020 http://arxiv.org/abs/1708. 06020
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>