<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ImageCLEF 2018 Tuberculosis Task: Ensemble of 3D CNNs with Multiple Inputs for Tuberculosis Type Classi cation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Adam Ishay</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oge Marques</string-name>
          <email>omarquesg@fau.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University</institution>
          ,
          <addr-line>33431 Boca Raton FL</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Convolutional neural networks have achieved state-of-theart results in general image classi cation tasks and have shown success in several applications within the medical imaging domain. In this paper, we apply a 3D convolutional neural network (CNN) to a dataset of tuberculosis-positive computed tomography (CT) scans to solve the task of automatically categorizing each tuberculosis (TB) case into one of ve possible TB types in the context of the ImageCLEFtuberculosis 2018 challenge. The size of the volumetric scans poses unique constraints on the network and the training process. The CT volumes are segmented with the provided masks, which are further pre-processed prior to training our model. Our best run ranked 2nd with an unweighted Cohen's Kappa of 0.1736 and an accuracy of 35.33%.</p>
      </abstract>
      <kwd-group>
        <kwd>3D-CNN tion</kwd>
        <kwd>Medical Imaging</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Tuberculosis</kwd>
        <kwd>Image Classi ca-</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        For the second year, ImageCLEF [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] has proposed the ImageCLEFtuberculosis
2018 task [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], in e orts to reduce the time required and cost of medical image
analysis. This year there are three subtasks: multi-drug resistance (MDR)
detection, tuberculosis type classi cation, and severity scoring. The goal of the
MDR task is to predict probabilities of a patient having a drug-resistant form of
tuberculosis. The third task, severity scoring, aims at predicting a severity score
from 1 (very bad) to 5 (very good). Finally, the task that this paper addresses
is the tuberculosis type classi cation. We are tasked with classifying the type of
tuberculosis, given a positive image. These types are: (1) In ltrative, (2) Focal,
(3) Tuberculoma, (4) Miliary, and (5) Fibro-cavernous.
      </p>
      <p>
        Deep learning approaches have been shown to be successful on a large variety
of computer vision and image analysis tasks [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Deep learning and CNNs in
particular have now broadly been applied to medical imaging [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. We apply a
deep 3D CNN to the medical image dataset for classi cation.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Data Pre-processing</title>
      <p>The training set provided by the ImageCLEF organizers consisted of patient
chest CT scans of ve di erent types of TB along with their labels. Often
patients had multiple scans and all scans of the same patient were of the same
type. There were 228, 210, 100, 79, and 60 patients belonging to In ltrative,
Focal, Tuberculoma, Miliary, and Fibro-cavernous types, respectively. The dataset
totaled 677 patients with 1008 scans (Table 1). Each scan consists of
approximately 100 512 512 slices. The depth of each scan varies and was changed to a
constant number of slices.</p>
      <p>
        The pre-processing stage consisted of 7 steps (Figure 1). The supplied masks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
were applied to the original scans to segment the lungs. The distance between
slices along with the resolution of each slice varied among scans. For easier
training, the images were resampled to an isotropic resolution of 1 1 1 mm. After
this step, the scans were roughly 300 300 300 in size. All scans were then cut to
remove the excess zeros in the background from the mask. The voxel values were
clipped between -1000 and 400, and normalized between 0 and 1. The values
outside of this range are not useful. Then, the largest lungs were used to nd the
new width and height to which all images would be padded, with the common
background voxel value in the scans. The scans were also padded in the depth
dimension to the depth of the largest scan. Next, the mean pixel was calculated
and subtracted from all scans to zero-center the data for better training. Finally,
the resulting scans were resized to reduce the data to a more reasonable size for
the network. Using this process, two datasets of di erent sized images were
created (see Figure 2). The purpose of this was to combine two di erent networks
to predict the label. The batch sizes used are a function of the size of the input
and the architecture of the network. In the networks used, most of the memory
consumption was due to the rst few layers of the network, since in these layers
the images were still large.
      </p>
      <p>
        The two datasets each had a respective train/validation split of 80/20.
Initially this split was done by scan and validation accuracy was relatively high.
However, when submitting results on the test set the accuracy was much lower
and close to random. This was thought to be at least partially due to the method
for splitting. Because some patients had multiple scans, there were scans in the
train and validation set from the same patient. Upon visual inspection, scans
from the same patient were indeed similar (see Figure 3).
The trained models were 3D convolutional neural networks using the software
library Keras [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] with Tensor ow [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] backend. We opted for 3D convolutions
because they naturally capture the 3D nature of the scans. We trained two
networks, one for each dataset created in the pre-processing stage. The combination
of the two networks achieved better results than either of them alone. To
alleviate the class imbalance problem shown in Table 1, oversampling was used during
the training phase. Classes three, four, and ve were oversampled to
approximately match the test distribution. This meant that a full epoch of training was
reached when roughly 900 patients (677 + 200) were processed by the network.
      </p>
      <p>
        Each network (Figure 4) had ve convolution layers with recti ed linear unit
(ReLU) activations, each with a following batch normalization and max pooling
with dropout layer. These led to two fully connected layers, each with batch
normalization and dropout. Finally, these activations went through a softmax
layer, which output a tensor of size ve, for each category. Categorical
crossentropy was used as the loss function, and Adam [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] was used for optimization.
      </p>
      <p>One of the restrictive parts of training this model was the batch size. Such small
batches make it harder to converge.</p>
      <p>Our most successful model was the combination of the two best models,
which had inputs of di erent size image volumes. The outcome of this ensemble
were 5 probabilities, one for each class. The probabilities were summed across
the two models, and then this vector was iteratively scaled by a weight vector
which was calculated from the class distribution. This resulted in output labels
which more closely matched the data distribution. This combination of networks
shown in Figure 4 was used for predicting the test labels.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>Only runs for the tuberculosis type subtask 2 were submitted. Our initial
submissions accuracies were barely better than random chance ( 28%). After
combining models and weighting probabilities, the accuracy and kappa score did
improve. Our best run (indicated in bold in Table 2) ranked second in unweighted
kappa coe cient, but tenth in accuracy.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>This paper applies a 3D CNN to pre-processed CT scans of the lungs. The
question of whether a CNN can extract the information necessary for labeling
types of TB remains open. Making predictions on image data alone has proved
a challenging problem. The large size of the images and small size and class
imbalance of the datasets are characteristic of medical imaging tasks. In this
analysis, batch sizes were restricted to sizes of ve and fourteen samples for the
two networks used. A feasible way of e ectively training with a much larger batch
size is to accumulate the gradients of each batch and only update the weights of
the network after storing a su cient number of batches gradients. The average
of the gradients for each batch can be used to update the weights. This allows
for e ectively training on larger batch sizes, circumventing memory problems.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abadi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barham</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brevdo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Citro</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Devin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghemawat</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goodfellow</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harp</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Irving</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isard</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jia</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jozefowicz</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaiser</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kudlur</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Levenberg</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mane</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monga</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murray</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Olah</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schuster</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shlens</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steiner</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Talwar</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tucker</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanhoucke</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vasudevan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Viegas</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vinyals</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Warden</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wattenberg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wicke</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>TensorFlow: Large-scale machine learning on heterogeneous systems (</article-title>
          <year>2015</year>
          ), https://www.tensor ow.org/, software available from tensor ow.org
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chollet</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , et al.: Keras. https://keras.io (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Dicente</given-names>
            <surname>Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          , , Muller, H.:
          <article-title>Overview of ImageCLEFtuberculosis 2018 - detecting multi-drug resistance, classifying tuberculosis type, and assessing severity score</article-title>
          .
          <source>In: CLEF2018 Working Notes. CEUR Workshop Proceedings</source>
          , CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt;</source>
          , Avignon,
          <source>France (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Dicente</given-names>
            <surname>Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Jimenez del Toro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.A.</given-names>
            ,
            <surname>Depeursinge</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          , Muller, H.:
          <article-title>E cient and fully automatic segmentation of the lungs in ct volumes</article-title>
          . In: Goksel,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Jimenez del Toro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.A.</given-names>
            ,
            <surname>Foncubierta-Rodr guez</surname>
          </string-name>
          , A., Muller, H. (eds.)
          <article-title>Proceedings of the VISCERAL Anatomy Grand Challenge at the 2015 IEEE ISBI</article-title>
          . pp.
          <volume>31</volume>
          {
          <fpage>35</fpage>
          . CEUR Workshop Proceedings, CEUR-WS (May
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ionescu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Muller, H.,
          <string-name>
            <surname>Villegas</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eickho</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andrearczyk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cid</surname>
            ,
            <given-names>Y.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liauchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kovalev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasan</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ling</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farri</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lungren</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dang-Nguyen</surname>
            ,
            <given-names>D.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piras</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riegler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lux</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          : Overview of ImageCLEF 2018:
          <article-title>Challenges, datasets and evaluation. In: Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the Ninth International Conference of the CLEF Association (CLEF</source>
          <year>2018</year>
          ),
          <source>LNCS Lecture Notes in Computer Science</source>
          , Springer, Avignon,
          <source>France (September 10-14</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kingma</surname>
            ,
            <given-names>D.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ba</surname>
          </string-name>
          , J.:
          <article-title>Adam: A method for stochastic optimization</article-title>
          .
          <source>arXiv preprint arXiv:1412.6980</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>LeCun</surname>
          </string-name>
          , Y.,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.:
          <article-title>Deep learning</article-title>
          .
          <source>Nature</source>
          <volume>521</volume>
          (
          <issue>7553</issue>
          ),
          <volume>436</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Litjens</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kooi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bejnordi</surname>
            ,
            <given-names>B.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Setio</surname>
            ,
            <given-names>A.A.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciompi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghafoorian</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>van der Laak</surname>
          </string-name>
          , J.A.,
          <string-name>
            <surname>van Ginneken</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanchez</surname>
            ,
            <given-names>C.I.:</given-names>
          </string-name>
          <article-title>A survey on deep learning in medical image analysis</article-title>
          .
          <source>Medical image analysis 42</source>
          ,
          <volume>60</volume>
          {
          <fpage>88</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>