<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ImageCLEF2018: Transfer Learning for Deep Learning with CNN for Tuberculosis Classification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>ntili</string-name>
          <email>agentili@ucsd.edu</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>San Diego VA Health Care System</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>San Diego</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>CA USA</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of California</institution>
          ,
          <addr-line>San Diego, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The diagnosis of Multi Drug Resistant (MDR) tuberculosis is challenging. We present our method for classifying whether a patient has MDR tuberculosis or drug sensitive (DS) tuberculosis based on a CT scan of that person's chest, which achieved the best accuracy and the second-best AUC at the ImageCLEF 2018 Tuberculosis - MDR detection task. Our approach consists of reformatting the images in the coronal plane, converting them to png format and using transfer learning to train a ResNext 50 convolutional neural network to classify images as MDR or DS tuberculosis.</p>
      </abstract>
      <kwd-group>
        <kwd>Deep Learning</kwd>
        <kwd>Convolutional Neural Network</kwd>
        <kwd>Tuberculosis</kwd>
        <kwd>Multidrug-resistant Tuberculosis</kwd>
        <kwd>CT Scans</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Tuberculosis is still a common disease and the diagnosis of Multi Drug Resistant
(MDR) tuberculosis is challenging. It is difficult for radiologists to distinguish between
MDR and Drug Sensitive (DS) tuberculosis and there is inconsistency in the literature
on which radiographic features are useful. For instance, presence of lymph node
calcifications is associated with MDR in some papers and with DS in other [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5">1-5</xref>
        ]. The main
objective of the ImageCLEF tuberculosis task is to provide tuberculosis severity scores
based on automatic analysis of lung CT images of patients. Being able to extract this
information from image data alone can allow for more limited lung washing and
laboratory analyses to determine tuberculosis type and drug resistances. This can lead to
quicker decisions on best treatment strategies, reduced use of antibiotics, and lower
impact on patients.[6]
The data set provided for the ImageCLEF 2018 Tuberculosis - MDR detection task
included 259 patients in the training set and 236 patients for the test set [7]. See Table
1.
As reported in the literature[
        <xref ref-type="bibr" rid="ref5">5, 8</xref>
        ], patients with MDR tuberculosis were younger, mean
age 43.6 ±17.17SD vs 50.7 ±18, applying the Student's t–test for two samples, this
difference was significant with p&lt;0.002. See Figure 1.
The images for the ImageCLEF tuberculosis task were provided as NIfTI 3D datasets.
We used med2image, a Python3 utility that converts medical image formatted files to
more visual friendly ones, such as png and jpg, to convert the images. After
reconstructing them in all 3 planes, we decided to use them in the coronal plane to have more
images containing areas of abnormal lung. Although we did not visually verify the
images of this data set, tuberculosis usually involves the upper lobes with relative sparing
of the lung bases. As a result, axial images through the lung bases could possibly be
normal even in patient with severe disease in the upper lobes, so we chose to use the
coronal plane since a larger proportion of images should contain abnormal areas. As
med2image did not take in consideration slice thickness, the reconstructed coronal
images were deformed and of different height. To correct this problem all images were
resized to a 512 x 512 matrix. Image masks for the lungs were available, but were not
used. To exclude chest walls and still include a significant portion of the lungs, of the
512 coronal images obtained for each patient only images 150 to 350 were utilized for
training — image 150 was the most posterior and 350 the most anterior image utilized.
All image equalization and data augmentation was done at the time of the training using
the fastai library [9].
2.2.
      </p>
      <p>Neural Network Training
For training of the CNN, we rented from Paperport a cloud virtual machine with 8
CPUs, Quadro P5000 GPU, 30 GB RAM, and 500 GB solid state drive created using
the fast.ai public template. We took advantage of the fastai library to perform transfer
training of ResNext 50 [10] convolutional neural network.</p>
      <p>For training the CNN an image size of 64 x 64 was utilized. The learning rate was
determined after running the learning rate finder function and plotting the learning rate
vs. loss. See Figure 2.</p>
      <p>Fig. 2. LEARNING RATE VS. LOSS
After reviewing this curve, a learning rate of 0.002 was selected for the last layers. The
last layers were trained for 2 epochs without data augmentation, then were trained for
2 additional epochs using data augmentation. For data augmentation, we used random
rotations of up to 10 degrees in each direction, random changes of intensity of up to
5%, and random horizontal flipping (but no vertical flipping) based on the assumption
that right and left lung are similar, but upper and lower lobes are different. Subsequently
all layers were unfrozen and trained for an additional 3 epochs using a different learning
rate for different layers. The final layer learning rate was kept at 0.002, but the learning
rate for the middle layers was one third of the last layers and the initial layers learning
rate was one ninth of the last layers. Same augmentation used at training time was also
used at test time, and the average of 4 augmented images was used for each test image.</p>
      <p>As we had analyzed each image separately, we had 200 different results for each
patient, so we averaged the results of the 200 images of each patient. As expected, using
the average decreased the probability of MDR tuberculosis as some of the images were
including only normal or less abnormal lungs. As the number of patients with MDR
was known, the probability was manually rescaled in Microsoft Excel before
submission to provide the correct number of positive and negative MDR cases and to use the
entire probability range from 0 to 1.
3.</p>
    </sec>
    <sec id="sec-2">
      <title>Results</title>
      <p>When each image is scored individually, patients with MDR tuberculosis have a
significant number of images scored as not MDR tuberculosis. This can be explained by
the fact that significant pathology necessary to make the diagnosis of MDR tuberculosis
may not be present in all images.</p>
      <p>In the final table of results, the submitted run for MDR detection task was ranked first
for accuracy among the 39 submitted runs with a prediction accuracy of 0.6144 and
second for area under ROC-curve (AUC) equal 0.6114 on the test image dataset[7].
The best result in terms of AUC value was achieved by VISTA@UEvora team and
resulted in AUC = 0.6178.
Run
MDR-Run-04-Mix-Vote-L-RT-RF.txt
testflowI.csv</p>
    </sec>
    <sec id="sec-3">
      <title>Analysis of the Results</title>
      <p>Although we achieved the best accuracy and second-best AUC, to be clinically useful
automatic detection of MDR need to further improve. Accuracy and AUC in the 0.61
range cannot be relied upon by the treating physician.
5.</p>
    </sec>
    <sec id="sec-4">
      <title>Perspectives for Future work</title>
      <p>Due to the competition’s time contrains, several shortcuts were implemented:
arbitrary selection of coronal images 150 to 350, conversion of images to png format,
averaging results of single slices of each patient. A better selection of images containing
the lungs or even better, the abnormal portion of the lungs/mediastinum, may improve
results. Using Hounsfield units from the original images, instead of values in the png
files may also be more accurate. Instead of averaging the results of single images and
rescaling the results, utilizing a more robust approach to combining results from
multiple images from the same patient may also help — possibilities to consider include
using an SVM[11] or an RNN [12].
6.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper, we presented the use of transfer learning to quickly train a CNN to
achieve the best accuracy and second-best AUC at the ImageCLEF 2018 Tuberculosis
- MDR detection task[7]. It also achieved better results than all submission at the
ImageCLEF 2017 Tuberculosis - MDR detection task.
6. Bogdan, I., et al., Overview of ImageCLEF 2018 : Challenges, Datasets and Evaluation.</p>
      <p>Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018),
2018.
7. Yashin Dicente Cid, V.L., Vassili Kovalev, Henning Müller, Overview of
ImageCLEFtuberculosis 2018 - Detecting multi-drug resistance, classifying tuberculosis
type, and assessing severity score. CLEF2018 Working Notes, 2018.
8. Chung, M.J., et al., Drug-sensitive tuberculosis, multidrug-resistant tuberculosis, and
nontuberculous mycobacterial pulmonary disease in nonAIDS adults: comparisons of
thinsection CT findings. Eur Radiol, 2006. 16(9): p. 1934-41.
9. Howard, J.a.o., fastai. GitHub, 2018.
10. Saining Xie, R.B.G., Piotr Doll, Kaiming He, Aggregated Residual Transformations for</p>
      <p>Deep Neural Networks. CoRR, 2016. abs/1611.05431.
11. Gao, X.W. and Y. Qian, Prediction of Multidrug-Resistant TB from CT Pulmonary Images</p>
      <p>Based on Deep Learning Techniques. Mol Pharm, 2018.
12. Sun, J., Chong, P., Tan, Y.X.M., Binder, A., : ImageCLEF 2017: ImageCLEF tuberculosis
task - the SGEast submission. CLEF2017 Working Notes. CEUR Workshop Proceedings,
2017.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , et al.,
          <article-title>Primary multidrug-resistant tuberculosis versus drug-sensitive tuberculosis in non-HIV-infected patients: Comparisons of CT findings</article-title>
          .
          <source>PLoS One</source>
          ,
          <year>2017</year>
          .
          <volume>12</volume>
          (
          <issue>6</issue>
          ): p.
          <fpage>e0176354</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kahkouee</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , et al.,
          <article-title>Multidrug resistant tuberculosis versus non-tuberculous mycobacterial infections: a CT-scan challenge</article-title>
          .
          <source>Braz J Infect Dis</source>
          ,
          <year>2013</year>
          .
          <volume>17</volume>
          (
          <issue>2</issue>
          ): p.
          <fpage>137</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>E.S.</given-names>
          </string-name>
          , et al.,
          <article-title>Computed tomography features of extensively drug-resistant pulmonary tuberculosis in non-HIV-infected patients</article-title>
          .
          <source>J Comput Assist Tomogr</source>
          ,
          <year>2010</year>
          .
          <volume>34</volume>
          (
          <issue>4</issue>
          ): p.
          <fpage>559</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Yeom</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          , et al.,
          <article-title>Imaging findings of primary multidrug-resistant tuberculosis: a comparison with findings of drug-sensitive tuberculosis</article-title>
          .
          <source>J Comput Assist Tomogr</source>
          ,
          <year>2009</year>
          .
          <volume>33</volume>
          (
          <issue>6</issue>
          ): p.
          <fpage>956</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cha</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , et al.,
          <article-title>Radiological findings of extensively drug-resistant pulmonary tuberculosis in non-AIDS adults: comparisons with findings of multidrug-resistant and drug-sensitive tuberculosis</article-title>
          .
          <source>Korean J Radiol</source>
          ,
          <year>2009</year>
          .
          <volume>10</volume>
          (
          <issue>3</issue>
          ): p.
          <fpage>207</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>