<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multi-Classification Study of the Tuberculosis with 3D CBAM-ResNet and EficientNet</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xing Lu</string-name>
          <email>lvxingvir@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Y Chang</string-name>
          <email>e8chang@health.ucsd.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chun-Nan Hsu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiang Du</string-name>
          <email>jiangdu@health.ucsd.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amilcare Gentili</string-name>
          <email>agentili@ucsd.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>San Diego VA Health Care System</institution>
          ,
          <addr-line>San Diego, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of California</institution>
          ,
          <addr-line>San Diego, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The detection and characterization of tuberculosis along with the evaluation of tuberculosis lesion characteristics are challenging. To provide a solution for this multi-classification task, we performed a deep learning study that relied on the use of 3D Resnet and EficientNet. With proper application of the provided masks, lung images were cropped, masked, and rearranged with diferent windowing. Stratified sampling for train/validation split and a balanced sampler in each batch sampler during training were used to address the data imbalance problem. A convolutional block attention model (CBAM) was used to add an attention mechanism in each block of the Resnet to further improve the performance of the convolutional neural network (CNN).</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Tuberculosis</kwd>
        <kwd>Computed Tomography</kwd>
        <kwd>Image Classification</kwd>
        <kwd>Tuberculosis Type</kwd>
        <kwd>CBAM</kwd>
        <kwd>Eficient Net</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <p>2.1. Data
The datasets provided for the tuberculosis task training set contained a total of 917 patients,
with labeling provided for five categories. To avoid bias in the training and validation cohorts, a
balanced train/validation strategy was employed to split each class according to an 8:2 ratio, as
shown in Figure 1. Figure 1(a) shows the results of a random train/validation split, whereas
Figure 1(b) shows the balanced split. As can be seen in Figure 1(a), in the fibro-cavernous class,
there were only a few examples in the validation dataset, whereas the balanced split shown in
Figure 1(b) resulted in each class having similar validation and training dataset.</p>
      <sec id="sec-2-1">
        <title>2.2. Preprocessing</title>
        <p>
          The preprocessing of the images for the deep learning model is shown in Figure 2. The images for
the ImageCLEF TB task were provided as NIFTI 3D datasets. Two versions of lung segmentation
masks were also provided. The first version of segmentation (denoted as Mask-1) provided
more accurate masks, containing individual masks for left and right laterality (values equal 1
for left and 2 for right), but in the most severe TB cases, there was a tendency to miss large
abnormal regions in the lungs [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. On the other hand, the second segmentation (denoted as
Mask-2) provided less precise boundaries, given that it contained the entire lung area (i.e., both
left and right sides of the lung), but was more stable in terms of including lesion areas [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. As
there was no need to locate lesions in terms of lung side, only Mask-2 was used in this study.
        </p>
        <p>
          As shown in Fig. 2, the original NIFTI-formatted dataset was transformed into image data
by first applying the NiBabel package. Next, the reformatted images were adjusted to three
diferent window levels, namely baseline, lung, and soft tissue, and then normalized. For baseline
window level, the foreground was obtained via the Otsu thresholding algorithm provided in the
openCV package; for lung and soft tissue, the image levels were set as [-600,1500] and [50,350],
respectively. Then, images were normalized to [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ] with their mean and std values. Finally, all
three windows and levels of data were saved, and annotation files were rearranged for use in
further training.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Network and Training</title>
        <p>
          In this study, a 3D convolutional block attention module (CBAM)-Resnet and a 3D EficientNet
were employed to train the model for 5-class classification based on the PyTorch framework.
Similar to our last year’s work [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], a standard 3D-resnet34 [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] was used as the convolutional
neural network (CNN) backbone, with three fc layers as the classiefir. CBAM [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]was used to
implement channel and spatial attention mechanisms for each block of the Resnet, and sigmoid
was used as the activation function for binary classification. according to our computing
resources, EficientNet B5 was the optimal 3D EficientNet for training [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>To train the neural networks, we used a workstation with 4 Nvidia GTX 1080 Ti video cards,
128 GB RAM, and a 1 TB solid state drive. During the training process, to avoid overfitting,
image augmentation and a balanced sampler were implemented in each batch. For the image
augmentation, traditional data augmentation methods, including brightness, shear, scale, and
lfip, were applied. The balanced sampler strategy, which equalized the data sampled from all
ifve classes for each batch, was adopted during the training process.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.4. Experiments and Model Selection</title>
        <p>Three experiments were conducted during model training. For 3D Resnet, 20 datasets that
fed into the network were dynamically generated from saved metadata with diferent window
levels as a single channel and were interpolated into two kinds of data size: 3× 64× 256× 256
and 3× 16× 384× 384. For 3D EficientNet, only 3 × 64× 256× 256 was used. For each experiment,
60 epochs with a cosine annealing warm-up learning rate were performed to train the model.
To find the best model for each experiment during training, epochs with either minimum loss
or highest accuracy were selected and saved for further submission.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results and Submissions</title>
      <p>The provided TST dataset included 421 image files for testing. With our preprocessing pipeline,
the TST data were cropped according to Mask-2 to generate calibrated image files. After
evaluation of the trained model, the results were rearranged according to the requirement and
saved as a .txt file to be submitted. As was mentioned in the Methods, we saved six models with
diferent metrics for evaluating the TST datasets; their performances are displayed in Table 1.</p>
      <p>Per the submitted results, 3D Resnet34 achieved both better accuracy and Kappa than
EficientNet B5. For 3D Resnet34, the model with tensor size
3× 64× 256× 256 and a loss-based selection model achieved a superior Kappa result of 0.190 and
accuracy of 0.371, with submission name of 154940_loss. We also tried to assemble the results
from diferent models into a single submission,137652, but the result was not significantly
improved.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and Conclusion</title>
      <p>To provide a deep learning solution for a multi-classification task of tuberculosis, we performed
experiments using 3D CBAM Resnet and 3D EficientNet as CNN backbones. There were several
challenges for this task, such as the severe class imbalance and 3D dimensionality of the CT
images, so we tried several techniques to improve the models’ performance. First, we properly
applied stratified sampling of each class for the train/validation split to mitigate bias in the
training and validation cohorts. Furthermore, a balanced sampler in each batch sampler was
used to address the data imbalance problem. Second, CBAM was used to add an attention
mechanism to each block of the Resnet to further improve the performance of the CNN. Third,
diferent windowings of the CT images were concatenated to further focus the CNN on features
of the illness according to a radiologist. Using all the aforementioned techniques, we achieved a
kappa of 0.190 in the evaluation of the test dataset and placed third in this competition.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Acknowledgments</title>
      <p>This work was supported in part by the Ofice of the Assistant Secretary of Defense for Health
Afairs through the Accelerating Innovation in Military Medicine Program under Award No.
(W81XWH-20-1-0693).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ionescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Peteri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Ben</given-names>
            <surname>Abacha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sarrouti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Demner-Fushman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kozlovski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dicente</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Pelka</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Jacutprakart</surname>
            ,
            <given-names>C. M.</given-names>
          </string-name>
          <string-name>
            <surname>Friedrich</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Berari</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Tauteanu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Fichou</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Brie</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dogariu</surname>
            ,
            <given-names>L. D.</given-names>
          </string-name>
          <string-name>
            <surname>Ştefan</surname>
            ,
            <given-names>M. G.</given-names>
          </string-name>
          <string-name>
            <surname>Constantin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chamberlain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Campello</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>T. A.</given-names>
          </string-name>
          <string-name>
            <surname>Oliver</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Moustahfid</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Popescu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Deshayes-Chossart</surname>
          </string-name>
          ,
          <article-title>Overview of the ImageCLEF 2021: Multimedia retrieval in medical, nature, internet and social media applications, in: Experimental IR Meets Multilinguality</article-title>
          , Multimodality, and
          <string-name>
            <surname>Interaction</surname>
          </string-name>
          ,
          <source>Proceedings of the 12th International Conference of the CLEF Association (CLEF</source>
          <year>2021</year>
          ),
          <source>LNCS Lecture Notes in Computer Science</source>
          , Springer, Bucharest, Romania,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kozlovski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dicente Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          , Overview of ImageCLEFtuberculosis 2021 -
          <article-title>CT-based tuberculosis type classification</article-title>
          ,
          <source>in: CLEF2021 Working Notes, CEUR Workshop Proceedings</source>
          , CEUR-WS.org &lt;http://ceur-ws.
          <source>org&gt;</source>
          , Bucharest, Romania,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dicente Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. A.</given-names>
            <surname>Jiménez del Toro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Depeursinge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <article-title>Eficient and fully automatic segmentation of the lungs in ct volumes</article-title>
          , in: O.
          <string-name>
            <surname>Goksel</surname>
            ,
            <given-names>O. A.</given-names>
          </string-name>
          <string-name>
            <surname>Jiménez del Toro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Foncubierta-Rodríguez</surname>
          </string-name>
          , H. Müller (Eds.),
          <source>Proceedings of the VISCERAL Anatomy Grand Challenge at the 2015 IEEE ISBI, CEUR Workshop Proceedings</source>
          , CEUR-WS.org &lt;http://ceurws.org&gt;,
          <year>2015</year>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kovalev</surname>
          </string-name>
          ,
          <year>Imageclef 2017</year>
          :
          <article-title>Supervoxels and co-occurrence for tuberculosis ct image classification</article-title>
          ,
          <source>in: CLEF2017 Working Notes, CEUR Workshop Proceedings</source>
          , CEURWS.org &lt;http://ceur-ws.
          <source>org&gt;</source>
          , Dublin, Ireland,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>X.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hsu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Gentili,</surname>
          </string-name>
          <article-title>Imageclef2020: Laterality-reduction three-dimensional cbam-resnet with balanced sampler for multi-binary classification of tuberculosis and CT auto reports</article-title>
          , in: L.
          <string-name>
            <surname>Cappellato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Eickhof</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Névéol (Eds.), Working Notes of CLEF 2020 -
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          , Thessaloniki, Greece,
          <source>September 22-25</source>
          ,
          <year>2020</year>
          , volume
          <volume>2696</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2696</volume>
          /paper_70.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>K.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Ren,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Deep residual learning for image recognition</article-title>
          ,
          <year>2015</year>
          . arXiv:
          <volume>1512</volume>
          .
          <fpage>03385</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Woo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Park</surname>
          </string-name>
          , J.-
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Kweon</surname>
          </string-name>
          , Cbam: Convolutional block attention module,
          <year>2018</year>
          . arXiv:
          <year>1807</year>
          .06521.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          , Eficientnet:
          <article-title>Rethinking model scaling for convolutional neural networks</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>1905</year>
          .11946.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>