<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>categorizing tuberculosis cases using normalization and pseudo-color CT image</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tetsuya Asakawa</string-name>
          <email>asakawa@kde.cs.tut.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riku Tsuneda</string-name>
          <email>tsuneda@kde.cs.tut.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kazuki Shimizu</string-name>
          <email>shimizu@heart-center.or.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Takuyuki Komoda</string-name>
          <email>komoda@heart-center.or.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Masaki Aono</string-name>
          <email>aono@tut.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Toyohashi Heart center</institution>
          ,
          <addr-line>21-1 Gobutori Tenpaku, Oyama, Toyohashi, Aichi</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Tuberculosis</institution>
          ,
          <addr-line>Deep Learning, Normalization, Pseudo-color</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The ImageCLEF 2021 Tuberculosis task is an example of a challenging research problem in the field of computed tomography (CT) image analysis. The purpose of this study is to make accurate estimates for five labels (infiltrative, focal, tuberculoma, miliary, and fibrocavernous) based on lung images. We describe the tuberculosis task and approach for chest CT image analysis and then perform a single-label CT image analysis using the task dataset. We propose an image processing and fine-tuning deep neural network model that uses inputs from convolutional neural network features. This paper presents several approaches for applying normalization and pseudo-color to the extracted 2D images, for applying mask data to the extracted 2D image data, and for extracting a set of 2D projection images based on the 3D chest CT data. Our submissions for the task test dataset achieved an unweighted Cohen's kappa of 0.117 and an accuracy of 0.382.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>With the spread of various diseases (e.g., tuberculosis (TB), COVID-19, and influenza), medical
research has been performed to develop and implement the necessary treatments for viruses. However,
there is no method currently available to identify such diseases early. An early diagnosis method is
needed to provide the necessary treatment, develop specific medicines, and prevent the deaths of
patients.</p>
      <p>Accordingly, a significant amount of effort has been invested in medical image analysis research in
recent years. In fact, a task dedicated to TB has been adopted as part of the ImageCLEF evaluation
campaign for the
five last years [1][2][3][4][5]. In ImageCLEF
2021 the
main task
[6],
“ImageCLEFmed Tuberculosis,” is treated as a computed tomography (CT) report. The goal of this
subtask is to automatically categorize each TB case into one of the following five types: infiltrative,
focal, tuberculoma, miliary, or fibrocavernous. Accordingly, the goal of this study is to automatically
categorize the TB type from 3D CT images of TB patients.</p>
      <p>In this paper, we employ a new fine-tuning neural network model that uses features extracted by
pretrained convolutional neural network (CNN) models as input. The existing CNN model had weak
classifications; therefore, we propose a new fully connected two layers. The new contributions of this
paper are the proposition of novel feature building techniques, the incorporation of features from the
proposed CNN model, and the use of several forms of pre-processing to predict TB from the images.
In Section 2, we describe the conducted task and the ImageCLEF2021 dataset. In Section 3, we</p>
      <p>2021 Copyright for this paper by its authors.
introduce the image pre-processing, experimental settings, and features used in this study. In Section 4,
we describe the experiments we performed. In Section 5, we provide our conclusions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. ImageCLEF 2021 Dataset</title>
      <p>The TB task of the ImageCLEF 2021 Challenge included partial 3D patient chest CT images [7].
The dataset contained the chest CT scan imaging data, including 917 images for the training
(development) dataset and 421 images for the test dataset. Some of the scans include additional
metainformation, which may vary depending on data availability for different cases. Each CT image
corresponds to only one TB type. In this edition, each CT scan corresponds to one patient. Using the
CT image data, our goal is to automatically extract and categorize each TB case into one of the following
five types: (1) Infiltrative, (2) Focal, (3) Tuberculoma, (4) Miliary, (5) fibrocavernous Table 1 lists the
labels for the chest CT scan in the training dataset.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Method</title>
      <p>We propose a single-label analysis system to predict the TB type from CT scan images. The first
step is input data pre-processing. After introducing our pre-processing of the input data, we describe
our deep neural network model, which enables single-label outputs given the CT scan images. In
addition, optionally in the first step, we can use a CT scan movie instead of CT scan images. We detail
our proposed system in the following subsections.</p>
      <p>3.1.</p>
    </sec>
    <sec id="sec-4">
      <title>Input data pre-processing</title>
      <p>The 3D CT scans in the training and test datasets are provided in compressed Nifti format. We
decompressed the files and extracted the slices along the z-axis of the 3D image, as shown in fig. 1. For
each Nifti image, we obtained a number of slices, according to the dimensions, ranging from 110 to 250
images for the z-dimension. After extracting the slices along the z-axis, we filtered the slices of each
patient using mask1 and mask2 data [8][9]. The mask1 data provide more accurate masks but tend to
miss large abnormal regions of the lungs in the most severe TB cases. The mask2 data provide more
rough bounds but behave more stably in terms of including lesion areas. We extracted the filtered CT
scan images. We noticed that all slices contain relevant information, including bone, space, fat, and
skin, in addition to the lungs that could help classify the samples. This is why we added a step to the
filter and selected a number of slices per patient. We call this data the Applying mask CT data.</p>
      <p>In addition, as shown in fig. 2, we implemented pseudo-color on the normalization mask CT data.
We call this data the normalization mask CT data.</p>
      <p>In addition, as shown in fig. 3, we perform pseudo color for normalization mask CT data. We call
this data pseudo color CT data.</p>
    </sec>
    <sec id="sec-5">
      <title>Proposed deep neural network model</title>
      <p>To solve this single-label problem, we propose fine-tuning neural network models that allow inputs
coming from end-to-end CNN features.</p>
    </sec>
    <sec id="sec-6">
      <title>3.2.1. Training and Validation sets</title>
      <p>The training dataset consists of 107,955 and 105,494 images extracted from the applying mask1 and
mask2 CT datasets, respectively, for the z-axis.</p>
      <p>We divided the training dataset at random into training and validation datasets with a ratio of 8:2.
The CNN features were extracted using pre-trained CNN-based neural networks, including EfficientNet
B05. To deal with the above features, we propose a deep neural network architecture.</p>
      <p>Our system incorporates CNN features, which can be extracted using deep CNNs pre-trained on
ImageNet [10] such as EffcientNet B05[11]. Because of the lack of datasets in visual sentiment analysis,
we adopted transfer learning for the feature extraction to prevent overfitting. We decreased the
dimensions of the fully connected layers used in the CNN models. In addition, we extracted the vector
to 2048 dimensions.</p>
    </sec>
    <sec id="sec-7">
      <title>3.2.2. Training and Validation sets and Test data</title>
      <p>We employed the unweighted Cohen’s kappa and accuracy to fine-tune the above CNN model.</p>
      <p>As illustrated in fig. 4, the CNN features are combined and represented by an integrated feature as a
linearly weighted average, where the weights are w3 for the CNN features. The CNN features are passed
through “Fusion” processing to generate the integrated features, followed by a “softmax” activation
function.</p>
      <p>3.3.</p>
    </sec>
    <sec id="sec-8">
      <title>Single-label probability</title>
      <p>We propose the method illustrated in Algorithm 1. The input is a collection of features extracted
from each image with K types of diseases, while the output is a K-dimensional hot vector.</p>
      <p>In Algorithm 1, we assume that the extracted CNN features are represented by their probabilities.
For each TB case, we sum the features, followed by the median of the result, which is denoted as Tik in
Algorithm 1. In short, the vector Si represents the output of each hot vector. We repeat this computation
until all the test (unknown) images are processed.</p>
      <p>Accuracy of training and</p>
    </sec>
    <sec id="sec-9">
      <title>4. Experiments</title>
      <p>4.1. Unweighted Cohen’s Kappa and</p>
      <p>validation sets</p>
      <p>The training dataset consists in Applying mask1 and mask2 CT data, and the normalization mask1
and mask2 CT data. The training dataset consists of 105 494,107 955 images extracted for the mask1
and mask2 CT data respectively.</p>
      <p>Here, we have divided the filtering data into training and validation datasets with a ratio of 8:2. We
determined the following hyper-parameters: the batch size is 256, the optimization function is stochastic
gradient descent with a learning rate of 0.001 and a momentum of 0.9, and the number of epochs is 200.
For the implementation, we employed Tensorflow[12] as our deep learning framework.</p>
      <p>For the evaluation of the single-label classification, we employed the un-weighted Cohen’s kappa
and the accuracy. Table 2 shows the results. finally, we employed EfficientNet B05 for the training and
validation datasets and the test data. The results are given in Section (4.2).</p>
    </sec>
    <sec id="sec-10">
      <title>Results for the training and validation datasets and the test data using our proposed model</title>
      <p>The test dataset consisted of 59 835 and 60 758 images extracted from the applying mask1 and
mask2 CT data, respectively, as show in Table 3.</p>
      <p>It is likely that our proposed models will give better results after more advanced data pre-processing
including the use of several types of CT images and data augmentation. Here as described above, we
employed fine-tuning CNN models in EfficientNet B05 based on several pre-processing methods.</p>
      <p>Table 4 shows the results. Here, we compare the results in terms of the unweighted Cohen’s kappa
and the accuracy. For mask1 and normalization on fine-tuning EfficientNet B05, our proposed CNN
model has good values of un-weighted Cohen’s kappa and accuracy.</p>
      <p>In addition, results of the other participants’ submissions with their un-weighted Cohen’s kappa and
accuracy are shown in Table 5. Here, we compare the results in terms of the unweighted Cohen’s kappa
and the accuracy.</p>
      <p>For our team, KDE-lab, our proposed CNN model has the best unweighted Cohen’s kappa and
accuracy.</p>
      <p>The results achieved by our submissions are well ranked compared to those at the top of the list
given in Table 5. Note that several runs in the table belong to the same teams and likely do not differ
significantly. In terms of the unweighted Cohen’s kappa, our model ranks 8th. In terms of the accuracy,
our model ranks 7th.</p>
    </sec>
    <sec id="sec-11">
      <title>5. Conclusions</title>
      <p>In this study, we proposed image pre-processing and a CNN model for predicting five labels
(infiltrative, focal, tuberculoma, miliary, and fibrocavernous) from chest CT images. We performed a
lung CT image analysis in which we proposed a deep neural network model that enabled the inputs to
be derived from the CNN features. To predict the five labels, we introduced a threshold-based
singlelabel prediction algorithm.</p>
      <p>Specifically, after training our deep neural network using the pre-processed images, we were able to
predict the categories of the five types of TB cases from unknown CT scan images. The experimental
results demonstrate that our proposed models out-perform some models in terms of the unweighted
Cohen’s kappa and the accuracy. For the unweighted Cohen’s kappa, our model achieved a good value.
As a consequence, we believe that using normalization to pre-process an image is effective.</p>
      <p>In the future, given an arbitrary X-ray, CT, echo, or magnetic resonance imaging image might be
included the optimal weights for the neural networks. Moreover, we hope our proposed model will
encourage further research into the early detection of diseases (such as TB, COVID-19, and influenza)
or unknown diseases.</p>
    </sec>
    <sec id="sec-12">
      <title>6. Acknowledgment</title>
    </sec>
    <sec id="sec-13">
      <title>7. References</title>
      <p>A part of this research was carried out with the support of the Grant for Toy-ohashi Heart Center
Smart Hospital Joint Research Course and the Grant for Education and Research in Toyohashi
University of Technology.</p>
      <p>Yashin Dicente Cid, Alexander Kalinovsky, Vitali Liauchuk, Vassili Kovalev, , and
Henning M¨uller. Overview of ImageCLEFtuberculosis 2017 - predicting tubercu-losis type
and drug resistances. In CLEF2017 Working Notes, CEUR Workshop Proceedings, Dublin,
Ireland, September 11-14 2017. CEUR-WS.org &lt;http://ceur-ws.org&gt;.</p>
      <p>Bogdan Ionescu, Henning M¨uller, Mauricio Villegas, Alba Garc´ıa Seco de Herrera,
Carsten Eickhoff, Vincent Andrearczyk, Yashin Dicente Cid, Vitali Liauchuk, Vas-sili
Kovalev, Sadid A. Hasan, Yuan Ling, Oladimeji Farri, Joey Liu, Matthew Lun-gren, Duc-Tien
Dang-Nguyen, Luca Piras, Michael Riegler, Liting Zhou, Mathias Lux, and Cathal Gurrin.
Overview of ImageCLEF 2018: Challenges, datasets and evaluation. In Experimental IR Meets
Multilinguality, Multimodality, and Interac-tion, Proceedings of the Ninth International
Conference of the CLEF Association (CLEF 2018), Avignon, France, September 10-14 2018.</p>
      <p>LNCS Lecture Notes in Computer Science, Springer.
[3] Bogdan Ionescu, Henning M¨uller, Renaud P´eteri, Yashin Dicente Cid, Vitali
Liauchuk, Vassili Kovalev, Dzmitri Klimuk, Aleh Tarasau, Asma Ben Abacha, Sa-did A. Hasan,
Vivek Datla, Joey Liu, Dina Demner-Fushman, Duc-Tien Dang-Nguyen, Luca Piras, Michael
Riegler, Minh-Triet Tran, Mathias Lux, Cathal Gur-rin, Obioma Pelka, Christoph M. Friedrich,
Alba Garc´ıa Seco de Herrera, Narciso Garcia, Ergina Kavallieratou, Carlos Roberto del
Blanco, Carlos Cuevas Rodr´ıguez, Nikos Vasillopoulos, Konstantinos Karampidis, Jon
Chamberlain, Adrian Clark, and Antonio Campello. ImageCLEF 2019: Multimedia Retrieval
in Medicine, Lifel-ogging, Security and Nature: Multimedia Retrieval in Medicine,
Lifelogging, Secu-rity and Nature. In Experimental IR Meets Multilinguality, Multimodality,
and Interaction, volume 2380 of Proceedings of the 10th International Conference of the CLEF
Association (CLEF 2019), Lugano, Switzerland, September 9-12 2019. LNCS Lecture Notes
in Computer Science, Springer.
[4] Obioma Pelka, Christoph M Friedrich, Alba Garc´ıa Seco de Herrera, and Hen-ning
M¨uller. Medical image understanding: Overview of the ImageCLEFmed 2020 concept
prediction task. In CLEF2020 Working Notes, Workshop Proceedings, Thessaloniki, Greece,
September 22-25 2020. CEUR-WS.org.
[5] Serge Kozlovski, Vitali Liauchuk, Yashin Dicente Cid, Aleh Tarasau, Vassili
Kovalev, and Henning M¨uller. Overview of ImageCLEFtuberculosis 2021 - automatic CT-based
report generation. In Overview of ImageCLEF tuberculosis 2021 - CT-based Tuberculosis Type
Classification, CEUR Workshop Proceedings, Bucharest, Romania, September 21-24 2021.</p>
      <p>CEUR-WS.org &lt;http://ceur-ws.org&gt;.
[6] Bogdan Ionescu, Henning M¨uller, Renaud P´eteri, Asma Ben Abacha, Vivek Datla,
Sadid A. Hasan, Dina Demner-Fushman, Serge Kozlovski, Vitali Liauchuk, Yashin Dicente
Cid, Vassili Kovalev, Obioma Pelka, Christoph M. Friedrich, Alba Garc´ıa Seco de Herrera,
Van-Tu Ninh, Tu-Khiem Le, Liting Zhou, Luca Piras, Michael Riegler, P˚al Halvorsen,
MinhTriet Tran, Mathias Lux, Cathal Gurrin, Duc-Tien Dang-Nguyen, Jon Chamberlain, Adrian
Clark, Antonio Campello, Dim-itri Fichou, Raul Berari, Paul Brie, Mihai Dogariu, Liviu Daniel
S¸tefan, and Mi-hai Gabriel Constantin. Overview of the ImageCLEF 2020: Multimedia
Retrieval in Medical, Lifelogging, Nature, and Internet Applications. In Experimental IR Meets
Multilinguality, Multimodality, and Interaction, volume 12260 of Proceedings of the 11th
International Conference of the CLEF Association (CLEF 2020), Thessaloniki, Greece,
September 22-25 2020. LNCS Lecture Notes in Computer Science, Springer.
[7] Serge Kozlovski, Vitali Liauchuk, Yashin Dicente Cid, Aleh Tarasau, Vassili
Kovalev, and Henning M¨uller. Overview of ImageCLEFtuberculosis 2020 - auto-matic CT-based
report generation. In CLEF2020 Working Notes, CEUR Work-shop Proceedings, Thessaloniki,
Greece, September 22-25 2020. CEUR-WS.org &lt;http://ceur-ws.org&gt;.
[8] Yashin Dicente Cid, Oscar Alfonso Jim´enez del Toro, Adrien Depeursinge, and
Henning M¨uller. Efficient and fully automatic segmentation of the lungs in ct volumes. In
Orcun Goksel, Oscar Alfonso Jim´enez del Toro, Antonio Foncubierta-Rodr´ıguez, and
Henning M¨uller, editors, Proceedings of the VISCERAL Anatomy Grand Challenge at the
2015 IEEE ISBI, CEUR Workshop Proceedings, pages 31–35. CEUR-WS, May 2015.
[9] Vitali Liauchuk and Vassili Kovalev. Imageclef 2017: Supervoxels and co-occurrence
for tuberculosis CT image classification. In Linda Cappellato, Nicola Ferro, Lorraine Goeuriot,
and Thomas Mandl, editors, Working Notes of CLEF 2017 - Conference and Labs of the
Evaluation Forum, Dublin, Ireland, September11-14, 2017, volume 1866 of CEUR Workshop
Proceedings. CEUR-WS.org, 2017.10.
[10] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma,
Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexan-der C. Berg, and
Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of
Computer Vision (IJCV), 115(3):211–252, 2015.
[11] Mingxing Tan and Quoc V. Le. Efficientnet: Rethinking model scaling for
convolutional neural networks. ICML 2019, 05 2019.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>