<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Feature and Deep Learning Based Approaches for Automatic Report Generation and Severity Scoring of Lung Tuberculosis from CT Images</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kirill Bogomasov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Braun</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Burbach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ludmila Himmelspach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Conrad</string-name>
          <email>stefan.conradg@hhu.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Heinrich-Heine-Universitat Dusseldorf, Institut fur Informatik Universitatsstra e 1</institution>
          ,
          <addr-line>40225 Dusseldorf</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper presents two approaches for automatic Computed Tomography (CT) report and tuberculosis (TB) severity scoring which were two subtasks of ImageCLEFtuberculosis 2019 challenge. While our rst approach uses image processing techniques for feature extraction from CT scans, our second approach uses arti cial neural networks (ANN) for predicting probabilities for di erent lung irregularities associated with pulmonary tuberculosis and tuberculosis severity assessment. The results showed that our feature-based approach is still a competitive method that achieved rank 3 of 54 in the severity scoring subtask and rank 7 of 35 in the CT report subtask.</p>
      </abstract>
      <kwd-group>
        <kwd>automatic CT report</kwd>
        <kwd>tuberculosis severity scoring ical image classi cation</kwd>
        <kwd>feature extraction</kwd>
        <kwd>deep learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The tuberculosis task [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] of the ImageCLEF 2019 [10] challenge consisted of
two subtasks dealing with analysis of Computed Tomography (CT) images of
patients su ering from pulmonary tuberculosis. The aim of subtask #1 was
the tuberculosis severity assessment based on CT scans. The subtask #2 was
dedicated to the automatic generation of a CT report including the information
about the left and right lung a ection, presence of calci cations, presence of
caverns, pleurisy, and lung capacity decrease. Both subtasks shared the same
data set consisting of CT images and additional patient's meta data including
information about education, imprisonment, disability, comorbidity, and others.
      </p>
      <p>
        Last year our team participated in the severity scoring subtask at
ImageCLEFtuberculosis 2018 challenge [6]. Our feature-based approach achieved rank
10 of 36 regarding the RMSE measure [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This result showed that our methods
could compete with more complicated and computationally intensive methods
in the eld of deep learning. Since our feature-based approach provided a
descriptive image classi cation framework, we decided to improve and to adapt it
to the requirements of both subtasks of the ImageCLEF 2019 challenge [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. On
the other hand, taking the last years research trends into account, we developed
a new deep learning-based approach.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Feature Based Approach for Automatic CT Report</title>
    </sec>
    <sec id="sec-3">
      <title>Generation and Tuberculosis Severity Scoring</title>
      <p>In this section we describe our feature-based approach for automatic CT report
and severity score prediction from CT scans. The main motive for developing
a feature-based approach was the ability not only to predict the probabilities
for di erent lung irregularities but also the ability to mark them in CT scans.
This could also be helpful for physicians during manual assessment of CT scans.
Furthermore, our approach provides information about the in uence of di erent
lung damages and additional patient's data on the tuberculosis severity score.
2.1</p>
      <sec id="sec-3-1">
        <title>Preprocessing</title>
        <p>
          Some features that we used for the automatic CT report were extracted from
the original CT scans, while other features were easier to extract from binary
images. Therefore, we binarized all CT scans using IsoData method [13]. We
used lung masks for extraction of all features for the CT report task. Some of
the lung masks that were provided by the organizers of the task [7] still did
not cover large lesions. For this reason we decide to use our own lung masks
extracted by the segmentation algorithm described in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. This algorithm examines
the silhouettes of extracted masks for irregularities and reconstructs the masks.
Although the reconstructed lung masks did not perfectly cover the entire lung,
they still contained more lung pixels than lung masks provided by the organizers
of the task.
2.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Automatic CT Report Generation</title>
        <p>
          Presence of Calci cation Pulmonary calci cation in CT scans was
determined for left and right lung separately depending on the number of pixels
that were identi ed as part of calci cation. Since di erent Houns eld Unit (HU)
ranges for pulmonary calci cation in CT scans were proposed in the literature
[
          <xref ref-type="bibr" rid="ref3">3, 8, 12</xref>
          ] and the Houns eld Units were not standardized in CT scans in the
data set, we decided on a relatively large range between 300 HU and 3000 HU.
In this way, we were able to identify calci cations of di erent density. On the
other hand, our range for calci cation contains the HU range for bones that
were often erroneously covered by the lung masks. To reduce the presence of
bones in the examined lung area, we adjusted the lung masks in a preprocessing
step by removing pixels of their boundaries along the z-axis using
morphological erosion function [11] with a disk of radius four pixels. Since many CT scans
contained noise patches that could be erroneously classi ed as calci ed nodules,
we removed all objects smaller than 10 pixels that were identi ed as calci
cations. Finally, we added up the pixels of found calci cations over all CT scan
slices along the z-axis in the le. If either left or right lung or both contained
more than 400 calci cation pixels, we stated the probability of presence of lung
calci cations as 1 otherwise as 0. This threshold value was determined based on
the cross-validation Area Under the ROC Curve (AUC) value for presence of
calci cation on the training set.
        </p>
        <p>Since Houns eld Unit range for plastic and metal overlaps our range for
calci cation, our method for detection of calci cation presence tended to false
positives for patients that had medical appliances in the lung. To prevent
misclassi cations in such cases, the shape of found calci cations could be additionally
examined.</p>
        <p>
          Presence of Caverns At ImageCLEFtuberculosis 2018 [6], we used a simple
approach for detection of pulmonary caverns. The principal idea of the method
was detecting caverns as dark spots surrounded by light tissue in binarized CT
image slices along the z-axis [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The main weak point of our approach was
that trachea and bronchi were incorrectly recognized as caverns. Therefore, we
cut out the middle part of the lung to avoid false positives. Unfortunately, that
workaround has led to many false negatives because our method did not detect
caverns that were either completely or partly located in the cut out part of
the lung. For this reason we improved our last year approach for detection of
pulmonary caverns by examining the entire lung.
        </p>
        <p>The Fleischner glossary de nes pulmonary cavities as thick-walled gas- lled
spaces [9]. The main di erence to trachea and bronchi is that cavities are
completely covered by cavity walls. Therefore, we validated a cavern in a binarized
CT scan slice along the z-axis as such only if its pixels were detected as pixels of
a cavern in the CT scan slices along the x- and y-axes. We estimated the volumes
of pulmonary caverns and their walls for right and left lung separately by adding
up the pixels of validated cavities and cavity walls over all CT image slices along
the z-axis. We used these four features for training a linear regression model for
predicting the presence of caverns.</p>
        <p>Our improved method reliably detected caverns in CT scans in the training
set as long as the distances between the slices in the scans were not too large so
that all cavity walls were depicted in the CT images. Unfortunately, our approach
still produced false positives due to artifacts on the images mainly caused by the
heartbeat of patients. Therefore, an additional preprocessing step is needed for
elimination of artifacts in CT scans.</p>
        <p>
          Presence of Pleurisy Pleurisy is in ammation of pleura which is a thin
membrane that covers the lungs [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Since in ammation often leads to thickening of
the tissue and pleura thickening increases the distance between the lung and
bones, in our approach for pleurisy detection, we compared the average distance
between the boundaries of the lung masks and bones in images along the z-axis
in patients with and without pleurisy. For that purpose we overlayed the lung
masks and the bone masks which represent pixels of the original CT scan with
Houns eld Units between 300 and 3000. In the resulting image, we calculated
the average distance between pixels of the lung mask boundaries and the nearest
bone pixels. Then we averaged the distances between lung and bones over all
CT scan slices along the z-axis for right and left lung separately and used them
for training a linear regression model for pleurisy prediction.
        </p>
        <p>
          Lung Capacity Decrease The lung capacity is the maximum amount of air
that the lung can hold. Some kinds of lung tissue damage caused by
Mycobacterium tuberculosis (MTB) bacteria may decrease the capacity of the lung. Since
an automatic detection and classi cation of di erent types of lung lesions from
CT scans is a challenging problem, we predicted the probability of the lung
capacity decrease based on the estimated ratio of the lung tissue to the entire
lung volume. Assuming that the lung tissue ratio compared to the lung volume
is larger in patients with decreased lung capacity than in patients with normal
lung capacity, in our approach, we did not di erentiated between healthy and
damaged lung tissue. Similar to our last year approach [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], we calculated the
ratio of the lung tissue as a relation of white pixels in the binarized CT
image to the number of pixels in the lung mask averaged over all slices along the
z-axis. Finally, we trained a linear regression model for lung capacity decrease
prediction using the ratios of the lung tissue for left and right lungs as features.
Right and Left Lungs A ected Mycobacterium tuberculosis (MTB) bacteria
causes more kinds of lung damage than calci cations, caverns, pleurisy, and
lung capacity decrease. Therefore, the estimation model for probability of lung
a ection based on the probabilities for lung damage described before did not
achieve satisfactory results on the training set. On the other hand, raw feature
values that we extracted for predicting the probability of aforementioned lung
damage, provided more information about further lesions in the lung. For this
reason, we used the number of calci cation pixels in the lung, average distance
between the lung and bones, and the ratio of lung tissue to the lung volume for
left and right lung, separately, as features for training random forests models for
predicting the probabilities of a ection of lungs.
2.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Tuberculosis Severity Scoring</title>
        <p>
          At ImageCLEFtuberculosis 2018 [6], our system achieved its best results for
tuberculosis severity score prediction using three features: the cavern volume, the
volume of cavern walls, and the infection ratio [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. This year we used data from
the CT report task combined with provided patient's meta data. Using linear
regression as classi er, we obtained the 5-fold cross-validation AUC of
approximately 0.8 for severity score on the training set. The most important features
for severity score prediction were the probability of left and right lung a ection,
information about the imprisonment, the probability for pleurisy, and
information about education. Although some features seemed to play an insigni cant
role, their elimination diminished the AUC value for severity score. Since some
features from meta data were very important for severity score prediction, we
tested our linear regression model on the training set only using patient's meta
data. We obtained AUC value of approximately 0.75. On the other hand, the
linear regression model trained only using data from CT report achieved the same
AUC value. Although we were aware that feature values predicted for the CT
report task were inaccurate to some degree, we used them combined with
provided patient's meta data for training a linear regression model for TB severity
score prediction.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Deep Learning Based Approach</title>
      <p>Deep Learning has been applied on solving medical relevant research questions.
Among other things it is used for classi cation of brain and lung tumors. Thus,
Liu and Kang [17] for example achieves an AUC value of 0.981 with their ANN
on the LIDC-IDRI data set [18] for the binary classi cation of lung cancer.</p>
      <p>In addition to the classi cation of the CT scans into the prede ned disease
stages, the task can be subdivided into a further subtask, namely the
segmentation. We suspect that the occurrence of disease-typical symptoms, such as
calci cation, caverns and pleurisy, may help in the subsequent classi cation.
The topic of the localization and classi cation of objects is the subject of many
scienti c publications.</p>
      <p>Some of the most promising approaches are based on the U-Net architecture
[15]. This is shown, for example, by the fact that the winner of the 2018 BraTS
Challenge used a U-Net variant [16]. The BraTS data set contains of CT scans
of brain tumor patients and is therefor similar to the given tuberculosis data.On
the one hand an advantage of the U-Net architecture is that the network
considers the semantic context of the entire image during segmentation, on the other
the hand U-Net architecture needs only a small amount of training examples
to produce good results. Regarding the low amount of training data of the two
tuberculosis tasks, this is a su ciently important feature. We will use one
architecture for both tasks, severity scoring and CT report, with the only di erence
being the number of nal classi cations to represent the di erent amount of
possible labels. Isensee et al. showed that the architecture of the U-Nets is already
so high-performant that a meaningful pre- and post-processing o ers a greater
potential for improvement than the change of the architecture [14]. Therefore,
we start our processing pipeline with preprocessing and extend the architecture
of the original U-Net [15] by an additional classi cation CNN. Afterwards we
nish our approach with postprocessing. The exact explanation follows in the
next sub-chapters.
The data set contains several anomalies which make preprocessing necessary. The
CT scans in the given data set have 3 di erent values f 3024; 2048; 1024g for
"outside of body" - mark. Probably because the images were taken by di erent
scanners and are not standardized. For this reason, some serious jumps can be
found in the value ranges of the Houns eld Units. Beside of that, there are
even higher values for some noisy pixels. Similar to [19], we used a four-stage
preprocessing to standardize the CT scans.</p>
      <p>
        { Step 1: Remove empty gap. "NULL"-representing pixel values outside of
body are often much lower than the values inside. To prevent that no area of
the examination remains empty, each "NULL" - representing pixel is replaced
with the next higher value.
{ Step 2: Removing noise by range limits. The new value range is limited to
[-1000; +2000]. Outside pixel values are set to the limit value.
{ Step 3: Min-max normalization to [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ].
{ Optional Step 4: In the following the lung area is segmented with the binary
masks from the original data set 1. Finally we reduced the image size by
removing "0"-values in border area.
As mentioned previously, our chosen architecture is based on the original U-Net
approach, but we changed the original 2D convolutional layer to 3D. Additionally
we added a nal classi cation CNN, based on the well known VGG19
architecture [20], for a binary output, since we have a two-classes problem. Figure 2
shows a draft of the resulting network architecture.
      </p>
      <p>During the training and in the later classi cation we limit the input to
16slice sliding windows, which contains coherent slices along z-axis, for two reasons.
First, this reduces the requirements on GPU memory. Second, we now have a
xed input depth without the need to scale it. This not only serves to reduce
the requirements on GPU memory, but has also proven to be a useful value to
enhance the precision. Complementary, a more accurate prediction is produced,
because of a several classi cation results for each image. In the next step, we
halve the image, separating it into left and right lungs. This distinction is not
taken into account during training. Finally, we scale the input data to 192 256
with a bilinear interpolation. This results in an input tensor of 192 256 16.</p>
      <p>For the U-Net, as segmentation network, we chose a depth of four with a
number of eight lters for the rst convolutional layers. We use maxpooling for
the downscaling path and a transposed convolution for the upscaling path.
Furthermore, batch normalization is applied after each convolution and a dropout
value of 30% for the last convolutional layer in the downscaling path. As
activation function we use the recti ed linear unit. To get our segmentation mask
we use a convolutional layer with lter size one in each direction. For task one
this results in one segmentation mask, due to the fact that we have a binary
classi cation. Contrary to that, we use ve segmentation masks for task two.
Even though there exist six labels in the task, we only need one probability to
distinguish the a ection of the left or right lung due to the splitting of the lung
as preprocessing. An example of di erent segmentation masks for task two can
be seen in Figure 3.</p>
      <p>Algorithm 1 De nition of Max-Rule
Require: 2 N
if jDleft Drightj &gt; then</p>
      <p>P ( max(Spos [ Sneg)
else
if jSposj = jSnegj then
if 1 max(Spos) &lt; min(Sneg) then</p>
      <p>P ( max(Spos)
else</p>
      <sec id="sec-4-1">
        <title>P ( min(Sneg)</title>
        <p>end if
else
if jSposj &gt; jSnegj then</p>
        <p>P ( max(Spos)
else</p>
      </sec>
      <sec id="sec-4-2">
        <title>P ( min(Sneg)</title>
        <p>end if
end if
end if
return P</p>
        <p>The segmentation mask is used as input in our nal classi cation CNN. For
the CNN we also use a depth of four with eight as number of lters for the rst
convolutional layers. Like for the segmentation network, batch normalisation
for all and a 30% dropout for the last convolutional layer are applied. A leaky
recti ed linear unit is used as activation. The nal layer is a dense layer with
one neuron to represent the probability of the label. In task one we have one
classi cation network, but for task two we use ve independent classi cation
networks, one for each label.
3.3</p>
        <sec id="sec-4-2-1">
          <title>Postprocessing</title>
          <p>The network predicts the class of 16-slice windows of the CT scan. To get an
overall prediction P for a whole CT-scan, an aggregation of a set of predictions
has to be made. Therefore we divide each CT scan into three sections of same
size. For each of these sections a prediction pi with fpi 2 Rj0 pi 1 ^ i 2
f1; : : : ; 6gg is calculated. Taking into account the left and right half, we get a
total of six results.</p>
          <p>Now we propose four methods to merge these six partial results pi into one
nal result P .
1. Average: The result is de ned as P = fpiji 2 f1; : : : ; 6gg, namely the average
prediction value over all partial predictions.
2. Max-Rule: For this rule we de ne Dleft and Dright as the number of lung
slices in z-direction of the left respectively right lung. Also let Spos be the
set of positive predictions for which holds that pi 0:5 with i 2 f1; : : : ; 6g.
Similary, Sneg is the set of negative predictions de ned as Sneg = fpi &lt;
0:5ji 2 f1; : : : ; 6gg. Like Algorithm 1 shows, we rst check if part of the
lungs is missing. This occurs due to the fact that the size of the left and
the right lung can diverge due to the preprocessing while reducing the zero
values at the image borders. Consequently, we make the assumption that
this di erence is a sign of serious illness. Therefore, if the di erence between
Dleft and Dright exceeds a threshold 2 N, the maximal partial prediction
value pi is chosen as probability. If the depth of the lung does not di er
too much, we let the majority decide and therefore choose the maximum
respectively the minimum value from the set, Spos or Sneg, that has more
elements. If the two sets have an equal amount of elements, the value with
the smallest distance to the respective target value 0 or 1 is chosen.
3. Average-Rule: Similar to Max-Rule, the only di erence is that the
calculation of the resulting prediction value P does not select the maximum or
minimum but the average over all values of the corresponding result set Spos
respectively Sneg.
4. Con dence correction: For each window of a CT scan from the validation
data set, consisting of 16 slices, the coe cient which is necessary to change
the prediction of the respective window, is calculated so that the classi cation
result is the correct class.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Evaluation and Results</title>
      <p>This section shows nal performance results of submitted runs in the severity
scoring (subtask #1) and CT report (subtask #2) challenge. The nal ranking
in the severity task was done based on the Area Under the ROC Curve (AUC)
value, while the nal ranking in the CT report task was done based on the average
AUC value. Table 1 summarizes the results for Top-10 submitted runs with the
highest AUC value and the best run for our deep learning-based approach for
severity scoring task. Table 2 lists the results for Top-10 submitted runs with the
highest mean AUC value and the best run for our deep learning-based approach
for CT report task. In the following subsection we describe the results for our
approaches in detail.
4.1</p>
      <sec id="sec-5-1">
        <title>Evaluation Results for the Feature Based Approach</title>
        <p>Since we used results from the CT report task for TB severity score prediction, it
is more sensible to start describing results for the CT report task. As highlighted
in Table 2, our best run for the feature based approach was ranked on the
seventh place. In this run we predicted the probabilities for lung irregularities as
described in Section 2.2. In our second best run, we predicted the probability of
presence of caverns only based on the number of cavern pixels in left and right
lungs, separately, omitting the pixels of cavern walls. This run was ranked on
the eighth place which is a worse result. Unfortunately, we did not receive the
detailed evaluation results, so we can not comment on the performance of our
approach regarding prediction of other lung irregularities.</p>
        <p>In severity scoring task, the best run for our feature based approach was
ranked on the third place among 54 submitted runs. In this run we predicted
the severity score using patient's meta data and the results from our best run
in CT report task. The prediction of severity score in our second best run was
based on patient's meta data and the results from our second best run in CT
report task. Although we did not submit a run for TB severity score predicted
only on the basis of provided patient's meta data, the results for these two runs
showed a positive impact of results from the CT report task on the tuberculosis
severity score prediction.</p>
        <p>Run name AUC Accuracy Preprocessing Postprocessing
run 06 0.6393 0.5812 - method 1
run 08 1 0.6258 0.6068 mixed method 1
run 04 0.6070 0.5641 complete method 1
run 07 0.6050 0.5556 complete method 3,
run 03 0.5692 0.5385 complete method 3,
run 05 0.5419 0.5470 segmentation only method 1
baseline 0.5103 0.4872 complete method 2,
run 02 0.4452 0.4530 complete method 4
Data
validation split
validation split
validation split
= 5 all data
= 10 validation split</p>
        <p>all data
= 5 validation split
validation split
4.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Evaluation Results for the Deep Learning Based Approach</title>
        <p>For our evaluation we used di erent input data. We di erentiated between
train/validation split and the complete dataset as training basis. The validation set
consists of 10 images.</p>
        <p>For Severity Score Task we set up the preprocessing, as shown in Table 3. For
our runs, we used either full preprocessing, just segmentation or no preprocessing
at all. Run 08 is an exception, therefore we took an average of run 5, run 6 and
run 7. Table 3 shows the list of postprocessing con gurations of each run.</p>
        <p>The highest AUC score is achieved by run 06. In this case the network got
the raw input data. We presume that the good AUC score is due to the fact
that the network nds relevant points outside our region of interest, which is
removed through preprocessing. This can be supported by the fact that the
segmentation alone generates the worst results. However, the accuracy of run 06
is lower than that of run 08. It is interesting that no neural network from those
three, that we calculate the average on, can achieve such a high accuracy by
itself. It seems that the networks found di erent features and learned di erently,
so in the connection they complemented each other and the accuracy increased.
Surprisingly, with an accuracy of 0.453, run 02 score performed signi cantly
worse than the other constellations. Presumably, this is because of our validation
set size of only 10 images, which is potentially too small. And thus, the calculated
coe cients cannot be generalized.</p>
        <p>Since we had only a limited amount of runs for CT reportings, we decided to
use only those constellations, that were trained on the whole data set. Because it
seemed to be more reasonable to train on more data. Table 4 shows the results.
The greatest value for Mean AUC of 0.6315 and Min AUC share CT R run 1
and CT R run 2. Compared to the third run, this shows, that for this task the
preprocessing may be more valuable as for task 1. CT R run 3:csv shows rather
moderate results of 0.561 Mean AUC, which is still better than random, but still
leaves space for improvement.
1 conglomerate of run 5, run 6 and run 7</p>
        <p>Run name Mean AUC Min AUC Preprocessing Postprocessing Data
CTR run 1.csv 0.6315 0.5161 complete method 1 all data
CTR run 2.csv 0.6315 0.5161 complete method 1 all data
CTR run 3.csv 0.5610 0.4477 segmentation only method 1 all data
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>In this paper we have shown that our feature-based approach is still competitive
to our deep learning-based method and to methods of other participants of the
tuberculosis task. Our best run achieved the third place regarding the AUC
value in the severity assessment subtask and the seventh place regarding the
mean AUC value in the CT report subtask. Although the results obtained by our
approach are promising, we still see potential for improvement of our approach
to achieve even better results in both subtasks.</p>
      <p>Regarding that our neural network was not as deep as other networks in the
literature, our results are promising. Especially the U-Net architecture seems to
be bene cial and can be a good starting point for more research. Our
preprocessing was only bene cial for subtask #2, which is surprising and therefore it
would be interesting to investigate which parts of the lung had an e ect on the
resulting predictions. Data augmentation unexpectedly led to bad results in our
rst tests and we therefore refrained from using it. But we like to further
investigate the usefulness of data augmentation for this task in combination with our
network. Furthermore, we will test the network on other data sets, especially
with segmentation data to train the U-Net separately. We hope that by this the
segmentation layers will nd meaningful areas, that can show us symptoms of
such diseases. And regarding the results for subtask #2, more training epochs
would be surely bene cial too and therefore the training will continue.
6. Dicente Cid, Y., Liauchuk, V., Kovalev, V., , Muller, H.: Overview of
ImageCLEFtuberculosis 2018 - detecting multi-drug resistance, classifying tuberculosis type,
and assessing severity score. In: CLEF2018 Working Notes. CEUR Workshop
Proceedings, CEUR-WS.org &lt;http://ceur-ws.org&gt;, Avignon, France (September 10-14
2018)
7. Dicente Cid, Y., Jimenez del Toro, O.A., Depeursinge, A., Muller, H.: E cient and
fully automatic segmentation of the lungs in ct volumes. In: Proceedings of the
VISCERAL Anatomy Grand Challenge at the 2015 IEEE International Symposium
on Biomedical Imaging (ISBI). pp. 31{35. CEUR-WS (2015)
8. Grewal, R.G., Austin, J.H.M.: CT Demonstration of Calci cation in Carcinoma of
the Lung. Journal of Computer Assisted Tomography 18(6), 867{871 (1994)
9. Hansell, D.M., Bankier, A.A., MacMahon, H., McLoud, T.C., Mller, N.L., Remy,
J.: Fleischner Society: Glossary of Terms for Thoracic Imaging. Radiology 246(3),
697{722 (2008)
10. Ionescu, B., Muller, H., Peteri, R., Cid, Y.D., Liauchuk, V., Kovalev, V., Klimuk,
D., Tarasau, A., Abacha, A.B., Hasan, S.A., Datla, V., Liu, J., Demner-Fushman, D.,
Dang-Nguyen, D.T., Piras, L., Riegler, M., Tran, M.T., Lux, M., Gurrin, C., Pelka,
O., Friedrich, C.M., de Herrera, A.G.S., Garcia, N., Kavallieratou, E., del Blanco,
C.R., Rodr guez, C.C., Vasillopoulos, N., Karampidis, K., Chamberlain, J., Clark,
A., Campello, A.: ImageCLEF 2019: Multimedia retrieval in medicine, lifelogging,
security and nature. In: Experimental IR Meets Multilinguality, Multimodality, and
Interaction. Proceedings of the 10th International Conference of the CLEF
Association (CLEF 2019), LNCS Lecture Notes in Computer Science, Springer, Lugano,
Switzerland (September 9-12 2019)
11. Jankowski, M.: Erosion, dilation and related operators. In: Proceedings of 8th</p>
      <p>International Mathematica Symposium (2006)
12. Khan, A.N., Al-Jahdali, H.H., Allen, C.M., Irion, K.L., Al Ghanem, S., Koteyar,
S.S.: The calci ed lung nodule: What does it mean? Annals of Thoracic Medicine
5(2), 67{79 (2010)
13. Ridler, T., Calvard, S.: Picture Thresholding Using an Iterative Selection Method.</p>
      <p>IEEE Transactions on Systems, Man and Cybernetics 8(8), 630{632 (1978)
14. Isensee, F. et al.: No New-Net. In: Crimi, A., Bakas, S. (eds.) Brainlesion: Glioma,
Multiple Sclerosis, Stroke and Traumatic Brain Injuries. pp. 234-244. Springer
International Publishing (2019).
15. Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional Networks for
Biomedical Image Segmentation. International Conference on Medical image computing and
computer-assisted intervention. pp. 234-241. Springer (2015)
16. Isensee, F. et al.: Brain tumor segmentation and radiomics survival prediction:
contribution to the BRATS 2017 challenge. In: International MICCAI Brainlesion
Workshop. pp. 287-297. Springer (2017)
17. Liu, K., Kang, G.: Multiview convolutional neural networks for lung nodule
classi cation. In: Int. J. Imaging Syst. Technol, vol. 27, pp. 12-22. Wiley (2017).
https://doi.org/10.1002/ima.22206
18. Armato III et al.: Data From LIDC-IDRI. The Cancer Imaging Archive. 2015
19. Braun, D., Singhof, M., Tatusch, M., Conrad, S.: Convolutional Neural Networks
for Multidrug-resistant and Drug-sensitive Tuberculosis Distinction. In: CLEF2017
Working Notes, CEUR Workshop Proceedings, Dublin, Ireland. CEUR-WS (2017)
20. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale
Image Recognition. In: arXiv 1409.1556. arXiv preprint (2014)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Berger</surname>
            ,
            <given-names>H.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mejia</surname>
            ,
            <given-names>E.: Tuberculous</given-names>
          </string-name>
          <string-name>
            <surname>Pleurisy</surname>
          </string-name>
          .
          <source>Chest</source>
          <volume>63</volume>
          (
          <issue>1</issue>
          ),
          <volume>88</volume>
          {
          <fpage>92</fpage>
          (
          <year>1973</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bogomasov</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Himmelspach</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klassen</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tatusch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conrad</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>FeatureBased Approach for Severity Scoring of Lung Tuberculosis from CT Images</article-title>
          . In: Working Notes of CLEF 2018 -
          <article-title>Conference and Labs of the Evaluation Forum (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Brooks</surname>
            ,
            <given-names>R.A.</given-names>
          </string-name>
          :
          <article-title>A Quantitative Theory of the Houns eld Unit and Its Application to Dual Energy Scanning</article-title>
          .
          <source>Journal of Computer Assisted Tomography</source>
          <volume>1</volume>
          (
          <issue>4</issue>
          ),
          <volume>487</volume>
          {
          <fpage>493</fpage>
          (
          <year>1977</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Burbach</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Automatic Lung Extraction from CT Scans</article-title>
          .
          <source>Bachelor's Thesis</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Dicente</given-names>
            <surname>Cid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Liauchuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Klimuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Tarasau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Kovalev</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          , Muller, H.:
          <article-title>Overview of ImageCLEFtuberculosis 2019 - Automatic CT-based Report Generation and Tuberculosis Severity Assessment</article-title>
          .
          <source>In: CLEF2019 Working Notes. CEUR Workshop Proceedings, CEUR-WS.org, ISSN</source>
          <volume>1613</volume>
          -0073, &lt;http://ceur-ws.org/Vol2380/&gt;, Lugano,
          <source>Switzerland (September 9-12</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>