<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Impact of augmentation techniques on the classification of medical images</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antoni Jaszcz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Applied Mathematics, Silesian University of Technology</institution>
          ,
          <addr-line>Kaszubska 23, 44100 Gliwice</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The analysis of medical data is an important task as it can help in the quick diagnosis of the patient. This work focuses on the analysis of X-ray images. The images show the patient's condition, who is healthy or suspected of having pneumonia. To enable the automatic analysis of such images, I suggest using a convolutional neural network based on various augmentation methods. The introduction of augmentation allowed to increase the training set for the neural network, which requires a large amount of data in order to best adapt the model to the problem. The network has been described, implemented and tested to validate its operation. The research focused on various augmentation techniques including random rotation, random contrast, and a combination of both these methods. Based on obtained results, contrast augmentation achieves better results concerning the lack of its use. For the other two augmentation results, the results were lowered due to the modification of the basic orientation in the x-rays.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Data classification</kwd>
        <kwd>convolutional neural networks</kwd>
        <kwd>medical images</kwd>
        <kwd>augmentation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>training process.</p>
      <p>
        The augmentation process is very important in tasks
Artificial intelligence methods allow for quick segmen- where the data are gathered for a long time, like medicine.
tation or classification of various data. However, these Automatic analysis of test results in the form of expert
methods require an enormous amount of data to train systems is very necessary to reduce the waiting time
such models. This is especially visible in the case of ar- for a diagnosis. For this purpose, expert systems quite
tificial neural networks, where deep architectures can often use solutions based on convolutional neural
netclassify data much better, although they need a lot of works (CNNs). It is visible in the images of moles on the
training data. Quite often, such data may not be enough skin, which are one of the basic and first examinations
to obtain a solution that can be implemented in practice. for the detection of potential skin cancer like melanoma.
For this purpose, augmentation is used. It is the process CNN’s can be used for image processing, feature
extracof artificially creating new samples within a single class tion and even classification or segmentation what was
to generate new samples that can increase the amount shown in [5, 6, 7]. This type of machine learning
techof data in the training set [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. nique is also used in the detection of Parkinson’s disease
      </p>
      <p>
        In the case of image processing, the augmentation is [8]. Medical analysis by the use of machine learning is
based on rotating or zooming some areas. This can pro- badly needed for faster disease detection and choice of
vide a new sample with similar features but in diferent treatment. Biomedical informatics uses also augmented
orientations or configurations. Apart from the classic reality for increasing the quality of data processing and
methods of sample analysis, new ones are proposed. An learning [9, 10, 11].
example of this is augmentation based on combining two Decision support systems quite often rely not only
samples based on interpolation of mathematical func- on algorithms, but also frameworks and alternative
sotions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The idea is to create points from two images lutions. An example of a framework for the analysis
and interpolate them to superimpose two images with of medical images, especially those obtained during
toa certain transparency. Similar tools (like interpolation mography, is presented in [12]. In addition, new neural
techniques) can be used in diferent approaches. It was network architectures are also modeled to diagnose e.g.
shown in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where the authors use it to generate syn- covid-19 [13, 14]. Moreover, medical systems rely on
thetic data instances. Again in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the idea of random deep neural networks that require training. The classical
cropping as an augmentation method was shown as not approach is based on teaching one model, but federal
the best approach. According to the presented results, learning is also developed. it is based on training in
parthis method can produce noise in the gradient during the allel on many clients who aggregate a common model
[15, 16].
      </p>
      <p>Based on this observation, in this paper, a deep
learning method was used to fast analysis of x-ray images
to detect possible pneumonia. The contribution of this
IVUS 2022: 27th International Conference on Information Technology
$ aj303181@student.polsl.pl (A. Jaszcz)</p>
      <p>© 2022 Copyright for this paper by its authors. Use permitted under Creative
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g CCoEmmUoRns LWiceonsrekAstthribouptionP4r.0oIncteerenadtiionnagl s(CC(CBYE4U.0)R.-WS.org)
 = 2−1 + (1 − 2)2.</p>
      <p>In the above formulas, the coeficients 1 and 2 are a
distributions. having these two parameters, the correlation
of them is calculated:
ˆ =
ˆ =</p>
      <p>,
1 − 1</p>
      <p>1 − 2</p>
      <p>.</p>
      <p>Finally, the weight in the next iteration ( + 1) will be
defined by the following formula:
+1 =  − √</p>
      <p>ˆ +</p>
      <p>ˆ,
where  ≈ 0, and  is known as learning coeficient.</p>
      <sec id="sec-1-1">
        <title>2.3. Augmentation methods</title>
        <sec id="sec-1-1-1">
          <title>2.3.1. Random rotation</title>
          <p>This model adds an augmentation layer that slightly and
randomly rotates the input image, right before the input
(1) layer of the base model. The mathematical formulation
of this method can be shown as a transformation matrix:
[︂ 
−
 (1 − ) ·  −  · ]︂ ,
  ·  + (1 − ) · 
where  =  · cos ,  =  · sin ,  is the rotation angle
chosen in random way and  is a scale parameter. An
example of such augmentation is shown in Fig. 1.
research are:
• analysis of selected augmentation methods and
its impact on convolutional neural networks,
• the use of augmentation methods to expand the
training set of medical images.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>In this section, all mathematical aspects of CNNs, training
algorithms and augmentation methods are described.</p>
      <sec id="sec-2-1">
        <title>2.1. Convolutional neural network</title>
        <p>CNN is created based on three types of layers:
convolutional, pooling and fully connected (dense). The first
type is the convolutional one, which has the purpose
of changing the image to extract the features from the
analyzed image. It is done by applying the convolutional
operator(*) on each pixel at a given position in image
, and filter matrix × according to:</p>
        <p>* , = ∑︁ ∑︁ , · +−1,+−1 + ,</p>
        <p>=1 =1
where  is a bias.</p>
        <p>The second layer is called pooling. The main task of
it to resize the image. It is performed by the selection of
one pixel from a given grid by the use of a mathematical
function like minimum or maximum. After finding a pixel
in the first grid (that is placed above pixel on position
(0,0) in the image), the grid is moved to the next pixel.
This is done until the grid does not cover the last pixel in
the image. As a result of the layer’s operation, an image
is created from selected pixels.</p>
        <p>The last layer is fully-connected and it is a classic
column of neurons that numerical values and weights:

(︃−1 )︃
∑︁ ,  ,
=0
(2)
where  is the number of neurons in the previous column,
 is the results from a neuron in the previous layer on
, connection.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Training algorithm</title>
        <p>Training process of CNN consists in modifying the weight
values, which can be done with the ADAM algorithm [17].
This algorithm assumes that the weights will be changed
according to statistical values including mean  and 
in -th iteration. The formulation of this can be defined
as:
 = 1−1 + (1 − 1),
(3)
This model adds an augmentation layer that slightly and
randomly changes the contrast of the input image, right
before the input layer of the base model. An example
where ′ is a new color in RGB color model,  is a
changed value of selected color, and  is the correlation
coeficient defined as follows:
 =
259( + 255)
255(259 − )
,
where  is the contrast level. In the case of augmentation,
this coeficient is random.
(9)
(10)</p>
        <sec id="sec-2-2-1">
          <title>2.3.3. Random rotation and contrast</title>
          <p>This model joins two previously described augmentation
methods, and applies them to the input image, right
before the input layer of the base model. The combination
of both presented augmentation methods is shown in Fig.
3.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>In this section, the experimental settings, obtained results
and discussion are presented.</p>
      <sec id="sec-3-1">
        <title>3.1. Testing environment</title>
        <p>All experiments were conducted on a computer with the
following specifications:
Processor: AMD Ryzen 5 5600X 6-Core Processor 4.20
GHz
Installed RAM: 32.0 GB
System type: 64-bit Windows 10 ; x64-based processor
All computing was done solely with the CPU.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Database</title>
        <p>The data used in our experiments consists of 5216
xray images (of diferent sizes) of patients with suspected
pneumonia, 3875 of which were confirmed cases (viral
and bacterial infections both wise), while the other 1341
were healthy. The data is accessible at Kraggle, at this
link. Kraggle is a public dataset platform for data
scientists and machine learning enthusiasts, controlled by
Google LLC.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Data preparation</title>
        <p>The images were first resized to 256x128 (pixels) and then
divided randomly into two groups:
• train group (75% of the database)
• validation group (25% of the database)</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Assessment</title>
        <p>The goal of this paper is to show what impact
diferent types of data augmentation have on an already
wellperforming neural network model. The structure of the
base CNN model is as follows:
1. Input layer - a convolutional layer, having 128
neurons with 3x3-sized filters, with input shape:
128,256,1 (shape of a 2D image), with ReLU
(Rectiifed Linear Unit) activation function. The output
of this layer is then passed onto pooling layers,
described below.
2. Hidden layer - in our model, we used two more
convolutional blocks, with each subsequent
having half the number of neurons than the previous
one. All pooling layers used in the model have
2x2-sized filters. All convolutional layers used
in the model have 3x3-sized filters. The output
is then passed onto the dense layer, with 64 neu- The results were obtained by assessing aforementioned
rons and ReLU activation. Next, the dropout layer validation group, in which 977 (roughly 75% of the group)
(with the rate set to 0.5) and the following flatten cases were pneumonic, while the rest 327 (25%) were
layer prepare the final output, of the hidden seg- healthy.
ment. The order of these layers and the number
of neurons within them is displayed below:
• pooling layer (2x2), ReLU
• convolutional layer (3x3), 64 neurons, ReLU
• pooling layer (2x2), activation: ReLU
• convolutional layer (3x3), 32 neurons, ReLU
• pooling layer (2x2), ReLU
• dense layer, 64 neurons, ReLU
• dropout layer (threshold: 0.5)
• flatten layer
3. Output layer - a dense layer with 2 neurons,
related to the healthiness of a patient. While
assessing, the larger value of two neurons is chosen and
thus, the patient is determined as healthy or ill
with pneumonia.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Results</title>
        <p>In this subsection, results of the experiments are shown.
In the Tab. 1, calculated metrics, such as ,
,  and  1 −  are displayed for each
model. The values were calculated with following
formulas (for binary classification):
• Accuracy:
• Precision:
• Recall:
• F1-Score:
 =</p>
        <p>+  
  +   +   +</p>
        <p>,
 =
 =</p>
        <p>+</p>
        <p>+  
,
,
1
 1
= 0.5 ·
(︂ 1

+
1 )︂

,
(11)
(12)
(13)
(14)
where
TP - true sample predicted as true,
TN - false sample predicted as false,
FP - false sample predicted as true,
FN - true sample predicted as false.</p>
        <p>In the case of the metrics other than accuracy itself, two
cases are considered. First, where pneumonia is
considered as the truth and healthy as falsity. Second, where
it’s the other way round.</p>
        <p>The relationship between predicted and real outcome is
also displayed in confusion matrices (Fig. 4), for each
tested model.</p>
        <p>In Tab. 1, the results show, that the base model reaches
a decent accuracy of 96%, but its recall could be improved
when it comes to detecting healthy cases (41 patients out
of 327 healthy ones were diagnosed with pneumonia by
the model, while in fact, being healthy). Furthermore,
rotation augmentation worsen the overall performance
of the model, while contrast one proved to be somewhat
beneficial to the results (see Fig. 4). Not only did accuracy
slightly improve, but its recall metric in detecting healthy
cases grew considerably. Although the ability to detect
all the pneumonic cases dropped, contrast augmentation
brings in more balance to the model’s assessment and
thus improves its overall performance.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>The analysis of medical images is important in order to
quickly detect or help a doctor make a diagnosis decision.
For this purpose, the use of a convolutional neural
network for the analysis of X-ray images was presented. As
part of the research, the possibilities of using
augmentation were considered (techniques such as random
rotation, contrast change and a combination of both). The
obtained results indicate that augmentation can quickly and
easily extend the training set. Random contrast change
as the main augmentation technique performed better
in terms of model accuracy compared to the original
database. In addition, it was found that the use of
rotation on medical images deteriorated the performance of
the trained model. The reason for this is the
rearrangement of the chest area on X-rays. As a result, the database
is enlarged with data that drastically difer from the rest,
and consequently, reduces the efectiveness of the neural
network. The results of the last model analyzed in this
paper, that is the one with both augmentations applied,
show the worst results of them all. Not only its accuracy
is lower, but also the ability to detect pneumonic cases,
which is crucial in medical illness detection, plummeted.
A positive impact of classic data augmentation techniques
on CNN-model performance was similarly shown in liver
illness recognition [18]. It was also suggested, that
classic augmentation methods connected with cutting edge
augmentation methods, such as generative adversarial
network (GAN), yield the best results of all model
conifgurations t ested. In [ 19], p ossible negative efects of
joined classic augmentation methods in medical image
classification were discussed, as well as their lone impact
on the learning process.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This work is supported by the Silesian University of
Technology by the mentoring project.
thinking the random cropping data augmentation 15891.
method used in the training of cnn-based sar image [16] B. Pfitzner, N. Steckhan, B. Arnrich, Federated
learnship detector, Remote Sensing 13 (2021) 34. ing in a medical context: A systematic literature
[5] D. Połap, Analysis of skin marks through the use review, ACM Transactions on Internet Technology
of intelligent things, IEEE Access 7 (2019) 149355– (TOIT) 21 (2021) 1–31.</p>
      <p>149363. [17] D. P. Kingma, J. Ba, Adam: A method for
stochas[6] R. Wang, G. Zheng, Cycmis: Cycle-consistent cross- tic optimization, arXiv preprint arXiv:1412.6980
domain medical image segmentation via diverse (2014).
image augmentation, Medical Image Analysis 76 [18] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai,
(2022) 102328. J. Goldberger, H. Greenspan, Gan-based synthetic
[7] D. Połap, Fuzzy consensus with federated learning medical image augmentation for increased cnn
permethod in medical systems, IEEE Access 9 (2021) formance in liver lesion classification,
Neurocom150383–150392. puting 321 (2018) 321–331.
[8] O. O. Abayomi-Alli, R. Damaševičius, [19] Zeshan, F. Hussain, D. Gimenez, D. Yi, Rubin,
DifR. Maskeliu¯nas, A. Abayomi-Alli, Bilstm ferential data augmentation techniques for medical
with data augmentation using interpolation imaging classification tasks, PubMed 2017 (2018)
methods to improve early detection of parkinson 979–984.
disease, in: 2020 15th Conference on Computer
Science and Information Systems (FedCSIS), IEEE,
2020, pp. 371–380.
[9] Y. Djenouri, A. Belhadi, G. Srivastava, J. C.-W. Lin,</p>
      <p>Secure collaborative augmented reality framework
for biomedical informatics, IEEE Journal of
Biomedical and Health Informatics (2021).
[10] C. Moro, J. Birt, Z. Stromberga, C. Phelps, J. Clark,</p>
      <p>P. Glasziou, A. M. Scott, Virtual and augmented
reality enhancements to medical and science
student physiology and anatomy test performance: A
systematic review and meta-analysis, Anatomical
sciences education 14 (2021) 368–376.
[11] Y. Zhuang, J. Sun, J. Liu, Diagnosis of chronic
kidney disease by three-dimensional
contrastenhanced ultrasound combined with augmented
reality medical technology, Journal of Healthcare</p>
      <p>Engineering 2021 (2021).
[12] T. Akram, M. Attique, S. Gul, A. Shahzad, M. Altaf,</p>
      <p>S. Naqvi, R. Damaševičius, R. Maskeliu¯nas, A novel
framework for rapid diagnosis of covid-19 on
computed tomography scans, Pattern analysis and
applications 24 (2021) 951–964.
[13] J. Rasheed, A. A. Hameed, C. Djeddi, A. Jamil, F.
Al</p>
      <p>Turjman, A machine learning-based framework
for diagnosis of covid-19 from chest x-ray images,
Interdisciplinary Sciences: Computational Life
Sciences 13 (2021) 103–117.
[14] P. Afshar, S. Heidarian, F. Naderkhani,</p>
      <p>A. Oikonomou, K. N. Plataniotis, A.
Mohammadi, Covid-caps: A capsule network-based
framework for identification of covid-19 cases
from x-ray images, Pattern Recognition Letters 138
(2020) 638–643.
[15] W. Zhang, T. Zhou, Q. Lu, X. Wang, C. Zhu, H. Sun,</p>
      <p>Z. Wang, S. K. Lo, F.-Y. Wang,
Dynamic-fusionbased federated learning for covid-19 detection,
IEEE Internet of Things Journal 8 (2021) 15884–</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Elgendi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. U.</given-names>
            <surname>Nasir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Grenier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Batte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Spieler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Leslie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Menon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Fletcher</surname>
          </string-name>
          , et al.,
          <article-title>The efectiveness of image augmentation in deep learning networks for detecting covid-19: A geometric transformation perspective, Frontiers in Medicine 8 (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Połap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Włodarczyk-Sielicka</surname>
          </string-name>
          ,
          <article-title>Interpolation merge as augmentation technique in the problem of ship classification</article-title>
          ,
          <source>in: 2020 15th Conference on Computer Science and Information Systems (FedCSIS)</source>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>443</fpage>
          -
          <lpage>446</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>O. O.</given-names>
            <surname>Abayomi-Alli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Damasevicius</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Misra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Maskeliunas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abayomi-Alli</surname>
          </string-name>
          ,
          <article-title>Malignant skin melanoma detection using image augmentation by oversampling in nonlinear lower-dimensional embedding manifold</article-title>
          ,
          <source>Turkish Journal of Electrical Engineering &amp; Computer Sciences</source>
          <volume>29</volume>
          (
          <year>2021</year>
          )
          <fpage>2600</fpage>
          -
          <lpage>2614</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jia</surname>
          </string-name>
          , H. Zhang, Re-
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>