=Paper= {{Paper |id=Vol-3611/paper1 |storemode=property |title= Impact of augmentation techniques on the classification of medical images |pdfUrl=https://ceur-ws.org/Vol-3611/paper1.pdf |volume=Vol-3611 |authors=Antoni Jaszcz |dblpUrl=https://dblp.org/rec/conf/ivus/Jaszcz22 }} == Impact of augmentation techniques on the classification of medical images == https://ceur-ws.org/Vol-3611/paper1.pdf
                                Impact of augmentation techniques on the classification of
                                medical images
                                Antoni Jaszcz
                                Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44100 Gliwice, Poland


                                                                          Abstract
                                                                          The analysis of medical data is an important task as it can help in the quick diagnosis of the patient. This work focuses on the
                                                                          analysis of X-ray images. The images show the patient’s condition, who is healthy or suspected of having pneumonia. To
                                                                          enable the automatic analysis of such images, I suggest using a convolutional neural network based on various augmentation
                                                                          methods. The introduction of augmentation allowed to increase the training set for the neural network, which requires a
                                                                          large amount of data in order to best adapt the model to the problem. The network has been described, implemented and
                                                                          tested to validate its operation. The research focused on various augmentation techniques including random rotation, random
                                                                          contrast, and a combination of both these methods. Based on obtained results, contrast augmentation achieves better results
                                                                          concerning the lack of its use. For the other two augmentation results, the results were lowered due to the modification of the
                                                                          basic orientation in the x-rays.

                                                                          Keywords
                                                                          Data classification, convolutional neural networks, medical images, augmentation



                                1. Introduction                                                                                                    training process.
                                                                                                                                                      The augmentation process is very important in tasks
                                Artificial intelligence methods allow for quick segmen-                                                            where the data are gathered for a long time, like medicine.
                                tation or classification of various data. However, these                                                           Automatic analysis of test results in the form of expert
                                methods require an enormous amount of data to train                                                                systems is very necessary to reduce the waiting time
                                such models. This is especially visible in the case of ar-                                                         for a diagnosis. For this purpose, expert systems quite
                                tificial neural networks, where deep architectures can                                                             often use solutions based on convolutional neural net-
                                classify data much better, although they need a lot of                                                             works (CNNs). It is visible in the images of moles on the
                                training data. Quite often, such data may not be enough                                                            skin, which are one of the basic and first examinations
                                to obtain a solution that can be implemented in practice.                                                          for the detection of potential skin cancer like melanoma.
                                For this purpose, augmentation is used. It is the process                                                          CNN’s can be used for image processing, feature extrac-
                                of artificially creating new samples within a single class                                                         tion and even classification or segmentation what was
                                to generate new samples that can increase the amount                                                               shown in [5, 6, 7]. This type of machine learning tech-
                                of data in the training set [1].                                                                                   nique is also used in the detection of Parkinson’s disease
                                   In the case of image processing, the augmentation is                                                            [8]. Medical analysis by the use of machine learning is
                                based on rotating or zooming some areas. This can pro-                                                             badly needed for faster disease detection and choice of
                                vide a new sample with similar features but in different                                                           treatment. Biomedical informatics uses also augmented
                                orientations or configurations. Apart from the classic                                                             reality for increasing the quality of data processing and
                                methods of sample analysis, new ones are proposed. An                                                              learning [9, 10, 11].
                                example of this is augmentation based on combining two                                                                Decision support systems quite often rely not only
                                samples based on interpolation of mathematical func-                                                               on algorithms, but also frameworks and alternative so-
                                tions [2]. The idea is to create points from two images                                                            lutions. An example of a framework for the analysis
                                and interpolate them to superimpose two images with                                                                of medical images, especially those obtained during to-
                                a certain transparency. Similar tools (like interpolation                                                          mography, is presented in [12]. In addition, new neural
                                techniques) can be used in different approaches. It was                                                            network architectures are also modeled to diagnose e.g.
                                shown in [3], where the authors use it to generate syn-                                                            covid-19 [13, 14]. Moreover, medical systems rely on
                                thetic data instances. Again in [4], the idea of random                                                            deep neural networks that require training. The classical
                                cropping as an augmentation method was shown as not                                                                approach is based on teaching one model, but federal
                                the best approach. According to the presented results,                                                             learning is also developed. it is based on training in par-
                                this method can produce noise in the gradient during the                                                           allel on many clients who aggregate a common model
                                                                                                                                                   [15, 16].
                                IVUS 2022: 27th International Conference on Information Technology                                                    Based on this observation, in this paper, a deep learn-
                                $ aj303181@student.polsl.pl (A. Jaszcz)
                                                                    Β© 2022 Copyright for this paper by its authors. Use permitted under Creative   ing method was used to fast analysis of x-ray images
                                 CEUR
                                 Workshop
                                 Proceedings
                                               http://ceur-ws.org
                                               ISSN 1613-0073
                                                                    Commons License Attribution 4.0 International (CC BY 4.0).
                                                                    CEUR Workshop Proceedings (CEUR-WS.org)                                        to detect possible pneumonia. The contribution of this




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
research are:                                                                𝑣𝑑 = 𝛽2 π‘£π‘‘βˆ’1 + (1 βˆ’ 𝛽2 )𝑔𝑑2 .             (4)

     β€’ analysis of selected augmentation methods and           In the above formulas, the coefficients 𝛽1 and 𝛽2 are a dis-
       its impact on convolutional neural networks,            tributions. having these two parameters, the correlation
     β€’ the use of augmentation methods to expand the           of them is calculated:
       training set of medical images.                                                         π‘šπ‘‘
                                                                                    π‘š
                                                                                    ˆ𝑑 =             ,                 (5)
                                                                                             1 βˆ’ 𝛽1𝑑
2. Methodology                                                                       𝑣ˆ𝑑 =
                                                                                               𝑣𝑑
                                                                                                     .                 (6)
                                                                                             1 βˆ’ 𝛽2𝑑
In this section, all mathematical aspects of CNNs, training
algorithms and augmentation methods are described.             Finally, the weight in the next iteration (𝑑 + 1) will be
                                                               defined by the following formula:

2.1. Convolutional neural network                                                                  πœ‚
                                                                              𝑀𝑑+1 = 𝑀𝑑 βˆ’ √              π‘š
                                                                                                         Λ† 𝑑,          (7)
                                                                                                 𝑣ˆ𝑑 + πœ–
CNN is created based on three types of layers: convo-
lutional, pooling and fully connected (dense). The first where πœ– β‰ˆ 0, and πœ‚ is known as learning coefficient.
type is the convolutional one, which has the purpose
of changing the image to extract the features from the 2.3. Augmentation methods
analyzed image. It is done by applying the convolutional
operator(*) on each pixel at a given position in image 2.3.1. Random rotation
𝐼π‘₯,𝑦 and filter matrix π‘˜π‘Γ—π‘ according to:
                                                            This model adds an augmentation layer that slightly and
                    𝑝    𝑝                                  randomly rotates the input image, right before the input
                             π‘˜π‘–,𝑗 Β· 𝐼π‘₯+π‘–βˆ’1,𝑦+π‘—βˆ’1 + 𝑏, (1) layer of the base model. The mathematical formulation
                   βˆ‘οΈ  βˆ‘οΈ
       π‘˜ * 𝐼π‘₯,𝑦 =
                   𝑖=1 𝑗=1                                  of this method can be shown as a transformation matrix:
                                                                        [οΈ‚                            ]οΈ‚
where 𝑏 is a bias.                                                         𝛼   𝛽 (1 βˆ’ 𝛼) Β· π‘₯ βˆ’ 𝛽 Β· 𝑦
                                                                                                         ,        (8)
    The second layer is called pooling. The main task of                   βˆ’π›½ 𝛼 𝛽 Β· π‘₯ + (1 βˆ’ 𝛼) Β· 𝑦
it to resize the image. It is performed by the selection of
one pixel from a given grid by the use of a mathematical where 𝛼 = π‘Ž Β· cos πœƒ, 𝛽 = π‘Ž Β· sin πœƒ, πœƒ is the rotation angle
function like minimum or maximum. After finding a pixel chosen in random way and π‘Ž is a scale parameter. An
in the first grid (that is placed above pixel on position example of such augmentation is shown in Fig. 1.
(0,0) in the image), the grid is moved to the next pixel.
This is done until the grid does not cover the last pixel in
the image. As a result of the layer’s operation, an image
is created from selected pixels.
   The last layer is fully-connected and it is a classic
column of neurons that numerical values and weights:
                      (οΈƒπ‘›βˆ’1         )οΈƒ
                         βˆ‘οΈ
                    𝑓        𝑀𝑖,𝑗 π‘₯𝑖 ,                   (2)
                        𝑖=0

where 𝑛 is the number of neurons in the previous column,
π‘₯𝑖 is the results from a neuron in the previous layer on
𝑀𝑖,𝑗 connection.

2.2. Training algorithm                                        Figure 1: An example of how images are randomly rotated

Training process of CNN consists in modifying the weight
values, which can be done with the ADAM algorithm [17].
This algorithm assumes that the weights will be changed        2.3.2. Random contrast
according to statistical values including mean π‘š and 𝑣
                                                               This model adds an augmentation layer that slightly and
in 𝑑-th iteration. The formulation of this can be defined
                                                               randomly changes the contrast of the input image, right
as:
                                                               before the input layer of the base model. An example
               π‘šπ‘‘ = 𝛽1 π‘šπ‘‘βˆ’1 + (1 βˆ’ 𝛽1 )𝑔𝑑 ,            (3)
of such contrast changing is presented in Fig. 2. This is
made by changing the value of colors as:

               𝑅′ = 𝐹 (𝑅 βˆ’ 128) + 128,                   (9)

where 𝑅′ is a new color in RGB color model, 𝑅 is a
changed value of selected color, and 𝐹 is the correlation
coefficient defined as follows:
                        259(𝐢 + 255)
                  𝐹 =                ,                 (10)
                        255(259 βˆ’ 𝐢)

where 𝐢 is the contrast level. In the case of augmentation,
this coefficient is random.
                                                               Figure 3: An example of how random contrast and rotation
                                                               is applied to an image



                                                               3.2. Database
                                                               The data used in our experiments consists of 5216 x-
                                                               ray images (of different sizes) of patients with suspected
                                                               pneumonia, 3875 of which were confirmed cases (viral
                                                               and bacterial infections both wise), while the other 1341
                                                               were healthy. The data is accessible at Kraggle, at this
                                                               link. Kraggle is a public dataset platform for data sci-
                                                               entists and machine learning enthusiasts, controlled by
                                                               Google LLC.
Figure 2: An example of how random contrast is applied to
an image
                                                               3.3. Data preparation
                                                               The images were first resized to 256x128 (pixels) and then
2.3.3. Random rotation and contrast                            divided randomly into two groups:
This model joins two previously described augmentation              β€’ train group (75% of the database)
methods, and applies them to the input image, right be-             β€’ validation group (25% of the database)
fore the input layer of the base model. The combination
of both presented augmentation methods is shown in Fig.
3.
                                                               3.4. Assessment
                                                               The goal of this paper is to show what impact differ-
                                                               ent types of data augmentation have on an already well-
3. Experiments                                                 performing neural network model. The structure of the
                                                               base CNN model is as follows:
In this section, the experimental settings, obtained results
and discussion are presented.                                      1. Input layer - a convolutional layer, having 128
                                                                      neurons with 3x3-sized filters, with input shape:
3.1. Testing environment                                              128,256,1 (shape of a 2D image), with ReLU (Recti-
                                                                      fied Linear Unit) activation function. The output
All experiments were conducted on a computer with the                 of this layer is then passed onto pooling layers,
following specifications:                                             described below.
Processor: AMD Ryzen 5 5600X 6-Core Processor 4.20                 2. Hidden layer - in our model, we used two more
GHz                                                                   convolutional blocks, with each subsequent hav-
Installed RAM: 32.0 GB                                                ing half the number of neurons than the previous
System type: 64-bit Windows 10 ; x64-based processor                  one. All pooling layers used in the model have
                                                                      2x2-sized filters. All convolutional layers used
All computing was done solely with the CPU.                           in the model have 3x3-sized filters. The output
       is then passed onto the dense layer, with 64 neu-      The results were obtained by assessing aforementioned
       rons and ReLU activation. Next, the dropout layer      validation group, in which 977 (roughly 75% of the group)
       (with the rate set to 0.5) and the following flatten   cases were pneumonic, while the rest 327 (25%) were
       layer prepare the final output, of the hidden seg-     healthy.
       ment. The order of these layers and the number
       of neurons within them is displayed below:
            β€’ pooling layer (2x2), ReLU                       Table 1
                                                              Calculated metrics
            β€’ convolutional layer (3x3), 64 neurons, ReLU
                                                                                         class   precision    recall    f1-score
            β€’ pooling layer (2x2), activation: ReLU
            β€’ convolutional layer (3x3), 32 neurons, ReLU                                  0      0.9630     0.8746      0.9167
                                                               base CNN model              1      0.9593     0.9887      0.9738
            β€’ pooling layer (2x2), ReLU
                                                                                                 accuracy=0.9601
            β€’ dense layer, 64 neurons, ReLU                                                0      0.9273     0.8196      0.8701
            β€’ dropout layer (threshold: 0.5)                   rotation                    1      0.9419     0.9785      0.9598
            β€’ flatten layer                                                                      accuracy=0.9387
                                                                                           0      0.9199     0.9480      0.9337
    3. Output layer - a dense layer with 2 neurons, re-
                                                               contrast                    1      0.9824     0.9724      0.9774
       lated to the healthiness of a patient. While assess-
                                                                                                 accuracy=0.9663
       ing, the larger value of two neurons is chosen and                                  0      0.8043     0.9052      0.8518
       thus, the patient is determined as healthy or ill       rotation and contrast       1      0.9669     0.9263      0.9462
       with pneumonia.                                                                           accuracy=0.9210

3.5. Results                                                     In Tab. 1, the results show, that the base model reaches
                                                              a decent accuracy of 96%, but its recall could be improved
In this subsection, results of the experiments are shown.
                                                              when it comes to detecting healthy cases (41 patients out
In the Tab. 1, calculated metrics, such as π‘Žπ‘π‘π‘’π‘Ÿπ‘Žπ‘π‘¦,
                                                              of 327 healthy ones were diagnosed with pneumonia by
π‘π‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘›, π‘Ÿπ‘’π‘π‘Žπ‘™π‘™ and 𝑓 1 βˆ’ π‘ π‘π‘œπ‘Ÿπ‘’ are displayed for each
                                                              the model, while in fact, being healthy). Furthermore,
model. The values were calculated with following formu-
                                                              rotation augmentation worsen the overall performance
las (for binary classification):
                                                              of the model, while contrast one proved to be somewhat
     β€’ Accuracy:                                              beneficial to the results (see Fig. 4). Not only did accuracy
                                                              slightly improve, but its recall metric in detecting healthy
                           𝑇𝑃 + 𝑇𝑁                            cases grew considerably. Although the ability to detect
                 𝛼=                     ,             (11)
                      𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁                       all the pneumonic cases dropped, contrast augmentation
     β€’ Precision:                                             brings in more balance to the model’s assessment and
                               𝑇𝑃                             thus improves its overall performance.
                       πœ“=            ,                (12)
                             𝑇𝑃 + 𝐹𝑃
     β€’ Recall:
                               𝑇𝑃                             4. Conclusions
                       𝜌=            ,                (13)
                             𝑇𝑃 + 𝐹𝑁
                                                            The analysis of medical images is important in order to
     β€’ F1-Score:                                            quickly detect or help a doctor make a diagnosis decision.
                      1
                                 (οΈ‚
                                    1   1
                                          )οΈ‚                For this purpose, the use of a convolutional neural net-
                         = 0.5 Β·      +      ,         (14) work for the analysis of X-ray images was presented. As
                     𝑓1             πœ“   𝜌
                                                            part of the research, the possibilities of using augmenta-
where                                                       tion were considered (techniques such as random rota-
TP - true sample predicted as true,                         tion, contrast change and a combination of both). The ob-
TN - false sample predicted as false,                       tained results indicate that augmentation can quickly and
FP - false sample predicted as true,                        easily extend the training set. Random contrast change
FN - true sample predicted as false.                        as the main augmentation technique performed better
In the case of the metrics other than accuracy itself, two in terms of model accuracy compared to the original
cases are considered. First, where pneumonia is consid- database. In addition, it was found that the use of rota-
ered as the truth and healthy as falsity. Second, where tion on medical images deteriorated the performance of
it’s the other way round.                                   the trained model. The reason for this is the rearrange-
The relationship between predicted and real outcome is ment of the chest area on X-rays. As a result, the database
also displayed in confusion matrices (Fig. 4), for each is enlarged with data that drastically differ from the rest,
tested model.                                               and consequently, reduces the effectiveness of the neural
Figure 4: Confusion matrices for described models



network. The results of the last model analyzed in this       References
paper, that is the one with both augmentations applied,
show the worst results of them all. Not only its accuracy     [1] M. Elgendi, M. U. Nasir, Q. Tang, D. Smith, J.-P. Gre-
is lower, but also the ability to detect pneumonic cases,         nier, C. Batte, B. Spieler, W. D. Leslie, C. Menon,
which is crucial in medical illness detection, plummeted.         R. R. Fletcher, et al., The effectiveness of image aug-
A positive impact of classic data augmentation techniques         mentation in deep learning networks for detecting
on CNN-model performance was similarly shown in liver             covid-19: A geometric transformation perspective,
illness recognition [18]. It was also suggested, that clas-       Frontiers in Medicine 8 (2021).
sic augmentation methods connected with cutting edge          [2] D. PoΕ‚ap, M. WΕ‚odarczyk-Sielicka, Interpolation
augmentation methods, such as generative adversarial              merge as augmentation technique in the problem
network (GAN), yield the best results of all model con-           of ship classification, in: 2020 15th Conference on
figurations t ested. In [19], possible negative effects of        Computer Science and Information Systems (FedC-
joined classic augmentation methods in medical image              SIS), IEEE, 2020, pp. 443–446.
classification were discussed, as well as their lone impact   [3] O. O. Abayomi-Alli, R. Damasevicius, S. Misra,
on the learning process.                                          R. Maskeliunas, A. Abayomi-Alli, Malignant skin
                                                                  melanoma detection using image augmentation by
                                                                  oversampling in nonlinear lower-dimensional em-
Acknowledgements                                                  bedding manifold, Turkish Journal of Electrical
                                                                  Engineering & Computer Sciences 29 (2021) 2600–
This work is supported by the Silesian University of Tech-        2614.
nology by the mentoring project.                              [4] R. Yang, R. Wang, Y. Deng, X. Jia, H. Zhang, Re-
     thinking the random cropping data augmentation               15891.
     method used in the training of cnn-based sar image      [16] B. Pfitzner, N. Steckhan, B. Arnrich, Federated learn-
     ship detector, Remote Sensing 13 (2021) 34.                  ing in a medical context: A systematic literature
 [5] D. PoΕ‚ap, Analysis of skin marks through the use             review, ACM Transactions on Internet Technology
     of intelligent things, IEEE Access 7 (2019) 149355–          (TOIT) 21 (2021) 1–31.
     149363.                                                 [17] D. P. Kingma, J. Ba, Adam: A method for stochas-
 [6] R. Wang, G. Zheng, Cycmis: Cycle-consistent cross-           tic optimization, arXiv preprint arXiv:1412.6980
     domain medical image segmentation via diverse                (2014).
     image augmentation, Medical Image Analysis 76           [18] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai,
     (2022) 102328.                                               J. Goldberger, H. Greenspan, Gan-based synthetic
 [7] D. PoΕ‚ap, Fuzzy consensus with federated learning            medical image augmentation for increased cnn per-
     method in medical systems, IEEE Access 9 (2021)              formance in liver lesion classification, Neurocom-
     150383–150392.                                               puting 321 (2018) 321–331.
 [8] O. O. Abayomi-Alli,            R. Damaőevičius,         [19] Zeshan, F. Hussain, D. Gimenez, D. Yi, Rubin, Dif-
     R. MaskeliuΜ„nas, A. Abayomi-Alli,             Bilstm         ferential data augmentation techniques for medical
     with data augmentation using interpolation                   imaging classification tasks, PubMed 2017 (2018)
     methods to improve early detection of parkinson              979–984.
     disease, in: 2020 15th Conference on Computer
     Science and Information Systems (FedCSIS), IEEE,
     2020, pp. 371–380.
 [9] Y. Djenouri, A. Belhadi, G. Srivastava, J. C.-W. Lin,
     Secure collaborative augmented reality framework
     for biomedical informatics, IEEE Journal of Biomed-
     ical and Health Informatics (2021).
[10] C. Moro, J. Birt, Z. Stromberga, C. Phelps, J. Clark,
     P. Glasziou, A. M. Scott, Virtual and augmented
     reality enhancements to medical and science stu-
     dent physiology and anatomy test performance: A
     systematic review and meta-analysis, Anatomical
     sciences education 14 (2021) 368–376.
[11] Y. Zhuang, J. Sun, J. Liu, Diagnosis of chronic
     kidney disease by three-dimensional contrast-
     enhanced ultrasound combined with augmented
     reality medical technology, Journal of Healthcare
     Engineering 2021 (2021).
[12] T. Akram, M. Attique, S. Gul, A. Shahzad, M. Altaf,
     S. Naqvi, R. DamaΕ‘evičius, R. MaskeliuΜ„nas, A novel
     framework for rapid diagnosis of covid-19 on com-
     puted tomography scans, Pattern analysis and ap-
     plications 24 (2021) 951–964.
[13] J. Rasheed, A. A. Hameed, C. Djeddi, A. Jamil, F. Al-
     Turjman, A machine learning-based framework
     for diagnosis of covid-19 from chest x-ray images,
     Interdisciplinary Sciences: Computational Life Sci-
     ences 13 (2021) 103–117.
[14] P. Afshar, S. Heidarian, F. Naderkhani,
     A. Oikonomou, K. N. Plataniotis, A. Moham-
     madi, Covid-caps: A capsule network-based
     framework for identification of covid-19 cases
     from x-ray images, Pattern Recognition Letters 138
     (2020) 638–643.
[15] W. Zhang, T. Zhou, Q. Lu, X. Wang, C. Zhu, H. Sun,
     Z. Wang, S. K. Lo, F.-Y. Wang, Dynamic-fusion-
     based federated learning for covid-19 detection,
     IEEE Internet of Things Journal 8 (2021) 15884–