=Paper=
{{Paper
|id=Vol-3611/paper1
|storemode=property
|title= Impact of augmentation techniques on the classification of medical images
|pdfUrl=https://ceur-ws.org/Vol-3611/paper1.pdf
|volume=Vol-3611
|authors=Antoni Jaszcz
|dblpUrl=https://dblp.org/rec/conf/ivus/Jaszcz22
}}
== Impact of augmentation techniques on the classification of medical images ==
Impact of augmentation techniques on the classification of medical images Antoni Jaszcz Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44100 Gliwice, Poland Abstract The analysis of medical data is an important task as it can help in the quick diagnosis of the patient. This work focuses on the analysis of X-ray images. The images show the patientβs condition, who is healthy or suspected of having pneumonia. To enable the automatic analysis of such images, I suggest using a convolutional neural network based on various augmentation methods. The introduction of augmentation allowed to increase the training set for the neural network, which requires a large amount of data in order to best adapt the model to the problem. The network has been described, implemented and tested to validate its operation. The research focused on various augmentation techniques including random rotation, random contrast, and a combination of both these methods. Based on obtained results, contrast augmentation achieves better results concerning the lack of its use. For the other two augmentation results, the results were lowered due to the modification of the basic orientation in the x-rays. Keywords Data classification, convolutional neural networks, medical images, augmentation 1. Introduction training process. The augmentation process is very important in tasks Artificial intelligence methods allow for quick segmen- where the data are gathered for a long time, like medicine. tation or classification of various data. However, these Automatic analysis of test results in the form of expert methods require an enormous amount of data to train systems is very necessary to reduce the waiting time such models. This is especially visible in the case of ar- for a diagnosis. For this purpose, expert systems quite tificial neural networks, where deep architectures can often use solutions based on convolutional neural net- classify data much better, although they need a lot of works (CNNs). It is visible in the images of moles on the training data. Quite often, such data may not be enough skin, which are one of the basic and first examinations to obtain a solution that can be implemented in practice. for the detection of potential skin cancer like melanoma. For this purpose, augmentation is used. It is the process CNNβs can be used for image processing, feature extrac- of artificially creating new samples within a single class tion and even classification or segmentation what was to generate new samples that can increase the amount shown in [5, 6, 7]. This type of machine learning tech- of data in the training set [1]. nique is also used in the detection of Parkinsonβs disease In the case of image processing, the augmentation is [8]. Medical analysis by the use of machine learning is based on rotating or zooming some areas. This can pro- badly needed for faster disease detection and choice of vide a new sample with similar features but in different treatment. Biomedical informatics uses also augmented orientations or configurations. Apart from the classic reality for increasing the quality of data processing and methods of sample analysis, new ones are proposed. An learning [9, 10, 11]. example of this is augmentation based on combining two Decision support systems quite often rely not only samples based on interpolation of mathematical func- on algorithms, but also frameworks and alternative so- tions [2]. The idea is to create points from two images lutions. An example of a framework for the analysis and interpolate them to superimpose two images with of medical images, especially those obtained during to- a certain transparency. Similar tools (like interpolation mography, is presented in [12]. In addition, new neural techniques) can be used in different approaches. It was network architectures are also modeled to diagnose e.g. shown in [3], where the authors use it to generate syn- covid-19 [13, 14]. Moreover, medical systems rely on thetic data instances. Again in [4], the idea of random deep neural networks that require training. The classical cropping as an augmentation method was shown as not approach is based on teaching one model, but federal the best approach. According to the presented results, learning is also developed. it is based on training in par- this method can produce noise in the gradient during the allel on many clients who aggregate a common model [15, 16]. IVUS 2022: 27th International Conference on Information Technology Based on this observation, in this paper, a deep learn- $ aj303181@student.polsl.pl (A. Jaszcz) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative ing method was used to fast analysis of x-ray images CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) to detect possible pneumonia. The contribution of this CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings research are: π£π‘ = π½2 π£π‘β1 + (1 β π½2 )ππ‘2 . (4) β’ analysis of selected augmentation methods and In the above formulas, the coefficients π½1 and π½2 are a dis- its impact on convolutional neural networks, tributions. having these two parameters, the correlation β’ the use of augmentation methods to expand the of them is calculated: training set of medical images. ππ‘ π Λπ‘ = , (5) 1 β π½1π‘ 2. Methodology π£Λπ‘ = π£π‘ . (6) 1 β π½2π‘ In this section, all mathematical aspects of CNNs, training algorithms and augmentation methods are described. Finally, the weight in the next iteration (π‘ + 1) will be defined by the following formula: 2.1. Convolutional neural network π π€π‘+1 = π€π‘ β β π Λ π‘, (7) π£Λπ‘ + π CNN is created based on three types of layers: convo- lutional, pooling and fully connected (dense). The first where π β 0, and π is known as learning coefficient. type is the convolutional one, which has the purpose of changing the image to extract the features from the 2.3. Augmentation methods analyzed image. It is done by applying the convolutional operator(*) on each pixel at a given position in image 2.3.1. Random rotation πΌπ₯,π¦ and filter matrix ππΓπ according to: This model adds an augmentation layer that slightly and π π randomly rotates the input image, right before the input ππ,π Β· πΌπ₯+πβ1,π¦+πβ1 + π, (1) layer of the base model. The mathematical formulation βοΈ βοΈ π * πΌπ₯,π¦ = π=1 π=1 of this method can be shown as a transformation matrix: [οΈ ]οΈ where π is a bias. πΌ π½ (1 β πΌ) Β· π₯ β π½ Β· π¦ , (8) The second layer is called pooling. The main task of βπ½ πΌ π½ Β· π₯ + (1 β πΌ) Β· π¦ it to resize the image. It is performed by the selection of one pixel from a given grid by the use of a mathematical where πΌ = π Β· cos π, π½ = π Β· sin π, π is the rotation angle function like minimum or maximum. After finding a pixel chosen in random way and π is a scale parameter. An in the first grid (that is placed above pixel on position example of such augmentation is shown in Fig. 1. (0,0) in the image), the grid is moved to the next pixel. This is done until the grid does not cover the last pixel in the image. As a result of the layerβs operation, an image is created from selected pixels. The last layer is fully-connected and it is a classic column of neurons that numerical values and weights: (οΈπβ1 )οΈ βοΈ π π€π,π π₯π , (2) π=0 where π is the number of neurons in the previous column, π₯π is the results from a neuron in the previous layer on π€π,π connection. 2.2. Training algorithm Figure 1: An example of how images are randomly rotated Training process of CNN consists in modifying the weight values, which can be done with the ADAM algorithm [17]. This algorithm assumes that the weights will be changed 2.3.2. Random contrast according to statistical values including mean π and π£ This model adds an augmentation layer that slightly and in π‘-th iteration. The formulation of this can be defined randomly changes the contrast of the input image, right as: before the input layer of the base model. An example ππ‘ = π½1 ππ‘β1 + (1 β π½1 )ππ‘ , (3) of such contrast changing is presented in Fig. 2. This is made by changing the value of colors as: π β² = πΉ (π β 128) + 128, (9) where π β² is a new color in RGB color model, π is a changed value of selected color, and πΉ is the correlation coefficient defined as follows: 259(πΆ + 255) πΉ = , (10) 255(259 β πΆ) where πΆ is the contrast level. In the case of augmentation, this coefficient is random. Figure 3: An example of how random contrast and rotation is applied to an image 3.2. Database The data used in our experiments consists of 5216 x- ray images (of different sizes) of patients with suspected pneumonia, 3875 of which were confirmed cases (viral and bacterial infections both wise), while the other 1341 were healthy. The data is accessible at Kraggle, at this link. Kraggle is a public dataset platform for data sci- entists and machine learning enthusiasts, controlled by Google LLC. Figure 2: An example of how random contrast is applied to an image 3.3. Data preparation The images were first resized to 256x128 (pixels) and then 2.3.3. Random rotation and contrast divided randomly into two groups: This model joins two previously described augmentation β’ train group (75% of the database) methods, and applies them to the input image, right be- β’ validation group (25% of the database) fore the input layer of the base model. The combination of both presented augmentation methods is shown in Fig. 3. 3.4. Assessment The goal of this paper is to show what impact differ- ent types of data augmentation have on an already well- 3. Experiments performing neural network model. The structure of the base CNN model is as follows: In this section, the experimental settings, obtained results and discussion are presented. 1. Input layer - a convolutional layer, having 128 neurons with 3x3-sized filters, with input shape: 3.1. Testing environment 128,256,1 (shape of a 2D image), with ReLU (Recti- fied Linear Unit) activation function. The output All experiments were conducted on a computer with the of this layer is then passed onto pooling layers, following specifications: described below. Processor: AMD Ryzen 5 5600X 6-Core Processor 4.20 2. Hidden layer - in our model, we used two more GHz convolutional blocks, with each subsequent hav- Installed RAM: 32.0 GB ing half the number of neurons than the previous System type: 64-bit Windows 10 ; x64-based processor one. All pooling layers used in the model have 2x2-sized filters. All convolutional layers used All computing was done solely with the CPU. in the model have 3x3-sized filters. The output is then passed onto the dense layer, with 64 neu- The results were obtained by assessing aforementioned rons and ReLU activation. Next, the dropout layer validation group, in which 977 (roughly 75% of the group) (with the rate set to 0.5) and the following flatten cases were pneumonic, while the rest 327 (25%) were layer prepare the final output, of the hidden seg- healthy. ment. The order of these layers and the number of neurons within them is displayed below: β’ pooling layer (2x2), ReLU Table 1 Calculated metrics β’ convolutional layer (3x3), 64 neurons, ReLU class precision recall f1-score β’ pooling layer (2x2), activation: ReLU β’ convolutional layer (3x3), 32 neurons, ReLU 0 0.9630 0.8746 0.9167 base CNN model 1 0.9593 0.9887 0.9738 β’ pooling layer (2x2), ReLU accuracy=0.9601 β’ dense layer, 64 neurons, ReLU 0 0.9273 0.8196 0.8701 β’ dropout layer (threshold: 0.5) rotation 1 0.9419 0.9785 0.9598 β’ flatten layer accuracy=0.9387 0 0.9199 0.9480 0.9337 3. Output layer - a dense layer with 2 neurons, re- contrast 1 0.9824 0.9724 0.9774 lated to the healthiness of a patient. While assess- accuracy=0.9663 ing, the larger value of two neurons is chosen and 0 0.8043 0.9052 0.8518 thus, the patient is determined as healthy or ill rotation and contrast 1 0.9669 0.9263 0.9462 with pneumonia. accuracy=0.9210 3.5. Results In Tab. 1, the results show, that the base model reaches a decent accuracy of 96%, but its recall could be improved In this subsection, results of the experiments are shown. when it comes to detecting healthy cases (41 patients out In the Tab. 1, calculated metrics, such as ππππ’ππππ¦, of 327 healthy ones were diagnosed with pneumonia by ππππππ πππ, ππππππ and π 1 β π ππππ are displayed for each the model, while in fact, being healthy). Furthermore, model. The values were calculated with following formu- rotation augmentation worsen the overall performance las (for binary classification): of the model, while contrast one proved to be somewhat β’ Accuracy: beneficial to the results (see Fig. 4). Not only did accuracy slightly improve, but its recall metric in detecting healthy ππ + ππ cases grew considerably. Although the ability to detect πΌ= , (11) ππ + πΉπ + ππ + πΉπ all the pneumonic cases dropped, contrast augmentation β’ Precision: brings in more balance to the modelβs assessment and ππ thus improves its overall performance. π= , (12) ππ + πΉπ β’ Recall: ππ 4. Conclusions π= , (13) ππ + πΉπ The analysis of medical images is important in order to β’ F1-Score: quickly detect or help a doctor make a diagnosis decision. 1 (οΈ 1 1 )οΈ For this purpose, the use of a convolutional neural net- = 0.5 Β· + , (14) work for the analysis of X-ray images was presented. As π1 π π part of the research, the possibilities of using augmenta- where tion were considered (techniques such as random rota- TP - true sample predicted as true, tion, contrast change and a combination of both). The ob- TN - false sample predicted as false, tained results indicate that augmentation can quickly and FP - false sample predicted as true, easily extend the training set. Random contrast change FN - true sample predicted as false. as the main augmentation technique performed better In the case of the metrics other than accuracy itself, two in terms of model accuracy compared to the original cases are considered. First, where pneumonia is consid- database. In addition, it was found that the use of rota- ered as the truth and healthy as falsity. Second, where tion on medical images deteriorated the performance of itβs the other way round. the trained model. The reason for this is the rearrange- The relationship between predicted and real outcome is ment of the chest area on X-rays. As a result, the database also displayed in confusion matrices (Fig. 4), for each is enlarged with data that drastically differ from the rest, tested model. and consequently, reduces the effectiveness of the neural Figure 4: Confusion matrices for described models network. The results of the last model analyzed in this References paper, that is the one with both augmentations applied, show the worst results of them all. Not only its accuracy [1] M. Elgendi, M. U. Nasir, Q. Tang, D. Smith, J.-P. Gre- is lower, but also the ability to detect pneumonic cases, nier, C. Batte, B. Spieler, W. D. Leslie, C. Menon, which is crucial in medical illness detection, plummeted. R. R. Fletcher, et al., The effectiveness of image aug- A positive impact of classic data augmentation techniques mentation in deep learning networks for detecting on CNN-model performance was similarly shown in liver covid-19: A geometric transformation perspective, illness recognition [18]. It was also suggested, that clas- Frontiers in Medicine 8 (2021). sic augmentation methods connected with cutting edge [2] D. PoΕap, M. WΕodarczyk-Sielicka, Interpolation augmentation methods, such as generative adversarial merge as augmentation technique in the problem network (GAN), yield the best results of all model con- of ship classification, in: 2020 15th Conference on figurations t ested. In [19], possible negative effects of Computer Science and Information Systems (FedC- joined classic augmentation methods in medical image SIS), IEEE, 2020, pp. 443β446. classification were discussed, as well as their lone impact [3] O. O. Abayomi-Alli, R. Damasevicius, S. Misra, on the learning process. R. Maskeliunas, A. Abayomi-Alli, Malignant skin melanoma detection using image augmentation by oversampling in nonlinear lower-dimensional em- Acknowledgements bedding manifold, Turkish Journal of Electrical Engineering & Computer Sciences 29 (2021) 2600β This work is supported by the Silesian University of Tech- 2614. nology by the mentoring project. [4] R. Yang, R. Wang, Y. Deng, X. Jia, H. Zhang, Re- thinking the random cropping data augmentation 15891. method used in the training of cnn-based sar image [16] B. Pfitzner, N. Steckhan, B. Arnrich, Federated learn- ship detector, Remote Sensing 13 (2021) 34. ing in a medical context: A systematic literature [5] D. PoΕap, Analysis of skin marks through the use review, ACM Transactions on Internet Technology of intelligent things, IEEE Access 7 (2019) 149355β (TOIT) 21 (2021) 1β31. 149363. [17] D. P. Kingma, J. Ba, Adam: A method for stochas- [6] R. Wang, G. Zheng, Cycmis: Cycle-consistent cross- tic optimization, arXiv preprint arXiv:1412.6980 domain medical image segmentation via diverse (2014). image augmentation, Medical Image Analysis 76 [18] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, (2022) 102328. J. Goldberger, H. Greenspan, Gan-based synthetic [7] D. PoΕap, Fuzzy consensus with federated learning medical image augmentation for increased cnn per- method in medical systems, IEEE Access 9 (2021) formance in liver lesion classification, Neurocom- 150383β150392. puting 321 (2018) 321β331. [8] O. O. Abayomi-Alli, R. DamaΕ‘eviΔius, [19] Zeshan, F. Hussain, D. Gimenez, D. Yi, Rubin, Dif- R. MaskeliuΜnas, A. Abayomi-Alli, Bilstm ferential data augmentation techniques for medical with data augmentation using interpolation imaging classification tasks, PubMed 2017 (2018) methods to improve early detection of parkinson 979β984. disease, in: 2020 15th Conference on Computer Science and Information Systems (FedCSIS), IEEE, 2020, pp. 371β380. [9] Y. Djenouri, A. Belhadi, G. Srivastava, J. C.-W. Lin, Secure collaborative augmented reality framework for biomedical informatics, IEEE Journal of Biomed- ical and Health Informatics (2021). [10] C. Moro, J. Birt, Z. Stromberga, C. Phelps, J. Clark, P. Glasziou, A. M. Scott, Virtual and augmented reality enhancements to medical and science stu- dent physiology and anatomy test performance: A systematic review and meta-analysis, Anatomical sciences education 14 (2021) 368β376. [11] Y. Zhuang, J. Sun, J. Liu, Diagnosis of chronic kidney disease by three-dimensional contrast- enhanced ultrasound combined with augmented reality medical technology, Journal of Healthcare Engineering 2021 (2021). [12] T. Akram, M. Attique, S. Gul, A. Shahzad, M. Altaf, S. Naqvi, R. DamaΕ‘eviΔius, R. MaskeliuΜnas, A novel framework for rapid diagnosis of covid-19 on com- puted tomography scans, Pattern analysis and ap- plications 24 (2021) 951β964. [13] J. Rasheed, A. A. Hameed, C. Djeddi, A. Jamil, F. Al- Turjman, A machine learning-based framework for diagnosis of covid-19 from chest x-ray images, Interdisciplinary Sciences: Computational Life Sci- ences 13 (2021) 103β117. [14] P. Afshar, S. Heidarian, F. Naderkhani, A. Oikonomou, K. N. Plataniotis, A. Moham- madi, Covid-caps: A capsule network-based framework for identification of covid-19 cases from x-ray images, Pattern Recognition Letters 138 (2020) 638β643. [15] W. Zhang, T. Zhou, Q. Lu, X. Wang, C. Zhu, H. Sun, Z. Wang, S. K. Lo, F.-Y. Wang, Dynamic-fusion- based federated learning for covid-19 detection, IEEE Internet of Things Journal 8 (2021) 15884β