=Paper= {{Paper |id=Vol-3179/Paper_16.pdf |storemode=property |title=Information Technology for Chest XRay Images Analysis |pdfUrl=https://ceur-ws.org/Vol-3179/Paper_16.pdf |volume=Vol-3179 |authors=Oleksii Bychkov,Kateryna Merkulova,Yelyzaveta Zhabska,Oleh Stelmakh |dblpUrl=https://dblp.org/rec/conf/iti2/BychkovMZS21 }} ==Information Technology for Chest XRay Images Analysis== https://ceur-ws.org/Vol-3179/Paper_16.pdf
Information Technology for Chest X-Ray Images Analysis
Oleksii Bychkov, Kateryna Merkulova, Yelyzaveta Zhabska and Oleh Stelmakh
Taras Shevchenko National University of Kyiv, 60, Volodymyrska Street, Kyiv, 01601, Ukraine

                Abstract
                This paper presents the research and development of information technology for analysis and
                classification of chest X-ray images in order to automatically detect the signs of the disease,
                specifically pneumonia, what is the most relevant in the conditions of COVID-19 pandemic.
                Information technology is based on the developed mathematical model through complex
                training of neural networks. The dataset used for the experimental studies and neural networks
                training consisted of 35,000 images ranging in size from 200×200 px to 2500×2500 px.
                Convolutional neural networks were used to fulfill the goal of software creation based on
                developed information technology. As a result of experiments, the weighted average value of
                F1 metric of 97.05% was obtained, that is close to the recognition rate of a physician.
                During the research the decision support software based on developed information technology
                was created with an aim to assist the physician in making a decision, help in the analysis of
                lungs X-rays for pneumonia, and also allow to store all the necessary information about the
                patients in one repository. The program was developed using Microsoft technologies, including
                the C# programming language and a technology environment designed to develop a user
                interface - WPF. Also, software was implemented using the MVVM architecture and ML.NET
                as a tool for implementation of a neural network. The Nvidia RTX 2070 Super graphics
                processor (GPU) and CUDA technology were used to train the neural network.
                Created software based on developed information technology for chest X-ray images analysis
                allows to record patients, classify and process images, add confirmations of physicians, and
                can be used as an accessory instrument to diagnose pneumonia, which will reduce the strain
                on the radiologist and allow to process larger number of X-rays images more effective.

                Keywords 1
                Disease recognition, X-ray imaging, image classification, neural networks algorithms.

1. Introduction
    According to the data of World Health Organization, pneumonia is an acute infectious lung
inflammation that is one of the most pressing issues in modern medicine. It is a curable disease that,
however, causes the death of millions of people, including more than 800,000 children under the age of
five each year [1], due to lack of access to its prevention and treatment. The problem also becomes
more severe due to diagnostic mistakes that lead to delayed intervention and death.
    Treatment of pneumonia consists of a number of epidemiological, clinical, pharmacological and,
finally, social aspects. The paradox of pneumonia lies in the fact that, on the one hand, impressive
results have been achieved in understanding the pathogenesis of the infectious process, increasing the
effectiveness of treatment, and, on the other hand, there is an increase in the number of patients with
severe disease and in the death rate. The presence of this problem is generally accepted and research in
this direction is actively conducted worldwide.
    Medical X-ray images are one of the first and main methods of the detection and diagnosis of
pneumonia that has become widespread in medical practice. The number of chest X-rays has increased
rapidly, but its roentgenological study is still usually conducts manually, that delays and reduces the

Information Technology and Implementation (IT&I-2021), December 01–03, 2021, Kyiv, Ukraine
EMAIL: bos.knu@gmail.com (O. Bychkov); kate.don11@gmail.com (K. Merkulova); y.zhabska@gmail.com (Y. Zhabska);
olehstelmakh@knu.ua (O. Stelmakh)
ORCID: 0000-0002-9378-9535 (O. Bychkov); 0000-0001-6347-5191 (K. Merkulova); 0000-0002-9917-3723 (Y. Zhabska); 0000-0002-
7718-6742 (O. Stelmakh)
             ©️ 2022 Copyright for this paper by its authors.
             Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
             CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                                        167
effectiveness of the disease detection. For this reason, automated detection of signs of pneumonia in X-
rays is a priority of information technology, that, consequently, will reduce the strain on the radiologists,
increase the efficiency of data processing and increase the accuracy of diagnosis. A promising method
for solving this problem is the usage of convolutional neural networks, which occupy a leading position
among the software used for image classification and recognition.

2. Literature Survey
    During the research, the accuracy of various existing solutions was compared.
    The first paper [2] describes the principle of operation of the multilayer perceptron and convolutional
neural network for the detection and classification of pneumonia. The Chest-X-Ray data set was used
with a total of 5840 images, consisting of two image classes: with and without pneumonia. The
classification models developed by the authors were effective and provided an average classification
accuracy of 92.16% for the multilayer perceptron and 94.40% for the convolutional neural network.
    In a study [3], a convolutional neural network for chest disease diagnosis was described.
Comparative analysis was performed on neural network with backpropagation and competitive neural
network for disease classification. Developed CNN, BPNN and CpNN have been trained and tested
using chest X-ray images with various diseases. Analysis of the results indicated that CNN was able to
obtain better generalization power, although the computation time and iterations number were higher.
    The authors of work [4] have used a deep convolutional neural network. For this purpose, the Chest
X-Ray data set was also used. It contained 79% of images with and 21% without signs of pneumonia.
Firstly, the images were preprocessed by normalization and magnification. It should be noted that this
solution was based on the idea of maximizing the value of the recall metric for the class of images
containing pneumonia in order to maximize the level of recognition specifically of the disease. As a
result, this neural network provided a recall rate of 98.72%, which is an extremely good result.
    In the paper [5], a new deep learning basis using the concept of transfer learning was proposed. The
main idea is to use the experience gained in solving one problem to solve another, similar problem. In
order to do that, the neural network has been trained on a large amount of data, then - on the target set.
Under this approach, image functions were obtained using various neural network models pre-trained
on ImageNet, and then have been forwarded to a classifier for prediction. Authors prepared and
analyzed five different models, and proposed a joint model that combines the results of previously
trained models, achieving accuracy of 96.4% in the recognition of pneumonia with a recall of 99.62%.
    The study [6] describes the appliance of two known convolutional neural network models, Xception
and Vgg16, for the diagnosis of pneumonia. The test results indicated that the Vgg16 network
outperforms the Xception network with a classification accuracy of 87% and 82%, respectively.
According to experimental results, each network has its own advantages in classification. The Xception
is more successful in detecting cases of pneumonia, but the Vgg16 more successfully classifies images
without the signs of the disease. In the future, authors plan to combine these two networks.
    Based on the results of analysis of existing solutions and programs, that is presented in Table 1, it
can be concluded that neural networks can achieve good results in X-ray images recognition, but due
to the complexity of the images, it requires powerful neural networks or a combination of neural
networks with well-thought-out architecture and good performance.
Table 1
Comparative table of the results of existing solutions
 №                                    Title                                 Accuracy   Precision   Recall     F1
        Models of Learning to Classify X-ray Images for the Detection of
 1                                                                          94.4 %      94.5%      94.3%    94.4%
                    Pneumonia using Neural Networks [2]
 2    Deep Convolutional Neural Networks for Chest Diseases Detection [3]    92.4%         -         -        -
      Pneumonia Detection: Pushing the Boundaries of Human Ability with
 3                                                                          90.71%     87.90%      98.72%    93%
                               Deep Learning [4]
      A Novel Transfer Learning Based Approach for Pneumonia Detection
 4                                                                          96.39%     93.28%      99.62%   96.34%
                           in Chest X-ray Images [5]
         Diagnosis of Pneumonia from Chest X-Ray Images using Deep
 5                                                                            87%      87.98%       85%     86.46%
                                 Learning [6]


                                                                                                                  168
3. Task Solution Methods
    With a purpose of developing an information technology for X-ray images analysis, it is necessary
to thoroughly select and test the methods through the set of experiments applied on real datasets. From
the perspective of analysis of previously mentioned works, it was decided to conduct the experimental
research with the following methods: histogram equalization and heat maps as X-ray image
preprocessing methods, neural networks as X-ray image classification method, and metrics of accuracy
and completeness in each of the classes to evaluate the quality of the algorithm.

3.1.    Neural Networks
    In order to create software based on information technology of chest X-ray images analysis, 4 neural
network architectures were used in this research, such as InceptionV3, MobilenetV2, ResNetV2_50 and
ResNetV2_101. All of them were previously trained on a large number of images from the ImageNet
database, meaning that in this case transfer learning was used. The basic idea is that a pre-trained
network can classify a large number of images into many categories, as a result it helps more effectively
train neural networks on not previously used data with many features of previously used images.
    InceptionV3 [7] is a convolutional neural network, that is mainly provides reducing of computational
power by modifying previous Inception architectures. In the InceptionV3 model, several network
optimization methods are used to ease constraints and simplify model adaptation. These methods
include factorized convolutions, regularization, dimensional reduction and parallel computations [8].
    ResNet (Residual Network) is a family of deep neural networks, that were presented in the work [9].
To solve the problem of descended or rapidly increased gradient in this architecture, the concept of a
residual network was presented, that uses the approach of skip connections. It consists in skipping
training from several layers and transmitting directly to the output. The advantage of using this type of
connection is that any level that degrades the performance of the architecture will be skipped by
regularization. Thus, it leads to the learning of a very deep neural network without the problems caused
by a descended or rapidly increased gradient. It is a high-speed connection that transforms a simple
architecture into a residual network. Each ResNet block has two (ResNet 18, 34) or three (ResNet 50,
101, 152) depth levels. In this research, ResNet networks of 50 and 101 layers were used.
    MobileNet-v2 [10] is a convolutional neural network with the main idea is that convolutional layers,
that are necessary for computer vision tasks, but quite expensive to calculate, can be replaced by so-
called depthwise separable convolutions. MobileNetV2 also contains inverted residual blocks that work
in a similar way to ResNet, but more efficient. All described types of neural networks establish the basis
of the information technology for X-ray images analysis under development.

3.2.    Histogram Equalization
    One of the problems that can arise during the training and testing of neural network is that in
processing of new data with other contrast or brightness than previously used data the trained neural
network model will produce worse results. With a view to prevent it and obtain better results, it was
decided to include into the information technology for X-ray images analysis under development a
method of improving image quality [11] by intensification of individual image areas. This method is
histogram equalization [12]. Equalization is the process of image histogram adjustment by adjusting
the brightness of individual pixels. Essentially, the histogram of an arbitrary image can be presented in
the form of peaks, that is the number of pixels in the image with a certain brightness. Typically, the
image histogram is a set of peaks nonuniformly distributed on the graph. Nonuniform distribution
means that the histogram has the areas with the highest or lowest peaks concentration, i.e. in the space
of the graph there are areas where the density of values is higher or lower than the graph average. The
equalization procedure makes the distribution of values in the histogram more uniform, almost without
the gaps and areas with excessively high number (and height) of peaks, and therefore equalization is
designed to correct the image by adjusting the integral of areas with different brightness. The function
of this transformation is:
                                                       ∑𝑖      𝐻[𝑗]
                                  (𝑖) = 𝑓𝑙𝑜𝑜𝑟(255 × ∑𝑗=1
                                                     255 𝐻[𝑗]),                                       (1)
                                                         𝑗=1

where i is the value of intensity; floor(k) is a function that returns an integer part of a real number k; H
is a brightness histogram, that is an integer array of 256 elements.

                                                                                                        169
   Initial image transformation can be described with the following:
                                       𝐼′(𝑥, 𝑦) = 𝑓(𝐼 (𝑥, 𝑦)),                                          (2)
where f(I(x, y)) is an equalization function; I – current intensity of a pixel with coordinates (x, y).
   This transform performs for all image channels, producing as a result an image with an equalized
histogram. An example of the algorithm appliance result is depicted in Figure 1.




Figure 1: Example of equalized image and its histogram
    This method of image processing can be used during the import of X-rays into the program. It will
allow to always input an image of standardized brightness and contrast to the neural network.

3.3.    Heat Maps
    With the reason to obtain higher classification results as the outcome of the information technology
for X-ray images analysis under development, heat maps were used that are a graphical representation
of the data. Heat maps presents the individual values of the matrix in the form of colors. To create this
functionality, the program was implemented with the algorithm that allows to form three types of heat
maps, that are depicted in Figure 2: thermal, rainbow and a map of white, blue, red colors and its shades.
In general, this transformation can be represented with the following expression:
                          𝐼 (𝑥, 𝑦) = ℎ𝑒𝑎𝑡𝑚𝑎𝑝[𝑓𝑙𝑜𝑜𝑟(𝐵(𝑥, 𝑦) × 255)],                                  (3)
where C(x,y) is a pixel color; heatmap is an array of values of new colors for the selected type of heat
map; floor(i) is a function rounding to the smallest integer value, i is a real number; B(x,y) is a value of
pixel brightness, where {B(x, y) ∈ R: 0 ≤ B(x, y) ≤ 1} with coordinates (x, y).




Figure 2: Initial image and its heat maps

3.4.    Metrics of Classification Performance Evaluation
   During the image classification with the use of a neural network, there are possible solutions, that
can be presented in the form of the confusion matrix, depicted in Table 2 [13]. It is an important concept
that used in the calculation of many classifier metrics. Matrix consists of two lines indicating the
response received from the neural network and two columns indicating the trueness of the result.

                                                                                                        170
Table 2
Confusion matrix
                                                                         Expert evaluation
                        Category
                                                                  True                       False
                                     Positive                      TP                         FP
    System evaluation
                                     Negative                      FN                         TN

    There are 4 groups of model predictions:
    1. TP – true-positive prediction;
    2. FN – true-negative prediction;
    3. FP – false-positive prediction;
    4. TN – false-negative prediction.
    The first metric is accuracy that describes the overall accuracy of model prediction for all classes
and can be expressed as:
                                                     𝑇𝑃 + 𝑇𝑁
                                𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =                           ,                                 (4)
                                               𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁
where TP, FP, TN, FN – values of the confusion matrix.
    Accuracy is one of the simplest metrics, but with the apparent feature that makes it unbalanced.
During the calculation, each image is assigned the same weight, which may be incorrect if the
distribution of images in the training or test sample is severely biased towards one or more classes. In
this case, the classifier has more information on these classes and, accordingly, within these classes, it
will make more valid decisions. One solution of this problem is to train or test the classifier on a
specially prepared, balanced set of images. Another option is to change the approach to formal quality
evaluation. It is necessary to separately use metrics of accuracy and completeness in each of the classes
to evaluate the quality of the algorithm. System precision within a class is the part of images that
actually belong to that class relative to all images that the system has assigned to that class. The system
recall is the part of the images found by the classifier belonging to the class in relation to all images of
this class in the test sample. These metrics are calculated by the following formulas:
                                                        𝑇𝑃
                                      𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =              ,                                       (5)
                                                     𝑇𝑃 + 𝐹𝑃
                                                    𝑇𝑃
                                       𝑅𝑒𝑐𝑎𝑙𝑙 =           ,                                       (6)
                                                 𝑇𝑃 + 𝐹𝑁
where TP is a value of true-positive prediction from confusion matrix; FP – is a value of false-positive
prediction; FN – is a value of false-negative prediction.
   Worth noting is that the higher the precision and recall the better, but usually the maximum values
of precision and recall are inaccessible at the same time. Therefore, it is necessary to find a certain
balance between these values. F-measure combines information about precision and recall and can be
calculated by the following formula:
                                               𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
                         𝐹𝛽 = (1 + 𝛽2 ) × 2                             ,                         (7)
                                           (𝛽 × 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛) + 𝑟𝑒𝑐𝑎𝑙𝑙
where β takes value in the range 0 < β < 1 if the precision is priority, and if β > 1 the recall is priority.
If β = 1 metric is balanced and can be expressed as:
                                          𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑟𝑒𝑐𝑎𝑙𝑙
                                 𝐹1 = 2 ×                        ,                                     (8)
                                          𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
   F-score is a suitable metric for a formal classifier quality evaluation. This method will be used to
make a decision on the efficiency degree of neural network model performance.
   To generalize and determine the result of precision, recall and F-score, it is necessary to determine
the average value for all classes. Micro average, macro average and weighted average are most often
used values for this purpose [14]. Micro-average for binary classification calculates with the following
formulas [15], respectively for precision (PRE), recall (REC) and F-score (FSC):


                                                                                                         171
                                                 𝑇𝑃1 + 𝑇𝑃2
                           𝑃𝑅𝐸𝑚𝑖𝑐𝑟𝑜 =                                ,                           (9)
                                        𝑇𝑃1 + 𝑇𝑃2 + 𝐹𝑃1 + 𝐹𝑃2
where TP1 and TP2 are values of true-positive prediction for the first and second class, respectively;
FP1 and FP2 – values of false-positive prediction for the first and second class, respectively.
                                                 𝑇𝑃1 + 𝑇𝑃2
                          𝑅𝐸𝐶𝑚𝑖𝑐𝑟𝑜 =                                  ,                         (10)
                                        𝑇𝑃1 + 𝑇𝑃2 + 𝐹𝑁1 + 𝐹𝑁2
where TP1 and TP2 are values of true-positive prediction for the first and second class, respectively;
FN1 and FN2 – values of false-negative prediction for the first and second class, respectively.
                                             𝑃𝑅𝐸𝑚𝑖𝑐𝑟𝑜 × 𝑅𝐸𝐶𝑚𝑖𝑐𝑟𝑜
                          𝐹𝑆𝐶𝑚𝑖𝑐𝑟𝑜 = 2 ×                             ,                          (11)
                                             𝑃𝑅𝐸𝑚𝑖𝑐𝑟𝑜 + 𝑅𝐸𝐶𝑚𝑖𝑐𝑟𝑜
  Macro average for binary classification is calculated with simple formulas of average:
                                                  𝑃1 + 𝑃2
                                    𝑃𝑅𝐸𝑚𝑎𝑐𝑟𝑜 =              ,                                   (12)
                                                      2
where P1 and P2 are precision values for the first and second class, respectively.
                                                  𝑅1 + 𝑅2
                                    𝑅𝐸𝐶𝑚𝑎𝑐𝑟𝑜 =              ,                                   (13)
                                                      2
where R1 and R2 are recall values for the first and second class, respectively.
                                            𝑃𝑅𝐸𝑚𝑎𝑐𝑟𝑜 × 𝑅𝐸𝐶𝑚𝑎𝑐𝑟𝑜
                         𝐹𝑆𝐶𝑚𝑎𝑐𝑟𝑜 = 2 ×                                 ,                       (14)
                                            𝑃𝑅𝐸𝑚𝑎𝑐𝑟𝑜 + 𝑅𝐸𝐶𝑚𝑎𝑐𝑟𝑜
  Weighted average value can is calculated as the following:
                                            𝑃1 × 𝑁𝐼1 + 𝑃2 × 𝑁𝐼2
                           𝑃𝑅𝐸𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 =                             ,                          (15)
                                                   𝑁𝐼1 + 𝑁𝐼2
                                            𝑅1 × 𝑁𝐼1 + 𝑅2 × 𝑁𝐼2
                           𝑅𝐸𝐶𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 =                             ,                          (16)
                                                   𝑁𝐼1 + 𝑁𝐼2
where P1 and P2 are precision values for the first and second class, respectively; NI1 and NI2 are the
number of images of the first and second class, respectively.
                                           𝑃𝑅𝐸𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 × 𝑅𝐸𝐶𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑
                     𝐹𝑆𝐶𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 = 2 ×                                    ,                     (17)
                                           𝑃𝑅𝐸𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑 + 𝑅𝐸𝐶𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑
   All the described values will be used in this research to determine the efficiency of neural network
and compare the obtained results.

4. Experimental Research and Analysis
4.1. Dataset
    The research was conducted on the dataset for training and testing with total size of 35,000 images.
The dataset was created by combining of five different sets from the Kaggle resource. A major part of
the dataset is the “Chest X-Ray images (Pneumonia)” [16] that contains images of different sizes made
at different angles, and the images with the certain features, such as the occurrence of cardiostimulator.
Image sizes vary from the smallest of 200×200 px to the largest of 2500×2500 px. The histogram
equalization algorithm for each image was then used. Based on the images with the equalized histogram,
three more sets of images with rainbow, red-white-blue and thermal heat maps were created with the
same number of images.
    For further neural networks training based on the above-mentioned architectures, a method of
supervised learning was used. This method’s idea consists in the partitioning of the data set into a
training and test sample [17]. However, in this work, five training datasets were prepared from the
combined dataset. All training datasets contain different numbers of images and one test sample, that
will be used for testing the performance of the neural network depending on the size of the training
datasets. The test sample contains 5% of the images and the largest training sample contains 95% of the


                                                                                                      172
images from the combined dataset. In all cases, a ratio of image classes of 1:1 was used, because it is
impossible to predict the relative frequency of image classes to be used in the program. The list and
properties of all datasets are presented in Table 3. The number of images of the training samples was
calculated using the value of the number of images of the largest test sample, in sequence dividing it by
two and rounding to the nearest smaller even number if the obtained number is real.
Table 3
Properties of the used datasets
   №                      Training dataset size                            Testing dataset size
   1                              2078
   2                              4156
   3                              8312                                            1750
   4                             16624
   5                             33250


4.2.    Research Results
   In this research, an approach was used with the main idea of an ensemble of neural networks that
specialize in differently processed images.
   In general, the classification process consists of the following stages: image selecting and loading,
pre-processing using the histogram equalization method, creating a series of heat maps, obtaining
classification results from the neural network for each image type and showing the overall result.
   In view of this, as a starting point, a set of unique images was created using the previously mentioned
methods. Then each neural network with a certain architecture was trained using all four sets of images.
During the training, the metrics of accuracy and cross entropy were calculated, as depicted in the form
of graphs in Figures 3-6. The closer the accuracy value is to 1, the better the result. For cross entropy,
on the contrary, the closer the value is to 0, the better it is considered.




   Figure 3: Values of (a) accuracy and (b) cross entropy for original images with equalized histogram

                                                                                                      173
   Figure 4: Values of (a) accuracy and (b) cross entropy for images with rainbow heat map




Figure 5: Values of (a) accuracy and (b) cross entropy for images with red-white-blue heat map

                                                                                             174
Figure 6: Values of (a) accuracy and (b) cross entropy for images with thermal heat map
   After the training of neural networks and testing on a test dataset, another set of metrics was
obtained, that allows to determine the most suitable neural network architecture for a particular image
type. Figures 7 and 8 presents the following metrics:
   1. Micro and macro accuracy, the closer value to 1 the better the model is.
   2. Logarithmic loss is a metric that indicates how close the probability of prediction to the
corresponding actual/true value. The more the predicted probability deviates from the actual value, the
higher the value of the logarithmic losses. The closer the value to 0 the better.
   3. Logarithmic loss reduction is a metric that indicates how the current model is better than the model
that produces random predictions. A reduction in logarithmic losses close to 1 indicates a better model.
   After all the experiments were performed, it was determined that the most effective in binary
classification were the following pairs of architecture – image type: InceptionV3 – images with a
rainbow heat map; MobilenetV2 – images with thermal heat map; ResNetV2_50 – original image with
equalized histogram; ResNetV2_101 – images with red-white-blue heat map.
   Next, there were used the approach of determining the class of the image based on the results of the
classification of all neural networks on the corresponding image type. In order to obtain the final result
from the four vectors with two values of probability that the image belongs to the corresponding class,
the average value for each class was used, followed by their comparison. The best recognition results
were achieved with the use of neural network models that were trained on a data set of 33,250 images
with the class ratio of 1:1. A confusion matrix, depicted in Table 4, was used to obtain the model metrics.
Table 4
Confusion matrix for model metric values
                                                                         Expert evaluation
                        Category
                                                                Normal                    Pneumonia
                                     Normal                      862                         39
    System evaluation
                                   Pneumonia                      13                         836

   According to the obtained values, several metrics were calculated for numerical evaluation of
classifiers, that Table 5 demonstrates.


                                                                                                       175
Figure 7: Metric values for (a) image with equalized histogram and (b) image with rainbow heat map




Figure 8: Metric values for image with (a) red-white-blue heat map and (b) thermal heat map


                                                                                               176
Table 5
Model metric values
       Class        Accuracy    Precision    Recall      F1         Number of correctly classified images
      Normal         98.51%      95.67%     98.51%     97.07%                       862
    Pneumonia        95.54%      98.46%     95.54%     96.98%                       836
      General        97.03%      97.07%     97.03%     97.05%                      1698

   The highest obtained result was compared with the results of existing solutions in Table 2. It was
determined that the proposed solution produces the highest result according to the calculated metrics of
the classifier, providing high accuracy of pneumonia detection.

5. Software Development
    Based on information technology developed in this research the decision support software was
created implementing MVVM architecture [18], that is a template focused on platforms that support the
binding of data and elements intended for the user interface. ML.NET was used to implement the neural
network, which provides .NET developers with analytics and predicting capabilities based on machine
learning models. The neural network itself was trained using the Nvidia RTX 2070 processor (GPU)
and CUDA technology [19]. The training time of the model that produces the highest results for each
set of images was 10 minutes. The software was created with a relational data model and developed
database contains the following tables: patients, image information, classification results, physician
confirmations, efficiency test results, classification quality metrics.
    Developed software allows to enter and store information about patients, process and classify X-ray
images of patients, check the efficiency and recognition accuracy, collect statistics on the application.
    The graphical interface is intuitive and adaptive, contains all the necessary elements made in a single
style. An example of the application screen is shown in Figure 9.




Figure 9: Developed software screen example: page of image processing and classification




                                                                                                       177
Figure 9: (continue)

6. Conclusion
    This work describes the research and development of information technology for analysis and
classification of X-ray image, in order to automatically determine the occurrence of the pneumonia
disease, and provides an example of creating a decision support software based on developed
information technology and designed to assist the physician in making decisions and to classify chest
X-ray images into those that contain signs of pneumonia and those that do not. The study consists of
several different experiments with the use of different datasets of different image class ratios, as well
as four neural network architectures: InceptionV3, MobilenetV2, ResNetV2_50, ResNetV2_101, and
image heat maps. An ensemble of neural networks was used, each network of which specializes in a
specific type of image. Another significant advantage was the use of transfer learning, which accelerated
training and improved final results. During the preparation of the dataset, the problem of the recognition
accuracy reduction was solved by using a histogram equalization method, that corrects the image by
adjusting the integral areas of the histograms with different brightness. This method was also used in
the development of the program, in order to have a standardized image as an input in further neural
network use. Analysis of experimental research has indicated that convolutional neural networks can
be successfully used for the diagnosis of pneumonia by X-ray images processing. The weighted average
value of F1 metric was rated in 97.05%. It is a result that is close to the result of recognition by the
physician, and it is also the highest result among the existing solutions considered. Taking into account
the current relevance of the solved problem, the created software based on developed information
technology for chest X-ray images analysis can be used to assist in pneumonia diagnosis, reducing the
strain on the radiologist and allowing to process more X-ray images with more efficiency.

                                                                                                      178
7. Acknowledgements
   The work was co-funded by the European Union’s Erasmus + Programme for Education under KA2
grant (project no. 2020-1-PL01-KA203-082197 “Innovations for Big Data in a Real World”).

8. References
[1] Pneumonia. World Health Organization. URL: https://www.who.int/
[2] A.A. Saraiva, D.B. Santos, N.J. Costa, J.V. Sousa, et al. Models of Learning to Classify X-ray Images
     for the Detection of Pneumonia using Neural Networks. In Proceedings of the 12th International Joint
     Conference on Biomedical Engineering Systems and Technologies - BIOIMAGING, ISBN 978-989-
     758-353-7; ISSN 2184-4305, pages 76-83. DOI: 10.5220/0007346600760083
[3] R.H. Abiyev and M. Ma'aitah. Deep Convolutional Neural Networks for Chest Diseases Detection.
     Journal of healthcare engineering, 2018, 4168538. DOI:10.1155/2018/4168538
[4] J. Wang. Pneumonia Detection: Pushing the Boundaries of Human Ability with Deep Learning.
     Towards Data Science. URL: https://towardsdatascience.com/pneumonia-detection-pushing-the-
     boundaries-of-human-ability-with-deep-learning-ce08dbd0dc20
[5] V. Chouhan, S.K. Singh, A. Khamparia, D. Gupta, et al. A Novel Transfer Learning Based
     Approach for Pneumonia Detection in Chest X-ray Images. Applied Sciences. 2020; 10(2):559.
     DOI: 10.3390/app10020559
[6] E. Ayan and H. M. Ünver. Diagnosis of Pneumonia from Chest X-Ray Images Using Deep
     Learning. 2019 Scientific Meeting on Electrical-Electronics & Biomedical Engineering and
     Computer Science (EBBT), 2019, pp. 1-5, doi: 10.1109/EBBT.2019.8741582
[7] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, & Z. Wojna. Rethinking the Inception Architecture
     for Computer Vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition
     (CVPR), 2016. DOI:10.1109/cvpr.2016.308
[8] V. Martsenyuk. Indirect method of exponential convergence estimation for neural network with
     discrete and distributed delays. Electronic Journal of Differential Equations, 2017, art. no. 246.
[9] K. He, X. Zhang, S. Ren & J. Sun. Deep Residual Learning for Image Recognition. 2016 IEEE
     Conference on Computer Vision and Pattern Recognition (CVPR), 2016. DOI:10.1109/cvpr.2016.90
[10] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov & L.-C. Chen. MobileNetV2: Inverted Residuals
     and Linear Bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern
     Recognition, 2018. DOI:10.1109/cvpr.2018.00474
[11] G. P. Dimitrov, O. Bychkov, P. Petrova, K. Merkulova, Y. Zhabska, et al. Creation of Biometric
     System of Identification by Facial Image. 2020 3rd International Colloquium on Intelligent Grid
     Metrology (SMAGRIMET), 2020, pp. 29-34. DOI: 10.23919/SMAGRIMET48809.2020.9263995
[12] O. Patel, P. S. Maravi, Y., & S. Sharma. A Comparative Study of Histogram Equalization Based
     Image Enhancement Techniques for Brightness Preservation and Contrast Enhancement. Signal &
     Image Processing : An International Journal, 4(5), 11–25, 2013. DOI: 10.5121/sipij.2013.4502
[13] A. Tharwat. Classification assessment methods. Applied Computing and Informatics. August
     2018. DOI:10.1016/j.aci.2018.08.003
[14] Micro- and Macro-average of Precision, Recall and F-Score. Abracadabra. URL:
     https://tomaxent.com/2018/04/27/Micro-and-Macro-average-of-Precision-Recall-and-F-Score/
[15] Krak, I., Barmak, O., Radiuk, P. Information technology for early diagnosis of pneumonia on
     individual radiographs. CEUR Workshop Proceedings, 2753, (2020), 11-21. DOI: CEUR-
     WS.org/Vol-2753/paper3.pdf
[16] P.       Mooney.       Chest       X-Ray       Images        (Pneumonia).        Kaggle.       URL:
     https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
[17] O. Bychkov, K. Merkulova and Y. Zhabska. Information Technology of Person’s Identification by
     Photo Portrait. 2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics,
     Telecommunications and Computer Engineering (TCSET), 2020, pp. 786-790, DOI:
     10.1109/TCSET49122.2020.235542
[18] J. Kouraklis. MVVM as Design Pattern. In: MVVM in Delphi, 2016. DOI: 10.1007/978-1-4842-2214-0_1
[19] CUDA Toolkit. NVIDIA Developer. URL: https://developer.nvidia.com/cuda-toolkit


                                                                                                     179