=Paper=
{{Paper
|id=Vol-3826/short2
|storemode=property
|title=Evaluation of the accuracy of the neural network algorithm for object recognition in security systems (short paper)
|pdfUrl=https://ceur-ws.org/Vol-3826/short2.pdf
|volume=Vol-3826
|authors=Andrii Sahun,Vladyslav Khaidurov,Valerii Lakhno
|dblpUrl=https://dblp.org/rec/conf/cpits/SahunKL24
}}
==Evaluation of the accuracy of the neural network algorithm for object recognition in security systems (short paper)==
<pdf width="1500px">https://ceur-ws.org/Vol-3826/short2.pdf</pdf>
<pre>
                                Evaluation of the accuracy of the neural network
                                algorithm for object recognition in security systems ⋆
                                Andrii Sahun1,†, Vladyslav Khaidurov2,† and Valerii Lakhno1,*,†
                                1
                                 National University of Life and Environmental Sciences of Ukraine, 15 Heroyiv Oborony str., 03041 Kyiv, Ukraine
                                2
                                 National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, 37 Beresteyskiy ave., 03056 Kyiv,
                                Ukraine


                                                   Abstract
                                                   The study presents the results of applying the main known metrics used to evaluate the performance and
                                                   accuracy of algorithms and neural network models on different classes for the task of graphic content
                                                   recognition in security systems. For the analysis, different classes of images processed by the neural
                                                   network algorithm were compared. То evaluates the quality of the algorithm’s training based on the results
                                                   of graphical pattern recognition, nine different metrics for the five conducted correct classification
                                                   computational experiments were used. The sample used in research, the CamVid benchmark video dataset
                                                   for training the neural network model, shows different training results for different recognition classes,
                                                   with this indicator ranging from 38.15% to 97.07% when using the VGG-16 function. At the same time, the
                                                   highest standard deviation of accuracy, with a value of 0.030351419, was recorded only for the “Pavement”
                                                   class. This indicates the imperfection of the CamVid training dataset. It should be modified to improve
                                                   recognition quality by increasing the size and number of test images.

                                                   Keywords
                                                   distance metrics, neural network, classifier, algorithm’s quality evaluation, image recognition 1


                         1. Introduction                                                              maximum value. In another research related to practical
                                                                                                      tasks of recognition and identification of graphical images,
                         Machine learning and neural networks are closely related,                    the average recognition (identification) accuracy is reported
                         as neural networks are one of the primary technologies in                    at 76.78% [11].
                         the field of machine learning [1–3]. These algorithms are                        Therefore, it is important to assess how accurately
                         particularly widely used in security systems. In machine                     graphical patterns are recognized in a specific practical task
                         learning, several key metrics are used to evaluate model                     [5]. The same systems are used in specific tasks, such as
                         performance. These metrics help to understand how well                       security systems. In particular, the corresponding modules
                         the model is performing the given task and to identify areas                 are part of intelligent access control systems [12].
                         where it can be improved. There are several metrics for
                         evaluating different neural network algorithms [4]. All of                   2. Main part
                         them are used to analyze the recognition of various
                         properties and characteristics of neural network recognition                 Now mostly part of more complex practical application
                         algorithms [5]. These are useful for creating an optimal                     systems, which are known as Image Identification and
                         model of a graphic information recognition system. The                       Recognition Systems (IIRS). IIRS are often used both for
                         most important ones are the metrics for evaluating the                       detecting defects on parts within quality control systems
                         quality of learning [6].                                                     according to ISO-9000 standards and for detecting and
                             Therefore, it is of particular interest to understand                    recognizing the values of vehicle license plates. Based on the
                         whether there is a correlation between the weight                            results of the IIRS module, the intelligent system can
                         coefficient of the presence of a particular classification                   automatically make decisions about granting or denying
                         object in graphic object recognition and the accuracy of                     access to a secured area for a specific object. Another
                         such recognition. For example, in the works [7–10], the use                  application of such systems is machine vision systems. The
                         of metrics such as Distance metrics is considered, while in                  common principle of construction for all such systems is:
                         the research [2] the use of Euclidean Distance. However, the
                         formulation of the task differs from the identification of                       1)    The technical part of acquiring and initial
                         graphical objects. At the same time, [2] emphasizes that the                           processing of the image.
                         accuracy of identification (recognition) was 96.38% as the


                                CPITS-II 2024: Workshop on Cybersecurity Providing in Information           0000-0002-5151-9203 (A. Sahun);
                                and Telecommunication Systems II, October 26, 2024, Kyiv, Ukraine         0000-0002-4805-8880 (V. Khaidurov);
                                ∗
                                  Corresponding author.                                                   0000-0001-9695-4543 (V. Lakhno)
                                †
                                  These authors contributed equally.                                                    © 2024 Copyright for this paper by its authors. Use permitted under
                                                                                                                        Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                   avd29@ukr.net (A. Sahun);
                                allif0111@gmail.com (V. Khaidurov);
                                lva964@nubip.edu.ua (V. Lakhno)
CEUR
Workshop
                  ceur-ws.org
              ISSN 1613-0073
                                                                                                    162
Proceedings
    2)   The technical or software part for analyzing and               For those practical tasks where IIRS is now mostly used, a
         classifying image elements.                                    mathematical apparatus based on neural networks with
    3)   The subsystem for registration/identification and              different types of training is applied [13–17]. The choice of
         summarization of recognition data.                             the type of neural network training model is not the subject
                                                                        of this study. And the aspects related to this choice are
    In all similar IIRS systems, this intelligent module with           described, in particular [2, 11, 13–17].
a neural network-based algorithm plays a central role. The                  The test model chosen is the neural network model
accuracy of this module determines the overall performance              described in [2]. This model has several layers of neurons
of the entire system.                                                   (Fig. 1).


Figure 1: Layered neural network architecture of the IIRS model with Haar feature

Given the practice of using neural network-based                        classification in the process of pattern recognition. The
algorithms in recognition and identification systems, a                 vgg16() function in MATLAB returns a neural network
deep-learning neural network model was chosen. This is                  object but does not contain a specific method for computing
due to several existing advantages of such models for                   distances (metrics) between feature vectors for processed
graphic identification/recognition tasks [1, 3].                        images.
    The main goal of the study is the evaluation of the                     The vgg19() function also implements the architecture
accuracy of a neural network algorithm in the task of                   of a deep neural network and has an input size of
recognizing graphic content.                                            224×224×3. Unlike vgg16, the neural network in the vgg19
    The neural network diagram of the IIRS shown in Fig. 1              network is trained and fine-tuned on a dataset of graphical
operates with the Haar feature. This approach is most                   data containing over 1,000,000 images and 1000 classes. This
effective when using a deep-learning neural network.                    allows this neural network to have more powerful
    In the basic model described in [2], the input layer of             capabilities for feature extraction in images. To define
neurons receives initial data, such as the intensity of each            metrics based on VGG19 in MATLAB, we first need to load
pixel and Haar features for various graphical objects to be             and prepare the VGG19 model, and extract image features
identified (bushes, trees, cars, roads, sky, sidewalk elements,         from a specific layer of the neural network. After this, both
fences, pedestrians, etc.).                                             vgg16 and vgg19 functions must use different metrics to
                                                                        compare these features. That is, neither function has built-
                                                                        in distance metric determination.
3. Applying distance metrics for                                            To use distance metrics with feature vectors extracted
    neural networks                                                     from the VGG16 model in MATLAB, we have to follow
                                                                        these steps:
In the Matlab environment, there is a built-in function
vgg16() which implements the architecture of a deep neural                 1)   Loading and preparing the VGG16 Model (use the
network. There is also a function analogous to it, vgg19().                     pre-trained VGG16 model to extract feature vectors
The first function operates with 16 convolutional and fully                     from images.
connected layers of neurons, including 13 convolutional and
3 fully connected layers. This function is used for image


                                                                  163
    2)   Extracting Feature Vectors (feed your images                                             1
         through the VGG16 model to get the feature                                      𝑚𝐴𝑃 =           𝐴𝑃 ,                   (6)
                                                                                                  𝑁
         vectors).
    3)   Computing Distance Metrics (use different                      where 𝑁 is the number of categories.
         distance metrics to compare the feature vectors).                  Confusion matrix. This matrix shows the number of
                                                                        correct and incorrect classifications for each class. It
     Below are the main known metrics used to evaluate the              includes TP, FP, TN, and FN for each category.
performance of algorithms and neural network models on                      Area under the ROC curve. The ROC curve shows the
different classes of graphic content recognition. These metrics         relationship between TPR and FPR at different thresholds.
are used in machine learning [2].                                       The area under the curve (AUC) measures the model’s
     Accuracy metric in machine learning. Accuracy                      ability to distinguish between classes (7).
shows the proportion of correctly classified objects among
all objects. This metric is well suited for tasks where classes                    𝐴𝑈𝐶 =       𝑇𝑃𝑅(𝑡) 𝑑𝐹𝑃𝑅(𝑡).                  (7)
are balanced. The expression below provides an example of
obtaining the accuracy metric in machine learning                           False Positive Rate (FPR). The FPR measures the
algorithms [18]:                                                        proportion of false positive results among all negative
                              𝑇𝑃 + 𝑇𝑁                                   examples during training.
        𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =                            ,         (1)
                       𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁                                                              𝐹𝑃
                                                                                          𝐹𝑃𝑅 =            .                    (8)
where TP (True Positive) is the number of correct positive                                        𝐹𝑃 + 𝑇𝑁
classifications, TN (True Negative) is the number of correct                False Negative Rate (FNR). The FNR measures the
negative classifications, FP (False Positive) is the number of          proportion of false negative results among all positive
incorrect positive classifications, and FN (False Negative) is          examples during training.
the number of incorrect negative classifications.                                                     𝐹𝑁
     Precision metric in machine learning. Precision                                      𝐹𝑁𝑅 =            .                    (9)
                                                                                                   𝐹𝑁 + 𝑇𝑃
measures the proportion of correctly classified positive
                                                                            The above-mentioned metrics help objectively assess the
objects among all objects classified as positive. This metric           quality and effectiveness of the model for identifying graphical
is important when the cost of false positive results is high.           objects in a video surveillance system based on neural
In (2) we present the expression for computing the accuracy             networks, as well as choosing the most efficient algorithm for
metric in machine learning.                                             specific conditions and tasks.
                                 𝑇𝑃                                         In this research, all the evaluation metrics (1)–(9) listed
               𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =               .              (2)
                              𝑇𝑃 + 𝐹𝑃                                   above were used to assess the quality of model training.
     Recall metric in machine learning. Recall measures the                 Table 1 shows the quality metric values of the algorithm
proportion of correctly classified positive objects among all           training obtained in 5 computational experiments
actual positive objects. This metric is important when the cost         (calculated result of correct classification of objects for all
of false negative results is high. The following expression is          classes).
used to compute this metric (2):
                               𝑇𝑃                                       Table 1
                 𝑅𝑒𝑐𝑎𝑙𝑙 =             .                 (3)             The quality metric values of the algorithm training obtained
                             𝑇𝑃 + 𝐹𝑁                                    in 5 computational experiments
     The F1-score metric of recall. The F1-score is the                  Class name   Exp #1    Exp #2     Exp #3    Exp #4     Exp #5
harmonic mean between precision and recall. It is useful                 Sky          0,9266    0,9320     0,9479    0,9348     0,9818
when balancing these two metrics is necessary. It is                     Building     0,7987    0,8647     0,9181    0,9126     0,8786
calculated according to the expression provided below:                   Pole         0,8698    0,9397     0,9483    0,9455     0,9541
                        2 ⋅ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ⋅ 𝑅𝑒𝑐𝑎𝑙𝑙                           Road         0,9518    0,9867     0,9551    0,9749     0,9848
       𝐹1 − 𝑠𝑐𝑜𝑟𝑒 =                             .       (4)              Pavement     0,4188    0,4468     0,6394    0,5463     0,5070
                         𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙                              Tree         0,4342    0,4347     0,4896    0,4549     0,4465
     Intersection over Union metric. IoU is used to                      SignSymb     0,3251    0,3264     0,4621    0,3698     0,4243
evaluate the quality of segmentation and object detection by             Fence        0,4921    0,5825     0,6245    0,5978     0,6582
measuring the ratio of the intersection area of predicted and            Car          0,8988    0,9218     0,9542    0,9594     0,9732
ground truth objects to their union area. It is calculated               Pedestr      0,758     0,8281     0,9104    0,8328     0,8972
according to the expression provided below:                              Bicyclist    0,8145    0,8172     0,9492    0,8576     0,8207
                   𝐴𝑟𝑒𝑎 𝑜𝑓 𝐼𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡𝑖𝑜𝑛
           𝐼𝑜𝑈 =                           .            (5)             Fig. 2 shows weights coefficient indicators              object
                       𝐴𝑟𝑒𝑎 𝑜𝑓 𝑈𝑛𝑖𝑜𝑛
                                                                        recognition for the test video data segment.
     Mean Average Precision metric. Average precision is
calculated for each category and then averaged across all
categories. This metric is often used for object detection
tasks. Such a metric is particularly relevant for evaluating
the training quality of this neural network-based model. The
metric value can be determined using the expression
provided below:


                                                                  164
Figure 2: Weight coefficient indicators for each of the
recognition classes
                                                                         Figure 4: The result of the correct classification of objects
The significance of the Intersection over Union (IoU) metric,            for classes “Sky”, “Building”, “Pole”, and “Road” in the 5
calculated for each of the semantic classes, lies in its ability         conducted computational experiments
to measure the accuracy of the neural network’s recognition
performance. IoU assesses how well the predicted
segmentation overlaps with the ground truth segmentation
for each class. Higher IoU values indicate better
performance, meaning the predicted areas closely match the
actual areas. This metric is crucial for evaluating the
effectiveness and reliability of the neural network in
accurately recognizing and segmenting different semantic
classes within the graphical content. In Fig. 3 we can see
values of the IoU Accuracy evaluation metric.


                                                                         Figure 5: The result of the correct classification of objects
                                                                         for classes “Pavement”, “Tree”, “SignSymbol”, and “Fence”
                                                                         in the 5 conducted computational experiments


Figure 3: Intersection over Union metric score calculated
for each of the semantic classes

As can be understood from above, the most important and
resultant indicator of model training quality is the IoU
(Intersection over Union) metric. The result of the correct
classification of objects for each class in the 5 conducted
computational experiments values for different detection
                                                                         Figure 6: The result of the correct classification of objects
classes are presented in Figs. 4–6.
                                                                         for classes “Car”, “Pedestrian”, and “Bicyclist” in the 5
     Considering that the model was trained on 421 images,
                                                                         conducted computational experiments
it can be considered that its training level may be sufficient
for the graphical identification task at hand. But we see that           As shown by the calculations obtained in Table 1, the most
the training quality even for the same semantic classes                  accurate results of the learning algorithm NM were obtained
varies significantly across the different 5 experiments.                 for the classes: “Road”—97.06%, “Sky”—94.46%, and “Car”—
     The smallest value of such a deviation will be for objects          94.16% accuracy of correct recognitions. At the same time,
of the “Bicyclist” class at 0.76%, and the largest will be for           the recognition quality of images of the type “SignSymbol”
objects of the “Fence” class at 25.25%.                                  was 38.15%, and “Tree” had 45.19% accuracy of correct
     Such a difference can be explained by various reasons.              recognition. The average learning quality of this algorithm
For example, the imperfection of the algorithm or the                    on the test fragments was 75.42%.
insufficient quality or length of the training data sample.


                                                                   165
4. Conclusions                                                                in Information and Telecommunication Systems,
                                                                              CPITS, vol. 3654 (2024) 391–397.
Analyzing the data presented and visualized in Table 1 and             [6]    H. Mohammad, M. N. Sulaiman, A Review on
Figs. 4–6, it can be said that the quality of the learning                    Evaluation Metrics for Data Classification
algorithm described in [2] significantly depends on the                       Evaluations, Int. J. Data Mining Knowledge Manag.
accuracy of the training. The accuracy of image recognition                   Process 5 (2015) 01–11. doi: 10.5121/ijdkp.2015.5201.
in neural network-based algorithms is highly dependent on              [7]    O. V. Herasina, V. I. Korniienko Global and Local
the quality of training. Here are some key points of this                     Optimization Algorithms in the Problem of
dependence: Training Data Quality; Training Data                              Identification of Complex Dynamic Systems, Inf.
Quantity; Preprocessing; Algorithm Complexity; Training                       Process. Syst. (6) (2010) 73–77.
Process. The sample used in the study [2] CamVid                       [8]    V. Lakhno, et al., Information Security Audit Method
benchmark video dataset for training the neural network                       Based on the Use of a Neuro-Fuzzy System, LNNS 232
model shows different training results for different                          (2021) 171–184.
recognition classes. This indicator ranges from 38.15% to              [9]    H. G. Schuster, Deterministic Chaos: Introduction and
97.07% when using the VGG-16 function. It can be noted                        Recent Results, Springer, Berlin, Heidelberg (1992).
that all the provided training quality metrics on the same                    doi: 10.1007/978-3-642-95650-8_2.
recognition classes yield approximately the same accuracy              [10]   A. Pérez-Romero, et al., Evaluation of Artificial
values. While the variance (standard deviation) indicator is                  Intelligence-based Models for Classifying Defective
highest only for the “Pavement” class. It amounts to                          Photovoltaic Cells. Appl. Sci. 11 (2021) 4226. doi:
0.030351419.                                                                  10.3390/app11094226.
    The obtained average recognition accuracy of graphical             [11]   L. Ljung, et al., Deep Learning and System
objects at 75.42% is comparable to the recognition rate of                    Identification, IFAC-PapersOnLine, 53(2) (2020) 1175–
98.7%. This indicates insufficient training quality due to the                1181. doi: 10.1016/j.ifacol.2020.12.1329.
shortcomings of the training dataset.                                  [12]   P. Anakhov, et al., Evaluation Method of the Physical
    It can be assumed that the simplest way to improve                        Compatibility of Equipment in a Hybrid Information
recognition accuracy could also be using a more complex                       Transmission Network, J. Theor. Appl. Inf. Technol.
neural network algorithm. Such one present in MatLab is                       100(22) (2022) 6635–6644.
called VGG-19 [19–21]. Also, to improve the quality of                 [13]   V. Lakhno, et al., Development Strategy Model of the
graphic content recognition, it is necessary to use another,                  Informational Management Logistic System of a
higher-quality training dataset that contains a larger                        Commercial Enterprise by Neural Network
number of relevant sets of graphic datasets. We can also                      Apparatus, in: Cybersecurity Providing in
create an improved CamVid benchmark video dataset. As                         Information and Telecommunication Systems, vol.
known, benchmark video dataset improvement can also                           2746 (2020) 87–98.
significantly enhance the performance of the deep learning             [14]   I. Goodfellow, Y. Bengio, A. Courville, Deep Learning,
neural network algorithm [22, 23].                                            The MIT Press (2016).
                                                                       [15]   D. F. Kandamali, et al., Machine Learning Methods for
References                                                                    Identification and Classification of Events in ϕ-OTDR
                                                                              Systems: A Review, Applied Optics, 61(11) (2022) 2975.
[1]   F. Ahmad, T. Ahmad, Image Mining Based on Deep
                                                                              doi: 10.1364/ao.444811.
      Belief Neural Network and Feature Matching
                                                                       [16]   S. Bickler, Machine Learning Identification and
      Approach Using Manhattan Distance, Comput.
                                                                              Classification of Historic Ceramics, Archaeology in
      Assisted Methods Eng. Sci. 28(2) (2021) 139–167.
                                                                              New Zealand, 61 (2018) 48–58.
      doi: 10.24423/cames.323.
                                                                       [17]   O. Rainio, J. Teuho, R. Klén, Evaluation Metrics and
[2]   A. Sahun, V. Khaidurov, V. Bobkov, Model of Graphic
                                                                              Statistical Tests for Machine Learning, Sci. Rep. 14
      Object Identification in a Video Surveillance System
                                                                              (2024) 6086. doi: 10.1038/s41598-024-56706-x.
      based on a Neural Network. Cybersecurity Providing
                                                                       [18]   S. Orozco Arias, et al., Measuring Performance
      in Information and Telecommunication Systems, vol.
                                                                              Metrics of Machine Learning Algorithms for
      3654 (2024) 361–367.
                                                                              Detecting and Classifying Transposable Elements,
[3]   J.-H. Lee, Minimum Euclidean Distance Evaluation
                                                                              Processes. 8 (2020) 1–19. doi: 10.3390/pr8060638.
      using Deep Neural Networks, AEU – Int. J. Electron.
                                                                       [19]   A. V. Ikechukwu, et al., ResNet-50 vs VGG-19 vs
      Commun. 112 (2019) 152964. doi: 10.1016/
                                                                              Training from Scratch: A Comparative Analysis of the
      j.aeue.2019.152964.
                                                                              Segmentation and Classification of Pneumonia from
[4]   K. Khorolska, et al., Application of a Convolutional
                                                                              Chest X-ray Images, Global Transitions Proceedings,
      Neural Network with a Module of Elementary Graphic
                                                                              2(2) (2021) 375–381. doi: 10.1016/j.gltp.2021.08.027.
      Primitive Classifiers in the Problems of Recognition of
                                                                       [20]   A. Saleh, R. Sukaik, S. Abu-Naser, Brain Tumor
      Drawing Documentation and Transformation of 2D to
                                                                              Classification Using Deep Learning (2020) 131–136.
      3D Models, J. Theor. Appl. Inf. Technol. 100(24) (2022)
                                                                              doi: 10.1109/iCareTech49914.2020.00032.
      7426–7437.
                                                                       [21]   L. Ali, et al., Performance Evaluation of Deep CNN-
[5]   V. Dudykevych, et al., Detecting Deepfake
                                                                              Based Crack Detection and Localization Techniques
      Modifications of Biometric Images using Neural
                                                                              for Concrete Structures, Sensors, 21 (2021). 1688.
      Networks, in: Workshop on Cybersecurity Providing
                                                                              doi: 10.3390/s21051688.


                                                                 166
[22] S. Richter, et al., Playing for Data: Ground Truth from
     Computer Games, LNCS 9906 (2016). doi: 10.1007/978-
     3-319-46475-6_7.
[23] P. Ravishankar,           A. Lopez,       G. M. Sanchez,
     Unstructured Road Segmentation using Hypercolumn
     based Random Forests of Local experts (2022).
     doi: 10.48550/arXiv.2207.11523.


                                                                167

</pre>