=Paper= {{Paper |id=Vol-3421/short11 |storemode=property |title=Methods of Face Recognition in Video Sequences and Performance Studies (short paper) |pdfUrl=https://ceur-ws.org/Vol-3421/short11.pdf |volume=Vol-3421 |authors=Mariia Nazarkevych,Vitaly Lutsyshyn,Hanna Nazarkevych,Liubomyr Parkhuts,Maryna Kostiak |dblpUrl=https://dblp.org/rec/conf/cpits/NazarkevychLNPK23 }} ==Methods of Face Recognition in Video Sequences and Performance Studies (short paper)== https://ceur-ws.org/Vol-3421/short11.pdf
Methods of Face Recognition in Video Sequences
and Performance Studies
Mariia Nazarkevych1, Vitaly Lutsyshyn1, Hanna Nazarkevych2, Liubomyr Parkhuts2,
and Maryna Kostiak2
1
    Lviv Ivan Franko National University, 1 Universytetska str., Lviv, 79000, Ukraine
2
    Lviv Polytechnic National University, 12 Stepan Bandera str., Lviv, 79013, Ukraine

                  Abstract
                  A method of capturing a person’s face in a video stream has been developed. The developed
                  methods of capturing the video stream are considered. Tracking methods are used in video
                  surveillance. Methods of video stream capture, image frame extraction, and face recognition
                  are considered. The method of flexible comparison on graphs, the principal component method,
                  The Viola-Jones method, Local binary patterns, and Hidden Markov models, which are used
                  for face recognition, are considered. The library in Python DeepFace was studied. Face
                  recognition experiments were conducted. Faces photographed in the genre of selfie, portrait,
                  and documentary photography were recognized. It has been found that the best recognition
                  methods are found in the genre of photography. The recognition results are somewhat worse
                  for selfies. The worst ones are for digital photography. Recognition was based on the
                  MediaPipe Face Detection library. The recognition time was from 10 to 22 mc.

                  Keywords 1
                  Face recognition, object tracking, machine learning

1. Introduction                                                                                         scene occlusions, and camera movement.
                                                                                                        Tracking is usually performed in the context of
                                                                                                        higher-level applications that require the
   Tracking objects in surveillance camera
                                                                                                        location and/or shape of an object in each frame.
footage is a challenging task. It is much more
                                                                                                        Typically, assumptions are made to limit the
difficult to track objects in video sequences to
                                                                                                        tracking problem in the context of a particular
improve their recognition. There are many
                                                                                                        application. In this review, we classify tracking
existing object-tracking methods, but all have
                                                                                                        methods based on the object and motion
some drawbacks. Some of the existing object-
                                                                                                        representations used. Object tracking consists
tracking models are region-based contour
                                                                                                        of using appropriate image features, selecting
models [1]. Tracking—tracking an object in a
                                                                                                        motion models, and detecting objects [3]:
video sequence; and detection—detecting an
object in a video sequence. Tracking-by-                                                                    • Target representation object.
detection—trackers first run a detector for each                                                            • Localization object.
frame, and then the tracking algorithm                                                                      Difficulties arise when objects move fast
associates these detections to determine the                                                            compared to the frame rate or when the tracked
movement of individual objects and assign them                                                          object changes direction in time [4–6]. The
unique identification numbers [2].                                                                      sequential flow of object detection, object
   Tracking objects is a complex problem.                                                               tracking, object identification, and object
Difficulties with object tracking can arise from                                                        behavior completes the tracking process [7].
abrupt object movement, changing appearance                                                             Video processing consists of the following steps:
patterns of both the object and the scene, non-                                                         video upload [8], prepro-cessing, a proposed
rigid object structures, object-object and object-

CPITS 2023: Workshop on Cybersecurity Providing in Information and Telecommunication Systems, February 28, 2023, Kyiv, Ukraine
EMAIL: mariia.a.nazarkevych@lpnu.ua (M. Nazarkevych); vitalylutsyshyn@gmail.com (V. Lutsyshyn); hanna.ya.nazarkevych@lpnu.ua
(H. Nazarkevych); liubomyr.t.parkhuts@lpnu.ua (L. Parkhuts); Kostiak.maryna@lpnu.ua (M. Kostiak)
ORCID: 0000-0002-6528-9867 (M. Nazarkevych); 0009-0008-1229-6706 (V. Lutsyshyn); 0000-0002-1413-630X (H. Nazarkevych); 0000-
0003-4759-9383 (L. Parkhuts); 0000-0002-6667-7693 (M. Kostiak)
              ©️ 2023 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

              CEUR Workshop Proceedings (CEUR-WS.org)



                                                                                              246
algorithm that includes video processing, then the        A video stream is streamed in which a face
object capture step (Fig. 1).                         needs to be recognized [11]. We determine the
                                                      size of the face coordinates. The face contour is
                      Video frames                    aligned and the basic parameters are determined
                                                      (Fig. 2). As a result, a parametric vector is built.
                                                      The parameters are compared. As a result,
                                                      recognition is performed.
                      Preprocessing




                                                                                stream
                                                                                 video
                   Proposed algorithm
                                                                           Facial recognition




           Moving object detection and tracking


Figure 1: Scheme of video processing and object
outline selection
                                                                        Face contour alignment

2. Object Recognition




                                                                             alignmen
                                                                                after
                                                                                Face


                                                                                  t
    The capture and encoding of digital images
should result in the creation and rapid
dissemination of a huge amount of visual
                                                                   Determination of basic parameters
information. Hence, efficient tools for searching
and retrieving visual information are essential.
Although there are effective search engines for
text documents today, there are no satisfactory
systems for retrieving visual information.
    Due to the growth of visual data both online                   Comparison of parameters. Result
and offline [9] and the phenomenal success of web
search, expectations for image and video search       Figure 2: Face recognition algorithm
technologies are increasing.                             Object detection in offline video. This
    However, with the evolution of video camera       approach estimates the behavior of perceived
characteristics that can record at high frame rates   objects and works best as a complement to other
in good quality, and with advances in detection,      offline video-based object detection systems [12].
such as new approaches based on Convolutional         In recent years, various other video object
Neural Networks (CNNs), the basis for Tracking-       detection systems have emerged that have tried to
by-detection trackers [10] has become more            use 3D convolutional networks that analyze many
robust. The requirements for a tracker in a           images simultaneously.
tracking system have changed dramatically,               Knowledge-based methods use information
allowing for much simpler tracking algorithms         about the face, its features, shape, texture, or skin
that can compete with more complex systems            color. In these methods, a certain set of rules is
requiring significant computational costs.            distinguished that a frame fragment must meet to
    Let’s analyze three ranking algorithms that       be considered a human face. It is quite easy to
take into account the spatial, temporal, and          define such a set of rules (Fig. 3). All rules are
spatiotemporal properties of geo-referenced video     formalized knowledge that a person uses to
clips.                                                determine whether a face is a face or not.
    Object detection requires training machine           For example, the basic rules are: the areas of
learning models, such as Recurrent Neural             the eyes, nose, and mouth differ in brightness
Networks (RNNs) and CNNs, on images where             from the rest of the face; the eyes on the face are
objects have been manually annotated and              always symmetrically positioned relative to each
associated with a high-level concept.                 other. Based on these and other similar properties,



                                                  247
algorithms are built that check whether these rules   systems, graphs can have a rectangular lattice and
are fulfilled in the image during execution. The      a     structure      formed     by     characteristic
same group of methods includes a more general         (anthropometric) points of faces.
method—the pattern-matching method. In this               Graph edges are weighted by the distances [16]
method, a face standard (template) is determined      between adjacent vertices. The difference
by describing the properties of individual face       (distance, discriminative characteristic) between
areas and their specified relative position, with     two graphs is calculated using a certain
which the input image is subsequently compared.       deformation cost function that takes into account
                                                      both the difference between the feature values
               Computing methodology                  calculated in the vertices and the degree of
                                                      deformation of the graph edges.
                                                          The graph is deformed by shifting each of its
             Artificial      Computer                 vertices by a certain distance in certain directions
            inteligents       graphics                relative to its original location and choosing such
                                                      a position at which the difference between the
                                                      feature values in the vertex of the deformed graph
            Computer                                  and the corresponding vertex of the reference
                             image
              vision       manipulation               graph is minimal. This operation is performed in
                                                      turn for all graph vertices until the smallest total
                                                      difference between the features of the deformed
              Object                                  and reference graphs is achieved. The value of the
            recognition                               deformation cost function at this position of the
                                                      graph will be the measure of the difference
                                                      between the input face image and the reference
            Detection
                             Face detection           graph. This “relaxation” deformation procedure
                                                      should be performed for all reference faces in the
Figure 3: Classification of face detection            system database. The result of the system’s
    Face detection using such methods is              recognition is the reference with the best value of
performed [13] by searching all rectangular           the deformation cost function.
fragments of the image to determine which class           The disadvantages of the method include the
the image belongs to.                                 complexity of the recognition algorithm and the
    Viola-Jones object detection [14]. The method     complicated procedure for entering new templates
was proposed by Paul Viola and Michael Jones          into the database.
and became the first method to demonstrate high           The best results in face recognition were
results in real-time image processing. The method     shown by the CNN or convolutional neural
has many implementations, including as part of        network. The success is due to the ability to
the     OpenCV        computer    vision    library   understand the two-dimensional topology of the
(cvHaarDetectObjects function). The advantages        image, unlike the multilayer perceptron.
of this method are high speed (due to the use of a        The distinctive features of CNN are local
cascade classifier); high accuracy in detecting       receptor fields (providing local two-dimensional
turned faces at an angle of up to 30 degrees. The     connectivity of neurons), common weights
disadvantages include a long training time. The       (providing detection of some features anywhere in
algorithm needs to analyze a large number of test     the image), and hierarchical organization with
images.                                               spatial subsampling. Thanks to these innovations,
    The method of comparison on graphs (Elastic       the CNN provides partial resistance to scale
graph matching) [15]. This method is related to       changes, shifts, rotations, changes in angle, and
2D modeling. Its essence lies in the comparison of    other distortions.
graphs describing faces (a face is represented as a       CNN was developed in DeepFace, which was
grid with an individual location of vertices and      acquired by Facebook to recognize the faces of its
edges). Faces are represented as graphs with          social network users.
weighted vertices and edges. At the recognition           Geometric face recognition method [17] is
stage, one of the graphs, the reference graph,        one of the first face recognition methods used. The
remains unchanged, while the other is deformed        methods of this type of recognition involve the
to best match the first graph. In such recognition    selection of a set of key points (or areas) of the
                                                      face and the subsequent formation of a set of


                                                  248
features. The key points can include the corners of      database containing images of faces with slight
the eyes, lips, the tip of the nose, the center of the   changes in lighting, scale, spatial rotation,
eye, etc. The advantages of this method include          position, and various emotions showed 96%
the use of inexpensive equipment. The                    recognition accuracy. The disadvantages of
disadvantages are as follows: low statistical            methods based on neural networks include the
reliability, high lighting requirements, and             addition of a new reference face to the database,
mandatory frontal image of the face, with small          which requires complete retraining of the network
deviations. It does not take into account possible       on the entire available set, and this is a rather
changes in facial expressions.                           lengthy procedure that, depending on the size of
    The method of flexible comparison on                 the sample, requires hours of work or even several
graphs [18], the essence of which is to compare          days.
graphs describing the image of a person’s face.              Local Binary Patterns (LBPs) [15] were first
Some publications indicate 95–97% recognition            proposed in 1996 to analyze the texture of
efficiency even in the presence of different             halftone images. Studies have shown that LBPs
emotional expressions and changes in the angle           are invariant to small changes in lighting
when forming a face image up to 15 degrees.              conditions and small image rotations. LBW-based
However, it takes about 25 seconds to compare the        methods work well when using images of faces
input face image with 87 reference images.               with different facial expressions, different
Another disadvantage of this approach is the low         lighting, and head turns. Among the
manufacturability of memorizing new standards,           disadvantages is the need for high-quality image
which generally leads to a non-linear dependence         preprocessing due to high sensitivity to noise, as
of the operating time on the size of the face            the number of false binary codes increases in its
database. The main advantage is low sensitivity to       presence.
face illumination and changes in face angle, but             Hidden Markov models [16]. A hidden
this approach itself has lower recognition               Markov model is a statistical model that simulates
accuracy than methods built using neural                 the operation of a process similar to a Markov
networks.                                                process with unknown parameters. According to
    The Principal Component Method (PCM)                 the model, the task is to find unknown parameters
[19] reduces the recognition or classification           based on other observed parameters. The obtained
process to the construction of a certain number of       parameters can be used in further analysis for face
principal components of images for an input              recognition. From the point of view of
image. However, in cases where there are                 recognition, an image is a two-dimensional
significant changes in illumination or facial            discrete signal. The observation vector plays an
expression in the face image, the effectiveness of       important role in building an image model. To
the method is significantly reduced.                     avoid discrepancies in descriptions, a rectangular
    The Viola-Jones method [14] allows you to            window is usually used for recognition. To avoid
detect objects in images in real-time. The method        losing data areas, rectangular windows should
works well when observing an object at a small           overlap each other. The values for overlap, as well
angle, up to about 30°. The recognition accuracy         as the recognition areas, are selected
using this method partially reaches over 90%,            experimentally. Before use, the model must be
which is a good result. However, at a deviation          trained on a set of pre-labeled images. Each label
angle of more than 30°, the recognition                  has its number and defines a characteristic point
probability drops sharply. This feature makes it         that the model will have to find when adapting to
impossible to detect a face at an arbitrary angle.       a new image.
Use of neural networks.
    One of the best results in face recognition is       3. Face Detection
achieved by using CNNs, which are a logical
development of such architectures as cognition
                                                            MediaPipe Face Detection is a face detection
and recognition. The success is due to the ability
to take into account the two-dimensional topology        software product that includes 6 landmarks and
of the image, unlike the multilayer perceptron.          support for multiple faces. It is based on
Thanks to these innovations, the ANN provides            BlazeFace [17], a lightweight and high-
partial resistance to scale changes, shifts,             performance face detector specifically designed
rotations, changes in perspective, and other             for mobile GPUs. The detector’s real-time
                                                         performance allows it to be applied to any real-
distortions. Testing of the ANN on the ORL


                                                     249
time video stream that requires an accurate face         A collection of detected faces, where each face
region to be used as input to other task-specific     is represented as a proto-message containing a
models, such as 3D face keypoint estimation (e.g.,    bounding box and 6 key points (right eye, left eye,
MediaPipe Face Mesh), facial features, or facial      nose tip, the center of the mouth, right ear tragion,
expression classification, and face region            and left ear tragion). The bounding box consists of
segmentation. BlazeFace utilizes a simplified         xmin and width (both normalized to [0.0, 1.0] by
feature extraction network inspired by                the width of the image), and ymin and height (both
MobileNetV1/V2, but distinct from it, a GPU-          normalized to [0.0, 1.0] by the height of the
friendly binding scheme modified from Single          image). Each key point consists of x and y, which
Shot MultiBox Detector (SSD).                         are normalized to [0.0, 1.0] by the width and
                                                      height of the image, respectively (Fig. 4).




Figure 4: Face capture in video

4. Face Mash                                          3D primitives, including a face pose
                                                      transformation matrix and a triangular face mesh
                                                      [21]. A lightweight statistical analysis method
    MediaPipe Face Mesh is a solution that
                                                      called Procrustes Analysis is used to drive robust,
estimates 468 3D facial landmarks in real-time,
                                                      efficient, and portable logic. The analysis is
even on mobile devices [18, 19]. The program
                                                      performed on the CPU and has a minimal speed
uses machine learning to determine the 3D surface
                                                      footprint.
of the face, requiring only a single camera input
                                                          The machine learning pipeline consists of two
without the need for a special depth sensor. Using
                                                      real-time deep neural network models that work
a simplified modeling architecture along with
                                                      together [22]: a detector that works on the full image
GPU acceleration throughout the pipeline, the
                                                      and calculates the location of the face, and a 3D
solution delivers real-time performance that is
                                                      facial landmark model that works on these locations
critical.
                                                      and predicts an approximate 3D surface using
    Additionally, the solution comes with a face
                                                      regression. Accurate face cropping significantly
transformation module that bridges the gap
                                                      reduces the need for conventional data
between facial landmark estimation and useful
                                                      augmentation.
real-time Augmented Reality (AR) applications
                                                          The pipeline is implemented as a MediaPipe
[20]. It establishes a metric 3D space and uses the
                                                      graph that uses a face landmark subgraph from the
positions of facial landmarks on the screen to
                                                      face landmark module and visualizes using a special
estimate facial transformations in that space. The
                                                      face renderer subgraph. The face landmark subgraph
face transformation data consists of conventional


                                                  250
internally uses the face_detection_subgraph from      5. Model Development
the face detection module.
    The face detector is the same BlazeFace model
                                                         There are two models in this solution: general
used in MediaPipe Face Detection.
                                                      and landscape. Both models are based on
    For 3D facial landmarks, we applied transfer
                                                      MobileNetV3 with modifications to make them
learning and trained the network with multiple
                                                      more efficient. The general model works with a
objectives: the network simultaneously predicts 3D
                                                      256×256×3 (HWC) tensor and outputs a
landmark coordinates on synthetic visualized data
                                                      256×256×1 tensor representing the segmentation
and 2D semantic contours on annotated real-world
                                                      mask. The landscape model is similar to the
data. The resulting network provided us with
                                                      general model but works on a 144×256×3 (HWC)
reasonable predictions of 3D landmarks not only on
                                                      tensor. It has fewer FLOPs than the regular model
synthetic but also on real-world data [23, 24].
                                                      and is therefore faster. MediaPipe Selfie
    The 3D landmark network receives a cropped
                                                      Segmentation automatically resizes the input
video frame as input without additional depth
                                                      image to the right tensor size before feeding it to
input. The model outputs the positions of the 3D
                                                      the ML model [27].
points, as well as the probability of the presence
                                                         The general model also supports ML Kit, and
and proper alignment of a face in the input data
                                                      the landscape model option supports Google Meet
[25, 26]. A common alternative approach is to
                                                      (Fig. 6).
predict a 2D heat map for each landmark, but it
does not lend itself to depth prediction and has
high computational costs for so many points. We
further improve the accuracy and reliability of our
model by iterative loading and refining the
predictions. In this way, we can increase our
dataset to increasingly complex cases such as
grimaces, obliques, and occlusions.
    This method can be used for a variety of face
masking applications (Fig. 5).


                                                      Figure 6: Landscape model—segmentation mask
                                                         During this experiment, the issue of
                                                      recognizing objects in a video stream was
                                                      considered. The main Python libraries that can be
                                                      used to recognize and classify objects from video
                                                      are highlighted. MediaPipe methods for achieving
                                                      a particular result in recognition are clearly
                                                      described (Fig. 7).

Figure 5: Creating a face mask in a video track
img = cv2.imread(img_path)
cv2.imshow('image', img)
cv2.waitKey(0)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
faces = face_cascade.detectMultiScale(gray,1.1,5)
faces_detected = "Знайдено обличчя: " + format(len(faces))
print(faces_detected)
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
    cv2.imshow('img', img)
    cv2.waitKey()
Figure 7: Fragment of the face detection program




                                                  251
6. Experimental Results                                photo was taken in the style of a selfie, a
                                                       documentary photo, and a portrait. In addition,
                                                       these images had different variations in quality
   About 50 images of graduating master’s
                                                       and contained several facets of variation in color,
students were used. The images were taken from
                                                       position, scale, rotation, pose, and facial
mobile phones. Subsequently, after taking photos,
                                                       expression. We present the detection results in
they recorded videos in MPEG7 format. In the
                                                       Tables 1 and 2 for the HHI MPEG7 image set. The
experiment, the placement of the face to the plane
                                                       face was fascinated by tracking.
of the photograph was taken into account. The

Table 1
FP: False Positives, DR: Detection Rate
         Out of man                     Frontal        Close to the        Semi-profile       Profile
                                                          frontal
                                Selfi
Number of images                12                    10               7                        15
Image size
FP: False Positives             6204             5205                  3090                   2580
DR: Detection Rate              87%              85%                   90%                    99%
Time (sec)                      10мс
                                Portrait
FP: False Positives             5290             5005                  2590                   2800
DR: Detection Rate              93%              92%                   85%                    95%
Time (sec)                      18 mc
                                Documentary photography
FP: False Positives             3458             FP: False             3458                 FP: False
                                                 Positives                                  Positives
DR: Detection Rate              85%              DR: Detection         85%                DR: Detection
                                                 Rate                                         Rate
Time (sec)                      22 mc


7. Acknowledgments                                     8. References
   Research on face detection in a video stream        [1]   A. Yilmaz, O. Javed, M. Shah, Object
has been conducted. This is done using the                   Tracking: A Survey. ACM Computing
MediaPipe Face Detection library. The results of             Surveys (CSUR), 38(4) (2006) 13-es.
face detection are shown in Fig. 4. Frames in the      [2]   T. Huang, Computer Vision: Evolution and
form of photos are recorded in the video stream.             Promise, CERN School Comput., 1996,
The DeepFace library is used to capture faces. If            21–25. doi: 10.5170/CERN-1996-008.21
there are several faces in the frame, then             [3]   Z. Pang, Z. Li, N. Wang, Simpletrack:
DeepFace captures several faces. Experiments                 Understanding and Rethinking 3D Multi-
were carried out when taking pictures in the genre           Object Tracking, ECCV 2022 Workshops:
of the selfie, portrait photography, and                     Tel Aviv, Israel, October 2022, 680–696.
documentary photography. The photo was taken                 doi:10.1007/978-3-031-25056-9_43
from the frontal, close to the frontal and profile.    [4]   O. Iosifova, et al., Analysis of Automatic
The performed recognition showed a high                      Speech     Recognition     Methods,    in:
percentage of face recognition.                              Workshop on Cybersecurity Providing in
                                                             Information     and    Telecommunication
                                                             Systems, vol. 2923 (2021) 252–257.
                                                       [5]   K. Khorolska, et al., Application of a
                                                             Convolutional Neural Network with a
                                                             Module of Elementary Graphic Primitive



                                                  252
       Classifiers in the Problems of Recognition    [16] E. Rica, S. Álvarez, F. Serratosa, Learning
       of     Drawing       Documentation     and         Distances Between Graph Nodes and
       Transformation of 2D to 3D Models,                 Edges, Structural, Syntactic, and Statistical
       Journal of Theoretical and Applied                 Pattern Recognition: Joint IAPR Int.
       Information Technology 100(24) (2022)              Workshops, S+SSPR 2022, Montreal, QC,
       7426–7437.                                         Canada, August 2022, 103–112.
[6]    V. Sokolov, P. Skladannyi, A. Platonenko,     [17] X. Qi, et al., A Convolutional Neural
       Video Channel Suppression Method of                Network Face Recognition Method Based
       Unmanned Aerial Vehicles, in: IEEE 41st            on BiLSTM and Attention Mechanism,
       International Conference on Electronics            Computational          Intelligence      and
       and Nanotech-nology (2022) 473–477. doi:           Neuroscience (2023).
       10.1109/ELNANO54667.2022.9927105              [18] Y. Yasuda, et al., Flexibility Chart 2.0: An
[7]    I. Delibaşoğlu, Moving Object Detection            Accessible Visual Tool to Evaluate
       Method with Motion Regions Tracking in             Flexibility Resources in Power Systems.
       Background Subtraction, Signal, Image and          Renewable and Sustainable Energy
       Video Processing, (2023) 1–9. doi:                 Reviews, 174 (2023) 113116.
       10.1007/s11760-022-02458-y                    [19] G. Ramadan, et al., Impact of PCM type on
[8]    X. Yu, Evaluation of Training Efficiency of        Photocell Performance Using Heat Pipe-
       Table Tennis Players Based on Computer             PCM Cooling System: A Numerical Study,
       Video Processing Technology, Optik, 273            J. Energy Systs. 7(1) (2023) 67–88.
       (2023) 170404.                                [20] S. Sut, et al., Automated Adrenal Gland
[9]    L. Nixon, How Do Destinations Relate to            Disease Classes Using Patch-Based Center
       One Another? A Study of Destination                Symmetric Local Binary Pattern Technique
       Visual Branding on Instagram, ENTER                with CT Images, J. Digital Imaging (2023)
       eTourism Conference, 2023, 204–216.                1–14.
[10]   C. Xiao, Z. Luo, Improving Multiple           [21] R. Glennie, et al., Hidden Markov Models:
       Pedestrian Tracking in Crowded Scenes              Pitfalls and Opportunities in Ecology.
       with Hierarchical Association, Entropy,            Methods in Ecology and Evolution, 14(1)
       25(2) (2023) 380.                                  (2023) 43–56.
[11]   S. Garcia, et al., Face-To-Face and Online    [22] N. Bansal, et al., Real-Time Advanced
       Teaching Experience on Experimental                Computational Intelligence for Deep Fake
       Animals and Alternative Methods with               Video Detection, Appl. Sci. 13(5) (2023)
       Nursing Students: A Research Study, BMC            3095.
       Nursing, 22(1) (2023) 1–10.                   [23] B. Deori, D. Thounaojam, A Survey on
[12]   M. Lee, Y. Chen, Artificial Intelligence           Moving Object Tracking in Video. Int. J.
       Based Object Detection and Tracking for a          Inf. Theor. (IJIT), 3(3) (2014) 31–46.
       Small Underwater Robot, Processes, 11(2)      [24] M. Medykovskyy, et al., Methods of
       (2023) 312.                                        Protection Document Formed from Latent
[13]   A. Boyd, et al., CYBORG: Blending                  Element Located by Fractals, in: X
       Human Saliency Into the Loss Improves              International Scientific and Technical
       Deep Learning-Based Synthetic Face                 Conference “Computer Sciences and
       Detection, IEEE/CVF Winter Conference              Information Technologies,” 2015, 70–72.
       on Applications of Computer Vision, 2023,     [25] M. Logoyda, et al., Identification of
       6108–6117.                                         Biometric Images using Latent Elements,
[14]   B. Hassan, F. Dawood, Facial Image                 CEUR Workshop Proceedings, 2019.
       Detection Based on the Viola-Jones            [26] M. Nazarkevych,             B. Yavourivskiy,
       Algorithm for Gender Recognition, Int. J.          I. Klyuynyk, Editing Raster Images and
       Nonlinear Analysis Appls. 14(1) (2023)             Digital Rating with Software, The
       1593–1599.                                         Experience of Designing and Appl. of CAD
[15]   E. Hartman, et al., Elastic Shape Analysis         Systems in Microelectr., 2015, 439–441.
       of Surfaces with Second-Order Sobolev         [27] V. Hrytsyk, A. Grondzal, A. Bilenkyj,
       Metrics: A Comprehensive Numerical                 Augmented Reality for People with
       Framework, Int. J. Comput. Vision, 2023,           Disabilities, in: X Int. Sci. and Technical
       1–27.                                              Conf. “Computer Sciences and Information
                                                          Technologies,” (2015) 188–191.


                                                 253