<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International Workshop on Advanced Applied Information Technologies, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Comparison of ResNet, EfficientNet, and Xception architectures for deepfake detection⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Khrystyna Lipianina-Honcharenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mykola Telka</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nazar Melnyk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>West Ukrainian National University</institution>
          ,
          <addr-line>Lvivska str., 11, Ternopil, 46000</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>5</volume>
      <issue>2024</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This study presents a comparative analysis of three deep neural networks-ResNet, EfficientNet, and Xception-for deepfake video detection tasks. The primary goal was to identify the most effective architecture for classifying fake videos, as well as to explore additional mechanisms, such as Long ShortTerm Memory (LSTM) and attention mechanisms, which could enhance the accuracy of the models. Using a dataset consisting of real and fake videos, each model was evaluated based on accuracy, precision, recall, and F1-score metrics. The results showed that the Xception model achieved the highest accuracy (87.7%), while EfficientNet also demonstrated high efficiency, particularly in resource-constrained tasks. ResNet showed stability but faced challenges in classifying underrepresented classes.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;deepfake</kwd>
        <kwd>ResNet</kwd>
        <kwd>EfficientNet</kwd>
        <kwd>Xception 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The proliferation of deepfake videos poses a significant threat to digital security and information
trust, creating challenges across various sectors, including media, politics, and legal systems.
Deepfake technologies facilitate the creation of highly realistic yet fabricated videos, making their
detection challenging for conventional methods. This contributes to the manipulation of public
opinion [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], facilitates the dissemination of false information, and enables malicious activities such
as deception and fraud. Therefore, developing effective systems for the automatic detection of
deepfake videos is critically important for ensuring information security and combating
disinformation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>This study aims to evaluate different deep neural network architectures, such as ResNet,
EfficientNet, and Xception, for deepfake video detection. The primary focus is on identifying the
most effective models and exploring the role of additional mechanisms, including LSTM and
attention, in improving detection accuracy. The study assesses the performance of the models using
metrics such as accuracy, recall, precision, and F1-score, ultimately identifying the best approaches
for building reliable deepfake detection systems.</p>
      <p>Future research could address current limitations by implementing advanced data augmentation
techniques to balance datasets, exploring ensemble models to combine the strengths of multiple
architectures, and optimizing computational efficiency for deploying lightweight models in
realworld scenarios.</p>
      <p>The paper is organized into several sections: Section 2 reviews existing deepfake detection
methods and the neural network architectures commonly employed. Section 3 outlines the research
methodology, detailing data preparation, model selection, and training processes. Section 4 provides
a comprehensive performance analysis of the models based on accuracy, recall, precision, and
F1score. Finally, Section 5 highlights the most effective approaches and proposes recommendations for
advancing deepfake detection systems.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        Recent studies in the field of deepfake video detection focus on utilizing deep neural networks, such
as ResNet, EfficientNet, and Xception, to improve the accuracy of classifying real and fake videos.
For instance, ResNet-50 is used for deepfake video detection by combining it with LSTM to account
for both images and video frame sequences. This allows the model to consider temporal
dependencies, significantly enhancing accuracy compared to methods that use only individual
frames [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Additionally, other studies, such as those involving Inception-ResNet-V2, emphasize the
necessity of developing effective deepfake detection methods due to security and privacy threats [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Other approaches concentrate on developing more complex architectures. Specifically, the
Sequential-Parallel Networks (SPNet) model offers a novel method for processing deepfake videos,
providing more efficient handling of spatiotemporal dependencies with a reduced number of
parameters [5]. This architecture helps lower computational costs, which is a crucial factor when
working with large volumes of video data. Furthermore, a five-layer convolutional neural network
proposed in another study demonstrates a high accuracy of 98% compared to other models, such as
Xception and EfficientNet-B0 [6].</p>
      <p>In addition to these recent approaches, models that use attention mechanisms, such as channel
and spatial attention, show significant improvements in deepfake detection accuracy compared to
standard models. The use of attention mechanisms allows the model to focus on important features
of the input data, which is particularly beneficial for complex detection tasks, such as identifying
fake videos [7]. Other studies, including reviews of deepfake detection methods using ResNet,
EfficientNet, and Xception, also confirm the effectiveness of these architectures in deep learning
tasks [8]. Similarly, research involving the use of Xception and ResNet-50 in combination with Local
Binary Pattern (LBP) for deepfake video classification demonstrates the effectiveness of image
processing and the accuracy of these models [9].</p>
      <p>In the work on deepfake detection using ResNext50 and LSTM, researchers significantly improved
accuracy by integrating temporal dependency analysis. This approach enables not only the detection
of individual frames but also the analysis of their interrelationships [10]. Finally, the use of
Generative Adversarial Networks (GAN) combined with CNN has helped reduce computational costs
by selecting key video frames to enhance results [11], making this approach promising in combating
deepfake videos.</p>
      <p>
        Most comparisons show that models like Xception and EfficientNet significantly outperform
ResNet in deepfake detection tasks due to their ability to process textures and fine image details more
effectively. Xception, with its architecture of deep separable convolutions, allows for a reduction in
the number of parameters without sacrificing accuracy, making it particularly useful in
resourceconstrained environments. EfficientNet, in turn, offers optimal scaling of model depth, width, and
resolution, leading to better performance compared to ResNet. However, ResNet remains an
important foundational architecture, especially when used in combination with mechanisms like
LSTM for handling temporal dependencies, making it effective in tasks that analyze both individual
frames and video sequences [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ][
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][5].
      </p>
      <p>Given this context, the aim of this study is to compare the effectiveness of different deep neural
networks, such as ResNet, EfficientNet, and Xception, in deepfake video detection tasks. Special
attention is given to how the architectural features of each model impact their ability to accurately
classify fake videos and optimize their performance in resource-constrained conditions. Additionally,
the study examines the role of supplementary mechanisms, such as LSTM and attention methods,
which can enhance deepfake detection accuracy by combining the processing of both individual
frames and temporal sequences.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Research methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Research architecture</title>
        <p>The research architecture (Figure 1) for evaluating model accuracy in deepfake detection tasks is
described below. The process begins with the initialization of the environment, including the import
of necessary libraries and metadata loading. Next, data preprocessing is carried out, which involves
reading metadata, randomly selecting a subset of videos, reading video files, extracting frames, and
splitting the data into training and testing sets. Following this, model preparation takes place, where
ResNet50, EfficientNet, and Xception are initialized and configured for binary classification. During
the training phase, the models are trained on the training data, and their evaluation is conducted on
the test data, with accuracy calculations. The process concludes with comparing the results of the
three models based on the obtained accuracy metrics.</p>
        <p>Init</p>
        <p>Import</p>
        <p>DataProc</p>
        <p>ModelPrep</p>
        <p>Train</p>
        <p>Eval</p>
        <p>Compare
Import libraries
Load metadata</p>
        <p>Read metadata
Select random subset</p>
        <p>Read video
Extract frames</p>
        <p>Split into train/test</p>
        <p>Init ResNet50, EfficientNet, Xception</p>
        <p>Setup classification</p>
        <p>Train on training data
Test on test data
Calc accuracy</p>
        <p>Compare ResNet50, EfficientNet, Xception
Init</p>
        <p>Import</p>
        <p>DataProc</p>
        <p>ModelPrep</p>
        <p>Train</p>
        <p>Eval</p>
        <p>Compare</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model descriptions</title>
        <p>a one-dimensional vector  = 
layers.</p>
        <p>ResNet50 [12], EfficientNetB0 [13], and Xception [14] are deep convolutional neural networks
designed for feature extraction from images, each with unique characteristics in their approaches to
scaling and optimization. All three models accept input tensors 
∈   × × , where H, W, and C
denote the image’s height, width, and the number of channels, respectively. The output of the
convolutional blocks in each model is a feature tensor  ∈  ℎ× × , which is then transformed into
( ) using a Flatten operation for further processing in dense</p>
        <p>ResNet50 [12] utilizes a convolutional layer architecture that includes "skip connections" to
prevent gradient vanishing during the training of deep networks. Each ResNet block involves a
sequence of convolutions, followed by adding the block input to its output before activation, which
is mathematically described as

=  ( ,  ) +  ,
(1)</p>
        <p>These skip connections help maintain information flow through the network and reduce
problems related to network depth.</p>
        <p>EfficientNetB0 optimizes its architecture using the composite scaling
method, which
simultaneously scales depth, width, and input size to balance accuracy and efficiency. The
architecture employs depthwise separable convolutions, which reduce the number of parameters and
computational operations by first applying depthwise convolutions independently on each channel
and then using 1 × 11 convolutions to combine the channels.</p>
        <p>Xception is an "extreme" version of the Inception architecture, where standard convolutions are
entirely replaced by depthwise separable convolutions for each spatial point and channel. This
approach not only reduces the number of parameters but also allows for more efficient feature
extraction by utilizing a greater number of independent operations. The model uses a sequence of
depthwise and pointwise convolutions in each layer, enabling better adaptation to diverse visual
patterns in the data.</p>
        <p>All three models use dense layers for further processing of the feature vector  and an output
layer with sigmoid activation for classification, underscoring their versatility and effectiveness in
modern computer vision tasks.</p>
        <p>The integration of the Swish activation function [15] and the Dropout technique [16] into the
ResNet50, EfficientNetB0, and Xception models can significantly enhance their performance and
generalization capabilities. Swish is a smoothly varying nonlinear activation function defined as

ℎ( ) =  ⋅  ( ),
(2)
where  ( ) is the sigmoid function  ( ) =
1+1− . This function has been proposed as an
alternative to ReLU due to its ability to mitigate the issue of dead neurons, allowing smoother
propagation of negative values and improving gradient flow in deep networks.</p>
        <p>Dropout, on the other hand, is a regularization technique that helps prevent overfitting by
randomly dropping out neurons during training. This forces the network to learn to be less reliant
on specific features, thus enhancing its robustness and ability to generalize to new data. In the
ResNet50, EfficientNetB0, and Xception models, applying Dropout in high-level dense layers can
help manage model complexity, reducing the risk of overfitting the large number of weights these
models have.</p>
        <p>The combination of Swish and Dropout in these models can be particularly advantageous for
tasks with large and complex datasets, where flexible activation and robust regularization are needed.
Using Swish can improve the models' learning capability in deep layers, where traditional activation
functions like ReLU may encounter limitations. Meanwhile, Dropout provides the additional benefit
of encouraging the network to distribute useful information across a greater number of neurons,
reducing the weight that any single neuron has on the model's decision.</p>
        <p>A comparative table of ResNet, EfficientNet, and Xception
models is presented below,
highlighting their key characteristics, features, and advantages in the context of video data
processing.
Comparison of ResNet, EfficientNet, and Xception Models</p>
        <sec id="sec-3-2-1">
          <title>Scalability Difficult to scale</title>
        </sec>
        <sec id="sec-3-2-2">
          <title>EfficientNet</title>
        </sec>
        <sec id="sec-3-2-3">
          <title>Xception</title>
          <p>Balanced
width,
resolution
depth, Depthwise
and
separable
convolution</p>
        </sec>
        <sec id="sec-3-2-4">
          <title>Contours, textures, Fine details, object details</title>
        </sec>
        <sec id="sec-3-2-5">
          <title>Balance of depth and performance</title>
        </sec>
        <sec id="sec-3-2-6">
          <title>Efficient</title>
          <p>different scales
for
deep
with
textures, artifacts</p>
        </sec>
        <sec id="sec-3-2-7">
          <title>Very</title>
          <p>efficient
convolutions</p>
        </sec>
        <sec id="sec-3-2-8">
          <title>Well-scaled</title>
          <p>computationally
intensive</p>
        </sec>
        <sec id="sec-3-2-9">
          <title>Specializes but in in</title>
          <p>Characteristic
Architecture</p>
        </sec>
        <sec id="sec-3-2-10">
          <title>Key Features</title>
        </sec>
        <sec id="sec-3-2-11">
          <title>Network Depth</title>
        </sec>
        <sec id="sec-3-2-12">
          <title>Texture Processing</title>
        </sec>
        <sec id="sec-3-2-13">
          <title>Object and Scene Processing</title>
        </sec>
        <sec id="sec-3-2-14">
          <title>ResNet Residual network (residual blocks)</title>
        </sec>
        <sec id="sec-3-2-15">
          <title>Contours, textures, objects, scenes Deep (up to 152 layers)</title>
        </sec>
        <sec id="sec-3-2-16">
          <title>Performs well with</title>
          <p>large objects and</p>
          <p>scenes</p>
          <p>Good at detecting
textures at high levels</p>
        </sec>
        <sec id="sec-3-2-17">
          <title>Detects</title>
          <p>efficiently
textures
due to detailed
textures
balanced scaling
Optimized
and anomalies
for Especially effective
various scenes and for
detecting
objects
anomalies</p>
          <p>Video
Processing
Capabilities</p>
        </sec>
        <sec id="sec-3-2-18">
          <title>Advantages</title>
        </sec>
        <sec id="sec-3-2-19">
          <title>Disadvantages</title>
        </sec>
        <sec id="sec-3-2-20">
          <title>Use in Video</title>
          <p>Tasks
Can handle temporal
sequences (with LSTM
[17])
- Learns features at
various levels: from
simple to complex
- Effective when
combined with LSTM
[17] for sequence</p>
          <p>analysis
- High computational
resource requirements
- Training very deep
models is challenging</p>
        </sec>
        <sec id="sec-3-2-21">
          <title>Extracts multi-level</title>
          <p>features (contours,
objects, scenes);
wellsuited for analyzing
temporal changes
Efficiently processes Detects artifacts,
video frames due to particularly useful
flexible scaling for deepfake
detection
- High efficiency due - Focused on
to balanced scaling artifact detection
- Suitable for large - Performs well in
and complex images deepfake detection
and videos tasks</p>
          <p>Effective for
processing videos</p>
          <p>with large or
complex scenes;
suitable for tasks
requiring a balance
of accuracy and</p>
          <p>efficiency
- May be less - Computationally
accurate without intensive
proper scaling - Challenging to
train due to deep
convolutions</p>
          <p>Excellent for</p>
          <p>detecting
anomalies and
artifacts in videos,
especially useful
for detecting fake
videos (deepfake)</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Research results</title>
      <p>This study conducted a comparison of the performance of three deep learning models—ResNet50,
EfficientNetB0, and Xception—in the task of image-based data classification. Each model was trained
for ten epochs, and the results were evaluated using accuracy metrics [18], precision, recall, F1-score,
as well as a confusion matrix for each model. Below is a detailed analysis of each model's
performance.</p>
      <p>The dataset used for this study was sourced from the Deepfake Detection Challenge on the Kaggle
platform (Kaggle, 2020) [19]. It contains videos classified into two categories: "REAL" and "FAKE."
After preprocessing, 480 samples were obtained, with 60 (12.5%) belonging to the "REAL" class and
420 (87.5%) to the "FAKE" class. The videos were standardized by frame size and used as input to
pretrained neural networks for classification. The uneven class distribution reflects a real-world
scenario, which is typical for deepfake detection tasks.</p>
      <p>For each video, multiple frames were processed and converted into tensors for use in neural
networks. All videos were standardized by frame size, and extracted features from these frames were
fed into the pretrained models (ResNet50, EfficientNet, Xception).</p>
      <p>Regarding the training process (Figure 2), all three models showed stable improvement in metrics
on the training sets; however, significant fluctuations were observed during validation, indicating
possible overfitting or sensitivity to parameter selection and data structures. Notably, in epochs
810, the models experienced some degradation in validation loss (val_loss), suggesting that the models
began to overfit after a certain number of epochs.</p>
      <p>The ResNet50 model demonstrated strong stability during training, achieving accuracy above
80%, but faced challenges in classifying the "REAL" class. This emphasizes the importance of further
work on data balancing to improve the model's performance on minority classes.</p>
      <p>EfficientNetB0, thanks to its efficient architecture, showed good performance in classifying both
classes, maintaining high accuracy for the "FAKE" class while also delivering better results for the
"REAL" class compared to ResNet50. Its performance could be improved through more aggressive
regularization to avoid the overfitting observed in later training stages.</p>
      <p>Xception, with its use of depthwise separable convolutions, achieved the best results in overall
accuracy and balance between classes. This suggests that its architecture is better suited for complex
image classification tasks, with fewer parameters that reduce the risk of overfitting.</p>
      <p>Next, the results were evaluated using accuracy, precision, recall, F1-score, and confusion
matrices for each model.</p>
      <p>ResNet50 (Figure 3) achieved an accuracy of 81.0% but faced difficulties in classifying the
"REAL" class, failing to correctly classify any instances of this class, as clearly shown in the confusion
matrix. This indicates an imbalance in model performance, which, despite high accuracy for the
"FAKE" class (93% recall), could not accurately classify the "REAL" class. This limitation may be
related to the network's depth and the need for additional regularization or data processing to
balance the classes.</p>
      <p>EfficientNetB0 (Figure 4) reached an accuracy of 81.5%, slightly better than ResNet50. The
model performed better in classifying the "REAL" class with a precision of 0.34 and recall of 0.50,
representing a significant improvement compared to ResNet50. For the "FAKE" class, the model
maintained high precision (0.92) and recall (0.86). This indicates that the EfficientNetB0 architecture
is better optimized for resource-constrained tasks due to its scaling mechanism.</p>
      <p>Xception (Figure 5) achieved the highest accuracy among all models—87.7%. This model
displayed balanced results for both classes, with precision and recall for the "REAL" class at 0.51 and
0.50, respectively, which is a significant improvement over the other models. For the "FAKE" class,
precision and recall values were 0.93, demonstrating the model's strong ability to extract and utilize
important features.</p>
      <p>Based on the obtained results, the idea of ensembling these three models—ResNet, EfficientNetB0,
and Xception—could be a promising direction for further research. Using a combined approach,
where the strengths of each model compensate for the weaknesses of others, could significantly
improve the system's ability to generalize and its classification accuracy across a wide range of data.
Xception, with its high accuracy and stability, could serve as the basis for accurate feature detection,
while ResNet[20] and EfficientNetB0 could add robustness and computational efficiency, especially
in resource-constrained environments. These initial observations encourage the development of a
comprehensive ensemble model, which will be thoroughly analyzed and evaluated in future research
projects aimed at optimizing detection and classification capabilities for modified images.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>This study conducted a comparative analysis of three deep neural networks—ResNet, EfficientNet,
and Xception—for deepfake video detection. The results indicate that Xception proved to be the most
effective model for classifying fake videos, achieving an accuracy of 87.7% along with balanced
precision and recall metrics for both classes. EfficientNet also demonstrated high performance, with
an accuracy of 81.5% and superior results compared to ResNet in detecting the "REAL" class. ResNet,
despite its stability in training and an accuracy of 81.0%, faced challenges in classifying videos of the
"REAL" class, highlighting the need for further model improvements when working with imbalanced
data.</p>
      <p>The application of additional techniques, such as LSTM for handling temporal sequences, helped
to enhance the accuracy of ResNet, demonstrating the importance of considering temporal
dependencies in deepfake detection. However, models like Xception and EfficientNet, with their
advanced architectures, significantly outperformed ResNet in deepfake detection tasks, providing
more efficient feature extraction and better generalization capabilities.</p>
      <p>For further improving deepfake video detection efficiency, a promising research direction is the
use of model ensemble methods. Combining the strengths of different models, such as ResNet,
EfficientNet, and Xception, could create a more robust and accurate detection system. Model
ensembling can enhance the system’s generalization ability and improve accuracy by reducing the
risk of overfitting and compensating for the weaknesses of individual models. Future research should
focus on developing effective ensemble approaches for deepfake detection, which could significantly
improve results in real-world conditions.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledments</title>
      <p>This paper is supported by the EU Erasmus+ programme within the Capacity Building Project
“WORK4CE” (619034-EPP-1–2020-1-UA-EPPKA2-CFHE-JP).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
[5] R. Sun, Z. Zhao, L. Shen, Z. Zeng, Y. Li, B. Veeravalli, X. Yulei (2023). An efficient deep video
model for deepfake detection. 2023 IEEE International Conference on Image Processing (ICIP).
https://dx.doi.org/10.1109/ICIP49359.2023.10222682
[6] J. B. Awotunde, R. Jimoh, A. Imoize, A. T. Abdulrazaq, C. T. Li, C. C. Lee (2022). An enhanced
deep learning-based deepfake video detection and classification system. Electronics, 12(1).
https://dx.doi.org/10.3390/electronics12010087
[7] A. E. Bayar, C. Topal (2023). Deepfake detection via combining channel and spatial attention.</p>
      <p>2023 IEEE Signal Processing Conference.
[8] A. Das, K. S. Viji, L. Sebastian (2022). A survey on deepfake video detection techniques using
deep learning. 2022 International Conference on Next Generation Information Systems
(ICNGIS). https://dx.doi.org/10.1109/ICNGIS54955.2022.10079802
[9] A. Arini, R. B. Bahaweres, J. Al Haq (2022). Quick classification of Xception and ResNet-50
models on deepfake video using Local Binary Pattern. IEEE Symposium on Artificial Intelligence
and Multimedia (ISMODE).
[10] S. Z. Yunes Al-Dhabi (2021). Deepfake video detection by combining convolutional neural
network (CNN) and recurrent neural network (RNN). 2021 IEEE International Conference on
Artificial Intelligence and Smart Energy (CSAIEE).
https://dx.doi.org/10.1109/CSAIEE54046.2021.9543264
[11] S. Lalitha, K. Sooda (2022). DeepFake detection through key video frame extraction using GAN.
2022 International Conference on Advanced Computing Research and Sustainability (ICACRS).
https://dx.doi.org/10.1109/ICACRS55517.2022.10029095
[12] S. Targ, D. Almeida, K. Lyman (2016). Resnet in resnet: Generalizing residual architectures.</p>
      <p>arXiv preprint arXiv:1603.08029.
[13] K. Kansal, T. B. Chandra, A. Singh (2024). ResNet-50 vs. EfficientNet-B0: Multi-Centric
Classification of Various Lung Abnormalities Using Deep Learning" Session id: ICMLDsE. 004".</p>
      <p>Procedia Computer Science, 235, 70-80.
[14] F. Chollet (2017). Xception: Deep learning with depthwise separable convolutions. In</p>
      <p>Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1251-1258.
[15] R. Sudharsan, E. N. Ganesh (2022). A Swish RNN based customer churn prediction for the
telecom industry with a novel feature selection strategy. Connection Science, 34(1), 1855-1876.
[16] S. Wager, S. Wang, P. S. Liang (2013). Dropout training as adaptive regularization. Advances in
neural information processing systems, 26.
[17] H. Arab, I. Ghaffari, R. M. Evina, S. O. Tatu, S. Dufour (2022). A hybrid LSTM-ResNet deep neural
network for noise reduction and classification of V-band receiver signals. IEEE Access, 10,
14797-14806.
[18] K. Lipianina-Honcharenko, V. Yarych, A. Ivasechko, A. Filinyuk, K. Yurkiv, T. Lebid, M. Soia
(2024). Evaluating the Effectiveness of Attention-Gated-CNN-BGRU Models for Historical
Manuscript Recognition in Ukraine. Proceedings of the First International Workshop of Young
Scientists on Artificial Intelligence for Sustainable Development Ternopil, Ukraine, May 10-11,
2024. (pp. 99-108). https://ceur-ws.org/Vol-3716/paper8.pdf
[19] Kaggle (2020). Deepfake Detection Challenge.
https://www.kaggle.com/competitions/deepfakedetection-challenge/data.
[20] S. Keerthana, N. Deepika, E. Pooja, I. Nandhini, M. Shanthalakshmi, G. R. Khanaghavalle (2024).</p>
      <p>An effective approach for detecting deepfake videos using Long Short-Term Memory and
ResNet. 2022 International Conference on Communication, Computing and Internet of Things
(IC3IoT), 1–5. https://doi.org/10.1109/ic3iot60841.2024.10550265.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Lipianina-Honcharenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Melnychuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yurkiv</surname>
          </string-name>
          , G. Hladiy,
          <string-name>
            <given-names>M.</given-names>
            <surname>Telka</surname>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Integrated Approach to the International Aspects of Online Dispute Resolution Formation</article-title>
          .
          <source>Proceedings of the First International Workshop of Young Scientists on Artificial Intelligence for Sustainable Development Ternopil, Ukraine, May 10-11</source>
          ,
          <year>2024</year>
          . (pp.
          <fpage>88</fpage>
          -
          <lpage>98</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Lipianina-Honcharenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yurkiv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ivasechko</surname>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Evaluation of the Effectiveness of Machine Learning Methods for Detecting Disinformation in Ukrainian Text Data</article-title>
          .
          <source>Proceedings of The Seventh International Workshop on Computer Modeling and Intelligent Systems (CMIS-2024)</source>
          , Zaporizhzhia, Ukraine, May 3,
          <year>2024</year>
          . (pp.
          <fpage>97</fpage>
          -
          <lpage>109</lpage>
          ). https://ceurws.org/Vol-
          <volume>3702</volume>
          /paper9.pdf
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Rani</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. R.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Pareek</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. S.</surname>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Deepfake video detection system using deep neural networks</article-title>
          .
          <source>2023 IEEE International Conference on Information and Communication Systems (ICICACS)</source>
          . https://dx.doi.org/10.1109/ICICACS57338.
          <year>2023</year>
          .10099618
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rajalaxmi</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. S.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rithani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhivakar</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. E.</surname>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Deepfake detection using InceptionResNet-V2 network</article-title>
          .
          <source>2023 International Conference on Computing and Communications (ICCMC)</source>
          . https://dx.doi.org/10.1109/ICCMC56507.
          <year>2023</year>
          .10083584
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>