<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Identification of Modern Facial Emotion Recognition Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kirill Smelyakov</string-name>
          <email>kyrylo.smelyakov@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oleksandr Bohomolov</string-name>
          <email>oleksandr.bohomolov@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maksym Kizitskyi</string-name>
          <email>maksym.kizitskyi@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasiya Chupryna</string-name>
          <email>anastasiya.chupryna@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>14 Nauky Ave., Kharkiv, 61166</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper is devoted to the problem of developing a generalized algorithm for the effective identification of computational intelligence models used to recognize emotions by a person's facial expression. To solve this problem, an actual dataset was selected, alternative recognition models, algorithms and machine learning technologies were identified, as well as performance indicators and metrics that are used in the course of a comparative analysis of the obtained results. A series of numerous experiments has been carried out in relation to the identification of the parameters of alternative models of neural networks that are used to recognize emotions and evaluate the effectiveness of their application. Based on a comparative analysis of the effectiveness of the results of experiments, a generalized algorithm for identifying emotions was formulated, as well as recommendations for the use of certain architectures of neural networks in the framework of the tasks of facial emotion recognition.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Computer vision</kwd>
        <kwd>facial emotion recognition</kwd>
        <kwd>face recognition</kwd>
        <kwd>convolutional neural network</kwd>
        <kwd>transfer learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        Researches in recent years focus on facial emotion recognition (FER) task [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1-3</xref>
        ]. Such systems often
supplement to face recognition systems (Azure Face API, Face, FaceReader, etc.) [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4-6</xref>
        ] and can be used
in many situations, from customer satisfaction analysis, service at the checkout, to tracking emotions at
a psychologist’s appointment [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ], in perspective drone vision services [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], etc.
      </p>
      <p>
        The most efficient approaches that use such networks as ResNet, AffectNet, MobileNet, etc. on
facial emotion recognition (FER) task are described by researchers. To simplify the access to this
information they organized special list [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>On the other hand, it includes various forms of ensembling and stacking of neural networks. It gives
a win in the quality of the classification of emotions, but this approach also has disadvantages. Firstly,
the model itself becomes quite huge and heavy, and a lot of time is spent on predictions. Because of
this, the application of models of this kind is very complicated on mobile devices or in real-time
systems. Secondly, due to the presence of several neural networks, the process of maintaining them
within the production system becomes more complicated, and the task of updating models while
maintaining the logic of work becomes more difficult compared to a solution in the form of an
end-toend model. Therefore, the issue of developing a model, perhaps not as effective, but much more compact
and easy to maintain for use in face recognition systems, remains relevant and open.</p>
      <p>
        At the same time, a wide variety of machine learning models and algorithms, as well as a high degree
of uncertainty in the application conditions, often create great difficulties in choosing an appropriate
network architecture and tuning its parameters effectively [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11-13</xref>
        ].
      </p>
      <p>
        Why are neural networks and transfer learning considered to solve FER problems?
In recent years, neural networks have become the standard tool in the area of computer vision [
        <xref ref-type="bibr" rid="ref14 ref15">14,
15</xref>
        ]. A large number of diverse architectural solutions (EfficientNet, ResNet, Yolov5, etc.) and machine
learning methods have been proposed to solve the problems of image classification object detection,
and recognition. Their performance is affected by the quality of images [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ], result of image
segmentation [
        <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
        ], the architecture and hyperparameter settings of neural networks [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Moreover,
researches on the application of convolutions are carried out to improve the effectiveness of CNN
application in the case of the optimization of convolution mask parameters, the number of layers and a
number of other parameters [
        <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
        ].
      </p>
      <p>
        For the purposes of identifying the parameters of a neural network, a wide range of machine learning
algorithms is currently used. One of the most effective is transfer learning [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Transfer learning
(finetuning a neural network with pre-trained weights on a huge data set (for example ImageNet) to solve a
specific problem) is widely used in all areas of computer vision and increases the quality of solving
different kinds of problems [
        <xref ref-type="bibr" rid="ref24 ref25">24, 25</xref>
        ].
      </p>
      <p>The main advantage of this approach is that, thanks to the pre-trained weights, the model transforms
the input image into a smaller set of meaningful features. Because of this, the relief of the loss function
is smoothed out and the models converge faster to its minimum. Also recently, in such a field as face
recognition, SOTA technique is often used, where the model is trained to compress the image into a
feature vector by which a person’s face can be identified [26]. Which in turn is very similar to what
transfer learning is used for. That's why we decided to compare classical transfer learning models with
face recognition models in more detail. Besides, this domain was selected because this is quite a popular
area and many pre-trained models are in the public access [27].</p>
      <p>For models to benefit from pre-trained weights, the task must be related to the domain on which the
models were trained.</p>
      <p>
        The research results are important not only for FER services, but also for solving a great number of
related tasks, including the development of effective integrated E-learning services, AI solutions [28],
the development of ICT solutions, network solutions and security services [
        <xref ref-type="bibr" rid="ref12">12, 29, 30</xref>
        ]. In addition, if
face recognition based models show advantages over standard approaches, it means that the use of face
recognition learning approaches can improve the quality of transfer learning models in other areas,
increase learning speed and allow using less data for training. And it will allow specialists to conduct
more experiments and reduce the outlay of cloud learning services.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods and Materials</title>
      <p>First of all, consider the data that will be used in further experiments, some other materials and
methods proposed to solve the problem under consideration.
3.1.</p>
    </sec>
    <sec id="sec-4">
      <title>Dataset Description</title>
      <p>In order to test our approach, we chose a quite well known data set FER2013 [31]. The 2013 Facial
Expression Recognition dataset (FER2013) is a Kaggle dataset, introduced by Pierre-Luc Carrier and
Aaron Courvill at the International Conference on Machine Learning (ICML) in 2013.</p>
      <p>This dataset was chosen because it is in the public access. It also contains photographs of people of
different age, gender, race, nationality, with different background and accessories (such as glasses,
masks). It allows a better evaluation of the generalization abilities in emotions recognizing.</p>
      <p>This dataset contains grayscale images of faces. Their size is 48x48 pixels. These images have been
created using an automatic face registration so that the faces on them are centered and occupy nearly
the same amount of space in each image. So when making a comparison, we assume that the images
have already been preprocessed in advance, therefore we will not consider this issue within the
framework of our paper. Each image is labeled with one of seven emotions from the following list:
Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral.</p>
      <p>The Disgust expression has the minimal number of images – 547, while other labels have nearly
5,000 samples each. More detailed information is presented in Table 1.
● loss on the training set;
accuracy on the validation set;
accuracy on the validation set;
mean convergence rate (MCR)
mean overfitting rate (MOFR)
= 1</p>
      <p>∑ =1(</p>
      <p>_
●
●
●
●
●
_
●



_

where n – number of epochs; 
 – performance metric on train data set during i`s epoch;
(1)
(2)

 − 
_</p>
      <p>_
 −1),
 ) − (
 _
 −1 −
where n – number of epochs; 
 – performance metric on validating data set during i`s epoch;</p>
      <p>– performance metric on train data set during i`s epoch;
initial accuracy – accuracy after training for 1 epoch. We chose this metric because it shows
how well the pre-trained weights of the model fit the domain;
● initial loss – loss after training for 1 epoch.</p>
      <p>In our experiment Metric will be accuracy and loss (categorical cross entropy).</p>
      <p>In general, this data set provides a wide variety of face images, which will favorably affect the
generalization ability of the model. However, it also has an imbalance of classes that is why the accuracy
of determining the emotion of disgust will probably be lower in comparison with others.</p>
      <p>To split the data set, a standard function from the sklearn package, train_test_split, was used.
Training dataset - 70% (25,121 images). Validation dataset - 10% (3,589 image). Test dataset - 20%
(7,177 images). The partition was stratified by emotion in the image with random_state = 42.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Experiment</title>
      <p>This section presents the plan of the experiment.</p>
      <p>In order to evaluate the effectiveness of transfer learning, we will compare several popular
architectures such as VGG-Face (Figure 2), OpenFace (Figure 3) which are neural networks trained for
face recognition. Our hypothesis is that since the task of face recognition is in some way similar to FER,
therefore, the weights of the networks will already contain the necessary features that will increase the
learning performance. We also chose ResNet-50, MobileNet (Figure 4) pretrained on ImageNet dataset
because they are the standard choice as a backbone in transfer learning. In these networks, the last layer
was excluded, and all layers except the last 4 were frozen.</p>
      <p>The model structure of VGG-Face and OpenFace were loaded using deepface library [35]. The
pretrained weights are available on [36-38]. ResNet-50, MobileNet were loaded using keras framfork
[39]. Each model will be trained with a fixed set of hyperparameters such as the learning rate (10-4),
the number of epochs is 20. Also, key metrics will be measured every 5 epochs. As a loss function we
chose categorical cross entropy.</p>
      <p>To compare the efficiency of transfer learning, we will train neural networks in 2 versions: with
pretrained weights and with randomly initialized weights. This approach will allow us to determine how
and at what stages the pre-trained weights affect the efficiency of the model.</p>
      <p>After the experiment we will find out in which model the pre-trained weights give the greatest value
compared to random initialization, determine which model converges faster than others, is more
resistant to overfitting and shows the highest accuracy.</p>
      <p>Training will be carried out in the Google Colaboratory environment.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Results</title>
      <p>The results of the experiments are presented in Figures 5 – Figure 8 and in Tables 2 – Table 5. The
high resolution versions of all images are presented here [40].
5.1.</p>
    </sec>
    <sec id="sec-7">
      <title>ML Results</title>
      <p>a) b)
Figure 5: MobileNet training process: a) Accuracy change over epoch of MobileNet; b) Loss change
over epoch of MobileNet</p>
      <p>a) b)
Figure 15: Classification result of emotion “neutral”: a) An image example [26]; b) Predicted emotion
probabilities</p>
    </sec>
    <sec id="sec-8">
      <title>6. Discussions</title>
      <p>As a result of the experiment, it was revealed that pre-trained models showed better performance
than randomly initialized ones in the FER task. Also, the pre-trained models had a higher average
convergence rate at the first epochs (1-10), but then values became the same, in some cases, at epochs
15-20, the randomly initialized model converged faster. This is mainly due to the fact that the
pretrained model at that moment reached an accuracy of more than 0.8 and, accordingly, the quality
increase slowed down. On the other hand, pre-trained models are more prone to overfitting, therefore,
when using them, it is desirable to apply various regularization methods or data augmentation.</p>
      <p>The best model in terms of initial and final accuracy on the validation set is VGGFace_pretrained.
Therefore, its weights are initially best suited for the FER task. But in our experiment, this model had
the worst performance in terms of convergence rate. That is why, for its training, other hyperparameters
should be used, for example, to increase the learning rate or add more dense classification layers.</p>
      <p>The second model for face detection - OpenFace - shows a level of accuracy comparable to the
standard solutions in transfer learning - ResNet-50. But at the same time it has fewer parameters,
therefore it fits and predicts faster. OpenFace has 3,743,280 parameters and ResNet-50 has 23,587,712
parameters. MobileNet has the fewest parameters (3,228,864), but it`s performance is lower than in
OpenFace. Also, OpenFace has the highest convergence rate and overfitting rate in comparison with
other models.</p>
      <p>Thus, the face recognition based models proved to be at a fairly high level, in some cases even
surpassing standard models like ResNet-50 and MobileNet.</p>
      <p>As can be seen from Figures 8-14, such emotions as happiness, anger, fear, surprise are best
recognized, and disgust is worst of all recognized. This is because this class is the least represented in
the dataset. In addition, some pictures are rather controversially labeled (for example, pictures 12-13).
In these examples neural networks show low confidence in the image class.</p>
      <p>Based on the results of the experiment, the final learning algorithm was developed, which can be
suggested to use in FER systems:</p>
      <p>Preprocessing:
1) apply the face detection model to the image. You can use one of the pre-trained models, or train
your own;</p>
      <p>2) apply various augmentations to images. This will balance the classes (if the original dataset is
unbalanced) and also increase the stability of the model on new data.</p>
      <p>Training:
1) select a backbone model. If speed is more important within the task and there is enough data for
training, we recommend choosing OpenFace. If the quality of recognition is more important and there
are no enough resources for full model training, choose VGGFace;
2) freeze all layers of the neural network for training and add fully connected layers on top of them;
3) select hyper parameters and start the learning process with them.</p>
    </sec>
    <sec id="sec-9">
      <title>7. Conclusions</title>
      <p>As a result of the research the aim and goals of the work were reached. We formulated an effective
algorithm for neural network identification and usage within the framework of the FER task; determined
which architecture of neural networks was better to use as a backbone for FER tasks in different
situations; compared the effectiveness of face recognition based backbones with standard solutions for
transfer learning.</p>
      <p>We found one of the most popular datasets on FER task – FER-2013. While analyzing its structure
we found out that it was quite unbalanced. On the one hand it’s a drawback, because models will learn
how to distinguish minor class worse. But on the other hand it will show how models will work with
real-world datasets that are often unbalanced.</p>
      <p>Then we defined key metrics for analysis of networks performance during learning. Proposed
metrics showed the efficiency of transfer learning for each architecture and determined what pre trained
weights are most suitable for FER task and lead to faster convergence and less overfitting speed.</p>
      <p>As part of this work, we organized an experiment and conducted a comparative analysis of the
quality of the most popular neural network architectures for transfer learning (ResNet-50, MobileNet)
with networks for face recognition (OpenFace, VGG-Face) within the FER task using various metrics.
The obtained results show only general performance of the networks because they were all trained under
the same conditions, and the best set of hyperparameters was not selected.</p>
      <p>Based on the analysis of the experimental results, we recommend using the algorithm proposed in
this article with a pretrained VGGFace. Also, under the condition of limited resources and the use of
regularization methods, we recommend OpenFace as an alternative. But we also recommend
specifically setting up the classifier for each specific task separately, because this will give a gain in
quality.</p>
      <p>For a deeper analysis of the effectiveness of neural networks, it is necessary to perform a deeper
study, which is not the purpose of this work. It includes testing a larger class of architectures on a larger
number of data sets, using various types of classifiers for embedding (including those not based on
neural networks).</p>
    </sec>
    <sec id="sec-10">
      <title>8. References</title>
      <p>Approaches," in IEEE Transactions on Medical Imaging, vol. 38, no. 8, pp. 1777-1787, Aug. 2019,
doi: 10.1109/TMI.2019.2894349.
[26] Deep Face Recognition: A Survey. URL: https://arxiv.org/pdf/1804.06655.pdf?source=post_page.
[27] Deepfase. URL: https://github.com/serengil/deepface.
[28] Y. Lu, Q. Mao and J. Liu, "A Deep Transfer Learning Model for Packaged Integrated Circuit
Failure Detection by Terahertz Imaging," in IEEE Access, vol. 9, pp. 138608-138617, 2021, doi:
10.1109/ACCESS.2021.3118687.
[29] O. Lemeshko, O. Yeremenko and A. M. Hailan, "Two-level method of fast ReRouting in
softwaredefined networks," 2017 4th International Scientific-Practical Conference Problems of
Infocommunications. Science and Technology (PIC S&amp;T), 2017, pp. 376-379, doi:
10.1109/INFOCOMMST.2017.8246420.
[30] Shubin, I., Kyrychenko, I., Goncharov, P., Snisar, S., "Formal representation of knowledge for
infocommunication computerized training systems," 2017 IEEE 4th International
ScientificPractical Conference Problems of Infocommunications, Science and Technology (PIC S&amp;T),
2017, pp. 287–291, doi: 10.1109/INFOCOMMST.2017.8246399.
[31] Learn facial expressions from an image. URL: https://www.kaggle.com/msambare/fer2013.
[32] VGG-Face network architecture. URL:
https://www.researchgate.net/figure/VGG-Face-networkarchitecture_fig2_319284653.
[33] OpenFace architecture. URL: https://www.cs.cmu.edu/~satya/docdir/CMU-CS-16-118.pdf.
[34] MobileNet-50 architecture. URL: https://arxiv.org/pdf/1704.04861.pdf.
[35] OpenFace. URL: A general-purpose face recognition library with mobile applications:
http://reports-archive.adm.cs.cmu.edu/anon/2016/CMU-CS-16-118.pdf .
[36] VGGF.URL:https://drive.google.com/file/d/1CPSeum3HpopfomUEK1gybeuIVoeJT_Eo/view.
[37] Openface.URL:https://drive.google.com/file/d/1LSe1YCV1x-BfNnfb7DFZTNpv_Q9jITxn/view.
[38] ResNet and ResNetv2. URL: https://keras.io/api/applications/resnet/#resnet50-function.
[39] Keras. URL: https://keras.io/api/applications/mobilenet.
[40] All images. URL: https://docs.google.com/document/d/1Z_S_FpRkv4Xf2cRAqHxo23BUv7aYqt
MZ59aJrpvYf-M/edit?usp=sharing.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          et al.,
          <article-title>"Dominant and Complementary Emotion Recognition from Still Images of Faces,"</article-title>
          <source>in IEEE Access</source>
          , vol.
          <volume>6</volume>
          , pp.
          <fpage>26391</fpage>
          -
          <lpage>26403</lpage>
          ,
          <year>2018</year>
          , doi: 10.1109/ACCESS.
          <year>2018</year>
          .
          <volume>2831927</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>"Weakly Supervised Emotion Intensity Prediction for Recognition of Emotions in Images,"</article-title>
          <source>in IEEE Transactions on Multimedia</source>
          , vol.
          <volume>23</volume>
          , pp.
          <fpage>2033</fpage>
          -
          <lpage>2044</lpage>
          ,
          <year>2021</year>
          , doi: 10.1109/TMM.
          <year>2020</year>
          .
          <volume>3007352</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. -Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          -L. Liu and
          <string-name>
            <given-names>H.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>"Multisource Transfer Learning for Cross-Subject EEG Emotion Recognition,"</article-title>
          <source>in IEEE Transactions on Cybernetics</source>
          , vol.
          <volume>50</volume>
          , no.
          <issue>7</issue>
          , pp.
          <fpage>3281</fpage>
          -
          <lpage>3293</lpage>
          ,
          <year>July 2020</year>
          , doi: 10.1109/TCYB.
          <year>2019</year>
          .
          <volume>2904052</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Smelyakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Datsenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Skrypka</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Akhundov</surname>
          </string-name>
          ,
          <article-title>"The Efficiency of Images Reduction Algorithms with Small-Sized and</article-title>
          <string-name>
            <surname>Linear Details</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          2019 IEEE International Scientific-Practical Conference Problems of Infocommunications, Science and
          <string-name>
            <surname>Technology (PIC S&amp;T)</surname>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>745</fpage>
          -
          <lpage>750</lpage>
          , doi: 10.1109/PICST47496.
          <year>2019</year>
          .
          <volume>9061250</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Mu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <article-title>"A Review of Face Recognition Technology,"</article-title>
          <source>in IEEE Access</source>
          , vol.
          <volume>8</volume>
          , pp.
          <fpage>139110</fpage>
          -
          <lpage>139120</lpage>
          ,
          <year>2020</year>
          , doi: 10.1109/ACCESS.
          <year>2020</year>
          .
          <volume>3011028</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yan</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <article-title>"Towards Age-Invariant Face Recognition,"</article-title>
          <source>in IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          , vol.
          <volume>44</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>474</fpage>
          -
          <issue>487</issue>
          , 1 Jan.
          <year>2022</year>
          , doi: 10.1109/TPAMI.
          <year>2020</year>
          .
          <volume>3011426</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N. -C.</given-names>
            <surname>Ristea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Duţu</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Radoi</surname>
          </string-name>
          ,
          <article-title>"Emotion Recognition System from Speech</article-title>
          and
          <source>Visual Information based on Convolutional Neural Networks," 2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          , doi: 10.1109/SPED.
          <year>2019</year>
          .
          <volume>8906538</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Partila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tovarek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Voznak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rozhon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sevcik</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Baran</surname>
          </string-name>
          ,
          <article-title>"Multi-Classifier Speech Emotion Recognition System," 2018 26th Telecommunications Forum (TELFOR</article-title>
          ),
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          , doi: 10.1109/TELFOR.
          <year>2018</year>
          .
          <volume>8612050</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Tokariev</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tkachov</surname>
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ilina</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Partyka</surname>
            <given-names>S.</given-names>
          </string-name>
          <article-title>Implementation of combined method in constructing a trajectory for structure reconfiguration of a computer system with reconstructible structure and programmable logic // Selected Papers of the XIX International Scientific and Practical Conference "Information Technologies and Security"</article-title>
          ,
          <source>(ITS</source>
          <year>2019</year>
          ),
          <source>CEUR Workshop Processing</source>
          , 28 Nov,
          <year>2019</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>81</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Facial</given-names>
            <surname>Expression</surname>
          </string-name>
          <article-title>Rec</article-title>
          . URL: https://paperswithcode.com/task/facial-expression-recognition.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Smelyakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shupyliuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Martovytskyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tovchyrechko</surname>
          </string-name>
          and
          <string-name>
            <given-names>O.</given-names>
            <surname>Ponomarenko</surname>
          </string-name>
          ,
          <article-title>"Efficiency of image convolution,"</article-title>
          <source>2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>578</fpage>
          -
          <lpage>583</lpage>
          , doi: 10.1109/CAOL46282.
          <year>2019</year>
          .
          <volume>9019450</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          et al.,
          <article-title>"Enabling AI in Future Wireless Networks: A Data Life Cycle Perspective,"</article-title>
          <source>in IEEE Communications Surveys &amp; Tutorials</source>
          , vol.
          <volume>23</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>553</fpage>
          -
          <lpage>595</lpage>
          ,
          <year>Firstquarter 2021</year>
          , doi: 10.1109/COMST.
          <year>2020</year>
          .
          <volume>3024783</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaterji</surname>
          </string-name>
          et al.,
          <article-title>"Lattice: A Vision for Machine Learning, Data Engineering, and Policy Considerations for Digital Agriculture at Scale,"</article-title>
          <source>in IEEE Open Journal of the Computer Society</source>
          , vol.
          <volume>2</volume>
          , pp.
          <fpage>227</fpage>
          -
          <lpage>240</lpage>
          ,
          <year>2021</year>
          , doi: 10.1109/OJCS.
          <year>2021</year>
          .
          <volume>3085846</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gao</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>Emotion Recognition Based On CNN," 2019 Chinese Control Conference (CCC)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>8627</fpage>
          -
          <lpage>8630</lpage>
          , doi: 10.23919/ChiCC.
          <year>2019</year>
          .
          <volume>8866540</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <article-title>"Artificial Intelligence Image Recognition Method Based on Convolutional Neural Network Algorithm,"</article-title>
          <source>in IEEE Access</source>
          , vol.
          <volume>8</volume>
          , pp.
          <fpage>125731</fpage>
          -
          <lpage>125744</lpage>
          ,
          <year>2020</year>
          , doi: 10.1109/ACCESS.
          <year>2020</year>
          .
          <volume>3006097</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K.</given-names>
            <surname>Smelyakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chupryna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hvozdiev</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sandrkin</surname>
          </string-name>
          ,
          <article-title>"Gradational Correction Models Efficiency Analysis of Low-Light Digital Image,"</article-title>
          2019 Open Conference of Electrical,
          <source>Electronic and Information Sciences (eStream)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          , doi: 10.1109/eStream.
          <year>2019</year>
          .
          <volume>8732174</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A. I.</given-names>
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Dunn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. G. A.</given-names>
            <surname>Hutchins</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Treanor</surname>
          </string-name>
          ,
          <article-title>"The Effect of Quality Control on Accuracy of Digital Pathology Image Analysis,"</article-title>
          <source>in IEEE Journal of Biomedical and Health Informatics</source>
          , vol.
          <volume>25</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>307</fpage>
          -
          <lpage>314</lpage>
          , Feb.
          <year>2021</year>
          , doi: 10.1109/JBHI.
          <year>2020</year>
          .
          <volume>3046094</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Yuan</surname>
          </string-name>
          , Y. Cheng and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>"Deep Guidance Network for Biomedical Image Segmentation,"</article-title>
          <source>in IEEE Access</source>
          , vol.
          <volume>8</volume>
          , pp.
          <fpage>116106</fpage>
          -
          <lpage>116116</lpage>
          ,
          <year>2020</year>
          , doi: 10.1109/ACCESS.
          <year>2020</year>
          .
          <volume>3002835</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G.</given-names>
            <surname>Wang</surname>
          </string-name>
          et al.,
          <article-title>"DeepIGeoS: A Deep Interactive Geodesic Framework for Medical Image Segmentation,"</article-title>
          <source>in IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          , vol.
          <volume>41</volume>
          , no.
          <issue>7</issue>
          , pp.
          <fpage>1559</fpage>
          -
          <issue>1572</issue>
          ,
          <issue>1</issue>
          <year>July 2019</year>
          , doi: 10.1109/TPAMI.
          <year>2018</year>
          .
          <volume>2840695</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C.</given-names>
            <surname>Nunes</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Pádua</surname>
          </string-name>
          ,
          <article-title>"A Convolutional Neural Network for Learning Local Feature Descriptors on Multispectral Images,"</article-title>
          <source>in IEEE Latin America Transactions</source>
          , vol.
          <volume>20</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>215</fpage>
          -
          <lpage>222</lpage>
          , Feb.
          <year>2022</year>
          , doi: 10.1109/TLA.
          <year>2022</year>
          .
          <volume>9661460</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>Automatic CNN Compression Based on Hyperparameter Learning," 2021 International Joint Conference on Neural Networks (IJCNN)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          , doi: 10.1109/IJCNN52387.
          <year>2021</year>
          .
          <volume>9533329</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>L.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wei</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>"Parameter Distribution Balanced CNNs,"</article-title>
          <source>in IEEE Transactions on Neural Networks and Learning Systems</source>
          , vol.
          <volume>31</volume>
          , no.
          <issue>11</issue>
          , pp.
          <fpage>4600</fpage>
          -
          <lpage>4609</lpage>
          , Nov.
          <year>2020</year>
          , doi: 10.1109/TNNLS.
          <year>2019</year>
          .
          <volume>2956390</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gonzales-Martínez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Machacuay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rotta</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Chinguel</surname>
          </string-name>
          ,
          <article-title>"Hyperparameters Tuning of Faster R-CNN Deep Learning Transfer for Persistent Object Detection in Radar Images,"</article-title>
          <source>in IEEE Latin America Transactions</source>
          , vol.
          <volume>20</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>677</fpage>
          -
          <lpage>685</lpage>
          ,
          <year>April 2022</year>
          , doi: 10.1109/TLA.
          <year>2022</year>
          .
          <volume>9675474</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Griffith</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Golmie</surname>
          </string-name>
          ,
          <article-title>"Toward Deep Transfer Learning in Industrial Internet of Things,"</article-title>
          <source>in IEEE Internet of Things Journal</source>
          , vol.
          <volume>8</volume>
          , no.
          <issue>15</issue>
          , pp.
          <fpage>12163</fpage>
          -
          <issue>12175</issue>
          , 1 Aug.1,
          <year>2021</year>
          , doi: 10.1109/JIOT.
          <year>2021</year>
          .
          <volume>3062482</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hussein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kandel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Bolan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. B.</given-names>
            <surname>Wallace</surname>
          </string-name>
          and
          <string-name>
            <given-names>U.</given-names>
            <surname>Bagci</surname>
          </string-name>
          ,
          <article-title>"Lung and Pancreatic Tumor Characterization in the Deep Learning Era: Novel Supervised and Unsupervised Learning</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>