<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Experimental Evaluation of the Effectiveness of ANN-based Numerical Data Augmentation Methods for Diagnostics Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ivan Izonin</string-name>
          <email>ivanizonin@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roman Tkachenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roman Pidkostelnyi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olena Pavliuk</string-name>
          <email>olena.m.pavliuk@lpnu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viktor Khavalko</string-name>
          <email>khavalkov@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anatoliy Batyuk</string-name>
          <email>abatyuk@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <addr-line>S. Bandera str., 12, 79013, Lviv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Improving the accuracy of diagnostics tasks is essential in various medical fields. When there are small data for training, there are high risks of overfitting or underfitting the machine learning model. This makes it impossible to apply it in practice. To solve such a problem, we can use various data augmentation methods. This paper focuses on neural network methods of data augmentation. The authors have investigated a variational autoencoder and approach based on GAN to generate artificial numerical data and then use it by machine-learning-based classifiers. The authors examined the proposed method for diagnosing diabetes mellitus development task. Experiments confirmed that autoencoders generated a dataset similar to an initial one, with a similarity score being 0.93. The authors established a significant accuracy improvement of Random Forest, AdaBoost, and Logistic regression classifiers based on processing an extended dataset. The application of the new dataset obtained using GAN does not ensure satisfactory accuracy. Such an issue may be due to a lack of samples for the training of this neural networks class. Further research is likely to be carried out into ensembles based on a single machine learning method, which will process decorrelated samples acquired by methods investigated in this paper.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Tabular data</kwd>
        <kwd>classification</kwd>
        <kwd>overfitting risk</kwd>
        <kwd>underfitting risk</kwd>
        <kwd>data augmentation</kwd>
        <kwd>ANN</kwd>
        <kwd>GAN</kwd>
        <kwd>autoencoder</kwd>
        <kwd>small data approach</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The development of modern medicine has been marked by digitizing a wide variety of information and
the automation of many processes [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This makes it possible to collect a large amount of data for
analysis. It also opens up new opportunities for applying data mining techniques to intellectualizing
specific diagnostics or treatment processes.
      </p>
      <p>However, the scarce data may impede the implementation of machine learning. Alternatively,
abnormal data may lead to increased accuracy, which is a critical point in this area.</p>
      <p>One possible solution to this problem lies in adopting data augmentation methods. This approach
can allow synthesizing of enough data to train the selected artificial intelligence tool.</p>
      <p>
        Nowadays, there are quite a few simple methods for manipulating an available sample of data to
increase its size. The data are enlarged both by rows and by columns [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However, these methods do
not always introduce helpful information into the expanded dataset and, consequently, only increase the
learning time of the selected model. The accuracy of the chosen classifier or regressors is not affected
here.
      </p>
      <p>
        Many neural network methods have been developed to increase a dataset today. A wide variety of
artificial neural network topologies are employed here. The augmentation is performed using a variety
of information - from time series to images. Generative adversarial network [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is among the most used
(A. 3)
methods for artificial augmentation of datasets, particularly in the field of image processing. This type
of neural network is most commonly used to synthesize new images for further use by deep learning
neural networks. Another type of neural network is autoencoders, which is often and successfully
applied in time series analysis. However, developing and researching a methodology for effective
artificial augmentation of numerical datasets remains to be solved. On the one hand, neural network
methods are more sophisticated and should reveal patterns in the dataset that are difficult to detect with
simple methods [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Such information can serve as a basis for the synthesis of new patterns in the
dataset. Alternatively, a neural network toolkit must obtain sufficient data for training and validating
the model. Moreover, generalization properties should be especially emphasized. Only by meeting all
these requirements will the selected tool operate adequately and synthesize the required amount of
synthetic data of the required quality. Thus this paper aims to investigate neural network methods for
enlarging tabular datasets to improve the accuracy of classification based on them.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Materials and methods</title>
      <p>This section includes a description of two neural-network-based approaches for numerical data
augmentation used in this paper. The main objective is to improve the classification accuracy in Clinical
Medicine based on expanded datasets.</p>
    </sec>
    <sec id="sec-3">
      <title>2.1.1. ANN-based numerical data augmentation methods</title>
      <p>
        The first approach selected is a new method for generating an artificial dataset based on a Generative
adversarial network (GAN) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. To this end, the author of the technique modified neural networks to
deal specifically with numerical datasets. The modification was as follows. The authors proposed to use
Conditional GANs as a generator of numerical data. This approach is explained by:
- efficient performance in the event of an unbalanced dataset;
- independence of the type of variables: discrete and continuous, with the possibility of modeling
them both at the same time;
- a flexible approach to modeling the distribution of probabilities within the dataset;
- the possibility of synthesizing high-quality synthetic samples that are very similar to the
observations from the initial dataset.
      </p>
      <p>A peculiarity of this method is that the authors use a special normalization method and a set of
stateof-the-art model learning methods, and a post-annotated network. In other respects, the method works
like a conventional GAN.</p>
      <p>
        Another interesting method is data augmentation based on a variational autoencoder [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. It is referred
to as generative models. Learning methods of this family consist of mapping objects into a given latent
space and reproducing them back. The task related to the autoencoder is to find the functions that will
allow mapping the latent variable area to another one, an understandable and simple space. A
customarily distributed space is a case in point.
      </p>
      <p>While designing methods based on variational autoencoder, one should define the number of
neurons in the first and the second latent layers and set the number of latent factors. It will contain all
applicable information and serve as a decoder to recover all initial inputs. After all the necessary settings
have been made, the learning procedure can be performed.</p>
      <p>
        If the hidden dependencies between variables are linear, the variational autoencoder works as a PCA
method [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In this case, to each element according to the method [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] some random noise will be added
to get the best autoencoder performance. As investigated by the author of the method, this approach
provides the possibility of obtaining an artificial set closer to the real dataset compared to the method
without noise. That is why this method was taken for comparison. Details of its implementation are
given in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
      </p>
      <p>The synthesis of the new data is as follows. Beforehand we know the variance and the mean of our
latent variables, which are determined by the autoencoder. The next step is the use of a normal
distribution with the variance and the mean for each of the latent variables. It is needed for the selection
of the value for all latent variables. It is these that serve as starting points from which all attributes of
the initial data set can be reproduced.</p>
      <p>
        In case the latent dependencies between variables are linear, the variational autoencoder works like
the PCA method [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which suggests that some random noise will be added to each item according to
the method of [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to obtain better results from the autoencoder. As the reported by author of the method
investigated in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], this approach allows producing an artificial set more similar to the real dataset
compared with the method without using noise. That is why this method was taken for comparison.
Details of its implementation are presented in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>2.1.2. Dataset description</title>
      <p>
        Diagnostics tasks are widespread in the medical industry. The majority of them are reduced to a
classification task, and we can apply machine learning techniques. For example, in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a dataset is
submitted, and the task of predicting the development of diabetes task is formulated. The original
variable is represented as 0 or 1. Thus, it is a binary classification problem. Fig. 1 shows a few
distributions for some variable pairs.
      </p>
    </sec>
    <sec id="sec-5">
      <title>3. Modeling and results</title>
      <p>Simulation of the classifiers investigated based on extended datasets using neural network methods.
Experimental studies were carried out by dividing the dataset randomly into two parts at a ratio of 80%
to 20%. Cross-validation was then applied (5 times). In this way, the reliability of the results was
ensured.</p>
      <p>The paper presents two neural network approaches for the artificial expansion of short datasets. Let
us consider the outcomes and evaluations of the synthesized data for each of them.</p>
    </sec>
    <sec id="sec-6">
      <title>3.1.1. Classification using augmented data via autoencoder</title>
      <p>
        Autoencoder-based modeling was carried out in order to synthesize a new dataset whose size would
match the size of the original dataset. A comparison between the synthesized dataset with the original
one based on several indicators from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], has revealed the following results: the mean correlations
between fake and real columns are 0.97, MAPE estimator results are 0.84, and a similarity score is 0.93.
      </p>
      <p>In addition, Figure 2 presents a comparison of the feature distributions for the initial and synthesized
datasets.</p>
      <p>It should be noted that the autoencoder generated the same number of instances of each class as the
number of instances in the initial dataset.</p>
      <p>The application results of different classifiers on the extended dataset are summarised in Table 1.</p>
      <p>Figure 2: Distribution per features for initial and synthetic datasets</p>
      <p>Total accuracy
(autoencoders)
0,8
0,78
0,76
0,74
0,72
0,7</p>
      <p>0 75 150 225 300 500 768
AdaBoost 0,7631 0,7652 0,7746 0,7745 0,7904 0,8203 0,8301
Logistic Regression 0,7696 0,7794 0,7844 0,7836 0,796 0,8029 0,8295
Random Forest 0,7656 0,7628 0,7637 0,7775 0,797 0,8116 0,8249</p>
      <p>SVC 0,7591 0,7664 0,7593 0,7563 0,7782 0,7856 0,7985</p>
      <p>As can be seen from Figure 3 the increase in the number of additionally added, artificially generated
vectors to the initial data set led to an increase in the accuracy of all classifiers. The highest accuracy
was obtained by added to the initial set the same dimension of the artificial sample (768 additional
vectors).</p>
    </sec>
    <sec id="sec-7">
      <title>3.1.2. Classification using augmented data via GAN</title>
      <p>
        The authors adopted the method from [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for numerical data augmentation. As in the previous cases,
the size of the new dataset is equal to the size of the initial one. A comparison between the simulated
dataset and the initial one based on several indicators from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] has shown the following results: mean
correlations between fake and real columns are 0.92, MAPE estimator results are 0.67, and a similarity
score is 0.59.
      </p>
      <p>Figure 4 shows a comparison of the feature distributions for the initial and synthesized datasets.</p>
      <p>It is worth remarking that the GAN-based method attempted to balance the dataset. It generated
significantly more instances of the smaller class compared to the initial dataset.</p>
      <p>The application results of different classifiers on the extended dataset are summarised in Table 2.</p>
      <p>Figure 4: Distribution per features for initial and synthetic datasets</p>
      <p>Total accuracy
(GAN)
0,78
0,76
0,74
0,72
0,7
0,68
0,66
0,64</p>
      <p>0 75 150 225 300 500 768
AdaBoost 0,7631 0,7509 0,7364 0,7543 0,7876 0,7905 0,8015
Logistic Regression 0,7696 0,7604 0,7462 0,7321 0,7148 0,7272 0,7352
Random Forest 0,7656 0,7426 0,732 0,7452 0,7456 0,7516 0,7946</p>
      <p>SVC 0,7591 0,7509 0,7255 0,7261 0,7064 0,7027 0,7326</p>
      <p>As can be seen from Figure 5 the increase in the number of additionally added, artificially generated
vectors to the initial data set did not always lead to an increase in the accuracy of the classifiers. Only
extension of the initial set by more than 60% of new, artificially synthesized data vectors helped to
reduce the errors of classifiers. The highest accuracy, as in the previous case, was obtained by added to
the initial dataset the same dimension of the artificial sample (768 additional vectors).
3.2.</p>
    </sec>
    <sec id="sec-8">
      <title>Comparison and discusion</title>
      <p>This section compares both the new datasets generated by both methods under investigation and the
classification results based on their application.</p>
    </sec>
    <sec id="sec-9">
      <title>3.2.1. Numerical evaluation of the synthetic datasets</title>
      <p>
        In this paper, a comparison of the results obtained by every method investigated was performed on the
basis of some indicators from [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The results of the comparison between the real dataset and one
synthesized by GAN or autoencoder are summarized in Table 3.
      </p>
      <p>As shown in Table 1, the data augmentation method based on the autoencoder provides significantly
higher results in comparison with the dataset obtained by GAN. This can significantly affect the
performance of classifiers with these data.</p>
      <p>However, such dataset decorrelation enables the construction of ensemble models based on a single
classifier to process different datasets. This approach can significantly improve the accuracy of
classification methods in medicine.</p>
    </sec>
    <sec id="sec-10">
      <title>3.2.2. Comparison of the classification accuracy of the different classifiers</title>
      <p>The performance of both neural network approaches was compared by determining the accuracy of a
few known classifiers: Random forest classifier; AdaBoost classifier; Logistic regression classifier;
SVM classifier. They were employed for classification based on the initial and new datasets. It should
be noted that the dimensionality of the new datasets was doubled, thus the synthesized data were added
to the original one. The outcomes that are based on Total accuracy, Precision, and Recall are shown in
Fig. 6. Since the problem is not balanced, F-measure was not taken into account.</p>
      <p>GAN Autoencoder Initial dataset
SVM classifier</p>
      <p>Logit classifier</p>
      <p>AdaBoost classifier
Random forest classifier</p>
      <p>GAN
SVM classifier</p>
      <p>Logit classifier</p>
      <p>AdaBoost classifier
Random forest classifier</p>
      <p>GAN
SVM classifier</p>
      <p>Logit classifier</p>
      <p>AdaBoost classifier
Random forest classifier
0,65
0
0</p>
      <p>0,75
Total accuracy
0,8</p>
      <p>0,85</p>
      <sec id="sec-10-1">
        <title>Autoencoder</title>
      </sec>
      <sec id="sec-10-2">
        <title>Initial dataset</title>
        <p>0,7
0,2
a)
b)
0,4</p>
        <p>Recall
Precision
0,6
0,8</p>
      </sec>
      <sec id="sec-10-3">
        <title>Autoencoder</title>
        <p>Initial dataset
0,2
0,4
0,6
0,8</p>
        <p>1</p>
        <p>From the graphs in Fig. 6, it follows that the highest accuracy based on all performance indicators
is achieved by using a synthesized dataset with an autoencoder. The application of GAN for data
augmentation shows a much lower performance of the known classifiers compared to processing an
initial dataset (based on the total accuracy). However, AdaBoost and Random Forest algorithms provide
more accurate results in this case. This can be explained by the insufficient amount of training data for
effective GAN performance, which affected the one of SVM and Logistic Regression classifiers.</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>4. Conclusion</title>
      <p>This paper deals with the numerical data augmentation task in Clinical Medicine. The authors have
experimentally evaluated the performance of modern neural network methods to solve the problem:
autoencoders and a GAN. Such an approach helps to reduce risks of overfitting or underfitting when
using machine learning models in case of small data processing.</p>
      <p>The modeling of the performance of these methods has been carried out using the dataset for solving
a classification task. In this case, we tried to predict the possibility of diabetes development. The dataset
is not balanced. Experiments have shown that autoencoders generate the most similar data according to
the Similiarity score. In addition, the accuracy of classifiers based on these data is significantly higher.
Compared with the initial dataset, we have improved the target resolution accuracy by about 10%.</p>
      <p>Given the different results of the similarity evaluation of the synthesized datasets concerning the
initial one and the different accuracy of the classifiers based on such data, the ensemble learning
approach can be used in further research to improve the accuracy of various diagnostics tasks. In
particular, the approach of constructing a stacking ensemble of homotypic classifiers that will process
different systematically studied datasets seems promising. This very approach can provide a significant
increase in the accuracy of classifiers when solving applied tasks of diagnostics in various fields
[1015] when processing average datasets.</p>
      <p>The National Research Foundation of Ukraine funds this study from the state budget of Ukraine
within the project “Decision support system for modeling the spread of viral infections” (№ 2020.01 /
0025).</p>
    </sec>
    <sec id="sec-12">
      <title>5. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Snow</surname>
          </string-name>
          ,
          <article-title>DeltaPy: A Framework for Tabular Data Augmentation in Python</article-title>
          , Social Science Research Network, Rochester, NY,
          <year>2020</year>
          . https://doi.org/10.2139/ssrn.3582219.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Berezsky</surname>
          </string-name>
          , G. Melnyk,
          <string-name>
            <given-names>T.</given-names>
            <surname>Datsko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Verbovy</surname>
          </string-name>
          ,
          <article-title>An intelligent system for cytological and histological image analysis</article-title>
          ,
          <source>in: The Experience of Designing and Application of CAD Systems in Microelectronics</source>
          , IEEE, Lviv - Polyana, Ukraine,
          <year>2015</year>
          : pp.
          <fpage>28</fpage>
          -
          <lpage>31</lpage>
          . https://doi.org/10.1109/CADSM.
          <year>2015</year>
          .
          <volume>7230787</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boyko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kuba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Mochurad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montenegro</surname>
          </string-name>
          ,
          <article-title>Fractal Distribution of Medical Data in Neural Network, CEUR-WS</article-title>
          .
          <year>Org</year>
          .
          <volume>2488</volume>
          (
          <year>2019</year>
          )
          <fpage>307</fpage>
          -
          <lpage>318</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Skoularidou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cuesta-Infante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Veeramachaneni</surname>
          </string-name>
          ,
          <article-title>Modeling Tabular data using Conditional GAN</article-title>
          , ArXiv:
          <year>1907</year>
          .00503 [Cs, Stat]. (
          <year>2019</year>
          ). http://arxiv.org/abs/
          <year>1907</year>
          .00503 (
          <issue>accessed December 26</issue>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>[5] Deep Learning for tabular data augmentation, Data Science Blog von Lschmiddey</article-title>
          .
          <article-title>(</article-title>
          <year>2021</year>
          ). https://lschmiddey.github.io/fastpages_/
          <year>2021</year>
          /04/10/DeepLearning_TabularDataAugmentation.
          <source>ht ml (accessed May 16</source>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Kotsovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Batyuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yurchenko</surname>
          </string-name>
          ,
          <article-title>New Approaches in the Learning of Complex-Valued Neural Networks</article-title>
          ,
          <source>in: 2020 IEEE Third International Conference on Data Stream Mining &amp; Processing (DSMP)</source>
          ,
          <year>2020</year>
          : pp.
          <fpage>50</fpage>
          -
          <lpage>54</lpage>
          , doi: 10.1109/DSMP47368.
          <year>2020</year>
          .
          <volume>9204332</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] lschmiddey, lschmiddey/deep_tabular_augmentation,
          <year>2021</year>
          . https://github.com/lschmiddey/deep_tabular_
          <source>augmentation (accessed May 16</source>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Pima</given-names>
            <surname>Indians Diabetes</surname>
          </string-name>
          <string-name>
            <surname>Database</surname>
          </string-name>
          , (n.d.). https://kaggle.com/uciml/pima
          <article-title>-indians-diabetes-database (accessed May 16,</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9] TableEvaluator - table evaluator 15-08-2019 documentation, (n.d.). https://baukebrenninkmeijer.github.io/table-evaluator/table_evaluator.
          <source>html (accessed May 16</source>
          ,
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Chumachenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Sokolov</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Yakovlev,</surname>
          </string-name>
          <article-title>Fuzzy recurrent mappings in multiagent simulation of population dynamics systems</article-title>
          , IJC. (
          <year>2020</year>
          )
          <fpage>290</fpage>
          -
          <lpage>297</lpage>
          . https://doi.org/10.47839/ijc.19.2.1773.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zharikova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sherstjuk</surname>
          </string-name>
          ,
          <article-title>Situation diagnosis based on the spatially-distributed dynamic disaster risk assessment</article-title>
          ,
          <source>in: 2019 IEEE 14th Intern. Conf. CSIT</source>
          ,
          <year>2019</year>
          : pp.
          <fpage>205</fpage>
          -
          <lpage>209</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Sergii</surname>
            <given-names>Babichev</given-names>
          </string-name>
          , Jiří Škvor, Jiří Fišer, Volodymyr Lytvynenko,
          <source>Technology of Gene Expression Profiles Filtering Based on Wavelet Analysis</source>
          ,
          <source>International Journal of Intelligent Systems and Applications(IJISA)</source>
          , vol.
          <volume>10</volume>
          , no.
          <issue>4</issue>
          ,
          <issue>2018</issue>
          : pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Babichev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. I.</given-names>
            <surname>Lytvynenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Taif</surname>
          </string-name>
          ,
          <article-title>Estimation of the inductive model of objects clustering stability based on the k-means algorithm for different levels of data noise, Radio Electronics</article-title>
          , Computer Science, Control, no.
          <issue>4</issue>
          ,
          <year>2016</year>
          :
          <fpage>54</fpage>
          -
          <lpage>60</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>