=Paper= {{Paper |id=Vol-3302/paper21 |storemode=property |title=ECG Analysis Software Library based on NLP and ML Methods |pdfUrl=https://ceur-ws.org/Vol-3302/paper13.pdf |volume=Vol-3302 |authors=Yurii Oliinyk,Mykhailo Yazenok,Oleksandr Ocheretianyi,Igor Baklan,Kateryna Lishchuk,Elisa Beraudo |dblpUrl=https://dblp.org/rec/conf/iddm/OliinykYOBLB22 }} ==ECG Analysis Software Library based on NLP and ML Methods== https://ceur-ws.org/Vol-3302/paper13.pdf
ECG analysis software library based on NLP and ML methods
Yurii Oliinyk, Mykhailo Yazenok, Oleksandr Ocheretianyi, Igor Baklan, Kateryna Lishchuk,
Elisa Beraudo


National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", 37, Prosp. Peremohy, Kyiv,
 03056, Ukraine

                Abstract
               This article discusses the implementation of a software library for the analysis of the
               electrocardiogram signal. A feature of this library is to improve the functionality and
               simplify the interaction with existing machine learning software and tools for loading,
               processing and storing ECG signal datasets by using the Word2Vec model. The library
               increases development speed of a new software, which involves various ECG analysis.
               Therefore, scientists could more easily implement their ideas related to NLP and ML.

                Keywords 1
                ECG signal, machine learning, WFDB, software library, NLP,Word2Vec

1. Introduction
    Today, the application of machine learning algorithms in various fields is a key element in the study
of various nature data in order to receive impulse for the scientific progress or to automate the
conclusion generating system, which normally requires a human solution. Machine learning software
libraries that exist today provide a wide range of tools for developers, making it easier to write products.
That is why the improvement of these software tools is the key to accelerating the stage of scientific
ideas implementation.
    The process of studying the electrocardiogram (ECG) signal includes the use of software tools: for
obtaining, formatting and storing datasets, analysis by machine learning methods and intermediate
stages of processing. The most popular libraries that can meet these needs of a developer are: the Scikit
Learn library [1], which contains machine learning algorithms and models for vectorized data; the
Gensim library [2], which is an NLP tool; the WFDB library [3] providing access to a variety of datasets
of electrocardiogram signals.
    The listed software tools have a number of drawbacks in the context of ECG processing, some of
them are: too low an abstraction level, the absence of an automatic caching system, the absence of
additional tools for processing ECG signals. To improve these tools, it is necessary to perform software
development to correct the listed drawbacks, reduce the amount of code to solve simple problems and
allow a developer to focus on key research.

2. Related work
   There are quite a few software libraries for implementing ECG signal processing. Scikit Learn or
sklearn is a Python-based library that provides basic mechanisms for creating models that are then used
to predict data. Also, this library provides a large number of additional tools and algorithms for pre-
1
 IDDM-2022: 5th International Conference on Informatics & Data-Driven Medicine, November 18–20, 2022, Lyon, France
EMAIL: oliyura@gmail.com (Y. Oliinyk); mihailyazenok@gmail.com (M. Yazenok); s.ocheretyany@gmail.com (O. Ocheretianyi);
iaa@ukr.net (I. Baklan); lishchuk_kpi@ukr.net (K. Lishchuk); elisa.beraudolive@gmail.com (E. Beraudo);
ORCID: 0000-0002-7408-4927 (Y. Oliinyk); 0000-0002-0929-3626 (M. Yazenok); 0000-0001-9455-4781(O. Ocheretianyi) 0000-0002-
5274-5261            (I.          Baklan);           0000-0002-9902-0065             (K.          Lishchuk);       000-0001-7550-
3620 (E. Beraudo)

                 ©️ 2022 Copyright for this paper by its authors.
                 Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                 CEUR Workshop Proceedings (CEUR-WS.org)
processing, post-processing, storing and transforming data [1]. The main types of data processing
algorithms in this library are model dimension reduction, model selection, regression, classification,
and cluster analysis. Also, this library provides several basic data sets for testing and verifying
algorithms.
    Gensim is an open source library written in Python. This library is the core for the implementation
of algorithms for representing text documents in the form of semantic vectors that can be used in data
analysis algorithms, the input values of which can be exclusively vectorized data [2].
    The main algorithms implemented in Gensim have a common foundation of functionality, which
consists in determining the semantic structure of the provided text documents by analyzing statistical
repetitions of similar text sentence schemes. The input data of such algorithms does not have to be texts
that a person can understand, thus, all these methods can be used to find statistical patterns of any sets
of words that are a combination of absolutely different characters.
    The principles of the Gensim library emphasize that they support a more practical point of view of
using algorithms to solve real-world problems. That is, the set of algorithms is more focused on
production-ready applications than on academic ones.
    The considered libraries have a number of drawbacks when programmers use them to solve a wide
range of problems. The proposed level of abstraction and the proposed architecture in these libraries
significantly reduces the speed of writing code, requiring the programmer to write elementary parts,
which subsequently accumulate, increasing the amount of code, thereby reducing readability. The
specified list of drawbacks has a significant impact on the final assessment of the received software,
increasing the number of man-hours spent on its development.
    We can state a fact that there are a few software libraries for implementing ECG signal processing
with using NLP and ML methods. In the article [12] authors found only 31 applications during the
literature search for ECG analysis in paper between 1 April 2015 and 15 May 2020 which used ML
methods in ECG analyses. In article [4] a linguistic approach to data analysis is proposed that include
transformation ECG signals to linguistic chain.
    In the article [11] proposed automatic ECG analysis system. The main idea of algorithm is based on
using existing Support Vector Machine classifier and optimizing some of their parameters. Proposed
hybrid optimization algorithm was developed using Particle Swarm Optimization and Migration
Modified Biogeography Based Optimization algorithm.
    Development of algorithms for ECG analysis and detection of various diseases based in them is a
rather complex process. Detailed overview of models, datasets, and their accuracy in diagnosis of heart
related diseases described authors in [9]. In article [10] authors presented a fully automatic and fast
ECG arrhythmia classifier based on a simple brain-inspired machine learning approach known as Echo
State Networks.
    All the above-mentioned methods are based on the classical machine learning methods, but in the
article [13] authors proposed a new technique to analyze ECG named ECG language processing that
processes the ECG signal in a way a text document is treated in natural language processing (NLP)
framework. This approach was extended by using Word2Vec model in our study [5]. But there are still
unexplored tasks of assessing the quality of the classification methods application for ECG signal
analysis, accuracy evaluation of the clustered representation accuracy of the ECG signal, evaluating
software code reduction and accelerating development process

2.1.    Researches tasks
   Main aim: to expand the capabilities of the electrocardiogram automatic analysis by creating a
Word2Vec model based on selected waves in the ECG. The following tasks should be solved within
the framework of this research:
       • selection of a data set for processing;
       • development a software library to facilitate the development of data analysis software using
          the Word2Vec model;
       • researching clustered representation accuracy of the ECG signal;
       • researching classification algorithms efficiency;
       • evaluating software code reduction and accelerating development process.
   .

3. THE SOFTWARE LIBRARY DEVELOPMENT
   The following describes the steps for the development of software library for analyzing the ECG
signal.

3.1.    ECG signal dataset preparation module
   One of the modules of the developed library is the module for preparing electrocardiogram signal
datasets. This module provides functionality for loading the most common datasets: MIT-BIH AFIB
Dataset and PhysioNet MIT-BIH [3]. The amount of data in these datasets ensures high accuracy in
machine learning.
   This module provides a wide range of tools for processing the ECG signals themselves, in particular,
we can distinguish: convenient dataset conversion to provide a standardized view of the entire dataset
and cut off unnecessary and invalid heartbeats in the read electrocardiogram signals, thus reducing the
amount of memory when storing the date without losing key information, which is necessary when
solving problems using machine learning and data analysis algorithms. Also, the module provides
functionality for convenient storing, reloading, rebuilding dataset data.

3.2.    Caching System for Machine Learning Algorithms Data
   Data caching is provided at all levels of advanced algorithms, so the programmer can reliably and,
most importantly, quickly form a data cache after each stage of information processing at each stage.
   The caching mechanism also makes it possible to reshape the data, so if a stage was designed
incorrectly (which is often the case when developing using machine learning algorithms), then the data
cached on it can be generated again.
   Data caching also provides the programmer with a tool to reproduce the experiment. Since some
machine learning algorithms have a certain moment of randomization of the steps of the algorithm,
some of the generated data can lead to a system failure. Such situations are very difficult to reproduce,
especially if the value of the basis for generating data, has not been passed to the algorithm. Implicit
caching makes it possible to reproduce the experiment with the original data, which allows the
programmer to make adjustments to the algorithm to ensure that it works in some edge cases that may
possibly occur in the future.
    It can also be noted that each extension of the main machine learning models implements a
common interface, thus facilitating the work of the programmer. The implementation scheme
is shown in Figure 1.




   Figure 1: UML class diagram of the Cacheable interface implementation using the main models
3.3.    Program Library Extension Classes
    The extension class for generating datasets of electrocardiogram signals performs the following
functions:
       • Loading and caching of unformatted datasets.
       • Standardization of dataset data to increase accuracy of machine learning.
       • Splitting the complete electrocardiogram into individual heartbeats.
       • Reformatting data with cutting off invalid heartbeats.
       • Converting the arrhythmia class of each heartbeat to the corresponding numeric value.
       • Preparing training and validation sample according to the assessment of the degree of
           arrhythmia of each electrocardiogram.
       • Data caching before reformatting, after reformatting, and after creating training and
           validation sets.
       • Optional sample limit for test runs.
    The extension class will be implemented as a wrapper for a standard tool for obtaining datasets -
the waveform-database (WFDB) package. The class supports standard electrocardiogram datasets, for
example: MIT-BIH Arrhythmia Database and MIT-BIH Atrial Fibrillation Database [3].
    The extension class for the Random Forest and Gradient Boosting models is designed to implement
machine learning methods for ECG signal processing. The standard RandomForest and Gradient
Boosting classes provided by the sklearn library have a similar functionality, namely parameterized
machine learning based on an input sample of the x - y match type, however, these algorithms have
specific parameters that are unique to a specific implementation, so combining extensions for them is
not possible. For each of these two machine learning models, separate extensions have been created that
support caching and parameterization of algorithms through interaction with the model base.
    The Word2Vec [5] model extension class provides additional functionality that is useful both for
the electrocardiogram and for any other input data.
    In particular, the advanced caching functionality is extremely useful, which, immediately after
preparing and training the model based on the specified parameters, creates a cache file for the finished
model, which can be used explicitly and implicitly in the process of writing software for data analysis.
    Also, specifically for the Word2Vec model, a separate functionality was created to convert the full
array of words into the corresponding vector representation, omitting invalid words for which no vector
transformation was created inside the trained Word2Vec model.
    A separate electrocardiogram signal analysis module provides basic functions for obtaining a
numeric arrhythmia class value and converting from a numeric value to an arrhythmia class, which are
defined and described in the MIT database.
    Also, this module provides functionality for splitting a continuous signal into separate heartbeats
and highlighting individual components of the QRS complex [6] of each heartbeat. Isolation of the QRS
complex can be useful when analyzing the results obtained and for forming the main data block for
training models.
    Separately, inside the module, the BeatsToWordsConverter class is described for converting an
electrocardiogram signal into a linguistic form using the K-Means clustering algorithm [7]. This class
also supports a caching system and a combined model training and caching operation for easier use by
software developers.
    The extension class will be implemented as a wrapper for a standard tool for obtaining datasets - the
waveform-database (WFDB) package. The class supports standard electrocardiogram datasets, for
example: MIT-BIH Arrhythmia Database and MIT-BIH Atrial Fibrillation Database [3].
    The extension class for the Random Forest and Gradient Boosting models is designed to implement
machine learning methods for ECG signal processing. The standard Random Forest and Gradient
Boosting classes provided by the sklearn library have a similar functionality, namely parameterized
machine learning based on an input sample of the x - y match type, however, these algorithms have
specific parameters that are unique to a specific implementation, so combining extensions for them is
not possible. For each of these two machine learning models, separate extensions have been created that
support caching and parameterization of algorithms through interaction with the model base.
    The Word2Vec [5] model extension class provides additional functionality that is useful both for the
electrocardiogram and for any other input data.
    In particular, the advanced caching functionality is extremely useful, which, immediately after
preparing and training the model based on the specified parameters, creates a cache file for the finished
model, which can be used explicitly and implicitly in the process of writing software for data analysis.
Also, specifically for the Word2Vec model, a separate functionality was created to convert the full array
of words into the corresponding vector representation, omitting invalid words for which no vector
transformation was created inside the trained Word2Vec model.
    A separate electrocardiogram signal analysis module provides basic functions for obtaining a
numeric arrhythmia class value and converting from a numeric value to an arrhythmia class, which are
defined and described in the MIT database.
    Also, this module provides functionality for splitting a continuous signal into separate heartbeats
and highlighting individual components of the QRS complex [6] of each heartbeat. Isolation of the QRS
complex can be useful when analyzing the results obtained and for forming the main data block for
training models.
    Separately, inside the module, the BeatsToWordsConverter class is implemented for converting an
electrocardiogram signal into a linguistic form using the K-Means clustering algorithm [7]. This class
also supports a caching system and a combined model training and caching operation for easier use by
software developers.

3.4.    Using of Software library
   To solve the arrhythmia classification problem based on the electrocardiogram signal, the algorithm
shown in Figure 2 was developed.
Figure 2: ECG Signal Classification Algorithm


   The developed algorithm supports a simple change in the classifier model to obtain and compare
results when using various machine learning algorithms, in particular Random Forest and Gradient
Boosting.
   The task of determining the connection between serial signals using Word2Vec. Word2Vec using
the Skip Gram algorithm [8] can be used to search for context words surrounding the key. This
algorithm is applicable to the sequence of electrocardiogram signals to determine whether the
connection between successive signals is inherent in the context of the linguistic representation of the
electrocardiogram.
   The developed algorithm is shown in Figure 3.
Figure 3: Algorithm for calculating connections between consecutive ECG signals

4. THE SOFTWARE LIBRARY EFFICIENCY
   The following describes the efficiency of software library for analyzing the ECG signal. To study
the efficiency, accuracy and conciseness of the developed and computer software, it is necessary to
conduct a number of experiments:
       • comparing the compactness of the program code with and without the library when solving
           problems: preparing an electrocardiogram dataset, splitting the dataset into training and
           validation samples, caching intermediate stage data, the general task of arrhythmia
           classification based on the ECG signal;
       •  calculating the speed of the machine learning algorithm when using the linguistic
          representation of the electrocardiogram signal;
      • determination of the accuracy of the clustered representation of the electrocardiogram signal
          for different sizes of clusters;
      • application of the library for the task of determining the presence of a connection between
          successive ECG signals in a linguistic representation;
      • application of the library for the analysis of data presented by the TextRank method when
          applied to the linguistic representation of the ECG signal.
   For research, we used hardware with the following characteristics: Intel Core i5-6200U CPU 2.3-
2.4 GHz, 12 GB RAM, Samsung 870 Evo-Series 1TB SATA III, AMD Radeon R5 M330. The
experiments were carried out on the operating system Windows 10 Corporate 2016.

4.1.       Researching clustered representation accuracy of the ECG signal
    The formation of a clustered representation consists of the formation of QRS complexes for
individual heartbeats of the electrocardiogram signal, followed by clustering of the formed parts using
two K-means models.
    The first model is responsible for clustering the intervals of R-peaks of heartbeats - high points in
the ECG signal.
    The second model is responsible for clustering the PR intervals that are in the corresponding R-peak
intervals and the ST intervals that are after the corresponding R-peak intervals.
    Thus, the entire ECG signal is subject to clustering.
    After clustering, we replace the corresponding signal intervals with cluster indices. The result is a
sequence of indices, where every three indices represent one heartbeat. The first index is the PR interval,
the second is the R-peak interval, the third index is the ST interval.
    To reverse transform a clustered sequence, you must replace each index with the value of the center
of the corresponding cluster. Thus, we get a copy of the electrocardiogram signal close to the original.
The accuracy of such a transformation depends entirely on the number of clusters specified as
parameters in both K-means models.
    An example of inverse transformation is shown in Figure 4. The orange line is the original signal,
the blue line is the restored signal from the clustered representation.
Figure 4: Comparison of the original and inverse transformation signals

   The rms error was used to estimate the accuracy of the clustered transformation.
   A graph of the change in the root-mean-square error depending on the total number of clusters in
both models is shown in Figure 5. Specific values are given in Table 1.




Figure 5: RMS errors versus total number of clusters
Table 1
   RMS errors depending on the number of clusters

  The number of clusters in the       Number of clusters in the                   MSE
         first model                      second model
               6                                20                                0.173
              12                                40                                0.192
              18                                60                                0.19
              24                                80                                0.124
              30                               100                                0.143
              36                               120                                0.12
              42                               140                                0.105
              48                               160                                0.112
              54                               180                                0.111
              60                               200                                0.101

   As can be seen from the graph, the error in increasing the number of clusters decreases. So, when
using 60 clusters for the first model and 200 for the second model, the error is 10%.
   A graphical comparison of the signal with maximum accuracy is shown in Figure 6.




   Figure 6: Comparison of the original signal and the signal reproduced after clustering using models
with 60 and 200 clusters

4.2.    Researching classification algorithms efficiency
   When using a linguistic representation of an electrocardiogram followed by vectorization using
Word2Vec, the amount of input data for machine learning algorithms is reduced, thereby accelerating
the learning rate of classifier models. The developed algorithm is able to reduce the amount of input
data by several times by initial clustering and then vectorization using the Word2Vec model.
  To determine the improvement in speed, a number of experiments were carried out with the
measurement of the time spent on training classifier algorithms. The results are presented in table 2.

Table 2
   Model learning rate with and without using the Word2Vec model

                     Initial dataset
                        volume               Training duration, s                  F- measure
                       (heart bit
    Algorithm         thousands)                                         Without
                                         Without       With using                        With using
                                                                          using
                                        Word2Vec       Word2Vec                          Word2Vec
                                                                        Word2Vec
                                         model           model                             model
                                                                         model
 Random Forest                            33,2447          27,796           0.95             0.96
 Gradient                    10.5         32,1669          27,877           0.96             0.98
 Boosting
 Random Forest                            495,244          45,063           0.95                0.97
 Gradient                    13.2         479,433          40,708           0.93                0.97
 Boosting

    As can be seen from the table, the learning time really increases when using the text representation
of the electrocardiogram signal vectorized using the Word2Vec model. Especially significant changes
are noticeable when increasing the dataset volume.

4.3. Evaluating software code reduction and accelerating development
process
   To compare the compactness of the code, we will use the count of the number of words and
characters to write the same functional block using the developed library and without it.
   Table 3 compares the number of words and symbols when solving the problem of preparing an
electrocardiogram signal dataset and dividing it into a training and validation set. Table 4 compares
data caching implementations. Example of using software library code can be found in source [14].

Table 3
   Counting the number of software code words and characters for preparing a dataset

 Used decision                         Number of the words for          Number of the symbols for
                                       solving                          solving
 Using the library                     37                               545
 Without using the library             813                              5577

   An example of using the library to prepare a dataset:
ecgdataset = ecgdatasetsholder.EcgDatasetsHolder.cache_from_mit(
    sets_count_limit=5,
    database_name="mitdb",
    mit_records_path=rsc_dir + "/mit_records",
    dataframe_path=rsc_dir + "/ecgdataset",
    annotator_type="symbol",
    reload=False
)

train, test = ecgdataset.split_train_test(test_size=0.25, random_state=42)
train_ready = train.concatenate_datasets()
test_ready = test.concatenate_datasets()

Table 4
Counting the number of words and characters for solving caching problem
 The task of caching,                  Using the library              Without using the library
 or learning and caching           Words              Symbols          Words          Symbols
 Word2vec                            37                 336             169             1117
 KMeans                              20                 237             121              736
 WFDB Dataset                        19                 199             609             3932

   An example of using the library for the task of learning and caching Word2Vec:
   num_features                               =                                                        300
word2vecExtModel                                =                                                        \
    word2vecext.Word2VecExt.load_or_fit_words_and_save(train_words,

train_ready.train_start_indices["start_indices"].tolist(),
                                                       rsc_dir    +   "/word2vec",
                                                       vector_size=num_features,
                                                       reset=False)
train_data       =       word2vecExtModel.vectorize_valid_with_labels(train_words,
train_ready.dataframe["labels"].tolist())

validation_data = word2vecExtModel.vectorize_valid_with_labels(validation_words,
test_ready.dataframe["labels"].tolist())

(train_x, train_y), (validation_x, validation_y) = (train_data, validation_data)



5. Discussion and Conclusion


     Today among modern researches use of machine learning algorithm in medical data assesment
becomes inevitable. Therefore we have developed software library for analyzing the ECG signal by
using Word2Vec model , which includes ECG signal dataset preparation module, caching system for
machine learning algorithms data, program library extension classes. With the help of the developed
library, the amount of software code necessary for data preparation, caching or analysis is reduced
several times.
     Clustered representation accuracy of the ECG signal was investigated. RMS errors of restored data
significantly decreases when using 100 or more clusters and is approximately 10%. Therefore, the
developed method [5] of presenting the ECG signal by using Word2Vec model, which reduces the
original ECG signal more than 100 times, can be effectively applied to significantly reduce the stored
signal and its further analysis without loss of quality.
     Use of Word2Vec model increases F-measure of Random Forest method from 0.95 to 0.97 and
from 0.96 to 0.98 for Gradient Boosting. Learning time significantly increases when using the text
representation of the electrocardiogram signal vectorized using the Word2Vec model, especially in case
of increasing the dataset volume.
     The developed library increases the level of abstraction, which allows researchers to use it with less
programming experience in their fields. An important direction of the library's development will be its
expansion and addition of support for new algorithms, so that scientists can more efficiently solve the
tasks, without spending time on low-level adjustment of algorithms.
References
  [1] Scikit-learn, a Python module for machine learning, 2020. URL:https://scikit-learn.org.
  [2] Gensim library,2022. URL: https://radimrehurek.com/gensim.
  [3] The WFDB Software Package, 2018. URL: https://archive.physionet.org/phys-
      iotools/wfdb.shtml.
  [4] Igor Baklan, ECG Signal Processing Based on Linguistic Chain Fuzzy Sets, in: Alina Oliinyk,
      Iryna Mukha, Kateryna Lishchuk, Olena Gavrilenko, Svitlana Reutska, Anna Tsytsyliuk, Yurii
      Oliinyk, Proceedings of the 5th International Conference on Computational Linguistics and
      Intelligent Systems, COLINS’2021, volume I of Main Conference, CEUR-WS, volume 2870,
      pp. 1731-1741. URL: http://ceur-ws.org/Vol-2870/paper125.pdf
  [5] Yurii Oliinyk, Andrii Tereschenko, Igor Baklan, Elisa Beraudo, ECG Analysis based on
      Word2Vec Model in Proceedings of the 4th International Conference on Informatics & Data-
      Driven Medicine, IDDM 2021 Valencia, Spain, CEUR-WS, volume 3038, pp. 203-232. URL:
      http://ceur-ws.org/Vol-3038/short9.pdf
  [6] J. Pan and W. J. Tompkins, "A Real-Time QRS Detection Algorithm," in IEEE Transactions
      on Biomedical Engineering, vol. BME-32, no. 3, pp. 230-236, March 1985, doi:
      10.1109/TBME.1985.325532.
  [7] Tryon, Robert C. Cluster Analysis: Correlation Profile and Orthometric (factor) Analysis for
      the Isolation of Unities in Mind and Personality, Ann Arbor, Mich., Edwards Brothers,1939.
  [8] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J. , Distributed representations of
      words and phrases and their compositionality, in: Proceeding of the 27th Annual Conference on
      Neural Information Processing System, Advances in neural information processing systems,
      volume 26, pp.3111-1119,2013.
  [9] Mishra A, Dharahas G, Gite S, Kotecha K, Koundal D, Zaguia A, Kaur M, Lee HN. ECG Data
      Analysis with Denoising Approach and Customized CNNs. Sensors (Basel). 2022 Mar
      1;22(5):1928. doi: 10.3390/s22051928.
  [10]         Alfaras M, Soriano MC and Ortín S (2019) A Fast Machine Learning Model for ECG-
      Based Heartbeat Classification and Arrhythmia Detection. Front. Phys. 7:103. doi:
      10.3389/fphy.2019.00103
  [11]         Manikandan Kaliappan, Sumithra Manimegalai Govindan, and Mohana Sundaram
      Kuppusamy. 2022. Automatic ECG analysis system with hybrid optimization algorithm based
      feature selection and classifier. J. Intell. Fuzzy Syst. 43, 1 (2022), 627–642.
      https://doi.org/10.3233/JIFS-212373
  [12]         Sulaiman Somani, Adam J Russak, Felix Richter, Shan Zhao, Akhil Vaid, Fayzan
      Chaudhry, Jessica K De Freitas, Nidhi Naik, Riccardo Miotto, Girish N Nadkarni, Jagat Narula,
      Edgar Argulian, Benjamin S Glicksberg, Deep learning and the electrocardiogram: review of
      the current state-of-the-art, EP Europace, Volume 23, Issue 8, August 2021, Pages 1179–
      1191, https://doi.org/10.1093/europace/euaa377
  [13]         Sajad Mousavi, Fatemeh Afghah, Fatemeh Khadem, U. Rajendra Acharya, ECG
      Language processing (ELP): A new technique to analyze ECG signals, Computer Methods and
      Programs in Biomedicine, Volume 202, 2021, 105959, ISSN 0169-2607,
      https://doi.org/10.1016/j.cmpb.2021.105959.
  [14]         Example of using software library based on NLP and ML methods, 2022. URL:
      https://colab.research.google.com/drive/1L46s8gcXfOzAsp5Lenoem9NFrNfKKWOm