=Paper= {{Paper |id=Vol-2212/paper10 |storemode=property |title=The ensemble of algorithms for coronary heart disease detection based on electrocardiogram |pdfUrl=https://ceur-ws.org/Vol-2212/paper10.pdf |volume=Vol-2212 |authors=Valeriia Guryanova }} ==The ensemble of algorithms for coronary heart disease detection based on electrocardiogram == https://ceur-ws.org/Vol-2212/paper10.pdf
The ensemble of algorithms for coronary heart disease
detection based on electrocardiogram


                    V N Guryanova1

                    1
                     Lomonosov Moscow State University, Leninskie Gory 1, Moscow, Russia, 119991



                    Abstract. Coronary heart disease (CHD) is the leading cause of death in the world. This disease
                    can be asymptomatic for a long time and over time can progress and result in death. Today
                    electrocardiogram (ECG) can be done at home with the help of special equipment from
                    CardioQvark. In this paper the possibility of CHD detection based on such ECGs was explored.
                    Different approaches to the classification of such electrocardiograms were surveyed. New
                    algorithms and modifications to existing algorithms were proposed. A new method – the
                    ensemble of different algorithms – has shown the best performance.




1. Introduction
Coronary heart disease(CHD) [1] is a group of diseases, that is defined by lack of oxygen supply
to the heart muscle through the coronary arteries. According to World Health Organization,
this disease is the leading cause of death in the world.
    At the initial stages of the disease, most people do not show any symptoms of this disease.
It is very important to identify the CHD in time to slow the course of the disease and prevent
the patient’s death.
    Traditionally, CHD can be detected with the help of specialists and a number of tests. It
should be noted that these tests take a significant amount of patient’s time, and also require high
qualification of the specialist who will conduct them. Since there are very few specialists and
the number of potential patients is growing every year, the task of automatically determining
CHD is extremely urgent at the present time. Currently, it is extremely relevant to create a
device that will help determine the disease or its probability at home. Such devices will allow
the person to be sent to a doctor in case of a high probability of having a CHD.
    An electrocardiogram (ECG) – is a signal that reflects the electrical activity of the heart. ECG
is one of the most affordable ways of diagnosing heart disease now due to its non-invasiveness
and low cost. Currently, there are many different studies that show that the ECG can be used
to determine CHD [2], [3], [4], [5].
    CardioQvark (project site: www.cardioqvark.ru) has created a device in a form of a
smartphone case that allows you to make ECG measurements at home. The CardioQVARK
device is a portable electrocardiograph in the form of a smartphone case (iPhone 5 / 5s / SE /
6 / 6s), allowing to register data of bio-electrical activity of the heart from the first ECG lead
and next leads: aVR, aVL, aVF, Vi (i = 1 ... 6) using the patient’s cable. In this work, the
possibility of CHD detection based on such ECGs from the first lead was explored.



IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)
Data Science
V N Guryanova




   There are different approaches to ECG classification problem. Some of them were surveyed
in this work. New algorithms and modifications to existing algorithms were proposed. In order
to improve the performance of the classification, an ensemble of 5 different methods was built.
Each of these methods will be described below.

2. Data Description
All research was conducted on the basis of the following medical centers: NGHCI ”Semashko
Central Clinical Hospital 2”, Federal State Scientific Institution ”Petrovsky Russian Scientific
Center of Surgery”, Federal State-Funded Health Care Institution ”City Clinical Hospital 4
Health Care Moscow Department”, Federal State-Funded Health Care Institution ”Moscow
Clinical Scientific Center of Moscow Department Health Care”, State Autonomous Health Care
Institution of the Moscow Region ”Clinical Center for Restorative Medicine and Rehabilitation”.
    A voluntary anonymous study included patients over 18 years of age. Annotated impersonal
electrocardiograms (ECG) were recorded from the first ECG-lead using a CardioQVARK cardio
monitor. The duration of each recording was 5 minutes. The measurement was taken in the
sitting position, with support for the back, hands on the knees or on the table. Data was collected
in dynamics of 3-10 observations with an interval of at least 12 hours between measurements.
    The sample that was used for this task consists of 1798 cardiograms. It contained 1055
cardiograms of healthy patients and 743 cardiograms of patients with CHD. The sampling
frequency was 1000 Hz.
    Signals were preprocessed before applying machine learning algorithms. For preprocessing, a
low-pass and high-pass Butterworth filters [6] of the second order were used. For low-pass filter
cutoff frequency was 0.3 Hz. For high-pass filter cutoff frequency was 15 Hz. The signal trend
was extracted using a median filter [6]. Then the trend was subtracted from the preprocessed
signal.

3. Algorithms Description
Below are descriptions of the algorithms that were used to build the ensemble.

3.1. The algorithm based on the HRV signal
The idea which was used for this algorithm was described in the article [2]. In the ECG signal,
R-peaks can be distinguished, which correspond to the person’s pulse [6] . ECG signal is used
to create heart rate variability signal (HRV signal). It is calculated as follows.
  • R-peaks are computed.
  • The intervals between two R-peaks (RR-intervals) are measured.
  • Each value of the RR-interval is converted to 60/RR.
The main idea of this method is a construction of various groups of features from HRV signal.
    The first group of features includes various entropic features, which indicate a measure of
unpredictability in the signal. The following types of entropies are used: approximate entropy,
sample entropy, and Shannon entropy. Each type of entropy is described in detail below.
    Approximate entropy is calculated as follows. Here x = (x0 , x1 , ..., xN −1 ) is the HRV signal
of length N .
  • The integer m and the real r are fixed.
  • A set of vectors of the form xm
                                  i = (xi , xi+1 , ..., xi+m−1 ), where i ∈ [0, N − m] is composed.
                  m
  • The values Ci (r) are calculated as follows:

                                                  xm      m m
                                                                     
                                                   k : d xi , xk ≤ r, k ∈ [0, N − m]
                                   Cim (r) =                                         ,
                                                              N −m+1


IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)           74
Data Science
V N Guryanova




     where
                                        d xm    m                        xm        m
                                                     
                                           i , xk =           max         i (a) − xk (a) ,
                                                           a∈[0,m−1]

    where the xm                                  m
               i (a) − component a of the vector xi .
  • The values Φm (r) are defined as:

                                                                        N −m+1
                                      Φm (r) = (N − M + 1)−l
                                                                          X
                                                                                    log(Cim (r)).
                                                                           i=1

  • Approximate entropy (ApEn) is defined as

                                               ApEn(r) = Φm (r) − Φm+1 (r).

In this paper, the approximate entropy was realized for m = 10 and r = 0.2std(x), std(x) −
standard deviation of the signal x.
   The Sample entropy is calculated as follows.
                                        m+1
  • Vectors xm
             i of length m and vectors xi     of length m + 1 are formed similar to those that
    were formed in approximate entropy.
  • The values A and B are calculated as follows:

                    A(r) = (xm+1
                             k   , xm+1
                                    l   ) : d(xm+1
                                               k   , xm+1
                                                      l   ) ≤ r, 0 ≤ k ≤ l ≤ N − m − 1 ,

                             B(r) = (xm    m        m m
                                      k , xl ) : d(xk , xl ) ≤ r, 0 ≤ k ≤ l ≤ N − m ,

    where d is determined as in approximate entropy.
  • Sample entropy (SampEn) is defined as

                                                                              A(r)
                                                 SampEn (r) = − log                .
                                                                              B(r)

In this paper, the sample entropy was determined for m = 10, r = 0.2std(x).
   The Shannon entropy is calculated as
                                                               k
                                                               X
                                                ShanEn =              pf log pf ,
                                                               f =1

where k is the number of different elements in the signal x, pf is the frequency of the element f
in the signal x.
    The following group of features is based on a recurrence plot. This plot shows the frequency
and duration of repetitions in the signal. The element (i, j) of the given plot is defined as 1,
if ||xi − xj || < ε, or as 0 otherwise, where x is the HRV signal.
    Based on this plot, the following features are calculated, N − the number of elements in the
HRV signal, ld min − the minimum length of the plot diagonal, lv min − minimal length of the
vertical line, ld max − maximum length of the diagonal, lv max − maximum length of the vertical
line:
  • Density of points (REC)
                                                                    N
                                                               1 X
                                                   REC =                R(i, j).
                                                              N 2 i,j=0



IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)              75
Data Science
V N Guryanova




  • The percentage of points that form the diagonal lines (DET)
                                                                Pld max
                                                                   l=ld min lP (l)
                                                    DET =         PN                 ,
                                                                     i,j R(i, j)

    P (l) − is the number of diagonals of length l.
  • The average length of the diagonals (Lmean )
                                                                 Pld max
                                                                    l=ld min lP (l)
                                                  Lmean = P                              .
                                                                     l=ld min P (l)

  • Entropy of diagonal lines (ENd )
                                                                  ldX
                                                                    max
                                                   ENd = −                  pl log pl ,
                                                                 l=ld min

    where pl is the frequency of diagonal lines of length l.
  • Entropy of vertical lines (ENv )
                                                                 lvX
                                                                   max
                                                  ENv = −                 pvl log pvl ,
                                                               l=lv min

     where pvl is the frequency of vertical lines of length l.
   For the calculation of these features, the PyRQA library [7] was used.
   Another group of features that is used for this approach is a group of features based on the
Poincare plot. The Poincare plot is constructed as follows: for the signal x = (x0 , x1 , ..., xN −1 ),
the plot consists of the points

                                             (x0 , x1 ), (x1 , x2 ), ..., (xi , xi+1 )

and so on. In this case, RR-intervals were used as a signal x.                               The following features are
constructed:
  • The standard deviation of distances from the points of the plot to the line y = x. This
    feature describes the local variability of RR-intervals.
  • The standard deviation of distances from the points of the plot to the line y = −x −
    2RRmean . RRmean − is the average value of RR-intervals. This feature describes the
    long-term variability of RR-intervals.
   The feature, which is based on the detrended fluctuation analysis. This method allows
determining the self-dependence of the signal. The following cumulative sum is defined as:
                                                                  t
                                                                  X
                                             xcumsum (t) =              (x(i) − µ),
                                                                  i=1

x is a signal consisting of RR-intervals, µ is the mean of the signal x.
   The data is segmented with a window of size ∆n. On each segment polynomial is found for
the data, which most accurately represents it (usually linear). The union of all such polynomials
forms a function x∆n (t), which is an approximation of the original function xcumsum (t). Then
there is the following function:


IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                               76
Data Science
V N Guryanova




                                            v
                                            u  N
                                            u1 X
                                   F (∆n) = t    [x           cumsum (t) − x∆n (t)]2 ,
                                                    N t=1
where N is the length of the signal consisting of RR-intervals.
    The feature is the slope of the line log F (∆n) to log(∆n). More information about this
approach is written in [8]. In this paper, the implementation of this feature was computed using
publicly available software Nonlinear measures for dynamical systems or nolds, version 0.3.2,
which can be downloaded from (https://pypi.python.org/pypi/nolds).
    The next feature that was used in this approach is the correlation of dimensions. This feature
is a quantitative characteristic of the signal trajectory and is defined as follows.

  • Vectors xm
             i of length m are similar to those that were formed in approximate entropy are
    constructed.
  • The value g is defined as:

                           g(r) = (xm    m        m m
                                    k , xl ) : d(xl , xk ) ≤ r, 0 ≤ k ≤ l ≤ N − m − 1 ,

    g − is the number of pairs of vectors whose distance is less than or equal to r.
  • The value C(r) is defined as
                                                   g(r)
                                           C(r) =       ,
                                                   N2
    where N − is the length of the signal x.
  • The correlation of dimensions (D2) is defined as

                                                                  log C(r)
                                                      D2 = lim             .
                                                               r→0 log(r)


   In this paper, the implementation of this feature was computed using publicly available
software Nonlinear measures for dynamical systems or nolds, version 0.3.2, which can be
downloaded from (https://pypi.python.org/pypi/nolds).
   The gradient boosting from the xgboost package [9] was used as the classifier in this approach.

3.2. The algorithm based on 3 different feature spaces
This algorithm is a mixture of 3 different feature spaces, which were previously used in the
classification of biomedical signals.
   The first group of features consists of the parameters of the Hjorth’s parameters:
activity, mobility, complexity [10]. These parameters were originally used as features for
electroencephalograms. Later, these parameters were used in many works, including the
classification of ECG signals [11].
   The second group of features consists of statistical signal features: mean, standard deviation,
signal minimum, signal maximum, skew, kurtosis, selective quantiles of order: 0.1, 0.25, 0.5, 0.75,
0.9, sums and sums of signal values squares that are above / below certain values of quantiles:
0.1, 0.25, 0.5, 0.75, 0.9.
   The next group of features was suggested by Uspenskiy for the disease detection by patient’s
ECG [12]. To calculate these features, it is necessary to calculate the amplitudes of the R-peaks
(A(n)), the distances between the R-peaks (T (n)), and the arctangent of their ratio

                                                                      A(n)
                                                   α(n) = acrtg             .
                                                                      T (n)


IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)           77
Data Science
V N Guryanova




                               Table 1. Signal encoding for Uspenskiy features

                                                             A     B     C     D     E   F
                                   A(n + 1) − A(n)           +     −     +     −     +   −
                                   T (n + 1) − T (n)         +     −     −     +     +   −
                                   α(n + 1) − α(n)           +     +     +     −     −   −



    It is assumed that the values of A(n), T (n) are not important, but signs of their increments
are. The method of signal encoding based on all possible signs of increments of these quantities
is presented in Table 1.
    After the code representation of the signal is received, the three-gram selection is performed.
The feature space is the number of occurrences of each of the possible three-grams in a given
code sequence derived from the signal.
    The logistic regression from the scikit-learn package [13] was used as the classifier in this
approach.

3.3. The algorithm based on R-peak’s neighborhood
The idea used in this algorithm was described in the article [14] for determining the state of the
patient’s heart in which he should be sent to the cardiac service. The feature space for this
approach is constructed as follows.
  • The neighborhoods of the signal R-peaks are allocated: 200 points before R-peak and 500
    after.
  • The averaged neighborhood is used as a feature space.
   Neural network with the architecture described in Table 2 was used as the classification model
for this algorithm.


    Table 2. Neural Network Structure for the algorithm based on R-peaks neighborhoods

                                 Input Layer         Shape = (700)
                                 Dense Layer         Units Number = 90
                                                     Activation Function = sigmoid
                                 Dense Layer         Units Number = 1
                                                     Activation Function = sigmoid



   In this work, the neural network was implemented using the libraries Theano [15] and Lasagne
[16].

3.4. The algorithm based on wavelet transformation
The idea for this algorithm was described in paper [17] for epilepsy detection based on ECG-
signal and for identification of a person based on ECG-signal.
   Wavelet signal transformation is a convolution of the signal with functions Ψ(t), called
wavelets. Such wavelet functions should possess specific properties:



IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)           78
Data Science
V N Guryanova




                                      Z +∞                     Z +∞
                                              Ψ(t)dt = 0              |Ψ(t)|2 dt < ∞.
                                        −∞                      −∞
   Wavelet transformation allows achieving signal compression, with good performance of
reproduced original signal [6].
   All wavelet functions used in a wavelet transformation can be represented with prototype
function ψ(t) using scaling and shift. In case of discreet wavelet transformation all wavelet
function can be written as:
                                            1
                                ψm,n (t) = √ m 2−m t − n .
                                                           
                                            2
   In discrete wavelet transformation of ψm,n ∗  function can be separated into two parts. The
two types of functions correspond to coefficients of approximation and detail coefficients. In
the present work approximation coefficients are used for new representation of the signal, and
Daubechies wavelets [18] are used as a wavelet function.
   After getting the coefficients of wavelet transformations, local segments are extracted from
the signal. The segments are extracted by moving a window of certain length w with step s,
with all elements within one window going into a separate segment. After such a procedure
every signal is represented as a set of local segments.
   All local segments in the training set are separated into k clusters using k-means algorithm.
After this, every segment is replaced with the number of the cluster it belongs to. This way every
signal is represented as a text of codewords, with each word representing the certain cluster.
   In the implementation of described algorithm these parameter values were used: w = 100,
s = 30, k = 200. K-means algorithm implementation from scikit-learn package was used.
   In the train sample every local segment is replaced with the cluster it is the closest to. That
means that for every local segment si in test sample cluster c it belongs to is determined using
this formula:                                             v
                                                                           u w
                                                                           uX
                               c = argmin d(bj , si ),       d(bj , si ) = t (sk − bk )2 ,
                                                                                         i   j
                                                                               k=1

where bj is a cluster center j, ski (bkj ) is a kth element of local segment i (cluster j).
    Using this approach and transforming the input signal into text it is possible to use natural
language processing algorithms. The paper’s authors that suggested such encoding used the bag
of words as a feature space. Feature description is a number of occurrences of each code word
in a specific signal.
    Features based on word2vec technology were used in order to extract dependencies between
local segments in the signal. This approach was suggested by Google and it allows to use
context-aware text processing, reducing the dimensions of the data [19].
    Word2Vec model was trained using the length of embedding vector equals to 80. A mean of
all vectors in the signal was used as a feature. The model was trained using package gensim [20].
The logistic regression from scikit-learn package [13] was used as a classification algorithm.

3.5. The algorithm based on bispectrum
Bispectrum is a function of two variables f1 and f2 that specify the frequencies, expressed by
the following formula [21]:

                                       B(f1 , f2 ) = X(f1 )X(f2 )X ∗ (f1 + f2 ),
   where X(f ) is the Fourier transform of the signal, and X ∗ (f ) is the complex conjugate of
it. The signal bispectrum is usually calculated using a fast Fourier transform. A detailed
description of the algorithm for the bispectrum computation of the signal can be found in [22].



IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)           79
Data Science
V N Guryanova




During calculation the bispectrum of the signal, we obtain a two-dimensional matrix whose
elements are complex numbers.
   Based on the resulting matrix, the elements of which can be denoted as a(i, j), we can
associate each signal with a certain image. This image is calculated as follows. A new matrix
B = ||bi,j || is calculated, the elements of which are equal to:
                                                     q
                                         b(i, j) =      Re2 a(i, j) + Im2 a(i, j),

where Re denotes the real part of the complex number, and Im denotes the imaginary part of
the complex number. The contour plot of matrix B is used as the image.
   The authors of the article [23] have shown that coronary heart disease can be detected by
analyzing images obtained from a signal bispectrum. The method described in the above article
was to allocate the area of the region within the level lines to conclude that the patient had
CHD. The results obtained by the authors allow concluding that bispectrum images can be used
to detect CHD.
   It was proposed to use neural networks for classification of such images. The architecture of
the neural network is described in Table 3.

             Table 3. Neural Network Structure for the algorithm based on bispectrum

                           Input Layer                 Shape = (3, 80, 80)
                           Convolution layer           Filter size = (32, 5, 5)
                                                       Offset = (2,2)
                           Dense Layer                 Units Number = 30
                                                       Activation Function = LeakyRelu
                           Dense Layer                 Units Number = 1
                                                       Activation Function = Sigmoid



   In this work, the neural network was implemented using the libraries Theano [15] and Lasagne
[16].

4. Methods of constructing ensembles of algorithms
In order to increase classification performance, it was suggested to use ensembles of the
algorithms. Several methods of ensembling are described in this section.

4.1. Majority voting
Given a set of algorithms A = (A1 , A2 , ..., An ) and output a vector of predictions a =
(a1 , a2 , ..., an ) then the resulting answer a of the ensemble is equal to

                                                a = mode(a1 , a2 , ..., an ),

where mode is a statistic, that is equal to the element which is most often encountered in the
predictions. If there are several of them a random one of them is chosen.

4.2. EM-algorithm
The main idea of the EM-algorithm [24] is data aggregation from different people about the
same event in order to get a correct evaluation. Since the goal of ensembling is the aggregation
of several different algorithms, the EM-algorithm is applicable to ensemble creation.


IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)        80
Data Science
V N Guryanova




   Algorithm description:
   N − size of available data, nkil − whether the k-th algorithm has the answer l (l ∈ {1, 2}) to
the data i (i = 1...N ),
    k (j ∈ {1, 2}) − is the probability that the k th algorithm will return j when the true
   πjl
answer is l.
   Tij = 1, if the true answer for the data i is j, otherwise, it equals 0.
   pj − the probability of class j in the sample.
  • Step 1: Initialize the matrices π to the ideal case. T initialize the voting value for the
    majority.
  • Step 2: Recalculate the values of matrices π and pj :

                                                                 Tij nkil
                                                         P                             P
                                                                                           Tij
                                                k
                                               πjl = PiP                     pj =      i
                                                                  Tij nkil                 N
                                                        l    i


  • Step 3. Recalculate Tij :
                                                            pj
                                                              Q2 QK k nk
                                                          k=1 l=1 (πjl ) il
                                             Tij = P2      QK Q2         k nk
                                                    q=1 pq   k=1 l=1 (πql ) il

Repeat steps 2 and 3 until the matrices π stop changing.
   At the end of this algorithm, we obtain the probability of the data belonging to each class
in the matrix T . As an answer, the class is taken, the probability of belonging to which is the
greatest.

5. Evaluation of Algorithms
Cross-validation was used to evaluate the performance of algorithm. To avoid overfitting the
ECGs of one patient did not fall simultaneously into the training and test set. The following
performance criteria was introduced:
                                                      N      Pni
                                                   1 X             j=1 Itij =pij
                                                                                   ,
                                                   N i=1               ni

where tij is the true value of the target variable for the cardiogram j of the patient i, pij −
the predicted value of the target variable for the cardiogram j of patient i, ni − the number
of cardiograms of the patient i, N − the number of patients, Itij =pij is the indicator function
which equals to 1 if tij equals to pij and equals to 0 otherwise. This criterion is called patient
performance. It allows us to evaluate how well the algorithm determines a person’s disease by
any of his cardiograms. In addition, this performance criterion does not depend on the number
of cardiograms for each patient. ROC-AUC score [25] and F-score [25] were used for models
evaluation.

6. Results
The results of evaluations are shown in Table 4, where the first column shows algorithms type
or ensemble type. The algorithm based on wavelet transformation is included in two variants,
with word2vec and without.




IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)           81
Data Science
V N Guryanova




7. Conclusion
In the course of this paper, the following results were obtained. When CardioQvark equipment is
used, it is possible to determine CHD with an accuracy of more than 0.81 for patient performance,
with an accuracy greater than 0.77 for F-score and with an accuracy greater than 0.87 for
ROC-AUC score. Word2vec can increase the performance of the classification method based
on the wavelet transformation. Bispectrum can be used to classify CHD. The EM algorithm
is applicable for ensemble and in this case, shows the best performance of classification for all
selected performance criteria.


                                          Table 4. CHD detection results

        An Algorithm                                        Patient Performance          ROC-AUC   F-score
        Bispectrum                                          0.7207                       0.7418    0.7244
        Wavelet transformation                              0.741                        0.8       0.6967
        Wavelet transformation +word2vec                    0.7501                       0.8       0.6990
        R-peak’s neighborhood                               0.7602                       0.7988    0.703
        The HRV signal                                      0.763                        0.744     0.662
        3 different feature spaces                          0.7632                       0.8042    0.70256
        majority                                            0.806                                  0.77
        EM                                                  0.8108                       0.8738    0.7784

8. References
[1] Gorbachev V V 2008 I cardiac ischemia (Minsk: Vysh. shk) p 479 (in Russian)
[2] Dua Sumeet 2012 Novel classification of coronary artery disease using heart rate variability
analysis Journal of Mechanics in Medicine and Biology 12(4) 1240017-1240019
[3] Giri D 2013 Automated diagnosis of coronary artery disease affected patients using LDA,
PCA, ICA and discrete wavelet transform Knowledge-Based Systems 37 274-282
[4] Acharya U 2017 Rajendra et al. Application of higher-order spectra for the
characterization of coronary artery disease using electrocardiogram signals Biomedical
Signal Processing and Control 31 31-43
[5] Kumar M R B U and Rajendra А 2017 Characterization of coronary artery disease using
flexible analytic wavelet transform applied on ECG signals Biomedical Signal Processing and Control 31
301-308
[6] Rangayyan R M 2015 Biomedical Signal Analysis (John Wiley & Sons)
[7] Rawald T M and Sips N M 2017 PyRQA – Conducting recurrence quantification analysis on very
long time series efficiently Computers & Geosciences 104 101-108
[8] Kantelhardt J W 2001 Detecting long-range correlations with detrended fluctuation analysis Physica
A: Statistical Mechanics and its Applications 295(3-4) 441-454
[9] Tianqi C and Guestrin C 2016 Xgboost: a scalable tree boosting system Proc. of the 22nd acm sigkdd
international conference on knowledge discovery and data mining ACM
[10] Hjorth B 1970 EEG analysis based on time domain properties Electroencephalography and Clinical
Neurophysiology 29(3) 306-310
[11] De Cooman T, Carrette E, Boon P, Meurs A and Van Huffel S 2014 September Online seizure
detection in adults with temporal lobe epilepsy using single-lead ECG Proceedings of the 22nd European
Signal Processing Conference 1532-1536
[12] Uspensky V 2008 Theory and practice of diagnosis of diseases of internal organs by the method of
information analysis of electrocardio signals (Moscow: Economics and Informatics) p 116 (in Russian)




IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)                       82
Data Science
V N Guryanova




[13] Pedregosa F 2011 Scikit-learn: machine learning in Python Journal of machine learning
research 2825-2830
[14] Ripoll V J R 2016 ECG assessment based on neural networks with pretraining Applied
Soft Computing 49 399-406
[15] Al-Rfou R, Alain G, Almahairi A, Angermueller C, Bahdanau D and Ballas N 2016
Theano: a Python framework for fast computation of mathematical expressions Preprint
arXiv:1605.02688
[16] Dieleman S 2015 Lasagne: First release (Geneva, Switzerland: Zenodo)
[17] Wang J, Liu P, She M F, Nahavandi S and Kouzani A 2013 Bag-of-words representation for
biomedical time series classification Biomedical Signal Processing and Control 8(6) 634-644
[18] Liu C L 2010 A tutorial of the wavelet transform (Taiwan: NTUEE)
[19] Mikolov T 2013 Efficient estimation of word representations in vector space (Scottsdale,
Arizona: ICLR Workshop)
[20] Rehurek R and Sojka P 2010 Software framework for topic modelling with large corpora
Proc. of the LREC Workshop on New Challenges for NLP Frameworks
[21] Civera M L Z and Surace S 2016 Using bispectral analysis and neural networks to
localise cracks in beam-like structures Proc. of the 8th European Workshop On Structural
Health Monitoring 1542-1551
[22] Chrysostomos N, Mysore L and Raghuveer R 1987 Bispectrum estimation: A digital
signal processing framework Proc. of the IEEE 75(7) 869-891
[23] Al-Fahoum A, Al-Fraihat A and Al-Araida A 2014 Detection of cardiac ischaemia
using bispectral analysis approach Journal of medical engineering & technology 38(6)
311-316
[24] Dawid A and Skene A 1979 Maximum likelihood estimation of observer error-rates
using the EM algorithm Applied statistics 20-28
[25] Sokolova M, Japkowicz N and Szpakowicz S 2006 Beyond accuracy, F-score and ROC: a family
of discriminant measures for performance evaluation Australian conference on artificial intelligence
4304 1015-1021




IV International Conference on "Information Technology and Nanotechnology" (ITNT-2018)          83