=Paper= {{Paper |id=Vol-3132/Paper_8.pdf |storemode=property |title=Clustering Time Series of Complex Dynamics by Features |pdfUrl=https://ceur-ws.org/Vol-3132/Paper_8.pdf |volume=Vol-3132 |authors=Lyudmyla Kirichenko,Oksana Pichugina,Hlib Zinchenko |dblpUrl=https://dblp.org/rec/conf/iti2/KirichenkoPZ21 }} ==Clustering Time Series of Complex Dynamics by Features== https://ceur-ws.org/Vol-3132/Paper_8.pdf
Clustering Time Series of Complex Dynamics by Features
Lyudmyla Kirichenko 1, Oksana Pichugina 2 and Hlib Zinchenko1

1 Kharkiv National University of Radio Electronics, 14 Nauki ave., 61166, Kharkiv, Ukraine
2
  National Aerospace University "Kharkiv Aviation Institute", 17 Chkalova Street, 61070 Kharkiv, Ukraine

               Abstract
               The paper presents an analysis of clustering by features for time series with complex dynamics. The
               data for clustering were time-dependent chaotic realizations for different chaos regimes. They
               underlie time–series clustering carried out were on datasets of statistical indicators of time series,
               which characterize dynamics of the change in the series and the probability distribution of its
               elements. Clustering was performed using the k-means method with different sets of parameters.
               The results of the study showed a high accuracy of clustering, even for close chaotic regimes.

               Keywords 1
               Time series clustering by features, machine learning, clustering, k-means algorithm, logistic
               mapping, chaotic time series

1. Introduction

    Nowadays, every minute huge information databases are created and constantly updated. A
significant part of the data constitutes data sequences ordered in time, i.e., time series. The necessity to
process quickly and qualitatively plenty of data requires developing new approaches utilizing artificial
intelligence methods, particularly machine learning. Time series problems relate to non-trivial and
complex machine learning problems. One of the problems is time series clustering. Clustering of time
series can be used as the preliminary stage of data processing following by implementation of data
mining methods. Also, clustering can be a separate task for analyzing time-series nature [1-4].
    Time-series clustering is used in various research fields and technology for different purposes, such
as partitioning information flows, detecting anomalies, comparing subsequences, indexing, etc. [5-9].
Currently, there are many fundamentally different approaches to clustering time series [1, 2, 10-12].
Among them is time series clustering by features [11,13-16]. This approach is based on deriving
features representing nature time series and their usage in subsequent series clustering. Due to its power,
this approach was chosen for implementation in the presented work. The paper pays special attention
to selecting features that characterize time-series dynamics and the probability law distribution of the
general population presented by time series.
    When clustering, one can use the following conventional algorithms: the k-means algorithm,
hierarchical methods, density-based approaches, etc. [1,2,11,17]. One of the most popular for practice
usage is the k-means method [17-19]. Taking into account its simplicity, clarity, and flexibility to
various modifications, the k-means method was chosen for utilization used in the current work. An
important issue when solving machine learning problems, such as clusterization, is the choice of data
used to implement new approaches and algorithms. The use of simulated model data makes it possible
to generate samples of time-dependent realizations with certain predefined properties and size. In the
presented work, chaotic realizations are chosen as input data, which are often used as models of various
biological and medical signals [20]. The purpose of this research is to study the clustering abilities of
time-dependent data realizations with complex dynamics utilizing clustering by features with the ones

Information Technology and Implementation (IT&I-2021), December 01-03, 2021, Kyiv, Ukraine
EMAIL: lyudmyla.kirichenko@nure.ua (L. Kirichenko); oksanapichugina1@gmail.com (O. Pichugina);        hlib.zinchenko@nure.ua
(H. Zinchenko)
ORCID: 0000-0002-2780-7993 (L. Kirichenko); 0000-0002-7099-8967 (O. Pichugina)
            ©️ 2022 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)



                                                                                                                        83
that are understandable and easily evaluable. At the same time, the features qualitatively represent the
nature of time series.

2. Problem statement
   We are given a time series set involving several time series classes, and the number of the classes is
known in advance. It is required to cluster the time series. To compare the similarity of time series, we
will use statistical characteristics of each time series. Thus, a set of characteristics is associated with
each series. These sets of metrics will serve as inputs for clusterization.
   Problem formulation. Consider a space X whose objects are time series x(t ) , where t  [0, T ] is
a time interval. Let there be some sample X l  xi (t )i 1 from the time series space X , which needs
                                                           l


to be partite, i.e. divided into disjoint subsets (clusters) yi  Y , i  1, l , where Y is a cluster space, so
that each cluster consists of similar objects with respect to certain properties. In contrast, elements of
different clusters differ significantly. Let us move from the time series space X to the space
Chl  chi i 1 of the series’ features, where chi  { parim }nm1 , pari ,m is an observation of an m -th
            l


statistical parameter (metric) of the time series i ( TSi ). Thus, the problem of clustering time series in
a set X l , for which dynamics in time is the main feature, is reduced to clustering sets (vectors) of
parameters forming a sample Ch l for which there is no dependency on time. Now, a clustering time
series problem can be reformulated as follows: for a sample Ch l of parameters’ vectors and a given
function  ( pari , par j ) of the distance between the vectors, it is necessary to find a set of clusters Y
and a clusterization algorithm  ( pari , par j ) such that each cluster contains objects close to each other
in metric  , while elements of different clusters differ significantly.

3. Clusterization: feature selection and a method
    3.1.         Feature selection
    By definition, any time series is a set of time-ordered values of some variable or variables. That is
why, on the one hand, it is possible to characterize the series by certain descriptive statistics’ indicators
compactly. On the other hand, since observations in a time series are ordered, various specific
characteristics concerning their dynamics can be taken into consideration.
    When considering a time series as any common sample in statistics, the probability distribution law
plays a role in its main distinctive characteristic. Thus, theoretically, time series included in one cluster
should have quite close probability distributions. Since probability distributions are characterized by
type and numerical parameters, in many cases is sufficient to make a comparison of the parameters for
establishing the difference of the distributions. These parameter estimates can be derived from samples,
then the assumption about the probability distribution can be made based on the estimate of the
parameters that can be obtained from the data set. After that, then the proximity of establishing the
probability distribution will be reduced to establishing the proximity of the corresponding vectors of
estimates of the parameters.
    Also, the presence of an internal connection between the elements in the time series makes it possible
to obtain specific numerical characteristics of their dynamics. Based on these characteristics, it becomes
possible to make assumptions on the nature and features of the series dynamics. Therefore, based on
the ordering of observations in time series, it is possible to cluster time series based on the proximity of
the dynamic indicators’ values. Naturally, a problem arises of choosing some statistical and dynamic
quantitative indicators as samples numerical features. The ones should be understandable and easy to
calculate while analyzing their similarity allows qualitatively distinguishing clusters for the time series
[21-22]. Let us consider some characteristics of time series that can quantitatively characterize the
probability distribution.


                                                                                                            84
    3.2.         A probability distribution quantitative characteristic
    Let us consider a numerical characteristic of a certain random variable, which depends only on the
its probability distribution of the random variable and completely defines its type being an identifier of
the probability distribution type. Assuming that such a numerical characteristic exists, a statistical
estimate of this metric will enable to determine or assess this probability distribution law type by sample
values from the random variable population.
    Let a random variable  be represented by a sample of the length n from the general population,
and the value R (n) is its range. Then as a numerical characteristic of the probability distribution law
type can be taken a function called the probability distribution identifier:
                                                        E[ R (n)] ,
                                                                    2

                                              Id (n) 
                                                                                                        (1)
                                                          Var[ ]

where E[ R (n)] is the mathematical expectation of the range function R (n) , Var[ ] is the variance of
the random variable  , n is the sample length. The expression (1) determines a functional characteristic
of a probability distribution law type.
    For the practical application of the identifier Id (n) in the analysis of numerical series, its estimate

                                              ˆ   xmax  xmin  ,
                                                                    2

                                              Id                                                         (2)
                                                         S2
is used, where xmax is the largest value of the time series, xmin is its smallest value, S 2 is the unbiased
sample variance of the time series.

        3.2.1. The coefficient of variation
   The coefficient of variation (relative standard deviation) is the relative value of the variation measure
that allows comparing variations of random variables with different mathematical expectations. It is
calculated by the formula:
                                                          s [x ]
                                                   Vx =          ,                                       (3)
                                                         E [x ]

where  [ ] is a standard deviation of a random variable, and E[ ] is its mathematical expectation.
The estimate of the coefficient of variation of the time series is determined as follows:
                                                         S
                                                   Vˆ = ,                                                (4)
                                                         x
where S is the unbiased sample standard deviation of random variable evaluated on a time series, x
is the mean value of the time series.
    The coefficient of variation is one of the characteristics of the tails of the probability density
function. Therefore it can be seen as a quantitative characteristic of the probability distribution law.

        3.2.2. The autocorrelation coefficient
   Let us consider characteristics of time series that enable quantitatively characterize the dynamics of
change in time series. For the analysis of ordered data sets, it makes sense to use the autocorrelation
function value, which shows the relationship between the sequences of values of the same time series
taken with a shift in time. The normalized autocorrelation function has the form
                                                   T 
                                         1      1
                                        K (0) T   0
                               K ( )                  ( x(t )  mx )( x(t   )  mx )dt ,          (5)




                                                                                                          85
where T is the time series length;  is the time shift; x(t ) is the series value at the moment t ; x (t   )
is the value of the series at the moment X (t   ) , mx is the mathematical expectation.
   As an estimate of the normalized autocorrelation function of the numerical series from X of the
length n , the following expression can be taken:
                                                      n
                                                     ( x[i]  x)( x[i  k ]  x)
                                          r[k ]  i  k 1    n
                                                                                    ,                       (6)
                                                              ( x[i]  x)   2

                                                             i 1


where r[k ] is a sample autocorrelation coefficient with the lag for the period k ; x is a sample mean;
x[i] is the value of the series at the i -th moment of time ( i -th observation); n is the number of
observations.

        3.2.3. The number of time series inversions
    To find the number of inversions in a time series, we count how many times the inequality is
satisfied x i > x j while i < j in the sequence of its elements. Each such inequality is associated
with an inversion, and their total number will be denoted as A .
    Suppose a time series has length n , and its elements are independent realization of a random
variable. In that case, the total inversion number is also a tabulated random variable with the
following mathematical expectation and variance:
                                     n(n  1)                   2n3  3n2  5n
                               mA            ,N-n        A2                 .
                                        4                             72
    The number of inversions is a feature used for detecting a monotonic trend in a time series.
Respectively, it can be used as a characteristic of the dynamics of time series.

        3.2.4. The number of series in time series
   To calculate the number of series S of a time series, all elements of the series are assigned to
one of disjoint classes with the help of comparison with some constant const serving as a threshold
value. Namely, if x i > const then we consider the observation x i as belonging to the positive class,
and when x i £ const , then we decide that it belongs to the negative class. After that, we count the
number of series (i.e., groups of consecutive values a positive or negative class, respectively) in the
time series. The number of series in a sequence makes it possible to detect whether its elements are
independent observations or the sequence such as a time series has a trend.
    Suppose the time series has length n , and its elements are independent realizations of a random
variable. In that case, if the threshold constant is equal to the series median, the number of series S
is a tabulated random variable with mathematical expectation and variance:
                                                 n                n  n  2
                                           m   1 ,  2                  .
                                                 2                4  n  1
    The series; number in a time series as a dynamic indicator is used when searching for a trend and
exploring the dynamics in a time series.

        3.2.5. The turning points’ number in time series
    Let n be a sequence of observations. Now, we count the number of peaks and troughs, i.e.,
evaluate how many times the following inequalities xi  xi 1  xi 2 or xi  xi 1  xi 2 are satisfied
in the sequence. Each such true inequality defines a turning point. Let P be the total number of


                                                                                                            86
turning points in the time series. If a sequence of n observations involves independent outcomes of a
random variable, then the number of turning points is a random variable with the following
mathematical expectation and variance:
                                           2                16n  29
                                      mP  (n  2) ,  P2           .
                                           3                  90
   It is known that the number P of turning points in a sequence of observations satisfying the above
conditions approaches a normal distribution when n tends to infinity. The number P is normally used
when searching for a trend. Evidently, it can be used as a numerical characteristic of the dynamics of a
time series.

    3.3.          K-means method
   The k-means method is an algorithm of Cluster Analysis, which aims, for a given k  1 , to partite
observations represented in a numerical form as vectors of R n into exactly k clusters with centroids,
with each observation assigned to a cluster to which centroid it is closest [18,19].
   Consider observations ( x(1) , x(2) ,..., x( m) ) , x (i )  R n .
   The k-means method divides the m observations into k groups ( k  m ) C  C1 , C2 ,..., Ck  based
on a search of the centroids minimizing the total deviation of cluster points from centroids of these
clusters. It can be formalized as follows:
                                        min   i 1  x( j ) C x ( j )  i  ,
                                                 k                            2
                                                                                                   (7)
                                                                i
                                                                                

where i is a centroid for a cluster Ci .
    From formula (7) is seen that, as a norm ‖ . ‖ , not only Euclidean one can be used. Depending on
the norm choice, different results of clusterization can be obtained.
    Consider an initial set of arbitrary k means (centroids) in clusters 10 , 20 ..., k0 in C1 , C2 ,..., Ck .
At the first stage, the centroids can be selected randomly or according to some rule. Among such rules
ate choosing centroids that maximize the initial distances between the clusters.
    We assign the observations to those clusters whose centroid is closest. Each observation belongs to
a single cluster, even if it can be assigned to two or more clusters.
    Then the centroid of each cluster is recalculated according to the following rule:
                                                       1
                                                 i1        
                                                       Ci x Ci
                                                           ( j )
                                                                 x( j ) .                                     (8)


   Thus, the k-means algorithm recalculates the centroids at each step for each cluster obtained in the
previous step.
   The algorithm terminates when the values 1t ,…,  kt become unchangeable:
                                                  it  it 1 , i  1, k ,                                  (9)

    where  it is the i-th centroid in iteration t .


4. Input data
   In our experiment, model chaotic realizations were chosen as time series. Deterministic chaos is a
complex form of dynamics in nonlinear systems. The behavior of such a system is unpredictable despite
the absence of random action. It is determined by the sensitive dependence of system dynamics to tiny
changes in initial conditions. Numerous studies justify that many real dynamic phenomena in nature
and technology are well described by chaotic models [23-25]. In particular, many biological and
medical realizations of signals, such as EEG and ECG, have chaotic properties. These peculiarities


                                                                                                              87
depend on a person's physiological state, and chaotic processes can qualitatively simulate and
explore them [26-28].
   One well-known model example of chaotic mapping is the logistic one:
                                           xn 1  Аxn (1  xn ) ,                           (10)

where A (0,4] is the bifurcation parameter, xn [0,1] .
    Figure 1 shows the bifurcation diagram for mapping (10) for values of the parameter
 A[3.43,4] . To construct a bifurcation diagram, on the axis Ox, the values of A are sequentially
set with a certain small step. Then, for each parameter value, a certain number of mapping iterations
is conducted until the steady-state (attractor) is attained. After that, the values of x obtained in the
iterations are depicted on the plot.




Figure 1: Bifurcation diagram for mapping (10)

     It was observed that, at values of A  3.569 , a chaotic regime begins and is interspersed with
windows of regular dynamics. If the bifurcation parameter A  3.9 , the chaotic regime can be seen
as developed. The time-dependent realizations obtained for these values of A are the most relevant
for modeling physiological dynamics. Figure 2 a)-c) shows the logistic mapping implementations
for the parameter A values A  3.9 , A  3.95 , and A  4 , respectively.

5. Experiment description
   In this study, a numerical experiment was conducted to explore the effectiveness of applying the
k-means method to cluster time series with chaotic dynamics. We considered the computer
realizations of the logistic mapping (1) for different bifurcation parameter values A for the chaotic
regime. There were simulated the corresponding sets of time series with the same parameter A .
These sets in the experiment played the role of clusters. Clustering was carried out by the k-means
clustering method. The number of clusters and the lengths of time series also varied in the
experiment. Before experimenting, the statistical properties of the features offered for inclusion into
time-independent data set were preliminarily studied in order to select the most significant for
comparison and partition of chaotic realizations. To better visualize and understand, only three were
selected from the entire set of the above-listed features of the time series. They are characterized by
the most variation of mean values and the smallest sample standard deviation for different chaotic
regimes. In the case under consideration, these features are the values of P (the number of turning
                                                               ˆ (the probability distribution type
points), R (the autocorrelation coefficient with a lag 1), and Id
identifier) for time series on 200 values (Tab. 1).
                                         ˆ were found, and vectors of these values were used as
   For each time series, values P , R , Id
input data for clustering. The small dimension of the number of features allows visualizing the
clustering process in 3-dimensional space. Since we generated initial data for the experiment and
were familiar with actual parameter values A for each time series, after clustering we identified two


                                                                                                           88
sets of clusters: the reference one, given in advance at the generation stage and the predicted one formed
by the clustering algorithm. In this case, accessing the performance of the clustering approach is reduced
to determining the quality of binary classification on a set of pairs of objects (reference and obtained).
Table 1
The identifiers’ sample statistics

                                        P                      ˆ
                                                               Id                          R
            A
                                 mean        std       mean           std         mean           std
           3.90                  162         4.4        8.7           0.5         0.52          0.04
           3.95                  155         4.4        9.4           0.5         0.33          0.05
           4.00                  130         6.5        7.9          0.35         0.05          0.06
           1.0



           0.8



           0.6



           0.4



           0.2



                            20              40            60           80            100

                                                     a)
           1.0



           0.8



           0.6



           0.4



           0.2



                            20              40            60           80           100


                                                     b)
            1.0



            0.8



            0.6



            0.4



            0.2



                            20              40            60           80           100


                                                  c)
Figure 2: Time-dependent realizations of mapping (10) for values A  {3.9, 3.95, 4}
    Thus, we build and analyze the confusion matrix that allows calculating the accuracy of clustering
as the ratio of the number of objects that predicted class coincides with the actual classes to the total
number of the objects. For obtaining the accuracy correctly for each clustering case associated with the
chaos mode and given number of clusters, the number of time series in each cluster, and the length of

                                                                                                       89
the time series were set, then the accuracy value was evaluated for all the combinations and it was
averaged over multiple experiments.

6. The experiment results and discussion
    Let us outline the main results of our experiment. Figure 3 provides visualization of the k-means
clustering for two sets of time-dependent data realizations. The first set contained 20 of them obtained
for the bifurcation parameter A  3.9 . The second one also included 20 realizations for the value of
 A  4 . Typical implementations are shown in Figures 1 and 2.
    Figure 3 a) shows the clustering vectors of parameters obtained from time series with a length of
100 . In this case, the statistical estimates of the chosen numerical features have a rather large error,
and, as a result, several observations (and corresponding) fall into the wrong clusters.
    Figure 3 b) depicts clustering results on time series of the length 200 . In this case, the overwhelming
number of experiments yields 100% correct partition. Only in a few experiments 1-2 time series were
misclustered. Figure 3 c) demonstrates the results for the time series on 300 values. In this case, in all
experiments, all the time series were clustered correctly.
    While Figure 4 illustrates data clustering with a sufficiently large variation of parameters, Figure 4
shows the clustering results for two quite close chaotic regimes with A  3.9 and A  3.95 ,
respectively. In this case, the length of time series was equal to 200 . Experiments have shown that
even time series that are quite close under chaos parameters are still clustered with high accuracy. The
experiments justify that the clustering quality does not depend much on the number of clusters. Figure
5 shows clustering of time series of three chaotic regimes listed above.
    Table 2 shows values of clustering accuracy, which is calculated as a ratio of the number of objects
of predicted clustering classes coincided with the reference clustering to the total number of clustering
objects.
Table 2
Clustering Accuracy
                        Time series            Length              Accuracy
                        А1=3.9, А2=4           100                 0.88
                        А1=3.9, А2=4           200                 0.94
                        А1=3.9, А2=4           300                 0.98
                        А1=3.9, А2=4           500                 1
                        А1=3.9, А2=3.95        100                 0.74
                        А1=3.9, А2=3.95        200                 0.92
                        А1=3.9, А2=3.95        300                 0.95
                        А1=3.9, А2=3.95        300                 0.99

7. Conclusion
    The paper presents an approach to clustering time series based on singling out essential time free
numerical characteristics of the series related to the data probability distribution and dynamics features.
Then time series are treated standard numerical samples used in conventional clusterization methods.
The approach's effectiveness is demonstrated in the computational experiment results on clustering of
simulated time series with chaotic dynamics. As input data, we used the time-dependent realizations of
the logistic mapping for different chaotic regimes.
    Statistical indicators of time series, which characterize probability distribution and dynamics of time
series are easily calculated, making it possible to work with real-world dynamic data and cluster them
in real-time. The k-means popular clusterization algorithm was chosen as a conventional clustering
method. The study results demonstrate high accuracy of clustering for different data, including the near
chaotic ones close. It is shown that the clustering accuracy highly depends on the length of the time
series. The proposed approach is expedient to apply for clustering time realizations with complex
irregular dynamics, in particular, those exhibiting chaotic behavior. Such time realizations include, for

                                                                                                         90
example, medical and biological signals such as EEG, ECG, and others. Further research will be focused
on applying the proposed approach to clustering real-time series.


                                            60
                                                                                                        9.5 9.0
                                                                                                                   8.5
                                                                                                                         8.0
                                                                                                                               7.5




                                      70




                           80




                                                        0.4
                                                                            0.2
                                                                                             0.0




                                                                                  a)

                                                  120


                                                                                                   10
                                                                                                              9
                                                                                                                         8



                                            140




                                160




                                                          0.4         0.2              0.0




                                                                                  b)

                                            400
                                                                                                        9.0
                                                                                                                  8.5
                                                                                                                         8.0

                                      380




                                360




                          340




                    320




                                            0.0
                                                                0.2
                                                                                       0.4




                                                     c)
Figure 3: Clustering results for A  3.9 and A  4 : a) time series length is 100; b) time series length is
200; c) time series length is 300



                                                                                                                                     91
                                        0.35
                                                0.40
                                                             0.45
                                                                            0.50


                       410




                       400




                       390




                       380




                                                                         10.0
                                                                       9.5
                                                                     9.0
                                                                    8.5


Figure 4: Clustering results for A  3.9 and A  3.95
                                                       400

                                                                          350



                                                                                         10




                                                                                         9



                                                                                         8



                                                                                   0.0


                                                                      0.2



                                                       0.4




Figure 5: Clustering for three types of chaos: A  4 , A  3.9 and A  3.95

8. References
[1] S. Aghabozorgi, A. Shirkhorshidi, T.J. Wah, Time-series clustering. A Decade Review,
    Information systems 53 (2015), 16-38.
[2] J Aggarwal, C., Reddy, Data Clustering: Algorithms and Applications. CRC Press (2013)
[3] T.W. Liao, Clustering of time series data – a survey, Pattern Recognition, 38 (11) (2005), 1857-
    1874. doi: 10.1016/j.patcog.2005.01.025.
[4] S. Rani, G. Sikka, Recent Techniques of Clustering of Time Series Data: A Survey, International
    Journal of Computer Applications 52 (15) (2012) 1-9. doi: 10.5120/8282-1278
[5] L. Kirichenko, T. Radivilova, Analyzes of the distributed system load with multifractal input data
    flows, in Proceedings of the 14th International Conference The Experience of Designing and
    Application of CAD Systems in Microelectronics (CADSM), Lviv, 2017, pp. 260-264, doi:
    10.1109/CADSM.2017.7916130.
[6] P. Grabusts, A. Borisov, Clustering methodology for time series mining, Scientific Journal of Riga
    Technical University 40 (2009), 81-86. doi: 10.2478/v10143-010 -0011-0.


                                                                                                   92
[7] O. Pichugina, N. Muravyova, Data Batch Processing: Modelling and Applications, in: 2020 IEEE
     International Conference on Problems of Infocommunications. Science and Technology (PIC S T),
     2020, pp. 765–770. doi: 10.1109/PICST51311.2020.9467928
[8] B. Farzad, O. Pichugina, L. Koliechkina, Multi - Layer Community Detection, in: 2018
     International Conference on Control, Artificial Intelligence, Robotics Optimization (ICCAIRO),
     2018, pp. 133–140. doi:10.1109/ICCAIRO.2018.00030.
[9] V.V. Lesnykh, V.S. Petrov, T.B. Timofeeva, Classification and modeling of intersystem accidents
     in critical infrastructure systems, Advanced Mathematical Techniques in Science and Engineering
     (2018) 33–56.
[10] Esma Ergüner Özkoç, Clustering of Time-Series Data, in: Data Mining - Methods, Applications
     and Systems, doi: 10.5772/intechopen.84490.
[11] A.      M.       Alonso,      Time     series    clustering,    in:     ASDM       (2019),    URL:
     http://halweb.uc3m.es/esp/Personal/personas/amalonso/esp/ASDM-C02-clustering.pdf.
[12] I. Ivanisenko, L. Kirichenko, T. Radivilova, Investigation of multifractal properties of additive
     data stream, in Proceedings of the 2016 IEEE First International Conference on Data Stream
     Mining & Processing (DSMP), Lviv, 2016, pp. 305-308, doi: 10.1109/DSMP.2016.7583564.
[13] D. Tiano , A. Bonifati, R. Ng, FeatTS: Feature-based Time Series Clustering, in Proceedings of
     the 2021 International Conference on Management of Data (SIGMOD/PODS '21), 2021,
     pp. 2784–2788, doi: 10.1145/3448016.3452757.
[14] M. Faraggi, Time series features extraction using Fourier and Wavelet transforms on ECG data.
     URL:        https://slacker.ro/2019/11/23/time-series-features-extraction-using-fourier-and-wavelet-
     transforms-on-ecg-dat.
[15] J.A. Vilar, S. Pértega, Discriminant and cluster analysis for Gaussian stationary processes: Local
     linear fitting approach, J. Nonparametr. Stat., 16 (2004) 443–462.
[16] P. D’Urso, E.A. Maharaj, A.M. Alonso, Fuzzy Clustering of Time Series using Extremes, Fuzzy
     Sets and Systems, 318, (2004) 56–79.
[17] M.G.H Omran, A.P. Engelbrecht, A.Salman, An Overview of Clustering Methods, Intelligent
     Data Analysis, v. 11, n. 6 (2007) 583-605, doi: 10.3233/IDA-2007-11602.
[18] L. Morissette, C. Sylvain, The k-means clustering technique: General considerations and
     implementation in Mathematica (2013), doi:10.20982/tqmp.09.1.p015.
[19] C. Zhang, Z. Fang, An improved K-means clustering algorithm. Journal of Information and
     Computational Science, 10 (2013) 193-199.
[20] H. G Shuster, Deterministic Chaos: An Introduction. VCH Publishers, New York, 1988.
[21] D. C. Montgomery, G. C. Runger,. Applied Statistics and Probability for Engineers. John Wiley
       & Sons, Inc., 2003.
[22] G. Cowan, Statistical Data Analysis, 1st edition. Oxford : New York: Clarendon Press, 1998.
[23] H.-O. Peitgen, H. Jürgens, Chaos and Fractals: New Frontiers of Science, 2nd ed. Springer-Verlag
     New York, Inc., 2004
[24] L. Kirichenko, R. Tsekhmistro, O. Krug, A. Storozhenko, Comparative analysis of pseudorandom
     number generation in the up-to-date wireless data communication, Telecommunications and Radio
     Engineering, № 70(4). (2011) 325-333, doi: 10.1615/TelecomRadEng.v70.i4.20.
[25] J.E. Skinner, M. Molnar, T. Vybiral, M. Mitra, Application of chaos theory to biology and
     medicine. Integr Physiol Behav Sci., 27(1) (1992) 39-53, doi: 10.1007/BF02691091.
[26] J.J. Wright, R.R. Kydd, D.T.J. Liley, EEG Models: Chaotic and Linear. Psycoloquy: 4(60) (1993)
     EEG Chaos.
[27] Ju. Ulbikas, A. Cenys, O. Sulimova, Chaos parameters for EEG analysis, Nonlinear Analysis.
     Modelling and Control. 3 (1998) 141-148. doi: 10.15388/NA.1998.3.0.15263.
[28] J. E.Jacob, A. Cherian, K. Gopakumar, T. Iype, D. G. Yohannan, K. P. Divya, Can Chaotic Analysis
     of Electroencephalogram Aid the Diagnosis of Encephalopathy? Neurology Research
     International (2018), Article ID 8192820, doi:10.1155/2018/8192820.




                                                                                                      93