SE4TeC: A Scalable Engine For efficient and
            expressive Time Series Classification
                 Jingwei Zuo                                Karine Zeitouni                                  Yehia Taher
    DAVID Lab, University of Versailles          DAVID Lab, University of Versailles           DAVID Lab, University of Versailles
        University of Paris-Saclay                   University of Paris-Saclay                    University of Paris-Saclay
            Versailles, France                          Versailles, France                             Versailles, France
          jingwei.zuo@uvsq.fr                         karine.zeitouni@uvsq.fr                         yehia.taher@uvsq.fr


   Abstract—Time Series (TS) data are ubiquitous in enormous            proXimation (SAX) [7], transforms subsequences of raw time
application fields, such as medicine, multimedia, and finance. In       series into value-characterized symbols, which is eligible for
this paper, we present SE4TeC: A Scalable Engine for efficient          a hierarchic indexing iSAX [8] to accelerate similarity search.
and expressive Time Series Classification, which brings novel
optimizations to the state-of-the-art shapelet-based algorithm: (i)     Fast Shapelet [9] proves its efficiency even scalability based
More efficient feature extraction; (ii) Better interpretability of      on SAX and random projection. However, a loss of feature
both feature extraction and classification process; (iii) Scalability   information is unavoidable after the reduction of dimensions,
in the context of Big Data. SE2TeC’s effectiveness and efficiency       which requires a trade-off between efficiency and accuracy.
are experimentally demonstrated over real-life datasets.                   Our work is biased towards raw time series processing
                                                                        which has a higher accuracy performance, but a relatively
                       I. I NTRODUCTION
                                                                        high time complexity [3], [10]. Unlike some hardware-based
   Tony is informed unhealthy heart status on the basis of his          implementations, such as using GPUs to accelerate the simi-
electrocardiogram (ECG) during a medical diagnosis. An ex-              larity calculation [11], we focus here on the scalability of TSC
perienced doctor can easily correlate the abnormal ECG with             based on shapelet extraction. Traditional TSC algorithms on
the diseases, then explain to Tony the symbolic abnormality             raw TS data are not applicable for big data context, because of
in the ECG and the relevant treatment. Nowadays, Machine                their low scalability. Here are our contributions in this paper:
Learning technique can partly replace the role of an experi-               1) We propose a novel method to evaluate the shapelet in
enced doctor and do the diagnosis very accurately. In Tony’s                   batches
case, the electrical activity of the heart from physiological              2) We introduce a scalable engine to extract the shapelet
sensor is collected as Time Series (TS). From the perspective                  expressively
of the machine, the diagnosis can be considered as a Time                  3) Based on the scalable engine, we propose an acceleration
Series classification (TSC) problem.                                           strategy to extract the shapelet more efficiently
   The classical approaches [1], [2] on TSC problems are                The rest of this paper is organized as follows. In Section 2,
usually based on the statistical features extracted from time           we review the background and state the research problems.
series, such as mean, standard deviation of subsequences,               We present our scalable engine for Time Series Classification
which are assumed to represent the global characteristics of            in Section 3. Section 4 shows an empirical evaluation and
time series. Intuitively, they get very superficial information         performance comparison with rival solutions. Finally, we give
with low noise tolerance. On the basis of solving these                 our conclusions and perspectives for future work in Section
limitations, Shapelet [3] has attracted great interest over the         5.
past years, owing to its high discriminative feature and good
interpretability. However, extracting shapelets from data series                                  II. BACKGROUND
is accomplished with large computation cost. Even for small             A. Definition and notation
sized datasets, the algorithm can take days. This is mainly                We start with defining shapelets and the other notions used
due to the repeated similarity search between a sub-sequence            in the paper.
(i.e. a candidate shapelet) and TS instances in the database.              Definition 1: A Time Series T is a sequence of real-valued
Some typical speed-up techniques (i.e., indexing [4], lower-            numbers T=(t1 , t2 , ..., ti , ..., tn ), where n is the length of T .
bounding [5] and early abandoning [3]), introduce always extra             Definition 2: A subsequence Ti,m of T is a continuous
parameters, which is difficult to operate without prior knowl-          subset of values from T with length m starting from position
edge. Some low-dimensional representation methods have also             i. Ti,m = (ti , ti+1 , ..., ti+m−1 ), where i ∈ [0, n − m + 1].
been proposed, such as Piece-wise Aggregate Approximation                  Definition 3: Shapelet ŝ is a time series subsequence which
(PAA) [6], which computes the mean of each subsequence                  shape is particularly representative of a class. As such, it shows
of time series in a given length, and transforms the raw data           an interpretable feature which can distinguish one class from
in coarse-grained sub-components. Symbolic Aggregate ap-                the others.


                                                                                                                                                 8
                                                                                                                                  T      T
                                                                                                                                   a,m    b,m

   Definition 4: A Dataset D is a collection of                                                           Source T


time series Ti , and its class label ci , Formally, D =                                                   Target T ’

<T1 , cj1 >,<T2 , cj2 >,...,<TN , cjN >, where N is the number                                              MPi                                     m


of instances in D. C = c1 , c2 , ..., c|C| is a collection of class                                                    0
                                                                                                                                                    OFFSET in T
                                                                                                                                                        n

labels, where |C| denotes the number of labels.
                                                                                                Fig. 2: Matrix Profile between Source time series T and Target time
   Definition 5: Z-Normalization Time Series is a formal repre-                                 series T 0 , where n is the length of T . Intuitively, M Pi shares the
sentation of Time Series, which is defined as ZN ormal(T ) =                                    same offset as source T
T −µ
  σ , where µ is the sample mean, σ is the standard deviation:
                                n                          n
                            1 X                      1 X 2                                      B. Evaluation of candidate Shapelets
               µ=                 ti ,      σ2 =          t −µ                            (1)
                            m i=1                    m i=1 i                                        The quality of a candidate shapelet ŝ, can be assessed
                                                                                                by its ability to separate the instances of different class in
Z-Normalization allows us to focus on the structural feature
                                                                                                the dataset D. A prerequisite of the quality measure for
of T , rather than its amplitude value. It addresses the
                                                                                                ŝ, is that a set of distance Dŝ must be calculated, where
problem of data stability. For instance, assume that the
                                                                                                Dŝ = Dŝ,1 , Dŝ,2 , ...Dŝ,n , n is the number of T in dataset
Euclidean Distance(ED) between two time series Tx,m , Ty,m
                                                                                                D.
is expressed as follows:
                                                                                                    Information Gain, an evaluation method based on Decision
                                                                                                Tree, is widely adopted in previous works [3], [10], [12].
                         v
                         um
                         uX
                 EDx,y = t (tx,i − ty,i )2               (2)                                    An iteration test of Dŝ is conducted to extract the best
                                            i=1                                                 instance which brings the highest Information Gain. The
                                                                                                distance instance will be applied as a property of candidate
Some little changes (e.g., the noise with a peak value) will
                                                                                                Shapelet, to check the inclusion between the candidate and
cause an evident bias for the result, Z-Normalization is a way
                                                                                                time series. Another simple approach, is to use the F-Statistic
of smoothing the bias value.
                                                                                                [13], based on the difference of means in an Analysis of
   Definition 6: Normalizedq Euclidean     Distance(N-ED), is
                              1
                                Pm                                                              Variance A(NOVA). The main idea of this statistic method,
                                                    2
expressed by the formula m         i=1 x,i − ty,i )
                                       (t                                                       is to assess the difference in distributions of ŝ between the
   Definition 7: Distance Profile DPi is a vector which stores                                  class distances.
the Normalized Euclidean Distance between a given subse-
                                                0
quence/query Ti,m and every subsequences Tj,m         of a target                               C. Problem Statement
                0                m                    0
Time Series T . Formally, DPi,j = dist(Ti,m , Tj,m       ), ∀j ∈
     0
[0, n − m + 1]                                                                                     The high degree of coupling inside classical TSC algorithm
                                                                                                [10], [12] leads to the problem of not being able to parallelize.
                                                                                                The speed-up method such as Early Abandoning [3] is based
                                         Query T
                                               i,m

           Source T                                                                             on classical Euclidean Distance measure, which has a time
                                                     Nearest neighbor T’
                                                                       j,m                      complexity of O(N 2 n4 ) with several orders of magnitude
           Target T ’

                                                                           m
                                                                                                higher than MASS [14]: O(N 2 n3 logn). Another common trick
             DPi,j
                                                                           OFFSET in T’         played by previous work [3], [10], [12]: If we know ŝ is a low-
                                                                               n’
                                                                                                quality candidate, then any similar subsequence ŝ0 to ŝ must
                        0


Fig. 1: Distance Profile between Query Ti,m and target time series                              also result in a low quality and therefore, a costly computation
T 0 , where n0 is the length of T 0 . Obviously, DPi,j shares the same                          of the distance set Dŝ0 (evaluation of ŝ0 ) can be skipped.
offset as target T 0
                                                                                                However, a candidate shapelet is evaluated by its quality
                                                                                                ranking among all candidates of the same length. Assume that
   Definition 8: MASS, namely Mueen’s ultra-fast Algorithm
                                                                                                the distributed nodes have generated from dataset a collection
for Similarity Search, computes Distance Profile based on Fast
                                                                                                of candidates ŝl , an aggregation operation between nodes is
Fourier Transform(FFT), which requires just O(nlogn) time,
                                                                                                required to extract the candidate with the best quality. Extra
other than O(nm2 ) time in classical N-ED similarity search.
                                                                                                aggregations will be made along with the iteration of candidate
   Definition 9: Matrix Profile M P is a vector of distance
                                                                                                length. Apparently, the acceleration from the classical pruning
between subsequence Ti,m in source T and its nearest neigh-
      0                                                                                         techniques can be easily offset by the communication cost
bor Tj,m  in target T 0 . Formally, M Pim = min(DPim ), where
                                                                                                caused by the aggregation.
i ∈ [0, n − m + 1].
   Like the distance profile, the matrix profile can be consid-
                                                                                                                           III. S YSTEM OVERVIEW
ered as a meta time series annotating the source time series T.
The profile has a lot of interesting and exploitable properties.                                  Strategies should be taken in order to make the system
For example, the highest point on the profile corresponds to                                    scalable. The main idea of our system is that the calculation
the time series discord, the (tied) lowest points correspond to                                 should be shared and executed independently, less communi-
the position of a query which has a similar matching in target                                  cation between the nodes, more powerful the algorithm would
time series.                                                                                    be.


                                                                                                                                                                         9
                                                            Communication
                                                            between nodes
                                                                                                    Algorithm 1: SMAP(Shapelet on MAtrix Profile)
                                                                                                  Input: Dataset D, classSet Ĉ, k, of f set
                                     node   node        node          ...
         Labeled TS
          Database
                           Cluster
                                                                                                  Output: Ŝ
                                                                                                1 minLength ← 2
                                               Shapelet                                         2 maxLength ← getM inLen(D)
                                               Extraction
                                                                                                3 double[] DiscmP ← [] double[] DistT hresh ← []
                                                                                                4 Ŝ ← ∅
                                                 Early                      Prediction
                          Unlabelled TS        Classifier                     result            5 D.cache(); //cache all the dataset in the cluster, where each
                           Database
                                                                                                    time series has an unique ID
                                                                                                6 MapPartition (Set of < ID, T >: Tset )
                          Fig. 3: System overview                                               7      for < ID, T >∈ Tset do
                                                                                                8          for m ← minLength to maxLength do
A. Main structure                                                                               9              DiscmP [m], DistT hresh[m] ←
                                                                                                                 computeDiscmP (T, D, m, ofp     f set)
   The conventional Time Series classification problems are                                    10              DiscmP [m] ← DiscmP [m] ∗ 1/l
tackled with nearest neighbor(1NN) algorithm due to its easy-                                  11            DiscmP ← pruning(DiscmP )
design feature. As shown in Figure 3, on the basis of 1NN, an                                  12            emit(DiscmP, DistThresh)
early classifier [15] adopted in the system allows to give the                                 13 MapAggregation (class, (DiscmP, DistT hresh))
prediction result as earlier as possible without waiting for the                               14   for c ∈ Ĉ do
entire sequence. The processing of labelled Time Series data                                   15       Ŝ 0 ← getT opk(DiscmP [c], DistT hresh[c], k)
requires to be flexibly arranged for nodes in the cluster. To this                             16       for ŝ ∈ Ŝ do
end, a suitable algorithm is applied here allowing assignment                                  17            ŝ.matchingIndices ←
of computing tasks which are relatively independent of each                                                    getM atchingIndices(ŝ, D)
other.                                                                                         18       Ŝ ← Ŝ Ŝ 0
                                                                                                                 S


B. SMAP: Shapelet extraction on MAtrix Profile                                                 19    return Ŝ

   Matrix Profile provides a meta-data which facilitates the
representation of a complex correlation between two time
                                                                                               resentative Power from class C to others(OVA, one-vs-all).
series. SMAP takes the time series as the smallest processing
                                                                                               Discrimination Profile is then defined as follows:
unit between nodes, and utilizes the normalized quality to
extract the most important parts in each processing unit and                                   DiscmP rof ile (TiC , D) = −(RP (TiC , DC ) − RP (TiC , D!C ))
then merges them by an aggregation process. For this reason,                                                                                              (4)
the number of candidate shapelet could be greatly reduced.                                     A quality Normalization in line 10 is made which allows
Moreover, a single aggregation operation is required to get                                    to assess the Discrimination Power for shapelet of different
the global shapelet result of different class. In line 5, dataset                              length in an uniform way. Similar as the concept Information
is broadcast to distributed nodes in order to reducing the                                     Gain, but Discrimination Profile is a technique more inter-
communication cost caused by accessing the common data.                                        pretable serving to assess the candidate shapelets. Moreover,
Then, each cluster partition shares the computing takes for                                    in this manner, a split distance can be given directly, other
a set of time series. The function computeDiscrimP aims at                                     than iterating every possible distance and deciding the best
computing in batches the quality of candidate shapelets. The                                   one with the highest Information Gain, in time O(N 2 n2 ). A
visualized process is shown in Figure 4.                                                       strategy to check if T contains a shapelet can be defined as
   The batch quality of instances in a time series is defined by                               the following:
Discrimination Profile, which refers to the concept Represen-
                                                                                                                    true, if dist(T, ŝC ) ≤ RP (ŝC , DC )
                                                                                                                
tative Profile:                                                                                sInT (T, ŝC ) =
                                                                                                                    f alse,                     otherwise
                      RP (TiC , D) = avg(M PTiC ,Tj )                                    (3)                                                              (5)

where Tj ∈ DC . Representative Profile targets thus on the                                     C. Acceleration strategy
minimal processing unit(i.e. time series) in SMAP, which                                          The pruning function in line11 is capable of eliminating
shows a vector of Representative Power of each instance in the                                 the number of candidate shapelet, and then reducing the
processing unit. To put it simply, the Representative Power of                                 communication cost during the aggregation process. We can
a subsequence in class C, is its normalized distance to the                                    simply take the ”TopK” strategy, which extracts the biggest
global instance cluster of class C. Intuitively, it represents                                 K values of DiscrimP. However, a such technique is far away
the relevance between the subsequence(i.e., the candidate                                      from lightening the computation during MapPartition process.
shapelet) and the class.                                                                          Since each processing unit should be independent of each
   The fact that a subsequence is discriminative for its class                                 other, a tenable technique for updating the profile of a long
towards others, can be expressed by the difference of Rep-                                     range query could be adopted. The Lower Bounding distance


                                                                                                                                                                  10
                                                     MPi,1C
                                   Computing
                TS: class C        Matrix Profile    MPi,2C                        Representative
                                                                                       Power
                                                       ...           ...

                                                     MPi,nC


       TiC                                                                                                       Discriminative
                                                                                                                     Power

                                                     MPi,1^C

                                                     MPi,2^C
                                   Computing                                       Representative
                TS: class ^C       Matrix Profile    ...           ...                 Power                                           TiC         sC(l=m)

                                                     MPi,n^C


                                                               Fig. 4: Discrimination Profile Extraction


[5] is defined to estimate a minimal possible Z-Normalized                           A. ECG medical diagnosis
Euclidean Distance between two subsequences Ti,l+k and                                  The dataset ECG200, from MIT-BIH Long-Term ECG
Tj,l+k , based on the distance already computed between Ti,l                         Database (ltdb), is collected by two electrodes which record
and Tj,l . Compared to a linear time complexity of computing                         the brain activities in distinct body positions. Each heartbeat
the exact distance, LB Distance Profile can be calculated in a                       has an assigned label of normal or abnormal. All abnormal
constant time, which can accelerate greatly the computation                          heartbeats are representative of a cardiac pathology known
of Matrix Profile in Figure 4. For example, from shapelet                            as supraventricular premature beat. ECG200 contains 100
length l = m to m + 1, the time complexity of computing                              labelled records with a fixed length of 96.
                          m(m−1)
the distance dl+1
               i,j is O(n    2     m), where j ∈ [0, n m(m−1)
                                                          2   ]
which represents the number of subsequences in D, n is the                                             Class 1                                                   Class 2

number of instance in D, m is the length of the longest                                                      Top-5 Shapelet                                          Top-5 Shapelet
instance in D. Accordingly, LB distance takes O(n m(m−1)  2   )                                                 (class1)                                                (class2)


which shows an apparent advantage when the query length is
relatively long. Lower Bounding distance [5] is defined as:
                   ( √ σj,l
                       l       ,              if qi,j ≤ 0
           l+k
     LB(di,j ) =      q σj,l+k                              (6)
                                 2    σj,l
                        l(1 − qi,j ) σj,l+k , otherwise                                              Fig. 5: Shapelets in ECG raw sequence

                        (tj+p−1 ti+p−1 )
              Pl
                                                −µi,l µj,l                              The shapelets extracted by SMAP and the raw time series
where qi,j = p=1           l
                         σi,l σj,l                                                   are shown in Figure 5. Intuitively, shapelets of different
   Empirically, the matching subsequence Tj,l which is the                           length extracted from ECG instances, can not even be found
nearest neighbor of Ti,l , can deduce a longer subsequence                           directly by naked eyes. The performance results are shown
Tj,l+1 , which is probably the nearest neighbor of Ti,l+1 .                          in Table I. SM APLB shows a gain in performance: 23.8X
Assume that the matching subsequence keeps in the same                               faster, realtively higher prediction accuracy (84% to 76%).
position in Ttarget when query Ti,l length increases, then the                       We should know that the speed-up performance relies on the
time complexity for computing the minimal distance between                           computing power of the cluster. We are more inclined to pay
Ti,l and Ttarget is O(l), other than O(l(n − l + 1)). As men-                        attention to its parallelization capability which can be assessed
tioned in Definition 9, M Pim = min(DPim ), the main idea                            by the shuffle cost of distributed nodes. From local(1 node) to
here is to utilize LB Distance to accelerate the computation of                      cluster mode(6 nodes), the shuffle cost increases by 106%, the
min(DPim ), other than computing the entire DPi in a higher                          Distance Measure cost drops to 8.6%. Obviously, considering
time complexity.                                                                     the gain, the shuffle cost can be ignored when we expand the
                                                                                     cluster to a larger scale.
                               IV. E VALUATION
                                                                                                        TABLE I: Performance comparison
   The baseline of the evaluation is USE in [16], which
                                                                                        Algorithm   Time     Accuracy             Shuffle(L./C.)   Dist. M.(L./C.)    TopK(L./C.)
utilizes the traditional method for shapelet extraction based
                                                                                        USE         2547s          76%                  -                 -                 -
on Information Gain. U SEM ASS utilizes M ASS as Simi-
                                                                                        U SEM ASS   1782s          76%                  -                 -                 -
larity Measure strategy, SM APLB applies the acceleration                               SMAP        139s           84%             1.52s/3.08s     1558s/135.57s       1.15s/0.27s
technique based on Lower Bounding Distance. KNN(K=10)                                   SM APLB     107s           84%             1.48s/3.19s     1198s/103.23s       1.21s/0.31s
classifier is applied for all accuracy test. All implementations
are based on Python3.6, with the extension PySpark of the
Apache Spark. The program is tested on AWS EMR cluster                               B. Phalanges Outlines Recognition
(each node is with 16GB RAM, 8 vCPUs, maximal parallelism                               To put the evaluation into larger scale context, we choose
of 6, where 2 vCPUs are reserved for cluster management                              the time series dataset PhalangesOutlinesCorrect, which con-
system). The number of nodes is adjustable.                                          tains 1800 training records with a fixed length of 80. The target


                                                                                                                                                                                      11
is to identify the bones of the middle finger by the outlines                                  [7] J. Lin et al., “A symbolic representation of time series, with implications
collected from the images. Three human evaluators labelled                                         for streaming algorithms,” in Proc. 8th ACM SIGMOD, ser. DMKD ’03,
                                                                                                   2003.
the output of the image outlining as correct or incorrect.                                     [8] A. Camerra et al., “isax 2.0: Indexing and mining one billion time
   USE requires more than 12960 mins to finish the compu-                                          series,” in Proc. ICDM 2010, 2010, pp. 58–67.
tation task. SMAP in single node environment is more than                                      [9] T. Rakthanmanon and E. Keogh, “Fast Shapelets: A Scalable Algorithm
                                                                                                   for Discovering Time Series Shapelets,” in Proc of the 2013 SIAM
7 times faster (precisely, it takes 1860 mins in shapelets                                         International Conference on Data Mining, 2013.
extraction). The accuracy was 86%. Figure 6 shows the                                         [10] A. Mueen et al., “Logical-shapelets: an expressive primitive for time
relation between the time cost and the cluster’s capacity, for                                     series classification,” Proc. 17th SIGKDD, 2011.
                                                                                              [11] K. Chang et al., “Efficient pattern-based time series classification on
both Similarity Measure and Aggregation process. The result                                        gpu,” in Proc. ICDM 2012, 2012, pp. 131–140.
is capable of supporting the conclusion obtained in previous                                  [12] L. Ye and E. Keogh, “Time series shapelets: a novel technique that
experimentation.                                                                                   allows accurate, interpretable and fast classification,” DMKD, 2011.
                                                                                              [13] J. Lines, L. M. Davis, J. Hills, and A. Bagnall, “A Shapelet Transform
                                                                                                   for Time Series Classification,” Tech. Rep., 2012.
         Similarity Measure in Remote Cluster    Aggregation/Shuffle Time in Remote Cluster
                                                                                              [14] C.-C. Michael Yeh et al., “Matrix Profile I: All Pairs Similarity Joins
                                                                                                   for Time Series: A Unifying View That Includes Motifs, Discords and
                                                                                                   Shapelets,” in Proc. ICDM ’16, 2016, pp. 1317–1322.
                                                                                              [15] Z. Xing, J. Pei, and P. S. Yu, “Early Prediction on Time Series: a Nearest
                                                                                                   Neighbor Approach,” 21st IJCAI, pp. 1297–1302, 2009.
                                                                                              [16] R. Mousheimish, Y. Taher, and K. Zeitouni, “Automatic Learning of
                                                                                                   Predictive CEP Rules: Bridging the Gap between Data Mining and
                                                                                                   Complex Event Processing,” in Proc. DEBS ’17, 2017, pp. 158–169.
             Number of distributed nodes               Number of distributed nodes


        Fig. 6: Time consumed along with the node number

                                    V. C ONCLUSION
   In this paper, we propose a novel methodology, namely
SMAP, for Time Series Classification. SMAP adopts the
concept Matrix Profile, and extracts the shapelet features in
a scalable and interpretable manner. Within SMAP, the Dis-
crimination Profile is defined to assess in batches the quality
of candidate shapelets. On the basis of Lower Bounding, we
propose also an acceleration strategy, which works appro-
priately for distributed environment. The satisfactory results
proved the efficiency of the scalable approach, and testified
its competitiveness. Different optimizations are in our planning
list. Specifically, to expand SMAP for longer time series by
reducing the data dimension, but meanwhile to conserve its
high accuracy and scalability.
                               ACKNOWLEDGEMENT
   This research was supported by DATAIA convergence in-
stitute as part of the Programme d’Investissement d’Avenir ,
(ANR-17-CONV-0003) operated by DAVID Lab, University
of Versailles Saint-Quentin, and MASTER project that has
received funding from the European Union’s Horizon 2020 re-
search and innovation programme under the Marie-Slodowska
Curie grant agreement N. 777695.
                                        R EFERENCES
 [1] N. Ravi et al., “Activity recognition from accelerometer data,” ser.
     IAAI’05. AAAI Press, 2005, pp. 1541–1546.
 [2] L. Bao and S. S. Intille, “Activity recognition from user-annotated
     acceleration data.” Springer, 2004, pp. 1–17.
 [3] L. Ye and E. Keogh, “Time series shapelets: A New Primitive for Data
     Mining,” Proc. 15th ACM SIGKDD, 2009.
 [4] D.-E. Yagoubi et al., “DPiSAX: Massively Distributed Partitioned
     iSAX,” in Proc. ICDM 2017, Nov. 2017, pp. 1–6.
 [5] M. Linardi and Y. e. a. Zhu, “Matrix Profile X: VALMOD -Scalable
     Discovery of Variable-Length Motifs in Data Series,” 2018.
 [6] E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra, “Dimensionality
     reduction for fast similarity search in large time series databases,”
     Knowledge and Information Systems, pp. 263–286, Aug 2001.


                                                                                                                                                                                 12