<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>TToommaass KKooccyyaann1,, JJaann MMaarrttiinnoovviicc1,, PPaavvllaa DDrraazzddiilloovvaa2,, aanndd KKaatteerriinnaa SSllaanniinnoovvaa22 1 1 2</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>VVSSBB -- TT</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>hhnnii</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ll Univ</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>rsity o</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Ostrava</institution>
          ,
          <addr-line>IT4Innovations, IT4Innovations, 17. listopadu 15/2172, 708 33 Ostrava</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <fpage>129</fpage>
      <lpage>138</lpage>
      <abstract>
        <p>Many types of data collections processed by time series analysis often contain repeating similar episodes (patterns). If these patterns are recognized, then they may be used for instance in data compression, for prediction or for indexing large collections. Extraction of these patterns from data collections with components generated in equidistant time and in nite number of levels is now a trivial task. The problem arises for data collections that are a subject to di erent types of distortions in all axes. In this type of collections, the found similar episodes do not have to be exactly the same; they can di er in time, shape or amplitude. In these cases, it is necessary to pick the suitable one from each group of similar episodes and to declare it as a representative member of the whole group. This paper discusses the possibilities of using the Dynamic Time Warping (DTW) method for deriving the representative member of a group of similar episodes that are subjects to the previously mentioned distortions. The paper is also focused on providing a suitable mechanism for more e ective searching of distorted time series.</p>
      </abstract>
      <kwd-group>
        <kwd>Dynamic Time Warping</kwd>
        <kwd>Time Series</kwd>
        <kwd>Pattern Mining</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Time series analysis covers methods for analysis of time series data with a focus
on extraction of various types of information like statistics and other
characteristics of the data. During time series processing, it is common that a time
series is divided into a large amount of smaller parts named episodes, which
are interconnected or partially overlapped [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and which are important for
further processing. For example, interconnected outputs of hydrological models,
data collections from tra c monitoring of selected stretches, or long time series
divided by segmentation algorithm like Voting experts [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] can be mentioned.
Obtained episodes may be processed by a suitable clustering algorithm and divided
into the clusters [
        <xref ref-type="bibr" rid="ref3 ref5">3, 5</xref>
        ].
Various approaches in spheres like recommended systems, decision support
systems or tasks based on Case Base Reasoning (CBR) are focused on nding
similar sequences (time series episodes) to a sequence entered on the input. In
such cases, a suitable cluster of similar sequences is found, which represents the
input sequence. Much faster searching is allowed due to nding in set of the
cluster representatives which were selected in indexing phase. Thereafter, it is
possible to search in depth in a selected cluster or a set of clusters, which are
similar to a found episode from the input.
      </p>
      <p>
        Since each obtained cluster contains a concrete amount of similar episodes,
it is suitable to select an appropriate representative, which would describe the
whole cluster. Given selected representative is named pattern. Research area
aimed to nding patterns, pattern mining, has been studied in several elds.
Pattern mining, or pattern recognition, is a scienti c discipline focused on object
classi cation into categories or classes [
        <xref ref-type="bibr" rid="ref10 ref4">10, 4</xref>
        ].
Finding the representative of a cluster is de ned as nding such set of
representative patterns P , which describes episodes E inside these clusters by the
most appropriate way. Obtained representatives may be used for the creation of
an index le, in which each representative contains a set of pointers to episodes
from the base collection (see Figure 1).
      </p>
      <p>Two basic ways for nding representatives are generally known. The rst
approach is based on selecting one episode, which is the most accurate for a given
cluster. The second approach is based on the creation of a representative using
the combination of episodes in the cluster. Euclidean distance and other common
methods for measuring the similarity between the episodes can be used only while
working with the episodes of the identical length. In cases where we have episodes
of di erent lengths, we need a speci c algorithm which respects this requirement
or an algorithm which is immune to sequence distortions. In the paper, it is
described the comparison of the both approaches, and the introduction of an
approach which combines the both ways for nding representatives using DTW
method is presented (for more details, see Section 2).</p>
      <p>The organisation of the paper is following: Dynamic time warping method
(DTW) and the utilization of DTW for nding cluster representatives is
described in Section 2 and in Section 3. Afterwards, in Section 4, a practical
demonstration of proposed approach is presented. The paper is concluded by
Section 5, in which obtained results of suggested approach are discussed and the
future work is outlined.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Dynamic Time Warping</title>
      <p>Recently, nding a signal similar to a signal generated by computers, which
consists of accurate time cycles and which achieves a determined nite number of
value levels, is a trivial problem. A main attention is focused more likely on
the optimisation of searching speed. A non-trivial task occurs while comparing
or searching the signals, which are not strictly de ned and which have
various distortions in time and amplitude. As a typical example, we can mention
measurement of functionality of human body (EKG, EEG) or the elements
(precipitation, ow rates in riverbeds), in which does not exist an accurate timing for
signal generation. Therefore, comparison of such episodes is signi cantly di
cult, and almost excluded while using standard functions for similarity (distance)
computation. Examples of such signals are presented in Figure 2a.
a)</p>
      <p>b)</p>
      <p>
        A problem of standard functions for similarity (distance) computation
consists in sequential comparison of opposite elements in both episodes (comparison
of elements with the identical indexes). Dynamic time warping (DTW) is a
technique for nding the optimal matching of two warped episodes using pre-de ned
rules [
        <xref ref-type="bibr" rid="ref1 ref9">1, 9</xref>
        ]. Essentially, it is a non-linear mapping of particular elements to match
them in the most appropriate way.
      </p>
      <p>
        The output of such DTW mapping of episodes from Figure 2a can be seen
in Figure 2b. This approach was used for example for comparison of two voice
patterns during an automatic recognition of voice commands [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        The main goal of DTW method is a comparison of two time dependent
episodes X and Y , where X = (x1; x2; : : : ; xN ) is of length N 2 N and Y =
(y1; y2; : : : ; yM ) is of length M 2 N, and to nd an optimal mapping of their
elements. A detailed description of DTW including particular steps of the algorithm
is presented in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Using DTW for Finding Cluster Representative
In cases, where it is necessary to gain the most suitable representative of the set
of similar episodes, we need to nd an algorithm appropriate to a given domain.
Sometimes it is possible to use simple average of episodes X and Y , which means
that for a representative R is valid, that:
      </p>
      <p>Ri = Xi +2 Yi ; 8i = 1; : : : ; P; where P = jXj = jY j: (1)
However, this approach is not su cient in cases, where we have data with
distortion. Examples of such episodes are presented in Figure 3a and 3b. If only
we used simple average presented in Equation 1, we would achieve an episode
showed in Figure 3c. As we can see, this episode absolutely is not a representative
and all the information about the episode course is loosed.</p>
      <p>As we can see from Figure 3, it is necessary to nd a more appropriate
algorithm for domains which yield to distortion. The algorithm should be immune to
such distortions. This paper is focused on using DTW for nding a representative
of set of similar, but distorted episodes.
3.1</p>
      <sec id="sec-2-1">
        <title>Finding Representative for Episode Couples</title>
        <p>The approach for nding a representative of two episodes X and Y by nding the
optimal mapping of two episodes using DTW was described in Section 2. In this
method, the most important is obtained warped path p = (p1; : : : ; pL), which
allows to nd a representative. The approach for nding such representative
is described in Algorithm 3.1. The output of presented algorithm applied on
episodes in Figure 3 is presented in Figure 3d.
Algorithm 3.1 Searching for Representative from Pair of Episodes
Input: Episodes X and Y
Output: Representative episode R</p>
        <p>Algorithm 3.1 nds a representative common for two episodes, where both
episodes have the same importance. It nds such episode, which is the most
similar to the both two episodes. If it is necessary, a one of the episodes may
be preferred by adding a weight w 2 (0; 1) and by adjusting a computation of
element r1 and rq by Equation 2:
r1 =
(x1
w) + y1 (w
w + 1
1)
and rq =
(xnl
w) + yml
w + 1
(w
1)
:
(2)</p>
        <p>The impact of adding a weight on achieved representative R for episodes X
and Y is following:
{ w = 1: episodes are equal
{ w 2 (1; 1): episode X is preferred
{ w 2 (0; 1): episode Y is preferred
3.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Finding Representative for Set of Episodes</title>
        <p>Algorithm 3.1 can be applied only on two episodes. However, this is often
insu cient in common practice; we need to nd a representative for the whole
set of episodes in most cases. Given a collection C with generally N episodes,
C = fe1; e2; : : : ; eN g. The question is, how the presented approach applies on
generally N episodes.</p>
        <p>A rst solution is based on an approach, in which is a representative found
step by step by nding particular representatives for episode couples. More
precisely, the rst step consists of nding representative R1 2 for the rst two
episodes e1 and e2. Then, representative R1 2 3 is found for a new obtained
episode R1 2 and for episode e3. Then, such approach is used for the rest of
episodes in the cluster.</p>
        <p>However, our experiments showed that this approach is not as much suitable
as it could be. It is strongly dependent on the order of particular episodes in
collection. The solution is to nd an approach that would be immune to the order
of elements in an episode. Our proposed approach which solves this problem is
presented in Algorithm 3.2.</p>
        <p>The presented approach is not restricted only to using DTW as a method
for the expression of episode similarity. Of course, DTW could be replaced by
any other indicator, for example Euclidean distance or statistical indicators for
time series (MAE, MPE, RMSE, etc.). In such cases, it is necessary to adapt
steps 2 and 4 of Algorithm 3.2, where instead of nding a representative for
the episodes couple by DTW is necessary to use (weighted) average of two
compounded episodes. Section 4 describes both two approaches with a visual
comparison of the impact to a found representative.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments</title>
      <p>In this section, a method for determination of similarity between two episodes is
presented. Furthermore, the proposed method is compared with other methods.
The achieved outputs are visualized with the following structure. The rst row
of the Figures 4 - 8 consists of episodes, which were used as the input to the
algorithm, the second row consists of outputs for the di erent approaches.</p>
      <p>The rst output was average of episodes, de ned in Equation 1. The second
output was from the proposed approach described in Section 3.2. Both outputs
are followed by the results using Mean Absolute Error (MAE), and nally as
reference, Euclidean distance.</p>
      <p>Meaning and usage of DTW method is closer to a human judgement and
perception of similarity than a machine de nition of physical distance. It is
impossible to use a numerical evaluation for the following outputs. The
experiments presented in this section were focused on nding such representative,
which would describe the characteristics and the important parts of particular
episodes.</p>
      <p>
        The rst input dataset was a set of similar signals (see Figure 4), which
shapes resembled ECG records (described for example in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]). The signal ended
with tiny swings. As we can see from the second row of the episodes in Figure 4,
average of values from both episodes absolutely degraded signal information; the
shift of signal peaks and drops was smoothed nearly to one level. Also usage of
MAE method and Euclidean distance did not provide su cient results, which
Algorithm 3.2 Searching for Representative from Set of Episodes
Input: Collection C of N episodes
Output: Representative episode R
1. Initialization:
{ N is count of input episodes.
{ u is level of collection; u = 1.
{ C1 is the rst level of collection; C1 = C.
      </p>
      <p>{ M is count of processed episodes in level u; M = N u + 1.
2. Create from collection Cu, which consists of episodes fe1u; e2u; : : : ; euM g, distance
matrix Du 2 R(M M), where particular matrix elements are de ned as diuj =
DT W (eiu; eju), i.e. matrix elements are created by values of reciprocal mapping of
particular episodes.
3. Calculate sum for each row riu in matrix Du and select a row with the lowest sum
u
value. Find row rmin, where</p>
      <p>M M
X dmin;j = min8i=1;:::M (X diuj )</p>
      <p>u
j=1 j=1
The found row refers to the episode, which is selected as the most similar to the
others in the current collection, and which could be declared as representative Ru
of the collection for u-th level.
4. Remove representative Ru from the current collection and create (N u) new
episodes by application of method for searching representative from couple (Ru,
eiu), described in Section 3.1. This algorithm can be modi ed by adding weight
(preference) to one of the episodes, which can prefer (or discriminate) the
importance of the representative Ru.</p>
      <p>if M &gt; 2 then
u = u + 1;
M = M 1;</p>
      <p>Repeat from Step 2 for remaining (N u) episodes;
else if M = 2 then</p>
      <p>Select a representative from the two episodes as a representative of the whole
original set of episodes C;
end if
did not di er from average outputs much. On the other way, usage of DTW
method for nding representative fully depicted a character of the signal and
brought the most accurate results.</p>
      <p>The next episode quartet contained signals with the three peaks mutually
shifted in time, while each of them had a variable duration (see Figure 5). It
was supposed that the representative would have a curve with the three evident
peaks. It is obvious from the results, that even though MAE and Euclidean
distance worked much better, the loss of information was still noticeable.</p>
      <p>The last input dataset represented the situation, in which the signal
consisted of two waves - one in a positive and one in a negative part (see Figure 6).
These waves were deformed in time, while they were spread or shrunk in X axis.
Although the other methods achieved seemingly the best results, the distortion
was evident again. The output representative did not contained as high
amplitudes as the input waves, did not have smoothed waves and did not detect the
constant segments, which were distorted.</p>
      <p>The most important advantage of the proposed solution is the fact that the
Algorithm 3.2 in combination with DTW is able to process even episodes with
di erent lengths. This is very di cult while using other methods, and in some
cases even impossible. In these cases it is necessary to shrink the episodes into
the identical length, which of course cause the loss of information. Using DTW,
we are able to process such episodes with di erent lengths without any loss of
information. In Figures 7 and 8 are presented outputs from proposed algorithm
applied on episodes with di erent lengths.</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>The real application of proposed algorithm \Searching for Representative from
Set of Episodes" described in Section 3.2 showed that it is able to nd a
representative not only from the set of typical episodes, but also from their distorted
variants. The tested input datasets consisted of signals with changed amplitudes
and were distorted by time shifting. The proposed solution was compared with
conventional methods, in which much worse success was obvious.</p>
      <p>Further work will be concentrated on creation of index le, which structure
was de ned in Section 1, and which visual representation was presented in
Figure 1. The aim is to create a su ciently robust mechanism, which will be able
to nd all the similar episodes to the selected pattern in data collection during
the shortest time. Furthermore, these found episodes will be used for a
prediction using the Case-Based Reasoning method. This method requires a suitable
mechanism that is able to extract the most similar patterns from the input.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgement</title>
      <p>This work was supported by the European Regional Development Fund in the
IT4Innovations Centre of Excellence project (CZ.1.05/1.1.00/02.0070) and by
the SGS, VSB { Technical University of Ostrava, Czech Republic, under the
grant No. SP2013/167 Analysis of Users' Behaviour in Complex Networks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <article-title>Dynamic time warping</article-title>
          .
          <source>In Information Retrieval for Music and Motion</source>
          , pages
          <volume>69</volume>
          {
          <fpage>84</fpage>
          . Springer Berlin Heidelberg, Jan.
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. G. D. Cli ord,
          <string-name>
            <given-names>F.</given-names>
            <surname>Azuaje</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>McSharry</surname>
          </string-name>
          , et al.
          <article-title>Advanced methods and tools for ECG data analysis</article-title>
          .
          <source>Artech House</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>G.</given-names>
            <surname>Gan</surname>
          </string-name>
          , C. Ma, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Data Clustering: Theory, Algorithms, and Applications</article-title>
          .
          <source>ASA-SIAM Series on Statistics and Applied Probability</source>
          . SIAM, MAY
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Hand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Mannila</surname>
          </string-name>
          .
          <article-title>Principles of Data Mining</article-title>
          . MIT Press, Cambridge, MA, USA,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>A. K. Jain</surname>
            ,
            <given-names>M. N.</given-names>
          </string-name>
          <string-name>
            <surname>Murty</surname>
            , and
            <given-names>P. J.</given-names>
          </string-name>
          <string-name>
            <surname>Flynn</surname>
          </string-name>
          .
          <article-title>Data clustering: a review</article-title>
          .
          <source>ACM Computing Surveys</source>
          ,
          <volume>31</volume>
          (
          <issue>3</issue>
          ):
          <volume>264</volume>
          {
          <fpage>323</fpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>E.</given-names>
            <surname>Keogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hart</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Pazzani</surname>
          </string-name>
          .
          <article-title>Segmenting time series: A survey and novel approach</article-title>
          . Work,
          <volume>57</volume>
          :
          <fpage>121</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>T.</given-names>
            <surname>Kocyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Martinovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Podhoranyi</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Vondrak.</surname>
          </string-name>
          <article-title>Unsupervised algorithm for retrieving characteristic patterns from time-warped data collections</article-title>
          .
          <source>In Proceedings of the MAS</source>
          <year>2012</year>
          ,
          <source>The 11th International Conference on Modeling and Applied Simulation</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Rabiner</surname>
          </string-name>
          and
          <string-name>
            <given-names>B. B.-H.</given-names>
            <surname>Juang</surname>
          </string-name>
          .
          <article-title>Fundamentals of Speech Recognition</article-title>
          .
          <source>Prentice Hall</source>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>P.</given-names>
            <surname>Senin</surname>
          </string-name>
          .
          <article-title>Dynamic time warping algorithm review</article-title>
          .
          <source>Information and Computer</source>
          Science Department University of Hawaii at Manoa Honolulu, USA, pages
          <volume>1</volume>
          {
          <fpage>23</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>S.</given-names>
            <surname>Theodoridis</surname>
          </string-name>
          .
          <source>Pattern Recognition. Elsevier</source>
          ,
          <volume>3</volume>
          <fpage>edition</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>