<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Progressive and Iterative Approaches for Time Series Averaging</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saeid Soheily-Khah</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ahlame Douzal-Chouakria</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Gaussier</string-name>
          <email>Eric.Gaussierg@image.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universite Grenoble Alpes</institution>
          ,
          <addr-line>CNRS - LIG/AMA</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <abstract>
        <p>Averaging a set of time series is a major topic for many temporal data mining tasks as summarization, extracting prototype or clustering. Time series averaging should deal with the tricky multiple temporal alignment problem; a still challenging issue in various domains. This work compares the major progressive and iterative averaging time series methods under dynamic time warping (dtw).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Time series centroid estimation is a major issue for many temporal data analysis
and mining tasks as summarization, extracting temporal prototype or clustering.
Estimating the centroid of a set of time series under time warp should deal
with the tricky multiple temporal alignment problem [1{4]. Temporal warping
alignment of time series has been an active research topic in many scienti c
disciplines. To estimate the centroid of two time series under temporal metrics,
as the dynamic time warping [5{7], one standard way is to embed the time series
into a new Euclidean space de ned by their temporal warping alignment. In this
space, the centroid can be estimated as the average of the linked elements. The
problem becomes more complex where the number of the time series is more than
two, as one needs to determine a multiple alignment that links simultaneously
all the time series on their commonly shared similar elements.</p>
      <p>A rst manner to determine a multiple alignment is to search, by dynamic
programming, the optimal path within an n-dimensional grid that crosses the n
time series. The complexity of this approach prevents its use, as it constitutes
an NP-complete problem with a complexity of O(T n) that increases
exponentially with the number of time series n and the time series length T . A second
way, that identi es progressive approaches, is based on combining progressively
pairwise time series centroids to estimate the global one. The progressive
approaches may su er of the early error propagation through the set of pairwise
centroid combinations. The third approach is iterative, it works similarly to the
progressive approach, but mainly reduces the error propagation by repeatedly
re ning the barycenter and realigning it to the initial time series. In general,
the main progressive and iterative approaches are of heuristic nature limited to
the dynamic time warping metric, that provide an estimation of the barycenter
without guarantee of an optimal solution.</p>
      <p>The main contribution of this work is to introduce some major progressive
and iterative approaches for time series centroid estimation, prior to present their
Copyright c 2015 for this paper by its authors. Copying permitted for private and academic
purposes.
characteristics, as well as an extensive comparison between the mentioned
methods throughout real and synthetic datasets, where to the best of our knowledge
this necessary study is never conducted before.</p>
      <p>The rest of this paper is organized as follows: In the next section, di erent
approaches are studied and Section 3 presents the conducted experimentation
and discuss the results obtained. Lastly, Section 4 concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Progressive and iterative approaches</title>
      <p>The progressive and iterative methods for averaging a set of time series are
mostly derived from the multiple sequence alignment methods to address the
tricky multiple temporal alignment problem. In the following, we review the
major progressive and iterative approaches for time series averaging under time
warp.</p>
      <p>
        Gupta et al. in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] used the dtw in the sequence alignment to average a set
of time series. The method, called "NonLinear Alignment and Averaging Filters
(nlaaf)", uses a tournament scheme averaging approach that it's simplest
averaging ordering consists in pairwise averaging sequences following a tournament
scheme. That way, N=2 average sequences are created at rst step. Then those
N=2 sequences, in turn, arepairwise averaged into N=4 sequences, and so on,
until one sequence is obtained. In this approach, the averaging method between
two sequence is applied (N 1) times. nlaaf works by placing each element of
the average sequence of two time sequences, as the center of each association
created by DTW. Its main drawback lies in growth of its resulting length, because
each use of the average method can almost double the length of the average
sequence. That is why nlaaf is generally used in conjunction with a process
reducing the length of the average, leading to a loss of information and thus
to an unsatisfactory approximation. Additionally, the average strongly depends
on the order of time series sequences and so, di erent orders of sequences give
di erent average sequence.
      </p>
      <p>
        To avoid the bias induced by random selection, Niennattrakul et al. [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]
proposed a framework of shape averaging called "Prioritized Shape Averaging
(psa)", which uses hierarchical clustering with a new dtw averaging function,
labeled "Scaled Dynamic Time Warping " with extra capability in stretching
some parts of warping path so that the result is more similar to a sequence time
series with more weight. Niennattrakul used hierarchical clustering as a heuristic
to order the priority. In spite of this hierarchical averaging method aims to
prevent the order dependency, the length of average sequences remains a problem.
Local averaging strategies like nlaaf or psa may let an initial approximation
error propagate throughout the averaging process. If the averaging process has
to be repeated, the e ects may dramatically alter the quality of the result. This
is why a global approach is desirable, where sequences would be averaged all
together, with no sensitivity to their order of consideration.
      </p>
      <p>
        A direct manner to estimate the centroid proposed by Abdulla et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
called "Cross-Words Reference Template (cwrt)", which uses medoid as the
reference time series as follows. First, the time series medoid is selected. The
whole time series are then described in the representation space de ned by the
reference medoid. In the next step, all sequences are aligned by dtw to a single
medoid and then the average is computed by averaging the time-aligned time
series across each point. Petitjean et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] proposed a global averaging method,
called "Dtw Barycenter Averaging (dba)", which consists in iteratively re ning
an initially average sequence, in order to minimize its distance to the averaged
sequence. As a summary, the dba under temporal warping is a global approach
that can average a set of sequences all together.
      </p>
      <p>All the methods de ne heuristic approaches, although with no guarantee
of optimal solutions, the provided approximations are accurate particularly for
time series that behave similarly within the set. However these approaches may
fail principally for time series with similar global behavior and local temporal
di erences, as one needs to deploy local instead of global averaging process.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Experimental study</title>
      <p>
        The experiments are conducted to compare the above approaches on classes of
time series composing various datasets. The datasets can be divided into two
categories. The rst one is composed of time series that have similar global
behavior within the classes, where the time series of the second category may
have distinct global behavior, while sharing local characteristics [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. For the
comparison, the induced inertia reduction rate and the required run time are
evaluated as well as the qualitative comparison of the centroids obtained by a
visualization. In the following, we rst describe the datasets used, then specify
the validation process and discuss the obtained results.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Data descrpition</title>
        <p>
          The experiments are rst carried out on four well known public datasets cbf, cc,
digits and character traj. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. These data de ne a favorable case for the
averaging task as time series behave similarly within the classes. Then, we consider
more complex datasets: bme1, umd1, spiral [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], noised spiral1 and
consseason [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. They are composed of time series that behave di erently within the
same classes while sharing several local characteristics. Table 1 indicates for each
data set: the number of classes it includes (Nb. Class), the number of instances
(Nb. TS), the number of attributes (Nb. Att), the time series length (TS length)
and the global or local nature of similarity within the classes (Type).
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Validation process</title>
        <p>The four mentioned methods nlaaf, psa, cwrt and dba described in Section
2 is compared together. The performances of these approaches are evaluated
through the centroid estimation of each class of the above described datasets.
1 http://ama.liglab.fr/ douzal/data
Particularly, the e ciency of each approach is measured through: a) the
reduction rate of the inertia criterion; the initial inertia being evaluated around the
time series medoid that minimizes the distances to the rest of time series and
b) the space and time complexity. The results reported hereafter are averaged
through a bootstrap process, with 10 repetitions. Finally for all reported results,
the best one which is signi cantly di erent from the rest(t -test at 1% risk) is
indicated in bold.</p>
        <p>Inertia reduction rate Time series averaging approaches are used to estimate
centroid of the time series classes described above, then the inertia w.r.t. the
centroids is measured. Lower is the inertia higher representative is the extracted
PN
centroid. Table 2, gives the obtained inertia reduction rates irr=1 PiNi==11DD((xxii;;mc)) ,
averaged per dataset; x1; :::; xN being the set of time series, D the metric, c the
determined centroid and m the initial medoid. Table 2 shows that the dba
provides the highest irr for the most datasets. Some negative rates observed
indicate an inertia increase.
Time and space complexity In Table 3 the studied approaches are compared
w.r.t their space and time complexity. The results, averaged per dataset, reveal
almost dba the faster method and psa the slowest one. The cwrt approach is
not comparable to the rest of the methods as it performs directly an euclidean
distance on the time series once the initial dtw matrix evaluated. Remark that
for nlaaf and psa the centroid lengths are very large making these approaches
unusable for large time series. The centroid lengths for the remaining methods
are equal to the length of the initial medoid. The higher time consumptions
observed for nlaaf and psa are mainly explained by the progressive increase of
the centroid length during the pairwise combination process.
From Table 2, we can see that dba and psa lead to the highest inertia reduction
rates, where the best scores (indicated in bold) are reached by dba for almost all
datasets. However it is signi cantly lower for some challenging datasets. Finally,
cwrt has the lowest inertia reduction rates. The negative rates observed for
cwrt indicate an inertia increase. As expected, the dba method that iteratively
optimizes an inertia criterion, in general, reaches higher values than the
noniterative methods (nlaaf, psa and cwrt).</p>
        <p>From Table 3, the results reveal dba the fastest method and the psa the
slowest one. For nlaaf and psa the estimated centroids have a drastically large
dimension (i.e. a length around 104) making these approaches unusable for large
time series datasets. The nlaaf and psa methods are highly time-consuming,
largely because of the progressive increase of the centroid length during the
pairwise combination process. The centroid lengths for the remaining methods are
equal to the length of the initial medoid (Table 3). Finally, psa appears greatly
slower than nlaaf; this is due to the hierarchical clustering on the whole time
series. We nally visualize here some of the centroids obtained by the di erent
methods to compare their shape to the one of the time series they represent.
Figure (1) and (2) display the centroids obtained by the mentioned methods
respectively for the class "funnel " of cbf and "cyclic" of data set cc. As one
can note, for global datasets, almost all the approaches succeed in obtainging
centroids more or less similar to the initial time series. However, we observe
generally less representative centroids for nlaaf and psa.
The dtw is among the most frequently used metrics for time series in several
domains as signal processing, temporal data analysis and mining or machine
learning. However, for time series clustering, approaches are generally limited
to kmedoid to circumvent time series averaging under dtw and tricky multiple
temporal alignments problem. The present study compares the major progressive
and iterative time series averaging approaches under dynamic time warping. The
experimental validation is based on global datasets in which time series share
similar behaviors within classes, as well as on more complex datasets exhibiting
time series that share only local characteristics, that are multidimensional and
noisy. Both the quantitative evaluation, based on an inertia criterion and time
and space complexity, and the qualitative one (consisting in the visualization
of the centroids obtained by di erent methods) show the e ectiveness of dba
approach. In particular, the dba method that iteratively optimizes an inertia
criterion, not only, reaches higher values than the non-iterative methods (nlaaf,
psa and cwrt), but also provides a fast time series averaging for global and local
datasets.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Abdulla</surname>
            ,
            <given-names>W.H.</given-names>
          </string-name>
          <article-title>and</article-title>
          <string-name>
            <surname>Chow</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sin</surname>
          </string-name>
          , G.:
          <article-title>Cross-words reference template for DTWbased speech recognition systems</article-title>
          .
          <source>Proc. TENCON, Pages</source>
          <volume>1576</volume>
          {
          <fpage>1579</fpage>
          , Vol.
          <volume>2</volume>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Hautamaki</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Nykanen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Franti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Time-series clustering by approximate prototypes</article-title>
          .
          <source>19th International Conference on Pattern Recognition</source>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Petitjean</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ketterlin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and GanCarski, P.:
          <article-title>A global averaging method for dynamic time warping, with applications to clustering</article-title>
          .
          <source>Pattern Recognition, Pages 678-693</source>
          , Vol.
          <volume>44</volume>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhou and F. De la Torre</surname>
          </string-name>
          :
          <article-title>Generalized time warping for multi-modal alignment of human motion</article-title>
          . IEEE,
          <source>Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <source>Pages</source>
          <volume>1282</volume>
          {
          <fpage>1289</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kruskall</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Liberman</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The symmetric time warping algorithm: From continuous to discrete</article-title>
          . Addison-Wesley,
          <source>Time Warps Journal</source>
          (
          <year>1983</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Sakoe</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <article-title>and</article-title>
          <string-name>
            <surname>Chiba</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Dynamic programming algorithm optimization for spoken word recognition</article-title>
          .
          <source>IEEE Transactions on Acoustics, Speech, and Signal Processing, Pages</source>
          <volume>43</volume>
          {
          <fpage>49</fpage>
          , Vol.
          <volume>26</volume>
          (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Sanko</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kruskal</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          :
          <article-title>Time warps, string edits, and macromolecules: the theory and practice of sequence comparison</article-title>
          . Cambridge University Press, AddisonWesley, (
          <year>1983</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Lalit</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Molfese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tammana</surname>
          </string-name>
          , P. Simos:
          <article-title>Nonlinear alignment and averaging for estimating the evoked potential</article-title>
          .
          <source>IEEE T. on Biomedical Engineering, No. 4, Pages</source>
          <volume>348</volume>
          {
          <fpage>356</fpage>
          , Vol.
          <volume>43</volume>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>C.</given-names>
            <surname>Frambourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Douzal-Chouakria</surname>
          </string-name>
          and
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Gaussier: Learning Multiple Temporal Matching for Time Series Classi cation</article-title>
          .
          <source>In Advances in Intelligent Data Analysis XII</source>
          (pp.
          <fpage>198</fpage>
          -
          <lpage>209</lpage>
          ). Springer Berlin Heidelberg. (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>UCI</given-names>
            <surname>Machine Learning</surname>
          </string-name>
          <string-name>
            <surname>Repository</surname>
          </string-name>
          , "http://archive.ics.uci.edu/ml/ "
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>V.</given-names>
            <surname>Niennattrakul</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Ratanamahatana: On Clustering Multimedia Time Series Data Using K-means and Dynamic Time Warping</article-title>
          . Multimedia and Ubiquitous Engineering, MUE'
          <fpage>07</fpage>
          . International Conference on IEEE,
          <source>Pages</source>
          <volume>733</volume>
          {
          <fpage>738</fpage>
          , (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>V.</given-names>
            <surname>Niennattrakul</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Ratanamahatana: Shape Averaging under Time Warping</article-title>
          . Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology,
          <source>ECTI-CON 6th International Conference on IEEE</source>
          , Vol.
          <volume>2</volume>
          ,
          <source>Pages</source>
          <volume>626</volume>
          {
          <fpage>629</fpage>
          ,
          <string-name>
            <surname>May</surname>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>