<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Enhancing Interpretability in Multivariate Time Series Classification through Dimension and Feature Selection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zed Lee</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer and Systems Sciences, Stockholm University</institution>
          ,
          <addr-line>Stockholm</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Interpretability in multivariate time series classification is crucial for understanding model decisions. However, the complexity of these classifiers often results in overwhelming feature spaces, hindering interpretability. To address this issue, we propose two novel methods: 1) Dimension selection based on segmentation of time series (DST) and 2) Feature selection based on discretization similarity (FDS). DST segments time series data and applies dimension selection to each segment, capturing distinct properties across diferent time ranges. FDS reduces feature redundancy by comparing discretization techniques and eliminating those with similar bin boundaries. Experiments on 24 UEA multivariate datasets demonstrate that our methods can significantly reduce the number of features while maintaining accuracy, ofering a practical solution for enhancing interpretability in multivariate time series classification.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Multivariate Time Series</kwd>
        <kwd>Interpretability</kwd>
        <kwd>Dimension Selection</kwd>
        <kwd>Feature Selection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Time series datasets involve large quantities of data across multiple dimensions. The
complexity of multivariate time series classification can quickly become overwhelming due to the
interactions between diferent dimensions, which may negatively impact the classification
outcome. Consequently, multivariate time series classifiers have grown increasingly complex in
their model structures and feature spaces to enhance predictive performance. However, these
classifiers often lack interpretability, posing both a challenge and a requirement.</p>
      <p>
        Interpretable time series classifiers, such as MR-SEQL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and MR-PETSC [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], have been
developed using symbolic discretization. Although these symbolic features have specific
meanings and are linked to an interpretable linear classifier, several issues hinder full interpretability.
First, as ensemble-based methods, both classifiers define multiple event sequence patterns for
the same time points under various parameter settings for discretization to create bag-of-word
patterns, resulting in inconsistencies in value ranges for the same discretized patterns,
undermining interpretability. Z-Time [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] addresses this issue by eliminating the ensemble structure
and applying various discretization techniques across the time series with unique event labels,
ensuring each event label corresponds to a specific value range. However, the second problem
is the sheer number of features used by all three classifiers, making human interpretation
impractical.
      </p>
      <p>
        In this paper, we suggest that interpretability should be evaluated not only by the architecture
of models and features but also by the number of features used for classification. While
dimensionality reduction is a common approach in various machine learning tasks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], it is not
suitable for interpretability as it can distort values. Initial eforts in dimensionality selection for
multivariate time series often assume the selection of specific dimensions throughout the entire
time series [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which might not be optimal. This paper proposes two techniques leveraging
previous work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Z-Time’s segmentation properties, and multiple discretization techniques.
First, we segment the time series and select diferent dimensions based on the properties within
each segment. Second, we measure the similarity of diferent discretized bins and remove those
with the highest similarity using the elbow method.
      </p>
      <p>
        Our main contributions and novelty of this paper include:
• Novelty. We introduce the use of segmentation and discretization similarity to reduce
the number of interpretable features in multivariate time series.
• Efectiveness and eficiency. Our proposed techniques can reduce the number of
features by up to 86% while maintaining accuracy, with an average accuracy drop of only
up to 9% on the UEA multivariate time series datasets [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>• Reproducibility. Our code is publicly available on our GitHub repository1.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        While many algorithms for multivariate time series classification have leveraged ensembles [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]
and deep learning techniques [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], recent attention has been directed towards interpretable time
series classification. Most state-of-the-art interpretable time series classifiers utilize symbolic
discretization [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] to create feature spaces, combined with linear classifiers. MR-SEQL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
integrates a symbolic sequential learner with two discretization techniques: symbolic aggregate
approximation (SAX) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and symbolic Fourier approximation (SFA) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], to form the feature
space representation. Similarly, MR-PETSC [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] employs standard frequent pattern mining
with a relative duration constraint, instead of a sequential learner, to capture non-contiguous
patterns as well as subsequences. Although both MR-SEQL and MR-PETSC can be applied
to multivariate time series classification, their interpretability has been studied primarily for
univariate problems, without addressing relationships between variables. The most recent
work, Z-Time [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], ofers the best eficiency (i.e., runtime) and efectiveness (i.e., accuracy) for
multivariate time series classification. Unlike MR-PETSC and MR-SEQL, Z-Time is designed to
consider the relationships between dimensions by incorporating temporal relations through
temporal abstraction. Z-Time enhances interpretability by avoiding ensemble structures with
multiple sliding windows and instead applying diferent discretization techniques, ensuring
each event label has a single definition and value range. For feature reduction, earlier methods
focused on dimension selection based on correlation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] or similarity scores [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The most
recent approach [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] selects dimensions based on the prototype distance between classes, which
1https://github.com/zedshape/dim-reduce
has also been tested in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] for HIVE-COTE 2.0 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], the most accurate classifier on the UCR
dataset [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Background</title>
        <sec id="sec-3-1-1">
          <title>3.1.1. Multivariate time series</title>
          <p>Let t = {1, . . . , } represent a time series spanning  time points. A collection of such time
series forms a time series instance T = {t1, . . . , t}, consisting of  variables or dimensions. If
 = 1, T is univariate; if  &gt; 1, T is multivariate. Each time series instance T is assigned a
class label  ∈ y, where y is a list of class labels corresponding to each instance. The goal of
time series classification models is to predict these class labels correctly.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Dimension selection techniques</title>
          <p>
            Recent work by [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] has proposed two supervised dimension selection methods, which our
suggestions build upon:
• Elbow class sum (ECS): This method calculates the distance matrix between class
centroid values and sums all the pairwise distances calculated for each dimension. The
elbow method is then applied to find a cut-of point.
• Elbow class pairwise (ECP): This method introduces an additional step to ECS. Instead
of summation, it applies the elbow method to the pairwise distances for each dimension
and then unions the eligible dimensions obtained from the distances for each dimension.
          </p>
          <p>These methods assume that the selected dimensions span the entire time series. While ECP is
regarded as the best among them, it sometimes fails to choose smaller dimensions, returning the
whole set of dimensions. We address this issue by suggesting a segmentation-based application.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>3.1.3. Discretization techniques</title>
          <p>
            Discretization techniques have been actively used in interpretable time series classifiers to
convert time series into sets of symbols. Each time step  ∈ t is converted into an event ,
creating an event sequence e. Each event value can take a unique event type  . Z-Time uses the
following three techniques:
• Equal width discretization (EWD): Assuming t follows a uniform distribution,
discretization boundaries are defined so that all event labels have value ranges of equal
length, i.e.,  −  =  −  [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ].
• Equal frequency discretization (EFD): Discretization boundaries are defined so
that each event label occurs with the same frequency in e, i.e., | ∈ e :  =  | =
| ∈ e :  =  | [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ].
• Symbolic aggregate approximation (SAX): SAX uses a window size  and an event
label size to perform both discretization and summarization. The discretization boundaries
are defined assuming t follows a normal distribution [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ], using the points that produce
equi-sized areas under the normal distribution curve.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dimension selection based on segmentation of time series (DST)</title>
        <p>Our assumption is that important dimensions may vary across diferent time ranges, whereas
current methods select dimensions by treating the time series as a whole. Selecting dimensions
from segments of the time series could potentially enhance performance. This approach might
not be feasible for interpretable classifiers that use sliding windows and ensemble structures, but
Z-Time applies segmentation to capture the distinct properties of diferent time periods. This
method is efective for time series with many unrelated parts or where the distribution changes
over time. Unlike sliding windows, which overlap over time points and discretize values within
these windows, segmentation ofers a more straightforward approach to interpretability since
each time point is discretized only once.</p>
        <p>First, a time series instance T is divided into  equal-length segments {T1, . . . , T}. Then,
a dimension selection algorithm such as ECP or ECS is applied to each segment T, resulting
in diferent dimensions being selected for each segment. When multiple time series instances
are considered, the dimension selection algorithm is applied to a set of instances to ensure
consistent dimension selection. Second, after segmentation, Z-Time is applied to each segment
individually. This results in  diferent feature sets, which are concatenated to create a single
feature set for input to an interpretable linear classifier.</p>
        <p>While segmentation has improved Z-Time’s performance, it has the side efect of linearly
increasing the number of features, as each feature created from each segment must be
distinguishable. This necessitates an additional step to significantly reduce the number of features.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Feature selection based on discretization similarity (FDS)</title>
        <p>The second proposed strategy to reduce features involves finding similarities among
discretization techniques. Z-Time uses three diferent discretizations, each with and without PAA, creating
six diferent representations for each dimension of a time series instance. Sometimes, diferent
techniques can produce similar bin boundaries, making it redundant to retain all of them. This
method compares the boundaries of the bins created by each discretization technique by
calculating the diferences in boundary values. After computing the sum of boundary diferences,
the elbow method is used to select dimensions with significant diferences.</p>
        <p>Z-Time uses equal width discretization (EWD), equal frequency discretization (EFD), and SAX.
Suppose a set of boundaries for a technique is g = {1, 2, . . . , }. The average diference is
calculated as follows:
1 ∑︁(, − ,)2</p>
        <p>The elbow method is then applied to identify the number of techniques with suficiently high
average diferences, resulting in a set g′ where |g′| ≤ | g|. While this strategy does not reduce
the number of dimensions, it significantly reduces the number of features, since in the worst
case, the number is quadratic to the number of discretization techniques. Each technique creates
a diferent set of event labels, thereby enhancing the overall interpretability of the classification
model.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>
        Our experiment aims to evaluate the efectiveness of our proposed methods in reducing the
number of features created by Z-Time. We used 26 UEA multivariate datasets with no missing
values for our experiments [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The properties of these datasets can be found in the original
repository. We excluded two datasets: FaceDetection, due to a memory limit of 128 GB for the
chosen parameters, and PenDigit, as it was too small to apply segmentation. We compared
diferent combinations of the following options:
• Setting 1: Dimension selection methods (ECP, ECS)
• Setting 2: Segmentation (with/without DST)
• Setting 3: Feature reduction (with/without FDS)
• Setting 4: Segment size ( = 2,  = 4)
      </p>
      <p>In total, there are 16 diferent combinations per dataset. While detailed results are available
in our repository, we present the average values in Table 1. Table 1 shows relative numbers for
the number of features, the number of dimensions, accuracy, and the total runtime compared to
the default setting without any feature reduction technique. The best technique is marked in
bold, while the second best is underlined.</p>
      <p>First, we observe that FDS significantly reduces the number of features, achieving an additional
average reduction of 24.9% of the original features for the same setting. Without FDS, the
minimum feature number is 26% of the original, but it can be further reduced to 12% with FDS.
Additionally, while ECP generally shows better accuracy than ECS, ECP reduces features to
26% of the original, whereas ECS can reduce them to 12% while maintaining the same accuracy
with  = 2.</p>
      <p>Since Table 1 only shows average values, it might obscure the efect of DST, as results
always appear inferior without DST. While standard ECP and ECS maintain better accuracy
on average, ECP and ECS after segmentation show better accuracy in many instances. ECP
after segmentation performs better in terms of accuracy on 16 datasets, considering all diferent
settings (FDS and the number of segments). Table 2 shows the number of datasets where better
accuracy is achieved by using DST. ECS with DST shows better accuracy than ECS without
DST on 16 datasets with  = 4 and on 13 datasets with  = 2, which is more than half in both
cases. However, with ECP, there is no meaningful improvement from applying DST. ECS after
segmentation shows a significant drop in specific datasets, afecting the overall average, mainly
due to the incorrect choice of the number of segments.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we introduced two methods to enhance the interpretability of multivariate time
series classifiers: 1) Dimension selection based on segmentation of time series (DST) and
2) Feature selection based on discretization similarity (FDS). Our experiments on 24 UEA
multivariate datasets demonstrated that these methods could significantly reduce the number
of features, by up to 86%, while maintaining accuracy, with only an average accuracy drop of
up to 9%. These methods simplify the feature space and enhance interpretability, ofering a
practical solution for multivariate time series classification without compromising predictive
performance. Future work can explore optimizing segmentation processes with dynamic lengths
and refining similarity measures in FDS to enhance the quality of features.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Le Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gsponer</surname>
          </string-name>
          , I. Ilie,
          <string-name>
            <surname>M. O'Reilly</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>Ifrim, Interpretable time series classification using linear models and multi-resolution multi-domain symbolic representations</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>33</volume>
          (
          <year>2019</year>
          )
          <fpage>1183</fpage>
          -
          <lpage>1222</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Feremans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cule</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Goethals</surname>
          </string-name>
          ,
          <article-title>Petsc: pattern-based embedding for time series classification</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>36</volume>
          (
          <year>2022</year>
          )
          <fpage>1015</fpage>
          -
          <lpage>1061</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lindgren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Papapetrou</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z-</surname>
          </string-name>
          <article-title>time: eficient and efective interpretable multivariate time series classification</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>38</volume>
          (
          <year>2024</year>
          )
          <fpage>206</fpage>
          -
          <lpage>236</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C. O. S.</given-names>
            <surname>Sorzano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vargas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Montano</surname>
          </string-name>
          ,
          <article-title>A survey of dimensionality reduction techniques</article-title>
          ,
          <source>arXiv preprint arXiv:1403.2877</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Dhariyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. L.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , G. Ifrim,
          <article-title>Fast channel selection for scalable multivariate time series classification</article-title>
          ,
          <source>in: Advanced Analytics and Learning on Temporal Data: 6th ECML PKDD Workshop</source>
          , AALTD 2021, Bilbao, Spain,
          <year>September 13</year>
          ,
          <year>2021</year>
          ,
          <source>Revised Selected Papers 6</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>54</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bagnall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lines</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Vickers</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Keogh,</surname>
          </string-name>
          <article-title>The uea &amp; ucr time series classification repository</article-title>
          ,
          <year>2018</year>
          . URL: http://www.timeseriesclassification.com.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lines</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , A. Bagnall,
          <article-title>Hive-cote: The hierarchical vote collective of transformationbased ensembles for time series classification</article-title>
          , in: ICDM, IEEE,
          <year>2016</year>
          , pp.
          <fpage>1041</fpage>
          -
          <lpage>1046</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Middlehurst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Large</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Flynn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lines</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bostrom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bagnall</surname>
          </string-name>
          ,
          <article-title>Hive-cote 2.0: a new meta ensemble for time series classification</article-title>
          ,
          <source>Machine Learning</source>
          <volume>110</volume>
          (
          <year>2021</year>
          )
          <fpage>3211</fpage>
          -
          <lpage>3243</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ismail Fawaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lucas</surname>
          </string-name>
          , G. Forestier,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pelletier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. I.</given-names>
            <surname>Webb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Idoumghar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.-A.</given-names>
            <surname>Muller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petitjean</surname>
          </string-name>
          , Inceptiontime:
          <article-title>Finding alexnet for time series classification</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>34</volume>
          (
          <year>2020</year>
          )
          <fpage>1936</fpage>
          -
          <lpage>1962</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Keogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lonardi</surname>
          </string-name>
          ,
          <article-title>Experiencing sax: a novel symbolic representation of time series</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>15</volume>
          (
          <year>2007</year>
          )
          <fpage>107</fpage>
          -
          <lpage>144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <article-title>The boss is concerned with time series classification in the presence of noise</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>29</volume>
          (
          <year>2015</year>
          )
          <fpage>1505</fpage>
          -
          <lpage>1530</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Kathirgamanathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          ,
          <article-title>A feature selection method for multi-dimension time-series data</article-title>
          ,
          <source>in: Advanced Analytics and Learning on Temporal Data: 5th ECML PKDD Workshop</source>
          , AALTD 2020, Ghent, Belgium,
          <year>September 18</year>
          ,
          <year>2020</year>
          ,
          <source>Revised Selected Papers 6</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>220</fpage>
          -
          <lpage>231</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            .
            <surname>Niculescu-Mizil</surname>
          </string-name>
          ,
          <article-title>Supervised feature subset selection and feature ranking for multivariate time series without feature extraction</article-title>
          , arXiv preprint arXiv:
          <year>2005</year>
          .
          <volume>00259</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Dhariyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Le</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , G. Ifrim,
          <article-title>Scalable classifier-agnostic channel selection for multivariate time series classification</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>37</volume>
          (
          <year>2023</year>
          )
          <fpage>1010</fpage>
          -
          <lpage>1054</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. I.</given-names>
            <surname>Webb</surname>
          </string-name>
          ,
          <article-title>A comparative study of discretization methods for naive-bayes classifiers</article-title>
          ,
          <source>in: PKAW</source>
          , volume
          <year>2002</year>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>