<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Online Explainable Ensemble of Tree Models Pruning for Time Series Forecasting⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Amal Saadallah</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lamarr Institute for Machine Learning and AI</institution>
          ,
          <addr-line>Dortmund</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Tree-based models are commonly used in time series forecasting due to their inherent interpretability, which makes them preferable to more complex black-box models. However, simple tree-based models are prone to overfitting, limiting their applicability in real-world scenarios. Ensembles of tree-based models are employed to mitigate this, but ensemble pruning is challenging, especially in the presence of dynamic time series data and concept drift. In this paper, we use TreeSHAP, a tree-specific explainability tool, to perform online tree-based ensemble pruning that adapts dynamically to changes in the time series, addressing the concept drift issue. Empirical evaluations on real-world time series datasets demonstrate that our method performs on par with or better than state-of-the-art techniques. In future research, we plan to automate the determination of the optimal number of clusters for ensemble pruning by leveraging ensemble properties like diversity, accuracy, and stability. This automation aims to enhance both the flexibility and explainability of the model selection process. Given that this work is in its early stages, we seek feedback and collaboration with experts to create a robust and explainable framework for ensemble-based time series forecasting.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Tree Models</kwd>
        <kwd>Online Ensemble Pruning</kwd>
        <kwd>TreeSHAP</kwd>
        <kwd>Time Series Forecasting</kwd>
        <kwd>Concept-drift</kwd>
        <kwd>Explainability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Time series forecasting is crucial for real-time planning and decision-making across various
ifelds like trafic management, weather prediction, and financial markets. However, it is also
one of the most challenging tasks due to the complex and dynamic nature of time series data,
which often involves non-stationary variations and is susceptible to concept drift [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This
makes accurate forecasting inherently dificult, necessitating models that can adapt to changing
data patterns [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5 ref6 ref7">2, 3, 4, 5, 6, 7</xref>
        ]. Given these challenges, explainability in forecasting models has
become increasingly important, especially for safety-critical applications. Tree-based models
are often favored for their intrinsic explainability, but identifying appropriate models for specific
time series requires adaptability due to time-varying characteristics. Decision Trees and their
ensembles, like Random Forests and Gradient-boosted Trees, are commonly used for time series
forecasting. However, these models can struggle with dynamic data since they typically operate
in a static manner, not inherently considering variations in the underlying time series. In
addition, combining multiple models into ensembles can improve forecasting accuracy, but
at the cost of explainability. To address these issues, we propose an online ensemble pruning
approach for time series forecasting, where the ensemble members are selected based on an
adaptive clustering procedure that uses TreeSHAP values to group models with similar modeling
paradigms. This methodology not only ensures diversity within the ensemble but also allows for
an explainable selection process by indicating which aspects of the time series data contribute
most to the predictions.
      </p>
      <p>In our future research, one key goal is to automatically determine the optimal number of
clusters, which corresponds to the ideal number of trees or ensemble members in the ensemble.
This would involve using ensemble properties such as diversity, accuracy, and stability to guide
the selection of the most suitable cluster count. By automating this process, we aim to improve
the ensemble’s flexibility and efectiveness in adapting to dynamic time series data. Moreover,
we intend to deepen the explainability aspect of our approach by explicitly demonstrating that
selecting models based on diferent TreeSHAP values aligns with distinct modeling paradigms
and hypotheses. This could be achieved by visualizing or analyzing how these varying TreeSHAP
values translate into diferent interpretations of the underlying data, providing insights into
the rationale behind model selection. Given that this is early-stage work, we plan to engage
with experts in the field to exchange ideas and gather feedback. Collaboration with specialists
will be instrumental in refining our methodology for selecting the optimal number of trees
and enhancing explainability. By incorporating diverse perspectives, we hope to develop a
robust and transparent approach that addresses the complexities of time series forecasting while
maintaining clarity in model selection and ensemble pruning. This collaborative efort will
contribute to building a reliable framework for ensemble-based forecasting, with a particular
emphasis on explainability and adaptability.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>
        Our proposed method uses TreeSHAP for online ensemble pruning using model clustering. First,
we define the used notation. Second, we describe Shapley values with a focus on TreeSHAP
values [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Third, we show how we generate the candidate tree-based models. Finally, we
demonstrate how TreeSHAP values are used for model clustering to allow for eficient ensemble
pruning and how the whole process is made adaptive to the changes in the time series.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Preliminaries</title>
        <p>
          A time series  is a temporal sequence of values, where 1: = {1, 2, · · · , } is a sequence
of  until time  and  is the value of  at time . Denote with T = { 1,  2, · · · ,   } the
pool of  tree-based models trained to approximate a true unknown function  that generated
. Let ^+ℎ = (^+1ℎ, ^+2ℎ, · · · , ^+ℎ) be the vector of forecast values of  at a future time
instant  + ℎ, ℎ ≥ 1 (i.e. +ℎ) by each of the models in T. An ensemble model ¯ T of T at time
instant  + ℎ can be formally expressed as a convex combination of the forecasts of the models in
   
T: ¯ T(^+ℎ) = ∑︀=1 +ℎ^+ℎ where +ℎ ∈ [1,  ] are the ensemble weights. The weights
are constrained to be positive and sum to one. In addition, it can be seen from the notation that
the weights are time-dependent. This is one of the requirements in online ensemble learning,
where the weights are required to be set in a timely manner to cope with the dynamic nature
of the time series and the time-changing performance of the ensemble members [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ]. The
pruned ensemble is reduced compared the full ensemble ¯ T for each forecast.
goal of dynamic online ensemble pruning is to identify the subset of models S ⊂
compose the ensemble at each time step  + ℎ such that the expected prediction error of the
T that should
S⊂ T
        </p>
        <p>E︀[( +ℎ − ¯ T(^+ℎ)
︀) 2|1:+ℎ− 1 − E︀[( +ℎ − ¯ S(^+ℎ)
︀]
︀) 2|1:+ℎ− 1
︀]</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. TreeSHAP Ensemble Learning</title>
        <sec id="sec-2-2-1">
          <title>2.2.1. Ensemble Pruning</title>
          <p>We divide the time series 1: into 
=
{1, 2, · · ·
, − } and 
{− +1, − +2, · · ·</p>
          <p>, }, with  a provided window size.  is used for training the


 
models in T and  is used to compute the TreeSHAP values. For each tree-based model
∈  , for each observation − + ∈  with  ∈ [1, ], we compute a TreeSHAP value
(− +) for each lagged value, i.e.,  ∈ [1,  ], where  is the number of lags on which the
model   is trained. Then, we aggregate absolute SHAP values over all the observations in
 to acquire SHAP-based lag importance  for each lag  ∈ [1,  ] using the model   :

 =

1 ∑︁ |</p>
          <p>=1</p>
          <p>(− +)|, ∀  ∈ [1,  ], ∀   ∈ 
we bring all the vectors I for all the models</p>
          <p>
            Each model  
can thus be clustered using their SHAP-based lag importance vectors I . However, diferent
models in  might be trained using diferent lag values. As a result, the length of the vectors
I can vary between  and . It exists clustering distance measure that can handle
vectors of diferent lengths [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ]. However, we are mainly interested in grouping models based

∈  can then be characterized by a vector I = {1
, 

2 , · · ·
 }. The models
on the way they represent the relationship between the input lagged values and the output.
Therefore, we assume that the models that are trained using a lag value  lower than 
ignore the importance and the contribution of lagged features that are greater than  . In
other words, if the mode   is trained on  ≤ , for each , such that  ≤  ≤ , the
value of its corresponding SHAP-based lag importance  on  is set to zero. In this manner,
∈  to the same length , and we use
K-means with Euclidean distance for model clustering. Models belonging to diferent clusters
are expected to have diferent modeling paradigms of the contributions of diferent lagged
values to the predictions, which contributes to boosting the ensemble diversity. We select only
cluster representatives to take part in the ensemble. We simply select the closest model to each
cluster center.
          </p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Ensemble Adaptation</title>
          <p>Streaming time series data is prone to significant changes, leading to concept drifts . To account
for these shifts, the selection of ensemble members must be updated, allowing for the inclusion of
(1)
=
(2)
models that can better address newly emerging patterns. Concept drift is detected by monitoring
deviations in the mean of the time series over time, using the Hoefding Bound to evaluate if
these deviations are significant. If a drift is detected, an alarm is triggered, the TreeSHAP-based
model clustering is updated, and the ensemble is adjusted to reflect the new patterns in the data.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>Our method is denoted in the following as OEP-TT: Online explainable Ensemble Pruning of
Tree models for Time series forecasting.</p>
      <sec id="sec-3-1">
        <title>3.1. Experimental Setup</title>
        <p>
          We use 100 univariate time series datasets from various application domains, including financial,
weather, and synthetic data. These datasets are provided by the Monash Forecasting Repository
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. We process each time series  by using the first 50% for training (), the following
25% for validation () and the remaining 25% for testing. Due to this way of splitting the
time series, we discard series that are shorter than 250 to allow enough training and validation
data. All experiments have been performed on consumer hardware, namely on a 2022 MacBook
Pro in R.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. OEP-TT Setup</title>
        <p>Tree-based models set-up: We construct a pool  of tree-based models using diferent
parameter settings that are summarized in Table 1. The list of parameters and their value ranges
Tree-based Model
Decision Tree (DT)
Random Forest (RF)
Gradient Boosted DT
(GBDT)
eXtreme Gradient
Boosting (Xgboost)
Light GBM (LGBM)</p>
        <p>Maximum Depth
Number of trees
Num. of variables sampled
at each split
Minimum size of terminal nodes
Number of trees
Maximum depth of each tree
Shrinkage parameter
Max number of iterations
Step size of each boosting step
Maximum Depth
Metric
Max number of iterations
Maximum depth of each tree</p>
        <p>Configurations
 ∈ {4, 8, 16}
 ∈ {50, 100, 150, 200}</p>
        <p>∈ {3, 5, 7}
 ∈ {5, 10, 15}
 ∈ {50, 100, 150, 200}
.ℎ ∈ {5, 7, 15}
ℎ ∈ {0.001, 0.01, 0.1}
 ∈ {50, 100, 150, 200}
 ∈ {0.001, 0.01, 0.1}
.ℎ ∈ {5, 7, 15}
 ∈ {1, 2}-Regularization
 ∈ {50, 100}
ℎ ∈ {5, 7, 15}
in Table 1 is not exhaustive, and further parameters and values can be considered to generate
more base learners. We also vary the lag parameter  on which the tree-based models are trained,
i.e.,  ∈ {3, 5, 7, 10, 15, 20}. Considering diferent combinations of all the parameters, we train
a total of 294 tree-based models.</p>
        <p>OEP-TT set-up: OEP-TT has also a number hyper-parameters:  is the Size of the Pool
of the tree-based models T: 294,  is the Size of the validation time window :25% of the data
length, || is the Number of final selected models: 6.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. State-of-the-Art Methods Setup</title>
        <p>
          We compare OEP-TT against State-of-the-Art (SoA) methods for online ensemble pruning,
treebased ensembles, and time series forecasting in general. These models include: Auto-Regressive
Integrated Moving Average (ARIMA) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], Exponential Smoothing (ETS) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], Long Short-Term
Memory (LSTM) [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], Multi-Layer Perceptron (MLP) [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], Convolutional Neural Network with
LSTM (CNN-LSTM, Bi-LSTM) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], Random Forest (RF) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], Gradient-Boosted Decisions
Trees (GBDT) [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], eXtrem Gradient Boosting (XGBoost) [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], and Light Gradient-Boosting
Machine (LGBM) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>
          To enable a fair comparison with OEP-TT, we feed to these ensemble pruning methods
the same pool of tree-based models T that was used for OEP-TT: Ens: Ensemble of all the
base modes in T; OCL [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]: Online drift-aware clustering of the tree-based models in T using
covariance-based clustering; OTOP [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]: Online drift-aware Top best-performing tree-based
models ranking using temporal correlation analysis; DEMSC [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]: Dynamic Ensemble Members
Selection using Clustering: Online drift-aware Top best-performing models ranking using
temporal correlation analysis combined with covariance-based clustering; ADE [
          <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
          ] was
recently developed for an online dynamic ensemble of forecasters construction. A meta-learning
strategy that specializes the tree-based models across the input time series. A sequential
weighting schema is developed to automatically select ensemble members by setting their
weights to zero.
        </p>
        <p>We also compare OEP-TT to its variants: OEP-TT-ST: Static variant of OEP-TT. Pruning is
decided at the initial forecasting instant and kept fixed along testing; OEP-TT-Per: Pruning is
updated periodically in a blind manner (i.e. without considering the occurrence of the drift).</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Results</title>
        <sec id="sec-3-4-1">
          <title>3.4.1. Predictive Performance</title>
          <p>Wins
Losses</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>3.4.2. Explainability Aspects</title>
          <p>Figure 1 shows the TreeSHAP values clusters of the Saugeen River Flow data set. The dots
on each lag value stand for the TreeSHAP values taken by the models belonging to the same
cluster, while the line connects the mean values to show the TreeSHAP values for each lag
of the representative selected model on each cluster (only for visualization purposes). Note
that we show the name of the model and the value of the first hyper-parameter plus the lag
value on which it is trained to distinguish between selected models belonging to the same
family of tree-based models, e.g., RF200(Lag10) and RF50(Lag7). It can be seen that on diferent
clusters diferent patterns of lagged values contributions to the target time series observations
are observed. This confirms that our clustering procedure promotes ensemble diversity by
enforcing the selection of models that have diferent modeling paradigms and distinct views
on the importance of specific lag values. For example, while models in cluster 6 favor higher
lag values and emphasize the contribution of their corresponding value to the output forecast
value, models in cluster 5 are built on the assumption of restricting the memory of the models
to lower lag values ( = 3). We can notice that in 3 clusters out of 6, models rely on restricted
lagged values (clusters 2, 3, and 5). Even with this limited width of memory, i.e., historical data,
they can excel in terms of predictive performance.</p>
          <p>Clustered TreeShap values
1.00 1.00
0.75 0.75
0.50 0.50
0.25 0.25
0.00 0.00
lrheeapeuaTSV00001.....0257005050 3 00001.....0257005050 4
1.00 5 1.00 6
0.75 0.75
0.50 0.50
0.25 0.25
0.00 lag15 lag14 lag13 lag12 lag11 lag10 lag9 lag8 lag7 lag6 lag5 lag4 lag3 lag2 lag1Lag0.v0a0luelag15 lag14 lag13 lag12 lag11 lag10 lag9 lag8 lag7 lag6 lag5 lag4 lag3 lag2 lag1
Selected Models</p>
          <p>RF200(Lag10)
GBM(Lag5)
XGboost(Lag3)
RF50(Lag7)
LGBM(Lag3)
RF150(Lag15)</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Concluding Remarks and Future Work</title>
      <p>
        This paper introduces OEP-TT a novel method for online adaptive ensemble of tree-based
models pruning. Through the use of TreeSHAP values, we are able to gain insight into its
decision-making process, both for model selection, as well as for the input time series points
relevance. We showed the advantages of OEP-TT on 100 real-world datasets, both in terms of
predictive performance as well as its explainability aspects. In future work, we plan to extend
our method to hybrid model pools by using the most eficient Shapley value estimation methods
for each model family, such as TreeSHAP for tree-based models, DeepSHAP [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] for Neural
Networks, as well as KernelSHAP [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] for remaining models, tune the size of the ensemble and
dive further into the explainability aspects. Given that this is early-stage work, we plan to
engage with experts in the field to exchange ideas and gather feedback.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          , I. Žliobaitė,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bifet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pechenizkiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bouchachia</surname>
          </string-name>
          ,
          <article-title>A survey on concept drift adaptation, ACM computing surveys (CSUR) 46 (</article-title>
          <year>2014</year>
          )
          <fpage>1</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saadallah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakobs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Morik</surname>
          </string-name>
          ,
          <article-title>Explainable online deep neural network selection using adaptive saliency maps for time series forecasting</article-title>
          , in: N.
          <string-name>
            <surname>Oliver</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Pérez-Cruz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Kramer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Read</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <string-name>
            <surname>Lozano</surname>
          </string-name>
          (Eds.),
          <source>Machine Learning and Knowledge Discovery in Databases. Research Track</source>
          , Springer International Publishing, Cham,
          <year>2021</year>
          , pp.
          <fpage>404</fpage>
          -
          <lpage>420</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saadallah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakobs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Morik</surname>
          </string-name>
          ,
          <article-title>Explainable online ensemble of deep neural network pruning for time series forecasting</article-title>
          ,
          <source>Machine Learning</source>
          <volume>111</volume>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saadallah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mykula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Morik</surname>
          </string-name>
          ,
          <article-title>Online adaptive multivariate time series forecasting</article-title>
          ,
          <source>in: Joint European conference on machine learning and knowledge discovery in databases</source>
          , Springer,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saadallah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Priebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Morik</surname>
          </string-name>
          ,
          <article-title>A drift-based dynamic ensemble members selection using clustering for time series forecasting</article-title>
          ,
          <source>in: Joint European conference on machine learning and knowledge discovery in databases</source>
          , Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saadallah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tavakol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Morik</surname>
          </string-name>
          ,
          <article-title>An actor-critic ensemble aggregation model for time-series forecasting</article-title>
          , in: IEEE ICDE,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakobs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saadallah</surname>
          </string-name>
          ,
          <article-title>Explainable adaptive tree-based model selection for time series forecasting</article-title>
          ,
          <source>arXiv preprint arXiv:2401.01124</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <surname>A Unified</surname>
          </string-name>
          <article-title>Approach to Interpreting Model Predictions</article-title>
          , in: I. Guyon,
          <string-name>
            <given-names>U. V.</given-names>
            <surname>Luxburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          , R. Garnett (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          <volume>30</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc., ????, pp.
          <fpage>4765</fpage>
          -
          <lpage>4774</lpage>
          . URL: http://papers.nips.cc/paper/ 7062-a
          <article-title>-unified-approach-to-interpreting-model-predictions</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Berndt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cliford</surname>
          </string-name>
          ,
          <article-title>Using dynamic time warping to find patterns in time series</article-title>
          ., in: KDD workshop, volume
          <volume>10</volume>
          ,
          <year>1994</year>
          , pp.
          <fpage>359</fpage>
          -
          <lpage>370</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Godahewa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bergmeir</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. I. Webb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Hyndman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Montero-Manso</surname>
          </string-name>
          ,
          <article-title>Monash time series forecasting archive</article-title>
          ,
          <source>in: Neural Information Processing Systems Track on Datasets and Benchmarks</source>
          ,
          <year>2021</year>
          . Forthcoming.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Box</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Jenkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Reinsel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Ljung</surname>
          </string-name>
          ,
          <article-title>Time series analysis: forecasting and control</article-title>
          , John Wiley &amp; Sons,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Gers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Eck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Applying lstm to time series predictable through time-window approaches</article-title>
          ,
          <source>in: Neural Nets WIRN Vietri-01</source>
          , Springer,
          <year>2002</year>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>200</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Romeu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zamora-Martínez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Botella-Rocamora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <article-title>Time-series forecasting of indoor temperature using pre-trained deep neural networks</article-title>
          ,
          <source>in: International conference on artificial neural networks</source>
          , Springer,
          <year>2013</year>
          , pp.
          <fpage>451</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          , Random forests,
          <source>Machine learning 45</source>
          (
          <year>2001</year>
          )
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Taieb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Hyndman</surname>
          </string-name>
          ,
          <article-title>A gradient boosting approach to the kaggle load forecasting competition</article-title>
          ,
          <source>International journal of forecasting 30</source>
          (
          <year>2014</year>
          )
          <fpage>382</fpage>
          -
          <lpage>394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Benesty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Khotilovich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          , I. Cano,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , et al.,
          <article-title>Xgboost: extreme gradient boosting</article-title>
          ,
          <source>R package version 0.4-2 1</source>
          (
          <issue>2015</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>G.</given-names>
            <surname>Ke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Finley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          , W. Ma,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ye</surname>
          </string-name>
          , T.-Y. Liu,
          <article-title>Lightgbm: A highly eficient gradient boosting decision tree</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>30</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>V.</given-names>
            <surname>Cerqueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Torgo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Pinto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Soares</surname>
          </string-name>
          ,
          <article-title>Arbitrated ensemble for time series forecasting</article-title>
          ,
          <source>in: Joint European conference on machine learning and knowledge discovery in databases</source>
          , Springer,
          <year>2017</year>
          , pp.
          <fpage>478</fpage>
          -
          <lpage>494</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>V.</given-names>
            <surname>Cerqueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Torgo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Pinto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Soares</surname>
          </string-name>
          , Arbitrage of forecasting experts,
          <source>Machine Learning</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>