<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Nancy, France
$ mauricio.orozco@itmerida.edu.mx (M. G. Orozco-del-Castillo)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>CBR-foX:A generic post-hoc case-based reasoning method for the explanation of time-series forecasting</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Moisés Fernando Valdez-Ávila</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gerardo Arturo Pérez-Pérez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Humberto Sarabia-Osorio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carlos Bermejo-Sabbagh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mauricio G. Orozco-del-Castillo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Tecnológico Nacional de México/IT de Mérida, Department of Systems and Computing</institution>
          ,
          <addr-line>Merida</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>This paper presents CBR-foX (Case-Based Reasoning for forcasting eXplanations): a post-hoc slidingwindow method that enables the explanation of forecasting models. It applies the Case-Based Reasoning paradigm to provide explanations-by-example, where time series are split into diferent time-window cases that serve as explanation cases for the outcome of the prediction model. It has been designed for domain-expert users -without ML skills- that need to understand and how (future) predictions could be dependent of past time series windows. The main novelty of this approach is its reusability, as CBR-foX can be applied to any black-box forecasting model based on time-series. We propose a novel similarity function which deals with both the morphological similarity and the absolute proximity between the time series, together with several reuse strategies to generate the explanation cases. We propose an automatic evaluation approach based on computing the error (MAE) between the model prediction for and the actual values in the solution of the explanatory case. Then we apply this evaluation method to demonstrate the performance of the proposal on the given dataset. Finally, we provide a reusable implementation that can be directly applied to other time-series forecasting models and domains.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Method Description</title>
      <p>
        The goal of this development is to provide a reusable CBR method for the generation of post-hoc
explanations for a given black-box forecasting model. CBR systems are claimed to have a
“natural” transparency as they are based on the reuse of previous experiences or examples.
Therefore, we propose a particular solution for the explanation of the outcomes of the forecasting
model to the experts, where an opaque, black-box ML system is explained by a more interpretable,
white-box CBR system, following the so-called twin-systems approach [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This approach is
illustrated in Figure 1, where the provided dataset [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is used as the input of the forecasting
model -in this case an Artificial Neural Network (ANN)- and to create the explanatory cases
by the CBR system. Explanatory cases are generated using a sliding-window method over
the whole time series:  = ⟨[ − , ], +1⟩ where  is the window size and +1 is the
solution of the case, which corresponds to the output of the forecasting model for the following
time stamp. Analogously, given a query time stamp , the forecasting model will predict the
time series values for that date:  (). Then, the query for our CBR system will be the time
window  = [ − 1 − ,  − 1]. Next, the prediction given by the black-box model for 
is explained by means of the most similar explanatory cases to the current time window ,
following the explanation-by-example paradigm illustrated in Figure 1.
      </p>
      <sec id="sec-1-1">
        <title>1.1. Personas</title>
        <p>The approach being presented is mainly addressed to the domain expert -without ML
skillsthat needs to understand the outcomes a black-box forecasting model. This way, the CBR-foX
method provides several explanation cases that illustrate how in similar past cases (time-series
windows) the forecasting model yields similar predictions to the current values.</p>
      </sec>
      <sec id="sec-1-2">
        <title>1.2. Explanation Strategy</title>
        <p>Under the CBR assumption that time windows with similar values will present similar outcomes
for the following time stamps, we could reuse previous time windows as cases that explain new
ones in the future, without the use of the forecasting model. To do so, we define a novel several
similarity metric for the retrieval.</p>
        <p>Let us define the Combined Correlation Index (CCI), which provides a way to measure how a
given time window case C is related to a target query window Q:</p>
        <p>CCI(C, Q) = 1 ( (C, Q) − 2‖(C, Q)‖ + 3), (1)</p>
        <p>4
where  is the function that calculates the Pearson correlation coeficient, and the double
bars represent the normalized Euclidean distance between those vectors. The correlation
component deals with the morphological similarity of the time windows, while the Euclidean
distance component deals with the proximity between the time series in the given time windows.
Remaining constants are used to normalize the range of the equation.</p>
        <p>Next, we define the full combined correlation index, FCCI, between two windows Cf and Qf
as the sum of the CCIf computed for every time-series feature. However, the calculated FCCI
values yield undesirable high frequency similarity values as shown in Figure 2 (top). To solve
this problem, a low pass filter can be applied over the FCCI time series to smooth the readings.
In our case, we performed a filtering phase which consisted of the multiple application of a
simple moving average filter (MAF). After iteratively applying this filter until the resulting
signal no longer changes, we obtain a smoothed time series for the FFCI. Smoothed time series
can be seen in Figure 2.</p>
        <p>The highest FFCI values are used to retreive the  most similar explanation cases to the
time-window query . Next, we provide several reuse strategies to combine the time-series of
these nearest-neighbours: average, max, min, etc. Then, our method can be configured present
each original k-nn or/and the combined time-series obtained during the cases reuse. These are
the explanation cases that are presented to the user in order to explain the prediction given by
the forecasting model. An example is presented in the left-hand side of Figure 3 using the 1-NN.
For visual evaluation purposes, we also show the less similar case on the right-hand side of this
same figure.</p>
      </sec>
      <sec id="sec-1-3">
        <title>1.3. Evaluation method and performance</title>
        <p>Even though a visual inspection of the plots of the windows shows an evident diference
between the highest and the lowest FCCIs, respectively, we proposed an automated quantitative
approach to measure this diference. When a window is retrieved from any of the absolute
highest peaks (maximum or minimum) in the FFCI signal, we calculated the mean absolute
error (MAE) between the forecasting model prediction for the  time-stamp and the actual
values in the solution of the case, , containing the readings of the time-stamp represented by
the explanation case, particularly:</p>
        <sec id="sec-1-3-1">
          <title>Most similar explanation case</title>
        </sec>
        <sec id="sec-1-3-2">
          <title>Less similar explanation case</title>
          <p>|TS|  ∈{TS}
∑︁ |pred()[ ] −</p>
          <p>R[ ]|,
(2)
where pred() is a vector containing the outputs of the forecasting model for each variable 
represented by the time series TS.  is another vector containing the actual values of a given
time window for each time-series feature obtained from the explanation case. We calculated
the MAEs of the top  highest and lowest FFCI values, respectively, for the given dataset, which
are shown in Table 1.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Benefits and Impact</title>
      <p>We propose a novel post-hoc explanation system that follows an explanation-by-example
approach implemented through Case-based Reasoning (CBR). The CBR-foX explanation method
splits time series into diferent time-window cases that serve as explanation cases for the
outcome of the forecasting model. It has been designed for domain-expert users -without ML
skills- that need to understand and how (future) predictions could be dependent of past time
series windows.</p>
      <p>This explanation method is completely reusable and proposes a novel similarity function
1 # Load required external libraries
2 loadImports()
3 # Load dataset
4 data = loadData()
5 # Load (or train) forecasting model
6 model = loadForecastingModel()
7 # Configuration parameters (with default values)
8 config = configExplanationParameters()
9 # Main explanation method
10 explain(data, model, config)
named Combined Correlation Index (CCI), which deals with both the morphological similarity
and the absolute proximity between the time series. We also provide several reuse approaches
(max, min, average, etc.) that can be configured to generate an explanation cases from retrieved
k-NNs.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Reusability and source</title>
      <p>The main novelty of this approach is its reusability, as CBR-foX can be applied to any black-box
forecasting model based on time-series. The source code allows to configure all the required
parameters such as input variables or time-window length. It has been designed to support
its reusability and integration into explanation libraries or APIs. As illustrated in Listing 4, it
isolates the domain dependent data and forecasting model. Then it provides a configuration
method with default values. And, finally, the explanation process itself, encapsulated as an only
executable method.</p>
      <p>Github (full source code):
Collab (online execution):</p>
      <p>Jupyter Notebook (pdf):</p>
      <p>Full execution video</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Keane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Kenny</surname>
          </string-name>
          ,
          <article-title>How case-based reasoning explains neural networks: A theoretical analysis of xai using post-hoc explanation-by-example from a survey of ann-cbr twin-systems</article-title>
          ,
          <source>in: Case-Based Reasoning Research and Development: 27th International Conference, ICCBR</source>
          <year>2019</year>
          , Otzenhausen, Germany, September 8-
          <issue>12</issue>
          ,
          <year>2019</year>
          , Springer-Verlag,
          <year>2019</year>
          , p.
          <fpage>155</fpage>
          -
          <lpage>171</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -29249-2\_
          <fpage>11</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Orozco-del Castillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Valdiviezo-N</surname>
          </string-name>
          , J. H.,
          <string-name>
            <given-names>S.</given-names>
            <surname>Navarro</surname>
          </string-name>
          ,
          <article-title>Urban expansion and its impact on meteorological variables of the city of merida</article-title>
          , mexico,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .13140/RG.2.2. 17652.48003.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>