<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Finding relevant multivariate models for multi-plant photovoltaic energy forecasting</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Youssef Hmamouche</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Piotr Przymusy</string-name>
          <email>ypiotr@przymus.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lotfi Lakhal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alain Casali</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LIF - CNRS UMR 7279, Aix Marseille University</institution>
          ,
          <addr-line>Marseille</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Forecasting the photovoltaic energy power is useful for optimizing and controling the system. It aims to predict the power production based on internal and external variables. This problem is very similar to the one of multiple time series forecasting problem. With the presence of multiple predictor variables, not all of them will equally contribute to the prediction. The goal is, given a set of predictors, to find what is the best / most accurate subset (s) leading to the best forecast. In this work, we present a feature selection and model matching framework. The idea is that we try to find the optimal combination of forecasting model with the most relevant features for given variable. We use a variety of causality based selection approaches and dimension reduction techniques. The experiments are conducted on real data and the results advocate the usefulness of the proposed approach.</p>
      </abstract>
      <kwd-group>
        <kwd>Time Series</kwd>
        <kwd>Prediction</kwd>
        <kwd>Data Mining</kwd>
        <kwd>Ensemble Selection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Time series forecasting is an important tool aiming to predict the evolution
of time series over time based on the existing history. It has many
applications, for example, in finance, neuroscience, industrial optimization . . . this field
is considered as an essential part of business intelligence systems. It is delivers
crucial information that can improve the decision making processes, by
anticipating systems behavior, e.g., energy consumption or production. Forecasting
photovoltaics (PV) Energy Production has gained attention with the growing of
interest in using PV as source of renewable energy. Forecasting the production
of such systems has a direct impact on trading and controlling the used energy.</p>
      <p>
        In general, the PV energy can be measured as time series variables that can
change according to the system state and external conditions, like temperature
and the weather conditions. The simplest approach would be to use univariate
forecasting model for power generation time series. Several models can be used
in this context, for example the auto-regressive models, e.g. AR or ARIMA [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
However this option have some drawbacks: it does not include crucial
informations provided by other variables. In this case, it is worth to exploit this extra
information from other variables using multivariate models. One approach would
be to use all available variables, but this (i) incorporate some irrelevant variables,
and thus decrease the forecast accuracy [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and (ii) use too much memory. Such
situation can be improved by extracting only the most relevant variables. This
rises some interesting challenges for multivariate time series forecasting. The
organization of the paper is as follow. In the first section, we present and
discuss some works related to the problem addressed. In the Section, 3 we detail
the proposed method. In Section 4, we describe the forecasting process and the
methodology used to perform the experiments. In Section 5, we show and discuss
the results. And in the last section, we summarize our approach.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        In the literature, many approaches was proposed to handle the problem of
forecasting PV energy production. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the paper deals with multi-plant PV
energy production forecasting. A comparison between artificial neural networks,
regression trees, and spatio-temporal auto-correlation based methods was
experimented. The authors show that regression trees provide better results than
artificial neural networks (ANNs). In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], ANNs are used to forecast PV energy
production, taking advantage from there ability to learn the changes. To improve
the forecasts, multiple predictor variables that may influence the energy
production were used based on internal and external factors. The same problematic
was investigated in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. An hybrid approach was used, by adding basic physical
constraints of the PV plant to the input of an ANN. The results show an
improvement of prediction accuracy compared to model without those constraints.
More works on photovoltaic power forecasting approaches can be found in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>We argue that the problem of PV energy forecasting can be modelled as
multivariate time series prediction. In the following we reformulate this problem
and discuss the main approaches used to address it. Consider a set of predictor
time series X = [x1; : : : ; xk] and a target variable y, with n observations.</p>
      <p>
        There are multiple strategies to predict y using X. One way consists in using
models that exploit the precedent values of y and X, e.g., the vector
autoregressive models [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this work, we focus on the prediction models that predict
Y at time t based on values of variables of X at the same time t. Therefore, the
general model can expressed as follow, y(t) = f (X1(t) + + Xk(t)) + (t).
      </p>
      <p>Linear models suppose that y can be expressed as linear combination of X,
i.e., y(t) = 0 + Pik=1 iXi(t) + (t), where (t) is the error term, and =
[ 0; 1; : : : ; k]0 is the vector parameter of the model. The estimation of these
parameters can be performed via different methods. The most common one is
the Least Square technique, which consists on minimizing the sum of squared
errors, and the resolution is performed based on straightforward derivation.</p>
      <p>
        Shrinkage methods aim to minimize the impact of irrelevant variables by
setting the coefficients close to zero. These technique is practical where the number
of predictors is large and the classical resolution is not possible due to
matrix operations constraints. For instance, the Ridge regression method proposed
in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], minimizes the term Ptn=1 (y(t) 0 Pik=1 iXi(t))2 + Pjk=1 j2, where
      </p>
      <p>Pjk=1 j2 is the shrinkage penalty. This mechanism results in shrinking
estimated coefficients towards zero. The Least Absolute Shrinkage and Selection
Operator (Lasso) method is similar to the Ridge regression, but it uses Pjk=1 j j j
as a shrinkage penalty term, in order to force the coefficient of unimportant
variables to be equal to zero.</p>
      <p>
        ANNs use generally a non-linear function (a network of nodes, where each
one pass the signal using weight and eventually an activation function). They
are characterized by the ability of modelling the dynamic dependencies between
variables and learning from the precedent information passed through the
network. By considering the prediction training step as a supervised problem, the
main algorithms used to calibrate the coefficients of the network are based on the
back-propagation of the errors using for instance gradient descent or stochastic
gradient descent algorithms [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        To handle the problem of selecting the most important predictors in a
multivariate prediction model, different approaches based on dimension reduction
and feature selection techniques were proposed in the literature. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], a
comparison of five dimensionality reduction and feature selection methods (t-test
and correlation based method (ranking technique), step-wise regression,
principle component analysis and factor analysis) is performed as a pre-processing
step to improve the forecast accuracy. Also in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], the authors combine multiple
dimension reduction methods based on Principal Component Analysis (PCA),
Genetic Algorithms (GA) and decision trees (CART), to improve the
multivariate prediction models with all existing variables. In [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], a feature selection
algorithm based on causality is proposed for stock prediction modeling. To avoid
the main problem of correlation, i.e., it cannot distinguish direct influences from
indirect ones. The authors select variables based on causality. This method was
compared with PCA, decision trees and LASSO. In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], an overview of
methods that uses principal components approaches for regression. And a sufficient
method for regression with many predictors was proposed.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>The Proposed Feature Selection Method</title>
      <p>In this section, we expose our proposed method. Let us consider a target variable
y and a set of predictors P . The goal is extract the relevant variables from P ,
i.e a subest of P , based on the notion of causality, that will be used in a model
to forecast y. Our approach consists of three steps. First, we calculate the graph
of causalities, then we reduce it by eliminating dependencies using a simple
transitive reduction technique. Finally, we rank them regards with the causality
on the target variable.</p>
      <p>
        To compute causality, we use two measures: (i) the Granger causality [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]
and (ii) the Transfer entropy [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. They are characterized by the property of
modeling non-symmetric relationships between variables. In other sense, they
detect which variable has a direct impact on the other one.
      </p>
      <p>
        Let us consider two univariate time series xt, yt. The Granger causality
assumes that xt causes yt if it contains helpful information to predict yt. The
associated test estimates causality using the Vector Auto-Regressive model. Two
models are computed, one using just the values of the target variables, and the
second using the target and the predictor variables. Then a difference between
with those two models is evaluated using the F-test. In the other hand, Transfer
Entropy has similar idea in evaluating the behavior of the target variable by
using itself and the predictor variable, but it is based on information theory. Let
us undeline that Granger causality is based on a prediction model while Transfer
entropy is based on information theory. It has been shown in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] that they are
equivalent only for variables follwing a normal distribution.
      </p>
      <p>The goal of the proposed method is simple, extracting variables by ranking
them according to the causality. However, selecting them directly based on such
non-symmetric measure leads to the problem of dependencies between variables.
In other words, it is possible to select a set of variables in which each one cause the
other, or even they can be duplicated (they could contain the same information
used to predict the target). Hence, a diversification can improve the selection
task. In this case, applying the transitive reduction algorithm seems natural as
a processing step. We summarize in Algorithm 1 our method. A short version is
provided where we suppose that the causality graph is input of the algorithm.
The following notation are adopted: x ! y expresses the fact that x causes y,
and causality(x ! y) is the value of this causality.</p>
      <p>Algorithm 1: Transitive Reduction on Causality Graph (TRCG)
Input: The causality graph G, the target variable y, the reduction size k
Output: S: Set of predictor variables of y.</p>
      <p>/* Eliminating dependencies with regard to the target variable */
1: for all node ts1 2 G:nodes n fyg do
2: for all node ts2 in G:nodes n fts1; yg do
3: if ts1 ! ts2, ts2 ! y and ts1 ! y then
4: Remove edge between ts1 and y</p>
      <p>/* Selecting top k variable (nodes of G) that cause y */
5: P = {ts 2 G.nodes, ts ! y}
6: Ps = P.sort (key=lambda x: causality(x ! y))
7: S = topk (Ps)
8: return S
4</p>
    </sec>
    <sec id="sec-4">
      <title>Methodology</title>
      <p>The data sets experimented are hourly multiple time series (from hour 2 to
20 each day), representing 3 PV plants, spanning a period of 12 months (year
2012). The goal is to predict 3 months of the production variable, from January
to March 2013(where the values of target variables are not known), based on
internal factors (temperature and irradiance), and external factors (cloudcover,
Data set</p>
      <p>Feature selection on
causality graphs
Dimension reduction</p>
      <p>Regression
models
Shrinkage
methods
Regression
tree
ANNs</p>
      <p>Select the best
{method, model} for
all target variables</p>
      <p>Predict all target
variables and
resample results
dewpoint, humidity, pressure, temperature, windbearing, windspeed). The data
are organized in a way to predict each hour separately, i.e., for each plant, we
have 19 target variables to predict.</p>
      <p>The methodology adopted is based on model selection. First, a benchmark
experiment is performed on training data (year 2012) using cross-validation with
8 experiments by predicting 3 months in each experiment. We execute all the
models on the subsets generated by all the methods. Then we select for each
target variable a pair {method, model} that will be used in testing step.</p>
      <p>
        In the reduction step, we use two existing methods, the Random Walk with
Restart on Granger causality graphs GRWR and on Transfer entropy graph
TRWR [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], and the PCA method. Two versions of each method proposed in 1,
TTRCG and GTRCG, is using either transfer entropy or Granger causality for
the causality measure. The forecasted models used can be classified in four main
types:
– Regression models: Linear regression, RANSAC Regressor (RR), Orthogonal
Matching Pursuit (OMP), Theil Sen Regressor (TSR), Hibber Regressor
(HB).
– Regression models with shrinkage representation: Ridge, Bayesian Ridge,
      </p>
      <p>SVM, Lasso.
– Decision trees: Decision Tree Regressor (DTR), Gradient Boosting Regressor
(GBR)
– ANNs: a simple multilayer perceptron neural network (MLP), using one
hidden layer and a stochastic gradient descent algorithm to update the
parameters of the network.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Results and Discussions</title>
      <p>In this section we present obtained results, and we provide discussion. As we
described in the previous section, we used 3 heuristics (PCA, RWR and TRCG)
in the training step. In testing step, we also used the brute force feature selection
approach that compute all the possible subset for small number of the fastest
prediction models. This allowed us to improve a few of models that were
previously pre selected using heuristic approaches. We obtained RMSE= 0:177 for
10% of testing data and 0:253 for all testing data. In the following we present
the result of the ensemble selection approach obtained in training step, i.e., with
heuristics methods. We focus on the results with heuristic methods, as they can
be applied for large-scale data sets.</p>
      <p>Hours
0.11 0.14 0.13
0.10 0.12
0.09 0.12 0.11
se00..0087 0.10 00..1009
Rm00..0065 0.08 00..0087
0.04 0.06 0.06
0.03 0.05
0.022 4 6 8 101214161820 0.042 4 6 8 101214161820 0.042 4 6 8 101214161820
2.5
In this paper we investigated the multi-plant PV energy forecasting task. We
presented an feature selection and model matching framework. The idea is that,
for a given variable, we can use heuristics to find the optimal combination of a
forecasting model with the most relevant features. Our matching approach is a
two step process: (i) we use an algorithm that picks optimal subset of features (or
combines the features), and (ii) we evaluate the selection on various prediction
models, like regression, decision trees or artificial neural network models. Finally
we select models that perform the best. The second contribution is a new feature
selection algorithm, which uses the transitive reduction algorithm on the graph
of causalities. The results show the utility of using different feature selection
methods and prediction models. However, the forecast accuracy analysis using
relative mean squared errors shows some difficulties to give good predictions in
a decent time, especially when the energy production is low, which decrease the
global performance.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Box</surname>
          </string-name>
          , G.:
          <article-title>Box and Jenkins: Time Series Analysis, Forecasting and Control</article-title>
          .
          <source>In: A Very British Affair. Palgrave Advanced Texts in Econometrics. Palgrave Macmillan UK</source>
          (
          <year>2013</year>
          )
          <fpage>161</fpage>
          -
          <lpage>215</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Stock</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watson</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          :
          <article-title>Chapter 10 Forecasting with Many Predictors</article-title>
          . In G. Elliott,
          <string-name>
            <given-names>C.W.J.G.</given-names>
            ,
            <surname>Timmermann</surname>
          </string-name>
          , A., eds.:
          <source>Handbook of Economic Forecasting</source>
          . Volume
          <volume>1</volume>
          .
          <string-name>
            <surname>Elsevier</surname>
          </string-name>
          (
          <year>2006</year>
          )
          <fpage>515</fpage>
          -
          <lpage>554</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ceci</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corizzo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fumarola</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malerba</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rashkovska</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Predictive Modeling of PV Energy Production: How to Set Up the Learning Task for a Better Prediction?</article-title>
          <source>IEEE Transactions on Industrial Informatics</source>
          <volume>13</volume>
          (
          <issue>3</issue>
          ) (
          <year>June 2017</year>
          )
          <fpage>956</fpage>
          -
          <lpage>966</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dumitru</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gligor</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Enachescu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Solar Photovoltaic Energy Production Forecast Using Neural Networks</article-title>
          .
          <source>Procedia Technology</source>
          <volume>22</volume>
          (
          <year>January 2016</year>
          )
          <fpage>808</fpage>
          -
          <lpage>815</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gandelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grimaccia</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leva</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mussetta</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ogliari</surname>
          </string-name>
          , E.:
          <article-title>Hybrid model analysis and validation for PV energy production forecasting</article-title>
          .
          <source>In: 2014 International Joint Conference on Neural Networks (IJCNN)</source>
          .
          <source>(July</source>
          <year>2014</year>
          )
          <fpage>1957</fpage>
          -
          <lpage>1962</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Antonanzas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Osorio</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Escobar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Urraca</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinez-de Pison</surname>
            ,
            <given-names>F.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antonanzas-Torres</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Review of photovoltaic power forecasting</article-title>
          .
          <source>Solar Energy</source>
          <volume>136</volume>
          (
          <year>October 2016</year>
          )
          <fpage>78</fpage>
          -
          <lpage>111</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Johansen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models</article-title>
          .
          <source>Econometrica</source>
          <volume>59</volume>
          (
          <issue>6</issue>
          ) (
          <year>1991</year>
          )
          <fpage>1551</fpage>
          -
          <lpage>1580</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Hoerl</surname>
            ,
            <given-names>A.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kennard</surname>
          </string-name>
          , R.W.:
          <article-title>Ridge Regression: Biased Estimation for Nonorthogonal Problems</article-title>
          .
          <source>Technometrics</source>
          <volume>12</volume>
          (
          <issue>1</issue>
          ) (
          <year>1970</year>
          )
          <fpage>55</fpage>
          -
          <lpage>67</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Zhang, T.:
          <article-title>Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms</article-title>
          .
          <source>In: Proceedings of the Twenty-First International Conference on Machine Learning. ICML '04</source>
          , New York, NY, USA, ACM (
          <year>2004</year>
          )
          <fpage>116</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Stochastic Gradient Descent Tricks</article-title>
          .
          <source>In: Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science</source>
          . Springer, Berlin, Heidelberg (
          <year>2012</year>
          )
          <fpage>421</fpage>
          -
          <lpage>436</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Tsai</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          :
          <article-title>Feature selection in bankruptcy prediction</article-title>
          .
          <source>Knowledge-Based Systems 22(2) (March</source>
          <year>2009</year>
          )
          <fpage>120</fpage>
          -
          <lpage>127</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Tsai</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hsiao</surname>
            ,
            <given-names>Y.C.</given-names>
          </string-name>
          :
          <article-title>Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches</article-title>
          .
          <source>Decision Support Systems</source>
          <volume>50</volume>
          (
          <issue>1</issue>
          ) (
          <year>December 2010</year>
          )
          <fpage>258</fpage>
          -
          <lpage>269</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xie</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ngai</surname>
            ,
            <given-names>E.W.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>A causal feature selection algorithm for stock prediction modeling</article-title>
          .
          <source>Neurocomputing</source>
          <volume>142</volume>
          (
          <year>October 2014</year>
          )
          <fpage>48</fpage>
          -
          <lpage>59</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Adragni</surname>
            ,
            <given-names>K.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cook</surname>
          </string-name>
          , R.D.:
          <article-title>Sufficient dimension reduction and prediction in regression</article-title>
          .
          <source>Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences</source>
          <volume>367</volume>
          (
          <year>1906</year>
          ) (
          <year>November 2009</year>
          )
          <fpage>4385</fpage>
          -
          <lpage>4405</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Granger</surname>
            ,
            <given-names>C.W.J.</given-names>
          </string-name>
          :
          <article-title>Testing for causality</article-title>
          .
          <source>Journal of Economic Dynamics and Control</source>
          <volume>2</volume>
          (
          <year>January 1980</year>
          )
          <fpage>329</fpage>
          -
          <lpage>352</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Schreiber</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Measuring Information Transfer</article-title>
          .
          <source>Physical Review Letters</source>
          <volume>85</volume>
          (
          <issue>2</issue>
          )
          <issue>(</issue>
          <year>July 2000</year>
          )
          <fpage>461</fpage>
          -
          <lpage>464</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Barnett</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barrett</surname>
            ,
            <given-names>A.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seth</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          :
          <article-title>Granger causality and transfer entropy are equivalent for Gaussian variables</article-title>
          .
          <source>Physical Review Letters</source>
          <volume>103</volume>
          (
          <issue>23</issue>
          ) (
          <year>December 2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Piotr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Youssef</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alain</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lakhal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Improving multivariate time series forecasting with random walks with restarts on causality graphs</article-title>
          .
          <source>In: ICDM Workshops</source>
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>