<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Nowcasting of the energy production of wind power plants through spatially-aware model trees</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>(Discussion Paper)</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Annunziata D'Aversa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianvito Pio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Data Science Lab, National Interuniversity Consortium for Informatics (CINI)</institution>
          ,
          <addr-line>Via Volturno, 58, 00185 Roma</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Computer Science, University of Bari "Aldo Moro"</institution>
          ,
          <addr-line>Via E. Orabona, 4, 70125 Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The accurate prediction of the energy production from renewable power plants in short-term intervals is of paramount importance in smart grids, to ensure an eficient distribution of energy within the network. Existing predictive approaches are mainly based on autoregressive models, machine learning methods and, more recently, on neural network architectures that also exploit spatio-temporal information. However, most of them are not able to capture spatial information at diferent degrees of locality, and tend to impose the presence of linear (or non-linear) dependencies among data. In this paper, we discuss a novel approach that is based on linear model trees, to simultaneously model linear and non-linear dependencies, properly extended to capture the spatial dimension at diferent degrees of locality. The proposed approach is able to work in the multi-step predictive setting, that means that it can simultaneously provide predictions for multiple time intervals in the future. Our experiments on a real dataset about the energy produced by wind power plants demonstrate the efectiveness of our method also in comparison with state-of-the-art neural network architectures.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Time series nowcasting</kwd>
        <kwd>Spatio-temporal autocorrelation</kwd>
        <kwd>Multi-step prediction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Smart grids are networks that distribute electricity with the support of sensors, advanced
communication technologies, and predictive components. Within the latter, models able to
forecast the energy consumption and production play a fundamental role. Indeed, in
longterm scenarios, they can support planning interventions on the network, aiming not only to
decrease production costs but also to contribute to the reduction of greenhouse gas emissions.
On the other hand, in short-term scenarios, the forecasting (usually called nowcasting, in the
case of very short-term timeframes) of energy production and consumption can be useful for
performing real-time load balancing actions, that may include powering on backup plants or
drawing energy from customers’ accumulators.</p>
      <p>In general, predictive models can be built by relying on machine learning methods by
exploiting historical data and spatial information of nodes. Indeed, the spatial dimension may introduce
spatial autocorrelation phenomena, which refer to dependencies that may exist among
observations at nearby geographical locations. In this context, the spatial proximity among power
plants or among customers can influence measurements due to similar climatic conditions.</p>
      <p>Another important aspect is that real-world time-series coming from sensor measurements
often exhibit a combination of linear and non-linear trends. This is very common when
measurements depend on weather conditions, which may easily show non-linear phenomena, e.g.,
possibly due to storms or other extreme events. Non-linear phenomena may also emerge in the
case of power grid failures. Therefore, capturing both linear and non-linear trends and
relationships, along with the exploitation of historical data and spatial information, could improve the
model performance and lead to more accurate predictions.</p>
      <p>
        In the literature, several nowcasting approaches have been proposed leveraging on
autoregressive models[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], machine learning models [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] and hybrid models [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, in
the literature, we can find only a few works that also take into account the spatial dimension
[
        <xref ref-type="bibr" rid="ref6 ref7 ref8 ref9">6, 7, 8, 9</xref>
        ]. For instance, in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] the authors propose a method for 5-minute ahead wind power
forecasting. The authors capture spatio-temporal dependencies using a method based on sparse
parametrization of VAR models, which selects coeficients that link sites with a spatial
codependence, discarding those exhibiting weak dependencies. Another relevant example is [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
where the authors proposed a spatio-temporal graph convolution neural network for the
shortterm prediction of the energy produced by wind power plants. The authors consider a multi-step
setting, where 16 future values (at a 15-minutes interval) are predicted simultaneously.
      </p>
      <p>
        The contribution of the temporal and spatial dimensions has also been considered in the
context of the more classical forecasting scenarios to predict the hourly energy production of
photovoltaic power plants 24 hours ahead [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], or to predict the monthly energy consumption
of customers one year ahead [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Also these works consider a multi-step setting, where the 24
hourly predictions (in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]) and the 12 monthly predictions (in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]) are returned simultaneously
by the model, possibly exploiting dependencies among them. The spatial dimension is considered
by resorting to two well known techniques in spatial statistics: the Local Indicator of Spatial
Association (LISA), that represents a local measure of spatial autocorrelation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and the
Principal Coordinates of Neighbour Matrices (PCNM), that represent the spatial structure in
the data [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Such indicators are used to augment the feature space of training instances.
      </p>
      <p>
        Recently, several neural network architectures that consider both temporal and spatial
dimensions have been proposed, but they were applied in diferent application domains. A relevant
example is MTGNN [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], that is a graph convolutional network applied to multiple domains,
including energy and trafic speed forecasting. MTGNN employs multiple temporal convolutional
networks (TCNs) with various kernel sizes, for learning temporal dependencies at diferent
scales, and a self-adaptive adjacency matrix to capture spatial correlations.
      </p>
      <p>
        It is noteworthy that, although some of the mentioned approaches are able to represent and
exploit the spatial information, they cannot capture spatial dependencies at diferent degrees
of locality. A first attempt to capture local spatial information can be found in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], where the
authors proposed the method D2STGNN applied to the trafic speed forecasting. D 2STGNN
identifies both difusion signals, representing how trafic conditions spread through the network,
and inherent patterns, such as recurring trafic patterns or daily / seasonal variations. The model
adopts a spatio-temporal localized convolution to capture hidden difusion time series, while a
combination of GRU (for short-term dependencies) and multi-head self-attention mechanism
(for long-term dependencies) is employed to model hidden inherent time series.
      </p>
      <p>In this paper, we discuss an approach to solve nowcasting tasks in the context of the prediction
of the energy produced by wind power plants, in a multi-step setting. Specifically, we aim at
learning a nowcasting model capable of predicting the energy production for 12 time-steps, at a
15-minutes granularity. Methodologically, contrary to most existing approaches, we capture
both linear and non-linear phenomena through linear model trees. Moreover, we extend them
to efectively capture and model the spatial information at diferent levels of locality.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Spatially-aware linear model trees</title>
      <p>
        As introduced in Section 1, we aim at adopting an approach that is able to capture both linear and
non-linear dependencies. In this respect, we argue that linear model trees [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] can represent a
possible solution, since they combine the ability to model non-linear dependencies of regression
trees with that of linear models. Existing methods for the construction of model trees employ a
learning process characterized by a top-down induction procedure that recursively partitions
the training set, which is analogous to that adopted by conventional tree-based algorithms.
      </p>
      <p>In linear model trees, in the leaf nodes we find linear models instead of constant
approximations of classical regression trees. More formally, given a set of independent variables and a
dependent variable , a standard regression tree returns, for each leaf node , a constant value
, namely,  =  for all the instances falling in the leaf node . Such constant value is usually
an aggregation (mean, median, etc.) of the value  of the training instances falling in the leaf
node. On the other hand, in model trees, each leaf node of the tree contains a linear regression
model that predicts the target variable based on the data points that reach that leaf. An example
to illustrate the diference between a regression tree and a linear model tree is shown in Fig. 1.</p>
      <p>
        The quality of a split is usually measured using a criterion that quantifies how well the split
separates data with respect to the target variable. For example, in CART [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the quality of a
split is evaluated by the Mean Squared Error (MSE). When a node is split, the MSE is computed
for each resulting child node, and the weighted sum (according to the number of instances) of
these MSE values represents the quality of the split. The best split is defined as the one that
minimizes the MSE. In the case of linear model trees, the behavior is similar: the only diference
is that the MSE on the child nodes is computed after fitting a linear model on them.
      </p>
      <p>
        In our approach, we considered the multi-step (MS) setting proposed in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] that consists
in predicting multiple future values of the target variable simultaneously. In particular, our
approach falls in the Multi-Input Multi-Output (MIMO) category [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], which goal is learning a
global predictive model that returns the whole vector of predictions, also taking into account
the possible dependencies between future values, that in principle may be beneficial in terms of
forecasting accuracy. More formally, we consider as input features  historical values of the
target variable − , − +1, ..., − 1, in order to predict the value of the target variable for ℎ
future timesteps , +1..., +ℎ, simultaneously. Note that, in this case, the reduction of the
MSE of a split is evaluated as the average reduction of MSE over all the future timesteps.
      </p>
      <p>
        In the literature, we can find several implementations of linear model trees [
        <xref ref-type="bibr" rid="ref16">16, 19, 20</xref>
        ]. In
this work, we consider the simplest implementation, where internal nodes are simple tests
involving descriptive variables, while leaf nodes are linear models, as shown in the right part
of Fig. 1. This choice makes our extension towards the consideration of the spatial dimension
more straightforward. Specifically, as introduced in Section 1, we aim at extending linear model
trees to efectively capture and model the spatial dimensions at diferent levels of locality.
      </p>
      <p>Methodologically, we introduce the consideration of the spatial dimension as a post-processing
step of the tree construction: we aim at capturing spatial relationships within each subset
implicitly defined by a leaf node of the model tree, potentially capturing spatial relationships at
diferent levels of locality. Considering to have multiple diferent positions (e.g., production
plants or consumers), each represented through several -dimensional training instances, we
act as follows: for each instance , fallen into a leaf node , related to the time step  and to
the geographic position  , we compute a set of additional features ,, . These features are
computed as the weighted average of the -dimensional historical observations at the same
time step  from other positions1 in , where the weights are determined by the spatial closeness
between  and the other positions (see Figure 2). More formally, ,, is defined as follows:
,, = ∑︀</p>
      <p>1
 ∈, ̸= [,  ]  ∈, ̸=
·
∑︁
[,  ] · ,
(1)
where  is the set of distinct positions of the training instances fallen into the leaf node ; ,
is the vector of  historical observations of the location  at the time step ; [,  ] is the
spatial closeness between the positions  and  computed as follows:
[,  ]
[,  ] = 1 − ()
(2)
where  is the distance matrix among locations computed according to the geodesic distance.</p>
      <p>The additional features are computed and added to all the training instances falling into the leaf
node. Finally, a new linear model is trained and the contribution of the added features is assessed
using a validation set. Therefore, we compare two distinct linear models, as depicted in Figure
3. The first model is exclusively trained on the original features (during the construction of the
tree), while the second model incorporates both the original features and the additional ones
computed according to the spatial closeness. We selectively retain the model that demonstrates
1Note that, if a given leaf node contains training instances associated with only one position, this step is skipped.
the lowest validation error within each leaf node. This selection process ensures that we tailor
our modeling approach to the specific peculiarities of each subset of data falling into leaf nodes.
Consequently, within this tree, some leaf nodes may employ models that incorporate spatial
features, while others may rely only on the original features (i.e., when the additional features
based on spatial closeness appear to provide no advantage).</p>
      <p>After performing this process on all the leaf nodes, we apply a pruning step to prevent
overfitting and possibly capture more global (i.e., less local) spatial dependencies. In particular,
we propose an extended version of the Reduced Error Pruning (REP) algorithm [21]: starting
from the bottom of the tree and working backward, for each internal node, it compares the
error made by the unpruned tree with that made simulating that the subtree rooted on the
node is pruned. The subtree is actually pruned only if the resulting tree performs no worse
than the unpruned one over the validation set. In our extended version, we also consider the
possible contribution coming from the features based on the spatial closeness. In particular, we
compare the unpruned tree with the pruned tree and with the pruned tree that also considers
the features based on the spatial closeness. Considering the example reported in Figure 4, given
the internal node 4, we compare the errors made on the validation set by three models: i)
the model represented by its two children nodes 4 and 5 (see the left part of Figure 4); ii) the
model obtained after pruning the subtree rooted in 4 and learning a new linear model from the
instances falling into it (see the middle part of Figure 4); iii) the model obtained after pruning
the subtree rooted in 4 and learning a new linear model from the instances falling into it,
expanded with the features considering the spatial closeness (see the right part of Figure 4).
If the model ii) or the model iii) leads to an improvement on the validation set, the tree is pruned
accordingly. This process continues in a bottom-up fashion until no improvement is obtained.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>In order to assess the efectiveness of the proposed approach, we performed our experiments on a
real-world wind power plants dataset, provided by a lead company in the energy distribution field.
The dataset consists of measurements of the energy production of 60 wind plants, collected every
15 minutes for a period of 1 year. Together with the geographic position (latitude and longitude),
the plants are described by some technical characteristics, namely, avg_wind_turbine_height,
rotor_diameter, and number_of_wind_turbines.</p>
      <p>Following a cross-validation setting for time series, we consider a sliding window approach
where the training set consists of 4 months of data, the validation set corresponds to the
last month of the training set, and the test set is the subsequent month. We performed the
experiments considering a multi-step setting, where the goal is to predict the energy production
of 12 target time-steps ahead simultaneously. As historical measurements associated with each
instance, we consider 12 previous values of energy production, i.e.,  = 12. It is noteworthy
that, in real-world production scenarios, actual measurements are often made available after a
certain amount of time. Therefore, we evaluated the performance of all the models considering
diferent delays from the last observed measurement and the first target time-step to predict.
The considered delays are 0 hours, 2 hours and 4 hours.</p>
      <p>
        To learn the initial model tree, we considered the implementation available in the linear-tree
python library2. For all the experiments, we investigated two diferent configurations of its
parameters, namely: min_samples_leaf = 0.1, max_depth = 5 and min_samples_leaf = 0.05,
max_depth = 20. The original version of this system (henceforth denoted with LT), that ignores
the spatial information, has been considered as the closest competitor to our approach. As
additional competitor systems, we considered three diferent regressors that are able to work in
the multi-step setting, namely, Linear Regression (henceforth denoted with LR), Random Forests
(henceforth denoted with RF) and XGBoost Regressor (henceforth denoted with XGB). For all
these competitors, we also assessed the performance achieved when the spatial information
is considered by injecting PCNM variables [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This allows us to specifically evaluate the
contribution of the novel strategy that we proposed to model the spatial dimension. Finally, we
considered two state-of-the-art neural network architectures that can work in the multi-step
setting and capture spatio-temporal phenomena, i.e., MTGNN [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and D2STGNN [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>As evaluation measure, we collected the Relative Squared Error (RSE) for LT, and the
percentage of improvement with respect to the best configuration of such a model for the
proposed method and for all the considered competitor systems. The RSE is formally defined as
 = ∑∑︀︀((−− ))22 , where  and  are the true and the predicted values, respectively, for the
̃︀ ̃︀
-th time-step, while  is the average value of a given target time-step in the training set.</p>
      <p>The adoption of the RSE, instead of more commonly adopted measures like MAE/MSE/RMSE,
allows us to evaluate the actual usefulness of the predictive models in real scenarios, with
respect to adopting a baseline predictor that always returns the mean of the measurements: an
RSE value close to 0.0 means that the model returns perfect predictions; an RSE value close to
1.0 corresponds to a model that performs analogously to the baseline that always returns the
mean; an RSE value higher than 1.0 means that the model performs worse than such a baseline.</p>
      <p>In Table 1, we report the RSE results, averaged over all the target time-steps and over all
the folds of the cross-validation. As expected, all the considered methods perform worse with
higher delays. Nevertheless, all the RSE values remains under 1.0, which means that they can
still provide more useful indications than those provided by the baseline predictor based on the
average. Looking at the results obtained by our approach, it clearly provides advantages over
LT with all the values of delay and in both configurations of its parameters. On the contrary, all
the other competitors perform worse than (or equal to) LT, except for few specific cases, where
the improvement is no more than 0.6%. These results confirm the adequacy of adopting model
trees in this application domain, due to the co-presence of linear and non-linear phenomena.</p>
      <p>Looking at the contribution provided by the PCNM variables to the competitors, we can
observe no evident diferences with respect to the same methods with no PCNM features, with
some peculiar cases in which the error also increases (see, for example, RF+PCNM vs RF). This
is possibly due to the fact that PCNM variables do not take historical factors into account. On
the other hand, our approach incorporates additional historical features, taking into account
the spatial closeness at diferent degrees of locality. This approach clearly performs better than
injecting static features dependent on the positions as seen by the approaches relying on PCNM.</p>
      <p>In general, we can observe that our approach outperforms all the considered competitors,
including those based on recent neural network architectures. Surprisingly, they obtained the
2https://github.com/cerlymarco/linear-tree</p>
      <p>Model
LT
LT</p>
      <p>LT+PCNM
tn LT+PCNM
eevm RRFF+PCNM
Iropm XXGGBB+PCNM
f LR
o
% LR+PCNM</p>
      <p>MTGNN
D2STGNN
Our approach
Our approach
worst results among the considered systems. This is possibly due to the complexity of their
architecture that requires a huge amount of training data (possibly much higher than those
available in this context) to properly learn an accurate model.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>In this paper, we presented an approach for nowcasting the energy produced by wind power
plants in a multi-step predictive setting. We enabled linear model trees to capture spatial
phenomena at diferent degrees of locality. Specifically, we incorporate additional features that
represent historical observations of other plants, taking into account their spatial closeness.
Moreover, we also extended the REP pruning strategy to consider the spatial dimension.</p>
      <p>Our experiments, performed on a real-world dataset, proved the efectiveness of the proposed
approach, in comparison with standard linear trees and other state-of-the-art competitors that
are also able to model the spatial dimension.</p>
      <p>For future work, we intend to evaluate the efectiveness of the proposed method in other
domains, and to perform a deep evaluation of the diference in terms of (theoretical and empirical)
model complexity with respect to unpruned linear trees and complex neural networks.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was partially supported by the project FAIR - Future AI Research (PE00000013),
Spoke 6 - Symbiotic AI, under the NRRP MUR program funded by the NextGenerationEU. The
research of Annunziata D’Aversa is funded by a PhD fellowship within the framework of the
Italian "POR Puglia FSE 2014-2020" – Axis X - Action 10.4 "Interventions to promote research
and for university education - PhD Project n. 1004.121 (CUP n. H99J21006620008).
for multi-step ahead time series forecasting based on the NN5 forecasting competition,
Expert systems with applications 39 (2012) 7067–7083.
[19] Y. Wang, I. Witten, Induction of model trees for predicting continuous classes, Induction
of Model Trees for Predicting Continuous Classes (1997).
[20] D. Malerba, F. Esposito, M. Ceci, A. Appice, Top-down induction of model trees with
regression and splitting nodes, IEEE Transactions on Pattern Analysis and Machine
Intelligence 26 (2004) 612–625. doi:10.1109/TPAMI.2004.1273937.
[21] J. Quinlan, Simplifying decision trees, International Journal of Man-Machine Studies 27
(1987) 221–234.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Aasim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mohapatra</surname>
          </string-name>
          ,
          <article-title>Repeated wavelet transform based arima model for very short-term wind speed forecasting</article-title>
          ,
          <source>Renewable Energy</source>
          <volume>136</volume>
          (
          <year>2019</year>
          )
          <fpage>758</fpage>
          -
          <lpage>768</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bacher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Nielsen</surname>
          </string-name>
          ,
          <article-title>Online short-term solar power forecasting</article-title>
          ,
          <source>Solar Energy</source>
          <volume>83</volume>
          (
          <year>2009</year>
          )
          <fpage>1772</fpage>
          -
          <lpage>1783</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <article-title>Short-term wind speed or power forecasting with heteroscedastic support vector regression</article-title>
          ,
          <source>IEEE Transactions on Sustainable Energy</source>
          <volume>7</volume>
          (
          <year>2016</year>
          )
          <fpage>241</fpage>
          -
          <lpage>249</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saeed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>Short-term wind speed interval prediction based on ensemble gru model</article-title>
          ,
          <source>IEEE Transactions on Sustainable Energy</source>
          <volume>11</volume>
          (
          <year>2020</year>
          )
          <fpage>1370</fpage>
          -
          <lpage>1380</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Short-term wind speed forecasting using a hybrid model</article-title>
          ,
          <source>Energy</source>
          <volume>119</volume>
          (
          <year>2017</year>
          )
          <fpage>561</fpage>
          -
          <lpage>577</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Dowell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pinson</surname>
          </string-name>
          ,
          <article-title>Very-short-term probabilistic wind power forecasts by sparse vector autoregression</article-title>
          ,
          <source>IEEE Transactions on Smart Grid</source>
          <volume>7</volume>
          (
          <year>2015</year>
          )
          <fpage>763</fpage>
          -
          <lpage>770</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>X. G.</given-names>
            <surname>Agoua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Girard</surname>
          </string-name>
          , G. Kariniotakis,
          <article-title>Short-term spatio-temporal forecasting of photovoltaic power production</article-title>
          ,
          <source>IEEE Transactions on Sustainable Energy</source>
          <volume>9</volume>
          (
          <year>2018</year>
          )
          <fpage>538</fpage>
          -
          <lpage>546</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <article-title>A spatiotemporal directed graph convolution network for ultra-short-term wind power prediction</article-title>
          ,
          <source>IEEE Transactions on Sustainable Energy</source>
          <volume>14</volume>
          (
          <year>2023</year>
          )
          <fpage>39</fpage>
          -
          <lpage>54</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Khodayar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Spatio-temporal graph deep neural network for short-term wind speed forecasting</article-title>
          ,
          <source>IEEE Transactions on Sustainable Energy</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <fpage>670</fpage>
          -
          <lpage>681</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ceci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Corizzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fumarola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Malerba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rashkovska</surname>
          </string-name>
          ,
          <article-title>Predictive modeling of pv energy production: How to set up the learning task for a better prediction?</article-title>
          ,
          <source>IEEE Transactions on Industrial Informatics</source>
          <volume>13</volume>
          (
          <year>2017</year>
          )
          <fpage>956</fpage>
          -
          <lpage>966</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>A. D'Aversa</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Polimena</surname>
            , G. Pio,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ceci</surname>
          </string-name>
          ,
          <article-title>Leveraging spatio-temporal autocorrelation to improve the forecasting of the energy consumption in smart grids</article-title>
          , in: P.
          <string-name>
            <surname>Pascal</surname>
          </string-name>
          , D. Ienco (Eds.), Discovery Science, Springer Nature Switzerland, Cham,
          <year>2022</year>
          , pp.
          <fpage>141</fpage>
          -
          <lpage>156</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Anselin</surname>
          </string-name>
          ,
          <article-title>Local indicators of spatial association - LISA, Geographical analysis 27 (</article-title>
          <year>1995</year>
          )
          <fpage>93</fpage>
          -
          <lpage>115</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Legendre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Peres-Neto</surname>
          </string-name>
          ,
          <article-title>Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM)</article-title>
          ,
          <source>Ecological modelling 196</source>
          (
          <year>2006</year>
          )
          <fpage>483</fpage>
          -
          <lpage>493</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Long</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Zhang, Connecting the dots: Multivariate time series forecasting with graph neural networks</article-title>
          ,
          <source>in: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery &amp; data mining</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>753</fpage>
          -
          <lpage>763</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Shao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jensen</surname>
          </string-name>
          ,
          <article-title>Decoupled dynamic spatialtemporal graph neural network for trafic forecasting</article-title>
          ., volume
          <volume>15</volume>
          ,
          <string-name>
            <given-names>VLDB</given-names>
            <surname>Endowment</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>2733</fpage>
          -
          <lpage>2746</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Quinlan</surname>
          </string-name>
          , et al.,
          <article-title>Learning with continuous classes</article-title>
          ,
          <source>in: 5th Australian joint conference on artificial intelligence</source>
          , volume
          <volume>92</volume>
          ,
          <string-name>
            <surname>World</surname>
            <given-names>Scientific</given-names>
          </string-name>
          ,
          <year>1992</year>
          , pp.
          <fpage>343</fpage>
          -
          <lpage>348</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Stone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Olshen</surname>
          </string-name>
          , Classification and
          <string-name>
            <given-names>Regression</given-names>
            <surname>Trees</surname>
          </string-name>
          , Routledge,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Taieb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Bontempi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Atiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sorjamaa</surname>
          </string-name>
          ,
          <article-title>A review and comparison of strategies</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>