Wind Speed Forecasting via Structured Output
                  Learning

         Annalisa Appice1,2 ( ), Antonietta Lanza1 , and Donato Malerba1,2
     1
         Dipartimento di Informatica, Università degli Studi di Bari Aldo Moro via
                             Orabona, 4 - 70126 Bari - Italy
            2
              Consorzio Interuniversitario Nazionale per l’Informatica - CINI
              annalisa.appice@uniba.it, antonietta.lanza@uniba.it,
                               donato.malerba@uniba.it


         Abstract. In the context of the wind energy management, the study of
         time series data by means of a predictability analysis can be very helpful.
         For example, accurate wind speed forecasts are necessary to schedule
         dispatchable generation and tariffs in the day-ahead electricity market.
         This paper examines the use of structured output learning, in order to
         model historical wind speed data and yield accurate forecasts of the wind
         speed on the day-ahead (24 h) horizon. The proposed method is based on
         a multi-resolution analysis of the historical data, which are represented
         at multiple scales in both space and time. Handling multi-resolution wind
         speed data allows us to leverage the knowledge hidden in both the spatial
         and temporal variability of the shared information, in order to identify
         spatio-temporal aided patterns that contribute to yield accurate wind
         speed forecasts. In an assessment, using benchmark data, we show that
         the multi-resolution structured output learning is able to determine more
         accurate forecasts than the state-of-the-art structured output models.


1     Introduction

Nowadays power and energy systems with wind energy being as integral system
have been successful. The benefit of clean wind energy also brings the challenge
of forecasting wind power for optimal management of electricity grids. However,
the variable nature of wind speed [17] poses operational challenges for wind
power integration into modern power systems. As the wind variability occurs in
time, as well as in space scales, the profiles of the available power of wind sources
depend on the geographic location, the season (or time of the year), the time of
the day and other physical parameters.
    In this paper, the wind speed forecasting task is addressed by considering
wind speed data measured every 10 minutes along day-ahead time horizons.
We propose a specific time series approach that applies artificial intelligence,
    SEBD 2018, June 24-27, 2018, Castellaneta Marina, Italy. Copyright held by the
    author(s).
in order to learn a forecasting model from the historical data only. A peculiar
contribution is the consideration of multi-resolution representations of historical
data, in order to handle the variability of the wind speed information at the space
and time scales. Our purpose is the investigation of the implications of learning
multi-resolution data on the accuracy of the forecasting operation. Specifically,
we formulate the day-ahead forecasting task as a structured output predictive
learning problem [4]. A multi-target model is considered, in order to learn a single
model that predicts multiple output variables at the same time – one variable for
each time point over the 24-ahead horizon. The decision of learning an output
structured model is supported by various studies which have repeatedly proved
that multi-target models are typically easier to interpret, perform better, and
overfit less than single-target predictions [4, 29, 7].
    Neighboring and windowing mechanisms are adopted, in order to represent
the historical data at various scales in both space and time, respectively. These
mechanisms are combined with the standard deviation operator that is used
to quantify the spatial and/or temporal variability of the data. This multi-
resolution representation of the wind speed variability contributes to define new
input variables, as well as new output variables. In this way, we are able to learn a
multi-target model that accounts for the variability of measurements at different
sites and times. The viability of the proposed method is assessed in the structured
output learning by comparing the accuracy of traditional multi-target models to
the accuracy of multi-resolution multi-target models in a benchmark scenario.
The sensitivity of the accuracy of the proposed forecasting model learned is eval-
uated along the size of the scale. Finally, the accuracy gained in by taking into
account appropriate patterns of the data variability is explored.
    The paper is organized as follows. In the next Section, we briefly report the
state-of-the-art of the time series analysis for the problem of wind speed forecast-
ing. In Section 3, we describe the basics of this study. In Section 4, we present
the multi-resolution structured output learning phase proposed here. In Section
5 we describe the benchmark dataset considered for the empirical evaluation
and illustrate the relevant results. Finally, Section 6 draws some conclusions and
outlines some future work.


2    Related work

In the literature, different forecasting horizons have been investigated: long-term
(from one day to one week ahead), medium-term (from 6 h to one day ahead),
short-term (from 30 min to 6 h ahead) and very short-term (few seconds to 30
min ahead) [28, 11]. On the other hand, various approaches have been developed
for wind speed forecasting in renewable energy systems. In particular, three main
predictive categories are described in the literature [6, 30, 3]: the physical [1, 15],
the time series [27, 8, 20, 13, 18, 22, 21, 2, 5] and the hybrid [16, 10, 12] approaches.
The physical approach describes a physical relationship between wind speed,
atmospheric conditions, local topography and the output from the wind power
turbine. The time series approach consists of time series forecasts, which are
based on the historical data (the wind speed collected at a specific site), while
neglect commonly the meteorological data. Finally, the hybrid approach applies
a combination of physical and time series models.
    By focusing the attention on the time series approach (which is the most
popular in practice and also the subject of this paper), a wide plethora of
time-series methods employ a general class of statistical models, that is, the
Auto-Regressive Moving Average (ARMA) or Auto-Regressive Integrated Mov-
ing Average (ARIMA), in order to estimate future observations of a wind farm
through a linear combination of the past data. The recent literature [26, 13,
19] has shown that these auto-regressive models are very well suited to capture
short range correlations. Hence, they have been used extensively in a variety
of (very) short-term forecasting applications (less than 6 hours). Recent studies
have also proved that auto-regressive models can be profitably extended, in or-
der to account for spatial characteristics of time series data and gain in accuracy
[23–25]. In alternative, the time series approach also involves the use of artifi-
cial intelligence techniques, which are commonly well suited to produce accurate
prediction in medium-term and long-term forecasting applications. Examples of
artificial intelligence wind speed forecasting methods apply Neural Networks [9],
Support Vector Machine [31], Regression Trees [18], K-Nearest Neighborhood [5]
and Cluster analysis [22, 21].
    By investing in artificial intelligence, this paper explores the use of the struc-
tured output learning in a medium-term wind speed forecasting application (24
h ahead). We note that the benefits of structured output learning in time series
forecasting have been recently assessed in [7] considering the problem of deriving
24 h ahead solar radiation forecasts. Differently from this seminal study, that has
modeled the spatial “correlation” of the solar radiation, in order to define new
“input” variables only, we leverage here the power of a model of both the spatial
and temporal “variability” of the wind speed, in order to define new “input”
and “output” variables, which contribute to yield accurate 24 h ahead forecasts
of the wind speed.


3     Basics
Premises Without loss of generality, the applicative scenario we consider in this
work is described by the following four premises. First, the spatial location of
a wind farm is modeled by means of 2-D point coordinates (e.g. latitude and
longitude). Second, the spatial locations of the wind farms are known, distinct
and invariant. Third, wind farms transmit measurements of the wind speed and
they are synchronized in the transmission time. Finally, transmission time points
are equally spaced in time.

Learning task Based upon these premises, the task we intend to perform is to
forecast wind speed at each farm of the grid. The forecasting model is that
learned from the input historical data of the wind speed, as they are collected
    Multi-dimensional representations of geographic space can be equally dealt.
from a grid of wind farms, every 10 minutes, over m+1 consecutive days. We also
consider additional input information, which models the wind speed variability
at the spatial, temporal and spatio-temporal scales. The output of the learning
phase is a structured output predictive model that allows us to yield fine-grained
forecasts for the next day (24 hours) at 10 minutes intervals, based on the input
historical wind speed data as they are measured at 10 minutes over the past m
days.

Input and output variables Formally, let ki be the i-th farm, (Xi , Yi ) are the
geographic coordinates of ki . Let us consider the historical wind speed data,
measured from ki , over days 1, . . . , m, m+1. They are transformed into a training
example, that is represented by vectors xi , xS      T       ST
                                                i , xi and xi , which cover the role
of independent input variables, and vectors yi , yi , yi and yiST , which cover the
                                                     S   T

role of dependent output (or target) variables, respectively. We note that the
input variables are calculated over days 1, . . . , m, while the output variables are
calculated over day m + 1. These input and output variable vectors are formally
described in the following.
    Vector xi is defined as follows:

                 xi = (xi1 , . . . , xi144 , xi145 , . . . , xi288 , . . . , xi144m ),   (1)

where xit denotes the wind speed measured from ki at time t with t = 1, . . . , 144m
(i.e. every day is divided into 144, ten minutes spaced, time points so that t de-
notes the time point that occurs every 10 minutes at days 1, . . . , m). Similarly,
vector yi is defined as follows:

                              yi = (yi144m+1 , . . . , yi144(m+1) ),                     (2)

where yit represents the wind speed measured from ki at time t with t = 144m +
1, . . . , 144(m + 1) (every 10 minutes at day m + 1).
     By applying the standard deviation operator in combination with the neigh-
boring and/or windowing mechanisms, we are able to define new data vectors
that represent the variability of the multi-resolution wind speed data considered
at spatial, temporal and spatio-temporal scales. In particular, the spatial scale
is defined by the neighboring mechanism, the temporal scale is defined by the
windowing mechanism, while the spatio-temporal scale is define by combining
the neighboring and windowing mechanisms.
     Given radius R, applying the neighboring mechanism to ki , a circular neigh-
borhood of ki is constructed. This is a set of wind farms kj so that d(ki , kj ) ≤ R
where d(·, ·) denotes the geographic distance. Considering the spatial scale de-
fined by this neighboring mechanism, we define vectors xS              S
                                                            i and yi , which rep-
resent farm ki at the space scale with radius R over days 1 . . . , m and day m + 1,
respectively. Procedurally,

                 xS     S             S       S               S               S
                  i = (xi1 , . . . , xi144 , xi145 , . . . , xi288 , . . . , xi144m ),
                 yiS = (yiS144m+1 , . . . , yiS144(m+1) )                                (3)
where:
      xSit = stdev({xjt |d(ki , kj ) ≤ R}) with t = 1 . . . , 144m,
         yiSt = stdev({yjt |d(ki , kj ) ≤ R}) with t = 144m + 1 . . . , 144(m + 1).                               (4)
    Given length L so that L is a factor of 144, the windowing mechanism trans-
forms the sequence of consecutive time points t1 , . . . , t144(m+1) into the sequence
of 144
    L (m + 1) consecutive time windows so that:

 windowing[1 . . . 144(m + 1)] =
                                                 144
 = W1 [1 → L], W2 [L + 1 → 2L], . . . , W 144 [(     − 1)L + 1 → 144], |{z}              ... ,
                                                                        . . . , . . . , |{z}
                                           L      L                 } day 2
   |                              {z                                                   day m
                                            day 1
                                                                  144
 W 144
    L m+1
          [t 144
              L m
                  + 1 → t 144
                           L m+L
                                 ], . . . , W 144
                                               L (m+1)
                                                       [t 144
                                                           L m
                                                               +(     − 1)L + 1 → t 144
                                                                                     L (m+1
                                                                                            ],
 |                                              {z                 L                       }
                                                         day m+1
                                                                                                                        (5)
where each each window covers L consecutive time points. Considering the tem-
poral scale defined by this windowing mechanism, we can define vectors xT i and
yiT , which represent farm ki at the time scale with length L over days 1 . . . , m
and day m + 1, respectively. Procedurally,
      xT     T             T        T
       i = (xi1 , . . . , xi 144 , xi 144        , . . . , xTi144 . . . xTi144             , . . . , xTi144 ),
                                            +1                   2               (m−1)+1                  m
                              L       L                      L             L                          L

      yiT = (yiT144     , . . . , yiT144       )                                                                  (6)
                    m+1                  (m+1)
                   L                  L


where:
                                                                          144
          xTit = stdev({xir |r ∈ Wt }) with t = 1, . . . ,                    m,
                                                                           L
                                                                 144                144
           yiTt = stdev({yir |r ∈ Wt }) with t =                     m + 1, . . . ,     (m + 1).                  (7)
                                                                  L                  L
   Finally, given radius R and length L, we define vectors xST i    and yiST , which
represent farm ki at the space scale with radius R and the time scale with length
L over days 1, . . . , m and day m + 1, respectively. Procedurally,
     xST
      i  = (xST            ST       ST
             i1 , . . . , xi 144 , xi 144          , . . . , xST           ST
                                                              i 144 . . . xi 144             , . . . , xST
                                                                                                        i 144 )
                                              +1                     2             (m−1)+1                    m
                                  L       L                      L             L                          L

     yiST = (yiST
                144 m+1
                        , . . . , yiST
                                     144 (m+1)
                                               )                                                                  (8)
                   L                  L


where:
                                                                                              L
xST
 it = stdev({xjr |d(ki , kj ) ≤ R and r ∈ Wt }) with t = 1, . . . ,                              m,
                                                                                             144
                                                                                   144                144
yiST = stdev({yjr |d(ki , kj ) ≤ R and r ∈ Wt }) with t =                              m + 1, . . . ,     (m + 1).
  t
                                                                                    L                  L
                                                                                                               (9)
4     Multi-Resolution Structured learning - MuReS
Let us consider a wind farm grid K, which is composed of N wind farms
k1 , k2 , . . . , kN , and a historical dataset D, which comprises wind speed mea-
surements collected from K over m + 1 days. Adopting the notation introduced
in Section 3, D is spanned over an independent input space X × XS × XT × XTS
and a dependent output space Y × YS × YT × YTS . The structured predictive
model f +ST can be learned from D so that:

              f +ST : X × XS × XT × XTS → Y × YS × YT × YTS ,                    (10)

    This predictive model is a “multi-resolution” upgrade of the traditional struc-
tured output predictive model [4, 29]. We note that the output space of f +ST (·)
yields 24 h forecasts of the fine-grained wind speed (Y), as well as 24 h ahead
forecasts of the winds speed variability at the space and time scales considered
(YS , YT and YTS ). However, this study aims at yielding accurate fine-grained
forecasts of wind speed; hence the empirical study will explore the accuracy of
model f +ST (·) along Y only.
    In this study, predictive model f +ST (·) is learned as a tree, i.e. a hierarchy
of clusters (Predictive Clustering Trees (PCTs)): the top node corresponds to
one cluster containing all the data, which is recursively partitioned into smaller
clusters while moving down the tree. CLUS, including PCTs for multi-target
regression [14], is available at clus.sourceforge.net.


5     Experimental study
The experiments are carried out using real world data publicly provided by
the DOE/NREL/ALLIANCE3 (http://www.nrel.gov/). The data (see Figure
1) consist of wind speed measurements from 1326 different locations at 80m of
height in the Eastern region of the US. The data were collected in 10 minutes
intervals during the year of 2004. This wind farm grid was able to produce 580
GW, and each farm produces between 100 MW and 600 MW. For the evaluation
of the results, we consider the root mean squared error (RMSE), computed over
the grid at each time point, as an indicator of the predictive performance. We de-
rive twelve (training and testing) datasets, which are constructed as follows: for
every month, days 1-11 defines a training dataset that is processed to learn the
forecasting model (with m = 10), while days 15-25 defines the testing set used
to evaluate the performance of the forecasting model learned on the correspond-
ing training dataset. The 24 h ahead forecasting errors, averaged on the twelve
datasets, are analyzed. For this empirical evaluation, the multi-resolution infor-
mation is modeled at the spatial scale with radius R = 10 km or R = 50 km, as
    The traditional structured output predictive model f : X → Y can be simply learned
    in this scenario by neglecting the information on the data variability.
    The information on the wind speed variability is included in the output learning
    setting as a constraint to improve the predictive ability of the forecasting model
    learned.
                            Fig. 1. Wind speed data


well as at the temporal scale with length L = 1 hour (6 consecutive time points)
or L = 3 hours (18 consecutive time points). The performance of the standard
deviation operator is compared to that of the sum and mean operators. Finally,
the forecasting performance of the multi-resolution structured output predic-
tive model (f +ST (·)) is compared to the performance of the baseline structured
output predictive model (f (·)).

Evaluating scale size We start by analyzing the performance of the multi-
resolution structured output predictive models learned by MuReS along the size
of the space (R) and time (L) scales. We run MuReS with the standard deviation
operator computed over neighborhoods with radius R = 10km or R = 50km and
windows with length L = 1h or L = 3h. Average RMSE results are plot in Figure
2. They show that the accuracy of the forecasting model is more sensitive to the
time scale than to the spatial scale. In any case, the decrease in the forecast-
ing accurcy performance, due to a large scale in the temporal resolution, starts
being observed starting from forecasts produced 12 hours far from the current
time point. Therefore, selecting the appropriate scale size is a crucial issue to
yield accurate long-term forecasts. Based on these preliminary results, we select
R = 10km and T = 1h for the remaining of this study.

Evaluating multi-resolution operator We proceed by exploring the performance
of the multi-resolution structured output predictive models learned by MuReS
along the selection of the multi-resolution operator used to model the wind speed
variability. Considering R = 10km and T = 1h, we compare the forecasting accu-
racy achieved when the data variability model is computed through the standard
deviation operator to the accuracy achieved when the variability model is com-
puted through the sum or mean operators. Average RMSE results are plot in
Figure 3. Results empirically support the effectiveness of our choice of resort-
ing to the standard deviation as the most appropriate second order statistic to
model the wind speed variability in both space and time. It actually contributes
to gain in forecasting accuracy in this peculiar application.
Fig. 2. Multi-resolution structured output predictive models (R = 10, 50km, L =
1, 3h): testing RMSE (averaged on twelve testing datasets - axis Y) plotted with respect
to 10 minutes-spaced time points (over 24 h ahead horizon - axis X).


Fig. 3. Multi-resolution structured output predictive models (multi-resolution operator
analysis - sum, mean and standard deviation): testing RMSE (averaged on twelve
testing datasets - axis Y) plotted with respect to 10 minutes-spaced time points (over
24 h ahead horizon - axis X).


Evaluating multi-resolution learning schema We complete this study by compar-
ing the performance of the multi-resolution structured output predictive models
learned by MuReS with R = 10km, L = 1h and the standard deviation as
multi-resolution operator to the performance of the baseline structured output
predictive models learned neglecting data variability at space and time scales.
Average RMSE results are plot in Figure 4. These results show empirically the
viability of the main idea inspiring this study: the accuracy of the structured
output predictive learning in the wind speed forecasting can be greatly improved
by augmenting both the input and output spaces of the learning problem with
multi-resolution information modeling the data variability of the wind speed
observed at both space and time scales.


6    Conclusion
This paper studies the problem of the medium-term (24 h ahead) wind speed
forecasting by considering different dimensions of analysis: data variability, spa-
Fig. 4. Multi-resolution structured output predictive models (multi-resolution analy-
sis - multi-resolution structured output model (MuReS) vs structured output model
(Baseline): testing RMSE (averaged on twelve testing datasets - axis Y) plotted with
respect to 10 minutes-spaced time points (over 24 h ahead horizon - axis X).


tial and temporal resolution, structured output learning, aiming to investigate
the relevant implications for the dealing with the multi-resolution representa-
tion of the data variability for the problem at hand. Results in a benchmark
dataset clearly show that accounting for the data variability at space and time
scales allows us to learn output prediction models, which are much more ac-
curate than traditional models that neglect the multi-resolution information.
Moreover, experimental results confirm that the standard deviation is an appro-
priate multi-resolution operator of the data variability to be taken into account
in this specific application. Finally, defining the size of the scale (particularly in
the time scale) can be crucial issue to guarantee accurate long-term forecasts. As
future work, we intend to explore the implications of these models of the data
variability in statistical time series models (e.g Arima or Var). We also plan to
investigate more sophisticated incremental learning methods that are able to
update the forecasting model as new historical data are collected, in order to fit
the learned models to drifting data.


7    Acknowledgments

Authors thank Enrico Laboragine for his support in developing the algorithm
presented and running the experiments. This work was partially supported by
the EU-funded project TOREADOR (ICT-16-2015, Grant Agreement A. no.
688797), as well as by the University of Bari Aldo Moro - funded ATENEO
Project 2014 on “Mining of network data” and ATENEO Project 2015 on “Mod-
els and Methods to Mine Complex and Large Data”.


References
 1. W. Albert, S. Chi, and C. J.H. An improved grey-based approach for electricity
    demand forecasting. Electric Power Systems Research, 67:217–224, 2003.
 2. V. Almeida and J. Gama. Collaborative wind power forecast. In 3rd Interna-
    tional Conference on Adaptive and Intelligent Systems ICAIS 2014, volume 8779
    of LNCS, pages 162–171. Springer, 2014.
 3. D. Ambach and P. Vetter. Wind speed and power forecasting - a review and incor-
    porating asymmetric loss. In 2016 Second International Symposium on Stochastic
    Models in Reliability Engineering, Life Science and Operations Management (SM-
    RLO), pages 115–123, 2016.
 4. A. Appice and S. Dzeroski. Stepwise induction of multi-target model trees. In J. N.
    Kok, J. Koronacki, R. L. de Mántaras, S. Matwin, D. Mladenic, and A. Skowron,
    editors, Machine Learning: ECML 2007, 18th European Conference on Machine
    Learning, Warsaw, Poland, September 17-21, 2007, Proceedings, volume 4701 of
    Lecture Notes in Computer Science, pages 502–509. Springer, 2007.
 5. A. Appice, S. Pravilovic, A. Lanza, and D. Malerba. Very short-term wind speed
    forecasting using spatio-temporal lazy learning. In N. Japkowicz and S. Matwin,
    editors, Discovery Science - 18th International Conference, DS 2015, Banff, AB,
    Canada, October 4-6, 2015, Proceedings, volume 9356 of Lecture Notes in Computer
    Science, pages 9–16. Springer, 2015.
 6. M. Bhaskar, A. Jain, and N. V. Srinath. Wind speed forecasting: Present status.
    In 2010 International Conference on Power System Technology, pages 1–6, 2010.
 7. M. Ceci, R. Corizzo, F. Fumarola, D. Malerba, and A. Rashkovska. Predictive
    modeling of PV energy production: How to set up the learning task for a better
    prediction? IEEE Trans. Industrial Informatics, 13(3):956–966, 2017.
 8. S. Fan, J. Liao, R. Yokoyama, L. Chen, and W.-J. Lee. Forecasting the wind
    generation using a two-stage network based on meteorological information. IEEE
    Transactions on Energy Conversion, 24(2):474–482, 2009.
 9. U. B. Filik and T. Filik. Wind speed prediction using artificial neural networks
    based on multiple local measurements in eskisehir. Energy Procedia, 107:264 – 269,
    2017. 3rd International Conference on Energy and Environment Research, ICEER
    2016, 7-11 September 2016, Barcelona, Spain.
10. G. Giebel, J. Badger, I. M. Perez, P. Louka, and G. Kallos. Short-term forecasting
    using advanced physical modelling-the results of the anemos project. results from
    mesoscale, microscale and cfd modelling. In Proc. of the European Wind Energy
    Conference 2006, 2006.
11. G. Giebel, R. Brownsword, G. Kariniotakis, M. Denhard, and C. Draxl. The State-
    Of-The-Art in Short-Term Prediction of Wind Power: A Literature Overview, 2nd
    edition. ANEMOS.plus, 2011. Project funded by the European Commission under
    the 6th Framework Program, Priority 6.1: Sustainable Energy Systems.
12. Z. Hui, L. Bin, and Z. Zhuo-qun. Short-term wind speed forecasting simulation
    research based on arima-lssvm combination method. In ICMREE) 2011, volume 1,
    pages 583–586, 2011.
13. R. G. Kavasseri and K. Seetharaman. Day-ahead wind speed forecasting using
    f-arima models. Renewable Energy, 34(5):1388 – 1393, 2009.
14. D. Kocev, C. Vens, J. Struyf, and S. Dzeroski. Tree ensembles for predicting
    structured outputs. Pattern Recognition, 46(3):817–833, 2013.
15. M. Lange and U. Focken. New developments in wind energy forecasting. In Power
    and Energy Society General Meeting - Conversion and Delivery of Electrical Energy
    in the 21st Century, 2008 IEEE, pages 1–8, 2008.
16. M. Negnevitsky, P. Johnson, and S. Santoso. Short term wind power forecasting
    using hybrid intelligent systems. In Power Engineering Society General Meeting,
    2007. IEEE, pages 1–4, 2007.
17. M. Negnevitsky, P. Mandal, and A. K. Srivastava. An overview of forecasting
    problems and techniques in power systems. In 2009 IEEE Power Energy Society
    General Meeting, pages 1–4, 2009.
18. O. Ohashi and L. Torgo. Wind speed forecasting using spatio-temporal indicators.
    In ECAI 2012 - PAIS-2012) System Demonstrations Track, volume 242 of Frontiers
    in Artificial Intelligence and Applications, pages 975–980. IOS Press, 2012.
19. J. Palomares-Salas, J. De la Rosa, J. Ramiro, J. Melgar, A. Aguera, and A. Moreno.
    Arima vs. neural networks for wind speed forecasting. In IEEE International Con-
    ference on Computational Intelligence for Measurement Systems and Applications,
    CIMSA 2009, pages 129–133, 2009.
20. C. Potter and M. Negnevitsky. Very short-term wind forecasting for tasmanian
    power generation. IEEE Transactions on Power Systems, 21(2):965–972, 2006.
21. S. Pravilovic, A. Appice, A. Lanza, and D. Malerba. Mining cluster-based models
    of time series for wind power prediction. In S. Greco and A. Picariello, editors,
    22nd Italian Symposium on Advanced Database Systems, SEBD 2014, Sorrento
    Coast, Italy, June 16-18, 2014., pages 9–20, 2014.
22. S. Pravilovic, A. Appice, A. Lanza, and D. Malerba. Wind power forecasting using
    time series cluster analysis. In 17th International Conference on Discovery Science,
    DS 2014, volume 8777, pages 276–287. Springer, 2014.
23. S. Pravilovic, A. Appice, and D. Malerba. An intelligent technique for forecasting
    spatially correlated time series. In 13th International Conference of the Italian
    Association for Artificial Intelligence, AI*IA 2013, volume 8249 of LNCS, pages
    457–468. Springer, 2013.
24. S. Pravilovic, A. Appice, and D. Malerba. Integrating cluster analysis to the
    ARIMA model for forecasting geosensor data. In 21st International Symposium
    on Foundations of Intelligent Systems, ISMIS 2014, volume 8502 of LNCS, pages
    234–243. Springer, 2014.
25. S. Pravilovic, M. Bilancia, A. Appice, and D. Malerba. Using multiple time series
    analysis for geosensor data forecasting. Inf. Sci., 380:31–52, 2017.
26. J. Shi, X. Qu, and S. Zeng. Short-term wind power generation forecasting: Direct
    versus indirect arima-based approaches. International Journal of Green Energy,
    8(1):100–112, 2011.
27. G. Sideratos and N. Hatziargyriou. An advanced statistical method for wind power
    forecasting. Power Systems, IEEE Transactions on, 22(1):258–265, 2007.
28. S. S. Soman, H. Zareipour, O. Malik, and P. Mandal. A review of wind power and
    wind speed forecasting methods with different time horizons. In North American
    Power Symposium 2010, pages 1–8, 2010.
29. D. Stojanova, M. Ceci, A. Appice, D. Malerba, and S. Dzeroski. Dealing with
    spatial autocorrelation when learning predictive clustering trees. Ecological Infor-
    matics, 13:22–39, 2013.
30. X. Wang, P. Guo, and X. Huang. A review of wind power forecasting models.
    Energy Procedia, 12(0):770 – 778, 2011.
31. J. Zhou, J. Shi, and G. Li. Fine tuning support vector machines for short-term
    wind speed forecasting. Energy Conversion and Management, 52(4):1990 – 1998,
    2011.