=Paper=
{{Paper
|id=None
|storemode=property
|title=Predicting Ramp Events with a Stream-based HMM framework
|pdfUrl=https://ceur-ws.org/Vol-960/paper6.pdf
|volume=Vol-960
}}
==Predicting Ramp Events with a Stream-based HMM framework==
Predicting Ramp Events with a Stream-based HMM framework Carlos A. Ferreira1 and João Gama2 and Vı́tor S. Costa3 and Vladimiro Miranda4 and Audun Botterud5 Abstract. The motivation for this work is the study and prediction parameters are estimated from historical data, the state transitions of wind ramp events occurring in a large-scale wind farm located in probabilities are estimated from wind power measurements and the the US Midwest. In this paper we introduce the SHREA framework, a emission probabilities, at each state, are estimated from wind speed stream-based model that continuously learns a discrete HMM model observations. To estimate the state probability transitions, first, we from wind power and wind speed measurements. We use a super- combine a ramp filter, a derivative alike filter, and a user-defined vised learning algorithm to learn HMM parameters from discretized threshold to translate the real-valued wind power time series into data, where ramp events are HMM states and discretized wind speed a labeled time-series, coding three different types of ramp events: data are HMM observations. The discretization of the historical data ramp-up, no-ramp and ramp-down. Then, the transitions occurring is obtained by running the SAX algorithm over the first order varia- in this labeled time series are used to estimate the transitions of the tions in the original signal. SHREA updates the HMM using the most Markov process hidden in the HMM, i.e., to model the transitions be- recent historical data and includes a forgetting mechanism to model tween the three states associated with the three types of ramp events. natural time dependence in wind patterns. To forecast ramp events To learn the HMM emission probabilities, first we combine a ramp we use recent wind speed forecasts and the Viterbi algorithm, that filter and the SAX algorithm [9] to translate the wind speed measure- incrementally finds the most probable ramp event to occur. ments signal into a string. Next we use both the wind power labeled We compare SHREA framework against Persistence baseline in time series and the wind speed string to estimate the emission proba- predicting ramp events occurring in very short-time horizons. bilities at each state. The estimative is obtained by counting the string symbols, coding wind speed variations, associated with a given state/ 1 Introduction ramp event. When we analyze wind power historical data we observe both sea- Ramping is one notable characteristic in a time series associated with sonal weather regimes and short-time ahead dependence of the recent a drastic change in value in a set of consecutive time steps. Two prop- past wind power/speed measurements. Thus, to accommodate these erties of a ramping event i.e. slope and phase error, are important issues, in SHREA we included a strategy that forgets old weather from the point of view of the System Operator (SO), with impor- regimes and continuously updates the HMM with the most recent tant implications in the decisions associated with unit commitment or measurements, both wind power measurements and wind speed mea- generation scheduling. Unit commitment decisions must prepare the surements. generation schedule in order to smoothly accommodate forecasted To generate ramp event predictions occurring in short-time ahead drastic changes in wind power availability [2]. In this paper we window we use the wind speed forecast, obtained from a major NWP present SHREA a novel stream-based framework that predicts ramp- provider, and the current HMM. First, we run a filter over the wind ing events in short term wind power forecasting. speed forecast signal to obtain a signal of wind speed variations. The development of the SHREA framework is the answer to the Next, we run the SAX algorithm to translate the resulting real-valued three main issues available in ramp event forecasting. How can we time series into a string. Then, we run the Viterbi algorithm [12] to describe and get insights on the wind power, and wind speed, time- obtain the most likely sequence of ramp events. We could use the dependent dynamic and use this description to predict short-time Forward-Backward algorithm [12] usually used to estimate the pos- ahead ramp events? How can we combine real valued historical wind terior probability but we would be using long time ahead, thus unre- power and speed measurements and Numerical Weather Predictions liable, wind speed forecasts to predict current ramp events. (NWP), specially wind speed predictions, to output reliable real-time It is important to observe that wind speed measurements and predictions? How can we continuously adapt SHREA to accommo- forecasts, mainly short time horizon predictions, are approximately date different natural weather regimes yet producing reliable predic- equally distributed over time. Moreover, the wind power output of tions? each turbine is related to wind speed measurements. To answer these questions we designed a stream-based framework In this work we run the SHREA framework to describe and predict that continuously learns a discrete Hidden Markov Model (HMM) very short-time ahead ramp events occurring in a large-scale wind and uses it to generate predictions. To learn and update the HMM the farm located in the US Midwest. We present a comparison against SHREA framework uses a supervised strategy whereas the HMM the Persistence model that is known to be hard to beat in short-time 1 LIAAD-INESC TEC and ISEP - Polytechnic Institute of Porto, Portugal forecasts [10]. 2 LIAAD-INESC TEC and FEP - University of Porto, Portugal Despite the difficulty of the ramp forecasting problem, in this work 3 CRACS-INESC TEC and FC - University of Porto, Portugal we make the following contributions: Develop a stream-based frame- 4 INESC TEC and FE - University of Porto, Portugal work that predicts ramp events and generates both descriptive and 5 Argonne National Laboratory, Argonne, IL, USA 28 results are presented that relate this parameter to the type and mag- nitude of identified ramps. The Pref parameter is usually defined according to the specific features of the wind farm site and, usually, is defined as a percentage of the nominal wind power capacity or a specified amount of megawatts. A comprehensive analysis of ramp modeling and prediction may be found in [2]. Algorithm 1: SHREA: a stream-based ramp predictor Figure 1: Illustration of ramp events, defined as a change of at least 50% in input : Three time series: PT , wind power measurements; OT , wind speed measurements; and JT , wind power in an interval of 4 hours Speed forecasts; a, the forecast horizon; Pref , threshold to identify ramp events; ∆t, the ramp definition parameter; W, the PAA parameter that specifies the amount of signal aggregation; σ, a forgetting factor output: A sequence of predictions Qd d r . . . Qr+a for each period/window d = 1, . . . cost-effective models; Introduce a forgetting mechanism so that we countT imeP eriods ← 0; f lag ← 0; Acount ← 0; B count ← 0; for each period/window d do can learn a HMM using only the most recent weather regimes; Use countT imeP eriods + + 1 Preprocessing wind speed forecasts as observations of a discrete HMM to predict Pd d d d d d s ← fitSpline(P ), Os ← fitSpline(O ), Js ← fitSpline(J ) short-time ahead ramp events. Pd f ← rampDef(P d , ∆t), Od ← rampDef(Od , ∆t), Jd ← rampDef(Jd , ∆t) s f s f s In the next Section we introduce the ramp event forecast problem. Ld ← label(Pd f , Pref ); // Label Data In Section 3 we present a detailed description of our framework. Od d d n ← znorm(Of ), Jn ← znorm(Jf ) d Od ← d )), OFd str ← SAX(PAA(Jn )) d In Section 4 we present and discuss the obtained results. Last, we 2 str SAX(PAA(O Learn Supervised HMM n present some conclusions and present future research directions. π ← (δ(Ld (r) = rampDown), δ(Ld (r) = noramp), δ(Ld (r) = rampU p)) λd (A, B, π) ← LearnHMM(Ostr d (1, . . . , r), Ld (1, . . . , r), A count , Bcount ) 3 Predict Ramp Events using the learned HMM 2 Ramp Event Definition and Related Work Qd r . . . Qd r+a ← V iterbi(λ, OF d (r + 1, . . . , r + a)) str λd (A, B, π) ← updateHMM(Ostr d (r + 1, . . .), Ld (r + 1, . . .)) One of the main problems in ramp forecasting is how to define a 4 Forgetting mechanism if (countTimePeriods==σ) then ramp. In fact, there is no standard definition [7, 3, 8] and almost Aaux aux count ← Acount ; Bcount ← Bcount ; f lag ← 1 if (countTimePeriods mod σ == 0 & flag==1) then all existing literature report different definitions, depending, for in- Acount ← Acount − Aaux aux count ; Bcount ← Bcount − Bcount stance, on the location or on the farm’s size. The authors in [5] and [11] define several relevant characteris- tics for ramp definition, characterization and identification: to define 3 Methodology developed to Forecast Ramps a ramp event, we have to determine values for its three key char- acteristics: direction, duration and magnitude (see Figure 1). With In this section we present SHREA framework, a stream-based frame- respect to direction there are two basic types of ramps: the upward work that uses a supervised learning strategy to obtain a HMM. ones (or ramp-ups), and the downward ones (or ramp-downs). The SHREA continuously learns a discrete HMM on a fixed size non- former, characterized by an increase of wind power, result from a overlapping moving window and, at each time period, uses the up- rapid raise of wind speeds, which might (not necessarily) be due to dated HMM to predict ramp events. We introduce a forgetting mech- low-pressure systems, low-level jets, thunderstorms, wind gusts, or anism to forget old wind regimes and to accommodate weather global other similar weather phenomena. Downward ramps are due to a de- changes. The SHREA architecture has three main steps (see algo- crease in wind power, which may occur because of a sudden deple- rithm pseudo-code in Algorithm 1): preprocessing phase, where a tion of the pressure gradient, or due to very high wind speeds, that ramp filter and the SAX algorithm are used to translate real valued lead wind turbines to reach cut-out limits (typically 22-25m/s) and signals into events/strings; learning phase, where a supervised strat- shut down, in order to prevent the wind turbine from damage [4]. In egy is used to learn a HMM; and prediction phase, where the Viterbi order to consider a ramp event, the minimum duration is assumed to algorithm is used to forecast ramp events. In the following lines we be 1 hour in [11], although in [7] these events lie in intervals of 5 to describe each one of these phases. 60 minutes. The magnitude of a ramp is typically represented by the percentage of the wind farm’s nominal power - nameplate. 3.1 Preprocessing In the preprocessing phase we translate the In [7] the authors studied the sensitivity of two ramp definitions real-valued points occurring in a given time period d, i.e. occurring to each one of the two parameters introduced above: ramp amplitude inside a non-overlapping fixed size window, into a discrete time- ranging from 150 to 600MW and ramp duration values varying be- series suitable to be used at HMM learning and prediction time. First, tween 5 and 60 minutes. The definition that we present and use in we fit a spline to both the wind power and wind speed measurements this work is similar to the one described in [7]. It is more appropri- time series obtaining, respectively, two new signals, Pds and Ods . We ate to use in real operations since it does not considers a time-ahead run the same procedure over J time series, a wind speed forecast, and point to identify a ramp event. obtain Jds . We fit splines to the original data to remove high frequen- cies that can be considered noisy data. Second, we run ramp defini- Definition 1 A ramp event is considered to occur at time point t, the tion one, presented above in Section 2, to filter the three smoothed end of an interval, if the magnitude of the increase or decrease in the signals and obtain three new signals: Pdf , Odf and Jdf . These signals power signal is greater than the threshold value, the Pref : are wind power and speed variations, derivative alike signals, suitable to identify ramp events. Third, we use a user-defined power varia- |P (t) − P (t − ∆t)| > Pref tion threshold, the input parameter Pref value, to translate the wind power signal Pdf into a labeled time series Ld (1, . . . , r + a), where 1 The parameter ∆t is related to the ramp duration and defines the is the first point of the time window, r is the forecast launch time and size of the time interval considered to identify a ramp. In [11] some a is the time horizon. We map each wind power variation into one of 29 three labels/ramp events: ramp-up, ramp-down and no-ramp. These come less sensitive to new weather regimes. Thus we introduce a for- three labels will be the three states of our HMM and the transitions getting strategy to update the HMM using only the most recent mea- will be estimated using the points of the Ld time series. surements and forgetting the old data. This strategy relies on a thresh- At this point we already have the data needed to estimate the tran- old that specifies the number of time periods to include in the HMM sitions of the Markov process hidden in the HMM process. Now we estimation. This forgetting parameter, σ, is a user-defined value that need to transform wind speed data into a format suitable to estimate can be set by experienced wind power technicians. Considering that emission probabilities of the discrete HMM that we are learning. We at time period d we have read σ time periods and that we backup the combine Piecewise Aggregate Approximation (PAA) and SAX algo- current counts into Aaux aux count and Bcount temporary matrices. After rithms [9] to translate the wind speed variations into symbolic time reading 2σ time periods we will use the following forgetting mecha- series, more precisely. Thus, we normalize the two wind speed sig- nism: A2σ 2σ aux 2σ 2σ count = Acount − Acount and Bcount = Bcount − Bcount . aux nals and obtain Odn and Jdn signals. Odn will be used to estimate the Then, we reset Aauxcount and B aux count equal to the updated A 2σ count and HMM emission probabilities and the Jdn will be used as the ahead ob- 2σ Bcount matrices, respectively. Next, to predict ramp events occurring servations that will be used to predict ramp events. Next, we run the in the time periods following 2σ, we will update and use the HMM PAA algorithm in each one of these signals to reduce complexity and, parameters obtained from the A2σ 2σ count and Bcount to forecast ramp again, obtain smoothed signals. The degree of signal compression is events. Every time we read a number of time periods that equals a the W PAA parameter that is a user-defined parameter of SHREA. multiple of σ we apply this forgetting mechanism using the updated This parameter is related with time point aggregation. Next, we run auxiliary matrices. the SAX algorithm to map each PAA signal into string symbols. This way we obtain two discrete signals Odstr and Jdstr . After the prepro- 3.3 Predict Ramp Events using the learned HMM In this step cessing phase we have two discrete time series, Ld and Odstr that will we use the HMM learned in time period d, the λd , and the string be used to learn the HMM state transitions and emissions probabili- Jdstr , obtained from wind speed forecasts, to predict ramp events for ties, respectively. the time points ranging from r to r + a. Remember that r is the prediction launch time and a is the forecast horizon. 3.2 Learn a Discrete HMM Here we explain how do we learn the To obtain the ramp event predictions we run the Viterbi algo- HMM in the time period d, and then how we update it in time. rithm [12]. We feed this algorithm with Jdstr and λd and get the state In the HMM that we learn, compactly written λ(A, B, π), the predictions (the ramp events) Qdr+1 , . . . , Qdr+a for the time points state transitions, the A parameter, are associated with wind power r + 1, . . . , r + a of time period d. Saying it in other way we obtain measurements and the emissions probabilities, the B parameter, are predictions for the points occurring in a non overlapping time win- associated with wind speed measurements. In Figure 2 we show a dow starting at r and with length equal to a. We will obtain the most HMM learned by SHREA at the end of the 2010 winter. To estimate likely sequence of states that best explains the observations, i.e., we these two parameters we use the ramp labels, Ld (1, . . . , r), and the will obtain a sequence of states Qdr+1 , . . . , Qdr+a that maximizes the wind speed mesurements signals, Odstr (1, . . . , r), and run the well- probability P (Qdr+1 , . . . , Qdr+a |Jdr+1 , . . . , Jdr+a , λd ). known and straightforward supervised learning algorithm described Regarding the π parameter, we introduce a non classical approach in [12]. To estimate the transition probabilities between states, the to estimate this parameter. We defined this strategy after observing three-way matrix A, we count the transitions between symbols ob- that it is almost impossible to beat a ramp event forecaster that pre- served in Ld (1, . . . , r) and compute the marginals to estimate the dicts the ramp event occurring one step ahead to be the current ob- probabilities. To estimate the emission probabilities for each state, served ramp event. Thus, we set π to be a distribution having zero the matrix B, we count, for each state, the observed frequency of probability for all events except the event observed at launch time, each symbol and then use state marginals to compute the probabili- the r time point. In the pseudo code we write π ← (δ(Ld (r) == ties. This way, we obtain the maximum likelihood estimate of both rampDown), δ(Ld (r) == noramp), δ(Ld (r) == rampU p),) the transitions and the emission probability matrices. where δ is a Dirac delta function defined by δ(x) = 1, if x is T RU E We now explain how to update the model in the time. We de- and δ(x) = 0, if x is F ALSE. sign our framework to improve over the time with the arriving of new data. At each time period d SHREA is fed with new data and the HMM parameters are updated to include the most recent histor- 4 Experimental Evaluation ical data. At each time period d we update the HMM parameters by counting the state transitions and state emissions coded in the cur- In this section we describe the configurations, the metrics d rent vectors Ostr (1, . . . , r) and Ld (1, . . . , r), obtaining the number and the results that we obtain in our experimental evaluation. of state transitions and emissions at each HMM state, the Acount and b c d Bcount . Then, we compute the marginal probabilities of each matrix a .02 .02 .53 .02 e and obtain the updated HMM, the model λd (Ad , B d , π d ) that will be .02 .04 used to predict ramp events. The learned HMM, λd , will be used to f ramp up .89 f Table 1: Misclassification predict ramp events occurring between r and r + a. In the next time Costs e .21 .12 .94 .04 .45 period (i.e. the next fixed sized time window) we will update the λd d .24 no .02 .02 .47 f ramp .02 .02 HMM, using this same strategy but including also the transitions and Observed .20 .51 ramp .02 emissions of the time period d that were not used to estimate λd , i.e., down e down no up c .14 .09 .02 Predicted down 0 10 80 we update Acount and Bcount with the wind measurements of the no 20 0 10 b a .85 .08 .02 d time period d occurring after d’s launch time and before d + 1 period up 100 30 0 a c launch time, the r point. By using this strategy we continuously up- b date the HMM to include both the most recent data and all old data. Figure 2: Winter HMM By using this strategy, and with the course of time, the HMM can be- 30 Table 2: KSS, SS and Expected Cost Mean and standard deviation for the last 100 days of the evaluation period SHREA Persistence ∆t=1 ∆t=2 ∆t=3 Metric ∆t=1 ∆t=2 ∆t=3 phE=0 phE=1 phE=2 phE=0 phE=1 phE=2 phE=0 phE=1 phE=2 KSS 0.144(0.002) – – 0.332(0.001) – – 0.446(0.002) – – 0.144(0.002) 0.332(0.001) 0.446(0.002) 30 min SS 0(0) – – 0(0) – – 0(0) – – – – – Time ahead ECost 3.129(0.016) – – 4.176(0.027) – – 4.04(0.019) – – 3.129(0.02) 4.176(0.03) 4.041(0.02) KSS 0.152(0.001) 0.202(0.002) – 0.278(0.001) 0.314(0.204) – 0.369(0.001) 0.417(0.001) – 0.127(0.009) 0.203(0.001) 0.343(0.001) 60 min SS 0.028(0.001) 0.085(0.002) – 0.094(0.00) 0.139(0.001) – 0.038(0.001) 0.113(0.001) – ECost 2.312(0.18) 2.107(0.014) – 3.860(0.39) 3.719(0.39) – 4.374(0.61) 4.108(0.61) – 8.731(0.99) 14.687(1.50) 16.104(1.63) KSS 0.123(0.000) 0.185(0.001) 0.231(0.002) 0.193(0.001) 0.240(0.001) 0.296(0.002) 0.271(0.001) 0.316(0.001) 0.345(0.001) 0.101(0.001) 0.163(0.002) 0.258(0.002) 90 min SS 0.0244(0.002) 0.093(0.001) 0.145(0.001) 0.035(0.001) 0.091(0.002) 0.159(0.001) 0.018(0.001) 0.079(0.001) 0.118(0.001) – – – ECost 2.089(0.013) 1.938(0.012) 1.807(0.010) 4.252(0.03) 4.028(0.025) 3.728(0.024) 5.165(0.025) 4.893(0.023) 4.677(0.025) 3.204 (0.030) 6.112(0.042) 6.783(0.050) 4.1 Experimental Configuration Our goal is to predict ramp 4.2 Results This work is twofold and here we present and ana- events in a large-scale wind farm located in the US Midwest. To lyze both the descriptive and predictive performance of the SHREA evaluate our system we collected historical data and, to make pre- framework. dictions, use wind speed power predictions (NWP) for the time pe- In Figure 2 we present an example of HMM generated by SHREA riod ranging between 3rd of June 2009 and 16th of February 2010. in February. This model was learned when running SHREA to pre- Each turbine in the wind farm has a Supervisory Control and Data dict 90 minutes ahead events and setting ∆t = 2. This HMM has Acquisition System (SCADA) that registers several parameters, in- three states, each state is associated with one ramp type, and each cluding the wind power generated by each turbine and the measured state emits six symbols, each representing a discrete bin of the ob- wind speed at the turbine, the latter are 10 minute spaced point mea- served wind speed. The lower level of wind speed is associated with surements. In this work we consider a subset of turbines and com- the a character and the higher level of wind speed is associated with pute, for each time point, the subset mean wind power output and the f character. The labels in the edges show the state emissions and the subset mean wind speed, obtaining two time series of measure- the state transition probabilities. ments. The wind speed power prediction for the wind farm location The HMM models that we obtained in our experiments uncover was obtained from a major provider. Every day we get a wind speed interesting ramp behaviors. If we consider all the data used in these forecast with launch time at 6 am and having 24 hours horizon. The experiments, when we set ∆t = 1 we found that there were de- predictions are 10 minute spaced point forecasts. In this work we tected 7% more ramp-up events than ramp-down events. When we run SHREA to forecast ramp events occurring 30, 60 and 90 min- set ∆t = 3 we get the inverse behavior, we get 4% more ramp- utes ahead, the a parameter. We start by learning a HMM using five downs than ramp-ups. This behavior is easily explained by the wind days of data and then use the learned, and updated, HMM to gen- natural dynamics that causes steepest ramp-up events and smooth erate predictions for each fixed size non overlapping time window. ramp-down events. If we analyze independently the four periods of Moreover, we split the day in four periods and run SHREA to learn the day we can say that we have a small number of ramp events, four independent HMM models: dawn, period ranging between zero both ramp-ups and ramp-downs, in the afternoon. If we compute the and six hours; morning, period ranging between six to twelve hours; mean number of ramps, for all ∆t parameters we get approximately afternoon, period ranging between twelve and eighteen hours; nigh, 30%(15%) more ramp-up(ramp-down) events at night than in the af- period ranging between eighteen and midnight. The last four models ternoon. Overall, we can say that we get more ramp events at night were only used to give some insight on the ramp dynamics and were and, in second place, at the dawn period. Moreover, we can say that not used to make predictions. We define a ramp event to be a change in the summer we get, both for ramp-up and ramp-down events, wind in wind power production higher than 20% of the nominal capacity, speed distributions with higher entropy, we get approximately 85% i.e., we set the Pref threshold equal to 20% of the nominal capacity. of the probability concentrated in two observed symbols. Different Moreover, we run a set of experiments by setting ∆t parameter equal from this behavior, in the winter we have less entropy in the wind to 1, 2 and 3 time points, i.e., equal to 30, 60 and 90 minutes. We run speed distribution associated with both types of ramp events. In the SHREA using thirty minute signal aggregation, thus each time point winter we have approximately 91% of the probability distribution represents thirty minutes of data. In these experiments we also con- concentrated in the one symbol. The emission probability distribu- sider phase error corrections. Phase errors are errors in forecasting tion of the ramp-down state is concentrated in symbol a and the emis- ramp timing [5]. We identify events that occur in a timestamp, t, not sion probability distribution in the ramp-down state is concentrated predicted at that time, but predicted instead to occur in one, or two, in symbol f. These two findings are consistent with our empirical time periods immediately before or after t. visual analysis and other findings [4]: Large wind ramps tend to oc- Furthermore, as SHREA is continuously updating the HMM, we cur in the winter and usually there is a rapid wind speed increase set the forgetting parameter σ = 30, i.e., each time the system reads a followed by a more gradual wind speed decrease. These findings are new period of 30 days of data, the system forgets 30 days of old data. also related with the average high temperature in the summer and The amount of forgetting used in this work results from a careful with the stable temperatures registered during the afternoons. Con- study of the wind patterns. sidering the ∆t parameter, we can say that the number of ramps, For this configuration we compute and present the Hanssen & both ramp-ups and ramp-downs, increase with the ∆t parameter. In Kuippers Skill Score (KSS) and the Skill Score (SS) [1, 6]. More- general, we observe large ramps only when we compare time points over, we compute the expected misclassification costs (EC) using the that are 20 to 30 minutes apart. formula presented in [13]. The cost matrix presented in Table 1 de- As is illustrated in Figure 2 we identified a large portion of self- fines the misclassification costs. We compare SHREA against a Per- loops, especially ramp-up to ramp-up transitions in the winter nights. sistence baseline algorithm. Despite its simplicity, the predictions of The percentage of self-loops range between 12%, when we run this model are the same as the last observation, this model is known SHREA with ∆t = 1, and 55% when we set ∆t = 3. This self-loop to be hard to beat in short-time ahead predictions [10]. transition shows that we have a high percentage of ramp events hav- 31 ing a magnitude of at least 40% of the nameplate, two times the Pref classification cost scenario (see Table 2) and show that SHREA threshold. Furthermore, in the winter we get a higher proportion of produces valuable predictions. In this real scenario, SHREA gener- ramp-up to ramp-down and ramp-down to ramp-up transitions than ates significant lower operational costs and better operational perfor- in the summer. This is especially clear at the dawn and night periods. mance than the baseline model. This phenomena can be related with the difference in the average temperatures registered in these time periods. 5 Conclusions and Future Work Before presenting the forecast performance, it must be said that the quality of ramp forecasting depends a great deal on the quality of In this work we obtained some insights on the intricate mechanisms meteorological forecasts. Moreover, as the HMMs represent proba- hidden in the ramp event dynamics and obtain valuable forecasts for bility distributions it is expected that SHREA will be biased to predict very short-time horizons. For instance, we can now say that steepest no-ramp events. Typically SHREA over predicts no-ramp events but and large wind ramps tend to occur more often in the winter. More- makes less severe errors. This biased behavior of SHREA is an ac- over, typically there is a rapid wind speed increase followed by a ceptable feature since it is better to forecast a no-ramp event when more gradual wind speed decrease. Overall, with the obtained HMM we observe a ramp-down(ramp-up) event than predicting a ramp- models we both obtained insights on the wind ramp dynamics and up(ramp-down) event. In real wind power operations (see Table 1) generate accurate predictions that prove to be cost beneficial when the cost of the later error is several times larger than the former er- compared against a Persistence forecast method. rors. The performance of SHREA is heavily dependent on the wind In Table 2 we present the mean (inside brackets we present the as- speed forecasts quality. Thus, in a near future we hope to get spe- sociated standard deviation) KSS, SS and Expected Cost metrics that cial purpose NWP suitable to detect ramp events and having more we obtained when running SHREA, and the reference model, to pre- frequent daily updates. Moreover, we will study multi-variate HMM dict ramp events occurring in the last hundred days of the evaluation emissions to include other NWP parameters like wind direction and period. temperature. Before presenting a detailed discussion of the obtained results, we must say that, if we consider the same ∆t parameter, in all exper- Acknowledgments: This manuscript has been created by iments we obtained better, or equal, results than the baseline algo- UChicago Argonne, LLC, Operator of Argonne National Laboratory rithm, the Persistence algorithm. Moreover, we must say that when (“Argonne“). Argonne, a U.S. Dep. of Energy Office of Science lab- we generate predictions for the 30 minute horizon (one time point oratory, is operated under Contract No. DE AC02-06CH11357. The ahead, since we use 30 minutes aggregation) we get the same results authors also acknowledge EDP Renewables, North America, LLC. as the Persistence model. This phenomena is related with the strategy This work was is also funded by the ERDF - through the COMPETE that we used to define the HMM initial state distribution. Remember programme and by National Funds through the FCT Project KDUS. that we set the HMM π parameter equal to the last state observed. As expected, the KSS results worsen with the increase of the time REFERENCES horizon. It is well known that the forecast reliability/fit worsens as [1] K.T. Bradford, R.L. Carpenter, and B. Shaw, ‘Forecasting southern the distance from the forecast launch time increases. Moreover we plains wind ramp events using the wrf model at 3-km’, in AMS Student can say that we obtained better KSS values for the morning period Conference, (2010). than in the other three periods of the day. For lack of space we do not [2] C. Ferreira, J. Gama, V. Miranda, and A. Botterud, ‘A survey on wind present a detailed description of the results that we obtain when we power ramp forecasting’, in Report ANL/DIS 10-13, Argonne National Laboratory, (2010). run SHREA to predict ramp events occurring in each one of the four [3] U. Focken and M. Lange, ‘Wind power forecasting pilot project in al- periods of the day. This can be related with the wind speed forecasts berta’, Oldenburg, Germany: energy & meteo systems GmbH, (2008). launch time. The wind speed forecast that we use in this work is [4] J. Freedman, M. Markus, and R. Penc, ‘Analysis of west texas wind updated every day at 6 am. plant ramp-up and ramp-down events’, in AWS Truewind, LLC, Albany, The analysis of the ∆t parameter shows that the mean KSS val- NY, (2008). [5] B. Greaves, J. Collins, J. Parkes, and A. Tindal, ‘Temporal forecast un- ues increase with the increase in the ∆t value. Again, this can be certainty for ramp events’, Wind Engineering, 33(11), 309–319, (2009). explained by the wind patterns, typically the wind speed increases [6] A.W. Hanssen and W.J.A. Kuipers, ‘On the relationship between the smoothly during more than 30 minutes. In Table 2 we can see clearly frequency of rain and various meteorological parameters’, Mededelin- that SHREA performance improves with the increase in ∆t param- gen van de Verhandlungen, 81, (1965). [7] C. Kamath, ‘Understanding wind ramp events through analysis of his- eter. We observe the same behavior when inspecting the results that torical data’, in IEEE PES Transmission and Distribution Conference we obtained by running the Persistence algorithm. Concerning the and Expo, New Orleans, LA, United States, (2010). SS, we can see that we obtain improvements over the Persistence [8] A. Kusiak and H. Zheng, ‘Prediction of wind farm power ramp rates: A forecast that ranges between 0% and 16%. data-mining approach’, J. of Solar Energy Engineering, 131, (2009). Concerning the phase error technique, we get important improve- [9] J. Lin, E. Keogh, S. Lonardi, and B. Chiu, ‘A symbolic representation of time series, with implications for streaming algorithms’, in 8th ACM ments for the two phase error parameter values considered in this SIGMOD Workshop on Research Issues in Data Mining and Knowledge study. The amount of improvement that we obtained by considering Discovery, San Diego, CA, (2003). the phase error can be valuable in real time operations. The techni- [10] C. Monteiro, R. Bessa, V. Miranda, A. Botterud, J. Wang, and cians can prepare the wind farm to deal with a nearby ramp event. G. Conzelmann, ‘Wind power forecasting: State-of-the-art 2009’, in Report ANL/DIS 10-1, Argonne National Laboratory, (209). In Table 2 we present the results without considering the phase error [11] C. W. Potter, E. Grimit, and B. Nijssen, ‘Potential benefits of a dedi- technique, phE = 0, and considering one time point (30 minutes), cated probabilistic rapid ramp event forecast tool’, IEEE, (2009). phE = 1, and two time points (60 minutes), phE = 2, phase errors [12] L.R. Rabiner, ‘A tutorial on hidden markov models and selected appli- corrections. cations in speech recognition’, Proceedings of the IEEE, 77(2), (1989). We also introduce a misclassification cost analysis framework that [13] A. Srinivasan, ‘Note on the location of optimal classifiers in n- dimensional roc space’, in Oxford University Technical Report PRG- can be used to quantify the management decisions. We define a mis- TR-2-99, Oxford, England, (1999). 32