1. Introduction and Motivation

Explainable Anomaly Detection in Renewable Energy Power Plants by Learning Multidimensional Normality Models

Carsten Kleiner

0 0 University of Applied Sciences &Arts Hannover, Faculty IV , Ricklinger Stadtweg 120, 30459 Hannover , Germany

Renewable energy production is one of the strongest rising markets and further extreme growth can be anticipated due to desire of increased sustainability in many parts of the world. With the rising adoption of renewable power production, such facilities are increasingly attractive targets for cyber attacks. At the same time higher requirements on a reliable production are raised. In this paper we propose a concept that improves monitoring of renewable power plants by detecting anomalous behavior. The system does not only detect an anomaly, it also provides reasoning for the anomaly based on a specific mathematical model of the expected behavior by giving detailed information about various influential factors causing the alert. The set of influential factors can be configured into the system before learning normal behaviour. The concept is based on multidimensional analysis and has been implemented and successfully evaluated on actual data from diferent providers of wind power plants.

eol>Anomaly detection Attack detection Resiliency Multidimensional analysis Wind power plant Normality model Explainable anomaly detection

1. Introduction and Motivation

Published in the Proceedings of the Workshops of the EDBT/ICDT 2024 Joint Conference (March 25-28, 2024), Paestum, Italy $ ckleiner@acm.org (C. Kleiner) 0000-0001-9497-0312 (C. Kleiner)

© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License ing the system itself as well as its application scope will CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutRion W4.0oInrtekrnsahtioonpal (PCCroBYce4.0e).dings (CEUR-WS.org) be presented in section 5.

2. Related Work 3. Concept

Several papers in the context of anomaly detection for 3.1. Requirements and Context renewable energy systems can be found in the literature.

In a more generalized context, [ 2 ] describes a learning Based on the research project SecDER1 which aims to approach similar to the one in this paper even for any increase the resilience of renewable virtual and physical type of IoT system. Whereas this approach could also be power plants, the requirements for an anomaly detection applied to renewable power plants, it is not clear which system have been identified as follows: part of the learning can be carried out in an automated Reason agnostic Both anomalies originating from fashion. Similarly, results do not provide explanations known and unknown attacks as well as non-attack based for anomalies. A focus on attacks, more specifically intru- anomalies shall be detected, ideally based on a single sion detection, is described in [ 3 ]. However, the approach detection system. is not extensible to outage detection and only provides Explainable alerts The identified anomalies should non-explainable alert messages. More specifically for be used to raise alerts that can be handled by human power plants, [ 4 ] uses many very general input param- domain experts. In order to simplify and substantiate eters. However, this approach also does not provide ex- the decisions by the experts explainable alerts should be plainable anomalies as results. provided, detailing the reason and context why the alert

Other interesting wind power specific concepts include has been issued. [ 5, 6 ]. However, these approaches also do not provide Adaptability The concept shall be usable for diferent explainable results. The first, in addition, requires a semi- types of wind power plants as well as diferent types of supervised learning approach which is not feasible for renewable power plants in general. The learned normalpreviously unknown attack types. Also, annotated train- ity models can be specific for each plant, however, the ing data is often times not available. The second approach concept to learn the model should be generic. focuses on system failure detection rather than attacks. General normality model While a single set of nor

On the other hand, [ 7 ] focuses on attacks and is specific mality models for all plants is not a goal, it is preferable, for wind power plants. It is not extensible to other types if normality models can be learned for groups of similar of energy sources and the degree of explainability of the plants. This way the model becomes more stable, and the results is not obvious. Papers [ 8, 9 ] also only focus on number of extensive learning processes can be reduced. specific attacks for wind power plants and thus do not Continuous learning and adjustment The system achieve the general detection capabilities of our concept. should be capable of adjusting the learned system beThe latter is concerned with false data injection attacks haviour continuously, thus improving the quality of the which are also the focus of several other publications. normality models over time. Thus can also update the Moreover, [ 10 ] provides a good overview of the security models in cases of concept drift over time. challenges from attacks that have to be considered, but The system described in the following part of the paper it does not present a comprehensive solution. will satisfy all of these requirements. On the other hand,

Finally, there are also papers with a pretty similar con- there are also limitations of the approach that have been cept to ours, but with diferent detection approaches, such accepted in order to keep the complexity manageable. In as Markov chains in [ 11 ] and a more complex detection particular, detection is only considered up to explainable model in [ 12 ]. However in both cases, while the approach alert generation, alert handling itself is not in scope. Hanis specific to wind power plants and an extensibility is not dling can be considered orthogonal as long as explainabildocumented, the explainability of the generated alerts is ity of the generated alerts is secured. For alert handling, uncertain. This is also true for [ 13 ] which also uses a cor- generic procedures and manual update concepts can be relation based approach, yet it is only one-dimensional considered as an extension, see e. g. [ 15 ] for an approach and requires and includes many specific sensors, so that based on rule-based anomaly detection. Similarly, we it is also tied to the domain of wind turbines only. Even only consider anomaly-based detection concepts, since more specific to wind turbine gearboxes is [ 14 ]. The au- most attack patterns (and even some of the non-attackthors do not limit their approach to attacks, also use a based outage patterns) are previously unknown, so rulemultidimensional analysis and generate at least partially or pattern-based detection will not be powerful enough explainable alerts. However, it is not obvious whether to detect these. As attacks on virtual power plants are and how this can be extended beyond gearboxes. executed by designated experts, advanced attacks will

In summary, none of the discussed references is able be used which are unique to the specific target and thus to provide the comprehensive features of our approach typically not previously known. (cover attacks and outages, generate explainable alerts, capable of detecting unknown attacks and useable for diferent types of power generation). 3.2. Multidimensional Normality Models The goal of the learning process by looking at histor(MNM) ical data is to compute a statistical description of the metric attribute for each cell of the cube. This is done by The basic concept for anomaly detection is learning mul- assuming a normal distribution for the metric readings tidimensional normality models (MNM) based on his- in each cell and approximating that normal distribution toric data of the power plant (or a set of similar power by estimating mean and standard deviation for the metplants) and then assessing the deviation from this MNM ric attribute based on learning from historical data. For for current readings of a logical record of the plant. The current readings the anomaly score is computed as difconcept called cellwise estimator (CE) of the MNM has ference to the mean of each relevant cell as number of already been described in [ 16 ] in detail; thus, we will standard deviations. The higher this factor, the more only present a high level description here. Originating likely the current reading is an outlier. As known from from online analytical processing (OLAP) cubes, the idea statistics a factor of 3 is a natural choice as a threshold is to describe normal behaviour of certain metrics (such to generate an alert. As will be seen in section 4, solely as power production in a windmill) based on several or- looking at this factor as an anomaly measure is not sufthogonal dimensions (such as weather conditions, plant ifcient, though, to properly assess the importance of an sensor readings and others). The reason for this multidi- alert. mensional treatment is that measurements of the metrics In summary, each cell’s normality model in our conmay be within a permissible range when looking at them cept consists of an estimation of normal distributions globally, whereas they may be an anomaly, when consid- (with mean and standard deviation each) of one or more ering the specific context in more detail. The context is measurements per cube cell over a timeslice. Cube cells described by the dimensions which are used in learning are defined by combinations of discrete values of relevant the MNMs. Conversely, potentially abnormal measure- dimensions, with wildcards allowed for cells with irrelements on the global level may actually be normal when vant values in a dimension. The anomaly score is then looking at their specific context. Thus, it is important computed based on the number of standard deviations to be able to base a decision whether a logical record that any current reading of a measure deviates from the constitutes an anomaly on both global as well as contex- expected mean. Alerts are typically only raised for cube tual, i. e. dimensional, information. To account for these cells with anomaly scores higher than a threshold of 3. challenges a specific normality model is learned for each In addition to the anomaly score the computed normality of the cube cells, i. e. every contextual situation. model as distribution estimation is also provided with the

Unfortunately, the higher the number of dimensions alert along with information about the cell’s dimensional and the number of values within a dimension, the larger values that caused the alert. This combination of inforthe number of combinations to consider becomes. Since mation (metric measurement, anomaly score, contextual the growth is exponential, these numbers have to be values, normality model) comprises the explanation for limited. In addition the concept of iceberg cubes ([ 17 ]) the human expert. Thus, an informed decision about known from the OLAP domain can also be used to restrict proper reaction to the alert is facilitated. the number of cubes to consider to relevant ones.

In order to deal with continuous data streams as needed for monitoring a power plant, the cubes are computed 3.3. Application of MNM to Wind Power per timeslice with a configurable timeslice length. The Plants metric attribute whose normal behavior is to be learned is aggregated by some configurable aggregation function over all readings within a timeslice. For the domain of wind power plants for instance, the power production output of a mill is a logical choice as a metric with multiple readings being aggregated by using the average over a timeslice. Typical dimensions for this metric can be wind speed, wind direction, rotor position and outside temperature. Since the dimensions are used to form an OLAP-like cube, all dimensions must be of discrete types.

Thus, continuous readings such as wind speed and temperature need to be assigned to a set of classes in order to be used as dimensions. As known from OLAP rollups, there is also a symbolic value of * in each dimension that aggregates all classes in that dimension and thus provides a cube cell where the class is irrelevant.

In order to apply our concept as explained in section 3.2 to renewable energy plants in general and wind power plants in particular, we have to define the metrics with aggregation functions for which normality models shall be learned as well as the discrete influential dimensions that might influence the metrics and be important for assessing an alert. Candidates for choosing the metrics are any elements of a monitoring reading that can be used to describe the operational behaviour of a windmill. The assumption is that attacks or outages will lead to unexpected behavior in this metric. Primarily, this is the efective electrical power production of the mill computed as an average over a timeslice. For consistency checks the number of measurement readings per timeslice can also be used as a metric. Alternative options that have not been evaluated in the experiments described in section 4 could be the positions of the pod or the blades of the windmill or other operational features.

There are much more options for choosing the dimensions than the metrics. In the evaluation in section 4 we have experimented with diferent choices, but there are actually many more. Obvious dimensions include wind speed, wind direction, pod position, air temperature, air pressure. More possible options include power factor, pitch angles of each blade, angle between pod and wind direction and anemometer readings. The choice of discretization of each of these factors (cf. 3.2) can be considered another hyperparameter of the application.

Specific choices for the dimensions and discretizations for the experiments will be explained in section 4, but it has to be pointed out that those are only initial selections and much more experiments will have to be carried out in the future to optimize the approach, cf. section 5.2.

4.1. Validation of Concept 4. Evaluation As an initial validation we used the data from 2020 of

the first dataset as training set and the readings from In order to evaluate the capabilities of the concept in 2021 for testing. We chose the average electrical power detail, we used historical data from actual wind power production over timeslices of 4 hours as primary metric. plants that are operated by project partners in the SecDER We experimented with some attributes as dimensions, project. We had two diferent datasets, one from each the results in this subsection have been achieved with operator. Data did not contain any known attacks, yet wind speed, wind direction and diference between gonsome anomalies due to maintenance or unusual weather dola angle and wind direction. The continuous values conditions. in these dimensions have been linearly assigned to 9, 12

The first dataset consists of operational log data from and 5 classes, respectively. The number of classes of the a single wind mill over the time range from January 2020 ifrst two features has been determined heuristically by to August 2021 at a sampling rate of 15 minutes. Each log assigning equally sized intervals of the total range of reading consists of 22 attributes in total, one of which is values to classes. For the third feature where original the timestamp and the others can be used as metrics or data had a strongly non-linear distribution we decided dimensions as will be explained in section 4.1. to use fewer classes to primarily account for major and

The second dataset provides operational log data from medium outliers in each of the two directions and have 9 diferent wind parks, comprising 42 windmills in total at most data in the no diference class. a sampling rate of 5 minutes. Data provides 30 attributes Figure 1 shows the test results for the global cell, i. e. no per reading and readings were available for the year 2020. ifxed value in any of the dimensions. As we can see, there

In both cases, a first part of the data has been used for are only few significant anomaly scores, primarily those training and the remainder for testing. In the sequel, re- on January 20th, March 11th and March 29th. At this sults will be presented based on output from a specifically general level (no fixed dimensional values), this behavior developed GUI tool. In the figures the testing period will can be expected as the threshold for raising an alert is be used horizontally to display the results for individual around 1900 kW which is already pretty close to the 2400 test instances. Each timeslice’s reading can be considered kW nominal power of the mill. However, the first two of a test case. The graph shows the results for a specific those scores will not be reported by an alert as all subcells cell of our cube, as selected from diferent dimensions, into the wind speed direction do not have an anomalous values and combinations at the top. Within a figure the score. This means that the power production seemed red curve shows the computed metric value (scale on unusually high from a global point of view (which is left) whereas the blue curve shows the anomaly score information that could have been observed without our (i. e. the number of standard deviations that the value is approach but would have raised a false positive), yet in from the mean in this particular cell), scale on the right. reality it is simply explainable by the rather high wind Typically, scores above 3 can be considered anomalous. speed on those days. For the remaining high anomaly In addition, a yellow line displays the learned mean value score the dimensional analysis shows reduced anomaly for the metric for this cell and green and lightblue lines scores the further detailed the cells become, yet it remains show mean +/- 3 standard deviations. above 3, thus raising an alert. Looking at the data in

4.3. Evaluation against Known Outages

detail in the evaluation, this score can be considered a false positive. The reason is that this specific context situation had not been observed in the whole training period. Such errors can be remedied by increasing the training data set.

Even more interesting is the analysis looking into some of the dimensions, as the learned normality behavior is Figure 4: Efective Power and Anomaly Scores (Plant group, much more specific in those cases as seen in figure 2. In dimensionally restricted view, speed class 6, direction class 8) that figure we have focused the display on the wind speed class 2 (pretty low speed) and the wind direction class 2. The figure shows that the learned model with mean parks as well as specifc wind speed and wind direction around 140 kW and 80 kW standard deviation is very all showing anomalouss scores in one alert as those are specific. Still, the only remaining alert with an anomaly all dependent cells in the cube. This shows that the score score of 3.1 shows up at April 11th. This could be a false is indeed an anomaly for these mills (cf. figure 4) and positive due to a too specific cell model or a true alert should thus be reported as an anomaly alert. This can due to a malfunction with too high generated power. A be considered a true positive that is recognized by the human operator seeing the alert would be able to classify system. It can be further explained to the human expert this alert based on his domain knowledge. Due to space by providing the specific wind park, speed and direction constraints we only present these exemplary results here. that causes the alert to be raised.

In general, the increased size of the training data leads 4.2. Common Model for Plant Groups to more precisely learned models in the cells. This potentially increases the number of false positives, since anomaly scores are more likely with smaller standard deviation. However, by judging an anomaly score in combination with the standard deviation of its cell, most of the false positives can be identified easily and thus do not lead to raising alerts. On the other hand the benefit of the more precise models is that false negatives are much less likely in that case.

Also, only precise cell models facilitate discovery of anomalies in cases with unusual low power production particularly relevant in case of attacks. This is due to the fact that low production is only observed as an anomaly if the learned mean - 3 standard deviations is above 0 kW.

This can only be achieved with rather precise cell models which need large training datasets.

For the second validation data from the set of windparks has been used. Here, January to August 2020 has been used as training data and September to December 2020 for testing. Metrics and dimensions shown are identical to the ones in the previous subsection for comparability purposes. In addition, the specific wind mill has also been used as another dimension in order to be able to analyze the outcome per mill and over all mills together.

Data from 17 of the mills with identical nominal power production of 2300 kW have been used.

Figure 3 again shows the overall view of the scores with no fixed dimensional values. We can see that the learned normality model is much more specific than the one in figure 1 due to the extended training set (standard deviation around 200 kW as opposed to 500 kW).

Two cases with higher anomaly scores can be identiifed, namely Nov 2nd and Nov 19th/20th. The first of those shows a similar behavior as already noted in the previous subsection, i. e. an anomaly score that does not show up in any of the dimensionally restricted models and thus, it would not be reported as alert. The latter anomaly score would be tied to two of the four windThe evaluations in the previous subsections were only able to show that anomalous behavior can be detected in principle, since the data did not contain any known attacks or outages of the power plants. In order to get a qualitative impression of how well the detected anomalies correspond with actual unusual behavior, we evalu- PMS issue ated the concept against data from a single windmill that was available over a 2.5 years time frame. In addition, CE anomaly alert false true for this plant information from the plant management system (PMS) was available that listed all known and Table 1 recorded system problems during that time. Confusion matrix for outage anomaly detection (at least 40

It should be noted that this evaluation is not well suited minute outage per timeslice considered anomalous) for a thorough quantitative analysis of the algorithm since the dataset only provides information about events afecting the operation of the mill that were known to on the other hand it is questionable whether a full timesthe PMS. Thus, since no attacks are known there are lice shall be considered anomalous just based on a single no attack labels and thus no evaluation against attack event. For the following evaluation we used thresholds of detection is possible. Similarly, anomalous situations 40 and 5 minutes within a 4 hour timeslice as a condition due to an unusual behavior of the mill unknown to the for an anomalous timeslice. Note that an anomaly due to PMS are not labeled as anomalous in the ground truth. an outage is usually rarely a very short incident. Thus we can expect some (seemingly) false positives for Another aspect is the management of missing readthe anomalous situations not recorded in the PMS and ings from the windmill which is often times caused by thus labeled as normal. This will lead to a rather low anomalous operation. If no data readings are present for precision when comparing our anomaly messages with a whole timeslice the CE algorithm will not detect an the events recorded in the plant management system as anomaly for the power production, since missing data ground truth. does not get any anomaly score. However, with the sec

In addition, the events in the PMS record any unusual ond metric (number of readings per timeslice) we can situation in the windmill regardless of their impact on easily detect timeslices where no power readings are the actual power production. Since we consider output present and thus report them as an anomaly as well. Fipower production as our analysis target, it is obvious nally, a single anomalous cube cell per timeslice will that we will not be able to detect events that have no or make the entire timeslice anomalous. This is one of the minimal influence on the power production 2. Such situa- primary strengths of the algorithm to also detect only tions will be recorded as seemingly false negatives in the specific anomalies within a large set of non-anomalously comparison, impacting the recall negatively. However, seeming other cells at the same time. The explaination we do not anticipate too many of such messages so that of the anomaly for a timeslice will contain all anomalous aiming for a high recall is still a desirable target. cube cells for that timeslice together with the additional

Both efects mentioned previously will also impact data, so that the human expert can further examine the other measures such as accuracy (to some degree) and incident.

F1 score (to large degree). Still a good, albeit not perfect, accuracy score is also a valid goal to target. 4.3.2. Exemplary results 4.3.1. Evaluation Setup For this evaluation we used windmill data from 2 years as training set for our algorithm and data from the remaining 0.5 years as a test set. We used an algorithm configuration similar to the one in section 4.1. We had to clean training data by removing the readings for times which had been recorded in the PMS as anomalous in order to only learn normal behavior of the system.

Since the events recorded in the PMS used timestamps with 5 minute diference, we first need to align the time resolution, i. e. define how many anomalous events within a 4 hour timeslice make such a timeslice anomalous in total. While it is desirable on one hand to even realize anomalies that only occur at a single instance in time,

2From a practical point of view detecting such events with our algo

rithm is not necessary, as these have only minimal impact on the power production and are already known from the PMS and thus do not require advanced detection.

With the setup as described before and 40 minute anomaly

threshold we achieved a recall of 0.72 and an accuracy of 0.89 as the primary targets of the algorithm. The precision was low at 0.27 as expected and explained above; this makes an F1 score of 0.39. The matrix in table 1 summarizes the results.

Again, the seemingly high number of false positives is due to the fact that the CE detects anomalies that are not part of the PMS failure ground truth, either because they are attacks or because they did not lead to events in the PMS. As another baseline an auto-encoder based algorithm trying to detect only outages on the same data set only achieved a 0.31 F1 score, mainly because of a higher number of false negatives.

If we reduce the threshold how many anomalous events in the groud truth make a timeslice anomalous to a single event (i. e. 5 minutes of the 4 hour timeslice), the recall reduces somewhat to 0.60, however accuracy and precision remain pretty much the same such that the F1 PMS issue CE anomaly alert score reduces to 0.37 (cf. table 2). This behavior is due to an increased number of false negatives, which could be expected as some minor issues in plant operation do not necessarily cause anomalous power production. The auto-encoder baseline increased its F1 score to 0.33 in this case.

A final evaluation shows that there is still potential in the CE based algorithm by fine tuning the learned cell models. Increasing the threshold anomaly score for alerts to 4 standard deviations, we obtain the confusion matrix in table 3. This increases the accuracy to 0.94 and specifically the precision to 0.43. The recall is slightly reduced to 0.68 for an overall F1 score of 0.53. This improvement is primarily due to the reduced number of seemingly false positives in situations where no outage is recorded in the PMS. However, it remains unclear whether this is an actual improvement in practice or not. It simply leads to a reduction of detected anomaly candidates. Yet from the data provided it is unknown where these situations would actually belong to anomalous or regular behavior.

In summary, the evaluation in this section has shown that the algorithm introduced in chapter 3 is capable of detecting unusual system behavior of a wind power plant which had also been recorded in a PMS, particularly with good accuracy and recall. Precision and thus F1 score are somewhat lower which can be attributed to the algorithm also detecting anomalous behavior that had not been recorded in the PMS, e. g. because it was due to a specific wind condition. This is exactly what the main advantage of the CE algorithm is, namely also detecting anomalous behavior in specific conditions which could be caused by an attack. We have also shown optimizing some of the hyper parameters of the approach (such as message thresholds and timeslice aggregation) might improve the detection quality further in addition to larger training sets and more dimensions.

5. Conclusion and Future Work 5.1. Summary

In this paper we have presented a concept and implementation to detect anomalous behavior in renewable power plants. The concept is based on learning normal behavior of key performance figures such as efective power production. The normal behavior is learned for many specific situations which can be expressed as multidimensional cells in an OLAP-like data cube. On one hand, this reduces the number of false negatives by learning very specific models for the individual cells representing specific situations. On the other hand, the number of false positives can still be kept low by using larger training data sets. Also, assessing the specificity of the learned model to put a mere anomaly score into context and thus facilitate appropriate treatment before raising alerts can be done by a human inspector and to some degree even an automation such as in section 4.3. This is an important advantage of the explainability achieved by the learned behavior models for each cell. The concept has been successfully evaluated on actual data from wind power plants as shown in section 4 both in general and also on a set of known outages as one possible reason for anomalous behavior.

In summary, the concept presented in this paper ofers a promising approach to detect anomalous behaviour in renewable power plants by learning specific models according to a configurable set of dimensions reflecting relevant circumstances for power production. The anomaly scores based on learned mathematical models provide traceable explanations for the detected anomalies which may originate from attacks or regular operational issues.

5.2. Outlook

While the evaluation presented in section 4 already showed the usefulness of the concept, much more experiments are needed to reveal its full potential. Much more analysis with regard to identifying interesting and relevant dimensions in the base data to be used for the cube is required. Some promising dimensions such as temperature, air pressure and power factor have not been included yet. Moreover, using larger time ranges for the training data will be one of the next steps to further verify the positive impact of more precisely learned models. This should also further reduce some issues detecting unusual low power production due to normality models with too large standard deviations that do not raise high enough anomaly scores even for zero power production in certain situations.

Also, some experiments have shown that using a normal distribution as foundation of estimating cell models is not always appropriate. We saw several cases where most metric training data lies around a rather small value with a few high outliers. For such distributions a normal distribution is not a good estimator. Instead, alternative models should be used which will be added to our implementation soon.

Finally, we have currently only evaluated the concept on wind power production. We have similar datasets from photovoltaics which we plan to use for a second evaluation. Metric will be similarly the efective power production, but regarding dimensions there will have to be an extensive evaluation which are most promising.

[1]

C. . I. S.

Agency , Russian Government Cyber Activity Targeting Energy and Other Critical Infrastructure Sectors , 2018 . URL: https://www.cisa.gov/ uscert/ncas/alerts/TA18-074A.

[2]

Chakraborty ,

Onuchowska ,

Samtani ,

Jank ,

Wolfram , Machine learning for automated industrial iot attack detection: An eficiencycomplexity trade-of , ACM Trans. Manage. Inf. Syst . 12 ( 2021 ). URL: https://doi.org/10.1145/3460822.

[3]

K. N.

Junejo ,

Goh , Behaviour-based attack detection and classification in cyber physical systems using machine learning , in: Proc. of the 2nd ACM Int. Workshop on Cyber-Physical System Security, CPSS '16 , ACM , New York, NY, USA, 2016 , p. 34 - 43 . URL: https://doi.org/10.1145/2899015.2899016.

[4]

Sun ,

Li ,

Yan ,

Lei , X. Zhang, Wind turbine anomaly detection using normal behavior models based on scada data , in: 2014 ICHVE International Conference on High Voltage Engineering and Application , 2014 , pp. 1 - 4 . URL: https: //doi.org/10.1109/ICHVE. 2014 . 7035504 .

[5]

Zhou ,

Hu ,

Min , et al., A semi-supervised anomaly detection method for wind farm power data preprocessing , in: 2017 IEEE Power & Energy Society General Meeting , 2017 , pp. 1 - 5 . URL: https: //doi.org/10.1109/PESGM. 2017 . 8273883 .

[6]

McKinnon ,

Carroll ,

McDonald , et al., Investigation of anomaly detection technique for wind turbine pitch systems , in: The 9th Renewable Power Generation Conference , 2021 , pp. 277 - 282 . URL: https://doi.org/10.1049/icp. 2021 . 1401 .

[7]

Badihi ,

Jadidi ,

Yu ,

Zhang , N. Lu, Smart cyber-attack diagnosis and mitigation in a wind farm network operator , IEEE Transactions on Industrial Informatics ( 2022 ) 1 - 10 . URL: https://doi. org/10.1109/TII. 2022 . 3228686 .

[8]

Datta ,

M. A.

Rahman , Cyber threat analysis framework for the wind energy based power system , in: Proc. of the 2017 Workshop on CyberPhysical Systems Security and PrivaCy , CPS '17, ACM , New York, NY, USA, 2017 , p. 81 - 92 . URL: https://doi.org/10.1145/3140241.3140247.

[9]

Guibene ,

Messai , et al., A data mining-based intrusion detection system for cyber physical power systems , in: Proc. of the 18th ACM Int. Symposium on QoS and Security for Wireless and Mobile Networks , ACM , New York, NY, USA, 2022 , p. 55 - 62 . URL: https://doi.org/10.1145/3551661.3561367.

[10]

Jindal ,

A. K.

Marnerides ,

Scott , D. Hutchison, Identifying security challenges in renewable energy systems: A wind turbine case study , in: Proc. of the 10th ACM Int. Conf. on Future Energy Systems , ACM, New York, NY, USA, 2019 , p. 370 - 372 . URL: https://doi.org/10.1145/3307772.3330154.

[11]

J. D.

Deng ,

H.-S.

Lee ,

McMillan ,

Rimoni , M. Zhang, Analyzing wind speed data through markov chain based profiling and clustering , in: Proc. of the 2nd Workshop on Machine Learning for Sensory Data Analysis, MLSDA'14 , ACM , New York, NY, USA, 2014 , p. 67 - 73 . URL: https: //doi.org/10.1145/2689746.2689756.

[12]

Song ,

Hu ,

Li , Anomaly detection of wind turbine generator based on temporal information , in: Proceedings of the 2019 7th Int. Conference on Information Technology: IoT and Smart City , ICIT '19 , ACM , New York, NY, USA, 2020 , p. 477 - 482 . URL: https://doi.org/10.1145/3377170.3377271.

[13]

Lee ,

N.-W.

Kim ,

J.-G.

Lee ,

B.-T.

Lee , An approach for utilizing correlation among sensors for unsupervised anomaly detection of wind turbine system , in: 2021 Int. Conf. on Information and Communication Tech. Convergence , 2021 , pp. 104 - 109 . URL: https://doi.org/10.1109/ICTC52510. 2021 . 9621198 .

[14]

Zhu ,

Qian ,

Jing , M. Han,

Huang , F. Zhang, Condition monitoring of wind turbine gearbox using multidimensional hybrid outlier detection , in: Int. Conf. on Smart-Green Technology in Electrical and Inf. Systems , 2021 , pp. 112 - 117 . URL: https: //doi.org/10.1109/ICSGTEIS53426. 2021 . 9650387 .

[15]

Renners ,

Heine ,

Kleiner ,

Dreo-Rodosek , Concept and practical evaluation for adaptive and intelligible prioritization for network security incidents , International Journal on Cyber Situational Awareness 4 ( 2019 ) 99 - 127 .

[16]

Heine , Outlier detection in data streams using OLAP cubes , in: New Trends in Databases and Information Systems - ADBIS Short Papers and Workshops , Nicosia, Cyprus, volume 767 of Communications in Computer and Information Science, Springer, 2017 , pp. 29 - 36 . URL: https://doi.org/10. 1007/978-3- 319 -67162- 8 _ 4 .

[17]

Han , J . Pei, G. Dong,

Wang , Eficient computation of iceberg cubes with complex measures , SIGMOD Rec . 30 ( 2001 ) 1 - 12 . URL: https://doi.org/ 10.1145/376284.375664.