=Paper=
{{Paper
|id=Vol-3651/DARLI-AP_paper1
|storemode=property
|title=Explainable Anomaly Detection in Renewable Energy Power Plants by Learning
|pdfUrl=https://ceur-ws.org/Vol-3651/DARLI-AP-1.pdf
|volume=Vol-3651
|authors=Carsten Kleiner
|dblpUrl=https://dblp.org/rec/conf/edbt/Kleiner24
}}
==Explainable Anomaly Detection in Renewable Energy Power Plants by Learning==
<pdf width="1500px">https://ceur-ws.org/Vol-3651/DARLI-AP-1.pdf</pdf>
<pre>
                                Explainable Anomaly Detection in Renewable Energy Power
                                Plants by Learning Multidimensional Normality Models
                                Carsten Kleiner1,*,†
                                1
                                    University of Applied Sciences &Arts Hannover, Faculty IV, Ricklinger Stadtweg 120, 30459 Hannover, Germany


                                                                       Abstract
                                                                       Renewable energy production is one of the strongest rising markets and further extreme growth can be anticipated due to
                                                                       desire of increased sustainability in many parts of the world. With the rising adoption of renewable power production, such
                                                                       facilities are increasingly attractive targets for cyber attacks. At the same time higher requirements on a reliable production
                                                                       are raised. In this paper we propose a concept that improves monitoring of renewable power plants by detecting anomalous
                                                                       behavior. The system does not only detect an anomaly, it also provides reasoning for the anomaly based on a specific
                                                                       mathematical model of the expected behavior by giving detailed information about various influential factors causing the
                                                                       alert. The set of influential factors can be configured into the system before learning normal behaviour. The concept is based
                                                                       on multidimensional analysis and has been implemented and successfully evaluated on actual data from different providers of
                                                                       wind power plants.

                                                                       Keywords
                                                                       Anomaly detection, Attack detection, Resiliency, Multidimensional analysis, Wind power plant, Normality model, Explainable
                                                                       anomaly detection


                                1. Introduction and Motivation                                                                         it operational or based on attacks. Since monitoring and
                                                                                                                                       decisions on potential actions to be taken are ultimately
                                For reasons of sustainability the amount of regenerative performed by highly skilled humans, it is important to
                                power production is continuously increasing worldwide use their time as economically as possible. By integrating
                                at ever higher rates. With higher shares of the overall outage and attack detection in a single system, this goal
                                power production, the importance of a reliable power is supported.
                                supply from renewable sources becomes more and more                                                       In addition, typically there is a tradeoff between false
                                important. On the other hand, due to their dependence on positives and false negatives to be balanced in anomaly
                                actual weather conditions, it is more difficult to achieve detection. The more alerts are generated, the smaller
                                a reliable supply from natural sources as a matter of prin- the number of false negatives. On the other hand, more
                                ciple. Thus, an even closer monitoring of the production alerts often means more false positives, exhausting the
                                process by the operators is important to account for that. human resources to deal with the generated alerts. Thus,
                                   Apart from operational challenges, the rising impact in order to take informed decisions and apply appropriate
                                of renewable sources in power production also makes measures, the human monitoring staff needs to be able
                                them an attractive target for attackers to achieve evil to assess messages from the anomaly detection engine.
                                purposes. As already shown by the attack on Ukrainian So it is important that reasons for alerts are provided to
                                power plants in December 2015 by Russian hacker groups, the humans in order to detect false positives as easy as
                                critical infrastructure becomes an ever more important possible. The proposed system will provide such reasons
                                attack target, not only in the recent war crisis in Ukraine to the operators by showing detailed, mathematically
                                ([1]). Thus, it is also important to employ advanced and based explanations for generating alerts.
                                powerful attack detection systems for renewable power                                                     The remainder of this paper starts with a review of
                                production systems in order to protect this part of the related publications in section 2 which will show that
                                critical infrastructure.                                                                               while there are already advanced solutions to specific
                                   In this paper a novel detection system will be proposed aspects, none of these systems provides the combination
                                that is capable of detecting anomalies in the operation of features as our system. The concept of the proposed
                                of renewable power plants. The system operates reason- system will then be explained in section 3 and specific
                                agnostic in its ability to detect anomalous operation, be configuration for wind power plants will be presented.
                                                                                                                                       This is followed by a practical evaluation of the concept
                                Published in the Proceedings of the Workshops of the EDBT/ICDT 2024 on actual wind power plant data from different German
                                Joint Conference (March 25-28, 2024), Paestum, Italy                                                   wind power plants from years 2019 to 2021 in section 4.
                                $ ckleiner@acm.org (C. Kleiner)                                                                        Finally, results will be summarized and ideas for extend-
                                 0000-0001-9497-0312 (C. Kleiner)                                                                     ing the system itself as well as its application scope will
                                          © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                    CEUR
                                          Attribution 4.0 International (CC BY 4.0).
                                          CEUR Workshop Proceedings (CEUR-WS.org)
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                                                                                       be presented in section 5.


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Related Work                                                 3. Concept
Several papers in the context of anomaly detection for          3.1. Requirements and Context
renewable energy systems can be found in the literature.
In a more generalized context, [2] describes a learning         Based on the research project SecDER1 which aims to
approach similar to the one in this paper even for any          increase the resilience of renewable virtual and physical
type of IoT system. Whereas this approach could also be         power plants, the requirements for an anomaly detection
applied to renewable power plants, it is not clear which        system have been identified as follows:
part of the learning can be carried out in an automated            Reason agnostic Both anomalies originating from
fashion. Similarly, results do not provide explanations         known and unknown attacks as well as non-attack based
for anomalies. A focus on attacks, more specifically intru-     anomalies shall be detected, ideally based on a single
sion detection, is described in [3]. However, the approach      detection system.
is not extensible to outage detection and only provides            Explainable alerts The identified anomalies should
non-explainable alert messages. More specifically for           be used to raise alerts that can be handled by human
power plants, [4] uses many very general input param-           domain experts. In order to simplify and substantiate
eters. However, this approach also does not provide ex-         the decisions by the experts explainable alerts should be
plainable anomalies as results.                                 provided, detailing the reason and context why the alert
    Other interesting wind power specific concepts include      has been issued.
[5, 6]. However, these approaches also do not provide              Adaptability The concept shall be usable for different
explainable results. The first, in addition, requires a semi-   types of wind power plants as well as different types of
supervised learning approach which is not feasible for          renewable power plants in general. The learned normal-
previously unknown attack types. Also, annotated train-         ity models can be specific for each plant, however, the
ing data is often times not available. The second approach      concept to learn the model should be generic.
focuses on system failure detection rather than attacks.           General normality model While a single set of nor-
    On the other hand, [7] focuses on attacks and is specific   mality models for all plants is not a goal, it is preferable,
for wind power plants. It is not extensible to other types      if normality models can be learned for groups of similar
of energy sources and the degree of explainability of the       plants. This way the model becomes more stable, and the
results is not obvious. Papers [8, 9] also only focus on        number of extensive learning processes can be reduced.
specific attacks for wind power plants and thus do not             Continuous learning and adjustment The system
achieve the general detection capabilities of our concept.      should be capable of adjusting the learned system be-
The latter is concerned with false data injection attacks       haviour continuously, thus improving the quality of the
which are also the focus of several other publications.         normality models over time. Thus can also update the
Moreover, [10] provides a good overview of the security         models in cases of concept drift over time.
challenges from attacks that have to be considered, but            The system described in the following part of the paper
it does not present a comprehensive solution.                   will satisfy all of these requirements. On the other hand,
    Finally, there are also papers with a pretty similar con-   there are also limitations of the approach that have been
cept to ours, but with different detection approaches, such     accepted in order to keep the complexity manageable. In
as Markov chains in [11] and a more complex detection           particular, detection is only considered up to explainable
model in [12]. However in both cases, while the approach        alert generation, alert handling itself is not in scope. Han-
is specific to wind power plants and an extensibility is not    dling can be considered orthogonal as long as explainabil-
documented, the explainability of the generated alerts is       ity of the generated alerts is secured. For alert handling,
uncertain. This is also true for [13] which also uses a cor-    generic procedures and manual update concepts can be
relation based approach, yet it is only one-dimensional         considered as an extension, see e. g. [15] for an approach
and requires and includes many specific sensors, so that        based on rule-based anomaly detection. Similarly, we
it is also tied to the domain of wind turbines only. Even       only consider anomaly-based detection concepts, since
more specific to wind turbine gearboxes is [14]. The au-        most attack patterns (and even some of the non-attack-
thors do not limit their approach to attacks, also use a        based outage patterns) are previously unknown, so rule-
multidimensional analysis and generate at least partially       or pattern-based detection will not be powerful enough
explainable alerts. However, it is not obvious whether          to detect these. As attacks on virtual power plants are
and how this can be extended beyond gearboxes.                  executed by designated experts, advanced attacks will
    In summary, none of the discussed references is able        be used which are unique to the specific target and thus
to provide the comprehensive features of our approach           typically not previously known.
(cover attacks and outages, generate explainable alerts,
capable of detecting unknown attacks and useable for
different types of power generation).                           1
                                                                    https://secder-project.de
3.2. Multidimensional Normality Models                            The goal of the learning process by looking at histor-
     (MNM)                                                     ical data is to compute a statistical description of the
                                                               metric attribute for each cell of the cube. This is done by
The basic concept for anomaly detection is learning mul-       assuming a normal distribution for the metric readings
tidimensional normality models (MNM) based on his-             in each cell and approximating that normal distribution
toric data of the power plant (or a set of similar power       by estimating mean and standard deviation for the met-
plants) and then assessing the deviation from this MNM         ric attribute based on learning from historical data. For
for current readings of a logical record of the plant. The     current readings the anomaly score is computed as dif-
concept called cellwise estimator (CE) of the MNM has          ference to the mean of each relevant cell as number of
already been described in [16] in detail; thus, we will        standard deviations. The higher this factor, the more
only present a high level description here. Originating        likely the current reading is an outlier. As known from
from online analytical processing (OLAP) cubes, the idea       statistics a factor of 3 is a natural choice as a threshold
is to describe normal behaviour of certain metrics (such       to generate an alert. As will be seen in section 4, solely
as power production in a windmill) based on several or-        looking at this factor as an anomaly measure is not suf-
thogonal dimensions (such as weather conditions, plant         ficient, though, to properly assess the importance of an
sensor readings and others). The reason for this multidi-      alert.
mensional treatment is that measurements of the metrics           In summary, each cell’s normality model in our con-
may be within a permissible range when looking at them         cept consists of an estimation of normal distributions
globally, whereas they may be an anomaly, when consid-         (with mean and standard deviation each) of one or more
ering the specific context in more detail. The context is      measurements per cube cell over a timeslice. Cube cells
described by the dimensions which are used in learning         are defined by combinations of discrete values of relevant
the MNMs. Conversely, potentially abnormal measure-            dimensions, with wildcards allowed for cells with irrele-
ments on the global level may actually be normal when          vant values in a dimension. The anomaly score is then
looking at their specific context. Thus, it is important       computed based on the number of standard deviations
to be able to base a decision whether a logical record         that any current reading of a measure deviates from the
constitutes an anomaly on both global as well as contex-       expected mean. Alerts are typically only raised for cube
tual, i. e. dimensional, information. To account for these     cells with anomaly scores higher than a threshold of 3.
challenges a specific normality model is learned for each      In addition to the anomaly score the computed normality
of the cube cells, i. e. every contextual situation.           model as distribution estimation is also provided with the
   Unfortunately, the higher the number of dimensions          alert along with information about the cell’s dimensional
and the number of values within a dimension, the larger        values that caused the alert. This combination of infor-
the number of combinations to consider becomes. Since          mation (metric measurement, anomaly score, contextual
the growth is exponential, these numbers have to be            values, normality model) comprises the explanation for
limited. In addition the concept of iceberg cubes ([17])       the human expert. Thus, an informed decision about
known from the OLAP domain can also be used to restrict        proper reaction to the alert is facilitated.
the number of cubes to consider to relevant ones.
   In order to deal with continuous data streams as needed
for monitoring a power plant, the cubes are computed           3.3. Application of MNM to Wind Power
per timeslice with a configurable timeslice length. The             Plants
metric attribute whose normal behavior is to be learned        In order to apply our concept as explained in section 3.2
is aggregated by some configurable aggregation function        to renewable energy plants in general and wind power
over all readings within a timeslice. For the domain of        plants in particular, we have to define the metrics with
wind power plants for instance, the power production           aggregation functions for which normality models shall
output of a mill is a logical choice as a metric with multi-   be learned as well as the discrete influential dimensions
ple readings being aggregated by using the average over        that might influence the metrics and be important for
a timeslice. Typical dimensions for this metric can be         assessing an alert. Candidates for choosing the metrics
wind speed, wind direction, rotor position and outside         are any elements of a monitoring reading that can be
temperature. Since the dimensions are used to form an          used to describe the operational behaviour of a wind-
OLAP-like cube, all dimensions must be of discrete types.      mill. The assumption is that attacks or outages will lead
Thus, continuous readings such as wind speed and tem-          to unexpected behavior in this metric. Primarily, this
perature need to be assigned to a set of classes in order      is the effective electrical power production of the mill
to be used as dimensions. As known from OLAP rollups,          computed as an average over a timeslice. For consistency
there is also a symbolic value of * in each dimension          checks the number of measurement readings per times-
that aggregates all classes in that dimension and thus         lice can also be used as a metric. Alternative options that
provides a cube cell where the class is irrelevant.            have not been evaluated in the experiments described in
section 4 could be the positions of the pod or the blades
of the windmill or other operational features.
   There are much more options for choosing the dimen-
sions than the metrics. In the evaluation in section 4
we have experimented with different choices, but there
are actually many more. Obvious dimensions include
wind speed, wind direction, pod position, air tempera-
ture, air pressure. More possible options include power
factor, pitch angles of each blade, angle between pod and
wind direction and anemometer readings. The choice
of discretization of each of these factors (cf. 3.2) can be
considered another hyperparameter of the application.
Specific choices for the dimensions and discretizations
for the experiments will be explained in section 4, but it Figure 1: Effective Power and Anomaly Scores for Single Mill
has to be pointed out that those are only initial selections (Total view)
and much more experiments will have to be carried out
in the future to optimize the approach, cf. section 5.2.
                                                                4.1. Validation of Concept
4. Evaluation                                                   As an initial validation we used the data from 2020 of
                                                                the first dataset as training set and the readings from
In order to evaluate the capabilities of the concept in
                                                                2021 for testing. We chose the average electrical power
detail, we used historical data from actual wind power
                                                                production over timeslices of 4 hours as primary metric.
plants that are operated by project partners in the SecDER
                                                                We experimented with some attributes as dimensions,
project. We had two different datasets, one from each
                                                                the results in this subsection have been achieved with
operator. Data did not contain any known attacks, yet
                                                                wind speed, wind direction and difference between gon-
some anomalies due to maintenance or unusual weather
                                                                dola angle and wind direction. The continuous values
conditions.
                                                                in these dimensions have been linearly assigned to 9, 12
    The first dataset consists of operational log data from
                                                                and 5 classes, respectively. The number of classes of the
a single wind mill over the time range from January 2020
                                                                first two features has been determined heuristically by
to August 2021 at a sampling rate of 15 minutes. Each log
                                                                assigning equally sized intervals of the total range of
reading consists of 22 attributes in total, one of which is
                                                                values to classes. For the third feature where original
the timestamp and the others can be used as metrics or
                                                                data had a strongly non-linear distribution we decided
dimensions as will be explained in section 4.1.
                                                                to use fewer classes to primarily account for major and
    The second dataset provides operational log data from
                                                                medium outliers in each of the two directions and have
9 different wind parks, comprising 42 windmills in total at
                                                                most data in the no difference class.
a sampling rate of 5 minutes. Data provides 30 attributes
                                                                   Figure 1 shows the test results for the global cell, i. e. no
per reading and readings were available for the year 2020.
                                                                fixed value in any of the dimensions. As we can see, there
    In both cases, a first part of the data has been used for
                                                                are only few significant anomaly scores, primarily those
training and the remainder for testing. In the sequel, re-
                                                                on January 20th, March 11th and March 29th. At this
sults will be presented based on output from a specifically
                                                                general level (no fixed dimensional values), this behavior
developed GUI tool. In the figures the testing period will
                                                                can be expected as the threshold for raising an alert is
be used horizontally to display the results for individual
                                                                around 1900 kW which is already pretty close to the 2400
test instances. Each timeslice’s reading can be considered
                                                                kW nominal power of the mill. However, the first two of
a test case. The graph shows the results for a specific
                                                                those scores will not be reported by an alert as all subcells
cell of our cube, as selected from different dimensions,
                                                                into the wind speed direction do not have an anomalous
values and combinations at the top. Within a figure the
                                                                score. This means that the power production seemed
red curve shows the computed metric value (scale on
                                                                unusually high from a global point of view (which is
left) whereas the blue curve shows the anomaly score
                                                                information that could have been observed without our
(i. e. the number of standard deviations that the value is
                                                                approach but would have raised a false positive), yet in
from the mean in this particular cell), scale on the right.
                                                                reality it is simply explainable by the rather high wind
Typically, scores above 3 can be considered anomalous.
                                                                speed on those days. For the remaining high anomaly
In addition, a yellow line displays the learned mean value
                                                                score the dimensional analysis shows reduced anomaly
for the metric for this cell and green and lightblue lines
                                                                scores the further detailed the cells become, yet it remains
show mean +/- 3 standard deviations.
                                                                above 3, thus raising an alert. Looking at the data in
Figure 2: Effective Power and Anomaly Scores (Single mill,   Figure 3: Effective Power and Anomaly Score (Plant Group,
Two dimensions restricted view)                              total view)


detail in the evaluation, this score can be considered a
false positive. The reason is that this specific context
situation had not been observed in the whole training
period. Such errors can be remedied by increasing the
training data set.
   Even more interesting is the analysis looking into some
of the dimensions, as the learned normality behavior is      Figure 4: Effective Power and Anomaly Scores (Plant group,
much more specific in those cases as seen in figure 2. In    dimensionally restricted view, speed class 6, direction class 8)
that figure we have focused the display on the wind speed
class 2 (pretty low speed) and the wind direction class
2. The figure shows that the learned model with mean       parks as well as specifc wind speed and wind direction
around 140 kW and 80 kW standard deviation is very         all showing anomalouss scores in one alert as those are
specific. Still, the only remaining alert with an anomaly  all dependent cells in the cube. This shows that the score
score of 3.1 shows up at April 11th. This could be a false is indeed an anomaly for these mills (cf. figure 4) and
positive due to a too specific cell model or a true alert  should thus be reported as an anomaly alert. This can
due to a malfunction with too high generated power. A      be considered a true positive that is recognized by the
human operator seeing the alert would be able to classify  system. It can be further explained to the human expert
this alert based on his domain knowledge. Due to space     by providing the specific wind park, speed and direction
constraints we only present these exemplary results here.  that causes the alert to be raised.
                                                              In general, the increased size of the training data leads
4.2. Common Model for Plant Groups                         to more precisely learned models in the cells. This po-
                                                           tentially increases the number of false positives, since
For the second validation data from the set of windparks anomaly scores are more likely with smaller standard
has been used. Here, January to August 2020 has been deviation. However, by judging an anomaly score in com-
used as training data and September to December 2020 bination with the standard deviation of its cell, most of
for testing. Metrics and dimensions shown are identical the false positives can be identified easily and thus do not
to the ones in the previous subsection for comparability lead to raising alerts. On the other hand the benefit of
purposes. In addition, the specific wind mill has also the more precise models is that false negatives are much
been used as another dimension in order to be able to less likely in that case.
analyze the outcome per mill and over all mills together.     Also, only precise cell models facilitate discovery of
Data from 17 of the mills with identical nominal power anomalies in cases with unusual low power production
production of 2300 kW have been used.                      particularly relevant in case of attacks. This is due to the
   Figure 3 again shows the overall view of the scores fact that low production is only observed as an anomaly
with no fixed dimensional values. We can see that the if the learned mean - 3 standard deviations is above 0 kW.
learned normality model is much more specific than the This can only be achieved with rather precise cell models
one in figure 1 due to the extended training set (standard which need large training datasets.
deviation around 200 kW as opposed to 500 kW).
   Two cases with higher anomaly scores can be identi-
fied, namely Nov 2nd and Nov 19th/20th. The first of 4.3. Evaluation against Known Outages
those shows a similar behavior as already noted in the The evaluations in the previous subsections were only
previous subsection, i. e. an anomaly score that does not able to show that anomalous behavior can be detected
show up in any of the dimensionally restricted models in principle, since the data did not contain any known
and thus, it would not be reported as alert. The latter attacks or outages of the power plants. In order to get a
anomaly score would be tied to two of the four wind- qualitative impression of how well the detected anoma-
lies correspond with actual unusual behavior, we evalu-                                            false    948     103
                                                                                      PMS issue
                                                                                                   true     15       38
ated the concept against data from a single windmill that
was available over a 2.5 years time frame. In addition,                               CE anomaly alert      false   true
for this plant information from the plant management
system (PMS) was available that listed all known and                      Table 1
recorded system problems during that time.                                Confusion matrix for outage anomaly detection (at least 40
   It should be noted that this evaluation is not well suited             minute outage per timeslice considered anomalous)
for a thorough quantitative analysis of the algorithm
since the dataset only provides information about events
affecting the operation of the mill that were known to                    on the other hand it is questionable whether a full times-
the PMS. Thus, since no attacks are known there are                       lice shall be considered anomalous just based on a single
no attack labels and thus no evaluation against attack                    event. For the following evaluation we used thresholds of
detection is possible. Similarly, anomalous situations                    40 and 5 minutes within a 4 hour timeslice as a condition
due to an unusual behavior of the mill unknown to the                     for an anomalous timeslice. Note that an anomaly due to
PMS are not labeled as anomalous in the ground truth.                     an outage is usually rarely a very short incident.
Thus we can expect some (seemingly) false positives for                      Another aspect is the management of missing read-
the anomalous situations not recorded in the PMS and                      ings from the windmill which is often times caused by
thus labeled as normal. This will lead to a rather low                    anomalous operation. If no data readings are present for
precision when comparing our anomaly messages with                        a whole timeslice the CE algorithm will not detect an
the events recorded in the plant management system as                     anomaly for the power production, since missing data
ground truth.                                                             does not get any anomaly score. However, with the sec-
   In addition, the events in the PMS record any unusual                  ond metric (number of readings per timeslice) we can
situation in the windmill regardless of their impact on                   easily detect timeslices where no power readings are
the actual power production. Since we consider output                     present and thus report them as an anomaly as well. Fi-
power production as our analysis target, it is obvious                    nally, a single anomalous cube cell per timeslice will
that we will not be able to detect events that have no or                 make the entire timeslice anomalous. This is one of the
minimal influence on the power production2 . Such situa-                  primary strengths of the algorithm to also detect only
tions will be recorded as seemingly false negatives in the                specific anomalies within a large set of non-anomalously
comparison, impacting the recall negatively. However,                     seeming other cells at the same time. The explaination
we do not anticipate too many of such messages so that                    of the anomaly for a timeslice will contain all anomalous
aiming for a high recall is still a desirable target.                     cube cells for that timeslice together with the additional
   Both effects mentioned previously will also impact                     data, so that the human expert can further examine the
other measures such as accuracy (to some degree) and                      incident.
F1 score (to large degree). Still a good, albeit not perfect,
accuracy score is also a valid goal to target.                            4.3.2. Exemplary results

4.3.1. Evaluation Setup                                                   With the setup as described before and 40 minute anomaly
                                                                          threshold we achieved a recall of 0.72 and an accuracy
For this evaluation we used windmill data from 2 years                    of 0.89 as the primary targets of the algorithm. The pre-
as training set for our algorithm and data from the re-                   cision was low at 0.27 as expected and explained above;
maining 0.5 years as a test set. We used an algorithm                     this makes an F1 score of 0.39. The matrix in table 1
configuration similar to the one in section 4.1. We had to                summarizes the results.
clean training data by removing the readings for times                       Again, the seemingly high number of false positives
which had been recorded in the PMS as anomalous in                        is due to the fact that the CE detects anomalies that are
order to only learn normal behavior of the system.                        not part of the PMS failure ground truth, either because
   Since the events recorded in the PMS used timestamps                   they are attacks or because they did not lead to events
with 5 minute difference, we first need to align the time                 in the PMS. As another baseline an auto-encoder based
resolution, i. e. define how many anomalous events within                 algorithm trying to detect only outages on the same data
a 4 hour timeslice make such a timeslice anomalous in                     set only achieved a 0.31 F1 score, mainly because of a
total. While it is desirable on one hand to even realize                  higher number of false negatives.
anomalies that only occur at a single instance in time,                      If we reduce the threshold how many anomalous events
                                                                          in the groud truth make a timeslice anomalous to a sin-
2
    From a practical point of view detecting such events with our algo-   gle event (i. e. 5 minutes of the 4 hour timeslice), the
    rithm is not necessary, as these have only minimal impact on the
                                                                          recall reduces somewhat to 0.60, however accuracy and
    power production and are already known from the PMS and thus
    do not require advanced detection.                                    precision remain pretty much the same such that the F1
                         false    938    103                5. Conclusion and Future Work
            PMS issue
                         true     25      38
            CE anomaly alert     false   true               5.1. Summary
                                                             In this paper we have presented a concept and implemen-
Table 2
                                                             tation to detect anomalous behavior in renewable power
Confusion matrix for outage anomaly detection (at least 5
minute outage per timeslice considered anomalous)            plants. The concept is based on learning normal behav-
                                                             ior of key performance figures such as effective power
                                                             production. The normal behavior is learned for many
                          false    1004      47
             PMS issue                                       specific situations which can be expressed as multidi-
                           true      17      36
                                                             mensional cells in an OLAP-like data cube. On one hand,
             CE anomaly alert      false    true             this reduces the number of false negatives by learning
                                                             very specific models for the individual cells represent-
Table 3                                                      ing specific situations. On the other hand, the number
Confusion matrix for outage anomaly detection with higher of false positives can still be kept low by using larger
anomaly threshold                                            training data sets. Also, assessing the specificity of the
                                                             learned model to put a mere anomaly score into context
                                                             and thus facilitate appropriate treatment before raising
score reduces to 0.37 (cf. table 2). This behavior is due alerts can be done by a human inspector and to some
to an increased number of false negatives, which could degree even an automation such as in section 4.3. This is
be expected as some minor issues in plant operation do an important advantage of the explainability achieved by
not necessarily cause anomalous power production. The the learned behavior models for each cell. The concept
auto-encoder baseline increased its F1 score to 0.33 in has been successfully evaluated on actual data from wind
this case.                                                   power plants as shown in section 4 both in general and
   A final evaluation shows that there is still potential in also on a set of known outages as one possible reason for
the CE based algorithm by fine tuning the learned cell anomalous behavior.
models. Increasing the threshold anomaly score for alerts       In summary, the concept presented in this paper offers
to 4 standard deviations, we obtain the confusion matrix a promising approach to detect anomalous behaviour in
in table 3. This increases the accuracy to 0.94 and specifi- renewable power plants by learning specific models ac-
cally the precision to 0.43. The recall is slightly reduced cording to a configurable set of dimensions reflecting rel-
to 0.68 for an overall F1 score of 0.53. This improvement evant circumstances for power production. The anomaly
is primarily due to the reduced number of seemingly false scores based on learned mathematical models provide
positives in situations where no outage is recorded in traceable explanations for the detected anomalies which
the PMS. However, it remains unclear whether this is an may originate from attacks or regular operational issues.
actual improvement in practice or not. It simply leads
to a reduction of detected anomaly candidates. Yet from 5.2. Outlook
the data provided it is unknown where these situations
would actually belong to anomalous or regular behavior. While the evaluation presented in section 4 already showed
   In summary, the evaluation in this section has shown the usefulness of the concept, much more experiments
that the algorithm introduced in chapter 3 is capable of are needed to reveal its full potential. Much more anal-
detecting unusual system behavior of a wind power plant ysis with regard to identifying interesting and relevant
which had also been recorded in a PMS, particularly with dimensions in the base data to be used for the cube is re-
good accuracy and recall. Precision and thus F1 score are quired. Some promising dimensions such as temperature,
somewhat lower which can be attributed to the algorithm air pressure and power factor have not been included
also detecting anomalous behavior that had not been yet. Moreover, using larger time ranges for the training
recorded in the PMS, e. g. because it was due to a specific data will be one of the next steps to further verify the
wind condition. This is exactly what the main advantage positive impact of more precisely learned models. This
of the CE algorithm is, namely also detecting anomalous should also further reduce some issues detecting unusual
behavior in specific conditions which could be caused low power production due to normality models with too
by an attack. We have also shown optimizing some of large standard deviations that do not raise high enough
the hyper parameters of the approach (such as message anomaly scores even for zero power production in certain
thresholds and timeslice aggregation) might improve the situations.
detection quality further in addition to larger training        Also, some experiments have shown that using a nor-
sets and more dimensions.                                    mal distribution as foundation of estimating cell models
                                                             is not always appropriate. We saw several cases where
most metric training data lies around a rather small value       ACM, New York, NY, USA, 2017, p. 81–92. URL:
with a few high outliers. For such distributions a normal        https://doi.org/10.1145/3140241.3140247.
distribution is not a good estimator. Instead, alterna- [9] K. Guibene, N. Messai, et al., A data mining-based
tive models should be used which will be added to our            intrusion detection system for cyber physical power
implementation soon.                                             systems, in: Proc. of the 18th ACM Int. Symposium
   Finally, we have currently only evaluated the concept         on QoS and Security for Wireless and Mobile Net-
on wind power production. We have similar datasets               works, ACM, New York, NY, USA, 2022, p. 55–62.
from photovoltaics which we plan to use for a second             URL: https://doi.org/10.1145/3551661.3561367.
evaluation. Metric will be similarly the effective power [10] A. Jindal, A. K. Marnerides, A. Scott, D. Hutchison,
production, but regarding dimensions there will have to          Identifying security challenges in renewable energy
be an extensive evaluation which are most promising.             systems: A wind turbine case study, in: Proc. of
                                                                 the 10th ACM Int. Conf. on Future Energy Systems,
                                                                 ACM, New York, NY, USA, 2019, p. 370–372. URL:
References                                                       https://doi.org/10.1145/3307772.3330154.
                                                            [11] J. D. Deng, H.-S. Lee, C. McMillan, A. Rimoni,
  [1] C. . I. S. Agency, Russian Government Cyber Ac-
                                                                 M. Zhang, Analyzing wind speed data through
      tivity Targeting Energy and Other Critical Infras-
                                                                 markov chain based profiling and clustering, in:
      tructure Sectors, 2018. URL: https://www.cisa.gov/
                                                                 Proc. of the 2nd Workshop on Machine Learn-
      uscert/ncas/alerts/TA18-074A.
                                                                 ing for Sensory Data Analysis, MLSDA’14, ACM,
  [2] S. Chakraborty, A. Onuchowska, S. Samtani,
                                                                 New York, NY, USA, 2014, p. 67–73. URL: https:
      W. Jank, B. Wolfram, Machine learning for auto-
                                                                 //doi.org/10.1145/2689746.2689756.
      mated industrial iot attack detection: An efficiency-
                                                            [12] N. Song, X. Hu, N. Li, Anomaly detection of wind
      complexity trade-off, ACM Trans. Manage. Inf. Syst.
                                                                 turbine generator based on temporal information,
      12 (2021). URL: https://doi.org/10.1145/3460822.
                                                                 in: Proceedings of the 2019 7th Int. Conference on
  [3] K. N. Junejo, J. Goh, Behaviour-based attack de-
                                                                 Information Technology: IoT and Smart City, ICIT
      tection and classification in cyber physical systems
                                                                 ’19, ACM, New York, NY, USA, 2020, p. 477–482.
      using machine learning, in: Proc. of the 2nd ACM
                                                                 URL: https://doi.org/10.1145/3377170.3377271.
      Int. Workshop on Cyber-Physical System Security,
                                                            [13] H. Lee, N.-W. Kim, J.-G. Lee, B.-T. Lee, An approach
      CPSS ’16, ACM, New York, NY, USA, 2016, p. 34–43.
                                                                 for utilizing correlation among sensors for unsuper-
      URL: https://doi.org/10.1145/2899015.2899016.
                                                                 vised anomaly detection of wind turbine system,
  [4] P. Sun, J. Li, Y. Yan, X. Lei, X. Zhang, Wind tur-
                                                                 in: 2021 Int. Conf. on Information and Communi-
      bine anomaly detection using normal behavior mod-
                                                                 cation Tech. Convergence, 2021, pp. 104–109. URL:
      els based on scada data, in: 2014 ICHVE Inter-
                                                                 https://doi.org/10.1109/ICTC52510.2021.9621198.
      national Conference on High Voltage Engineer-
                                                            [14] S. Zhu, Z. Qian, B. Jing, M. Han, Z. Huang, F. Zhang,
      ing and Application, 2014, pp. 1–4. URL: https:
                                                                 Condition monitoring of wind turbine gearbox us-
      //doi.org/10.1109/ICHVE.2014.7035504.
                                                                 ing multidimensional hybrid outlier detection, in:
  [5] Y. Zhou, W. Hu, Y. Min, et al., A semi-supervised
                                                                 Int. Conf. on Smart-Green Technology in Electri-
      anomaly detection method for wind farm power
                                                                 cal and Inf. Systems, 2021, pp. 112–117. URL: https:
      data preprocessing, in: 2017 IEEE Power & Energy
                                                                 //doi.org/10.1109/ICSGTEIS53426.2021.9650387.
      Society General Meeting, 2017, pp. 1–5. URL: https:
                                                            [15] L. Renners, F. Heine, C. Kleiner, G. Dreo-Rodosek,
      //doi.org/10.1109/PESGM.2017.8273883.
                                                                 Concept and practical evaluation for adaptive and
  [6] C. McKinnon, J. Carroll, A. McDonald, et al., Inves-
                                                                 intelligible prioritization for network security inci-
      tigation of anomaly detection technique for wind
                                                                 dents, International Journal on Cyber Situational
      turbine pitch systems, in: The 9th Renewable Power
                                                                 Awareness 4 (2019) 99–127.
      Generation Conference, 2021, pp. 277–282. URL:
                                                            [16] F. Heine, Outlier detection in data streams using
      https://doi.org/10.1049/icp.2021.1401.
                                                                 OLAP cubes, in: New Trends in Databases and
  [7] H. Badihi, S. Jadidi, Z. Yu, Y. Zhang, N. Lu, Smart
                                                                 Information Systems - ADBIS Short Papers and
      cyber-attack diagnosis and mitigation in a wind
                                                                 Workshops, Nicosia, Cyprus, volume 767 of Com-
      farm network operator, IEEE Transactions on In-
                                                                 munications in Computer and Information Science,
      dustrial Informatics (2022) 1–10. URL: https://doi.
                                                                 Springer, 2017, pp. 29–36. URL: https://doi.org/10.
      org/10.1109/TII.2022.3228686.
                                                                 1007/978-3-319-67162-8_4.
  [8] A. Datta, M. A. Rahman, Cyber threat analysis
                                                            [17] J. Han, J. Pei, G. Dong, K. Wang, Efficient com-
      framework for the wind energy based power sys-
                                                                 putation of iceberg cubes with complex measures,
      tem, in: Proc. of the 2017 Workshop on Cyber-
                                                                 SIGMOD Rec. 30 (2001) 1–12. URL: https://doi.org/
      Physical Systems Security and PrivaCy, CPS ’17,
                                                                 10.1145/376284.375664.

</pre>