Towards Adaptive Anomaly Detection and Root
   Cause Analysis by Automated Extraction of
         Knowledge from Risk Analyses

   Bram Steenwinckel, Pieter Heyvaert, Dieter De Paepe, Olivier Janssens,
 Sander Vanden Hautte, Anastasia Dimou, Filip De Turck, Sofie Van Hoecke,
                            and Femke Ongenae

                 Ghent University - imec, IDLab, Ghent, Belgium
                         bram.steenwinckel@ugent.be


      Abstract. Sensors, inside internet-connected devices, analyse the envi-
      ronment and monitor possible unwanted behaviour or the malfunctioning
      of the system. Current risk analysis tools, such as Fault Tree Analysis
      (FTA) and Failure Mode and Effect Analysis (FMEA), provide prior in-
      formation on these faults together expert-driven insights of the system.
      Many people are involved in this risk analyses process, resulting in dis-
      ambiguations and incompleteness. Ontologies could resolve this issue by
      providing a uniform structure for the failures and their causes. However,
      domain experts are not always ontology experts, resulting in a lot of hu-
      man effort to keep the ontologies up to date. In this paper, automated
      mappings from the FMEA data to a domain-specific ontology and the
      generation of rules from a constructed FTA were researched to annotate
      and reason on sensor observations semantically and provide some first
      steps towards automated, expert-driven fault detection. The approach
      is demonstrated with a use case to investigate the possible failures and
      causes of reduced passenger comfort levels inside a train.

      Keywords: Anomaly detection · Root Cause Analysis · Risk Analysis ·
      Semantics · Ontology development · Sensor data · IoT


1   Introduction
Sensor monitoring systems are transforming the industry, with game-changing
applications in, e.g., transportation [5] and healthcare [20]. These systems can
yield valuable insights into company's physical assets and the interaction of these
assets with their environment. However, sensors have limited added value with-
out data analysis [22]. More and more, new methodologies are defined to specify
the correct functioning of the system based on these sensor observations. Com-
mon methodologies for observing unwanted system behaviour with this data are
Anomaly Detection (AD) and Root Cause Analysis (RCA). AD is the identifi-
cation process of events or observations, which do not adhere to the expected
pattern or other items inside a dataset [20]. RCA guides a problem solver to
deduce and understand the real causes of the anomalies [19]. Interest in AD &
18        B. Steenwinckel et al.


                                                                                       Alerting Issue
                                  Failure    Failure     Failure    Control
 Component        Function
                                  Mode        Effect     Cause      Method
Fire detector Detects the fire   Unwanted Falsely      Broken       None
                                 evacuation generate   sensor
                                            d alarm
Smoke         Sense the       Values too Broken        Dust         Clean
sensor        smoke level in  high       sensor        accumulation system    Broken                    Power
                                                                                           Dust
              the environment                                                 Sensor                    Failure


                                      (a)                                                  (b)


                             Fig. 1: Example of FMEA (a) and FTA (b)


RCA will continue to grow as more relevant data is generated and tools become
widely accessible that can handle data from diverse operating environments.
    However, domain-specific knowledge needs to be leveraged to clearly define
the unwanted behaviour and its causes inside these tools as sensor, or system
behaviour in general, varies wildly between application domains. This knowledge
is often provided by domain experts by using risk analysis, which define all
the possible failures and their (observable) effects on the system. Failure Mode
and Effects Analysis (FMEA) [2] and Fault Tree Analysis (FTA) [8] provide
templates to provide such analyses easily. As shown in Figure 1 (a), FMEA
captures, on multiple levels of the system, the potential failures that can occur
to the components and their underlying causes and effects. FTA analyses the
undesired states of a system using Boolean logic. This combination of low-level
events, leading to system failures, can be visualised using a tree, as exemplified
in Figure 1 (b).
    Constructing these FMEA and FTA documents is a time-consuming process
when applied thoroughly. A large number of experts are involved, who each have
expertise in other parts of the system and interpret different parts of the risk
analysis differently. Ambiguities, inconsistencies and duplicates are, therefore,
quite common. These disagreements reduce the advantages of these risk analysis
and make it difficult for non-experts to interpret and use these documents. Shar-
ing, however, a common understanding about the structure of the system and
contextual knowledge amongst the experts could help in separating the domain
knowledge from the operational knowledge about the (mal)functioning of the
system. Ontologies and accompanying inference rules have proven their worth in
providing a common knowledge representation about a domain by defining com-
mon concepts and generalizable rules [23]. However, most system experts are not
familiar with ontology design, which makes these approaches difficult to imple-
ment and maintain. Semantic Web experts are required to constantly improve
and update these ontologies and accompanied rules with new domain-specific
knowledge.
    In contrast, the effort to enable domain experts to generate ontologies and
rules based on the domain knowledge captured in the FMEA and FTA docu-
                    Automated Extraction of Knowledge from Risk Analyses          19

ments will lower the barrier to use them in existing data analysis methodologies.
The generated ontologies and rules themselves can be used to annotate and
reason upon incoming sensor observations, to eventually provide some essential
tools for preliminary semantic-based AD and RCA.
    In this paper, we propose an approach to automatically generate the required
ontologies and inference rules from the aforementioned risk analysis outcomes.
The entries from the FMEA table will be used to create a domain-specific ontol-
ogy. The fault trees will be used to derive rules to clarify if a particular sensor
observation leads to failure or not. The automation of this approach reduces the
need for the involvement of ontology and rule experts in the risk analysis process
while enabling maintainable, semantic-based AD and RCA. As such, the domain
experts can focus on their primary task, i.e., applying their domain knowledge
to accurately capture the unwanted behaviour of a system and its causes.
    The remainder of the paper is structured as follows. Section 2 situates our
approach with respect to the related work. The designed approach is discussed
in detail in Section 3, while Section 4 details the application of the approach on
a real-life use case, i.e., investigating the possible failures and causes of reduced
passenger comfort levels inside a train. Section 5 highlights the most important
accomplishments and discusses the directions for future work.


2   Related work

As previously mentioned, ontology-based risk analysis methods have been pro-
posed. Dittmann et al. [7] describes a process to capture the results of a FMEA
in an ontology, instead of in a document and highlighted the (dis)advantages.
Rehman et al. [18] and Zhou et al. [24] designed high-level ontologies to model
the main concepts of a FMEA and their relationships. The first applied it to
model the results of a FMEA in the automotive domain. The second used it
to capture the FMEA of wind turbines and developed a reasoning framework
to perform intelligent fault diagnosis using the designed ontology capturing the
domain-specific concepts. Both papers showed how an ontology could be used
to easily trace the relationships between failures and their corresponding causes,
making it easier to interpret the risk analysis. Ontologies to automatically link
the observations made within a particular system to anomalies or faults that
can occur, have also been proposed [16]. Although high-level concepts have been
defined to model irregularities and link them to system components and effects,
an ontology expert is required to model all the domain-specific anomalies that
can occur and how they link to the sensor observations. None of the proposed
ontologies are publicly available, hindering re-use. Moreover, all the approaches
propose to replace the existing methodologies with a process in which the re-
sults of a FMEA are directly captured in an ontology. This requires extensive
knowledge about ontology design from the system experts as the current FMEA
are performed using standard spreadsheet tools.
    FTA has the advantage to be a more rigorous approach due to the step-by-
step reasoning. Contrary to FMEA, FTA is a graphical method and already iden-
20      B. Steenwinckel et al.

tifies the interrelations between concepts. As a result, FTA is more interpretable
than FMEA as the latter forces the analyst to decompose the system [17]. In an
effort to automate the construction of the FTA trees, Venceslau et al. [21] de-
fined an ontology to model the system components and failures and constructed
a technique to generate the FTA tree from the constructed ontology automati-
cally. The use of the ontology solves the issue of inconsistencies and ambiguities
between FTA trees due to the lack of a common knowledge representation and
the automatic generation of the tree ensures human understanding of the result.
However, it again requires ontology design knowledge from the system experts.
     While an ontology can capture the various concepts occurring within a do-
main and their intricate relationships, additional expressivity is required to de-
rive that a fault has occurred out of the combination of various system obser-
vations. Rule languages, such as RuleML [3] and SWRL [13] can define infer-
ence rules, which are used inside a semantic reasoner to derive logical conse-
quences. Recently, techniques have been designed to extract SWRL rules from
text using NLP [11] or mine Semantic Web Association Rules from RDF data
(SWARM) [1]. However, there are, to our knowledge, currently no techniques
which allow the automated extraction of rules from risk analyses.
     It can be concluded that currently, no approaches exist that allow system
experts to use their traditional risk analysis methodologies, i.e. FMEA tables and
FTA trees, while still providing methods to automatically extract unambiguous
and consistent ontologies and rules from them in a user-friendly manner.


3    User-friendly approach to extract knowledge from risk
     analyses

Tools which detect and analyse the unwanted behaviour directly from the sen-
sor observations, such as AD and RCA, can use the expert knowledge as in-
put, preferably in a semantic format to operate automatically. Inferring the fail-
ures based on the incoming sensor observations, in combination with a domain-
specific ontology, enable the detection of irregularities and the derivation of their
causes. The generated ontologies and inference rules are the building blocks to
determine this unwanted behaviour and can be incorporated in a knowledge-
based monitoring system to identify anomalies and their causes continuously. For
example, they can be integrated into MASSIF, a data-driven platform for the
semantic annotation of and reasoning on Internet-connected data, allowing com-
plex decision-making processing [4]. When new sensor observations are generated
by the system, MASSIF semantically annotates them using the domain-specific
ontologies. MASSIF then uses a semantic reasoner to process the generated rules
and links defined in the ontologies to determine whether failures are occurring
and what their possible causes are. As such, the sensed data can be combined on
the fly with background knowledge, resulting in enhanced and adaptive context-
aware AD and RCA applications. This full process is visualised in Figure 2. The
next sections will focus mainly on the first part of this approach: the extraction
                   Automated Extraction of Knowledge from Risk Analyses         21

and combination of sensor data and expert knowledge in the form of FMEA and
FTA.


  FMEA/FTA                             Semantic                            AD
                                     Enhancement


  Sensor data                                                             RCA

Fig. 2: Overview of the approach to combine knowledge with sensor data for AD
and RCA


    To generate these rules and ontologies in a user-friendly manner, we propose
an approach to automatically extract them from FMEA and FTA documents
and trees, as visualised in Figure 3.
    The first part of this automation approach uses declarative mapping rules to
map FMEA documents on domain-specific ontologies describing the components
and their associated anomalies, causes and system effects. Second, predefined
translation scripts are used to extract the inferences rules from the FTA trees. It
is important to state that both the mapping rules and scripts are generic and can
be re-used for every new FMEA table and FTA tree. Only when the structure
or template of the documents changes, additional mappings or changes to the
scripts will have to be provided. Different methodologies to easily provide these
changes with a minimum amount of human effort or knowledge about ontologies
and inference rules are also provided. Both are part of the semantic enhancement
visualises in Figure 2.
    To ease the explanation of the different steps, a running example based on a
smart fire detector will be used in the next sections. The end goal is to seman-
tically map the observations from this fire detector to possible failures and give
further tools the possibility to derive possible causes. A part of the FMEA is
visualised in Figure 1 (a) and it describes the possible failures of the available
smoke sensor. A false alarm (failure effect) could be generated when dust accu-
mulates in the device (failure cause), as it hinders the sensor from observing the
environment correctly.


3.1   Ontology mapping approach

The data inside the FMEA tables can be used to define a domain-specific on-
tology, describing the links between several components of the system and their
22         B. Steenwinckel et al.


                                                         +
                                                                                Domain ontology
                                             RMLEditor    Folio Ontology        (based on Folio)
     FMEA tables


                                                   RML rules


                                                   RML Mapper


                                                   Python script

                                                                                  SWRL rules
        FTA tree                              Mapping generator

      Fig. 3: Overview of the approach to extract knowledge from risk analyses


possible failures and related causes. The automated process transforming these
tables to an ontology is visualised in the top part of Figure 3. The FMEA tables,
represented as CSV files, will be used for the mapper as input. The mapper it-
self will use rules to convert the data inside the table to ontology concepts with
predefined links between them. Both the description of the rules and the links
between the concepts in the FMEA tables will be explained in this section.

Folio ontology
  Before we can specify the methodology to extract knowledge from FMEA &
FTA documents for a specific domain ontology, a definition of the common con-
cepts within the risk analysis domain should be given. Therefore, we developed
an ontology, called Folio1 , which captures all application-independent concepts
that occur within FMEA, FTA and anomaly detection methods. It is based
on the aforementioned ontologies constructed by Zhou, et al. [24] and Pardo,
et al.[16]. There are several concepts inside the FMEA template similar in the
anomaly domain. The effects and causes of an anomaly can be related to the
failure causes and effects, while both have detection methods and a degree of
severity. Combining the concepts of both of them enables the derivation of the
possible anomaly causes with the available knowledge inside the FMEA work-
sheets.
    The Anomaly class defined inside Figure 4 and the directly connected classes
include all the possible anomaly information. These classes were adapted to
ensure applicability in a context of detecting anomalies for Internet-connected
devices and can determine the irregularities in streaming data. The Semantic
1
     Folio ontology: https://github.com/IBCNServices/Folio-Ontology/blob/master/Folio.owl
                                                 Automated Extraction of Knowledge from Risk Analyses                                                                                       23

                                                                                              folio:Anomaly


                                                                                                                                               hasDetection
                                                    hasCriticality


                                                                                hasCategory
                        hasDomain


                                                                                                                                                                            hasCause
                                                                                                                      hasEffect
 folio:ConsequenceDomain                        folio:Criticality            folio:Category                   folio:Effect               folio:DetectionMethod           folio:Cause

                                                                                                  hasEffect


                                                                                                                isA
     sosa:result


                                                                                                                                                                           isA
                                                                                                        folio:FailureEffect
       sosa:hasResult


                                                                                                                                   isA
                                                    isA
                                                                            hasLowerEffect                                                 hasLowerEffect
                                                folio:LocalEffect                                      folio:IntermediateEffect                               folio:FailureCause
                                    isA                                      hasNextEffect                                                 hasNextEffect
sosa:Observation
                                          isA


                                                                     isLocalEffect                        isInterEffect


                                                                                                                                                              isA
    sosa:makeObservation

 sosa:Sensor                                                                              folio:FailureMode
                                                  hasEffect                                                                              hasCause
                                                                                                                  happenedAt                                           External component
                                                                        ssn:hasSubSystem
                                                                                                                                  ssn:System                           Subclass relation
                        sosa:Observes
                                                                     sosa:ObservableProperty                                                                           Specified relation


                                                                        Fig. 4: The Folio ontology.


Sensor Network (SSN) ontology2 [9,14,10] describes sensors and their observa-
tions for a diverse range of devices and is included in this upper ontology. The
SSN architecture includes a lightweight, but self-contained core ontology called
SOSA (Sensor, Observation, Sample, and Actuator) for its elementary classes
and properties. With their different scope and different degrees of axiomatiza-
tion, SSN and SOSA can support a wide range of applications and use cases. By
using SSN & SOSA, the Folio ontology can describe the sensor's observations
that are the basis for analysing the system behaviour. Relationships were defined
in Folio to correlate the SSN classes with possible failures and effects.
    The FMEA concepts from Zhou, et al. were extended and related to the
anomaly class inside the Folio ontology. The FailureEffect and FailureCause
classes are subclasses of the anomaly Effect and Cause classes.
    Relations between causes and effects are needed to describe the correspond-
ing connections between multiple components. A FailureCause defines a con-
cept with no further hasNextEffect relations. An IntermediateEffect con-
cept will describe the influence of an intermediate component that is affected
by, but not causing, the detected problem. The whole detection flow can have
multiple Intermediate Effects. The LocalEffect refers to the first detected
effect on the system. A LocalEffect will mostly be related to a faulty sensor
observation itself, describing the current state of the device or system compo-
nent. For the fire detector example, the accumulation of the dust will be defined
as a FailureCause because it can be verified as an end effect with no further
hasNextEffect relations. The malfunctioning of the sensors are IntermediateEffects,

2
    SSN ontology: https://www.w3.org/TR/vocab-ssn/
24         B. Steenwinckel et al.

and they could even have multiple causes. A LocalEffect could be a value too
high observation, indicating something is wrong with the system.


Domain knowledge transformation
  The previously described upper ontology Folio can now be used to form a
more domain-specific ontology, using the data from the FMEA tables. As such,
anomaly knowledge can be extracted from the FMEA itself, and the causes of
these anomalies can be derived by following the created semantic links. The
mapping approach from these table entries to Folio concepts is visualised in the
top part of Figure 3 and consists of three different steps. First, the tables itself
must be transformed into a computer-readable format to process the information.
Second, rules must be generated to map the column and header information from
the FMEA table to existing concepts inside the upper ontology. At last a mapping
procedure is required to transform all the rows inside the formatted table to
domain-specific concepts, eventually creating the domain-specific ontology.
    More specific, FMEA is usually performed using a spreadsheet program,
transformable to CSV document used for further analysis. The different pos-
sible elements of each record in the FMEA are fixed and defined by the column
headers of the provided FMEA templates. More concrete, every FMEA table
defines minimal the failures, their effects and their causes of a system. To enable
the mapping of the FMEA on the Folio ontology, these column headers should
be mapped on ontological concepts. A mapping language was used to realise
this, which enables the declarative definition of how to generate RDF from ex-
isting data sources through a set of rules. Our approach uses the RML mapping
language [6].
    We defined the RML rules following the guidelines of the Folio ontology
via the RMLEditor [12], which offers a graphical user interface to aid users in
defining rules. The high levels steps we followed are as follows: (i) a sample of
an FMEA table was loaded in the RMLEditor, (ii) the rules were created by an
ontology expert, (iii) the corresponding RDF triples were generated, (iv) if the
triples are not as expected the rules are updated, and (v) the rules are exported3 .
Afterwards, the RMLMapper4 , a tool to execute RML rules, is used to generate
the ontologies for all FMEA tables. The mappings ensure that for each cell in
the FMEA table, a new class is created in the ontology, which is a subclass of
the class on which the column is mapped according to the rules. For example, if
we consider the 5th cell on the second row of Figure 1 (a), the RMLMapper will
create a new concept DustAccumulation in the ontology, which has as superclass
the FailureCause class.
    This approach is preferred because mapping languages provide a reusable
solution, while custom software and mapping scripts are limited to a specific
use case or implementation [6]. Another advantage is the adaptive character
of the mapping rules: when making changes in the representation of the data
updating the mapping rules will suffice to incorporate this extra information
3
     RML rules: https://github.com/IBCNServices/Folio-Ontology/blob/master/mapping.rml.ttl
4
     RMLMapper: https://github.com/RMLio/RML-Mapper
                    Automated Extraction of Knowledge from Risk Analyses         25

in the domain-ontology. As such, these mappings can be re-used to translate
any FMEA table that is created according to the standard FMEA structure.
Changing the information inside or adding new information regarding risks and
failures to the FMEA documents do not affect the generation of the domain
ontologies at all. Only if the template changes, e.g. If a new column is added or
for instance, the risk analysts switch to a more advanced Failure Mode, Effect
and Criticality Analysis method, new rules must be defined, which can easily be
created by mapping the columns to the Folio classes by using the RMLEditor.
Due to the frequently used FMEA templates, this will not happen often. In our
fire detector example, this means that updating the FMEA documents, by adding
an additional cause for the unwanted evacuation (for instance a broken test
button), does not affect the domain ontology generation process. The outcome
of our mapping approach is a domain ontology in OWL. In our fire detector
example, this ontology will relate the failures of the temperature and smoke
sensors to the general system effects, using the relationships of the upper Folio
ontology.


3.2   Rule generation approach

While the previous part relates the domain-specific information of failures and
causes together, faulty sensor observations must be mapped to the correct fail-
ures before a semantic-based monitoring system, such as the one describe in
Figure 2 can be operational. Rules are used for specifying the irregularities in
the data through defining patterns or operational ranges. For example, a smoke
detector can be defined as faulty, when it measures impossibly high values of dust
accumulations. This requires experts to define the normal value ranges for these
sensors adequately. The Folio ontology and the previously explained FMEA map-
ping approach already allow defining the observations and their resulting values
made by sensors. This section describes how rules can be extracted from the
FTA trees to link these observations to possible faults that occur, by using the
process visualised at the bottom of Figure 3. Similar to the FMEA mapping
approach, three different steps can be defined. First, the trees themselves must
be transformed into a computer-readable format to process the rules. Second,
tree-agnostic knowledge must be mapped to specific rule concepts to perform
the translation between the tree representations and the rule definitions. At last
a mapping procedure is required to transform all the information inside the
formatted trees to domain-specific rules.


Decision Fault Trees
 While original fault trees describe the relationship between the components
of the system, they usually do not allow to differentiate the observations from
their possible failures. In the case of the fire detector example visualised in
Figure 1 (b), the link between the sensor observations and all the possible failures
shows the interaction of the different system components but does not capture
the difference between the accumulation of dust or, for example, a broken sensor.
26         B. Steenwinckel et al.

A fault tree created from the FTA restricts the analysis to the relations between
the components inside the system solely. Therefore, a combination of a decision
tree, which is capable of modelling the decision from observations to failure with
the possible consequences, together with the general FTA tree, is used here. This
so-called decision fault tree (DFT) provides tests on the intermediate edges of the
tree, visualising the basic rules for further analysis. Figure 5 gives an example of
such a decision tree, related to our fire detector example. When a certain smoke
observation has a value greater than 50 ppm, the observation can be classified
as a ValuesTooHigh failure.


                                              >   50
                                        Value              ValuesTooHigh
                     Smoke
                   Observation
                                       Value
                                             <   =50       Normal behaviour


                         Fig. 5: example of a Decision Fault Tree.


    A user interface was designed to build such DFTs, as shown further in Fig-
ure 8. In this editor, descriptions of the observation and failure nodes can be
given. These different node concepts should align with the concepts defined in
the FMEA. Tests describing the relations between these observations and fail-
ures can be added or adapted. Several representations are possible for such DFTs.
The user-interface outputs JSON file to describe the nodes and the rule-specific
edges.


Domain-specific decisions
 The constructed DFTs can now be used as input to define the domain-specific
decisions, relating the observations to previously defined failures in the FMEA
procedure. To translate the decision inside the tree to rules, a rule generator
script was designed in Python5 to transform the JSON representation into SWRL
rules. In a first step, the decision and nodes are gathered from the DFT inside a
JSON format. Second, RDF syntax rule definitions are used as mock-ups for the
SWRL rules. These definitions specify all the basic boolean operations, as well
as the logical operators (<, ≤, ==, ≥, >). The JSON DFTs are then provided as
input to these definitions, resulting finally in specific SWRL rules. These SWRL
rules can be attached to the FMEA RDF document or can be saved separately. To
give an example, the generated SWRL rule specifying a ValuesTooHigh failure
in the fire detector example of Figure 5 looks as follows:

5
     Script: https://github.com/IBCNServices/Folio-Ontology/blob/master/swrl_builder.py
                   Automated Extraction of Knowledge from Risk Analyses      27

     SmokeObservation(?o) ^ hasResult(?o, ?result) ^
     hasValue(?result, ?value) ^ swrlb:greaterThan(?value, 50)
      -> ValuesTooHigh(?o)

This rule describes the inference of a ValuesTooHigh failure when an observation
is a SmokeObservation and the result value of this observation is greater than
50.
    Similar to the FMEA mapping procedure, the python mock-ups are defined
once and can operate on all generated DFTs. Changing the information inside or
adding new information regarding risks and failures to the DFTS do not affect
the rule generation process. Only if the DFT components changes, e.g. If new
functional operators are added, a new mock-up must be defined.


4   Use case: Measuring Train Passenger Comfort

The growing requirements for quality of service put new challenges on the oper-
ation and development of trains and railway tracks. Therefore, research on the
passenger comfort levels has reached high interest in the last decade [15]. As
shown in Figure 6, train bogies are now equipped with accelerometers and gyro-
scope sensors, able to detect the shocks and damping effect of the train on the
tracks. Multiple sensor observations of different train cars can be combined on
a server to indicate the passenger comfort inside the train. Maintenance alerts
are given to both the train or track staff to resolve the issues.


                     Fig. 6: Schematic overview of a train.


   The company installing these train monitoring units, i.e. Televic Rail, per-
formed risk analyses. The resulting FMEA table, visualised in Figure 7, shows
the possible failures of a disallowed comfort level that result in the effect of
multiple falsely generated warnings for the train driver. Two possibilities are
a broken or malfunctioning sensor. The FMEA table shows that the cause of
the latter is varying outdoor temperatures while degradation causes the broken
sensor. Replacing or recalibrating it could solve these issues.
28          B. Steenwinckel et al.
                                                              FMEA_temp
Component                Function                     Failure Mode   Failure Eﬀect           Failure Cause        Control   Containment
                                                                                                                  Method    Action

Passenger Comfort Unit Detects the level of Comfort False warning    Indicating impossible   Broken sensor        None      None
                                                                     comfort level

Accelerometer Sensor     Measures changes in          Values too high Broken sensor          Degradation of the   None      Replace Sensor
                         gravitational acceleration                                          sensor

                                                                     Malfunctioning sensor   Rapid temperature    None      Calibrate sensor
                                                                                             changes
Gyroscope Sensor         Measure the smoke level      Values too high Broken sensor          Degradation of the   None      Replace Sensor
                                                                                             sensor

                                                                     Malfunctioning sensor   Rapid temperature    None      Calibrate sensor
                                                                                             changes


                          Fig. 7: Train passenger comfort FMEA example


                                      Fig. 8: Train sensors DFT example


     A DFT was also modelled by Televic in the designed web interface, as shown
in Figure 8. This tree describes the relationship between the temperature obser-
vations of the accelerometer unit and the humidity observations of the gyroscope
sensor unit with their possible failure modes. A ValuesTooHigh failure can occur
when the temperature of the accelerometer has a value higher than 125 degrees
Celsius. The value range is this high because the accelerometer module operates
on the wheel axles and these components are influenced by much frictional heat.
A second failure can be derived when the temperature value is lower or equal
than minus 40 degrees Celsius. We will refer to to same ValuesTooHigh failure
for visualization purposes. At last, the same ValuesTooHigh failure can be used
to indicate the humidity of the gyroscope has a value higher than 85%. All other
observations are classified as normal in this simple use case.
     The corresponding JSON file of the DFT and the CSV file of the table can be
given as input to the mapping engine. The RML rules are here already predefined
(same rules as defined in the fire detector example) and map the specific input
fields to an RDF train-specific ontology. A schematic overview of the generated
ontology is given in Figure 9 and visualises the major concepts of Figure 7. The
inferred rules of the DFT, given in Figure 8, are visualised in Listing 1.1. This
listing describes three SWRL rules corresponding to the paths from the sensor
observations to the single failure mode. When an accelerometer temperature
                                                                     1
                   Automated Extraction of Knowledge from Risk Analyses      29


    Fig. 9: Ontograf visualisation of the passenger comfort FMEA ontology


observation reaches the reasoning engine, and its value is greater than to 125
degrees Celsius, the observation will be classified as a ValuesTooHigh failure
and further actions can be taken.

    HumidityObservation (? o ) ^
    hasResult (? o , ? result ) ^
    swrlb : greaterThan (? Value , 0.85) ^
    hasValue (? result , ? Value )
    -> ValuesTooHigh (? o )

    hasResult (? o , ? result ) ^
    swrlb : greaterThan (? Value , 50) ^
    hasValue (? result , ? Value ) ^
    Te mp era tur eO bse rv ati on (? o ) ^
    swrlb : greaterThan (? Value , 125)
    -> ValuesTooHigh (? o )

    swrlb : lessThanOrEqual (? Value , -40) ^
    hasResult (? o , ? result ) ^
    hasValue (? result , ? Value ) ^
    Te mp era tur eO bse rv ati on (? o ) ^
    swrlb : lessThanOrEqual (? Value , 50)
    -> ValuesTooHigh (? o )

             Listing 1.1: SWRL rules derived from the DFT in Figure 8


5    Conclusion and Future work
In this paper, research is proposed to enable the automatic knowledge extraction
out of risk analyses into domain-specific ontologies and accompanying inference
30      B. Steenwinckel et al.

rules. Mappings were provided to incorporate the knowledge inside FMEA doc-
uments into a domain-specific ontology. An upper ontology was used to relate
the main concepts, making the methods operational for several, different appli-
cations. Inference rules were extracted from DFTs, which were able to express
the link between sensor observations and defined failures. Both methods allow
system experts to use the risk analysis methodologies and tools they are used
too to build a domain-specific ontology with accompanying rules, without the
additional need for ontology experts. These ontologies and accompanying rules
ensure that a common vocabulary and consistency check is maintained and can
be used to enable on the fly detection of anomalies and their causes through
semantic reasoning. It enables the system experts to focus on the risk analysis
task, instead of on a knowledge modelling task for which they do not have the
adequate ontology design expertise. Future research can now use the designed
ontologies, together with accompanying rules to derive or reason on the possible
causes inferred from the failures. Additionally, the DFT editor itself can be ex-
tended with consistency checks to ensure improved rule generation.

Acknowledgment: This research is part of the imec ICON project Dyversify,
co-funded by imec, VLAIO, Renson Ventilation NV, Televic Rail & Cumul.io.


References

 1. Barati, M., et al.: Swarm: approach for mining association rules from semantic web
    data. In: Conference on Artificial Intelligence. pp. 30–43. Springer (2016)
 2. Ben-Daya, M.: Failure mode and effect analysis. In: Handbook of maintenance
    management and engineering, pp. 75–90. Springer (2009)
 3. Boley, H., et al.: Design rationale of ruleml: A markup language for semantic web
    rules. In: Proceedings on Semantic Web Working. pp. 381–401. CEUR-WS (2001)
 4. Bonte, P., et al.: The massif platform: a modular and semantic platform for the
    development of flexible iot services. Knowledge and Information Systems (2017)
 5. Camossi, E., et al.: Semantic-based Anomalous Pattern Discovery in Moving Ob-
    ject Trajectories. CoRR abs/1305.1 (2013)
 6. Dimou, A., et al.: Rml: A generic language for integrated rdf mappings of hetero-
    geneous data. In: LDOW (2014)
 7. Dittmann, L., et al.: Performing fmea using ontologies. In: 18th International Work-
    shop on Qualitative Reasoning. Evanston USA. pp. 209–216 (2004)
 8. Ericson, C.A.: Fault tree analysis. Hazard analysis techniques for system safety pp.
    183–221 (2005)
 9. Haller, A., Janowicz, K., Cox, S.J.D., Le Phuoc, D., Taylor, K., Lefrançois, M.:
    Semantic Sensor Network Ontology. W3C and OGC Recommendation, W3C &
    OGC (Oct 19 2017), https://www.w3.org/TR/vocab-ssn/
10. Haller, A., Janowicz, K., Cox, S.J., LefranÃğois, M., Taylor, K., Le Phuoc, D.,
    Lieberman, J., Garcia-Castro, R., Atkinson, R., Stadler, C.: The modular SSN
    ontology: A joint W3C and OGC standard specifying the semantics of sensors,
    observations, sampling, and actuation. Semantic Web Journal (2018)
11. Hassanpour, S., et al.: Framework for the automatic extraction of rules from online
    text. In: Workshop on Rules and Rule Markup Languages. Springer (2011)
                     Automated Extraction of Knowledge from Risk Analyses              31

12. Heyvaert, P., et al.: Rmleditor: a graph-based mapping editor for linked data map-
    pings. In: International Semantic Web Conference. pp. 709–723. Springer (2016)
13. Horrocks, I., et al.: Swrl: A semantic web rule language combining owl and ruleml.
    W3C Member submission 21, 79 (2004)
14. Janowicz, K., Haller, A., Cox, S.J., Le Phuoc, D., LefranÃğois,
    M.:     SOSA:       A     lightweight    ontology      for   sensors,    observations,
    samples,      and      actuators.     Journal    of      Web    Semantics      (2018).
    https://doi.org/https://doi.org/10.1016/j.websem.2018.06.003
15. Karimpanal, T.G., Gadhia, H.M., Sukumar, R., Cabibihan, J.: Sensing discomfort
    of standing passengers in public rail transportation systems using a smart phone.
    CoRR (2017)
16. Pardo, E., et al.: A framework for anomaly diagnosis in smart homes based on
    ontology. Procedia Computer Science 83 (2016)
17. Peeters, J., et al.: Improving failure analysis efficiency by combining fta and fmea
    in a recursive manner. Reliability engineering & system safety 172, 36–44 (2018)
18. Rehman, Z., Kifor, C.V.: An ontology to support semantic management of fmea
    knowledge. International Journal of Computers, Communications & Control (2016)
19. Solé, M., et al.: Survey on Models and Techniques for Root-Cause Analysis. Clinical
    Orthopaedics and Related Research (CoRR) (2017)
20. Souiden, I., et al.: A survey on outlier detection in the context of stream mining.
    In: Advances in Intelligent Systems and Computing (2017)
21. Venceslau, A., et al.: Ontology for computer-aided fault tree synthesis. In: Emerging
    Technology and Factory Automation (ETFA), 2014 IEEE. pp. 1–4. IEEE (2014)
22. YE: Big data: Changing the way businesses compete and operate (2014)
23. Ye, J., et al.: Semantic web technologies in pervasive computing. Pervasive and
    Mobile Computing pp. 1–25 (2015)
24. Zhou, A., et al.: A research on intelligent fault diagnosis of wind turbines based on
    ontology and fmeca. Advanced Engineering Informatics 29(1), 115–125 (2015)