Theoretical characterization of Signal Diagnostic
                    Processing Language

      Ognjen Savković1 , Evgeny Kharlamov2 , Guohui Xiao1 , Gulnar Mehdi3,5 ,
     Elem Güzel Kalaycı1 , Werner Nutt1 , Mikhail Roshchin3 , and Ian Horrocks2
                     1
                       Free University of Bozen-Bolzano, Bolzano, Italy
                            2
                              University of Oxford, Oxford, UK
                  3
                    Siemens AG, Corporate Technology, Munich, Germany


       Abstract. In this paper we analyze the theoretical aspects of our semantic rule
       language, SDRL. Inspired with an industrial collaboration, SDRL is designed
       to process signals from sensors installed in industrial equipment by filtering,
       aggregating, and combining sequences of time-stamped measurements recorded by
       the sensors. In our previous work, we conducted extensive experiments showing
       scalability of the language on real data. Here instead, we analyze theoretical
       properties of SDRL. In particular, we show that query answering in SDRL can be
       reduced to relational algebra with aggregations. To this end, we rely on recently
       introduced Datalognr MTL that is first order rewritable. First we introduce an
       extension of Datalognr MTL with aggregate functions. Then we show how to
       translate query answering in SDRL into the query answering in the extended
       language.


1   Introduction
Large engineering and maintenance companies like Siemens typically rely on rule-based
intelligent diagnostics: diagnostic engineers create and use complex diagnostic rule-sets
to detect abnormalities during equipment run time and sophisticated analytical models
to combine these abnormalities with models of physical aspects of equipment such as
thermodynamics and energy efficacy [1].
    An important class of rules that are commonly used in Siemens for rule-based
equipment diagnostics are signal processing rules (SPRs). SPRs allow to filter, aggregate,
combine, and compare signals, that are time stamped measurement values, coming
from sensors installed in equipment and trigger error or notification messages when a
certain criterion has been met. Thus, sensors report temperature, pressure, vibration and
other relevant parameters of equipment and SPRs process this data and alert whenever a
certain pattern is detected. Moreover, SPRs can combine data of two types: master data
that contains train engineering specifications, results of previous diagnostic tasks, and
diagnostic event data, and operational data that contains recorded signals from sensors
installed in equipment. Data processed by SPRs is then used by analytical models in
order to ensure safety, efficiency, and to lower the maintenance costs.
    SPRs that are currently offered in most of existing diagnostic systems and used in
Siemens are highly data dependent in the sense that (i) data about specific characteristics
like speed, capacity, and identifiers of individual sensors and pieces of equipment are
explicitly encoded in SPRs and (ii) schema of such data is reflected in the SPRs. As a
result, for a typical simple train diagnostic task an engineer has to write from dozens to
hundreds of SPRs with hundreds of sensor tags, component codes, sensor and threshold
values and equipment configurations and design data.

Example 1. Consider the diagnostic task: all door sensors indicate that the car doors are
OK, and the main train line pressure is trending upwards for more than 33 seconds.
    The above task can be checked by analyzing the train sensors. For instance, let us
assume that the locomotive L1 has door sensors SKNF X01, SKNF X02, SKNF X03
and SKNF X04, and let MNPRSS be the sensor that measures the main train pressure.
Further, let us assume that the task is verified as OK if the sum of the above sensors
values is equal to 16. Then, the following data dependent SPR written in a syntax similar
to the one of Siemens SPRs can be used to accomplish the task:

 $CarDoorsOK = truth (sum (SKNF X01, SKNF X02, SKNF X03, SKNF X04) (1)
               : value(=, 16) &&trend(MNPRSS, ’up’) : duration(>, 0.33s))

Here $CarDoorsOK is a Boolean variable. Many of these SPRs differ only on specific
sensor identifiers and the number of speed signals to aggregate. Adapting these SPRs to
another cars will also require to change a lot of identifiers.

     Data dependence of SPRs poses significant challenges for diagnostic engineers in
(i) authoring, and then (ii) reuse, and (iii) maintenance of SPRs. These challenges are
common for large enterprises and Siemens is not an exception.
     In our previous work [2, 3], we proposed a higher-level rule language Semantic
Diagnostic Rule Language (SDRL) that allows to write SPRs and complex diagnostic
tasks in an abstract fashion by exploiting both ontological vocabulary and queries over
ontologies to identify relevant sensors and data values. We implemented SDRL in a
prototypical system and deployed an implementation in Siemens [4]. Also, we evaluated
usability of our solution with Siemens engineers by checking how fast they are in
formulating diagnostic tasks in SDRL. We also evaluated the efficiency of our solution in
processing diagnostic tasks over train and turbine signals in a controlled environment.
Contributions. In this work, we study the formal properties of the SDRL language.
    We prove that the reasoning tasks in SDRL programs can be rewritten into query
answering using linear algebra with aggregates (that is SQL). Further, we analyzed the
upper bound of the complexity of our problem. In particular, we measure the complexity
of the problem in the size of two main components: the program and the data. We expect,
the size of the data largely dominates the size of programs, and while the data can be huge
(typically several GBs) that the size of program can be significantly big (e.g., several
thousands of rules) thus both measures are relevant. 4
    To study the two properties from above, we encode the problem of firing a message
in SDRL into the fact-entailment problem [6] over an extended version of the recently
introduced non-recursive metric Datalog, Datalognr MTL [7]. The reason for doing this
is twofold. First, Datalognr MTL (inspired by the well-studied Metric Temporal Logic
 4
     We point out that the notion of “rewriting into linear algebra with aggregates” is very similar
     to the formal property of First-order rewritability (e.g., see [5]) but since the latter does not
     consider aggregates we use the former.
(MTL) [7]) provides a natural way to model rules that reason over time intervals. Second,
Datalognr MTL is a language suitable for OBDA [5, 8], that is, it has been shown how to
rewrite queries over the rules in Datalognr MTL into standard SQL over the sources [7].
    Still, Datalognr MTL cannot be immediately related to our language since it does not
support aggregates and some other logic constructs that we need for our encoding (in
particular, functional symbols, negation and aggregates). So first, we extend DatalogMTL
with functional symbols, aggregation, etc. under reasonable restrictions, without increasing
the complexity. Then to show that our problem is rewritable into linear algebra with
aggregates we do the following encoding. Given a program Π and a message rule r we
create an extended non-recursive Datalognr MTL program ΣΠ,r and a proposition mr
only such that: Π fires r iff ΣΠ “entails” mr . A corollary of this gives us (i) that our
language is suitable for OBDA [5] specification (follows from the encoding); (ii) ways to
reformulate our programs and rules into SQL queries (extending the principles in [7]).

2     Signal Diagnostic Processing Language SDRL
In this section, we introduce formally the syntax and semantics of our signal diagnostic
processing language, SDRL. To do so, we first introduce signal processing expressions that
allow one to manipulate basic signals using mathematical functions. Then we introduce
a notion of programs that allow one to compose and combine expressions, and form
message rules. Finally, we provide semantics of our language that formally defines how
SDRL should be executed.

2.1   Signals Processing Expressions

We introduce signal expressions that filter and manipulate basic signals and create new
more complex signals. Intuitively, in our language we group signals in ontological
concepts and signal expression are defined on the level of concepts. Then, a signal
processing expression is recursively defined as follows:

          C =     Q                          | {s1 , . . . , sm }               |
                  α◦C                        | C1 : value( , α)                 |
                  agg C1                     | C1 : duration( , t)              |
                  C1 : align C2              | C1 : trend(direction).

where C is a concept, Q is a CQ with one output variable, ◦ ∈ {+, −, ×, /}, agg ∈
{min, max, avg, sum}, α ∈ R, ∈ {<, >, ≤, ≥}, align ∈ {within, after[t], before[t]},
t is a period, and direction ∈ {up, down}. In the following we will consider well formed
sets of signal expressions, that is, sets where (i) each concept is defined at most once and
(ii) where definitions of new concepts are assumed to be acyclic: if C1 is used to define
C2 (directly or indirectly) then C1 cannot be defined (directly or indirectly) using C2 .
     Expressions C = Q and C = {s1 , . . . , sm } we call basic signal expressions and
other we call complex signal expressions.
     The formal meaning of signal processing expressions is defined in Table 1.
Example 2. The data-driven rules that can be used to determine that car doors function
well, as in Equation (1), can be expressed with two concepts in SDRL as follows:

            DoorsLocked = sum MainCarDoors : value(=, LockedValue)                         (2)
             PressureUp = CabinPressure : trend(’up’) : duration(>, 33sec)                 (3)

Here, MainCarDoors is a CQ defined over a database. For brevity we do not introduce a
new concept for each expression but we just join them with symbol “ :”. The constant
LockedValue is a parameter for analyzing door of a train, and they are instantiated from
the train configuration when the expressions are evaluated.

2.2    Diagnostic Programs and Messages
We now show how to use signal expressions to compose diagnostic programs and to alert
messages.
    A diagnostic program (or simply program) Π is a tuple (S, K, M) where S is a set
of basic signals, K a KB, M a set of well formed signal processing expressions such that
each concept that is defined in M does not appear in K.
    In order to enjoy favorable semantic and computational characteristics of OBDA, we
consider well-studied ontology language OWL 2 QL that formal basis is DL-LiteR [5].
    On top of diagnostic programs Π SDRL allows to define message rules that report
the current status of a system. Formally, they are defined as Boolean combinations of
signal processing expressions:

                           D := C | not D1 | D1 and D2 .

 A message rule is a rule of the form, where D is a concept and m is a (text) message:

                                     message(m) = D.

Example 3. Using Equations (2)–(3) we define the following message:

            message(“All car doors OK”) = DoorsLocked and PressureUp.

The message intuitively indicates that the doors are functioning and locked.

      Now we are ready to define the semantics of the rules, expression and programs.

2.3    Semantics of SDRL
We now define how to determine whether a program Π fires a rule r. To this end, we
 extend first-order interpretations that are used to define semantics of OWL 2 KBs.
      Formally, our interpretation I is a pair (IFOL , IS ) where IFOL interprets objects
 and their relationships (like in OWL 2) and IS —signals. First, we define how both
 components of I interprets basic signals in S. Formally, S I = {sI1 , . . . , sIn } where IFOL
‘returns’ the signal id, i.e. sIFOL = os and IS ‘returns’ the signal itself, i.e. sIS = s.
      Now we can define how I interprets KBs. Interpretation of a KB KI extends
 the notion of first-order logics interpretation as follows: KIFOL is a first-order logics
 interpretation K and KIS is defined for objects, concepts, roles and attributes extending
 S I . That is, for each object o we define oIS as s if o is the id of s from S; otherwise
 C=                        Concept C contains
 Q                         all signal ids return by Q evaluated over the KB.
 α ◦ C1                    new signal s0 for each signal s in C1 with fs0 = α ◦ fs .
 C1 : value( , α)          new signal s0 for each signal s in C1 with fs0 (t) = α fs (t)
                           if α fs (t) at time point t; otherwise fs0 (t) = ⊥.
 C1 : duration( , t0 )     new signal s0 for each s ∈ C1 with fs0 (t) = fs (t) if exists an interval
                           I s.t.: fs is defined I, t ∈ I and size(I) t0 ; otherwise fs0 (t) = ⊥.
 {s1 , . . . , sm }        all enumerated signal {s1 , . . . , sm }.
 agg C1                    new signal s0 with fs0 (t) = aggs∈C1 fs (t), that is, s0 is obtained from
                           all signals in C1 by applying the aggregate agg at each time point t.
 C1 : align C2             new signal s1 from C1 if: exists a signal s2 from C2 that is aligned
                           with s1 , i.e., for each interval I1 where fs1 is defined there is
                           an interval I2 where fs2 is defined s.t. I1 aligns with I2 .
 C1 : trend (direction)    one signal s0 for each signal s in C1 with fs0 (t) = fs (t) if exists
                           an interval I around t s.t.: fs is defined I, and fs is an increasing
                           (decreasing) function on I for direction=up (=down resp.)

Table 1. Meaning of signal processing expressions. For the interval I, size(I) is its size. For
intervals I1 , I2 the alignment is: “I1 within I2 ” if I1 ⊆ I2 ; “I1 after[t] I2 ” if all points of
I2 are after I1 and the start of I2 is within the end of I1 plus period t; “I1 before[t] I2 ” if
“I2 start[t] I1 ”. In order to make the mathematics right, we assume that c ◦ ⊥ = ⊥ ◦ c = ⊥ and
c ⊥ = ⊥ c = false for c ∈ R, and analogously we assume for aggregate functions. If the
value of a signal function at a time point is not defined with these rules, then we define it as ⊥.

(o, f⊥ ). Then, for a concept A we define AIS = {sIS | oIs FOL ∈ AIFOL }. Similarly, we
define ·IS for roles and attributes.
     Finally, we are ready to define I for signal expressions and we do it recursively
following the definitions in Table 1.
     We now illustrate some of them. For example, if C = {s1 , . . . , sm }, then C I =
{s1 , . . . , sIm }; if C = Q then C IFOL = QIFOL where QIFOL is the evaluation of Q over
   I

IFOL and C IS = {s | oIs FOL ∈ QIFOL }, provided that IFOL is a model of K. Otherwise we
define C I = ∅. Similarly, we define interpretation of the other expressions.

Firing a Message Let Π be a program and ‘r : message(m) = C’ a message rule.
We say that Π fires message r if for each interpretation I = (IFOL , IS ) of Π it holds
C IFOL 6= ∅, that is, the concept that fires r is not empty. Our programs and rules enjoy the
canonical model property, that is, each program has a unique (Hilbert) interpretation
[9] which is minimal and can be constructed starting from basic signals and ontology
by following signal expressions. Thus, one can verify C IFOL 6= ∅ only on the canonical
model and thus one can evaluate SDRL programs and expressions in a bottom-up fashion.

3     Extended DatalogMTL
In this part we introduce our extension of DatalogMTL . An atom A in extended
DatalogMTL is either a comparison (e.g., τ ≤ τ 0 ) or defined by the following grammar:

             A ::= P (τ1 , . . . , τm ) | > | % A | %A | %A |       %A
                             0             0
                   A U% A | A S% A ¬A | τ = aggJτi | P (τ1 , . . . , τm )K

Here, P is a predicate, % is an interval in reals, τ is a term (possibly with functional
symbols), agg ∈ {min, max, avg, sum} and brackets J·K denote multiset (values can
repeat). A DatalogMTL program Σ is a finite set of rules of the following form:

                A+ ← A1 ∧ · · · ∧ Ak          or       ⊥ ← A1 ∧ · · · ∧ Ak ,

where A+ is an atom that does not contain any ‘non-deterministic’ operators % , % , U% ,
S% , negated atoms, or aggregate operators.
     For our purposes it is sufficient to consider non-recursive programs. Informally, that
are programs where dependency (direct or indirect) between predicates is acyclic. In fact,
it is not trivial to understand how one would even define the semantics of programs with
recursion in case of aggregates and negation. So, from now, we only consider extended
DatalogMTL programs that are non-recursive.
     In DatalogMTL , temporal operators are defined over intervals and they take the form
% , % and U% , which refer to the future, and % , % and S% , which refer to the past
where % is an interval. For example, % A is true at t iff an atom A is true in all points of
an interval % in the future from t, while % A is true at t iff there exists a point in the past
not longer than % from t where A is true. For the complete semantics of the temporal
operators and rules, please see [7].
     A (temporal) data instance is a finite set of facts of the form P (c)@ι, where P (c) is
a ground atom and ι an interval. The fact P (c)@ι states that P (c) holds throughout the
interval ι. Moreover, we simply write P (c)@t for P (c)@[t, t].
     Finally, every satisfiable extended Datalognr MTL program Σ with database D has
the canonical (or minimal) model of Π and D, MΠ,D . As usual, the most important
property of canonical model is that if a fact holds in canonical model then it holds in any
other model.

4     Encoding into Extended DatalogMTL
We start with an example of the encoding then we show the complete encoding.
Example 4 (Example of Encoding). The query in signal processing expression in Exam-
ples 2 and 3 can be encoded in a modular way, starting from simpler to more complex
expressions. We start with the encoding rules for Example 2. First, we show how to
capture the expression “sum MainCarDoors” in (2). For that we use the following rule:

SumMainCarDoors(car ), value(car , v1 ) ← sumJv | MainCarDoors(x), value(x, v)K = v1 .

Intuitively, the rule introduces a new constant car representing the “aggregated main car
door sensor” and assign the average value of all main car door sensors to it. Then, we
encode the second part of (2), value(=, LockedValue), using SumMainCarDoors with:

       DoorsLocked(x) ← SumMainCarDoors(x), value(x, v), v = LockedValue.

 To encode expression (3) in Example 2 we need to use temporal operators. In particu-
lar, to encode CabinPressure : trend(‘up’) we need to copy all intervals of a signal in
CabinPressure on which the signal is trending up. For that we need universal quantifica-
tion (“ ∀”). This is expressible in Datalog by two rules connected with a negation. First
we compute intervals on which a signal is not trending up with the rule:

    notTrendUpCP (x) ← CabinPressure(x), value(x, v1 ),       (0,δ] (value(x, v2 ), v1 > v2 )
Intuitively, formula (value(x, v1 ), (0,δ] (value(x, v2 ), v1 > v2 )) evaluates to true for
some value v1 at a time point t if there exists an interval of a size at most δ containing t
in which signal x has another value v2 that is smaller than v1 . Here, a parameter δ is a
“small” real number and it is typically selected based on the size of signal sampling.
    We compute the trending-up intervals by eliminating non-trending-up time points:

         CabinPressureAux(fcp (x)) ← CabinPressure(x), ¬notTrendUpCP (x)

   Here, the functional symbol fcp is used to create a new signal identifier for each x.
The values of the new signals are the same as originals and they are just copied for each
time-point that is “trending up”:

              value(fcp (x), v) ← CabinPressureAux(fcp (x)), value(x, v)

   To encode the construct duration we also need temporal operators. In particular, we
encode the construct : duration(>, 33sec) with the rule:

             PressureUp(fpu (x)) ←       [0,33s] [0,33s] CabinPressureAux(x)


Intuitively, the temporal operator [0,33s] selects “an event that lasts for the last 33s”,
and the temporal operator [0,33s] selects “an event happens within the last 33s”. The
nesting [0,33s] [0,33s] of these two operators selects the whole duration of all the events
lasting at least 33s. Similarly as above, the value is transferred with the rule:

                   value(fpu (x), v) ← PressureUp(fpu (x)), value(x, v)

     Finally, to encode message firingfrom Example 3 we introduce two propositionals pdl
and ppu for concepts DoorsLocked and PressureUp, respectively. In particular, pdl is true
if there exists a signal in DoorsLocked that has at least one value. And similarly for ppu .
This is encoded with the rules:

           pdl ←    [0,∞) DoorsLocked,             pdl ←    [0,∞) DoorsLocked,

           pup ←     [0,∞) PressureUp,               pup ←     [0,∞) PressureUp


Here, [0,∞) and [0,∞) are used to check if DoorsLocked has at least one signal with
value in the past or in the future, respectively. Then we encode with the firing a message:

                        message(“All car doors OK”) ← pdl , pup

4.1   Full Specification of Encoding into Extended Datalognr MTL
Let (S, K, M) be an SDRL program. We define a corresponding extended Datalognr MTL
program (DS,K , ΠM ) where the temporal facts DS,K encodes S and K, and the program
ΠM encodes expressions M. In particular, for each basic signal s = (os , fs ) in S:

  – if fs (t) = v we add value(s, v)@t to DS,K , and
  – if os is an answer of Q over KB K then we add Q(os )@(−∞, +∞) to DS,K .
We observe that the signals can be encoded as a finite database instance is due to the
assumption that signals are step functions.
     The program ΠM is constructed from M following the encodings in Table 2. The
encoding is obtained by using a unary predicate τC for each signal processing expression
C and a binary predicate value which we describe in the next paragraphs. It is important
to note that these predicates are interpreted not like FO-predicates but using point based
semantics (e.g., τC (o) is true or false for a constant o at a given time point t). For detailed
semantics of such rules see [7].
     More formally, for a signal s = (os , fs ), the fact τC (os ) is true at a time point t iff
(i) o ∈ C I and (ii) fs (t) is a real number. Condition (ii) simplifies the encoding since
we do not need to define when a signal does not have a real value at a point; otherwise we
have to have the rules that encode the absence of a real value. Further, we use functional
symbols, e.g., fC , to generate fresh signal identifier. E.g., for a signal s, fC (os ) represents
a new signal id obtained from s for the expression C.
     To store the value of a signal at a time point we use the predicate value. That is,
value(os , v) is true at point t iff fs (t) = v.
     The encoding rules for trend(up) and trend(down) are based on intervals. For them
we introduce a parameter δ, a “small” real number, that we use to select an interval
around a time point. In theory, such parameter should converge to 0 to indeed check
the trend of a real function (in fact, one needs the first derivative), however, in practice
we expect that one can select such δ a priori (e.g., the length of signal sampling since
signals are step functions) that is sufficiently small to check the trend of a function for a
particular time point.
     It is easy to observe that the extended DatalogMTL program ΣΠ is non-recursive.

5     Formal Properties
In this part we state the formal properties of the encoding. First we introduce two lemmas
that characterize the encodings of auxiliary predicates value, τC ’s and propositional pD ’s.
Then we use them to show the main encoding theorem.
    The following lemma establishes the correspondence between a program and auxiliary
predicates value and τC ’s.

Lemma 1. Let Π = (S, K, M) be an SDRL program and ΣΠ = (DS,K , ΠM ) be
the extended DatalogMTL program as defined above. Further, let I be the canonical
interpretation for Π and let M be the canonical interpretation for ΣΠ . Then, for a signal
processing expression C and a time point t the following are equivalent:

    – sI ∈ C I and fs (t) = v;
    – M, t |= τC (os ) and M, t |= value(os , v).

Proof. The proof (in the both directions of “iff”) is based on induction on the number of
rules that are required to generate expression C starting from basis expressions. We show
direction “⇐”. The opposite one can be shown analogously.
    Induction Base: In this case, C is defined either with C = Q or C = {s1 , . . . , sm }.
    Let us assume C = Q, and sI ∈ C I and fs (t) = v. Since s is a basic signal, DS,K
must contain the fact value(os , v)@t. Moreover K |= Q(os ) hence according to the rule
Expression C           Encoding of C

Q                      τC (x) ← Q(x), value(x, v).
{s1 , . . . , sm }     τC (si ) ← value(si , v), for each si
α ◦ C1 ,               τC (fC (x)), value(fC (x), v) ← τC1 (x), value(x, v 0 ), v = α ◦ v 0 .
C1 : value( , α)       τC (fC (x)), value(fC (x), v) ← τC1 (x), value(x, v), v α.
C1 : duration(≥, t)    τC (fC (x)) ← [0,t] [0,t] τC1 (x).
                       value(fC (x), v) ← τC (fC (x)), value(x, v).
C1 : duration(<, t)    τC (fC (x)) ← τC1 (x), ¬ ( [0,t] ([0,t] τC1 (x)).
                       value(fC (x), v) ← τC (fC (x)), value(x, v).
agg C1                 τC (c), value(c, v) ← v = aggJv1 | value(x, v1 ), τC1 (x)K,
                       where c is a fresh constant, aggJ·K is an aggregation operator over bags
                                                                                                 
C1 : after[t] C2       τC (fC (x1 )) ← τC1 (x1 ) U[0,∞) (¬τC1 (x1 ) ∧ ¬τC2 (x2 )) U[0,t] τC2 (x2 ) .
                       value(fC (x1 ), v) ← τC (fC (x1 )), value(x1 , v)
                                                                                                 
C1 : before[t] C2      τC (fC (x1 )) ← τC1 (x1 ) S[0,∞) (¬τC1 (x1 ) ∧ ¬τC2 (x2 )) S[0,t] τC2 (x2 ) .
                       value(fC (x1 ), v) ← τC (fC (x1 )), value(x1 , v)
                                                                                                  
C1 : within C2         τC (fC (x1 )) ← τC1 (x1 ) ∧ τC2 (x2 ) S[0,∞) (¬τC1 (x1 )) U[0,∞) ¬τC1 (x1 ) .
                       value(fC (x1 ), v) ← τC (fC (x1 )), value(x1 , v).
C1 : trend(up)         τC (fC (x)) ← τC1 (x), ¬ notTrendUpC1 (x)
                       notTrendUpC1 (x) ← τC1 (x), value(x, v1 ), (0,δ] (value(x, v2 ), v1 > v2 )
                                 where δ is a “small enough” positive real number
                       value(fC (x), v) ← τC (fC (x)), value(x, v).
C1 : trend(down)       τC (fC (x)) ← τC1 (x), ¬ notTrendDownC1 (x)
                       notTrendDownC1 (x) ← value(x, v1 ), (0,δ] (value(x, v2 ), v1 < v2 )
                                 where δ is a “small enough” positive real number
                       value(fC (x), v) ← τC (fC (x)), value(x, v).


Boolean D              Encoding of D

D=C                    pD ←    [0,∞) τC (x).     pD ←     [0,∞) τC (x).

D = D1 and D2          pD ← pD1 , pD2 .

D = not D1             pD ← ¬pD1 .

message(m) = D         message(m) ← pD .

     Table 2. The encoding SDRL language into extended DatalogMTL. Each signal processing
     expression in the left column, the corresponding DatalogMTL rules are provided in the right.
     τC (x) ← Q(x), value(x, v) we have that M, t |= τC (os ). Since, value(x, v)@t ∈ DS,K
     we also have that M, t |= value(os , v).
          Assume now that C = {si , . . . , sm } and s = si for some i. Then it must be sI ∈ C I .
     Next, let us assume that fs (t) = v. Since s is a basic concept, we have that value(os , v)@t
     is in DS,K , and thus M, t |= value(os , v). Further, following the encoding rule for C,
     τC (x) ← value(x, v), we have that M, t |= τC (os ).
     Induction Step: Consider now that C is an expression that is created by other
expressions in at most n + 1. For example, let us assume that C ← C1 : duration(≥, t0 ).
Induction step for the other rules can be shown analogously.
     We assume that sI ∈ C I and fs (t) = v for some t and v. Since C is created from
C1 then it must exists s1 such that os = fC (os1 ) and fs1 (t) = v for some interval I that
contains t and is longer t0 . Since, C1 is created in at most n steps by induction hypothesis
we have that M, t |= τC1 (os1 ) and M, t |= value(os1 , v). Now we analyze the encoding
rule τC (fC (x)) ← [0,t0 ] [0,t0 ] τC1 (x). Intuitively, the body of the rule evaluates to
true for some x if there exists a time point in the “past” of t (expressed with condition
                                              0
  [0,t0 ] ) contained in an interval of size t (expressed with condition [0,t0 ] ) such that on
that interval τC1 (x) is true, i.e, M, t |= τC1 (x) for all t00 ∈ I. Since I is such interval
                                          00

for which τC1 (os1 ) is true, we have that encoding rule fires and makes τC (fS (os )) true
at point t, i.e., M, t |= τC (fS (os1 )). Furthermore, from the rule value(fC (x), v) ←
τC (fC (x)), value(x, v) and the fact that M, t |= τC (fS (os1 )), value(s1 , v) it holds that
M, t |= value(fC (os1 ), v). This concludes the proof.
    The following lemma defines a correspondence between a Boolean combinations of
signal processing expressions and their encoding rules.
Lemma 2. Let Π = (S, K, M) be an SDRL program and ΣΠ = (DS,K , ΠM ) be the
extended Datalognr MTL program as defined above. Further, let I be the canonical
interpretation for Π and M be the canonical interpretation for ΣΠ . Then, for a Boolean
combination of signal processing expression D we have that the following is equivalent:

                   DI is true         iff     M, t |= pD for any time point t

Proof. The proof is based on induction on the size of D.
    Induction Base: We assume D = C for complex expression C and assume DI is
true. Then there must exists a signal s such os ∈ C I which has at least one value v at
some time point t. From Lemma 1 we have that M, t |= τC (os ). Thus, from the encoding
rule pD ← τC (x) we have that M, t |= pD .
    Induction Step: We prove induction step for the case D = D1 and D2 . Similarly, it
can be shown in case D = ¬D1 . Assume that D is true in I, then also D1 and D2 are
true. Since D1 and D2 are constructed in less steps then D by induction hypothesis we
have that M, t |= pD1 and M, t |= pD2 . Hence, M, t |= pD .
    For an extended DatalogMTL program Σ, ground atom A and a time point t we
define that Σ |= A@t for the canonical model M of Σ it holds M, t |= A Then, directly
from Lemmas 1 and 2 we have the following theorem.
Theorem 1 (Encoding Theorem). Let Π be a program and r a message rule. Let ΣΠ
the extended DatalogMTL that encodes Π as described above and let the grounded
propositional mr be the head of DatalogMTL rule encoding r. Then:

                   Π fires r    iff         ΣΠ |= mr @t, for any time point t

   It is not hard to show the ideas of Theorem 5 in [7] can be carried to the extended
Datalognr MTL programs and thus preserve computational properties required by OBDA.
Formally, that means that data complexity [5] for the fact-entailment problem is in AC0 .
    The second observation is that we can extend rewriting techniques developed for
Datalognr MTL in [7] that allow us to rewrite our rules into standard SQL. More involving
part of rewriting lies on rewriting algebra of intervals, and for more details we refer [7].
Rewriting that includes functional symbols, negation, aggregation, and built-in arithmetic
can be done straightforwardly.
    Let Σ be an extended Datalognr MTL program, D a set of facts and A an grounded
atom. As usual, Σ, D |= A@t for some time point t holds if for the canonical model M
of Σ ∪ D it holds M, t |= A. The decision problem success is the problem of checking
whether Σ, D |= A@t. We refer to program (resp. data) complexity if all parameters are
fixed except the program (resp. set of facts).
Lemma 3. Success problem for extended Datalognr MTL programs is PSPACE-complete
in combined and program complexity and in AC0 in data complexity.
Proof (Proof Idea). Hardness follows from Theorem 5 in [7]. To show membership it is
sufficient to observe that each derivation in an extended Datalognr MTL program is of
length polynomial in the size of the program. Thus it is in PSPACE.
From Lemma 3 and Theorem 1 we have the following.
Theorem 2. The problem of checking whether a message rule is fired is PSPACE-
complete (it is complete already in size of rule signal expressions), and it is in AC0 in the
size of signal data and ontological data.

6   Related Work
In the Semantic Web community it is common to use rule-based ontological languages
such as SWRL [10], OWL 2 RL [11], SPIN [12], and SHACL Rules [13–15] for analytics.
Compared with SDRL, these languages cannot be directly used in the context of signal
analysis as they do not support reasoning of values over the temporal dimension.
    Works in [16, 17] introduce analytical operations directly into ontological rules
in such a way that OBDA scenario is preserved. They define analytical functions on
concepts, e.g. avg C, in OBDA setting. However, the authors do not consider temporal
dimension of the rules.
    As discussed above, our work is strongly related to the work on well-studied Metric
Temporal Logic [18]. In particular, we use a non-trivial extension of non-recursive
Datalog language Datalognr MTL which is suitable for OBDA scenario.
    Another related direction is real-time processing of signal data streams. In this direc-
tion, most of the work done so far mainly focused on querying RDF stream data. Many
different approaches such as C-SPARQL [19], SPARQLstream [20] and CEQLS [21]
have surfaced in recent years, introducing SPARQL based query processors. Most of
them, apart from C-SPARQL, follow the Data Stream Management Systems (DSMSs)
paradigm and do not provide support for stream reasoning. EP-SPARQL [22] combines
SPARQL with complex event processing features, and includes sequencing and simul-
taneity operators. Unlike the others, LARS [23] is an Answer Set Programming based
framework, which enables reasoning by compiling a Knowledge Base together with a
SPARQL-like query into a more expressive logic program.
Acknowledgments This research is supported by the EPSRC projects MaSI3 , DBOnto,
ED3 ; and by the Free University of Bolzano projects QUEST, OBATS and ROBAST.
References
 1. G. Vachtsevanos, F. L. Lewis, M. Roemer, A. Hess, B. Wu, Intelligent Fault Diagnosis and
    Prognosis for Engineering Systems, Wiley, 2006.
 2. E. Kharlamov, O. Savkovic, G. Xiao, R. Penaloza, G. Mehdi, I. Horrocks, M. Roshchin,
    Semantic rules for machine diagnostics: Execution and management, in: CIKM, 2017.
 3. G. Mehdi, E. Kharlamov, O. Savkovic, G. Xiao, E. G. Kalaycı, S. Brandt, I. Horrocks,
    M. Roshchin, T. A. Runkler, Semantic rule-based equipment diagnostics, in: ISWC, 2017, pp.
    314–333.
 4. G. Mehdi, E. Kharlamov, O. Savkovic, G. Xiao, E. G. Kalaycı, S. Brandt, I. Horrocks,
    M. Roshchin, T. Runkler, Semantic rules for siemens turbines, in: ISWC (Posters and Demos),
    2017.
 5. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, R. Rosati, Tractable reasoning and
    efficient query answering in description logics: The DL-Lite family, JAR 39 (3).
 6. E. Dantsin, T. Eiter, G. Gottlob, A. Voronkov, Complexity and expressive power of logic
    programming, ACM Comput. Surv. 33 (3).
 7. S. Brandt, E. G. Kalaycı, R. Kontchakov, V. Ryzhikov, G. Xiao, M. Zakharyaschev, Ontology-
    based data access with a horn fragment of metric temporal logic, in: AAAI, 2017.
 8. A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati, Linking data to
    ontologies, J. Data Semantics 10 (2008) 133–173.
 9. F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, P. F. Patel-Schneider (Eds.), The De-
    scription Logic Handbook: Theory, Implementation, and Applications, Cambridge University
    Press, New York, NY, USA, 2003.
10. I. Horrocks, P. Patel-Schneider, H. Boley, S. Tabet, B. Grosof, M. Dean, SWRL: A semantic
    web rule language combining OWL and RuleML, W3C Member Submission, World Wide
    Web Consortium (2004).
11. B. Motik, B. C. Grau, I. Horrocks, Z. Wu, A. Fokoue, C. Lutz, Owl 2 web ontology language:
    Profiles, W3C Recommendation, World Wide Web Consortium, http://www.w3.org/TR/owl2-
    profiles/ (2012).
12. H. Knublauch, J. A. Hendler, K. Idehen, Spin - overview and motivation, W3C member
    submission (February 2011).
13. W3C, SHACL advanced features, W3C working group note, W3C (June 2017).
14. J. Corman, J. L. Reutter, O. Savkovic, Semantics and validation of recursive SHACL, in:
    ISWC, 2018.
15. J. Corman, J. L. Reutter, O. Savkovic, Tractable notion of stratification for SHACL, in: ISWC,
    2018.
16. E. Kharlamov, E. Jiménez-Ruiz, D. Zheleznyakov, D. Bilidas, M. Giese, P. Haase, I. Horrocks,
    H. Kllapi, M. Koubarakis, Ö. L. Özçep, M. Rodriguez-Muro, R. Rosati, M. Schmidt, R. Schlatte,
    A. Soylu, A. Waaler, Optique: Towards OBDA systems for industry, in: ESWC Satellite Events,
    2013.
17. E. Kharlamov, N. Solomakhina, Ö. L. Özçep, D. Zheleznyakov, T. Hubauer, S. Lamparter,
    M. Roshchin, A. Soylu, S. Watson, How semantic technologies can enhance data access at
    siemens energy, in: ISWC, 2014, pp. 601–619.
18. R. Koymans, Specifying real-time properties with metric temporal logic, Real-Time Syst.
    2 (4).
19. D. F. Barbieri, D. Braga, S. Ceri, E. D. Valle, M. Grossniklaus, C-SPARQL: a continuous
    query language for RDF data streams, Int. J. Semantic Computing 4 (1) (2010) 3–25.
20. J. Calbimonte, H. Jeung, Ó. Corcho, K. Aberer, Enabling query technologies for the semantic
    sensor web, Int. J. Semantic Web Inf. Syst. 8 (1) (2012) 43–63.
21. D. L. Phuoc, M. Dao-Tran, J. X. Parreira, M. Hauswirth, A native and adaptive approach for
    unified processing of linked streams and linked data, in: ISWC, 2011, pp. 370–388.
22. D. Anicic, P. Fodor, S. Rudolph, N. Stojanovic, EP-SPARQL: a unified language for event
    processing and stream reasoning, in: WWW, 2011, pp. 635–644.
23. H. Beck, M. Dao-Tran, T. Eiter, M. Fink, LARS: A logic-based framework for analyzing
    reasoning over streams, in: AAAI, 2015, pp. 1431–1438.