On the Data Complexity of Ontology-Mediated Queries with MTL Operators over Timed Words S. Kikot1 , V. Ryzhikov2 , P. A. Waª¦ga1,3 , and M. Zakharyaschev2 1 University of Oxford, U.K. 2 Birkbeck, University of London, U.K. 3 University of Warsaw, Poland Abstract. We report on our initial results regarding the data complex- ity of answering atomic queries mediated by ontologies given in a Horn fragment of the metric temporal logic MTL. We adopt the pointwise se- mantics for MTL over dense time. The complexity classes involved are AC 0 , NC1 , L, NL, and P. 1 Introduction Our general concern is detecting events in a complex asynchronous system based on qualitative sensor measurements. More precisely, the system in question has a number of sensors that from time to time and asynchronously send their readings to a database. We may think of such readings as pairs of the form (A, t), where A is a concept name and t a timestamp given as a non-negative nite binary fraction such as 101.011. As a formalism to dene events we use propositional Horn clauses with operators of the metric temporal logic MTL [14,2]. For example, according to Siemens, a gas turbine has a normal stop if the rotor speed coasts down from 1500 to 200, which was preceded by another coast down from 6600 to 1500 some time in the previous 9 minutes, at most 2 minutes before which the main ame was o, while the active power was o earlier within another 2 minutes. The event `normal stop' can be encoded by the following hornMTL rule, where (n,m] ϕ is assumed to hold true at a timestamp t if ϕ holds at some previous timestamp t0 with n < t − t0 ≤ m:  NormalStop ← CoastDown1500to200 ∧ (0,9m] CoastDown6600to1500 ∧  (0,2m] MainFlameOff ∧ (0,2m] ActivePowerOff (the concepts on the right-hand side can be dened by similar rules). The use of hornMTL and datalogMTL ontologies (with both diamond and box operators in rule bodies and only box operators in rule heads) for querying temporal log data was advocated in [9], which also demonstrated experimentally feasibility and eciency of ontology-based data access (OBDA) with nonrecur- sive datalogMTL queries. An extension of the OBDA platform Ontop to support such queries was suggested in [8]. For a recent survey of temporal OBDA we refer the reader to [5]; surveys of early developments in temporal deductive databases are given in [7,10]. Note also that the satisability problem for the description logic ALC extended with MTL operators over (N, ≤) was considered in [12,6]. In this paper, we start investigating the data complexity of answering onto- logy-mediated queries (OMQs) with MTL operators. As the problem turns out to be quite involved, we restrict ourselves to atomic queries and ontologies in the fragment hornMTL− of hornMTL where only past diamond operators are allowed in rule bodies and no temporal operators can be used in rule heads. We distinguish between arbitrary, linear and core (or binary) rules and consider, in particular, the following types of the range % in % : hr, ∞), h0, ri, and [r, r] where h is one of [ or (, while i is one of ] or ). The underlying timeline is (R, ≤) as we cannot assume that sensor readings come at regular time intervals. There are two standard semantics for MTL over (R, ≤): continuous and pointwise ; see, e.g., [16] and references therein. Here, we only consider the latter, under which both model checking and satisability for full MTL are decidable but not primitive recursive [15] (over (Z, ≤), these problems are ExpSpace -complete). As pointed out in [16], under the pointwise semantics, one thinks of atomic propositions in MTL as referring to events (corresponding to state changes) rather than to states themselves. The complexity results we obtain in this paper are collected in the table below, where `range-uniform' means that all the % operators before intensional concept names in a given hornMTL− program have the same range %: rules \ ranges any hr, ∞) h0, ri [r, r] Horn = P ≤ L ≤ L linear = NL in AC 0 ≥ NC 1 ≥ NC1 core ≤ NL in 0 AC (range-uniform) ≤ L 2 The Horn Fragment of MTL We denote the set of nite binary fractionsaka dyadic rational numbersby Q2 ; the set of non-negative dyadic rationals is denoted by Q≥0 2 . As well-known, Q2 is dense in R and, by Cantor's theorem, (Q2 , <) is isomorphic to (Q, <). By an interval, ι, we mean any nonempty subset of the real numbers R of the form [t1 , t2 ], [t1 , t2 ), (t1 , t2 ] or (t1 , t2 ), where t1 , t2 ∈ Q2 ∪ {−∞, ∞} and t1 ≤ t2 , excluding t1 = t2 ∈ {−∞, ∞}. We identify (t, ∞] with (t, ∞), [−∞, t] with (−∞, t], etc. A range, %, is an interval with non-negative endpoints. The metric temporal logic MTL [14,2] is a propositional modal logic with box operators indexed by ranges, say (0,60] , which is interpreted over (R, <) as `at every time instant within the previous minute', its future counterpart (0,60] , and their dual diamond operators (0,60] and (0,60] . In this paper, we only consider a fragment of MTL that is called hornMTL− . A hornMTL− program, Π , is a nite set of rules of the form A ← B1 ∧ · · · ∧ Bk , (1) where k ≥ 1 and the Bi are dened by the grammar B ::= A | > | % B, with A being a concept name or ⊥. As usual, A is called the head of the rule, and B1 ∧ . . . ∧ Bk its body. A hornMTL− program Π is linear if, in each of its rules, at most one of the concepts in the body occurs as a head in Π . A hornMTL− program is core if all of its rules are of the form A ← B or ⊥ ← B1 ∧ B2 . A hornMTL− (ontology-mediated) query takes the form (Π, A(x)). As mentioned in Section 1, we may think of a data instance as a nite set D = {(A1 , t1 ), . . . , (An , tn )}, where the Ai are concept names from some xed alphabet Λ and the ti are (not necessarily ordered) timestamps from Q≥0 2 . When proving some complexity results, we assume that D is given as the FO-structure D = (∆, <, Ω, bitin , bitfr , A1 , . . . , Ap ), (2) in which  ∆ = {0, . . . , `} ⊆ N, where ` is the maximum of the number of distinct timestamps in D and the number of bits in the longest binary fraction in D (excluding the binary point), and < is the usual order on ∆;  Ω ⊆ ∆ is a set of timestamps ; for every n ∈ Ω , we set n̄ = b` · · · b0 .c0 · · · c` such that bitin (i, n, bi ) and bitfr (i, n, ci ) hold, for i = 0, . . . , `;  thus, bitin and bitfr are ternary relations on ∆ such that, for any n ∈ Ω and i ∈ ∆, there is a unique bi ∈ {0, 1} and a unique ci ∈ {0, 1} with bitin (i, n, bi ) and bitfr (i, n, ci );  Ai ⊆ Ω , for i = 1, . . . , p; intuitively, Ai (n) holds i Ai (n̄) ∈ D. For any d ∈ Q≥0 2 , one can readily dene FO-formulas:  distd (x, y) that holds in D i x, y ∈ Ω and x̄ − ȳ > d;  their modications dist≤d (x, y) and dist≥d (x, y). It will be convenient to assume that these predicates are also given by D for the relevant d. In Theorem 5, we also make the following assumption: (ord) Ω = {0, . . . , k}, for some k ≤ `, and n < m ≤ k implies n̄ < m̄. There are two semantics for MTL known as pointwise and continuous [16]. Pointwise semantics. A pointwise interpretation is a structure of the form I = (T, AI1 , AI2 , . . . ), where T 6= ∅ is a nite subset of Q≥0 2 (timestamps) and Ai ⊆ T. We set B = A I I I if B = A, B = T if B = >, B = ∅ if B = ⊥, and I I ( % B)I = {t ∈ T | ∃t0 ∈ B I (t − t0 ∈ %)}. I is a model of a data instance D if t ∈ AI for any (A, t) ∈ D; I is a model of a hornMTL− program Π if, for any rule (1) in Π and any t ∈ T, we have t ∈ AI whenever t ∈ BiI for 1 ≤ i ≤ k . We say that D and Π are consistent if there is a model of D and Π . We also say that a timestamp t from D is a certain answer to a query (Π, A(x)) over D if t ∈ AI , for every model I of D and Π . The query answering problem for (Π, A(x)) is to decide, given a data instance D and a timestamp t in it, whether t is a certain answer to (Π, A(x)) over D. Continuous semantics. The only dierence from the pointwise case is that a continuous interpretation is dened over the reals: I = (R, AI1 , AI2 , . . . ) with AIi ⊆ R and ( % B)I = {t ∈ R | ∃t0 ∈ B I (t − t0 ∈ %)}. To illustrate, suppose that Π = {A ← (0,1] (0,1] B} and D = {(B, 0), (C, 2)}. Then (Π, A) has no answers over D under the former semantics (because 1 is not a timestamp in D), but 2 is an answer under the latter one. In this paper, we only consider the pointwise semantics and leave the contin- uous one for future work. Normal form. A program is in normal form if its rules have one of the forms: P0 ← 0 %01 P1 ∧ · · · ∧ 0 %0` Pm , (3) P0 ← %1 P1 ∧ · · · ∧ % k Pk ∧ 0 %01 P1 ∧ · · · ∧ 0 %0` Pm , (4) where the Pi0 are from the data alphabet Λ, the Pi do not belong to Λ (and so cannot occur in data instances), and 0 ∈/ %i for any i (although there may be, say %0i = [0, 0]). Every hornMTL− program can be transformed to a program in normal form with the same answers. We illustrate this claim by an example. Example 1. Let Π = {P10 ← [0,d] P0 ∧ Q0 , P0 ← (0,e) P1 ∧ [0,f ] Q1 }, where the 0 0 0 0 0 Pi are in Λ. By introducing fresh concept names P0 , P1 , we convert Π to 0 0 0 P1 ← [0,d] P0 ∧ Q0 , P0 ← (0,e) P1 ∧ [0,f ] Q1 , P0 ← P00 , P1 ← P10 . To get rid of [0, d], we further transform the program to P1 ← P0 ∧Q00 , P1 ← 0 (0,d] P0 ∧Q0 , P0 ← 0 0 0 (0,e) P1 ∧ [0,f ] Q1 , P0 ← P0 , P1 ← P1 . Now, P0 in the rst rule is not in the scope of % (Q00 can be regarded as a shorthand for [0,0] Q00 ). So we transform the rule using obvious derivations to obtain the following program in normal form: P1 ← P00 ∧ Q00 , P1 ← (0,e) P1 ∧ 0 0 [0,f ] Q1 ∧ Q0 , P1 ← 0 (0,d] P0 ∧ Q0 , 0 0 0 P0 ← (0,e) P1 ∧ [0,f ] Q1 , P0 ← P0 , P1 ← P1 . A hornMTL− query (Π, A(x)) is in normal form if Π is in normal form and A∈/ Λ. Clearly, every query can be converted to a one in normal form and having the same answers. 3 The Data Complexity of Answering hornMTL− Queries It is not hard to see that consistency checking and answering hornMTL− queries can be reduced to consistency checking and answering monadic datalog queries over D. For example, the rule A ← (r,s] B can be replaced with the monadic datalog rule A(x) ← B(y) ∧ dist>r (x, y) ∧ dist≤s (x, y), where dist>r and dist≤s are extensional predicates given by the data instance. It follows that consistency checking and answering hornMTL− queries can be done in polynomial time for data complexity; for linear hornMTL− queries, this can be done in NL. The next theorem establishes a matching lower bound, which is in sharp contrast to hornLTL queries that are in NC1 for data complexity [4]. Theorem 1. (i) Consistency checking and answering hornMTL− queries is P- complete for data complexity. (ii) Consistency checking and answering linear hornMTL− queries is NL- complete for data complexity. Proof. We only show the lower bound in (ii); the construction in (i) is similar. The proof is by reduction of the reachability problem in acyclic digraphs. Let G = (V, E) be such a digraph with a set V = {v0 , . . . , vn−1 } of vertices and a set E ⊆ V × V of edges such that whenever (vi , vj ) ∈ E then i < j (we can always represent G in this way due to its acyclicity). Suppose that the task is to check whether vt ∈ V is reachable from vs ∈ V in G. We construct a data instance DG with concepts V , Er , El , I , O, which encodes G on a linear order. Namely, DG contains the following pairs (i, j, k ∈ N):  (V, 2k + 2in ), for 0 ≤ k < n and 1 ≤ i ≤ n; 2n ), for (vi , vj ) ∈ E ;  (Er , 2i + i+1  (El , 2i + j+1 2n ), for (vi , vj ) ∈ E ;  (I, 2k + i+1 2n ), for 0 ≤ k < n and vs = vi ;  (O, 2k + i+12n ), for 0 ≤ k < n and vt = vi . An example of a graph and the corresponding data instance are shown below: • v1 G: • • vs = v0 v3 = vt • v2 by Π : R R R R R RR R RR I O I O I O I O DG : Er El Er El Er El VVVV VVVV VVVV VVVV 1 2 3 4 33 34 35 36 65 66 67 68 97 98 99100 0 16 16 16 16 1 2 16 16 16 16 3 4 16 16 16 16 5 6 16 16 16 16 7 8 V: v0v1v2v3 v0v1v2v3 v0v1v2v3 v0v1v2v3 edges from v0 edges from v1 edges from v2 edges from v3 Let Π be a linear hornMTL− program with the following rules: R ← I, R ← El ∧ [0,1] (Er ∧ R), R←V ∧ [2,2] R, P ← R ∧ O, ⊥ ← P. Note that all numbers occurring in DG and Π belong to Q2 . For the atoms im- plied by Π , see the previous picture. One can show that Π and DG are consistent i vt is not reachable from vs . 4 hornMTL− Queries with hr,∞) In this section, we show that if all temporal operators in a hornMTL− program Π are of the form hr,∞) , then answering (Π, A(x)) can be done in 0 AC ; in other words, (Π, A(x)) is FO-rewritable. To this end, we require the canonical models for hornMTL− programs, which can be dened as follows. Suppose we are given a hornMTL− program Π and a data instance D. Dene a set CΠ,D of pairs of the form (B, t) that contains all answers to queries with Π over D. We start by setting C = D and denote by cl(C) the result of applying exhaustively and non-recursively the following rules to C : (horn) if A ← B1 ∧ . . . ∧ Bk is in Π and (Bi , t) ∈ C , for i = 1, . . . , k , then we add (A, t) to C ; ( % ) if % B occurs in Π , (B, t0 ) ∈ C , and t − t0 ∈ %, for a timestamp t in D, then we add ( % B, t) to C . It should be clear that there is some N < ω (polynomially depending on Π and D) such that clN (C) = clN +1 (C). We then set CΠ,D = clN (D). Theorem 2. A timestamp t from D is a certain answer to (Π, A(x)) over D i (A, t) ∈ CΠ,D . As known from [3], if we use LTL diamonds in place of the MTL ones in hornMTL− programs, then all such queries are FO-rewritable and in AC0 for data complexity. In fact, almost the same argument shows the following: Theorem 3. Consistency checking and answering hornMTL− queries with tem- poral operators of the form hr,∞) are in AC0 for data complexity. Proof. Let (Π, A(x)) be a hornMTL− query with temporal operators of the form p hr,∞) . It is easy to see that constructing CΠ,D needs at most |Π| applications 2 of the cl operator to D (because each rule ( hr,∞) ) needs to be applied at most once). Thus, we can construct an FO-rewriting of (Π, A(x)) using iteratively the standard FO-translation of, say, [r,∞) B into ∃t0 (dist≥r (t, t0 ) ∧ B(t0 )). Example 2. An FO-rewriting for the query (Π, A(x)) with the program Π com- prising two rules A ← [2,∞) C , C ← [1,∞) A looks as follows:    ∃y A(x) ∨ C(y) ∧ dist≥2 (x, y) ∨ A(y) ∧ dist≥3 (x, y) . When the ranges % in % are dierent from hr, ∞), the technique above does not work any more and the complexity landscape changes signicantly. 5 Metric Automata for Linear hornMTL− Our technical tool for studying the data complexity of linear hornMTL− queries is automata with metric constraints that are dened for programs in normal form. These automata can be viewed as a primitive version of standard timed automata for MTL [1] as we only have one clock c, the clock reset c := 0 happens at every transition, and the clock constraints are of the simple form c ∈ %. A (nondeterministic) metric automaton is a quadruple A = (S, S0 , Σ, δ), where S 6= ∅ is a set of states, Σ a tape alphabet, δ a transition relation, and S0 is a nonempty set of pairs of the form (q, e), where e ∈ Σ , q ∈ S . The transition % relation δ is a set of instructions of the form q − →e q 0 with q, q 0 ∈ S , e ∈ Σ and a range %. The automaton A takes as input timed words σ = (e0 , t0 ), . . . , (en , tn ), where the ti are timestamps with ti−1 < ti . A run over σ is a sequence q0 , . . . , qm %i such that (q0 , e0 ) ∈ S0 , qi−1 −→ei qi is in δ and ti − ti−1 ∈ %i , for 0 < i ≤ n. Let Π be a linear hornMTL− program in normal form. We denote the con- junctions %01 P10 ∧ . . . ∧ %0` Pm0 (with data concept names Pi0 ) that occur in Π by ε, possibly with subscripts. Thus, since Π is linear, rules (4) in Π are of the form P ← ε ∧ % Q. Let EΠ = {ε1 , . . . , εq } be the set of all such ε occurring in Π . We dene a metric automaton AΠ for Π as follows. The set S of its states comprises the head concept names in Π , and Σ = 2EΠ . The transition relation % δ comprises Q − →E P such that P ← ε ∧ % Q is in Π and ε ∈ E . Finally, S0 is the set of all pairs (P, ε) such that a rule P ← ε of the form (3) is in Π . Example 3. For Π = {P0 ← 0 [0,1] P0 , P1 ← (1,2) P0 ∧ P10 , P0 ← (1,3) P1 }, the metric automaton AΠ is depicted below, where P00 , P10 ∈ Λ, E0 = {P10 }, E1 = { [0,1] P00 }, E2 = {P10 , [0,1] P00 }, and S0 = {(P0 , [0,1] P00 )}. E2 (1, 2) E0 (1, 2) P0 P1 ∅ (1, 3) E0 (1, 3) E1 (1, 3) E2 (1, 3) We represent any data instance D as a timed word σD . For ti occurring in D, let E(ti ) be the maximal set of ε from Π that hold at ti in D, and let σD = (E(t1 ), t1 ), . . . , (E(tn ), tn ) . The corresponding FO-structure from Section 2 can take the form σD = (∆, <, Ω, bitin , bitfr , E1 , . . . , Ek ), (5) where {E1 , . . . , Ek } = 2EΠ . It is not hard to see that there is an FO-translation of (2) to (5). Example 4. A data instance D and its representation as σD are shown below: 0 1 1.5 4 4.5 5 6.5 D: P00 Q0 P10 P00 P10 P10 Q0 σD : ∅ E1 E0 ∅ E2 E2 ∅ Theorem 4. For any linear hornMTL− query (Π, A(x)), a timestamp ti is a certain answer over a data instance D i there exist a subword σD0 of σD with the last timestamp ti and a run of AΠ over σD0 that ends with A. Example 5. Let (Π, P1 (x)) be the query with Π from Example 3. Then, for σD from Example 4, we have the run P0 , P1 , P0 , P1 on 3 (E1 , 1), (E0 , ), (∅, 4), (E2 , 5), 2 and so 5 is a certain answer to the query over D from Example 4. One could dene metric automata as classical timed automata; however, Theorem 4 does not use them in the standard way as it requires runs on subwords. Whether and how such runs can be captured by timed automata remains to be claried. We now use the obtained automaton characterisation of certain answers to linear queries to give better complexity bounds for two classes of linear programs with restricted temporal ranges than the NL bound of Theorem 1 (ii). 6 hornMTL− Queries with h0,ri We say that a hornMTL− program Π in normal form is range-uniform if every (intensional) concept name P ∈/ Λ occurs in Π in the scope of % , for some xed range %. By a coreMTL− program we mean a core hornMTL− program in normal form. Rules (4) in such a program take the form P ← % Q. 6.1 Range-uniform coreMTL− queries with h0,ri Let Π be a range-uniform coreMTL− program and AΠ = (S, S0 , Σ, δ) the cor- responding metric automaton. For each (q0 , e0 ) in S0 , we dene a classical nite automaton AAq0 ,e0 = (S, Σ, {q0 }, δ, {A}), where S, Σ , and δ are as in AΠ (note h0,ri that all transitions take the form q −−−→∅ q 0 ), and q0 and A are unique initial and nal states, respectively. Thus, AA q0 ,e0 is a unary (i.e., over singleton alphabet) automaton. It is known that each such automaton has an equivalent automaton in normal form [11,17], where cycles can be only disjoint. More precisely, there is a number of arithmetic progressions aS i + bi N = {ai + bi · m | m ∈ N} such that a word ∅n is accepted by AA q0 ,e0 i n ∈ i ai + bi N. This characterisation allows us to obtain the following: Theorem 5. Consistency checking and answering range-uniform coreMTL− que- ries with temporal operators of the form h0,ri are in AC0 for data complexity provided that the input data instances satisfy (ord). Example 6. To illustrate the theorem, consider the query (Π, S1 (x)) with Π = {S0 ← B, S1 ← (0,d) S0 , S2 ← (0,d) S1 , S3 ← (0,d) S2 , S1 ← (0,d) S3 }. The automaton AS0 ,B (which is in normal form) is shown in the picture below on the right. Using it, we construct the following FO-rewriting ϕ(x) of (Π, S1 (x)): ϕ(x) = ∃x0 B(x0 ) ∧ ∀y (x0 < y ≤ x) → ∃y 0 dist 1));  ϕ+k (z, x0 ) = dist