On Decidability and Tractability of Querying in Temporal EL Vı́ctor Gutiérrez-Basulto1 , Jean Christoph Jung1 , and Roman Kontchakov2 1 Department of Computer Science, Universität Bremen, Germany 2 Dept. of Computer Science and Information Systems, Birkbeck, University of London, UK Abstract. We study access to temporal data with TEL, a temporal extension of the tractable description logic EL. Our aim is to establish a clear computational complexity landscape for the atomic query answering problem, in terms of both data and combined complexity. Atomic queries in full TEL turn out to be unde- cidable even in data complexity. Motivated by the negative result, we identify well-behaved yet expressive fragments of TEL. Our main contributions are a se- mantic and sufficient syntactic conditions for decidability and three orthogonal tractable fragments, which are based on restricted use of rigid roles, temporal operators, and novel acyclicity conditions on the ontologies. 1 Introduction Due to the increasing need to account for the temporal dimension of data available on the Web [30, 16], the DL community has recently investigated extensions of the ontology- based data access (OBDA) paradigm for temporal data. The initial efforts concentrated on temporal query languages with atemporal ontologies [22, 25, 5, 9, 10], but some applications, such as managing data from sensor networks, require temporal aspects in conceptual modelling; hence, there is a need for temporal ontology languages [1]. In this line, the research has focused on temporal extensions of DL-Lite that support rewritability of temporal queries into monadic second-order logic with order or two- sorted first-order logic with < and + [4, 1]. Standard relational databases have such built-in predicates and so, in principle, can evaluate FO(<, +)-rewritings. However, no temporal extensions of other DLs have been investigated in the context of OBDA, partly due to intractability and often even undecidability of the standard reasoning tasks [2, 19, 20]. On the other hand, temporal data has also been studied in database theory [15]. In their seminal paper, Chomicki and Imielinski [13] identified DATALOG1S as a decidable extension of DATALOG with one successor function. We make the first (to the best of our knowledge) attempt to link temporal OBDA with temporal deductive databases [12, 7]. In this paper, we study TEL, a temporal extension of EL [6]. The underlying DL component, EL, underpins the OWL 2 EL profile of OWL 2 and the medical ontology S NOMED CT, which provides the vocabulary for electronic health records (EHRs). Indeed, applications managing EHRs must be able to provide information, e.g., on when and for how long some drug has been prescribed to a patient, so that drugs that interact adversely are not prescribed at the same time. Clinical trials [31, 29] also require a unified conceptual model for specifying temporal constraints of protocol entities such as ‘a viable participant should have had a vaccination with live virus 5 days ago’ or ‘blood tests of a patient should be run every 3 days’. These statements can be expressed in TEL: 5 Patient u P ∃vaccinated.LiveVirus v ViableParticipant, (1) 3 Patient u P RequiresBloodTest v RequiresBloodTest. (2) Our main objective is to establish the limits of decidability and tractability of the query answering problem over TEL ontologies, in terms of both data and combined complexity. In order to set the foundations, we focus on temporal atomic queries. On the one hand, an atomic query like ViableParticipant(x, t) together with the temporal concept inclusion (1) effectively encodes a tree-shaped temporal conjunctive query. On the other hand, using (1) to extend the vocabulary with the concept ViableParticipant is closer to the spirit of the OBDA paradigm than repeating the same conjunction in similar user queries. Moreover, the recurrent pattern RequiresBloodTest is expressible as an atomic query RequiresBloodTest(x, t) with the temporal concept inclusion (2) but not expressible as a query without temporal concept inclusions such as (2). As we shall see, even for the atomic queries rather surprising (and challenging) results are obtained. Our main contributions are complexity bounds, algorithms, and rewritability into DATALOG1S for atomic query answering in fragments of TEL. Since query answering over full TEL turns out to be undecidable even in data complexity, we investigate its frag- ments to attain decidability and tractability. First, for TEL◦ , which allows only the ‘next-’ F and ‘previous-time’ P operators, we identify ultimate periodicity as a natural seman- tic condition ensuring decidability, more precisely, PS PACE data complexity (the question of decidability of the full TEL◦ is left open for future work). Then, we identify a number of fragments with better computational properties. (a) For the fragment of TEL◦ without rigid (not changing over time) roles on the right-hand side of concept inclusions, we con- struct a polynomial rewriting into DATALOG1S , and so, establish PS PACE-completeness for data complexity. This fragment contains all EL ontologies as well as both (1) and (2). (b) Over temporally acyclic TEL◦ -ontologies (with rigid roles), query answering is PT IME-complete in both data and combined complexity. This tractable fragment con- tains (1) and fully captures all atemporal EL ontologies and may prove particularly useful in applications; it, however, does not contain (2). (c) Query answering over DL-acyclic TEL◦ -ontologies is NC1 -complete for data complexity (in principle, highly paralleliz- able). This fragment contains all acyclic EL ontologies as well as both (1) and (2) (a large part of S NOMED CT is in fact acyclic). We remark that our two novel acyclicity condi- tions (each constraining only one dimension) relax the ‘traditional’ notion of acyclicity in (temporal) DLs [23, 21]. Finally, we show that the language with only 3P and 3F (sometime in the past/future) on the left-hand side of concept inclusions enjoys PT IME query answering. All proofs can be found at http://tinyurl.com/TempEL16. 2 Preliminaries We begin by introducing TEL, a temporal extension of the classical DL EL. Let NC , NR , NI be countably infinite sets of concept, role and individual names, respectively. We assume that NR is partitioned into two infinite sets, Nrig loc R and NR , of rigid and local role names, respectively. TEL-concepts are defined by the following grammar: C, D ::= A | C u D | ∃r.C | ∗ C | 3∗ C, where A ∈ NC , r ∈ NR , and ∗ ∈ {F, P }. A TEL-TBox is a finite set of concept inclusions (CIs) C v D and concept definitions (CDs) C ≡ D, for TEL-concepts C, D. Data is given in terms of temporal ABoxes A, which are finite sets of assertions of the form A(a, n) and r(a, b, n), where A ∈ NC , r ∈ NR , a, b ∈ NI , and n ∈ Z. We denote by ind(A) the set of individual names occurring in A, and by tem(A) the set {n ∈ Z | min A ≤ n ≤ max A}, where min A and max A are, respectively, the minimal and maximal time points in A. The size, |T | and |A|, of T and A is the number of symbols required to write T and A, resp., with time points n ∈ Z encoded in unary. An interpretation I is a structure (∆I , (In )n∈Z ), where each In is a classical DL interpretation with domain ∆I : we have AIn ⊆ ∆I and rIn ⊆ ∆I ×∆I . Rigid roles r ∈ Nrig R do not change their interpretation in time: r In = rI0 for all n ∈ Z. We usually I,n I,n In In write A and r instead of A and r , respectively, and extend ·I,n as follows: (C u D)I,n = C I,n ∩ DI,n , (∃r.C)I,n = d | there is e ∈ C I,n with (d, e) ∈ rI,n ,  ( ∗ C)I,n = C I,n op∗ 1 , (3∗ C)I,n = d | d ∈ C I,n op∗ k for some k > 0 ,  where op∗ stands for + if ∗ = F and for − if ∗ = P . Although we use strict 3∗ , our results do not depend on the choice. TBoxes are interpreted globally: an interpretation I is a model of C v D, written I |= C v D, if C I,n ⊆ DI,n , for all n ∈ Z; and a model of C ≡ D if C I,n = DI,n , for all n ∈ Z. We call I a model of a TBox T , written I |= T , if I |= α for all α ∈ T . For ABoxes A we adopt the standard name assumption: aI,n = a for all a ∈ ind(A), n ∈ Z; thus, ind(A) ⊆ ∆I . The relation |= is extended to ABoxes: I |= A(a, n) iff a ∈ AI,n and I |= r(a, b, n) iff (a, b) ∈ rI,n ; then, I |= A if I |= α for all α ∈ A. An interpretation I is a model of a temporal knowledge base (KB) K = (T , A), written I |= K, if I |= T and I |= A. Finally, K |= A(a, n) if I |= A(a, n) in every I |= K. A temporal atomic query (TAQ) is of the form A(x, t), where A ∈ NC , x an indi- vidual variable and t a temporal variable. A certain answer to A(x, t) over (T , A) is a pair (a, n) ∈ ind(A) × tem(A) with (T , A) |= A(a, n). We study the problem of TAQ answering: Input: TBox T , ABox A, TAQ A(x, t) and a pair (a, n). Question: Is (a, n) a certain answer to A(x, t) over (T , A)? Our results concern both the combined and data complexity of the problem: for data complexity, the TBox is fixed. As usual, for a complexity class C and a class X of TBoxes, we say that TAQ answering over X is C-hard in data complexity if there is some T ∈ X such that answering TAQs over T is C-hard. Conversely, TAQ answering over X is in C in data complexity if answering TAQs over T is in C for all T ∈ X . As classes X , we will in particular look at full TEL and its fragments TEL3 and TEL◦ , in which, respectively, only the temporal operators 3∗ and ∗ are allowed. Note that 3∗ on the left-hand side (and 2∗ with the usual semantics on the right-hand side) of CIs can be expressed in TEL◦ , e.g., instead of 3P A v C or, equivalently, A v 2F C, take A v A0 and P A0 v A0 u C, for a fresh concept name A0 . Observe also that rigid concepts, which do not change their interpretation in time, can be expressed in these two fragments using 3P 3F C v C and F C ≡ C, respectively. 3 Query Answering in TEL: Undecidability We first pinpoint different sources of complexity for the query answering problem in TEL in order to identify computationally well-behaved fragments later. We begin by showing that TAQ answering over TEL3 is undecidable. The known undecidability of subsumption in TEL3 [2] translates only into the combined complexity of TAQ answering. We strengthen the result to obtain undecidability in data complexity by reducing the halting problem for the universal Turing machine. We exploit the crucial observation that disjunction, although not in the syntax, can be simulated with 3∗ [2]. Theorem 1. TAQ answering over TEL3 is undecidable in data complexity. The proof can also be adapted to the non-strict semantics of 3∗ using the chessboard technique [17]. Next, we show that over TEL◦ — although it is not capable of expressing disjunction — TAQ answering is hard. Theorem 2. TAQ answering over TEL◦ is non-elementary in combined complexity and PS PACE-hard in data complexity. The proof of PS PACE-hardness is close in spirit to that for DATALOG1S [13]; we only remark that the lower bound holds even for the sublanguage of TEL◦ without ∃r.C on the right-hand side of CIs. For the non-elementary lower bound, we take inspiration in the construction for the product modal logic LTL×K [17, Theorem 6.34]. Our proof requires a careful implementation of the yardstick technique [33] with only Horn formulas. Decidability of TAQ answering over full TEL◦ is left open as interesting and chal- lenging future work; more insights on the difficulty of the problem are given in Section 4. Nevertheless, we show that extending TEL◦ with certain DL constructs that are harmless for data complexity of atemporal query answering [26] immediately leads to undecid- ability. Let TELI ◦ and TELF ◦ be the extensions of TEL◦ with inverse roles r− and functionality axioms func(r), respectively.1 For both languages, we reduce the halting problem for the universal Turing machine to prove: Theorem 3. TAQ answering over TELI ◦ and TELF ◦ is undecidable in data complexity. 4 Foundations of Query Answering in TEL◦ In the rest of the paper, we study decidability and complexity of TAQ answering in various fragments of TEL◦ and TEL3 . To this end, we first lay the groundwork for the development of algorithms for query answering in those fragments by introducing canonical quasimodels, which are succinct abstract representations of the universal models of the KBs, see also [4, 1]. They can also be viewed as a generalization of the canonical structures used for query answering in atemporal EL [27]. We assume that TEL◦ -TBoxes are in normal form: they consist of CIs of the form A u A0 v B, A v ∃r.B, X v A, 0 where A, A , B ∈ NC and X is a basic concept of the form A, ∗ A, or ∃r.A, for A ∈ NC . Observe that, without loss of generality, ∗ is restricted to the left-hand side of CIs: e.g., 1 with the usual semantics: (r− )I,n = {(e, d) | (d, e) ∈ rI,n } and I |= func(r) iff e1 = e2 , for all (d, e1 ), (d, e2 ) ∈ rI,n and n ∈ Z. A v F B is equivalent to P A v B. It is routine to show that every TEL◦ -TBox can be transformed into the normal form by introducing fresh concept names; see, e.g., [6]. Fix a KB (T , A) with a TEL◦ -TBox T in normal form. Let CN be the set of concept names in (T , A). A map π : Z → 2CN is a trace for T if it satisfies the following: (t1) if A u A0 v B ∈ T and A, A0 ∈ π(n), then B ∈ π(n); (t2) if ∗ A v B ∈ T and A ∈ π(n), then B ∈ π(n op∗ 1). Traces are the building blocks of quasimodels and are used to represent the temporal evolution of individual domain elements. For example, for T = { P C v B, P B v C}, the map π such that π(i) = {B} for odd i and π(i) = {C} for even i is a trace for T . In order to describe interactions of domain elements, we require more notation. Let π be a trace for T . For a rigid role r ∈ Nrig R , the r-projection of π is a map projr (π) : Z → 2CN that sends each i ∈ Z to {A | ∃r.B v A ∈ T , B ∈ π(i)}; for a local role r ∈ Nloc R , projr (π) is defined in the same way on 0 but is ∅ for all other i ∈ Z. Given a map % : Z → 2CN and n ∈ Z, we say that π contains the n-shift of % and write % ⊆n π if %(i − n) ⊆ π(i), for all i ∈ Z. For example, let T = {∃r.B v B 0 } with rigid role r. In the picture below, trace πa contains the 1-shift of the r-projection of πB : B C B C B πB B0 B0 B0 projr (πB ) B0 B0 B0 πa -1 0 1 A 2 3 4 Note that, if r were local then πa would have to contain B 0 only at 1 (but not at 3, etc.). We are now fully equipped to define quasimodels. Henceforth, let D = ind(A) ∪ CN. A quasimodel Q for (T , A) is a set of traces πd for T (d ∈ D) such that (q1) A ∈ πa (n), for all A(a, n) ∈ A; (q2) B ∈ πB (0), for all B ∈ CN; (q3) projr (πb ) ⊆0 πa , for all r(a, b, n) ∈ A; (q4) if A ∈ πd (n) then projr (πB ) ⊆n πd , for all d ∈ D, n ∈ Z and A v ∃r.B in T . Intuitively, quasimodels represent models of (T , A): each πa stands for the ABox in- dividual a; each πB , on the other hand, represents all individuals that witness B for CIs A v ∃r.B in T . The latter is, in fact, the crucial abstraction underlying quasi- models. Note that traces πB are normalized: B occurs at time point 0, which has to be compensated by the shift operation in (q4). For example, in the picture above, if A v ∃r.B ∈ T then, in any model, a has an r-successor that belongs to B at moment 1. Such a successor can be obtained as a ‘copy’ of trace πB shifted by 1 so that its origin, 0, matches moment 1 for a. Then, by (q4), a belongs to B 0 at all odd moments. For the purposes of query answering we need to identify canonical (minimal) quasi- models. We define the canonical quasimodel as the limit of the following saturation (chase-like) procedure. Start with initially empty maps πd , for d ∈ D, and apply (t1)– (t2), (q1)–(q4) as rules: (q3), for example, says ‘if r(a, b, n) ∈ A and A ∈ projr (πb )(i), then add A to πa (i).’ Then we have the following characterization: Theorem 4. Let T be a TEL◦ -TBox and Q = {πd | d ∈ D} the canonical quasimodel of (T , A). Then, (T , A) |= A(a, i) iff A ∈ πa (i), for any A ∈ CN, a ∈ ind(A), i ∈ Z. The procedure for constructing the canonical quasimodel deals with infinite data struc- tures (traces) and is generally not terminating. So, although Theorem 4 provides a criterion for certain answers, it does not immediately yield a decision algorithm for full TEL◦ . We remark that known techniques for dealing with such infinite structures cannot be easily applied: for example, MSO (over Z), a standard tool for decidability proofs in temporal DLs [17], is not sufficient to encode the canonical quasimodel directly because (q4) requires +. In fact, the key to showing decidability for (fragments of) TEL◦ is finding a finite representation of traces. The starting point of the rest of the paper is a semantic condition on the canonical quasimodel, ultimate periodicity, which ensures decidability, at least in data complexity. Let T be a TEL◦ -TBox and Q the canonical quasimodel for (T , ∅). We say that T is ultimately periodic, if there is p ∈ N such that all πB , B ∈ CN, in Q are ultimately p-periodic, that is, for each B ∈ CN, there are positive integers mP , pP , mF , pF ≤ p satisfying the following conditions: πB (n − pP ) = πB (n), for all n ≤ −mP , πB (n + pF ) = πB (n), for all n ≥ mF . Intuitively, an ultimately p-periodic trace has repeating sections on the left and on the right: −mP −2pP −mP −pP −mP 0 mF mF +pF mF +2pF The condition of ultimate periodicity is rather natural. On the practical side, it is motivated by applications with recurrent patterns such as health care support [31], see concept inclusions (1) and (2) in Section 1. From the theoretical point of view, any satisfiable LTL formula has an ultimately periodic model [28]. Theorem 5. TAQ answering over ultimately periodic TEL◦ -TBoxes is PS PACE-complete in data complexity. PS PACE-hardness follows from (the proof of) Theorem 2. We prove the matching upper bound by rewriting an ultimately periodic TEL◦ -TBox T into DATALOG1S [13]. First, we take temporal rules reflecting rigid roles and standard EL concept inclusions: r(x, y, t ± 1) ← r(x, y, t), for r ∈ Nrig R in T , 0 B(x, t) ← A(x, t), A (x, t), for A u A0 v B in T , B(x, t) ← r(x, y, t), A(y, t), for ∃r.A v B in T . Second, we observe that, for any trace πa in the canonical quasimodel Q of any (T , A), if A ∈ πa (n) and A v ∃r.B ∈ T then, by (q4), πa contains the n-shift not only of projr (πB ) but also of πA . Since T is ultimately periodic, for each trace πB , we fix integers mP , pP , mF , pF and take the following rules with a fresh predicate FB : A(x, t + i) ← B(x, t), for 0 ≤ i < mF and A ∈ πB (i), A(x, t + i) ← FB (x, t), for 0 ≤ i < pF and A ∈ πB (mF + i), FB (x, t + mF ) ← B(x, t) and FB (x, t + pF ) ← FB (x, t), and symmetric rules with mP , pP and fresh PB . Intuitively, the rules in the first line replicate the (irregular) part of πB from 0 to mF . The last two rules add recurring markers FB at the start of each period while the rules in the second line replicate the period of πB starting from each marker FB . The required DATALOG1S -program ΠT contains all the rules above (note that CIs of the form ∗ A v B are also covered by the rules for traces πB ). Using the canonical quasimodel and Theorem 4, it is readily seen that ΠT is equivalent to T : for every temporal ABox A, the answers to ΠT over A coincide with the certain answers to (T , A). Theorem 5 follows from the PS PACE data complexity in DATALOG1S [13] and independence of ΠT from A. Observe that Theorem 5 does not imply decidability of full TEL◦ , since it is open whether every TEL◦ -TBox is ultimately periodic. We thus turn our attention to sufficient syntactic conditions for ultimate periodicity and obtain tight complexity bounds for both data and combined complexity for the resulting fragments. We consider two types of conditions: restricted use of rigid roles and acyclicity of concept inclusions. 5 Restricted Use of Rigid Roles We consider TEL◦ ◦ loc , the restriction of TEL in which only local roles are allowed. Due to the reduced interaction between temporal and DL component, we obtain data tractability. Theorem 6. TAQ answering over TEL◦ loc is PS PACE -complete in combined and PT IME - complete in data complexity. Lower bounds follow from PS PACE- and PT IME-hardness of entailment in Horn- LTL [11] and EL, respectively. For the upper bounds, let (T , A) be a KB with a TEL◦ loc -TBox and Q = {πd | d ∈ D} its canonical quasimodel. We take a propo- sition PA,d for each A ∈ CN and d ∈ D and construct a Horn-LTL formula ϕT ,A whose minimal model is isomorphic to Q: variable PA,d is true in the model at moment n just in case A ∈ πd (n). We take the conjunction of the following formulas, for d ∈ D: 2(PA,d ∧ PA0 ,d → PB,d ), for A u A0 v B in T , 2( ∗ PA,d → PB,d ), for ∗ A v B in T , n PA,a , for A(a, n) ∈ A, PB,B , for B ∈ CN, n n PB,b → PA,a , for r(a, b, n) ∈ A and ∃r.B v A in T , PB 0 ,B → 2(PA,d → PA0 ,d ), for A v ∃r.B and ∃r.B 0 v A0 in T , where n is Fn if n ≥ 0 and P−n if n < 0 and 2 is the ‘globally’ operator. It is readily verified that ϕT ,A is as required (crucially, (q4) for local roles boils down to the last formula above). Since entailment in LTL is in PS PACE [32] and ϕT ,A is polynomial in the size of (T , A), we obtain membership in PS PACE for combined complexity. For PT IME data complexity, observe that traces πB , for B ∈ CN, are ultimately 2|T | - periodic because they are traces of the canonical quasimodel for (T , ∅); so, they can be stored in constant space. Next, traces πa , a ∈ ind(A), are ultimately 2|T |+|A| -periodic, but a closer inspection reveals that the middle irregular section, mP + mF , is bounded by |A| + 2|T | , while both periods, pP and pF , by 2|T | ; see [3, Lemma 3]. So, Q requires space bounded by a polynomial in |A|. Since each rule application extends the traces, the saturation procedure constructing Q terminates in polynomial time in |A|. Since TBoxes without rigid roles at all may be too restrictive for applications, we consider TEL◦ l-rig -TBoxes: rigid roles are allowed only in CIs of the form ∃r.B v A. In the following theorem, the lower bound follows from (the proof of) Theorem 2; for the upper bounds, we construct rewritings into DATALOG1S , similarly to ΠT in Section 4. Theorem 7. TAQ answering over TEL◦ l-rig is PS PACE -complete in data complexity and in E XP T IME in combined complexity. 6 Acyclicity Conditions It is known that acyclicity conditions often lead to better complexity. For example, acyclic TBoxes are a way of obtaining CTL-based temporal extensions of EL that have rigid roles and enjoy PT IME subsumption [21]. In DATALOG1S , a restriction on recursion has also been used to attain tractability [12]. From the application point of view, large parts of S NOMED CT and GO [18] are indeed acyclic. So, we believe that the fragments we consider below are well-suited for temporal extensions of such ontologies. Acyclic TBoxes are finite sets of CDs A ≡ C, A ∈ NC , such that no two CDs have the same left-hand side, and there are no CDs A1 ≡ C1 , . . . , Ak ≡ Ck in T such that Ai+1 occurs in Ci , for all 1 ≤ i ≤ k (where Ak+1 := A1 ). We say A is defined in T if A ≡ C ∈ T and primitive otherwise. We obtain the following basic tractability result. Theorem 8. TAQ answering over acyclic TEL◦ is in L OG T IME-uniform AC0 in data complexity and in PT IME in combined complexity. The PT IME upper bound in combined complexity is subsumed by Theorem 10 below. We establish the L OG T IME-uniform AC0 upper bound by rewriting into FO(+). More precisely, for a given TAQ A(x, t) and TBox T , we construct a two-sorted first-order formula ϕT ,A (x, t) with functions ±1 on temporal terms such that (T , A) |= A(a, i) iff A (viewed as an interpretation) is a model of ϕT ,A (a, i), for all ABoxes A, a ∈ ind(A), i ∈ Z. We construct ϕT ,A (x, t) by adapting the strategy developed for atemporal EL [8]: ϕT ,A (x, t) = SA (x, t), if A is primitive, ϕT ,A (x, t) = SA (x, t) ∨ ϕT ,C (x, t), if A ≡ C ∈ T , ϕT ,B1 uB2 (x, t) = ϕT ,B1 (x, t) ∧ ϕT ,B2 (x, t),  ϕT ,∃r.B (x, t) = ∃y Rr (x, y, t) ∧ ϕT ,B (y, t) , ϕT , ∗ B (x, t) = ϕT ,B (x, t op∗ 1), where SA (x, t) is a disjunction of all B(x, t), for a concept name B, with T |= B v A, 0 0 rig and Rr (x, y, t) is r(x, y, t) for r ∈ Nloc R and ∃t r(x, y, t ) for r ∈ NR . Note that ϕT ,A is Z an FO -rewriting in the terminology of Artale et al. [4, 1] because the temporal variables range over Z. It will follow from Theorem 10, however, that the infinite interpretation of A is empty after at most |T | steps from the ABox and so, ϕT ,A can be converted into an FO-rewriting whose temporal variables range over tem(A) only; see [1]. To address the restricted expressiveness of acyclic TBoxes, we next introduce novel weaker notions of acyclicity that restrict only one dimension, either DL or temporal. DL Acyclicity First, we introduce DL-acyclic TEL◦ -TBoxes, which are well-suited as temporal ex- tensions of, say, biomedical ontologies that may require recurrent patterns but have an acyclic DL component. A TEL◦ -TBox T with concept names CN is called DL-acyclic if there is a mapping `DL : CN → N such that: (i) A v ∃r.B or ∃r.B v A ∈ T implies `DL (A) > `DL (B); (ii) ∗ A v B implies `DL (A) = `DL (B); (iii) A u A0 v B ∈ T implies `DL (A) = `DL (A0 ) = `DL (B). We say that a DL-acyclic TBox is of depth k if k is the smallest integer m such that there is such a mapping `DL satisfying `DL (B) ≤ m for all B ∈ CN. Theorem 9. TAQ answering over DL-acyclic TEL◦ -TBoxes of depth k, k ≥ 1, is k-E XP S PACE-complete in combined complexity and NC1 -complete in data complexity. A closer inspection of the non-elementary lower bound proof in Theorem 2 reveals that the TBox used is DL-acyclic and TAQ answering over TBoxes of depth k is k-E XP S PACE-hard. NC1 -hardness in data complexity follows by reduction of the word problem of NFAs to TAQ answering (even without the DL dimension); see [1]. For the matching upper bounds, fix (T , A) with T of depth k. We devise a completion procedure, which is based on special LTL-formulas and implies ultimate periodicity of all traces in the canonical quasimodel of (T , A); cf. Section 5. Given any A, let Ai consist of all A(a, i) and r(a, b, i) in A as well as all assertions r(a, b, i) such that r ∈ Nrig R and r(a, b, j) ∈ A, for some j ∈ Z. The algorithm separates consequences coming from the role structure in the ABox and local temporal consequences of T . In particular, it exhaustively adds assertions A(a, i) to A if either (T , Ai ) |= A(a, i) or B(a, i op∗ 1) ∈ A and ∗ BvA∈T. (3) It turns out that Ai in (3) can be replaced by its suitably defined quotient Bi . Intu- itively, the logic can only distinguish distinct trees of depth k, whose number depends on |T | only; so, the size of Bi is independent of |A|. By induction on depth k, we define LTL-formulas ϕa,i of k-fold-exponential size characterizing all A ∈ CN such that (T , Bi ) |= A(a, i): we start from formulas as in Theorem 6; the induction step takes account of the structure of Bi and incurs an exponential blowup. For the combined complexity upper bound, observe that each of the polynomially many ϕa,i can be analyzed in k-E XP S PACE. For the data complexity upper bound, note that checking (T , Bi ) |= A(a, i) can be done in constant time. The second option in (3), however, cannot be implemented directly as the number of steps depends on |A|. Instead, by using Büchi automata, we show that the question of whether all traces extending A have A at position i is a regular property and so, is in NC1 . Temporal Acyclicity We next look at an orthogonal restriction that admits recursion in the DL dimension; it generalizes not only acyclic TEL◦ -TBoxes but also general EL-TBoxes. A TEL◦ -TBox T with concept names CN is temporally acyclic if there is `◦ : CN → N such that (i) P A v B or F B v A ∈ T implies `◦ (B) = `◦ (A) + 1; (ii) ∃r.B v A or A v ∃r.B ∈ T implies `◦ (A) = `◦ (B); (iii) A u A0 v B ∈ T implies `◦ (A) = `◦ (A0 ) = `◦ (B). Theorem 10. TAQ answering over temporally acyclic TEL◦ is PT IME-complete in data and combined complexity. The lower bounds are inherited from EL. To prove the upper bounds, we show that KBs with a temporally acyclic TEL◦ -TBox T enjoy a small quasimodel property: for every trace πd in the canonical quasimodel Q of any (T , A), we have πd (j) = ∅, if j > max A + |T | or j < min A − |T |. Intuitively, it means that the canonical quasimodel has a very restricted temporal exten- sion that stretches at most |T | time points beyond the ABox. It follows that the procedure for constructing the quasimodel can be implemented in polynomial time: traces πd require only polynomial space, and rules (q1)–(q4) extend the traces. Inflationary TEL3 In this section, we follow an approach suggested by Artale et al. [4] (in the context of temporal DL-Lite) and restrict TEL3 by allowing 3∗ only on the left-hand side of CIs. We denote this fragment by TEL3 infl , for inflationary TEL (which is related to inflationary DATALOG1S [12]). Note that TEL3 infl extends general EL-TBoxes. Yet, the complexity remains the same: Theorem 11. TAQ answering over TEL3 infl is PT IME -complete in both data and com- bined complexity. We need to show only the upper bounds. Let T be a TEL3 infl -TBox with concept names in CN. Observe that TEL3infl can still be viewed as a fragment of TEL◦ ; see Section 2. In fact, one can show an analogue of Theorem 4 with the following replacement of (t2): (t20 ) if 3∗ A v B ∈ T and A ∈ πd (n), then B ∈ πd (n op∗ k) for all k > 0. We establish a special shape of the traces in the canonical model of any (T , A). Let % : Z → 2CN be a map and let l, u ∈ Z with l ≤ u. We say that % is an [l, u]-bow tie if – for all i > u, we have %(i) ⊆ %(i + 1), and if %(i + 1) = %(i) then all %(i0 ), for i0 ≥ i, coincide; – symmetrically, for all i < l, we have %(i) ⊆ %(i − 1), and if %(i − 1) = %(i) then all %(i0 ), for i0 ≤ i, coincide. These properties mean that % grows monotonically to the right of u and to the left of l; in other words, % has inflationary behaviour. We prove that the traces πd in the canonical quasimodel Q of (T , A), for any A, enjoy the following properties: – πa is a [min A, max A]-bow tie, for each a ∈ ind(A); – πB is a [0, 0]-bow tie, for each B ∈ CN. Thus, the traces in Q can be represented in polynomial space because only the middle section and at most |CN| steps at both ends need to be stored. Since the traces are extended with every rule application, the procedure terminates after polynomially many steps; Theorem 11 follows. 7 Discussion and Future Work We summarize the fragments of TEL, their relationships and the obtained complexity results in the following diagram: ≥ PS PACE TEL◦ undecidable ≥ non-elem TEL3 PS PACE ultim. period. TEL◦ undecidable ≥ non-elem TEL◦ DL-acyclic TEL◦ l-rig in E XP T IME TEL3 infl non-elem PT IME NC1 temp. acyclic TEL◦ TEL◦ loc acyclic TEL◦ PT IME PS PACE PT IME in PT IME acyclic EL EL AC0 in PT IME PT IME where the solid lines are inclusions between DLs, the dashed line is a reduction that preserves answers to all queries (model conservative extension). The data complexity is indicated by shading and the combined complexity is specified below the language. We briefly remark that although acyclic and temporally acyclic TEL◦ -TBoxes cannot express rigid concepts (cf. Section 2), we conjecture that our techniques can be extended to handle them without affecting the complexity results in the diagram. In contrast, DL-acyclic TBoxes can express rigid concepts and the results in the diagram above thus hold for this case. Our data-tractability results show theoretical adequacy of the identified fragments of TEL for data-intensive applications. Our two novel forms of acyclicity, DL- and temporal, are somewhat close in spirit to multi-separability [12]: however, the latter puts a weaker restriction on recursion but a stricter one on the interaction between the temporal and data component. DL-acyclic TEL◦ is the first (to the best of our knowledge) DL shown to have NC1 -complete query answering (the large gap between data and combined complexity is also remarkable). On the practical side, there is evidence that such data- tractable fragments should be sufficient for many biomedical applications. Following the principles of OBDA, our framework provides a means of defining temporal concepts in the ontology for these applications: temporal concepts capture both (restricted) tree- shaped temporal conjunctive queries (CQs) and recurring temporal patterns. As our immediate future work, we will address decidability of (full) TEL◦ and then consider CQs with the + operation on temporal terms. We expect that our positive results can be lifted to CQs using the combined approach [27], which utilizes a structure similar to our canonical quasimodel to compute CQ certain answers in atemporal EL. We will also study succinct and expressive representations of temporal data. For example, the only known algorithm for DATALOG1S with binary encoding of timestamps in the data runs in E XP T IME in the size of the data [13]. We, however, conjecture that careful materialization should be sufficient to deal with the issue. We will also consider interval encoding of temporal ABoxes, e.g., A(a, [n1 , n2 ]), and settings capturing infinite temporal periodic data as introduced in [24, 14]. References 1. Artale, A., Kontchakov, R., Kovtunova, A., Ryzhikov, V., Wolter, F., Zakharyaschev, M.: First- order rewritability of temporal ontology-mediated queries. In: Proc. IJCAI-15. pp. 2706–2712 (2015) 2. Artale, A., Kontchakov, R., Lutz, C., Wolter, F., Zakharyaschev, M.: Temporalising tractable description logics. In: Proc. TIME-07. pp. 11–22 (2007) 3. Artale, A., Kontchakov, R., Ryzhikov, V., Zakharyaschev, M.: The complexity of clausal fragments of LTL. In: Proc. LPAR-19, 2013. pp. 35–52 (2013) 4. Artale, A., Kontchakov, R., Wolter, F., Zakharyaschev, M.: Temporal description logic for ontology-based data access. In: Proc. IJCAI-13 (2013) 5. Baader, F., Borgwardt, S., Lippmann, M.: Temporal query entailment in the description logic SHQ. J. Web Sem. 33, 71–93 (2015) 6. Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope. In: Proc. IJCAI-05 (2005) 7. Baudinet, M., Chomicki, J., Wolper, P.: Temporal deductive databases. In: Temporal Databases, pp. 294–320. Benjamin-Cummings Publishing (1993) 8. Bienvenu, M., Lutz, C., Wolter, F.: Deciding FO-rewritability in EL. In: Proc. DL-12 (2012) 9. Borgwardt, S., Lippmann, M., Thost, V.: Temporalizing rewritable query languages over knowledge bases. J. Web Sem. 33, 50–70 (2015) 10. Borgwardt, S., Thost, V.: Temporal query answering in the description logic EL. In: Proc. IJCAI-15. pp. 2819–2825 (2015) 11. Chen, C.C., Lin, I.P.: The computational complexity of satisfiability of temporal Horn formulas in propositional linear-time temporal logic. Inf. Process. Lett. 45(3), 131–136 (1993) 12. Chomicki, J.: Polynomial time query processing in temporal deductive databases. In: Proc. PODS-90 (1990) 13. Chomicki, J., Imielinski, T.: Temporal deductive databases and infinite objects. In: Proc. PODS-88. pp. 61–73 (1988) 14. Chomicki, J., Imielinski, T.: Finite representation of infinite query answers. ACM Trans. Database Syst. 18(2), 181–223 (1993) 15. Chomicki, J., Toman, D.: Temporal Databases. In: Handbook of Temporal Reasoning in Artificial Intelligence. pp. 429–468. Elsevier (2005) 16. Dong, X.L., Tan, W.: A time machine for information: Looking back to look forward. PVLDB 8(12), 2044–2055 (2015) 17. Gabbay, D., Kurucz, A., Wolter, F., Zakharyaschev, M.: Many-dimensional modal logics: theory and applications. Elsevier (2003) 18. Gene Ontology Cons.: Gene ontology: Tool for the unification of biology. Nature Genetics 25, 25–29 (2000) 19. Gutiérrez-Basulto, V., Jung, J.C., Lutz, C.: Complexity of branching temporal description logics. In: Proc. ECAI-12 (2012) 20. Gutiérrez-Basulto, V., Jung, J., Schneider, T.: Lightweight description logics and branching time: a troublesome marriage. In: Proc. KR-14 (2014) 21. Gutiérrez-Basulto, V., Jung, J., Schneider, T.: Lightweight temporal description logics with rigid roles and restricted TBoxes. In: Proc. IJCAI-15. pp. 3015–3021 (2015) 22. Gutiérrez-Basulto, V., Klarman, S.: Towards a unifying approach to representing and querying temporal data in description logics. In: Proc. RR-12. pp. 90–105 (2012) 23. Haase, C., Lutz, C.: Complexity of subsumption in the EL family of description logics: Acyclic and cyclic TBoxes. In: Proc. of ECAI-08 (2008) 24. Kabanza, F., Stévenne, J., Wolper, P.: Handling infinite temporal data. In: Proc. PODS-90. pp. 392–403 (1990) 25. Klarman, S., Meyer, T.: Querying temporal databases via OWL 2 QL. In: Proc. RR-14. pp. 92–107 (2014) 26. Krisnadhi, A., Lutz, C.: Data complexity in the EL family of description logics. In: Proc. LPAR-07. pp. 333–347 (2007) 27. Lutz, C., Toman, D., Wolter, F.: Conjunctive query answering in the description logic EL using a relational database system. In: Proc. IJCAI-09 (2009) 28. Manna, Z., Wolper, P.: Synthesis of communicating processes from temporal logic specifica- tions. ACM Trans.Program.Lang.Syst. 6(1), 68–93 (1984) 29. O’Connor, M.J., Shankar, R.D., Parrish, D.B., Das, A.K.: Knowledge-data integration for temporal reasoning in a clinical trial system. Int. J. Med. Inform. 78, S77–S85 (2009) 30. Roth, M., Tan, W.: Data integration and data exchange: It’s really about time. In: Proc. CIDR-13 (2013) 31. Shankar, R.D., Martins, S.B., O’Connor, M.J., Parrish, D.B., Das, A.K.: Representing and rea- soning with temporal constraints in clinical trials using semantic technologies. In: BIOSTEC- 08, Revised Selected Papers. pp. 520–530 (2008) 32. Sistla, A.P., Clarke, E.M.: The complexity of propositional linear temporal logics. J. ACM 32(3), 733–749 (1985) 33. Stockmeyer, L.J.: The Complexity of Decision Problems in Automata Theory and Logic. Ph.D. thesis, MIT, Cambridge, Massachusetts, USA (1974)