=Paper=
{{Paper
|id=None
|storemode=property
|title=Temporal Query Answering in DL-Lite
|pdfUrl=https://ceur-ws.org/Vol-1014/paper_8.pdf
|volume=Vol-1014
|dblpUrl=https://dblp.org/rec/conf/dlog/BorgwardtLT13
}}
==Temporal Query Answering in DL-Lite==
Temporal Query Answering in DL-Lite ∗ Stefan Borgwardt, Marcel Lippmann, and Veronika Thost Theoretical Computer Science, TU Dresden, Germany {stefborg,lippmann,thost}@tcs.inf.tu-dresden.de 1 Introduction Context-aware applications try to detect specific situations within a changing environment (e.g. a computer system or air traffic observed by radar) in order to react accordingly. To gain information, the environment is observed by sensors (for a computer system, data about its resources is gathered by the operating system), and the results of sensing are stored in a database. A context-aware application then detects specific predefined situations based on this data (e.g. a high system load) and executes some action (e.g. increases the CPU frequency). In a simple setting, such an application can be realized by using standard database techniques: the sensor information is stored in a database, and the situ- ations to be recognized are specified as database queries [1]. However, we cannot assume that the sensors provide a complete description of the current state of the environment. Thus, the closed world assumption employed by database systems (i.e. facts not present in the database are assumed to be false) is not appropriate since there may be facts of which the truth is not known. For example, a sensor for certain information might not be available for a moment or not even exist. In addition, though a complete specification of the environment usually does not exist, often some knowledge about its behavior is available. This knowledge can be used to formulate constraints on the interpretation of the predicates used in the queries, to detect more complex situations. In ontology-based data access (OBDA) [8], domain knowledge is encoded in ontologies using a description logic (DL). In this paper, we consider logics of the DL-Lite family, which are light- weight DLs with a low complexity for many reasoning problems [8]. In order to recognize situations that evolve over time, we propose to add a temporal logical component to the queries. We use the operators of the temporal logic LTL, which allows to reason about a linear and discrete flow of time [17]. Usual temporal operators include next (#φ), eventually (3φ), and always (2φ), asserting that a property φ is true at the next point in time, at some point in the future, and at all time points in the future, respectively. We also use the corresponding past operators #− , 3− , and 2− . Consider, for example, a collection of servers providing several services. An important task is to migrate services between servers to balance the load. To decide when to migrate, we want to detect certain critical situations. We consider a process to be critical if it has an increasing workload, and at the same time the ∗ Partially supported by DFG SFB 912 (HAEC) and GRK 1763 (QuantLA). server that hosts it is almost overloaded. We want to detect those processes and servers that were in a critical situation at least twice within the past ten time units, expressed by the query #−10 (3(Critical(x, y) ∧ #3Critical(x, y))), where Critical(x, y) := Server(x) ∧ Process(y) ∧ executes(x, y) ∧ Running(y) ∧ IncreasingWorkload(y) ∧ AlmostOverloaded(x). In this example, it is essential that future and past operators can be nested arbitrarily. One might argue that, as we are looking at the time line from the point of view of the current time point, and nothing is known about the future, having only past operators is sufficient. We will even show that in our setting it is always possible to construct an equivalent query using only past operators. However, the resulting query is not very concise and it is not easy to see the situation that is to be recognized. This is similar to propositional LTL, where eliminating the past operators from a formula incurs a blowup that is at least exponential and no constructions smaller than triply exponential are known [16]. Temporal extensions of DL-Lite [9] have been considered in the context of conceptual modeling [2,3,4], where the focus lies on checking concept satisfiabil- ity. OBDA, the second major use case of DL-Lite, with query answering as the most important reasoning problem [8], has not yet been studied in a temporal setting. Investigations of temporal query languages based on a combination of DL queries such as conjunctive queries (CQs) and temporal logics such as LTL have started only quite recently. In [14], a framework is developed that combines the two without much interference. The algorithm for query answering in this setting is an LTL-satisfiability test using a sub-procedure to answer (atemporal) CQs. In [5], a similar query language, a combination of LTL and CQs over the DL ALC, is proposed. Its temporal component, however, is allowed to influence the DL queries via the notion of rigid names. The complexity increases depend- ing on whether only rigid concept names or also rigid role names are allowed. Additionally, the latter paper also studies the so-called data complexity, where the complexity is measured only w.r.t. the size of the sensor data. In this paper, we follow an approach suggested in [14] to combine the first- order rewriting techniques for atemporal query answering in logics of the DL-Lite family with a temporal component. The main idea is to use optimized database techniques to answer the actual queries. However, the existing techniques for answering temporal queries over temporal databases do not perfectly suit our purposes. In [12], the authors describe a temporal extension of the SQL query language that can answer temporal queries over a complete temporal database. However, in our setting the database containing all previous observations may grow huge very fast, but not all past observations are relevant for a particu- lar query. In [11], an approach is described that reduces the amount of space needed; but the query language considered there allows only for past operators. In addition to describing how these approaches can be applied to our problem, we propose a new algorithm that extends the one from [11] and can also deal with future operators. An extended version of this paper has been accepted for publication in [6]. The formal proofs of our results can be found in [7]. 2 Preliminaries We first describe the DL component, and then the temporal component of our query language. The DL-Lite family consists of various DLs that are tailored towards conceptual modeling and allow to realize query answering using classical database techniques. We only consider DL-Lite core as a prototypical example. Definition 1. Let NC , NR , and NI be non-empty, pairwise disjoint sets of con- cept, role, and individual names, respectively. A role expression is either a role name P1 ∈ NR or an inverse role P2− with P2 ∈ NR . A basic concept is of the form A or ∃R, where A ∈ NC and R is a role expression. A general concept is of the form B or ¬B, where B is a basic concept. A concept inclusion is of the form B v C, where B is a basic concept and C is a general concept. An assertion is of the form A(a) or P (a, b), where A ∈ NC , P ∈ NR , and a, b ∈ NI . A TBox (or ontology) is a finite set of concept inclusions, and an ABox is finite set of assertions. The semantics of DL-Lite core is defined through the notion of an interpretation. Definition 2. An interpretation is a pair I = (∆I , ·I ), where ∆I is a non- empty set (called domain) and ·I is a function that assigns to every A ∈ NC a set AI ⊆ ∆I , to every P ∈ NR a binary relation P I ⊆ ∆I × ∆I , and to every a ∈ NI an element aI ∈ ∆I . This function is extended to role expressions, basic concepts, and general concepts as follows: (P − )I := {(e, d) | (d, e) ∈ P I }, (∃R)I := {d | there is an e ∈ ∆I such that (d, e) ∈ RI }, and (¬C)I := ∆I \ C I . I is a model of B v C if B I ⊆ C I , of A(a) if aI ∈ AI , and of P (a, b) if (a , b ) ∈ P I . We write I |= T if I is a model of all concept inclusions in the I I TBox T , and I |= A if I is a model of all assertions in the ABox A. We assume that all interpretations I satisfy the unique name assumption (UNA), i.e. for all a, b ∈ NI with a 6= b, we have aI 6= bI . We now introduce the notion of temporal knowledge bases. Intuitively, they contain sensor data (ABoxes) for all previous time points, and a global TBox. Definition 3. A temporal knowledge base (TKB) K = h(Ai )0≤i≤n , T i consists of a finite sequence of ABoxes Ai and a TBox T , where the ABoxes Ai can only contain concept names that also occur in T . Let I = (Ii )0≤i≤n be a sequence of interpretations Ii = (∆, ·Ii ) over a fixed non-empty domain ∆. Then I is a model of K (written I |= K) if Ii |= Ai and Ii |= T for all i, 0 ≤ i ≤ n. Similar to what was done in [5,14], our temporal query language is based on conjunctive queries [1,10]. The main difference is that we do not allow negation, as also in DL-Lite arbitrary negation is not allowed. The reason is that the reductions in Section 3 do not work in the presence of negation. Definition 4. Let NV be a set of variables. A conjunctive query (CQ) is of the form φ = ∃y1 , . . . , ym .ψ, where y1 , . . . , ym ∈ NV and ψ is a (possibly empty) finite conjunction of atoms of the form A(z) for A ∈ NC and z ∈ NV ∪ NI ( concept atom); or r(z1 , z2 ) for r ∈ NR and z1 , z2 ∈ NV ∪ NI ( role atom). The empty conjunction is denoted by true. Temporal conjunctive queries (TCQs) are built from CQs as follows: each CQ is a TCQ, and if φ1 and φ2 are TCQs, then so are φ1 ∧ φ2 (conjunction), • φ1 ∨ φ2 (disjunction), #φ1 (strong next), φ1 (weak next), #− φ1 (strong previ- • ous), − φ1 (weak previous), φ1 U φ2 (until), and φ1 S φ2 (since). • The symbols #− , − , and S are called past operators, the symbols #, , and• U are future operators. All results also hold in the presence of the additional temporal operators 2 (always), 2− (always in the past), 3 (eventually), and 3− (some time in the past) [7], but we omit them here for space reasons. We denote the set of variables occurring in a TCQ φ by Var(φ), and the set of free variables in φ by FVar(φ). A TCQ φ is called Boolean if FVar(φ) = ∅. We further denote by Sub(φ) the set of all TCQs occurring as subqueries in φ (including φ itself). A union of conjunctive queries (UCQ) is a disjunction of CQs. For our purposes, it is sufficient to define the semantics for Boolean CQs and TCQs. As usual, it is given using the notion of a homomorphism [10]. Definition 5. Let I = (∆, ·I ) be an interpretation and ψ be a Boolean CQ. A mapping π : Var(ψ) ∪ NI → ∆ is a homomorphism of ψ into I if π(a) = aI for all a ∈ NI , π(z) ∈ AI for all concept atoms A(z) in ψ, and (π(z1 ), π(z2 )) ∈ rI for all role atoms r(z1 , z2 ) in ψ. We say that I is a model of ψ (written I |= ψ) if there is such a homomorphism. Let now φ be a Boolean TCQ. For a sequence of interpretations I = (Ii )0≤i≤n and i with 0 ≤ i ≤ n, we define I, i |= φ by induction on the structure of φ: I, i |= ∃y1 , . . . , ym .ψ iff Ii |= ∃y1 , . . . , ym .ψ I, i |= φ1 ∧ φ2 iff I, i |= φ1 and I, i |= φ2 I, i |= φ1 ∨ φ2 iff I, i |= φ1 or I, i |= φ2 I, i |= #φ1 iff i < n and I, i + 1 |= φ1 • I, i |= φ1 iff i < n implies I, i + 1 |= φ1 I, i |= #− φ1 iff i > 0 and I, i − 1 |= φ1 • I, i |= − φ1 iff i > 0 implies I, i − 1 |= φ1 I, i |= φ1 U φ2 iff there is some k, i ≤ k ≤ n such that I, k |= φ2 and I, j |= φ1 for all j, i ≤ j < k I, i |= φ1 S φ2 iff there is some k, 0 ≤ k ≤ i such that I, k |= φ2 and I, j |= φ1 for all j, k < j ≤ i. Here we assume that there is no time point before 0 or after n, similar to the temporal semantics used for LTL in [20] or for temporal query languages for databases [11,15,18]. As in classical LTL, one can show that φ1 S φ2 is equivalent to φ2 ∨ (φ1 ∧ #− (φ1 S φ2 )), and a similar equivalence holds for U. We are now ready to introduce the central reasoning problem of this paper, namely finding certain answers to TCQs. Definition 6. Let φ be a TCQ, I = (Ii )0≤i≤n a sequence of interpretations, and i ≥ 0. The mapping a : FVar(φ) → NI is an answer to φ w.r.t. I at time point i if I, i |= a(φ), where a(φ) denotes the Boolean TCQ that is obtained from φ by replacing the free variables according to a. Let further K = h(Ai )0≤i≤n , T i be a TKB. A mapping a : FVar(φ) → NI is a certain answer to φ w.r.t. K at time point i if for every J |= K, we have J, i |= a(φ). The set of all answers to φ w.r.t. I at time point i is denoted by Ans(φ, I, i), and the set of all certain answers to φ w.r.t. K is denoted by Cert(φ, K, i). Recall that our main interest lies in finding answers to queries at the last time point, i.e. computing the sets Ans(φ, I) := Ans(φ, I, n) or Cert(φ, K) := Cert(φ, K, n). We will sometimes use the abbreviation false := A(x) ∧ A0 (x), where A, A0 are new concept names for which we assume that the concept inclusion A v ¬A0 is contained in the global TBox T . 3 Answering Temporal Conjunctive Queries For computing the set of certain answers for a conjunctive query, the rewriting approach [8] can be employed. It compiles the information contained in the TBox into the query and evaluates the query w.r.t. the ABox (viewed as database) using classical database techniques. A similar approach is possible for TCQs. Definition 7. For an ABox A, the interpretation DB(A) := (NI , ·DB(A) ) is de- fined as follows: – aDB(A) := a for all a ∈ NI ; – ADB(A) := {a | A(a) ∈ A} for all A ∈ NC ; and – P DB(A) := {(a, b) | P (a, b) ∈ A} for all P ∈ NR . As shown in [8], this interpretation is the smallest model of A. In order to employ database techniques, we must assume NI and DB(A) to be finite. Proposition 8 ([8]). Let ψ be a CQ, A be an ABox, and T be a TBox. There is a canonical model IA,T of A and T and a UCQ ψ T such that Cert(ψ, hA, T i) = Ans(ψ, IA,T ) = Ans(ψ T , DB(A)). We now use this proposition to show a similar result for TCQs. Let φ be a TCQ and K = h(Ai )0≤i≤n , T i be a TKB. The TCQ φT is obtained by replacing each CQ ψ occurring in φ by ψ T . Note that φT is again a TCQ since ψ T is always a UCQ. Let furthermore IK := (IAi ,T )0≤i≤n and DB(K) := (DB(Ai ))0≤i≤n . The following theorem can be shown by a straightforward induction on the structure of φ. Theorem 9. For every TCQ φ, TKB K = h(Ai )0≤i≤n , T i, and i ≥ 0, we have Cert(φ, K, i) = Ans(φ, IK , i) = Ans(φT , DB(K), i). More importantly, for every TCQ φ and TKB K = h(Ai )0≤i≤n , T i, it holds that Cert(φ, K) = Ans(φT , DB(K)). It thus remains to show how to compute the set Ans(φ, I) for a TCQ φ and a sequence I = (Ii )0≤i≤n of interpretations over a finite domain. A first possibility is to view I as a temporal database and rewrite φ into an ATSQL query [12]. However, since our goal is to monitor processes that produce new data in very short time intervals, storing all the data for all previous time points is not feasible. Therefore, we describe two different approaches that reduce the amount of space necessary to compute Ans(φ, I). Since we are interested in the answers at the last time point, the idea is to keep only the past information necessary to answer the query φ. In the first approach (Section 4), we rewrite φ into a TCQ φ0 without fu- ture operators, employing a construction described in [13]. We then compute Ans(φ0 , I) using an algorithm described in [11,19] that uses a so-called bounded history encoding, which means that the space required by the algorithm is con- stant w.r.t. the number n of previous time points. Only the current state of the database and some auxiliary relations have to be stored. In Section 5, we generalize the algorithm from [11] to directly deal with the future operators. The main difference is that we do not consider negation or arbitrary first-order queries. Unfortunately, the space required by this algorithm is in general unbounded w.r.t. n and thus does not constitute a bounded history encoding in the sense of [11,19]. However, it allows us to circumvent the non- elementary blow-up of the formula resulting from the reduction in [13]. 4 Eliminating Future Operators In this section, we rewrite a TCQ φ into an equivalent TCQ φ0 that does not contain future operators, but may contain negation as in [11]. We then apply the algorithm described in [11] to iteratively compute the sets Ans(φ0 , I, i). The reduction uses the separation theorem for propositional LTL [13]. However, this theorem cannot be applied directly since our temporal semantics differs from that in [13]. The only temporal operators in [13] are strict versions of U and S. Moreover, the semantics is defined w.r.t. bounded past and unbounded future. Definition 10. Let P be a set of propositional variables. LTL-formulae are built from P using the constructors φ1 ∧ φ2 , φ1 ∨ φ2 , ¬φ1 , φ1 U< φ2 , and φ1 S< φ2 . An LTL-structure is an infinite sequence J = (wi )i≥0 of worlds wi ⊆ P , i ≥ 0, and it satisfies an LTL-formula φ at i ≥ 0 (written J, i |= φ) if J, i |= p for p ∈ P iff p ∈ wi J, i |= φ1 ∧ φ2 iff J, i |= φ1 and J, i |= φ2 J, i |= φ1 ∨ φ2 iff J, i |= φ1 or J, i |= φ2 J, i |= ¬φ1 iff not J, i |= φ1 J, i |= φ1 U< φ2 iff there is some k > i such that J, k |= φ2 and J, j |= φ1 for all j, i < j < k J, i |= φ1 S< φ2 iff there is some k, 0 ≤ k < i, such that J, k |= φ2 and J, j |= φ1 for all j, k < j < i. We define the constants true and false by p ∨ ¬p and p ∧ ¬p, respectively, for an arbitrary p ∈ P . We also define first := ¬(true S< true) with the semantics that J, i |= first iff i = 0, i.e. this formula is satisfied exactly at the first time point. Let from now on φ be an arbitrary but fixed TCQ containing only the CQs α1 , . . . , αm . Let furthermore {p1 , . . . , pm , p} be the set of propositional variables. For a finite sequence I = (Ii )0≤i≤n of interpretations, its propositional abstrac- tion is the LTL-structure b I := (wi )i≥0 , where wi := {pj | Ii |= αj } ∪ {p} if 0 ≤ i ≤ n, and wi := ∅ otherwise. We first transform φ into an LTL-formula fφ that behaves similarly to φ w.r.t. the propositional abstractions of sequences of interpretations I. It is defined inductively on the structure of φ: – fαj := pj for a CQ αj ; – fφ1 ∧φ2 := fφ1 ∧ fφ2 ; fφ1 ∨φ2 := fφ1 ∨ fφ2 ; – f#φ1 := false U< (fφ1 ∧ p); f#− φ1 := false S< fφ1 ; – f•φ1 := false U< (fφ1 ∨ ¬p); f•− φ1 := first ∨ false S< fφ1 ; – fφ1 U φ2 := fφ2 ∨ (fφ1 ∧ fφ1 U< (fφ2 ∧ p)); and – fφ1 S φ2 := fφ2 ∨ (fφ1 ∧ fφ1 S< fφ2 ). We now use the separation theorem from [13] to transform fφ into an equiva- lent LTL-formula fφ0 which is a Boolean combination of temporal subformulae that either contain only S< operators or only U< operators. In the proof of this theorem, subformulae of fφ are copied and rearranged, but no additional propositional variables are introduced. Since we are interested in evaluating φ (and thus fφ and fφ0 ) at n, we can now reduce fφ0 as follows: First, we replace all variables that are in the scope of a U< by false. The reason for this is that such variables are only evaluated at time points after n, where all variables are false in all propositional abstractions. The resulting formula is then simplified using the simple equivalences ¬true ≡ false, ¬false ≡ true, true ∧ ψ ≡ ψ, true ∨ ψ ≡ true, false ∧ ψ ≡ false, false ∨ ψ ≡ ψ, ψ U< false ≡ false, true U< true ≡ true, and false U< true ≡ true. This yields a formula fφ00 that does not contain any U< operators and is equivalent to fφ0 at time point n in every LTL-structure of the form b I. This formula is now translated back into a Boolean TCQ φfφ00 . Recall that the goal is to use the algorithm presented in [11], where negation is allowed in the query language. Furthermore, in that paper a slightly different operator S∗ is used instead of S. The semantics of ¬ and S∗ , as employed in [11], is as follows: I, i |= ¬φ1 iff not I, i |= φ1 I, i |= φ1 S∗ φ2 iff there is some k, 0 ≤ k < i such that I, k |= φ2 and I, j |= φ1 for all j, k < j ≤ i. In the following, we call any TCQ built using the operators ∧, ∨, ¬, #− , and S∗ a Past-TCQ, which is in particular a temporal query in the sense of [11]. We can now define the final translation recursively as follows: – φpj := αj for j, 1 ≤ j ≤ m; φp := true; – φf1 ∧f2 := φf1 ∧ φf2 ; φf1 ∨f2 := φf1 ∨ φf2 ; φ¬f1 := ¬φf1 ; – φf1 S< f2 := #− (φf2 ∨ φf1 S∗ φf2 ). In [7], we use the above constructions to show the following result. Theorem 11. For every Boolean TCQ φ there is a Boolean Past-TCQ ψ such that for all I = (Ii )0≤i≤n , we have I, n |= φ iff I, n |= ψ. Recall that in the above construction the Boolean CQs αj were only copied and rearranged inside the structure of φ. Thus, it is easy to see from Definition 6 that this result also holds for computing the sets of answers of non-Boolean TCQs. Corollary 12. For every TCQ φ there is a Past-TCQ ψ such that FVar(ψ) = FVar(φ) and for all I = (Ii )0≤i≤n , we have Ans(φ, I) = Ans(ψ, I). Let now I = (Ii )i≥0 be an infinite sequence of interpretations representing the observations over all time points. At each time point i ≥ 0, we only have ac- cess to the finite prefix I(i) := (Ij )0≤j≤i of I of length i + 1. Let ∆ be the shared domain of the interpretations in I. The algorithm from [11] works as follows on the TCQ ψ constructed in Corollary 12. On input I0 , it computes a first-order interpretation I00 of several auxiliary predicates. Intuitively, for each subformula ψ 0 of ψ beginning with a past operator, the algorithm stores the 0 I0 answers Ans(ψ 0 , I(0) ) ⊆ ∆FVar(ψ ) for ψ 0 in a new relation Aψ00 ⊆ ∆k of ar- ity k := |FVar(ψ 0 )|. The set Ans(ψ, I(0) ) can then easily be computed from I0 and I00 . Afterwards, the algorithm disregards I0 and keeps only the information computed in I00 . On input I1 , it then updates I00 to a new interpretation I10 , which allows it to compute Ans(ψ, I(1) ), and so on. The memory requirements of this algorithm are bounded by the maximal size of one pair (Ii , Ii0 ), which is polynomial in the size of ∆, in the number of concept and role names, and in the number of past operators occurring in ψ, and exponential in the number of free variables occurring below past operators. However, the memory requirements do not depend on the length of the sequence of interpretations I(i) seen so far. This is called a bounded history encoding in [11]. The presented construction has several drawbacks. First, the translations from φ to fφ and from fφ00 to φfφ00 = ψ may duplicate subformulae, which can cause exponential blowups in the size of φ. This could be avoided by applying a reduction similar to the one for propositional LTL in [13] directly to the TCQ φ. However, since the reduction in [13] is already non-elementary in the size of the formula (basically one exponential for each U< nested inside a S< , and vice versa), this is not much more efficient. The approach presented in this section is thus best suited for answering simple, small queries φ over large data sets I. 5 A New Algorithm In this section, we present an algorithm that computes the set Ans(φ, I) without the need to eliminate the future operators beforehand, thereby avoiding the non-elementary blowup of the construction described in the previous section. However, the memory requirements of this new algorithm are not independent of the number of previous time points. From now on, let φ be a fixed TCQ and I = (Ii )i≥0 be a fixed infinite sequence of interpretations over the same finite domain ∆. For i ≥ 0, we denote by I(i) := (Ij )0≤j≤i the finite prefix of I of length i + 1. Our algorithm iteratively computes the sets Ans(φ, I(i) ). It uses as data structure so-called answer formulae, which represent TCQs in which some parts have already been evaluated. In particular, they contain no more CQs, but rather sets of already computed answers to subqueries. Additionally, they may contain variables (different from those in NV ) that serve as place-holders for subqueries that have to be evaluated at the next time point. For ease of presentation, we assume in the following that NV is finite and that answers are of the form a : NV → ∆ instead of a : FVar(φ) → ∆. In an implementation, one would of course already restrict the intermediate compu- tations of answers for subqueries ψ ∈ Sub(φ) to FVar(ψ). But then one has to be more careful when combining answers to different subqueries. Thus, when we talk about answers, we mean mappings a : NV → ∆, and in particular Ans(. . .) refers to a set of such mappings, i.e. a subset of ∆NV . Definition 13. Let FSub(φ) denote the set of all subqueries of φ of the form • #ψ1 , ψ1 , or ψ1 U ψ2 . For j ≥ 0, we denote by Varφj the set of all variables of the form xψ i j for ψ ∈ FSub(φ). The set AFφ of all answer formulae for φ at i ≥ 0 is the smallest set satisfying the following conditions: – Every set A ⊆ ∆NV is an answer formula for φ at i. – Every xψ φ j ∈ Varj with j ≤ i is an answer formula for φ at i. – If α1 and α2 are answer formulae for φ at i, then so are α1 ∩ α2 and α1 ∪ α2 . In order to evaluate these answer formulae, we introduce the notion of correct- ness. Intuitively, an answer formula α for φ at i is correct for i if we obtain the set Ans(φ, I(i) ) by replacing the variables xψ j in α by appropriate sets of answers and evaluating ∩ and ∪ as set intersection and union, respectively. N Definition 14. We define the function evaln : AFnφ → 2∆ V , n ≥ 0, as follows: – evaln (A) := A if A ⊆ ∆NV ; • Ans(ψ1 , I(n) , j + 1) if j < n and ψ = #ψ1 or ψ = ψ1 ; Ans(ψ, I(n) , j + 1) if j < n and ψ = ψ U ψ ; 1 2 – evaln (xψ j ) := ∅ if j = n and ψ = #ψ 1 or ψ = ψ1 U ψ2 ; • NV ∆ if j = n and ψ = ψ1 ; – evaln (α1 ∩ α2 ) := evaln (α1 ) ∩ evaln (α2 ); and – evaln (α1 ∪ α2 ) := evaln (α1 ) ∪ evaln (α2 ). We say that a mapping Φ : Sub(φ) → AFiφ is correct for i ≥ 0 if for all n ≥ i and for all ψ ∈ Sub(φ), we have evaln (Φ(ψ)) = Ans(ψ, I(n) , i). In particular, if Φ : Sub(φ) → AFiφ is correct for i, then evali (Φ(φ)) = Ans(φ, I(i) ), which is the set we want to compute. The algorithm works as follows. It first computes a mapping Φ0 that is correct for 0, which is used to compute the next mapping Φ1 when the interpretation I1 becomes available. This mapping is correct for 1 and can be used to compute the next mapping Φ2 , and so on. In each step, to compute Φi+1 , we only need Φi and the interpretation Ii+1 . Definition 15. We recursively define the mapping Φ0 : Sub(φ) → AF0φ : – Φ0 (ψ1 ) := Ans(ψ1 , I(0) ) if ψ1 is a CQ; – Φ0 (ψ1 ∧ ψ2 ) := Φ0 (ψ1 ) ∩ Φ0 (ψ2 ); Φ0 (ψ1 ∨ ψ2 ) := Φ0 (ψ1 ) ∪ Φ0 (ψ2 ); – Φ0 (#ψ1 ) := x0#ψ1 ; Φ0 (#− ψ1 ) := ∅; – Φ0 ( ψ1 ) := x• • • ψ1 0 ; Φ0 ( − ψ1 ) := ∆NV ; – Φ0 (ψ1 U ψ2 ) := Φ0 (ψ2 ) ∪ Φ0 (ψ1 ) ∩ xψ 1 U ψ2 0 ; Φ0 (ψ1 S ψ2 ) := Φ0 (ψ2 ). We assume for this definition that CQs ψ1 are answered using a different mech- anism, e.g. by evaluating the first-order query ψ1 over the “database” I0 [1]. Assume now that Φi−1 : Sub(φ) → AFi−1 φ is a function containing only vari- ables with index i − 1. We proceed as follows to construct a new function that contains only variables with index i. We first define a function Φ0i : Sub(φ) → AFiφ similarly to Definition 15 that may still contain variables with index i − 1. We then iteratively replace these variables until only variables with index i are left. Definition 16. Let i > 0 and Φi−1 : Sub(φ) → AFi−1 φ . We recursively define the 0 i mapping Φi : Sub(φ) → AFφ as follows: – Φ0i (ψ1 ) := Ans(ψ1 , I(i) ) if ψ1 is a CQ; – Φ0i (ψ1 ∧ ψ2 ) := Φ0i (ψ1 ) ∩ Φ0i (ψ2 ); Φ0i (ψ1 ∨ ψ2 ) := Φ0i (ψ1 ) ∪ Φ0i (ψ2 ) – Φ0i (#ψ1 ) := xi#ψ1 ; Φ0i (#− ψ1 ) := Φi−1 (ψ1 ); – Φ0i ( ψ1 ) := x• • • ψ1 i ; Φ0i ( − ψ1 ) := Φi−1 (ψ1 ); – Φ0i (ψ1 U ψ2 ) := Φ0i (ψ2 ) ∪ Φ0i (ψ1 ) ∩ xψ 1 U ψ2 i ; and – Φ0i (ψ1 S ψ2 ) := Φ0i (ψ2 ) ∪ Φ0i (ψ1 ) ∩ Φi−1 (ψ1 S ψ2 ) . In order to remove the “old” variables with index i − 1, we now substitute them appropriately. For example, since x#ψ i−1 is a place-holder for the answers to ψ w.r.t. I(n) at i, we can now replace it by Φ0i (ψ). However, this formula may itself contain another old variable, and thus we have to be careful about the order in which we do these substitutions. Since each Φ0i (ψ) can contain only variables that refer to subqueries of ψ, by replacing the variables for “smaller” subqueries first, we ensure that all variables are eliminated. The details of this construction can be found in [7]. We obtain a mapping Φi : Sub(φ) → AFiφ that is correct for i. Lemma 17. For each i ≥ 0, the mapping Φi is correct for i. Consider now the algorithm, which, on input φ and I, computes the mappings Φi as described above, and outputs evali (Φi (φ)) for each i ≥ 0. The following is a trivial consequence of the correctness of these mappings. Theorem 18. Given a TCQ φ and an infinite sequence I = (Ii )i≥0 of inter- pretations, the algorithm outputs Ans(φ, I(i) ) for each i ≥ 0. It is now easy to compute the sets evali (Φi (φ)) = Ans(φ, I(i) ) for i ≥ 0 since each of the variables xψi in Φi (φ) simply has to be replaced by either ∅ or ∆ NV (see Definition 14). However, as mentioned earlier, the size of the formula Φi (φ) may depend exponentially on the length i of the current sequence of interpretations. For example, the answer formula Φ01 (φ) for φ = C(x) S(A(x) U B(x)) is Φ01 (ψ) ∪ (C1 ∩ (B0 ∪ (A0 ∩ xψ 0 ))), where ψ := A(x) U B(x), Φ01 (ψ) = B1 ∪ (A1 ∩ xψ 1 ), A0 := Ans(A(x), I (0) ), and ψ 0 similarly for the other CQs and time points. After replacing x0 by Φ1 (ψ), the variable xψ i 1 occurs twice in Φ1 (φ). In general, Φi (φ) will contain 2 occurrences ψ of the variable xi . However, applying equivalences like the associativity, com- mutativity, distributivity, and absorption laws for ∩ and ∪ does not affect the semantics of an answer formula (which is given by eval), and hence Φ1 (φ) ≡ Φ01 (ψ) ∪ (C1 ∩ (B0 ∪ (A0 ∩ Φ01 (ψ)))) ≡ Φ01 (ψ) ∪ (C1 ∩ B0 ) ∪ (C1 ∩ A0 ∩ Φ01 (ψ)) ≡ Φ01 (ψ) ∪ (C1 ∩ B0 ) ≡ ((C1 ∩ B0 ) ∪ B1 ) ∪ (A1 ∩ xψ 1) The resulting formula contains xψ 1 only once. In general, the formula Φi (φ) is equivalent to Di ∪ (Ai ∩ xψi ), where D0 := B0 and Di := (Ci ∩ Di−1 ) ∪ Bi for all i > 0. Thus, the algorithm only has to store the two sets Ai , Di ⊆ ∆NV at each time point, i.e. we achieve a bounded history encoding, as in [11]. If the formula φ contains no future operators, then the answer formulae contain no variables and can always be fully evaluated to a subset of ∆NV . In this case, our algorithm can be seen as a variant of the one from [11] for less expressive queries. The example demonstrates that it is important that the computed answer formulae are simplified at each step, while preserving their semantics under eval. However, this does not guarantee a bounded history encoding as in [11]. 6 Conclusions We have introduced the reasoning task of temporal OBDA over DL-Lite knowl- edge bases and shown how to reduce this task to answering queries over temporal databases, similar to what was done for the atemporal case [8]. We then pre- sented three approaches to solve the latter problem. The first involves storing the whole history of the database and re-evaluating the query at each time point using a temporal database query language like ATSQL [12]. The second approach works by eliminating the future operators and evalu- ating the resulting query using the algorithm of [11], which achieves a bounded history encoding. Although independent of the length of the history, this in- volves a non-elementary blow-up in the size of the query. Finally, we presented an algorithm that works directly with the future operators. We showed that the algorithm computes exactly the desired answers, but its space requirements are in general not independent of the length of the history. In future work, we will try to achieve a bounded history encoding for certain classes of TCQs, and compare the performance of all three approaches on temporal databases. References 1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995) 2. Artale, A., Kontchakov, R., Ryzhikov, V., Zakharyaschev, M.: DL-Lite with tem- poralised concepts, rigid axioms and roles. In: Ghilardi, S., Sebastiani, R. (eds.) Proc. of the 6th Int. Symp. on Frontiers of Combining Systems (FroCoS’09). Lec- ture Notes in Computer Science, vol. 5749, pp. 133–148. Springer-Verlag (2009) 3. Artale, A., Kontchakov, R., Ryzhikov, V., Zakharyaschev, M.: Temporal conceptual modelling with DL-Lite. In: Proc. of the 2010 Int. Workshop on Description Logics (DL’10). CEUR-WS, vol. 573 (2010) 4. Artale, A., Kontchakov, R., Ryzhikov, V., Zakharyaschev, M.: A cookbook for temporal conceptual data modelling with description logics. CoRR abs/1209.5571 (2012), http://arxiv.org/abs/1209.5571 5. Baader, F., Borgwardt, S., Lippmann, M.: Temporalizing ontology-based data ac- cess. In: Bonacina, M.P. (ed.) Proc. of the 24th Int. Conf. on Automated Deduc- tion (CADE’13). Lecture Notes in Artificial Intelligence, vol. 7898, pp. 330–344. Springer-Verlag (2013) 6. Borgwardt, S., Lippmann, M., Thost, V.: Temporal query answering in the descrip- tion logic DL-Lite. In: Fontaine, P., Ringeissen, C., Schmidt, R.A. (eds.) Proc. of the 9th Int. Symp. on Frontiers of Combining Systems (FroCoS’13). Lecture Notes in Computer Science, Springer-Verlag, Nancy, France (2013), to appear. 7. Borgwardt, S., Lippmann, M., Thost, V.: Temporal query answering w.r.t. DL-Lite- ontologies. LTCS-Report 13-05, Chair of Automata Theory, TU Dresden, Dresden, Germany (2013), see http://lat.inf.tu-dresden.de/research/reports.html. 8. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A., Rodriguez- Muro, M., Rosati, R.: Ontologies and databases: The DL-Lite approach. In: Tes- saris, S., Franconi, E., Eiter, T., Gutierrez, C., Handschuh, S., Rousset, M.C., Schmidt, R.A. (eds.) Reasoning Web, 5th Int. Summer School 2009, Tutorial Lec- tures, Lecture Notes in Computer Science, vol. 5689, pp. 255–356. Springer-Verlag (2009) 9. Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Rosati, R.: DL-Lite: Tractable description logics for ontologies. In: Proc. of the 20th Nat. Conf. on Artificial Intelligence (AAAI’05). pp. 602–607. AAAI Press (2005) 10. Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Hopcroft, J.E., Friedman, E.P., Harrison, M.A. (eds.) Proc. of the 9th Annual ACM Symp. on Theory of Computing (STOC’77). pp. 77–90. ACM Press (1977) 11. Chomicki, J.: Efficient checking of temporal integrity constraints using bounded history encoding. ACM Transactions on Database Systems 20(2), 148–186 (1995) 12. Chomicki, J., Toman, D., Böhlen, M.H.: Querying ATSQL databases with temporal logic. ACM Transactions on Database Systems 26(2), 145–178 (2001) 13. Gabbay, D.: Declarative past and imperative future. In: Banieqbal, B., Barringer, H., Pnueli, A. (eds.) Proc. of the 1987 Coll. on Temporal Logic in Specification. Lecture Notes in Computer Science, vol. 398, pp. 409–448. Springer-Verlag (1989) 14. Gutiérrez-Basulto, V., Klarman, S.: Towards a unifying approach to representing and querying temporal data in description logics. In: Krötzsch, M., Straccia, U. (eds.) Proc. of the 6th Int. Conf. on Web Reasoning and Rule Systems (RR’12). Lecture Notes in Computer Science, vol. 7497, pp. 90–105. Springer-Verlag (2012) 15. Hülsmann, K., Saake, G.: Theoretical foundations of handling large substitution sets in temporal integrity monitoring. Acta Informatica 28(4), 365–407 (1991) 16. Laroussinie, F., Markey, N., Schnoebelen, P.: Temporal logic with forgettable past. In: Proc. of the 17th Annual IEEE Symp. on Logic in Computer Science (LICS’02). pp. 383–392. IEEE Press (2002) 17. Pnueli, A.: The temporal logic of programs. In: Proc. of the 18th Annual Symp. on Foundations of Computer Science (SFCS’77). pp. 46–57 (1977) 18. Saake, G., Lipeck, U.W.: Using finite-linear temporal logic for specifying database dynamics. In: Börger, E., Büning, H.K., Richter, M.M. (eds.) Proc. of the 2nd Workshop on Computer Science Logic (CSL’88), Lecture Notes in Computer Sci- ence, vol. 385, pp. 288–300. Springer-Verlag (1989) 19. Toman, D.: Logical data expiration. In: Chomicki, J., van der Meyden, R., Saake, G. (eds.) Logics for Emerging Applications of Databases, chap. 6, pp. 203–238. Springer-Verlag (2004) 20. Wilke, T.: Classifying discrete temporal properties. In: Proc. of the 16th Annual Symp. on Theoretical Aspects of Computer Science (STACS’99). Lecture Notes in Computer Science, vol. 1563, pp. 32–46. Springer-Verlag (1999)