=Paper=
{{Paper
|id=None
|storemode=property
|title=Efficient Approximation in DL-Lite of OWL 2 Ontologies
|pdfUrl=https://ceur-ws.org/Vol-1014/paper_19.pdf
|volume=Vol-1014
|dblpUrl=https://dblp.org/rec/conf/dlog/ConsoleSS13
}}
==Efficient Approximation in DL-Lite of OWL 2 Ontologies==
Efficient approximation in DL-Lite of OWL 2 ontologies Marco Console, Valerio Santarelli, Domenico Fabio Savo Dipartimento di Ingegneria Informatica Automatica e Gestionale “Antonio Ruberti” S APIENZA Università di Roma lastname@dis.uniroma1.it Abstract. Ontologies, as a conceptualization of a domain of interest, can be used for different objectives, such as for providing a formal description of the domain of interest for documentation purposes, or for providing a mechanism for reason- ing upon the domain. For instance, they are the core element of the Ontology- Based Data Access paradigm, in which the ontology is utilized as a conceptual view, allowing user access to the underlying data sources. With the aim to use an ontology as a formal description of the domain of interest, the use of expressive languages proves to be useful. If instead the goal is to use the ontology for reason- ing tasks which require low computational complexity, the high expressivity of the language used to model the ontology may be of hindrance. In this scenario, the approximation of ontologies expressed in very expressive languages through on- tologies expressed in languages which keep the computational complexity of the reasoning tasks low is pivotal. In this work we present our notion of ontology ap- proximation and present an algorithm for computing the approximation of OWL 2 ontologies by means of DL-Lite TBoxes. Moreover, we provide optimization techniques for this computation, and discuss the results of the implementation of these techniques. 1 Introduction Ontologies, as a shared conceptualization of a domain of interest, are commonly recog- nized as powerful tools in support of the information systems of organizations [6,4,10]. On one hand, they can be used for providing a formal description of the domain of inter- est, and thus represent valuable documentation for an organization. On the other hand, they provide a mechanism for reasoning upon the domain with different objectives. For instance, they are the core element of the Ontology-Based Data Access (OBDA) [13,4] paradigm, in which the ontology is utilized as a conceptual view of the underlying data sources, granting the user the ability to retrieve information without specific knowledge on how this information is organized and where it is stored. With the aim to use an ontology as a formal description of the domain of interest, the use of expressive languages, such as the OWL 2 Web Ontology Language [9], proves to be useful. The expressivity of such languages allows the ontology designer to obtain a precise formalization of the domain. If instead the goal is to use the ontology for reason- ing tasks, the high expressivity of the language used to model the ontology may be of hindrance. In particular, when wishing to access large quantities of data through the on- tology, as in OBDA, the computational cost of very expressive languages such as OWL 2 Marco Console, Valerio Santarelli, Domenico Fabio Savo 2 is prohibitive. In these cases, it is necessary to recur to less expressive languages, thus resigning to having less complete representations of the domain of interest. One possible solution to this issue is to allow the ontology designer the use of ex- pressive languages to define ontologies that model the domain in great detail for the purpose of documentation and of other tasks that do not require strong computational effort, while adopting, for all those reasoning tasks in which particular computational properties are required, such as OBDA, descriptions of the domain of interest obtained through less expressive languages. Following this approach, the notion of approximation becomes pivotal. The goal of approximation is, given a ontology O in a language L, to compute an ontology O0 in a target language L0 , in which as much as possible of the semantics of O is preserved. In general, various approaches can be adopted towards this goal [17,14,12,18,3]. Among these, the approaches in which we are most interested are those which aim to obtain the so-called semantic approximation [14,3,12]. Here, as opposed to those approaches which aim at a syntactic approximation [17,18], the computation of the ontology which is the approximation of the original one is obtained through the semantics of this origi- nal ontology, and not only through its syntactic representation. In this paper, we focus on the semantic approximation of an ontology for OBDA ap- plications. Thus, we study approaches for approximating ontologies in very expressive languages with ontologies in languages that, characterized by low reasoning complex- ity, are suitable for query answering purposes. The most significant works in which this problem is studied are [12] and [3], in which the proposed approaches can be traced back to the work of Selman and Kautz [14]. These approaches, as is ours, are based on the notion of entailment set. Given an ontology O in a language L, the entailment set of O in a target language L0 is the set of all the assertions expressible in L0 over Σ that are entailed by O. The idea behind our approach is that an approximation in L0 of O is an ontology O0 whose entailment set minimally differs from the entailment set of O in L0 . Since OWL 2 is the W3C standard language for expressing ontologies, it is often used as the expressive language for formulating ontologies describing the domain of interest. On the other hand, in the context of OBDA, one naturally focuses on the logics of the DL-Lite family [5]. This is a family of DLs specifically designed to keep all reasoning tasks polynomially tractable in the size of the data, and is thus suitable for OBDA. For this reason, in this work we study the problem of approximating OWL 2 ontologies with ontologies in DL-Lite. To this aim we provide an algorithm for the computation of these approximations, and an optimized technique for the computation of the entailment set of an OWL 2 ontology in DL-Lite, which can be used efficiently in practice. The rest of the paper is organized as follows. In Section 2, we provide some useful notions for the paper. In Section 3 we study the problem of ontology approximation, and introduce our notion of approximation. In Section 4 we focus on the approximation in DL-Lite of OWL 2 ontologies. In Section 5 we describe our technique for optimizing the computation of the entailment set of an OWL 2 ontology in DL-Lite, and present the results of our experimentation. Finally, in Section 6 we conclude the paper. Efficient approximation in DL-Lite of OWL 2 ontologies 3 2 Preliminaries Description Logics (DLs) [2] are logics that allow one to represent the domain of in- terests in terms of concepts, denoting sets of objects, value-domains, denoting sets of values, attributes, denoting binary relations between objects and values, and roles de- noting binary relations over objects. Let Σ be a signature of symbols for individual (object and value) constants and atomic elements, i.e., concepts, value-domains, attributes, and roles. If L is a DL, then an ontology O in L (or L ontology) over Σ is the set T ∪ A, where T , the TBox, is a finite set of intensional assertions over Σ expressed in L, and A, the ABox, is a finite set of instance assertions, i.e, assertions on individuals, over Σ. Different DLs allow for different kinds of TBox and/or ABox assertions and allow for different manners in which these can be combined for obtaining TBoxes and ABoxes in the specific DL. The semantics of a DL ontology is given in terms of first-order (FOL) interpreta- tions (cf. [2]). We denote with M od(O) the set of models of O, i.e., the set of FOL interpretations that satisfy all the assertions in T and A, where the definition of satis- faction depends on the kind of expressions and assertions in the specific DL language in which O is specified. As usual, a ontology O is said to be satisfiable if it admits at least one model, and O is said to entail a First-Order Logic (FOL) sentence φ, denoted O |= φ, if φI = true for all I ∈ M od(O). Moreover, given to ontology O and O0 , we say that O and O0 are logically equivalent if M od(O) = M od(O0 ). In this work we mainly focus on two specific languages, which are OWL 2, the offi- cial ontology language of the World Wide Web Consortium (W3C) [9], and DL-LiteA , a member of the DL-Lite family [5], which is a family of tractable DLs particularly suited for dealing with ontologies with very large ABoxes, and is at the basis of OWL 2 QL, one of the profiles of OWL 2. The Web Ontology Language (version 2), or simply OWL 2, is an ontology lan- guage for the Semantic Web with formally defined meaning. OWL 2 provides for de- scribing the domain of interest in terms of concepts, roles, attributes, individuals, and data values [9]. Due to the limitation of space, here we do not provide a complete de- scription of OWL 2, but we refer the reader to the official W3C specification [11]. We now present the syntax of the DL DL-LiteA . As for the semantics, we apply the general definitions given above, and refer the reader to [13] for the precise description of the semantics of a DL-LiteA ontology. In what follows we adopt the following notation: A, P , and U are symbols in Σ denoting respectively an atomic concept, an atomic role and an atomic attribute; T1 , . . . , Tn are n pairwise disjoint unbounded value-domains; B denotes a basic con- cept; C a general concept; Q a basic role; R a general role; V a general attribute; E a basic value-domain; and F a value-domain expression. Expressions in DL-LiteA are formed according to the following syntax: B −→ A | ∃Q | δ(U ) Q −→ P | P − V −→ U | ¬U C −→ B | ¬B | ∃Q.C | δF (U ) R −→ Q | ¬Q E −→ ρ(U ) F −→ T1 | · · · | Tn where: P − denotes the inverse of P , ∃Q, or unqualified existential restriction denotes the objects related to some object by the role Q, ¬ denotes negation, δ(U ) denotes 4 Marco Console, Valerio Santarelli, Domenico Fabio Savo the domain of U , i.e., the set of objects that U relates to values, and ρ(U ) denotes the range of U , i.e., the set of values related to objects by U . The concept ∃Q.C, or qualified existential restriction, denotes the qualified domain of Q with respect to C, i.e., the set of objects that Q relates to some instance of C. Similarly, δF (U ) denotes the qualified domain of U with respect to a value-domain F , i.e., the set of objects that U relates to some value in F . We refer to the qualified existential restriction expression ∃Q1 . . . . .∃Qn .C, where C is not a qualified existential restriction, as an existential role chain of depth n. A DL-LiteA TBox assertion is an assertion of the form: BvC QvR EvF U vV (funct Q) (funct U ) From left to right, the first four assertions denote inclusions between concepts, roles, value-domains, and attributes, respectively. The last two assertions denote functionality on roles and on attributes. A DL-LiteA TBox is a finite set of assertions of the form above, where suitable limitations in the combination of such assertion are imposed. Given a TBox T and a role P (resp. an attribute U ), we say that P (resp. U ) is primitive in T if P does not appear in T positively in the right-hand side of any role (resp. attribute) inclusion assertion and in any qualified existential restriction. In a DL-LiteA TBox, a role P (resp, attribute U ) that is not primitive in T cannot appear either directly nr inversely in a functionality assertion. A DL-LiteA ABox A is a finite set of assertions of the form A(a), P (a, b), and U (a, v), where A, P , and U are as above, a and b are object constants while v is a value constant. 3 Approximation of DL ontologies In this section, we present our notion of approximation of an ontology expressed in a language L in a target language L0 , and then we provide a comparison between our notion and others proposed in literature. Ontology approximation. In what follows, O is a satisfiable L ontology, and L0 a language not necessarily different from L. First of all, we need to introduce the notion of entailment set [12] of a satisfiable ontology with respect to a language. Definition 1. Let O be a satisfiable ontology expressed in a language L over a signa- ture Σ, and let L’ be a language, not necessarily different from L. The entailment set of O with respect to L0 , denoted as ES(O, L0 ), is the set which contains all L0 axioms over Σ that are entailed by O. In other words, we say that an axiom α belongs to the entailment set of an ontology O with respect to a language L0 , if α is an L0 axiom built over the alphabet of O and for each interpretation I ∈ M od(O) we have that I |= α. Clearly, given an ontology O and a language L0 , the entailment set of O with respect to L0 is unique. With the notion of entailment set in place, we can define the notion of approximation in L0 of O0 . Efficient approximation in DL-Lite of OWL 2 ontologies 5 Definition 2. Let O be an ontology over a signature Σ. A satisfiable L0 ontology O0 over Σ is an approximation in L0 of O if both the following statements hold: (i) ES(O0 , L0 ) ⊆ ES(O, L0 ); (ii) there is no satisfiable L0 ontology O00 such that ES(O0 , L0 ) ⊂ ES(O00 , L0 ) ⊆ ES(O, L0 ). In other words, the above definition states that a satisfiable ontology O0 is an ap- proximation in L0 of O , if O0 is an ontology in L0 , and there is no a satisfiable ontology O00 in L0 whose entailment set in L0 is “nearer” to the entailment set of O in L0 than the entailment set in L0 of O, where the distance here is measured in terms of set inclusion. Given an ontology O, it may be the case that the entailment set in a language L0 of O is infinite. If so, it may happen that the approximation in L0 , in accordance with Definition 2, does not exist. For this reason, we follow the approach proposed in [3], where safeness restrictions on the target language are imposed, in order to guarantee the finiteness of the entailment set. For example, in DL-LiteA , which is the language we focus on in the following sections, the infiniteness of the entailment set arises from the possibility of inferring infinitely-long existential chains. In this case, the safeness restriction requires to allow, in the TBox, only existential chains of bounded length. Therefore, in what follows we denote versions of DL-LiteA in which such restriction is (k) enforced as DL-LiteA , where k is the bounded length of the existential chains. Another characteristic of DL-LiteA , shared with other languages such as EL++ [1], is that syntactic restrictions are imposed on the manner in which assertions can be com- bined. In this case, there may exist more than one ontology which is an approximation in L0 of O. In what follows we denote with ApxM AX (O, L0 ) the set of L0 ontologies which are approximations in L0 of O in accordance with Definition 2. Example 1. Consider the OWL 2 ontology O = {A v ∃R.B, A v ∀R.B, (funct R)}. It is easy to see that, due to the syntactic restriction in DL-LiteA , which imposes that a non primitive role cannot be functional, we have that, up to logical equivalence, the set (1) ApxM AX (O, DL-LiteA ) = {{A v ∃R.B}, {A v ∃R, (funct R)}}. t u Comparison with related work. The problem of approximating ontologies for OBDA applications has recently been faced in [12] and [3]. In [12], the authors define the approximation in L0 of a satisfiable ontology O as the set of L0 axioms coinciding with the entailment set of O with respect to L0 . Definition 3. Let O be a satisfiable ontology expressed in a language L. The approxi- mation in L0 of O is ApxES (O, L0 ) = ES(O, L0 ). Therefore, in accordance with the above definition, the approximation in L0 of the on- tology O is a set of L0 axioms, but not necessarily a valid ontology expressed in L0 . In [3], the authors provide a more sophisticated notion of approximation, in which it is required that the approximation in L0 of O be an ontology expressed in L0 . Definition 4. Let O be a satisfiable ontology expressed in a language L over a signa- ture Σ. A satisfiable L0 ontology O0 over Σ is an approximation in L0 of O if both the following statements hold: 6 Marco Console, Valerio Santarelli, Domenico Fabio Savo (i) ES(O0 , L0 ) ⊆ ES(O, L0 ); (ii) for each α ∈ ES(O, L0 ) such that O0 ∪{α} is an L0 ontology, then α ∈ ES(O0 , L0 ). In other words, an L0 ontology O0 is an approximation in L0 of O if among all the possible maximal subsets of ES(O, L0 ) which are L0 ontologies, there is one which is logically equivalent to O0 . It is not unexpected, as with our approach, that there may exist more than one ontology which is an approximation in L0 of O. We denote with ApxSC (O, L0 ) set of L0 ontologies which are all approximations in L0 of O in accordance with Definition 4. Differently from Definition 3, Definition 4 guarantees that an approximation is al- ways an ontology expressed in the target language L0 . Similarly to our notion of approximation, if the entailment set is infinite, it may hap- pen that there is no ontology expressed in the target language which is an approxima- tion. Hence, in order to guarantee the existence of an approximation, one must impose safeness restrictions on the target language in this case as well, in order to guarantee the finiteness of the entailment set. We now compare our approach to those in [3] and [12] by means of some examples. Example 2. Consider the following OWL 2 ontology: O = {∃R− v B, A v ∃R.B, (funct R)}. In accordance with Definitions 3, 4, 2, we have: (1) ApxES (O, DL-LiteA ) = {∃R− v B, A v ∃R, A v ∃R.B, A v ∃R.∃R− , ∃R v ∃R.∃R− , ∃R− v ∃R− .∃R, (funct R) } (1) 0 ApxSC (O, DL-LiteA ) = {Osc = {∃R− v B, A v ∃R, A v ∃R.B, A v ∃R.∃R− , ∃R v ∃R.∃R− , ∃R− v ∃R− .∃R }, Osc = {∃R− v B, A v ∃R, (funct R) } } 00 (1) ApxM AX (O, DL-LiteA ) = {Omax = {∃R− v B, A v ∃R, (funct R) } } t u (1) Due to the syntactic restrictions enforced in DL-LiteA , ontology O of Example 2 is (1) (1) not a DL-LiteA ontology. However, it is logically equivalent to the DL-LiteA ontology {∃R− v B, A v ∃R, (funct R)}. (1) (1) Regarding ApxES (O, DL-LiteA ) we only observe that it is not a valid DL-LiteA (1) ontology. Up to logical equivalence, we can see that the set ApxSC (O, DL-LiteA ) 00 contains, along with ontology Osc , which is logically equivalent to O, also the unex- 0 0 (1) 00 (1) pected ontology Osc , for which we have ES(Osc , DL-LiteA ) ⊂ ES(Osc , DL-LiteA ). Finally, according to Definition 2, up to logical equivalence, the only approximation is (1) a DL-LiteA ontology that is logically equivalent to O. Other significant differences between these three approaches arise when the goal is to compute the approximation of an ontology O expressed in a language L in which the UNA is not adopted into a target language L0 in which the UNA is adopted. In the example below, we highlight the different behavior of the three approaches in this circumstance. Example 3. We recall that in OWL 2 the UNA is not adopted. This means, for instance, that given the following two ontologies O = {A v {o1 }, B v {o2 }} and O0 = {A v Efficient approximation in DL-Lite of OWL 2 ontologies 7 {o1 }, B v {o2 }, o1 6= o2 }, we have that M od(O) 6= M od(O0 ). In fact, while O does not entail A v ¬B, O0 does. Differently, if we adopt the UNA in OWL 2, then we have that M od(O) = M od(O0 ), and both entail the assertion A v ¬B. From the observations above, it follows that {A v {o1 }, B v {o2 }} ⊆ ES(O, OWLU N A ), and that A v ¬B 6∈ ES(O, OWLU N A ). In what follows let us denote with OWLU N A the version of OWL 2 where the UNA is adopted. In accordance with Definitions 3, 4, and 2, we have that, up to logical equivalence, the approximations in OWLU N A of O are: – ApxES (O, OWLU N A ) = ES(O, OWLU N A ). In this case, ApxES (O, OWLU N A ) is a valid OWLU N A ontology. Let OES be such ontology. As shown above, since in OWLU N A the UNA is adopted, OES |= A v ¬B. This means that ES(OES , OWLU N A ) 6⊆ ES(O, OWLU N A ). – ApxSC (O, OWLU N A ) = ∅. Indeed, since ES(O, OWLU N A ) is a valid OWLU N A ontology, any OWLU N A ontology O1 such that ES(O1 , OWLU N A ) ⊂ ES(O, OWLU N A ), does not satisfy condition (ii) of Definition 4. Moreover, since every OWLU N A ontology O2 that satisfies condition (ii) entails both {A v {o1 }, B v {o2 }}, then it also entails A v ¬B. Hence ES(O2 , OWLU N A ) 6⊆ ES(O, OWLU N A ), and so condition (i) of Definition 4 is not satisfied. 0 00 – ApxM AX (O, OWLU N A ) = {Omax = {A v {o1 }}, Omax = {B v 0 00 {o2 }} }. Indeed, it is easy to verify that both Omax and Omax satisfy both con- ditions (i) and (ii) of Definition 2, and that there is no other ontology O000 in 0 00 ApxM AX (O, OWLU N A ) logically equivalent to neither Omax nor Omax . t u As shown in the previous example, given an L ontology O and a target language L0 , our approach, as the one in [3], but differently from the one given in [12], guarantees that an approximation in L0 of O is an L0 ontology, and that for each L0 ontology O0 that is an approximation in L0 of O, we have that ES(O0 , L0 ) ⊆ ES(O, L0 ). Finally, it can be shown that our approach always preserves the same or more inferences than those obtained by adopting the approach given in [3]. Theorem 1. For each Osc ∈ ApxSC (O, L0 ), there is an Omax ∈ ApxM AX (O, L0 ) such that ES(Osc , L0 ) ⊆ ES(Omax , L0 ). Furthermore, for each Omax ∈ ApxM AX (O, L0 ), there is no Osc ∈ ApxSC (O, L0 ) such that ES(Omax , L0 ) ⊂ ES(Osc , L0 ). 4 Approximation in DL-LiteA of OWL 2 ontologies In this section, we study the problem of computing the approximation in DL-LiteA of a satisfiable OWL 2 ontology O. More specifically, we aim to approximate O with DL-LiteA TBox assertions. Therefore, in what follows we assume that ApxM AX (O, DL-LiteA ) is a set of DL-LiteA TBoxes and that ES(O, DL-LiteA ) is a set of DL-LiteA TBox assertions. To guarantee the finiteness of the entailment set, we refer to versions of DL-LiteA allowing only for TBox assertions with existential chains of bounded length k. 8 Marco Console, Valerio Santarelli, Domenico Fabio Savo (k) Given a set of DL-LiteA assertions S, and a functionality assertion ϕ over a role R (resp. attribute U ), we denote with clashes(ϕ, S) the set of all assertions involving R (k) (resp. U ) that, together with ϕ, violate the syntactic restriction imposed on DL-LiteA TBoxes. Hence, clashes(ϕ, S) is a set of role (resp. attribute) inclusion assertions and assertions with a qualified existential role (resp. attribute) on the right hand side. Let O be an OWL 2 ontology, and let F be the set containing all the function- (k) (k) ality assertions in ES(O, DL-LiteA ) for which clashes(ϕ, ES(O, DL-LiteA )) 6= ∅. (k) (k) If F = 6 ∅, then ES(O, DL-LiteA ) is not a valid DL-LiteA TBox, and there are at (k) most 2|F | TBoxes Ti which are valid DL-LiteA TBoxes and minimally differ from (k) (k) (k) ES(O, DL-LiteA ). Let M axSubES (ES(O, DL-LiteA )) be the set of such DL-LiteA (k) TBoxes. One can compute these TBoxes in M axSubES (ES(O, DL-LiteA )) by re- (k) tracting, from ES(O, DL-LiteA ), either ϕ ∈ F or the assertions in clashes(ϕ, S), in order to resolve the violations of the syntactic restriction. (k) The lemma below guarantees that a TBox in M axSubES (ES(O, DL-LiteA )) is a (k) candidate for being one of the TBoxes in ApxM AX (O, DL-LiteA ). (k) Lemma 1. Let O be a satisfiable OWL 2 ontology and let T be a DL-LiteA TBox. (k) (k) (k) ES(T , DL-LiteA ) ⊆ ES(O, DL-LiteA ) if and only if T ⊆ ES(O, DL-LiteA ). In other words, the above lemma guarantees that the first condition in Definition 2 (k) (k) is satisfied by every DL-LiteA TBox in M axSubES (ES(O, DL-LiteA )), and that (k) in computing all the TBoxes in ApxM AX (O, DL-LiteA ), one can consider only the (k) assertions in ES(O, DL-LiteA ). Moreover, from the monotonicity of the DLs, it di- (k) rectly follows that for every TBox T in ApxM AX (O, DL-LiteA ) there is a TBox in (k) M axSubES (ES(O, DL-LiteA )) which is logically equivalent to T . (k) However, in order for a TBox Ti in M axSubES (ES(O, DL-LiteA )) to belong to (k) ApxM AX (O, DL-LiteA ), it must also satisfy the second condition of Definition 2, (k) (k) and thus that there is no other DL-LiteA TBox T 0 ⊆ ES(T , DL-LiteA ) such that (k) (k) (k) ES(Ti , DL-LiteA ) ⊂ ES(T 0 , DL-LiteA ) ⊆ ES(O, DL-LiteA ). (k) To explain why a TBox in M axSubES (ES(O, DL-LiteA )) does not necessarily also satisfy the second condition in Definition 2, we refer to the ontology O = {∃R− v (1) B, A v ∃R.B, (funct R)} of Example 2. It is easy to see that ES(O, DL-LiteA ) = {∃R v B, A v ∃R.B, (funct R), ∃R v ∃R.∃R , ∃R v ∃R.B, A v ∃R.∃R− , − − (1) ∃R− v ∃R− .∃R, A v ∃R}, and that clashes((funct R), ES(O, DL-LiteA )) = {A v ∃R.B, ∃R v ∃R.∃R− , ∃R v ∃R.B, A v ∃R.∃R− , ∃R− v ∃R− .∃R}. Hence, by following the procedure described above, we have the following two TBoxes (1) in M axSubES (ES(O, DL-LiteA )): T1 = {∃R− v B, A v ∃R.B, ∃R v ∃R.∃R− , ∃R v ∃R.B, A v ∃R.∃R , ∃R− v ∃R− .∃R, A v ∃R}, and T2 = {∃R− v B, − (1) (funct R), A v ∃R}. However, it is clear that only T2 ∈ ApxM AX (O, DL-LiteA ). In (1) (1) fact we have that ES(T1 , DL-LiteA ) ⊂ ES(T2 , DL-LiteA ). We provide the algorithm isApx which, given a TBox T and an ontology O, returns (k) true if T ∈ ApxM AX (O, DL-LiteA ), false otherwise. The algorithm proceeds as fol- Efficient approximation in DL-Lite of OWL 2 ontologies 9 Algorithm 1: isApx(T , O) (k) Input: a DL-LiteA TBox T , a satisfiable OWL 2 ontology O Output: true or false begin (k) E ← ES(T , DL-LiteA ); (k) S ← ES(O, DL-LiteA ) \ E; foreach α ∈ S (k) if T ∪ {α} is a DL-LiteA TBox then return false; foreach functionality assertion φ ∈ E E ← E \ clashes(φ, E); foreach functionality assertion ϕ ∈ S (k) (k) if ES(E \ clashes(ϕ, E), DL-LiteA ) = ES(T , DL-LiteA ) then return false; return true; end (k) lows. Given a satisfiable OWL 2 ontology O and a DL-LiteA TBox T , it first computes (k) their entailment sets in DL-LiteA and then computes the set S containing all the asser- (k) tions in the entailment set of O in DL-LiteA which are not in the entailment set of T in (k) DL-LiteA . Then, for every assertion α in S, it attempts to add α to T without violat- ing any syntactic restriction. If such assertion exists, then T is not an approximation in (k) DL-LiteA of O. If not, the algorithm continues, fixing E as the entailment set of T in (k) DL-LiteA , and resolving every violation of the syntactic restriction in E by removing clashes(φ, E) from E, for every functionality assertion φ in E. This operation guaran- (k) tees that E is a DL-LiteA TBox. Then the algorithm checks, for every functionality assertion ϕ in S, if the set obtained from E by removing clashes(ϕ, E) from E is logi- (k) cally equivalent to E. If this is the case, then it is possible to build a DL-LiteA TBox T 00 (k) by adding ϕ to the set of assertions obtained in this fashion. T 00 is a DL-LiteA TBox that proves that T does not satisfy the second condition of Definition 2. Otherwise, the algorithm terminates by returning true. The theorem below establishes termination and correctness of Algorithm 1. (k) Theorem 2. Let O be a satisfiable OWL 2 ontology, and let T be a DL-LiteA TBox. (k) isApx(T , O) terminates, and returns true if and only if T ∈ ApxM AX (O, DL-LiteA ). Given a satisfiable OWL 2 ontology O, Lemma 1 and Theorem 2 suggest Algo- (k) rithm 2 for computing, up to logical equivalence, the set ApxM AX (O, DL-LiteA ). The following theorem establishes termination and correctness of Algorithm 2. Theorem 3. Let O be a satisfiable OWL 2 ontology. Then computeApx(O) terminates (k) and computes, up to logical equivalence, ApxM AX (O, DL-LiteA ). As expected, Algorithm 2 does not return a single TBox, but instead a set of TBoxes. For application purposes, the approximation that shall be used must be chosen from this set. A pragmatic approach could be to choose one non-deterministically. Instead, one could think to leave this choice to the end user, according to the use he intends to make 10 Marco Console, Valerio Santarelli, Domenico Fabio Savo Algorithm 2: computeApx(O) Input: a satisfiable OWL 2 ontology O (k) Output: a set of DL-LiteA TBoxes begin (k) S ← M axSubES (ES(O, DL-LiteA )); foreach Ti ∈ S if isApx(Ti , O) = false then S ← S \ {Ti }; return S; end of it. A more interesting direction could be to achieve the identification of a unique TBox by applying some preference criteria to the set returned by Algorithm 2. 5 Efficient entailment set computation in DL-Lite In the previous sections, we have provided an algorithm for computing the set (k) ApxM AX (O, DL-LiteA ) of TBoxes for an OWL 2 ontology O. This computation is (k) clearly intractable. Indeed, it requires to compute the set ES(O, DL-LiteA ), and more- (k) over, due to the syntactic restriction enforced in DL-LiteA , in the worst case, the cardi- (k) nality of the set ApxM AX (O, DL-LiteA ) is exponential in the number of functionality (k) (k) assertions in ES(O, DL-LiteA ). In particular, the computation of ES(O, DL-LiteA ) is in general very costly, as highlighted also in [3] and [12], since it requires the in- vocation of reasoning services over an OWL 2 ontology O. This task is performed by invoking an OWL 2 oracle which can be implemented by an OWL 2 reasoner. (k) A naive algorithm for computing ES(O, DL-LiteA ) is the one described in [12], in (k) which firstly one computes the set Γ of DL-LiteA TBox assertions which can be built over the signature Σ, and then, for each assertion α ∈ Γ such that O |= α, one adds α (k) to ES(O, DL-LiteA ). (k) We now show how to optimize the computation of ES(O, DL-LiteA ) through a technique which drastically reduces in practice the calls to the OWL 2 oracle. (k) In the computation of ES(O, DL-LiteA ), a large portion of the invocations of the OWL 2 oracle involve assertions in which a general concept C∃R1 ...∃Rn involving a non trivial existential role chain occurs. Empirical observation has brought to light the fact that this kind of general concept very often does not subsume any concept in O. Hence, all the invocations of the OWL 2 oracle involving these childless general concepts are useless. Therefore, at the base of our strategy is the identification of all these childless general concepts C∃R1 ...∃Rn , without invoking the OWL 2 oracle. We use the function subsumed(S1 , O), where S1 is a general concept (resp. general role, general attribute) which returns the set of concepts (resp. roles, attributes) S2 such that O |= S2 v S1 . This function is efficiently performed by the most commonly-used OWL 2 reasoners, such as Pellet [15], Racer [8], FACT++ [16], and HermiT [7]. Our technique calls, as the first step, for the computation of the classification of basic concepts, roles, and attributes, which is encoded into a directed graph, in which the nodes represent the predicates of the ontology, and the edges the inclusion assertions. Efficient approximation in DL-Lite of OWL 2 ontologies 11 Ontology DL Fragment #O.A. #N.A. Total computation time (ms) k=1 k=2 k=3 k=1 k=2 k=3 k=1 k=2 k=3 Mouse ALCI 6.059 12.611 19.163 11.018 40.362 157.738 8.426 9.955 12.173 Pathway EL 10.191 11.999 11.999 14.294 52.374 204.694 11.975 16.498 17.553 Cognitive SHIF (D) 56.883 178.381 474.145 48.006 541.350 6.461.478 348.892 1.812.511 6.832.865 Mammalian AL+ 7.551 7.551 7.551 112.898 413.922 1.618.018 322.527 350.853 350.853 Spatial ALEHI 51.065 82.735 150.195 47.143 4.541.815 445.019.671 27.827 52.742 132.807 (k) Table 1: Evaluation of the optimized algorithm for computing ES(O, DL-LiteA ). After this initial step, the remaining invocations, which we work to minimize, are those needed for computing the entailed inclusion assertions involving general con- cepts C∃R1 ...∃Rn , and the entailed disjointness. Regarding the former, we exploit the graph encoding of concept, role, and attribute classification to invoke these subsump- tion checks in a manner which follows the hierarchical order of these general concepts C∃R1 ...∃Rn , in order to avoid those checks which can be skipped. Consider, for exam- ple, an ontology O that entails the inclusions A1 v A2 and P1 v P2 , where A1 and A2 are concepts and P1 and P2 are roles. Exploiting these inclusions we can deduce the hierarchical structure involving general concepts that can be built on these four predicates. For instance, we know that ∃P2 .A2 v ∃P2 , that ∃P2 .A1 v ∃P2 .A2 , that ∃P1 .A1 v ∃P2 .A1 , and so on. We begin by invoking the OWL 2 oracle, asking for the children of the general concepts which are in the highest position in the hierarchy. So, first we call subsumed(∃P2 , O). If subsumed(∃P2 , O) = ∅, we avoid invoking the oracle asking for subsumed(∃P2 .A2 , O), and so on. Regarding the latter we fol- low the same procedure, but beginning from the lowest positions in the hierarchy. The algorithm concludes by asking the OWL 2 oracle for all functionalities entailed by O. In Table 1 we present a sample of the evaluation tests for this strategy, performed through a Java-based implementation of this technique. The table reports the number of invocations to the OWL 2 oracle performed with our optimization (#O.A.), and without (k) (#N.A.), for computing the entailment set of the ontologies in DL-LiteA , with 1 ≤ k ≤ (k) 3. It also reports, for each ontology, the total computation time of ES(O, DL-LiteA ). 6 Conclusion In this paper we address the problem of ontology approximation. We illustrate our ap- proach to this problem, and provide a comparison with other approaches provided in literature. We deeply investigate the approximation of OWL 2 ontologies with DL-Lite TBoxes for OBDA purposes, and provide an algorithm for its computation. Finally, we present a technique for the optimization of the core procedure of this computation, whose success we have shown with empirical evaluation. As future work, we plan to improve the performances in computing the approximation in DL-Lite of OWL 2 on- tologies by adopting more sophisticated techniques. Moreover, we aim to study reason- able solutions for addressing the problem of multiple approximations of an ontology. In particular, for those settings in which the approximation is used in OBDA. Acknowledgments. This research has been partially supported by the EU under FP7 project Optique (grant n. FP7-318338), and by the EU under FP7-ICT project ACSI (grant no. 257593). 12 Marco Console, Valerio Santarelli, Domenico Fabio Savo References 1. Franz Baader, Sebastian Brandt, and Carsten Lutz. Pushing the EL envelope further. In Proc. of OWLED 2008 DC, 2008. 2. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter F. Patel- Schneider, editors. The Description Logic Handbook: Theory, Implementation and Applica- tions. Cambridge University Press, 2003. 3. Elena Botoeva, Diego Calvanese, and Mariano Rodriguez-Muro. Expressive approximations in DL-Lite ontologies. Proc. of AIMSA 2010, pages 21–31, 2010. 4. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, Antonella Poggi, Mariano Rodriguez-Muro, Riccardo Rosati, Marco Ruzzi, and Domenico Fabio Savo. The Mastro system for ontology-based data access. Semantic Web J., 2(1):43–53, 2011. 5. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Ric- cardo Rosati. Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. of Automated Reasoning, 39(3):385–429, 2007. 6. Balakrishnan Chandrasekaran, John R Josephson, and V Richard Benjamins. What are on- tologies, and why do we need them? Intelligent Systems and Their Applications, IEEE, 14(1):20–26, 1999. 7. Birte Glimm, Ian Horrocks, Boris Motik, Rob Shearer, and Giorgos Stoilos. A novel ap- proach to ontology classification. J. of Web Semantics, 10(1), 2011. 8. Volker Haarslev and Ralf Möller. RACER system description. In Proc. of IJCAR 2001, volume 2083 of LNAI, pages 701–705. Springer, 2001. 9. Pascal Hitzler, Markus Krötzsch, Bijan Parsia, Peter F. Patel-Schneider, and Sebas- tian Rudolph, editors. OWL 2 Web Ontology Language: Primer. W3C Recom- mendation, 11 december 2012. Available at http://www.w3.org/TR/2012/ REC-owl2-primer-20121211/. 10. Maurizio Lenzerini. Ontoloogy-based data management. In Proc. of CIKM 2011, pages 5–6, 2011. 11. Boris Motik, Bijan Parsia, and Peter F. Patel-Schneider, editors. OWL 2 Web On- tology Language Structural Specification and Functional-Style Syntax. W3C Rec- ommendation, 11 december 2012. Available at http://www.w3.org/TR/2012/ REC-owl2-syntax-20121211/. 12. Jeff Z Pan and Edward Thomas. Approximating OWL-DL ontologies. In Proc. of AAAI 2007, page 1434, 2007. 13. Antonella Poggi, Domenico Lembo, Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Riccardo Rosati. Linking data to ontologies. J. on Data Semantics, X:133– 173, 2008. 14. Bart Selman and Henry Kautz. Knowledge compilation and theory approximation. J. of the ACM, 43(2):193–224, 1996. 15. Evren Sirin, Bijan Parsia, Bernardo Cuenca Grau, Aditya Kalyanpur, and Yarden Katz. Pel- let: a practical OWL-DL reasoner. Web Semantics: science, services and agents on the World Wide Web, 5(2):51–53, 2007. 16. Dmitry Tsarkov and Ian Horrocks. FaCT++ description logic reasoner: System description. In Proc. of IJCAR 2006, pages 292–297, 2006. 17. Tuvshintur Tserendorj, Sebastian Rudolph, Markus Krötzsch, and Pascal Hitzler. Approxi- mate OWL-reasoning with Screech. In Proc. of RR 2008, pages 165–180. Springer, 2008. 18. Holger Wache, Perry Groot, and Heiner Stuckenschmidt. Scalable instance retrieval for the semantic web by approximation. In Proc. of WISE 2005, pages 245–254. Springer, 2005.