=Paper=
{{Paper
|id=None
|storemode=property
|title=Efficient Approximation in DL-Lite of OWL 2 Ontologies
|pdfUrl=https://ceur-ws.org/Vol-1014/paper_19.pdf
|volume=Vol-1014
|dblpUrl=https://dblp.org/rec/conf/dlog/ConsoleSS13
}}
==Efficient Approximation in DL-Lite of OWL 2 Ontologies==
<pdf width="1500px">https://ceur-ws.org/Vol-1014/paper_19.pdf</pdf>
<pre>
Efficient approximation in DL-Lite of OWL 2 ontologies

               Marco Console, Valerio Santarelli, Domenico Fabio Savo

     Dipartimento di Ingegneria Informatica Automatica e Gestionale “Antonio Ruberti”
                              S APIENZA Università di Roma
                                 lastname@dis.uniroma1.it


       Abstract. Ontologies, as a conceptualization of a domain of interest, can be used
       for different objectives, such as for providing a formal description of the domain
       of interest for documentation purposes, or for providing a mechanism for reason-
       ing upon the domain. For instance, they are the core element of the Ontology-
       Based Data Access paradigm, in which the ontology is utilized as a conceptual
       view, allowing user access to the underlying data sources. With the aim to use an
       ontology as a formal description of the domain of interest, the use of expressive
       languages proves to be useful. If instead the goal is to use the ontology for reason-
       ing tasks which require low computational complexity, the high expressivity of
       the language used to model the ontology may be of hindrance. In this scenario, the
       approximation of ontologies expressed in very expressive languages through on-
       tologies expressed in languages which keep the computational complexity of the
       reasoning tasks low is pivotal. In this work we present our notion of ontology ap-
       proximation and present an algorithm for computing the approximation of OWL
       2 ontologies by means of DL-Lite TBoxes. Moreover, we provide optimization
       techniques for this computation, and discuss the results of the implementation of
       these techniques.


1   Introduction

Ontologies, as a shared conceptualization of a domain of interest, are commonly recog-
nized as powerful tools in support of the information systems of organizations [6,4,10].
On one hand, they can be used for providing a formal description of the domain of inter-
est, and thus represent valuable documentation for an organization. On the other hand,
they provide a mechanism for reasoning upon the domain with different objectives. For
instance, they are the core element of the Ontology-Based Data Access (OBDA) [13,4]
paradigm, in which the ontology is utilized as a conceptual view of the underlying data
sources, granting the user the ability to retrieve information without specific knowledge
on how this information is organized and where it is stored.
     With the aim to use an ontology as a formal description of the domain of interest, the
use of expressive languages, such as the OWL 2 Web Ontology Language [9], proves to
be useful. The expressivity of such languages allows the ontology designer to obtain a
precise formalization of the domain. If instead the goal is to use the ontology for reason-
ing tasks, the high expressivity of the language used to model the ontology may be of
hindrance. In particular, when wishing to access large quantities of data through the on-
tology, as in OBDA, the computational cost of very expressive languages such as OWL
2       Marco Console, Valerio Santarelli, Domenico Fabio Savo

2 is prohibitive. In these cases, it is necessary to recur to less expressive languages, thus
resigning to having less complete representations of the domain of interest.
    One possible solution to this issue is to allow the ontology designer the use of ex-
pressive languages to define ontologies that model the domain in great detail for the
purpose of documentation and of other tasks that do not require strong computational
effort, while adopting, for all those reasoning tasks in which particular computational
properties are required, such as OBDA, descriptions of the domain of interest obtained
through less expressive languages.
     Following this approach, the notion of approximation becomes pivotal. The goal of
approximation is, given a ontology O in a language L, to compute an ontology O0 in a
target language L0 , in which as much as possible of the semantics of O is preserved. In
general, various approaches can be adopted towards this goal [17,14,12,18,3]. Among
these, the approaches in which we are most interested are those which aim to obtain
the so-called semantic approximation [14,3,12]. Here, as opposed to those approaches
which aim at a syntactic approximation [17,18], the computation of the ontology which
is the approximation of the original one is obtained through the semantics of this origi-
nal ontology, and not only through its syntactic representation.
     In this paper, we focus on the semantic approximation of an ontology for OBDA ap-
plications. Thus, we study approaches for approximating ontologies in very expressive
languages with ontologies in languages that, characterized by low reasoning complex-
ity, are suitable for query answering purposes. The most significant works in which this
problem is studied are [12] and [3], in which the proposed approaches can be traced
back to the work of Selman and Kautz [14].
     These approaches, as is ours, are based on the notion of entailment set. Given an
ontology O in a language L, the entailment set of O in a target language L0 is the set of
all the assertions expressible in L0 over Σ that are entailed by O. The idea behind our
approach is that an approximation in L0 of O is an ontology O0 whose entailment set
minimally differs from the entailment set of O in L0 .
    Since OWL 2 is the W3C standard language for expressing ontologies, it is often
used as the expressive language for formulating ontologies describing the domain of
interest. On the other hand, in the context of OBDA, one naturally focuses on the logics
of the DL-Lite family [5]. This is a family of DLs specifically designed to keep all
reasoning tasks polynomially tractable in the size of the data, and is thus suitable for
OBDA. For this reason, in this work we study the problem of approximating OWL
2 ontologies with ontologies in DL-Lite. To this aim we provide an algorithm for the
computation of these approximations, and an optimized technique for the computation
of the entailment set of an OWL 2 ontology in DL-Lite, which can be used efficiently
in practice.
    The rest of the paper is organized as follows. In Section 2, we provide some useful
notions for the paper. In Section 3 we study the problem of ontology approximation,
and introduce our notion of approximation. In Section 4 we focus on the approximation
in DL-Lite of OWL 2 ontologies. In Section 5 we describe our technique for optimizing
the computation of the entailment set of an OWL 2 ontology in DL-Lite, and present
the results of our experimentation. Finally, in Section 6 we conclude the paper.
                            Efficient approximation in DL-Lite of OWL 2 ontologies      3

2   Preliminaries
Description Logics (DLs) [2] are logics that allow one to represent the domain of in-
terests in terms of concepts, denoting sets of objects, value-domains, denoting sets of
values, attributes, denoting binary relations between objects and values, and roles de-
noting binary relations over objects.
     Let Σ be a signature of symbols for individual (object and value) constants and
atomic elements, i.e., concepts, value-domains, attributes, and roles. If L is a DL, then
an ontology O in L (or L ontology) over Σ is the set T ∪ A, where T , the TBox, is a
finite set of intensional assertions over Σ expressed in L, and A, the ABox, is a finite
set of instance assertions, i.e, assertions on individuals, over Σ. Different DLs allow
for different kinds of TBox and/or ABox assertions and allow for different manners in
which these can be combined for obtaining TBoxes and ABoxes in the specific DL.
     The semantics of a DL ontology is given in terms of first-order (FOL) interpreta-
tions (cf. [2]). We denote with M od(O) the set of models of O, i.e., the set of FOL
interpretations that satisfy all the assertions in T and A, where the definition of satis-
faction depends on the kind of expressions and assertions in the specific DL language
in which O is specified. As usual, a ontology O is said to be satisfiable if it admits at
least one model, and O is said to entail a First-Order Logic (FOL) sentence φ, denoted
O |= φ, if φI = true for all I ∈ M od(O). Moreover, given to ontology O and O0 , we
say that O and O0 are logically equivalent if M od(O) = M od(O0 ).
     In this work we mainly focus on two specific languages, which are OWL 2, the offi-
cial ontology language of the World Wide Web Consortium (W3C) [9], and DL-LiteA , a
member of the DL-Lite family [5], which is a family of tractable DLs particularly suited
for dealing with ontologies with very large ABoxes, and is at the basis of OWL 2 QL,
one of the profiles of OWL 2.
     The Web Ontology Language (version 2), or simply OWL 2, is an ontology lan-
guage for the Semantic Web with formally defined meaning. OWL 2 provides for de-
scribing the domain of interest in terms of concepts, roles, attributes, individuals, and
data values [9]. Due to the limitation of space, here we do not provide a complete de-
scription of OWL 2, but we refer the reader to the official W3C specification [11].
     We now present the syntax of the DL DL-LiteA . As for the semantics, we apply the
general definitions given above, and refer the reader to [13] for the precise description
of the semantics of a DL-LiteA ontology.
     In what follows we adopt the following notation: A, P , and U are symbols in
Σ denoting respectively an atomic concept, an atomic role and an atomic attribute;
T1 , . . . , Tn are n pairwise disjoint unbounded value-domains; B denotes a basic con-
cept; C a general concept; Q a basic role; R a general role; V a general attribute; E a
basic value-domain; and F a value-domain expression.
     Expressions in DL-LiteA are formed according to the following syntax:
    B −→ A | ∃Q | δ(U )                       Q −→ P | P −          V −→ U | ¬U
    C −→ B | ¬B | ∃Q.C | δF (U )              R −→ Q | ¬Q           E −→ ρ(U )
    F −→ T1 | · · · | Tn
where: P − denotes the inverse of P , ∃Q, or unqualified existential restriction denotes
the objects related to some object by the role Q, ¬ denotes negation, δ(U ) denotes
4        Marco Console, Valerio Santarelli, Domenico Fabio Savo

the domain of U , i.e., the set of objects that U relates to values, and ρ(U ) denotes
the range of U , i.e., the set of values related to objects by U . The concept ∃Q.C, or
qualified existential restriction, denotes the qualified domain of Q with respect to C,
i.e., the set of objects that Q relates to some instance of C. Similarly, δF (U ) denotes
the qualified domain of U with respect to a value-domain F , i.e., the set of objects that
U relates to some value in F . We refer to the qualified existential restriction expression
∃Q1 . . . . .∃Qn .C, where C is not a qualified existential restriction, as an existential role
chain of depth n.
     A DL-LiteA TBox assertion is an assertion of the form:
      BvC           QvR          EvF           U vV         (funct Q)        (funct U )
From left to right, the first four assertions denote inclusions between concepts, roles,
value-domains, and attributes, respectively. The last two assertions denote functionality
on roles and on attributes.
    A DL-LiteA TBox is a finite set of assertions of the form above, where suitable
limitations in the combination of such assertion are imposed. Given a TBox T and a role
P (resp. an attribute U ), we say that P (resp. U ) is primitive in T if P does not appear
in T positively in the right-hand side of any role (resp. attribute) inclusion assertion and
in any qualified existential restriction. In a DL-LiteA TBox, a role P (resp, attribute U )
that is not primitive in T cannot appear either directly nr inversely in a functionality
assertion.
    A DL-LiteA ABox A is a finite set of assertions of the form A(a), P (a, b), and
U (a, v), where A, P , and U are as above, a and b are object constants while v is a value
constant.


3   Approximation of DL ontologies
In this section, we present our notion of approximation of an ontology expressed in a
language L in a target language L0 , and then we provide a comparison between our
notion and others proposed in literature.
Ontology approximation. In what follows, O is a satisfiable L ontology, and L0 a
language not necessarily different from L.
    First of all, we need to introduce the notion of entailment set [12] of a satisfiable
ontology with respect to a language.
Definition 1. Let O be a satisfiable ontology expressed in a language L over a signa-
ture Σ, and let L’ be a language, not necessarily different from L. The entailment set
of O with respect to L0 , denoted as ES(O, L0 ), is the set which contains all L0 axioms
over Σ that are entailed by O.
    In other words, we say that an axiom α belongs to the entailment set of an ontology
O with respect to a language L0 , if α is an L0 axiom built over the alphabet of O and
for each interpretation I ∈ M od(O) we have that I |= α. Clearly, given an ontology
O and a language L0 , the entailment set of O with respect to L0 is unique.
    With the notion of entailment set in place, we can define the notion of approximation
in L0 of O0 .
                             Efficient approximation in DL-Lite of OWL 2 ontologies       5

Definition 2. Let O be an ontology over a signature Σ. A satisfiable L0 ontology O0
over Σ is an approximation in L0 of O if both the following statements hold:
 (i) ES(O0 , L0 ) ⊆ ES(O, L0 );
(ii) there is no satisfiable L0 ontology O00 such that ES(O0 , L0 ) ⊂ ES(O00 , L0 ) ⊆
     ES(O, L0 ).
     In other words, the above definition states that a satisfiable ontology O0 is an ap-
proximation in L0 of O , if O0 is an ontology in L0 , and there is no a satisfiable ontology
O00 in L0 whose entailment set in L0 is “nearer” to the entailment set of O in L0 than the
entailment set in L0 of O, where the distance here is measured in terms of set inclusion.
     Given an ontology O, it may be the case that the entailment set in a language L0
of O is infinite. If so, it may happen that the approximation in L0 , in accordance with
Definition 2, does not exist. For this reason, we follow the approach proposed in [3],
where safeness restrictions on the target language are imposed, in order to guarantee
the finiteness of the entailment set. For example, in DL-LiteA , which is the language
we focus on in the following sections, the infiniteness of the entailment set arises from
the possibility of inferring infinitely-long existential chains. In this case, the safeness
restriction requires to allow, in the TBox, only existential chains of bounded length.
Therefore, in what follows we denote versions of DL-LiteA in which such restriction is
                      (k)
enforced as DL-LiteA , where k is the bounded length of the existential chains.
     Another characteristic of DL-LiteA , shared with other languages such as EL++ [1],
is that syntactic restrictions are imposed on the manner in which assertions can be com-
bined. In this case, there may exist more than one ontology which is an approximation
in L0 of O. In what follows we denote with ApxM AX (O, L0 ) the set of L0 ontologies
which are approximations in L0 of O in accordance with Definition 2.
Example 1. Consider the OWL 2 ontology O = {A v ∃R.B, A v ∀R.B, (funct R)}.
It is easy to see that, due to the syntactic restriction in DL-LiteA , which imposes that a
non primitive role cannot be functional, we have that, up to logical equivalence, the set
                         (1)
ApxM AX (O, DL-LiteA ) = {{A v ∃R.B}, {A v ∃R, (funct R)}}.                              t
                                                                                         u

Comparison with related work. The problem of approximating ontologies for OBDA
applications has recently been faced in [12] and [3].
    In [12], the authors define the approximation in L0 of a satisfiable ontology O as the
set of L0 axioms coinciding with the entailment set of O with respect to L0 .
Definition 3. Let O be a satisfiable ontology expressed in a language L. The approxi-
mation in L0 of O is ApxES (O, L0 ) = ES(O, L0 ).
Therefore, in accordance with the above definition, the approximation in L0 of the on-
tology O is a set of L0 axioms, but not necessarily a valid ontology expressed in L0 .
     In [3], the authors provide a more sophisticated notion of approximation, in which
it is required that the approximation in L0 of O be an ontology expressed in L0 .
Definition 4. Let O be a satisfiable ontology expressed in a language L over a signa-
ture Σ. A satisfiable L0 ontology O0 over Σ is an approximation in L0 of O if both the
following statements hold:
6        Marco Console, Valerio Santarelli, Domenico Fabio Savo

 (i) ES(O0 , L0 ) ⊆ ES(O, L0 );
(ii) for each α ∈ ES(O, L0 ) such that O0 ∪{α} is an L0 ontology, then α ∈ ES(O0 , L0 ).

     In other words, an L0 ontology O0 is an approximation in L0 of O if among all the
possible maximal subsets of ES(O, L0 ) which are L0 ontologies, there is one which
is logically equivalent to O0 . It is not unexpected, as with our approach, that there
may exist more than one ontology which is an approximation in L0 of O. We denote
with ApxSC (O, L0 ) set of L0 ontologies which are all approximations in L0 of O in
accordance with Definition 4.
     Differently from Definition 3, Definition 4 guarantees that an approximation is al-
ways an ontology expressed in the target language L0 .
     Similarly to our notion of approximation, if the entailment set is infinite, it may hap-
pen that there is no ontology expressed in the target language which is an approxima-
tion. Hence, in order to guarantee the existence of an approximation, one must impose
safeness restrictions on the target language in this case as well, in order to guarantee the
finiteness of the entailment set.
     We now compare our approach to those in [3] and [12] by means of some examples.

Example 2. Consider the following OWL 2 ontology: O = {∃R− v B, A v
∃R.B, (funct R)}. In accordance with Definitions 3, 4, 2, we have:
                     (1)
ApxES (O, DL-LiteA ) = {∃R− v B, A v ∃R, A v ∃R.B, A v ∃R.∃R− ,
                        ∃R v ∃R.∃R− , ∃R− v ∃R− .∃R, (funct R) }
                     (1) 0
ApxSC (O, DL-LiteA ) = {Osc = {∃R− v B, A v ∃R, A v ∃R.B, A v ∃R.∃R− ,
                               ∃R v ∃R.∃R− , ∃R− v ∃R− .∃R },
                        Osc = {∃R− v B, A v ∃R, (funct R) } }
                         00

                       (1)
ApxM AX (O, DL-LiteA ) = {Omax = {∃R− v B, A v ∃R, (funct R) } }                           t
                                                                                           u
                                                           (1)
    Due to the syntactic restrictions enforced in DL-LiteA , ontology O of Example 2 is
              (1)                                                                (1)
not a DL-LiteA ontology. However, it is logically equivalent to the DL-LiteA ontology
{∃R− v B, A v ∃R, (funct R)}.
                                      (1)                                                 (1)
    Regarding ApxES (O, DL-LiteA ) we only observe that it is not a valid DL-LiteA
                                                                                         (1)
ontology. Up to logical equivalence, we can see that the set ApxSC (O, DL-LiteA )
                                   00
contains, along with ontology Osc      , which is logically equivalent to O, also the unex-
                   0                                0          (1)          00          (1)
pected ontology Osc , for which we have ES(Osc        , DL-LiteA ) ⊂ ES(Osc    , DL-LiteA ).
Finally, according to Definition 2, up to logical equivalence, the only approximation is
          (1)
a DL-LiteA ontology that is logically equivalent to O.
    Other significant differences between these three approaches arise when the goal is
to compute the approximation of an ontology O expressed in a language L in which
the UNA is not adopted into a target language L0 in which the UNA is adopted. In
the example below, we highlight the different behavior of the three approaches in this
circumstance.

Example 3. We recall that in OWL 2 the UNA is not adopted. This means, for instance,
that given the following two ontologies O = {A v {o1 }, B v {o2 }} and O0 = {A v
                             Efficient approximation in DL-Lite of OWL 2 ontologies   7

{o1 }, B v {o2 }, o1 6= o2 }, we have that M od(O) 6= M od(O0 ). In fact, while O does
not entail A v ¬B, O0 does. Differently, if we adopt the UNA in OWL 2, then we have
that M od(O) = M od(O0 ), and both entail the assertion A v ¬B.
    From the observations above, it follows that {A v {o1 }, B v {o2 }} ⊆
ES(O, OWLU N A ), and that A v ¬B 6∈ ES(O, OWLU N A ). In what follows let us
denote with OWLU N A the version of OWL 2 where the UNA is adopted.
    In accordance with Definitions 3, 4, and 2, we have that, up to logical equivalence,
the approximations in OWLU N A of O are:

    – ApxES (O, OWLU N A ) = ES(O, OWLU N A ). In this case, ApxES (O, OWLU N A )
      is a valid OWLU N A ontology. Let OES be such ontology. As shown above,
      since in OWLU N A the UNA is adopted, OES |= A v ¬B. This means that
      ES(OES , OWLU N A ) 6⊆ ES(O, OWLU N A ).
    – ApxSC (O, OWLU N A ) = ∅. Indeed, since ES(O, OWLU N A ) is a valid
      OWLU N A ontology, any OWLU N A ontology O1 such that ES(O1 , OWLU N A ) ⊂
      ES(O, OWLU N A ), does not satisfy condition (ii) of Definition 4. Moreover, since
      every OWLU N A ontology O2 that satisfies condition (ii) entails both {A v
      {o1 }, B v {o2 }}, then it also entails A v ¬B. Hence ES(O2 , OWLU N A ) 6⊆
      ES(O, OWLU N A ), and so condition (i) of Definition 4 is not satisfied.
                                           0                           00
    – ApxM AX (O, OWLU N A ) = {Omax             = {A v {o1 }}, Omax           = {B v
                                                        0          00
      {o2 }} }. Indeed, it is easy to verify that both Omax and Omax satisfy both con-
      ditions (i) and (ii) of Definition 2, and that there is no other ontology O000 in
                                                                0           00
      ApxM AX (O, OWLU N A ) logically equivalent to neither Omax     nor Omax  .     t
                                                                                      u

    As shown in the previous example, given an L ontology O and a target language L0 ,
our approach, as the one in [3], but differently from the one given in [12], guarantees
that an approximation in L0 of O is an L0 ontology, and that for each L0 ontology O0
that is an approximation in L0 of O, we have that ES(O0 , L0 ) ⊆ ES(O, L0 ). Finally,
it can be shown that our approach always preserves the same or more inferences than
those obtained by adopting the approach given in [3].

Theorem 1. For each Osc ∈ ApxSC (O, L0 ), there is an Omax ∈ ApxM AX (O, L0 )
such that ES(Osc , L0 ) ⊆ ES(Omax , L0 ). Furthermore, for each Omax ∈
ApxM AX (O, L0 ), there is no Osc ∈ ApxSC (O, L0 ) such that ES(Omax , L0 ) ⊂
ES(Osc , L0 ).


4     Approximation in DL-LiteA of OWL 2 ontologies

In this section, we study the problem of computing the approximation in DL-LiteA
of a satisfiable OWL 2 ontology O. More specifically, we aim to approximate
O with DL-LiteA TBox assertions. Therefore, in what follows we assume that
ApxM AX (O, DL-LiteA ) is a set of DL-LiteA TBoxes and that ES(O, DL-LiteA ) is a
set of DL-LiteA TBox assertions. To guarantee the finiteness of the entailment set, we
refer to versions of DL-LiteA allowing only for TBox assertions with existential chains
of bounded length k.
8       Marco Console, Valerio Santarelli, Domenico Fabio Savo

                           (k)
    Given a set of DL-LiteA assertions S, and a functionality assertion ϕ over a role R
(resp. attribute U ), we denote with clashes(ϕ, S) the set of all assertions involving R
                                                                                        (k)
(resp. U ) that, together with ϕ, violate the syntactic restriction imposed on DL-LiteA
TBoxes. Hence, clashes(ϕ, S) is a set of role (resp. attribute) inclusion assertions and
assertions with a qualified existential role (resp. attribute) on the right hand side.
    Let O be an OWL 2 ontology, and let F be the set containing all the function-
                                     (k)                                          (k)
ality assertions in ES(O, DL-LiteA ) for which clashes(ϕ, ES(O, DL-LiteA )) 6= ∅.
                                   (k)                          (k)
If F = 6 ∅, then ES(O, DL-LiteA ) is not a valid DL-LiteA TBox, and there are at
                                                   (k)
most 2|F | TBoxes Ti which are valid DL-LiteA TBoxes and minimally differ from
                  (k)                                     (k)                           (k)
ES(O, DL-LiteA ). Let M axSubES (ES(O, DL-LiteA )) be the set of such DL-LiteA
                                                                                (k)
TBoxes. One can compute these TBoxes in M axSubES (ES(O, DL-LiteA )) by re-
                                (k)
tracting, from ES(O, DL-LiteA ), either ϕ ∈ F or the assertions in clashes(ϕ, S), in
order to resolve the violations of the syntactic restriction.
                                                                                    (k)
    The lemma below guarantees that a TBox in M axSubES (ES(O, DL-LiteA )) is a
                                                                    (k)
candidate for being one of the TBoxes in ApxM AX (O, DL-LiteA ).
                                                                                (k)
Lemma 1. Let O be a satisfiable OWL 2 ontology and let T be a DL-LiteA TBox.
              (k)                (k)                                 (k)
ES(T , DL-LiteA ) ⊆ ES(O, DL-LiteA ) if and only if T ⊆ ES(O, DL-LiteA ).
    In other words, the above lemma guarantees that the first condition in Definition 2
                               (k)                                           (k)
is satisfied by every DL-LiteA TBox in M axSubES (ES(O, DL-LiteA )), and that
                                                          (k)
in computing all the TBoxes in ApxM AX (O, DL-LiteA ), one can consider only the
                              (k)
assertions in ES(O, DL-LiteA ). Moreover, from the monotonicity of the DLs, it di-
                                                                    (k)
rectly follows that for every TBox T in ApxM AX (O, DL-LiteA ) there is a TBox in
                              (k)
M axSubES (ES(O, DL-LiteA )) which is logically equivalent to T .
                                                                         (k)
    However, in order for a TBox Ti in M axSubES (ES(O, DL-LiteA )) to belong to
                       (k)
ApxM AX (O, DL-LiteA ), it must also satisfy the second condition of Definition 2,
                                           (k)                                (k)
and thus that there is no other DL-LiteA TBox T 0 ⊆ ES(T , DL-LiteA ) such that
                 (k)                     (k)                     (k)
ES(Ti , DL-LiteA ) ⊂ ES(T 0 , DL-LiteA ) ⊆ ES(O, DL-LiteA ).
                                                                (k)
    To explain why a TBox in M axSubES (ES(O, DL-LiteA )) does not necessarily
also satisfy the second condition in Definition 2, we refer to the ontology O = {∃R− v
                                                                                    (1)
B, A v ∃R.B, (funct R)} of Example 2. It is easy to see that ES(O, DL-LiteA ) =
{∃R v B, A v ∃R.B, (funct R), ∃R v ∃R.∃R , ∃R v ∃R.B, A v ∃R.∃R− ,
     −                                                  −
                                                                                  (1)
∃R− v ∃R− .∃R, A v ∃R}, and that clashes((funct R), ES(O, DL-LiteA )) =
{A v ∃R.B, ∃R v ∃R.∃R− , ∃R v ∃R.B, A v ∃R.∃R− , ∃R− v ∃R− .∃R}.
Hence, by following the procedure described above, we have the following two TBoxes
                                  (1)
in M axSubES (ES(O, DL-LiteA )): T1 = {∃R− v B, A v ∃R.B, ∃R v ∃R.∃R− ,
∃R v ∃R.B, A v ∃R.∃R , ∃R− v ∃R− .∃R, A v ∃R}, and T2 = {∃R− v B,
                             −
                                                                                   (1)
(funct R), A v ∃R}. However, it is clear that only T2 ∈ ApxM AX (O, DL-LiteA ). In
                                    (1)                    (1)
fact we have that ES(T1 , DL-LiteA ) ⊂ ES(T2 , DL-LiteA ).
    We provide the algorithm isApx which, given a TBox T and an ontology O, returns
                                     (k)
true if T ∈ ApxM AX (O, DL-LiteA ), false otherwise. The algorithm proceeds as fol-
                             Efficient approximation in DL-Lite of OWL 2 ontologies        9


  Algorithm 1: isApx(T , O)
                    (k)
    Input: a DL-LiteA TBox T , a satisfiable OWL 2 ontology O
    Output: true or false
    begin
                             (k)
        E ← ES(T , DL-LiteA );
                             (k)
        S ← ES(O, DL-LiteA ) \ E;
        foreach α ∈ S
                                    (k)
            if T ∪ {α} is a DL-LiteA TBox then return false;
        foreach functionality assertion φ ∈ E
            E ← E \ clashes(φ, E);
        foreach functionality assertion ϕ ∈ S
                                              (k)               (k)
            if ES(E \ clashes(ϕ, E), DL-LiteA ) = ES(T , DL-LiteA ) then return false;
    return true;
    end

                                                              (k)
lows. Given a satisfiable OWL 2 ontology O and a DL-LiteA TBox T , it first computes
                                    (k)
their entailment sets in DL-LiteA and then computes the set S containing all the asser-
                                               (k)
tions in the entailment set of O in DL-LiteA which are not in the entailment set of T in
         (k)
DL-LiteA . Then, for every assertion α in S, it attempts to add α to T without violat-
ing any syntactic restriction. If such assertion exists, then T is not an approximation in
         (k)
DL-LiteA of O. If not, the algorithm continues, fixing E as the entailment set of T in
         (k)
DL-LiteA , and resolving every violation of the syntactic restriction in E by removing
clashes(φ, E) from E, for every functionality assertion φ in E. This operation guaran-
                           (k)
tees that E is a DL-LiteA TBox. Then the algorithm checks, for every functionality
assertion ϕ in S, if the set obtained from E by removing clashes(ϕ, E) from E is logi-
                                                                                  (k)
cally equivalent to E. If this is the case, then it is possible to build a DL-LiteA TBox T 00
                                                                                      (k)
by adding ϕ to the set of assertions obtained in this fashion. T 00 is a DL-LiteA TBox
that proves that T does not satisfy the second condition of Definition 2. Otherwise, the
algorithm terminates by returning true.
    The theorem below establishes termination and correctness of Algorithm 1.
                                                                                  (k)
Theorem 2. Let O be a satisfiable OWL 2 ontology, and let T be a DL-LiteA TBox.
                                                                                (k)
isApx(T , O) terminates, and returns true if and only if T ∈ ApxM AX (O, DL-LiteA ).
    Given a satisfiable OWL 2 ontology O, Lemma 1 and Theorem 2 suggest Algo-
                                                                             (k)
rithm 2 for computing, up to logical equivalence, the set ApxM AX (O, DL-LiteA ).
    The following theorem establishes termination and correctness of Algorithm 2.

Theorem 3. Let O be a satisfiable OWL 2 ontology. Then computeApx(O) terminates
                                                            (k)
and computes, up to logical equivalence, ApxM AX (O, DL-LiteA ).

    As expected, Algorithm 2 does not return a single TBox, but instead a set of TBoxes.
For application purposes, the approximation that shall be used must be chosen from this
set. A pragmatic approach could be to choose one non-deterministically. Instead, one
could think to leave this choice to the end user, according to the use he intends to make
10       Marco Console, Valerio Santarelli, Domenico Fabio Savo


    Algorithm 2: computeApx(O)
     Input: a satisfiable OWL 2 ontology O
                              (k)
     Output: a set of DL-LiteA TBoxes
     begin
                                            (k)
         S ← M axSubES (ES(O, DL-LiteA ));
         foreach Ti ∈ S
             if isApx(Ti , O) = false then S ← S \ {Ti };
     return S;
     end

of it. A more interesting direction could be to achieve the identification of a unique
TBox by applying some preference criteria to the set returned by Algorithm 2.

5     Efficient entailment set computation in DL-Lite
In the previous sections, we have provided an algorithm for computing the set
                         (k)
ApxM AX (O, DL-LiteA ) of TBoxes for an OWL 2 ontology O. This computation is
                                                                             (k)
clearly intractable. Indeed, it requires to compute the set ES(O, DL-LiteA ), and more-
                                                           (k)
over, due to the syntactic restriction enforced in DL-LiteA , in the worst case, the cardi-
                                          (k)
nality of the set ApxM AX (O, DL-LiteA ) is exponential in the number of functionality
                               (k)                                                       (k)
assertions in ES(O, DL-LiteA ). In particular, the computation of ES(O, DL-LiteA )
is in general very costly, as highlighted also in [3] and [12], since it requires the in-
vocation of reasoning services over an OWL 2 ontology O. This task is performed by
invoking an OWL 2 oracle which can be implemented by an OWL 2 reasoner.
                                                       (k)
     A naive algorithm for computing ES(O, DL-LiteA ) is the one described in [12], in
                                                   (k)
which firstly one computes the set Γ of DL-LiteA TBox assertions which can be built
over the signature Σ, and then, for each assertion α ∈ Γ such that O |= α, one adds α
                     (k)
to ES(O, DL-LiteA ).
                                                                              (k)
     We now show how to optimize the computation of ES(O, DL-LiteA ) through a
technique which drastically reduces in practice the calls to the OWL 2 oracle.
                                              (k)
     In the computation of ES(O, DL-LiteA ), a large portion of the invocations of the
OWL 2 oracle involve assertions in which a general concept C∃R1 ...∃Rn involving a non
trivial existential role chain occurs. Empirical observation has brought to light the fact
that this kind of general concept very often does not subsume any concept in O. Hence,
all the invocations of the OWL 2 oracle involving these childless general concepts are
useless. Therefore, at the base of our strategy is the identification of all these childless
general concepts C∃R1 ...∃Rn , without invoking the OWL 2 oracle.
     We use the function subsumed(S1 , O), where S1 is a general concept (resp. general
role, general attribute) which returns the set of concepts (resp. roles, attributes) S2 such
that O |= S2 v S1 . This function is efficiently performed by the most commonly-used
OWL 2 reasoners, such as Pellet [15], Racer [8], FACT++ [16], and HermiT [7].
     Our technique calls, as the first step, for the computation of the classification of
basic concepts, roles, and attributes, which is encoded into a directed graph, in which the
nodes represent the predicates of the ontology, and the edges the inclusion assertions.
                              Efficient approximation in DL-Lite of OWL 2 ontologies           11

 Ontology   DL Fragment     #O.A.                   #N.A.            Total computation time (ms)
                     k=1     k=2     k=3    k=1     k=2        k=3    k=1       k=2        k=3
 Mouse       ALCI    6.059 12.611 19.163 11.018    40.362    157.738 8.426      9.955     12.173
 Pathway      EL    10.191 11.999 11.999 14.294    52.374    204.694 11.975    16.498     17.553
 Cognitive SHIF (D) 56.883 178.381 474.145 48.006 541.350 6.461.478 348.892 1.812.511 6.832.865
 Mammalian   AL+     7.551 7.551 7.551 112.898 413.922 1.618.018 322.527 350.853 350.853
 Spatial    ALEHI 51.065 82.735 150.195 47.143 4.541.815 445.019.671 27.827    52.742 132.807
                                                                                      (k)
      Table 1: Evaluation of the optimized algorithm for computing ES(O, DL-LiteA ).

     After this initial step, the remaining invocations, which we work to minimize, are
those needed for computing the entailed inclusion assertions involving general con-
cepts C∃R1 ...∃Rn , and the entailed disjointness. Regarding the former, we exploit the
graph encoding of concept, role, and attribute classification to invoke these subsump-
tion checks in a manner which follows the hierarchical order of these general concepts
C∃R1 ...∃Rn , in order to avoid those checks which can be skipped. Consider, for exam-
ple, an ontology O that entails the inclusions A1 v A2 and P1 v P2 , where A1 and
A2 are concepts and P1 and P2 are roles. Exploiting these inclusions we can deduce
the hierarchical structure involving general concepts that can be built on these four
predicates. For instance, we know that ∃P2 .A2 v ∃P2 , that ∃P2 .A1 v ∃P2 .A2 , that
∃P1 .A1 v ∃P2 .A1 , and so on. We begin by invoking the OWL 2 oracle, asking for
the children of the general concepts which are in the highest position in the hierarchy.
So, first we call subsumed(∃P2 , O). If subsumed(∃P2 , O) = ∅, we avoid invoking
the oracle asking for subsumed(∃P2 .A2 , O), and so on. Regarding the latter we fol-
low the same procedure, but beginning from the lowest positions in the hierarchy. The
algorithm concludes by asking the OWL 2 oracle for all functionalities entailed by O.
     In Table 1 we present a sample of the evaluation tests for this strategy, performed
through a Java-based implementation of this technique. The table reports the number of
invocations to the OWL 2 oracle performed with our optimization (#O.A.), and without
                                                                       (k)
(#N.A.), for computing the entailment set of the ontologies in DL-LiteA , with 1 ≤ k ≤
                                                                                    (k)
3. It also reports, for each ontology, the total computation time of ES(O, DL-LiteA ).

6   Conclusion
In this paper we address the problem of ontology approximation. We illustrate our ap-
proach to this problem, and provide a comparison with other approaches provided in
literature. We deeply investigate the approximation of OWL 2 ontologies with DL-Lite
TBoxes for OBDA purposes, and provide an algorithm for its computation. Finally,
we present a technique for the optimization of the core procedure of this computation,
whose success we have shown with empirical evaluation. As future work, we plan to
improve the performances in computing the approximation in DL-Lite of OWL 2 on-
tologies by adopting more sophisticated techniques. Moreover, we aim to study reason-
able solutions for addressing the problem of multiple approximations of an ontology. In
particular, for those settings in which the approximation is used in OBDA.
Acknowledgments. This research has been partially supported by the EU under FP7
project Optique (grant n. FP7-318338), and by the EU under FP7-ICT project ACSI
(grant no. 257593).
12       Marco Console, Valerio Santarelli, Domenico Fabio Savo

References
 1. Franz Baader, Sebastian Brandt, and Carsten Lutz. Pushing the EL envelope further. In Proc.
    of OWLED 2008 DC, 2008.
 2. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter F. Patel-
    Schneider, editors. The Description Logic Handbook: Theory, Implementation and Applica-
    tions. Cambridge University Press, 2003.
 3. Elena Botoeva, Diego Calvanese, and Mariano Rodriguez-Muro. Expressive approximations
    in DL-Lite ontologies. Proc. of AIMSA 2010, pages 21–31, 2010.
 4. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, Antonella
    Poggi, Mariano Rodriguez-Muro, Riccardo Rosati, Marco Ruzzi, and Domenico Fabio Savo.
    The Mastro system for ontology-based data access. Semantic Web J., 2(1):43–53, 2011.
 5. Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Ric-
    cardo Rosati. Tractable reasoning and efficient query answering in description logics: The
    DL-Lite family. J. of Automated Reasoning, 39(3):385–429, 2007.
 6. Balakrishnan Chandrasekaran, John R Josephson, and V Richard Benjamins. What are on-
    tologies, and why do we need them? Intelligent Systems and Their Applications, IEEE,
    14(1):20–26, 1999.
 7. Birte Glimm, Ian Horrocks, Boris Motik, Rob Shearer, and Giorgos Stoilos. A novel ap-
    proach to ontology classification. J. of Web Semantics, 10(1), 2011.
 8. Volker Haarslev and Ralf Möller. RACER system description. In Proc. of IJCAR 2001,
    volume 2083 of LNAI, pages 701–705. Springer, 2001.
 9. Pascal Hitzler, Markus Krötzsch, Bijan Parsia, Peter F. Patel-Schneider, and Sebas-
    tian Rudolph, editors.      OWL 2 Web Ontology Language: Primer.            W3C Recom-
    mendation, 11 december 2012.            Available at http://www.w3.org/TR/2012/
    REC-owl2-primer-20121211/.
10. Maurizio Lenzerini. Ontoloogy-based data management. In Proc. of CIKM 2011, pages 5–6,
    2011.
11. Boris Motik, Bijan Parsia, and Peter F. Patel-Schneider, editors. OWL 2 Web On-
    tology Language Structural Specification and Functional-Style Syntax.          W3C Rec-
    ommendation, 11 december 2012. Available at http://www.w3.org/TR/2012/
    REC-owl2-syntax-20121211/.
12. Jeff Z Pan and Edward Thomas. Approximating OWL-DL ontologies. In Proc. of AAAI 2007,
    page 1434, 2007.
13. Antonella Poggi, Domenico Lembo, Diego Calvanese, Giuseppe De Giacomo, Maurizio
    Lenzerini, and Riccardo Rosati. Linking data to ontologies. J. on Data Semantics, X:133–
    173, 2008.
14. Bart Selman and Henry Kautz. Knowledge compilation and theory approximation. J. of the
    ACM, 43(2):193–224, 1996.
15. Evren Sirin, Bijan Parsia, Bernardo Cuenca Grau, Aditya Kalyanpur, and Yarden Katz. Pel-
    let: a practical OWL-DL reasoner. Web Semantics: science, services and agents on the World
    Wide Web, 5(2):51–53, 2007.
16. Dmitry Tsarkov and Ian Horrocks. FaCT++ description logic reasoner: System description.
    In Proc. of IJCAR 2006, pages 292–297, 2006.
17. Tuvshintur Tserendorj, Sebastian Rudolph, Markus Krötzsch, and Pascal Hitzler. Approxi-
    mate OWL-reasoning with Screech. In Proc. of RR 2008, pages 165–180. Springer, 2008.
18. Holger Wache, Perry Groot, and Heiner Stuckenschmidt. Scalable instance retrieval for the
    semantic web by approximation. In Proc. of WISE 2005, pages 245–254. Springer, 2005.

</pre>