=Paper=
{{Paper
|id=None
|storemode=property
|title=A Graph-Based Approach for Classifying OWL 2 QL Ontologies
|pdfUrl=https://ceur-ws.org/Vol-1014/paper_21.pdf
|volume=Vol-1014
|dblpUrl=https://dblp.org/rec/conf/dlog/LemboSS13
}}
==A Graph-Based Approach for Classifying OWL 2 QL Ontologies==
<pdf width="1500px">https://ceur-ws.org/Vol-1014/paper_21.pdf</pdf>
<pre>
          A graph-based approach for classifying
                 OWL 2 QL ontologies?

          Domenico Lembo, Valerio Santarelli, and Domenico Fabio Savo

     Dipartimento di Ing. Informatica, Automatica e Gestionale “Antonio Ruberti”
                             Sapienza Università di Roma
                          Via Ariosto 25, I-00186 Roma, Italy
                     {lembo,santarelli,savo}@dis.uniroma1.it


        Abstract. Ontology classification is the reasoning service that com-
        putes all subsumption relationships inferred in an ontology between con-
        cept, role, and attribute names in the ontology signature. OWL 2 QL is a
        tractable profile of OWL 2 for which ontology classification is polynomial
        in the size of the ontology TBox. However, to date, no efficient methods
        and implementations specifically tailored to OWL 2 QL ontologies have
        been developed. In this paper, we provide a new algorithm for ontol-
        ogy classification in OWL 2 QL, which is based on the idea of encoding
        the ontology TBox into a directed graph and reducing core reasoning to
        computation of the transitive closure of the graph. We have implemented
        the algorithm in the QuOnto reasoner and extensively evaluated it over
        very large ontologies. Our experiments show that QuOnto outperforms
        various popular reasoners in classification of OWL 2 QL ontologies.


1     Introduction

Ontology classification is the problem of computing all subsumption relationships
inferred in an ontology between predicate names in the ontology signature, i.e.,
named concepts (a.k.a. classes), roles (a.k.a. object-properties), and attributes
(a.k.a. data-properties). It is considered a core service for ontology reasoning,
which can be exploited for various tasks, at both design-time and run-time,
ranging from ontology navigation and visualization to query answering.
    Devising efficient ontology classification methods and implementations is a
challenging issue, since classification is in general a costly operation. Most pop-
ular reasoners for Description Logic (DL) ontologies, i.e., OWL ontologies, such
as Pellet [23], Racer [11], FACT++ [24], and HermiT [9], offer highly optimized
classification services for expressive DLs. Various experimental studies show that
such reasoners have reached very good performances through the years. How-
ever, they are still not able to efficiently classify very large ontologies, such as
the full versions of GALEN [22] or of the FMA ontology [10].
    Whereas the above tools use algorithms based on model construction through
tableau (or hyper-tableau [9]), the CB reasoner [14] for the Horn-SHIQ DL is
?
    This paper is an extended abstract of [18].
a consequence-driven reasoner. The use of this technique allows CB to obtain
an impressive gain on very large ontologies, such as full GALEN. However, the
current implementation of the CB reasoner is rather specific for particular frag-
ments of Horn-SHIQ (and incomplete for the general case) [14]. For example,
it does not allow for classification of properties.
    Other recently developed tools, such as Snorocket [17], ELK [15], and
JCEL [20], are specifically tailored to intensional reasoning over logics of the
EL family, and show excellent performances in classification of ontologies speci-
fied in such languages, which are the logical underpinning of OWL 2 EL, one of
the tractable profile of OWL 2 [21].
    Instead, to the best of our knowledge, ontology classification in the other
OWL 2 profiles has received so far little attention. In particular, classification in
OWL 2 RL has been investigated only in [16], whereas, to date, no techniques
have been developed that are specifically tailored to intensional reasoning in
OWL 2 QL, the “data oriented” profile of OWL 2, nor for any logic of the DL-Lite
family [7]1 , which constitutes the logical underpinning of OWL 2 QL. Our aim
is then to contribute to fill this lack on OWL 2 QL, encouraged also by the
fact that such language, like all logics of the DL-Lite family, allows for tractable
intensional reasoning, and in particular for PTime ontology classification, as it
immediately follows from the results in [7].
    In this paper, we thus provide a new method for ontology classification in
the OWL 2 QL profile. In our technique, we encode the ontology terminology
(TBox) into a graph, and compute the transitive closure of the graph to then
obtain the ontology classification. The analogy between simple inference rules
in DLs and graph reachability is indeed very natural: consider, for example, an
ontology containing the subsumptions A1 v A2 and A2 v A3 , where A1 , A2 ,
and A3 are class names in the ontology signature. We can then associate to this
ontology a graph having three nodes labeled with A1 , A2 , and A3 , respectively,
an edge from A1 to A2 and an edge from A2 to A3 . It is straightforward to see
that A3 is reachable from A1 , and therefore an edge from A1 to A3 is contained in
the transitive closure of the graph. This corresponds to the inferred subsumption
A1 v A3 . On the other hand, things become soon much more complicated when
complex (OWL) axioms come into play.
    In this respect, we will show that for an OWL 2 QL ontology it is possible
to easily construct a graph whose closure constitutes the major sub-task in on-
tology classification, because it allows us to obtain all subsumptions inferred by
the “positive knowledge” specified by the TBox. We will show that the com-
puted classification misses only “trivial” subsumptions inferred by unsatisfiable
predicates, i.e., named classes (resp. properties) that always have an empty in-
terpretation in every model of the ontology, and that are therefore subsumed
by every class (resp. property) in the ontology signature. We therefore provide
an algorithm that, exploiting the transitive closure of the graph, computes all
unsatisfiable predicates, thus allowing us to obtain a complete ontology classi-
1
    Not to be confused with the set of DLs studied in [2], which form the DL-Litebool
    family.
fication. We notice that the presence of unsatisfiable predicates in an ontology
is mainly due to errors in the design. However, it is not rare to find such pred-
icates, especially in very large ontologies or in ontologies that are still “under
construction”. In particular, we could find unsatisfiable concepts even in some
benchmark ontologies we used in our experiments (cf. Section 4). Of course, al-
ready debugged ontologies might not present such predicates [13,12]. In this case,
one can avoid executing our algorithm for computing unsatisfiable predicates.
    We have implemented our technique in a new module of QuOnto [1], the
reasoner at the base of the Mastro [6,8] system, and have carried out extensive
experimentation, focusing in particular on very large ontologies. We have consid-
ered a number of well-known ontologies, often used as benchmark for ontology
classification, and have suitably approximated in OWL 2 QL those that are out
of this language.
    QuOnto showed better performances, in some cases corresponding to enor-
mous gains, with respect to tableau-based reasoners (in particular, Pellet,
Fact++, and HermiT). We also obtained comparable or better results with re-
spect to the CB reasoner, for almost all ontologies considered, but, differently
from CB reasoner, we were always able to compute a complete classification. We
finally compared QuOnto with ELK, one of the most performing reasoner for
EL, for those approximated ontologies that turned out to be both in OWL 2 QL
and OWL 2 EL, obtaining similar performances in almost all cases.
    We conclude by noticing that, even though we refer here to OWL 2 QL, our
algorithms and implementations can be easily adapted to deal with all logics
of the DL-Lite family mentioned in [7], excluding those allowing for the use
of conjunction in the left-hand side of inclusion assertions or the use of n-ary
relations instead of binary roles.
    The rest of the paper is organized as follows. In Section 2, we provide some
preliminaries. In Section 3, we describe our technique for ontology classification
in OWL 2 QL. In Section 4, we describe our experimentation, and finally, in
Section 5, we conclude the paper.


2   Preliminaries

In this section, we present some basic notions on DL ontologies, the formal
underpinning of the OWL 2 language, and on OWL 2 QL. We also recall some
notions of graph theory needed later on.
Description Logic Ontologies. We consider a signature Σ, partitioned in two
disjoint signatures, namely, ΣP , containing symbols for predicates, i.e., atomic
concepts, atomic roles, atomic attributes, and value-domains, and ΣC , containing
symbols for individual (object and value) constants. Complex concept, role, and
attribute expressions are constructed starting from predicates of ΣP by applying
suitable constructs, which vary in different DL languages. Given a DL language
L, an L-TBox (or simply a TBox, when L is clear) over Σ contains universally
quantified first-order (FOL) assertions, i.e., axioms specifying general properties
of concepts, roles, and attributes. Again, different DLs allow for different axioms.
An L-ABox (or simply an ABox, when L is clear) is a set of assertions on
individual constants, which specify extensional knowledge. An L-ontology O is
constituted by both an L-TBox T and an L-ABox A, denoted as O = hT , Ai.
    The semantics of a DL ontology O is given in terms of FOL interpretations
(cf. [4]). We denote with Mod (O) the set of models of O, i.e., the set of FOL-
interpretations that satisfy all TBox axioms and ABox assertions in O, where
the definition of satisfaction depends on the DL language in which O is specified.
An ontology O is satisfiable if Mod (O) 6= ∅. A FOL-sentence φ is entailed by an
ontology O, denoted O |= φ, if φ is satisfied by every model in Mod (O). All the
above notions naturally apply to a TBox T alone.
    Traditional intensional reasoning tasks with respect to a given TBox are
verification of subsumption and satisfiability of concepts, roles, and attributes [4].
More precisely, a concept C1 is subsumed in T by a concept C2 , written T |=
C1 v C2 , if, in every model I of T , the interpretation of C1 , denoted C1I , is
contained in the interpretation of C2 , denoted C2I , i.e., C1I ⊆ C2I for every I ∈
Mod (T ). Furthermore, a concept C in T is unsatisfiable, which we wrote as
T |= C v ¬C, if the interpretation of C is empty in every model of T , i.e., C I = ∅
for every I ∈ Mod (T ). Analogous definitions hold for roles and attributes.
    Strictly related to the previous reasoning tasks is the classification inference
service, which we focus on in this paper. Given a signature ΣP and a TBox
T over ΣP , such a service allows to determine subsumption relationships in T
between concepts, roles, and attributes in ΣP . Therefore, classification allows
to structure the terminology of T in the form of a subsumption hierarchy that
provides useful information on the connection between different terms, and can
be used to speed up other inference services. Here we define it more formally.
Definition 1. Let T be a satisfiable L-TBox over ΣP . We define the T -
classification of ΣP (or simply T -classification when ΣP is clear from the con-
text) as the set of inclusion assertions defined as follows:
    Let S1 and S2 be either two concepts, roles, or attributes in ΣP . If
    T |= S1 v S2 then S1 v S2 belongs to the T -classification of ΣP .

The OWL 2 QL Language. The OWL 2 QL language is based on DL-LiteR ,
a DL of the DL-Lite family [7]. Differently from DL-LiteR , however, besides ob-
ject properties (i.e., roles), OWL 2 QL allows also for the use of data properties
(i.e., attributes), as well as some further constructs, as (ir-)reflexivity on prop-
erties. For the sake of presentation, we prefer to not consider here attributes,
nor (ir-)reflexivity constraints. This choice does not actually correspond to a
real simplification, since in the algorithms proposed in this paper we can treat
both attributes and roles essentially in the same way, and our techniques can
be applied to full OWL 2 QL ontologies with minimal adaptations. Therefore,
in the following, we provide a simplified, German style, syntax for OWL 2 QL,
which actually corresponds to that of DL-LiteR , whereas refer the reader to [21]
for the complete, OWL functional-style syntax of this language2 .
2
    Notice that (a)symmetric roles allowed in OWL 2 QL, even though not explicitly
    mentioned, can be easily expressed in the syntax that we consider.
    Expressions in OWL 2 QL are formed according to the following syntax:

               B −→ A | ∃Q                      Q −→ P | P −
               C −→ B | ¬B | ∃Q.A               R −→ Q | ¬Q
where: A and P are symbols in ΣP denoting respectively an atomic concept
and an atomic role; P − denotes the inverse of P ; ∃Q, also called unqualified
existential role, denotes the set of objects related to some object by the role Q;
the concept ∃Q.A, or qualified existential role, denotes the qualified domain of
Q with respect to A, i.e., the set of objects that Q relates to some instance of
A. In the following, we call B a basic concept, and Q a basic role.
    An OWL 2 QL TBox T is a finite set of axioms of the form B v C and
Q v R, where the former denote subsumptions between concepts, and the latter
subsumptions between roles. We call positive inclusions axioms of the form B1 v
B2 , B1 v ∃Q.A, and Q1 v Q2 , and negative inclusions axioms of the form
B1 v ¬B2 and Q1 v ¬Q2 .
    The semantics of OWL 2 QL ontologies and TBoxes is given in the standard
way [21,4].
    As for OWL 2 QL ABoxes, we do not present them here, since we concentrate
on intensional reasoning, and refer the interested reader to [21].
Graph Theory Notions. In this paper we use the term digraph to refer to a
directed graph. We assume that a digraph G is a pair (N , E), where N is a set of
elements called nodes, and E is a set of ordered pairs (s, t) of nodes in N , called
arcs, where s is denoted the source of the arc, and t the target of the arc.
    The transitive closure G ∗ = (N , E ∗ ) of a digraph G = (N , E) is a digraph
such that there is an arc in E ∗ having a node s as source and a node t as target
if and only if there is a path from s to t in G [5]. Let G = (N , E) be a digraph,
and let n be a node in N . We denote with predecessors(n, G) the set of nodes pn
in N such that there exists in E an arc (pn , n).


3    T -classification in OWL 2 QL

In this section we describe our approach to computing, given a signature ΣP
and an OWL 2 QL TBox T over ΣP , the T -classification of ΣP .
    In OWL 2 QL, a subsumption relation between two concepts or roles in ΣP ,
can be inferred by a TBox T if and only if (i) T contains such subsumption;
(ii) T contains a set of positive inclusion assertions that together entail the
subsumption; or (iii), trivially, the subsumed concept or role is unsatisfiable in
T . The above observation is formalized as follows.

Theorem 1. Let T be an OWL 2 QL TBox containing only positive inclusions,
and let S1 and S2 be two atomic concepts or two atomic roles. S1 v S2 is entailed
by T if and only if at least one of the following conditions holds:

1. a set P of positive inclusions exists in T , such that P |= S1 v S2 ;
2. T |= S1 v ¬S1 .
    Given a OWL 2 QL TBox T over a signature ΣP , we use ΦT and ΩT to
denote two sets of positive inclusions of the form S1 v S2 , with S1 , S2 ∈ ΣP ,
such that ΦT contains only positive inclusions for which statement 1 holds, and
ΩT contains only positive inclusions for which statement 2 holds. It is easy to
see that ΦT and ΩT are not disjoint. From Definition 1 and Theorem 1 it follows
that the T -classification coincides with the union of the sets ΦT and ΩT .
    In the following, we describe our approach to the computation of the T -
classification by firstly computing the set ΦT , and then computing the set ΩT .
Computation of ΦT . Given an OWL 2 QL TBox T , in order to compute ΦT ,
we encode the set of positive inclusions in T into a digraph GT and compute
the transitive closure of GT in such a way that each subsumption S1 v S2 in
ΦT corresponds to an arc (S1 , S2 ) in such transitive closure, and vice versa. The
following constructive definition describes the appropriate fashion to obtain the
digraph TBox representation for our aims.

Definition 2. Let T be an OWL 2 QL TBox over a signature ΣP . We call the
digraph representation of T the digraph GT = (N , E) built as follows:

1. for each atomic concept A in ΣP , N contains the node A;
2. for each atomic role P in ΣP , N contains the nodes P , P − , ∃P , ∃P − ;
3. for each concept inclusion B1 v B2 ∈ T , E contains the arc (B1 , B2 );
4. for each role inclusion Q1 v Q2 ∈ T , E contains the arcs (Q1 , Q2 ),
   (Q−     −                        −     −
      1 , Q2 ), (∃Q1 ,∃Q2 ), and (∃Q1 , ∃Q2 );
5. for each concept inclusion B1 v ∃Q.A ∈ T , E contains the arc (B1 , ∃Q);

    The idea is that each node in the digraph GT represents a basic concept
or a basic role, and each arc models a positive inclusion, i.e., a subsumption,
contained in T , where the source node of the arc represents the left-hand side of
the subsumption and the target node of the arc represents the right-hand side
of the subsumption. Observe that for each role inclusion assertion P1 v P2 in
the TBox T , we also represent as nodes and arcs in the digraph GT the entailed
positive inclusions P1− v P2− , ∃P1 v ∃P2 , and ∃P1− v ∃P2− .
    Let T be an OWL 2 QL TBox and let GT = (N , E) be its digraph represen-
tation. We denote with GT∗ = (N , E ∗ ) the transitive closure of GT . Note that by
definition of digraph transitive closure, for each node n ∈ N there exists in E ∗
an arc (n, n). Moreover, in what follows, we denote with α(E ∗ ) the set of arcs
(S1 , S2 ) ∈ E ∗ such that both terms S1 and S2 denote in T either two atomic
concepts or two atomic roles. Then, the following property holds.

Theorem 2. Let T be an OWL 2 QL TBox and let GT = (N , E) be its digraph
representation. Let S1 and S2 be two atomic concepts or two atomic roles. An
inclusion assertion S1 v S2 belongs to ΦT if and only if there exists in α(E ∗ ) an
arc (S1 , S2 ).

   We can then easily construct an algorithm, called ComputeΦ, that, taken as
input an OWL 2 QL TBox T , first builds the digraph GT = (N , E) according
Algorithm: computeUnsat
Input: an OWL 2 QL TBox T
Output: a set of concept and role expressions
Emp ← ∅;
foreach negative inclusion S1 v ¬S2 ∈ T do                           /* step 1 */
   foreach n1 ∈ predecessors(S1 , GT∗ ) do
      foreach n2 ∈ predecessors(S2 , GT∗ ) do
          if n1 = n2
          then Emp ← Emp ∪ {n1 };
          if (n1 = ∃Q− and n2 = A) or (n2 = ∃Q− and n1 = A)
          then Emp ← Emp ∪ {∃Q.A};
Emp0 ← ∅;
while Emp 6= Emp0 do                                                 /* step 2 */
   Emp0 ← Emp;
   foreach S ∈ Emp0 do
      foreach n ∈ predecessors(S, GT∗ ) do
          Emp ← Emp ∪ {n};
          if n = P or n = P − or n = ∃P or n = ∃P −
          then Emp ← Emp ∪ {P, P − , ∃P, ∃P − };
          if there exists B v ∃Q.n ∈ T
          then Emp ← Emp ∪ {∃Q.n};
return Emp.
                      Fig. 1: The algorithm computeUnsat(T )
to Definition 2, then computes its transitive closure, and finally returns the set
ΦT , which contains an inclusion assertion S1 v S2 for each arc (S1 , S2 ) ∈ α(E ∗ ).
    According to Theorem 2, ComputeΦ is sound and complete with respect to
the problem of computing ΦT for any OWL 2 QL TBox T containing only
positive inclusions.
Computation of ΩT . We first observe that, according Definition 2, no node
corresponding to a qualified existential role is created in the TBox digraph rep-
resentation. This kind of node is indeed not useful for computing ΦT . Differently,
if one aims to identify every cause of unsatisfiability, the creation of nodes cor-
responding to a qualified existential role is needed. This is due to the fact that
a TBox may entail that a qualified existential role ∃P.A is unsatisfiable, even
in case of satisfiability of ∃P . Specifically, this may occur in two instances: (i)
if the TBox T entails the assertion ∃P − v ¬A, and (ii), the TBox T entails
A v ¬A. Clearly, in both cases the concept ∃P.A is unsatisfiable. We therefore
modify here Definition 2 by substituting Rule 5 with the following one:

5∗ . for each concept inclusion B1 v ∃Q.A ∈ T , N contains the node ∃Q.A, and
     E contains the arcs (B1 , ∃Q.A) and (∃Q.A, ∃Q);

    From now on, we adopt the digraph representation built according to Defini-
tion 2, where rule 5∗ replaces rule 5. Given one such TBox T over a signature ΣP ,
the algorithm computeUnsat given in Figure 1 returns all unsatisfiable concepts
and roles in ΣP , by exploiting the transitive closure of the digraph representation
of T .
    Before describing the algorithm, we recall that, given a digraph G = (N , E)
and a node n ∈ N , the set predecessors(n, G ∗ ) contains all those nodes n0 in N
such that G ∗ contains the arc (n0 , n), which means that there exists a path from n0
to n in G. Also, it can be shown that GT∗ allows in fact to obtain all subsumptions
between satisfiable basic concepts or roles, in the sense that the TBox T infers
one such subsumption S1 v S2 if and only if there is an arc (S1 , S2 ) in E ∗ . Then,
the two steps that compose the algorithm proceed as follows:
Step 1 Let S be either a concept expression or a role expression. We have
   that for each S i ∈ predecessors(S, GT∗ ) the TBox T entails S i v S.
   Hence, given a negative inclusion assertion S1 v ¬S2 , for each S1i ∈
   predecessors(S1 , GT∗ ) and for each S2j ∈ predecessors(S2 , GT∗ ), T |= S1i v
   ¬S2j . Therefore, for each negative inclusion S1 v ¬S2 ∈ T , the algo-
   rithm computes the set predecessors(S1 , GT∗ ) and predecessors(S2 , GT∗ ) and
   is able to: (i) recognize as unsatisfiable all those concepts and roles
   whose corresponding nodes occur in both the set predecessors(S1 , GT∗ ) and
   predecessors(S2 , GT∗ ), and (ii) identify those unsatisfiable qualified exis-
   tential roles ∃Q.A whose inverse existential role node ∃Q− occurs in
   predecessors(S1 , GT∗ ) (resp. predecessors(S2 , GT∗ )) and whose concept node A
   occurs in predecessors(S2 , GT∗ ) (resp. predecessors(S1 , GT∗ )), which indeed im-
   plies ∃Q− v ¬A and therefore unsatisfiability of ∃Q.A.
Step 2 Further unsatisfiable concepts and roles are identified by the algorithm
   through a cycle in which: (i) if a concept or role S is in Emp, then all the ex-
   pressions corresponding to the nodes in predecessors(S, GT∗ ) are in Emp. This
   captures propagation of unsatisfiability through chains of positive inclusions;
   (ii) if at least one of the expressions P, P − , ∃P, ∃P − is in Emp, then all four
   expressions are in Emp; (iii) for each expression ∃Q.A in N , if A ∈ Emp,
   then ∃Q.A ∈ Emp. We notice that the algorithm stops cycling when no new
   expressions of the form ∃Q or ∃Q.A are added to Emp (indeed, in this case
   only a single further iteration may be needed).
   It easy to see that, by virtue of the fact that the size of the set N of the
digraph representation of the TBox T is finite, computeUnsat(T ) terminates,
and that the number of executions of the while cycle is less than or equal to |N |.
   The following theorem shows that algorithm computeUnsat can be used for
computing the set containing all the unsatisfiable concepts and roles in T .
Theorem 3. Let T be an OWL 2 QL TBox and let S be either an atomic con-
cept or an atomic role in ΣP . T |= S v ¬S if and only if S ∈ computeUnsat(T ).
   We call ComputeΩ the algorithm that, taken T as input, returns ΩT by
making use of computeUnsat.
   The following theorem, which is a direct consequence of Theorem 2 and of
Theorem 3, states that our technique is sound and complete with respect to the
problem of classifying an OWL 2 QL TBox.
Theorem 4. Let T be an OWL 2 QL TBox and let S1 and S2 be either two
atomic predicates. T |= S1 v S2 if and only if S1 v S2 ∈ ComputeΦ(T ) ∪
ComputeΩ(T ).
                                                     Original DL   Original Owl 2 QL    Negative
    Ontology         Concepts   Roles   Attributes
                                                       fragment    axioms    axioms    inclusions
    Mouse                2753       1            0       ALE           3463       3463          0
    Transportation        445      89            4    ALCH(D)            931       931        317
    DOLCE                 209     313            4   SHOIN(D)          1736       1991         45
    AEO                   760      47           16    SHIN(D)          3449       3432       1957
    Gene                26225       4            0        SH          42655      42655          3
    EL-Galen            23136     950            0       ELH          46457      48026          0
    Galen               23141     950            0    ALEHIF+         47407      49926          0
    FMA 1.4              6488     165            0    ALCOIF          18612      18663          0
    FMA 2.0             41648     148           20   ALCOIF(D)       123610     118181          0
    FMA 3.2.1           84454     132           67   ALCOIF(D)        88204      84987          0
    FMA-OBO             75139       2            0       ALE         119558     119558          0
    Table 1: In the table the Original and OWL 2 QL axioms fields indicate respec-
    tively the total number of axioms in the original version of the ontology and in
    the OWL 2 QL-approximated version. The Negative inclusion field reports the
    number of negative inclusions in the OWL 2 QL-approximated version.

4    Implementation and Evaluation
By exploiting the results presented in Section 3, we have developed a Java-based
OWL 2 QL classification module for the QuOnto reasoner [1,6,8].
    This module computes the classification of an OWL 2 QL TBox T by adopt-
ing the technique described in Section 3. In this implementation the transitive
closure of the digraph GT is based on a breadth first search through GT . In the
implementation we have considered all aspects of OWL 2 QL which were ignored
in the theoretical discussion presented in the previous sections (see Section 2).
    We have performed comparative experiments, where QuOnto was tested
against several popular ontology reasoners. Specifically, during our test we com-
pared ourselves with the Fact++ [24], Hermit [9], and Pellet [23] OWL reasoners,
and with the CB [14] Horn SHIQ reasoner, and with the ELK [15] reasoner for
those ontologies that are also in OWL 2 EL.
    The ontology suite used during testing includes twenty OWL ontologies, as-
sembled from the TONES Ontology Repository3 and from other independent
sources. The six reasoners exhibited negligible differences in performance for the
majority of the smaller tested ontologies, so we will only discuss the ontologies
which offered interesting results, meaning those on which reasoning times are
significantly different for at least a subset of the reasoners.
    These ontologies include: the Mouse ontology; the Transportation on-
tology4 ; the Descriptive Ontology for Linguistic and Cognitive Engineering
(DOLCE) [19]; the Athletic Events Ontology (AEO)5 ; the Gene Ontology
(GO) [3]; two versions of the GALEN ontology [22]; and four versions of the
Foundational Model of Anatomy Ontology (FMA) [10].
    Because QuOnto is an OWL 2 QL reasoner, each benchmark ontology not
in OWL 2 QL was preprocessed prior to classification in order to fit OWL 2
QL expressivity. Therefore, every OWL expression which cannot be expressed
3
  http://owl.cs.manchester.ac.uk/repository/
4
  http://www.daml.org/ontologies/409
5
  http://www.boemie.org/deliverable d 3 5
   Ontology      QuOnto        FaCT++            HermiT          Pellet       CB       ELK
Mouse                0.156            0.282            0.296          0.179    0.159     0.246
Transportation       0.150            0.045            0.163          0.151    0.195     0.343
DOLCE                1.327            0.245           25.619          1.696    1.358    —
AEO                  0.650            0.743            0.920          0.647    0.605    —
Gene                 1.255            1.400            3.810          2.803    1.918     1.419
EL-Galen             2.788          109.835            7.966         50.770    2.446     1.205
Galen                4.600          145.485           34.608 timeout           2.505    —
FMA 1.4              0.688 timeout                    93.781 timeout           1.243    —
FMA 2.0              4.111 out of memory    out of memory    timeout           7.142    —
FMA 3.2.1            4.146            4.576           11.518         24.117    4.976    —
FMA-OBO              4.827 timeout                    50.842         16.852    7.433     4.078
Table 2: Classification times of benchmark OWL 2 QL ontologies by QuOnto and
other tested reasoners.

by OWL 2 QL axioms was approximated from the ontology specifications. This
approximation follows this procedure: each axiom in the ontology is fed to an
external reasoner, specifically Hermit, and every OWL 2 QL-compliant axiom
that is implied from that axiom, between the ontology symbols that appear in
it, is added to the OWL 2 QL-approximated ontology. For instance, the OWL
assertion EquivalentClasses(ObjectUnionOf(:Male :Female) :Person) is approxi-
mated by the two assertions SubClassOf(:Male :Person) and SubClassOf(:Female
:Person). Note that, as is the case in this example, the OWL 2 QL-approximated
ontology may contain a greater number of axioms than the original ontology. Ta-
ble 1 shows that the Mouse, Transportation, Gene, and FMA-OBO ontologies
are in OWL 2 QL, and thus do not need approximation, while AEO and FMA
1.4 are subject to minimal changes by the approximation procedure.
     During the tests for each reasoner, classification was performed on the OWL
2 QL-compliant versions of the ontologies resulting from the above described
preprocessing. Metrics about the ontologies are reported in Table 1.
     All tests were performed on a DELL Latitude E6320 notebook with Intel
Core i7-2640M 2.8Ghz CPU and 4GB of RAM, running Microsoft Windows 7
Premium operating system, and Java 1.6 with 2GB of heap space. Classification
timeout was set at one hour, and aborting if maximum available memory was
exhausted. All figures reported in Table 2 are in seconds, and, because classifi-
cation results are subject to minor fluctuation, particularly when dealing with
large ontologies, are the average of 3 classifications of the respective ontologies
with each reasoner. The following versions of the OWL reasoners were tested:
Fact++ v.1.5.3, HermiT v.1.3.6, Pellet v.2.3.0, CB v.12, and ELK v.0.3.2.
     In our test configuration, the classifications of the FMA 2.0 ontology by the
Hermit and FaCT++ reasoners terminate due to an out-of-memory error. In [9],
classification of this ontology by the Hermit reasoner is performed successfully,
but classification time far exceeds the one registered by QuOnto.
     The results of the experiments are summarized in Table 2. These results
confirm that the performance offered by QuOnto compares favorably to other
reasoners for almost all tested ontologies. Classification for even the largest of the
tested ontologies, i.e., the FMA-OBO and FMA 3.2.1 ontologies, is performed
in under 5 seconds, and memory space issues were never experienced during
our tests with QuOnto. For some test cases, the gap in performance between
QuOnto and other reasoners is sizeable: for instance, classification by Pellet of
the Galen and FMA (1.4 and 2.0) and by FaCT++ of the FMA (1.4 and OBO)
ontologies exceeds the predetermined timeout limit of one hour.
    Detailed analysis of the results provided in Table 2 shows that only the CB
and ELK reasoners consistently display comparable performances to QuOnto,
which is fastest for all ontologies which feature only positive inclusions, with the
exception of the EL-Galen, Galen, and FMA-OBO ontologies. The CB reasoner,
which is the best-performing reasoner for the Galen ontology, does not however
always perform complete classification. For instance, it does not compute prop-
erty hierarchies. The ELK reasoner instead is slower than QuOnto for three
out of the five ontologies also in OWL 2 EL, showing instead markedly better
performance for EL-Galen.
     Furthermore, if, as it is usually the case, an ontology does not present unsat-
isfiable predicates, the computation of such predicates through the exploration
of all negative inclusions can be avoided. This is the case for ontologies such
as DOLCE and AEO, for which computation of the set ΦT of positive inclusion
assertions resulting from the transitive closure of GT is performed respectively in
0.347 and 0.384 seconds, fastest among tested reasoners. Instead, for ontologies
such as Pizza and Transportation, which feature respectively 2 and 62 unsatis-
fiable atomic concepts, the identification of all such predicates is unavoidable,
and the resulting set of trivial inclusion assertions must be added to ΩT .


5   Conclusions


The research presented in this paper can be extended in various directions. First
of all, in the implementation of our technique we have adopted a naive algorithm
for computing the digraph transitive closure. We are currently experimenting
more sophisticated and efficient techniques for this task. We are also working
to optimize the procedure through which we identify unsatisfiable predicates.
Finally, we are working to extend our technique to compute all inclusions that are
inferred by the TBox (which, in OWL 2 QL, are a finite number). In this respect,
we notice that through GT∗ it is already possible to obtain the classification of all
basic concepts, basic roles, and attributes, and not only that of predicates in the
signature, and that, with slight modifications of computeUnsat, we can actually
obtain the set of all negative inclusions inferred by an OWL 2 QL TBox. The
remaining challenge is to devise an efficient mechanism to obtain all inferred
positive inclusions involving qualified existential roles and attribute domains.

Acknowledgments. This research has been partially supported by the EU
under FP7 project Optique (grant n. FP7-318338), and by the EU under FP7-
ICT project ACSI (grant no. 257593).
References
 1. A. Acciarri, D. Calvanese, G. D. Giacomo, D. Lembo, M. Lenzerini, M. Palmieri,
    and R. Rosati. QuOnto: Querying Ontologies. In M. Veloso and S. Kambhampati,
    editors, Proc. of AAAI 2005, pages 1670–1671. AAAI Press/The MIT Press, 2005.
 2. A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev. The DL-Lite
    family and relations. J. of Artificial Intelligence Research, 36:1–69, 2009.
 3. M. Ashburner, C. Ball, J. Blake, D. Botstein, H. Butler, J. Cherry, A. Davis,
    K. Dolinski, S. Dwight, J. Eppig, et al. Gene Ontology: tool for the unification of
    biology. Nature genetics, 25(1):25, 2000.
 4. F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. F. Patel-Schneider, ed-
    itors. The Description Logic Handbook: Theory, Implementation and Applications.
    Cambridge University Press, 2nd edition, 2007.
 5. J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications.
    Springer, 2nd edition, 2008.
 6. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, M. Rodriguez-
    Muro, R. Rosati, M. Ruzzi, and D. F. Savo. The MASTRO system for ontology-
    based data access. Semantic Web J., 2(1):43–53, 2011.
 7. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati. Tractable
    reasoning and efficient query answering in description logics: The DL-Lite family.
    J. of Automated Reasoning, 39(3):385–429, 2007.
 8. G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, R. Rosati, M. Ruzzi, and D. F.
    Savo. MASTRO: A reasoner for effective ontology-based data access. In Proc. of
    ORE-2012, volume 858 of CEUR, ceur-ws.org, 2012.
 9. B. Glimm, I. Horrocks, B. Motik, R. Shearer, and G. Stoilos. A novel approach to
    ontology classification. J. of Web Semantics, 14:84–101, 2012.
10. C. Golbreich, S. Zhang, and O. Bodenreider. The foundational model of anatomy
    in OWL: Experience and perspectives. J. of Web Semantics, 4(3):181–195, 2006.
11. V. Haarslev and R. Möller. RACER system description. In R. Goré, A. Leitsch, and
    T. Nipkow, editors, Proc. of IJCAR 2001, volume 2083 of LNCS, pages 701–706.
    Springer, 2001.
12. Q. Ji, P. Haase, G. Qi, P. Hitzler, and S. Stadtmüller. RaDON - repair and diagnosis
    in ontology networks. In L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath,
    E. Hyvönen, R. Mizoguchi, E. Oren, M. Sabou, and E. P. B. Simperl, editors, Proc.
    of ESWC 2009, volume 5554 of LNCS, pages 863–867. Springer, 2009.
13. A. Kalyanpur, B. Parsia, E. Sirin, and J. A. Hendler. Debugging unsatisfiable
    classes in OWL ontologies. J. of Web Semantics, 3(4):268–293, 2005.
14. Y. Kazakov. Consequence-driven reasoning for Horn SHIQ ontologies. In
    C. Boutilier, editor, Proc. of IJCAI 2009, pages 2040–2045. AAAI press, 2009.
15. Y. Kazakov, M. Krötzsch, and F. Simancik. Concurrent classification of EL on-
    tologies. In L. Aroyo, C. Welty, H. Alani, J. Taylor, A. Bernstein, L. Kagal, N. F.
    Noy, and E. Blomqvist, editors, Proc. of ISWC 2011, volume 7031 of LNCS, pages
    305–320. Springer, 2011.
16. M. Krötzsch. The not-so-easy task of computing class subsumptions in OWL RL.
    In P. Cudré-Mauroux, J. Heflin, E. Sirin, T. Tudorache, J. Euzenat, M. Hauswirth,
    J. X. Parreira, J. Hendler, G. Schreiber, A. Bernstein, and E. Blomqvist, editors,
    Proc. of ISWC 2012, volume 7649 of LNCS, pages 279–294. Springer, 2012.
17. M. Lawley and C. Bousquet. Fast classification in Protégé: Snorocket as an
    OWL 2 EL reasoner. In T. Meyer, M. Orgun, and K. Taylor, editors, In Proc.
    of AOW 2010, volume 122 of CRPIT, pages 45–50. ACS, 2010.
18. D. Lembo, V. Santarelli, and D. F. Savo. Graph-based Ontology Classification in
    OWL 2 QL. In Proc. of ESWC 2013, 2013. (to appear).
19. C. Masolo, S. Borgo, A. Gangemi, N. Guarino, A. Oltramari, and L. Schneider. The
    wonderweb library of foundational ontologies and the DOLCE ontology. Technical
    Report D17, WonderWeb, 2002.
20. J. Mendez, A. Ecke, and A. Turhan. Implementing completion-based inferences
    for the EL-family. In Proc. of DL 2011, volume 745 of CEUR, ceur-ws.org, 2011.
21. B. Motik, B. Cuenca Grau, I. Horrocks, Z. Wu, A. Fokoue, and C. Lutz.
    OWL 2 Web Ontology Language – Profiles (2nd edition). W3C Recommenda-
    tion, World Wide Web Consortium, Dec. 2012. Available at http://www.w3.org/
    TR/owl2-profiles/.
22. J. Rogers and A. Rector. The GALEN ontology. Medical Informatics Europe (MIE
    96), pages 174–178, 1996.
23. E. Sirin, B. Parsia, B. C. Grau, A. Kalyanpur, and Y. Katz. Pellet: A practical
    OWL-DL reasoner. J. of Web Semantics, 5(2):51–53, 2007.
24. D. Tsarkov and I. Horrocks. FaCT++ description dogic reasoner: System descrip-
    tion. In U. Furbach and N. Shankar, editors, Proc. of IJCAR 2006, volume 4130
    of LNCS, pages 292–297. Springer, 2006.

</pre>