=Paper= {{Paper |id=Vol-1193/paper_68 |storemode=property |title=Detecting Conjunctive Query Differences between ELHr-Terminologies using Hypergraphs |pdfUrl=https://ceur-ws.org/Vol-1193/paper_68.pdf |volume=Vol-1193 |dblpUrl=https://dblp.org/rec/conf/dlog/Ludwig014 }} ==Detecting Conjunctive Query Differences between ELHr-Terminologies using Hypergraphs== https://ceur-ws.org/Vol-1193/paper_68.pdf
        Detecting Conjunctive Query Differences
          between ELHr -Terminologies using
                    Hypergraphs?

                          Michel Ludwig and Dirk Walther

                TU Dresden, Theoretical Computer Science, Germany
                    Center for Advancing Electronics Dresden
                     {michel,dirk}@tcs.inf.tu-dresden.de



        Abstract. We present a new method for detecting logical differences be-
        tween EL-terminologies extended with role inclusions, domain and range
        restrictions of roles using a hypergraph representation of ontologies. In
        this paper we consider differences given by pairs consisting of a conjunc-
        tive query and of an ABox formulated over a vocabulary of interest. We
        define a simulation notion between such hypergraph representations and
        we show that the existence of simulations coincides with the absence
        of a logical difference. To demonstrate the practical applicability of our
        approach, we evaluate a prototype implementation on large ontologies.


1     Introduction
The aim of this paper is to propose and investigate a novel and coherent approach
to the logical difference problem as introduced in [4,6,7] using a hypergraph rep-
resentation of ontologies. The logical difference is taken to be the set of queries
relevant to an application domain that produce different answers when evalu-
ated over ontologies that are to be compared. The language and signature of
the queries can be adapted in such a way that exactly the differences of interest
become visible, which can be independent of the syntactic representation of the
ontologies. Three types of queries have been studied so far: concept subsump-
tions, instance and conjunctive queries. The logical difference problem involves
reasoning tasks such as determining the existence of a difference and of a suc-
cinct representation of the entire set of queries that lead to different answers.
Other relevant tasks include the construction of an example query that yields
different answers from ontologies given a representation of the difference, as well
as finding explanations, i.e. the axioms by which this query is entailed.
    Our approach is based on representing ontologies as hypergraphs and com-
puting simulations between them. Hypergraphs are a generalisation of graphs
with many applications in computer science and discrete mathematics. In knowl-
edge representation, hypergraphs have been used implicitly to define reachability-
based modules of ontologies [10], explicitly to define locality-based modules [9]
?
    Supported by the German Research Foundation (DFG) via the Cluster of Excellence
    ‘cfAED’.
and to perform efficient reasoning with ontologies [8]. We consider ontologies
that can be translated into directed hypergraphs by taking the concept and role
names that occur in them as nodes and treating the axioms as hyperedges. For
instance, the axiom A v ∃r.B is translated into the hyperedge ({xA }, {xr , xB }),
and the axiom A ≡ B1 u B2 into the hyperedges ({xA }, {xB1 }), ({xA }, {xB2 })
and ({xB1 , xB2 }, {xA }).
    In this paper we consider the conjunctive query difference between ontologies
formulated as terminologies in the description logic ELHr , i.e. the extension of
EL with role inclusions and domain & range restrictions [1]. Given a signature Σ
(i.e. a set of concept and role names), the Σ-query difference between TBoxes T1
and T2 is the set qDiff Σ (T1 , T2 ) of pairs of the form (A, q(a)), where A is an ABox
and q(x) a conjunctive query which both only use symbols from Σ, and a is a tu-
ple of individual names from A such that (T1 , A) |= q(a) and (T2 , A) 6|= q(a) [4].
An extension of ELHr has been introduced in [4] whose concept subsumptions
taken as queries detect the same differences between general ELHr -TBoxes as
conjunctive queries. Thus it is sufficient to consider concept subsumption queries
over the extended language only. Primitive witness theorems state that for ev-
ery concept subsumption in the difference between ELHr -terminologies, there
are also simpler subsumptions of the form A v C, (dom(r) v C, ran(r) v C),
or D v A that have an atomic concept (or a domain/range expression), called
witness, either on the left-hand or the right-hand side. Checking for the exis-
tence of a logical difference is thus equivalent to searching for so-called left- and
right-hand witnesses. In [4] distinct methods based on semantic notions are em-
ployed for each type of witness. The search for left-hand witnesses is performed
by checking for simulations between canonical models, whereas two different
approaches were suggested for right-hand witnesses: one is based on instance
checking and the second one employs dynamic programming.
    In this paper we develop an alternative approach for finding witnesses based
on checking for the existence of certain simulations between hypergraph rep-
resentations of ontologies. The detection of witnesses is performed by checking
for the existence of forward and backward simulations. The existence of such
simulations between hypergraphs characterises the fact that the corresponding
ontologies cannot be distinguished from each other with conjunctive queries, i.e.
no logical difference exists. Our approach is unifying in the sense that the exis-
tence of both types of witnesses can be characterised via graph-theoretic notions.
We focus on backward simulations only as checking for forward simulations is
similar to checking for simulations between canonical models [4].
    The paper is organised as follows. After some preliminaries, we introduce a
hypergraph representation of ELHr -terminologies and the notion of a backward
simulation in hypergraphs. We show that the existence of backward simulations
corresponds to the absence of right-hand witnesses. A prototype implementa-
tion of an algorithm that checks for the existence of both types of simulations
demonstrates that witnesses can typically be found at least as quickly as with the
previous tool CEX 2.5 [5]. Our prototype implementation, however, can handle
large cyclic terminologies, which was not possible with CEX 2.5.
   This paper uses results from [4, 6] and it extends the previously introduced
approach on the concept subsumption difference between terminologies using
hypergraphs but restricted to the logic EL [3].


2     Preliminaries
We start by briefly reviewing the description logic EL and its extensions ELran ,
ELran,u,u with range restrictions, conjunctions of roles, and the universal role, as
well as concept subsumptions based on ELran,u,u and ELran . For a more detailed
introduction to description logics, we refer to [2].
    Let NC and NR be countably infinite and disjoint sets of concept names and
role names. The sets of EL-concepts C, ELran -concepts D, ELran,u,u -concepts E
and the sets of ELran,u,u -inclusions α and ELran -inclusions β are built according
to the following grammar rules:

                   C ::= > | A | C u C | ∃r.C
                   D ::= > | A | D u D | ∃r.D | ran(r)
                   E ::= > | A | E u E | ∃r1 u . . . u rn .E | ∃u.E
                   α ::= D v E | r v s
                   β ::= D v C | r v s

where A ∈ NC , r, r1 , . . . , rn , s ∈ NR , n ≥ 1, and u ∈
                                                          / NR is the universal role.
Concept inclusions of the form ran(r) v D are also called range restrictions, and
those of the form dom(r) v D are termed domain restrictions, where dom(r)
stands for ∃r.>. A TBox is a finite set of inclusions, which are also called axioms.
    An ELHr -terminology T is a TBox in which every axiom is of the form
A v C, A ≡ C, r v s, ran(r) v C or dom(r) v C, where A is a concept name,
C an EL-concept and no concept name occurs more than once on the left-hand
side of an axiom.1
    The semantics is defined using interpretations I = (∆I , ·I ), where the do-
main ∆I is a non-empty set, and ·I is a function mapping each concept name
A to a subset AI of ∆I , every role name r to a binary relation rI over ∆I ,
and uI = ∆I × ∆I . The extension C I of a concept C is defined inductively as:
(>)I := T∆I , (C u D)I := C I ∩ DI , (∃r1 u . . . u rn .C)I := { x ∈ ∆I | ∃ y ∈ C I :
           n
(x, y) ∈ i=1 riI }, and (ran(r))I := {y ∈ ∆I | ∃x : (x, y) ∈ rI }.
    An interpretation I satisfies a concept C, an inclusion C v D or r v s if,
respectively, C I 6= ∅, C I ⊆ DI , or rI ⊆ sI . We write I |= ϕ if I satisfies the
axiom ϕ. An interpretation I satisfies a TBox T if I satisfies all axioms in T ;
in this case, we say that I is a model of T . An axiom ϕ follows from a TBox T ,
written T |= ϕ, if for all models I of T , we have that I |= ϕ.
    To simplify the presentation we assume that terminologies do not contain
axioms of the form A ≡ B or A ≡ > (after having removed multiple B or
>-conjuncts) for concept names A and B. A terminology is acyclic if it can
1
    A concept equation A ≡ C stands for the inclusions A v C and C v A.
be unfolded (i.e., the process of substituting concept names by their definitions
terminates). An ELHr -terminology T is normalised iff it only contains axioms
of the forms r v s,

 – ϕ v B1 u . . . u Bn , A v ∃r.B, A v dom(r), and
 – A ≡ B1 u . . . u Bm , A ≡ ∃r.B,

where ϕ ∈ {A, dom(s), ran(s)}, n ≥ 1, m ≥ 2, A, B, Bi are concept names, r, s
roles names, and each conjunct Bi is non-conjunctive in T , i.e. there does not
exist an axiom of the form Bi ≡ B10 u. . .uBm 0
                                                ∈ T for concept names B10 , . . . , Bm
                                                                                     0

with m ≥ 2 (otherwise Bi is said to be conjunctive in T ).
    Every ELHr -terminology T can be normalised in polynomial time such that
the resulting terminology is a conservative extension of T [4]. Note that axioms
of the form A ≡ ∃r.> are rewritten into A v dom(r) and dom(r) v A.
    A signature Σ is a finite set of symbols from NC and NR . The signature sig(ϕ)
of a syntactic object ϕ is the set of concept and role names occurring in ϕ. Note
that sig(∃u.C) = sig(C) for every concept C. The symbol Σ is used as a subscript
to a set of concepts or inclusions to denote that its elements only use symbols
from Σ, e.g., ELΣ , ELran         r
                        Σ , ELHΣ , etc.
    The logical difference between two terminologies for ELran,u,u -inclusions as
query language is defined as follows [4, 6].

Definition 1 (Concept Inclusion Difference). The ELran,u,u -concept inclu-
sion difference between ELHr -terminologies T1 and T2 w.r.t. a signature Σ is the
                               ran,u,u
set Diff Σ (T1 , T2 ) of all ELΣ       -inclusions ϕ such that T1 |= ϕ and T2 6|= ϕ.

    If the set Diff Σ (T1 , T2 ) is not empty, then it typically contains infinitely many
concept inclusions. We make use of the primitive witnesses theorems from [4],
which state that it is sufficient to consider the following inclusions to represent a
difference: (δ1 ) r v s, (δ2 ) ELran,u,u -inclusions of the forms A v E, dom(r) v E
and ran(r) v E, and (δ3 ) ELran -inclusions the form D v A.
    The set of all ELran,u,u -concept inclusion difference witnesses is defined as

     WtnΣ (T1 , T2 ) = (roleWtnΣ (T1 , T2 ), lhsWtnΣ (T1 , T2 ), rhsWtnΣ (T1 , T2 )),

where the set roleWtnΣ (T1 , T2 ) consists of all role inclusions in Diff Σ (T1 , T2 ), and
the sets lhsWtnΣ (T1 , T2 ) ⊆ (NC ∩ Σ) ∪ { dom(r) | r ∈ Σ } ∪ { ran(r) | r ∈ Σ }
and rhsWtnΣ (T1 , T2 ) ⊆ NC ∩ Σ of left-hand and right-hand concept difference
witnesses consist of the left-hand sides of the type-δ2 inclusions in Diff Σ (T1 , T2 )
and the right-hand sides of type-δ3 inclusions in Diff Σ (T1 , T2 ), respectively. Ob-
serve that the set WtnΣ (T1 , T2 ) is finite. Consequently, it can be seen as a suc-
cinct representation of the typically infinite set Diff Σ (T1 , T2 ) in the sense that:
Diff Σ (T1 , T2 ) = ∅ iff WtnΣ (T1 , T2 ) = (∅, ∅, ∅) [4]. Thus, to decide the existence
of concept inclusion differences, it is equivalent to decide non-emptiness of the
three witness sets.
    In this paper, we focus on right-hand witnesses in rhsWtnΣ (T1 , T2 ), i.e., only
the inclusions of types δ3 are relevant.
3     Logical Difference using Hypergraphs
Our approach for detecting logical differences is based on finding appropriate
simulations between the hypergraph representations of terminologies. The hy-
pergraph notion in this paper is such that the existence of certain simulations
between the ontology hypergraphs of terminologies T1 and T2 coincides with
lhsWtnΣ (T1 , T2 ) = ∅ and rhsWtnΣ (T1 , T2 ) = ∅. For every concept name A ∈ Σ
(or role name r ∈ Σ), we verify whether A (or dom(r), ran(r)) belongs to
lhsWtnΣ (T1 , T2 ), or whether A is a member of rhsWtnΣ (T1 , T2 ). For the former,
we check for the existence of a forward simulation, and for the latter, for the ex-
istence of a backward simulation between the ontology hypergraphs of T1 and T2 .
In this paper we present backward simulations only. Checking for left-hand side
witnesses and for witnesses in roleWtnΣ (T1 , T2 ) using ontology hypergraphs can
be done similarly to [4].
    We start with defining ontology hypergraphs and we subsequently introduce
the notion of a backward simulation between such hypergraphs.

3.1     Ontology Hypergraphs
We introduce a hypergraph representation of ELHr -terminologies. A directed
hypergraph is a tuple G = (V, E), where V is a non-empty set of nodes (or
vertices), and E is a set of directed hyperedges of the form e = (S, S 0 ), where
S, S 0 ⊆ V. A node v ∈ V is reachable in G from a set of nodes V 0 ⊆ V (written
V 0 ≥G v) if v ∈ V 0 , or there is a hyperedge e = (S, S 0 ) ∈ E such that v ∈ S 0 and
every node in S is reachable from V 0 .
    We now show how to represent terminologies as hypergraphs.
Definition 2 (Ontology Hypergraph). Let T be a normalised ELHr -termin-
ology and let Σ be a signature. The ontology hypergraph GTΣ of T for Σ is a
directed hypergraph GTΣ = (V, E) defined as follows:
      V ={ xA | A ∈ NC ∩ (Σ ∪ sig(T )) }
           ∪ { xr , xdom(r) , xran(r) | r ∈ NR ∩ (Σ ∪ sig(T )) }, and
      E ={({x` }, {xBi }) | ` ./ B1 u . . . u Bn ∈ T , ./ ∈ {v, ≡},
                                                    ` ∈ NC ∪ { dom(s), ran(s) | s ∈ NR }}
           ∪ {({xA }, {xdom(r) }) | A v dom(r) ∈ T or A ./ ∃r.B ∈ T , ./ ∈ {v, ≡}}
           ∪ {({xA }, {xr , xran(r)uB }), ({xran(r)uB }, {xB }),
                                ({xran(r)uB }, {xran(r) }) | A ./ ∃r.B ∈ T , ./ ∈ {v, ≡}}
           ∪ { ({xr , xB }, {xA }) | A ≡ ∃r.B ∈ T }
           ∪ { ({xB1 , . . . , xBn }, {xA }) | A ≡ B1 u . . . u Bn ∈ T }
           ∪ { ({xdom(r) }, {xr , xran(r) }) | r ∈ NR ∩ (Σ ∪ sig(T ) }
           ∪ {({xr }, {xs }), ({xdom(r) }, {xdom(s) }), ({xran(r) }, {xran(s) }) | r v s ∈ T }
   The ontology hypergraph GTΣ = (V, E) contains a node x` for every signature
symbol ` in Σ and T .2 Additionally, we represent concepts of the form dom(r)
2
    Note that, differently to [3], the graph GTΣ does not contain a node representing >.
and ran(r) as nodes in the graph. We include a hyperedge ({xdom(r) }, {xr , xran(r) })
for every role name r in Σ and T , which corresponds to the tautology dom(r) v
∃r.ran(r).3 Recall that dom(r) equals ∃r.>. The other hyperedges in GTΣ rep-
resent the axioms in T . Every hyperedge is directed and can be understood
as an implication, i.e., ({x`1 }, {x`2 }) stands for T |= `1 v `2 . The complex
hyperedges are of the form ({xA }, {xr , xB }) and ({xr , xB }, {xA }) representing
T |= A v ∃r.B and T |= ∃r.B v A, and of the form ({xB1 , ..., xBn }, {xA })
representing T |= B1 u . . . u Bn v A.

Example 1. Let T = {A ≡ ∃r.X, X ≡ Y u Z, B v Z, ran(r) v Y }, and Σ =
{A, B, r}. The ontology hypergraph GTΣ of T for Σ can be depicted as follows.
                      xran(t)        xZ
                                                 xX
                     xt         xB xY                   xA   xran(r)uX
                          xdom(t)                 xr
                                    xran(r)
                                              xdom(r)

   In the following we introduce a relation →T on nodes xl , xl0 of an ontology
hypergraph such that xl →T xl0 holds iff T |= l v l0 .

Definition 3. Let GTΣ = (V, E) be the ontology hypergraph of a normalised
ELHr -terminology T for a signature Σ. We define →T ⊆ V × V to be the
smallest reflexive and transitive relation closed under the following conditions:

 – x →T x0 if ({x}, {x0 }) ∈ E;
 – x →T x0 if ({x}, {xr , xA }) ∈ E, ({xs , xB }, {x0 }) ∈ E, xr →T xs , and
   xA →T xB ;
 – x →T x0 if ({xB1 , . . . , xBm }, {x0 }) ∈ E and x →T xBi for every 1 ≤ i ≤ m.

It can be readily seen that the relation →T can be computed in polynomial time
in the size of T . Note that →T does not coincide with the usual reachability
notion in a hypergraph.


3.2    Backward Simulation

We introduce backward simulations between ontology hypergraphs whose exis-
tence coincides with the absence of right-hand witnesses. The simulations are
defined such that a node xA in GTΣ1 is simulated by a node xA0 in GTΣ2 iff A0
is entailed in T2 by exactly the same ELranΣ -concepts that entail A in T1 . To
define backward simulations we need to take all the axioms into account that
cause Σ-concepts to entail a concept name. Axioms of the forms A ≡ ∃r.B,
A ≡ B1 u . . . u Bn , and ran(r) v A require special treatment, while it is more
straightforward to deal with the other types of axioms. For the former, con-
sider T1 = {A ≡ ∃r.X}, T2 = {A v ∃r.>}, and Σ = {A, r}. It holds that
3
    These hyperedges are relevant for the forward simulation only.
Diff Σ (T1 , T2 ) = ∅. Observe that there does not exist an ELran
                                                              Σ -concept that en-
tails A in T1 as the concept name X is not entailed by any Σ-concept in T .
Thus, the node xA in GTΣ1 should be simulated by the node xA in GTΣ2 . To handle
such cases, we want to characterise the entailment by an ELranΣ -concept in terms
of (a special) reachability notion in ontology hypergraphs.
    For a signature Σ, let Σ Ran = Σ ∪ Σ dom ∪ Σ ran , where Σ dom = { dom(t) | t ∈
NR ∩ Σ } and Σ ran = { ran(t) | t ∈ NR ∩ Σ } are the sets consisting of concepts of
the form dom(t) and ran(t) for every role name t in Σ, respectively.

Definition 4 (Σ Ran -Reachability). Let GTΣ = (V, E) be the ontology hyper-
graph of a normalised ELHr -terminology T for a signature Σ. We set VRan Σ
                                                                            =
                                      Ran                          Σ
{ x ∈ V | xσ →T x for some σ ∈ Σ } to be the set of nodes in GT that are
reachable via →T from a node labelled with elements from Σ Ran . We say that a
node v ∈ V is Σ Ran -reachable in GTΣ iff VRan
                                           Σ
                                               ≥GTΣ v.

   It can readily be seen that all Σ Ran -reachable nodes can be identified in
polynomial time w.r.t. the size of T .

Example 2. Let T = {A ≡ ∃r.X, X ≡ Y u Z, B v Z, ran(r) v Y } (cf. Ex. 1)
and let Σ = {B, r}. Then all the nodes are Σ Ran -reachable in GTΣ and we have
that T |= ∃r.B v A.

    The following example demonstrates that the relation →T is not sufficient
to characterise entailment by ELran
                                Σ -concepts in every case.

Example 3. Let T = {A ≡ ∃r.X, X ≡ ∃r.B} and let Σ = {A, B, r}. The nodes
xA and xB are Σ Ran -reachable in GTΣ , T |= ∃r.∃r.B v A, but xB 6→T xA .

    We now state the properties of Σ Ran -reachable nodes that we obtain.
Lemma 1. Let T be a normalised ELHr -terminology and let Σ be a signature.
Then the following statements hold:
 (i) xA ∈ V is Σ Ran -reachable in GTΣ iff there is an ELran Σ -concept D such that
     T |= D v A;
(ii) xs ∈ V is Σ Ran -reachable in GTΣ iff there is s0 ∈ NR ∩Σ such that T |= s0 v s.
    For axioms of the form A ≡ B1 u . . . u Bn , we introduce the following no-
tion which associates with every node xA in a hypergraph GT a set of concept
names non-conj(xA ) that are essential to entail A in T (cf. [4]).

Definition 5 (Non-Conjunctive). Let GTΣ = (V, E) be the ontology hyper-
graph of a normalised ELHr -terminology T for a signature Σ. For xA ∈ V, let
non-conj(xA ) be defined as:
 – if ({xB1 , . . . , xBm }, {xA }) ∈ E with m ≥ 2 (i.e. A ≡ B1 u . . . u Bm ∈ T ), we
   define
                              non-conjT (xA ) = {xB1 , . . . , xBm };
 – otherwise, let non-conjT (xA ) = {xA }.
    Note that for any concept name A in a normalised ELHr -terminology T
the concept names B1 , . . . , Bm with non-conjT (xA ) = {xB1 , . . . , xBm } are non-
conjunctive in T .
    We need to take special care of axioms of the form ran(r) v X as they
might cause non-obvious entailments. Let T = {X ≡ B1 u B2 , A ≡ ∃r.X} and
Σ = {A, B1 , B2 , r}. Then the Σ-concept ∃r.(B1 uB2 ) entails A in T . If we add the
axiom ran(r) v B1 to T , then already the Σ-concept ∃r.B2 (of smaller signature)
is sufficient to entail A in T . Intuitively, the conjunct B1 of X is already covered
by ran(r) in the presence of the axiom ran(r) v B1 (as T |= ran(r) u B2 v X).
To define backward simulations for axioms of the form A ≡ ∃r.X, all axioms of
the form ran(r) v Y need to be taken into account.
    We will therefore define the notion of backward simulation using an additional
parameter ζ ∈ C Σ = {} ∪ (NR ∩ Σ), called the role context. Such a parameter ζ
stands for an expression of the form ran(ζ) in which a node x ∈ V1 should be
simulated by a node x0 ∈ V2 . We treat  as a special role name and set ran() = >.
    Additionally, we say that a node y ∈ V in an ontology hypergraph GTΣ =
(V, E) is relevant for a node x in T w.r.t. a set of node labels L used in GTΣ if
y ∈ non-conjT (x) and x` 6→T y for every ` ∈ L.
    We now give the definition of backward simulations as subsets of V1 ×V2 ×C Σ .

Definition 6 (Backward Simulation). Let GTΣ1 = (V1 , E1 ), GTΣ2 = (V2 , E2 )
be the ontology hypergraphs of normalised ELHr -terminologies T1 and T2 for a
signature Σ. A relation ←- ⊆ V1 × V2 × C Σ is a backward Σ-simulation if the
following conditions are satisfied:

  (ib ) if (x, x0 , ζ) ∈ ←-, then for every σ ∈ Σ Ran : xσ →T1 x implies xσ →T2 y 0
        for every y 0 ∈ V2 relevant for x0 in T2 w.r.t. {ran(ζ)};
 (iib ) if (x, x0 , ζ) ∈ ←-, ({xr , y}, {x}) ∈ E1 , and y is Σ Ran -reachable in GTΣ1 , then
        for every s ∈ Σ such that xs →T1 xr , and for every y 0 ∈ V2 relevant
        for x0 in T2 w.r.t. {ran(ζ), dom(s)}, there exists ({xr0 , z 0 }, {y 0 }) ∈ E2 with
        xs →T2 xr0 and (y, z 0 , s) ∈ ←-;
(iiib ) if (x, x0 , ζ) ∈ ←- and ({xX1 , . . . , xXn }, {x}) ∈ E1 , then for every y 0 ∈ V2
        relevant for x0 w.r.t. {ran(ζ)} there exists y ∈ V1 relevant for x in T1 w.r.t.
        {ran(ζ)} with (y, y 0 , ) ∈ ←-.

We write GTΣ1 ←- GTΣ2 iff there exists a backward Σ-simulation ←- ⊆ V1 × V2 × C Σ
such that (xA , xA , ) ∈ ←- for every A ∈ NC ∩ Σ.

Members of a backward simulation ←- are called simulation triples.
    For a node x in GTΣ1 to be backward simulated by x0 in GTΣ2 , Condition (ib )
enforces that appropriate Σ-concept names B or concepts of the form ran(s),
dom(s) with s ∈ Σ that entail x in T1 must also entail x0 in T2 . Condition (iib )
applies to nodes xA ∈ GTΣ1 for which there exists an axiom A ≡ ∃r.X in T1 and
propagates the simulation to the successor node xX by taking into account pos-
sible entailments regarding domain or range restrictions in T2 . Condition (iiib )
handles axioms of the form A ≡ B1 u. . .uBn in T1 . We have to match every con-
junct y 0 that is relevant for x0 in T2 with some conjunct y relevant for x in T1 (pos-
sibly leaving some conjuncts y unmatched) since, intuitively speaking, some con-
juncts in the definition of A in T1 can be ignored to preserve logical entailment.
For instance, let T1 = {A ≡ B1 u B2 }, T2 = {B1 v A} and Σ = {A, B1 , B2 }.
Then rhsWtnΣ (T1 , T2 ) = ∅ and, in particular, T2 |= B1 u B2 v A holds as well.
Note that the simulation between conjuncts is propagated in the context  only
as all the conjuncts that are entailed by ran(ζ) have been filtered out already.

Example 4. Let T1 = {A ≡ ∃r.X, X ≡ Y u Z, B v Z, ran(r) v Y } (cf. Ex. 1),
T2 = {A ≡ X u Y, X ≡ ∃r.B, dom(s) v Y, r v s}, and Σ = {A, B, r}.
    It can be readily seen that the nodes xB , xY , xZ , and xX are Σ Ran -reachable
in GTΣ1 . As only xB →T1 xB , we have that the node xB in GTΣ1 can be simulated
by the node xB in GTΣ2 in the contexts  and r. Similarly, as only xB →T1 xZ ,
the node xZ in GTΣ1 can be simulated by the node xB in GTΣ2 in the contexts 
and r. Hence, as non-conjT2 (xB ) = {xB } and as xZ is relevant for xX in T1
w.r.t. {ran(r)}, we have that xX in GTΣ1 can be simulated by xB in GTΣ2 in the
context r. Finally, as non-conjT2 (xA ) = {xX , xY } and as only xX is relevant for
xA in GTΣ2 (due to xdom(r) →T2 xY ), we can conclude that the node xA in GTΣ1
can be simulated by xA in GTΣ2 in the contexts  and r. Overall,

   S = { (xA , xA , ζ) | ζ ∈ {, r} } ∪ { (xB , xB , ζ) | ζ ∈ {, r} }
                                      ∪ { (xZ , xB , ζ) | ζ ∈ {, r} } ∪ {(xX , xB , r)}

is a backward Σ-simulation between GTΣ1 and GTΣ2 such that (xA , xA , ) ∈ S.

Example 5. Let T1 , T2 , and Σ be defined as in Ex. 4. Now let T10 = T1 ∪{ran(t) v
Z} and Σ 0 = Σ ∪ {t}. We observe that xran(t) →T2 x0 does not hold for any
              0                          0                                        0
node x0 ∈ GTΣ2 , i.e., the node xZ in GTΣ0 cannot be simulated by any node in GTΣ2
                                            1
                                                                                               0
(in any context) as Condition (ib ) cannot be fulfilled. Hence, the node xX in GTΣ0
                                                                                               1
                                     Σ0
cannot be simulated by xB in G       T10   in the context r as Condition (iiib ) is vio-
                                                                                           0
lated. Thus, there cannot exist a backward Σ-simulation such that xA in GTΣ0 is
                                                                                       1
                         0
simulated by xA in GTΣ2 in the context  as Condition (iib ) cannot be fulfilled.

   We can now show that the existence of a backward simulation coincides with
the absence of right-hand witnesses and that one can check in polynomial time
whether backward simulations exist.

Theorem 1. Let T1 , T2 be normalised ELHr -terminologies, and let Σ be a sig-
nature. Then it holds that rhsWtnΣ (T1 , T2 ) = ∅ iff GTΣ1 ←- GTΣ2 .

Theorem 2. Let T1 , T2 be normalised ELHr -terminologies and let Σ be signa-
ture. Then it can be checked in polynomial time whether GTΣ1 ←- GTΣ2 holds.

   So far we focused on finding concept names A contained in rhsWtnΣ (T1 , T2 ),
which, together with the sets roleWtnΣ (T1 , T2 ) and lhsWtnΣ (T1 , T2 ), is sufficient
to decide the existence of a logical difference between T1 and T2 . However, in
practical applications users may require concrete concept inclusions D v A
         T1         T2          Time (s) - CEX 2.5        Time (s) - Prototype
                                           with ex.                   with ex.
         SM09a      SM09b        340.40      580.44        211.08        214.20
         SM09b      SM10a        495.22      639.46        295.24        302.53
         SM09b      SM09a        477.34      599.07        324.56        328.81
         SM10a      SM09b        444.71      608.44        229.15        235.61

              Table 1. Experimental Results Obtained for Snomed CT


(or A v E) in Diff Σ (T1 , T2 ) that correspond to a witness A. We note that
such example concept inclusions (and also example conjunctive queries) can be
constructed recursively from triples for which the simulation conditions failed.


4     Experiments
To investigate the practical applicability of our simulation-based approach for
detecting right-hand witnesses, we implemented a prototype tool in OCaml that
is based on the CEX 2.5 tool [5]. We then conducted a brief experimental eval-
uation involving large fragments of three versions of Snomed CT (the first and
second international release from 2009 as well as the first international release
from 2010) and 119 versions of NCI4 which appeared between October 2003 and
January 2014. The considered fragments of Snomed CT each contain about
280 000 concepts names and 62 role names. The aim of our experiments was to
compare the performance of our prototype implementation against the CEX 2.5
tool, which can detect logical differences between acyclic terminologies only. We
instructed both tools to compute the set WtnΣ (T1 , T2 ) for various versions T1
and T2 of Snomed CT and NCI. All the experiments were conducted on PCs
equipped with an Intel Core i5-2500 CPU running at 3.30GHz, and all the com-
putation times we report on are the average of three executions.
    In our experiments involving Snomed CT we used signatures composed of
the intersection of the concept names in the two versions that were compared,
together with the same 31 role names (including “RoleGroup”) that were cho-
sen at random initially (and which occur in every version). The results that we
obtained are shown in Table 1. The first two columns indicate which versions
were used as ontologies T1 and T2 . The next two columns then show the com-
putation times (CPU time) required by CEX 2.5, with column four depicting
the computation times if additionally examples illustrating the witnesses were
computed. The last two columns then indicate the computation times of our
prototype tool. The times required when additionally examples were computed
are shown in the last column. One can see that in all the cases our prototype tool
required less time to compute difference witnesses (also together with example
inclusions) than CEX 2.5.
4
    More precisely, we first extracted the ELHr -fragment of the NCI versions by remov-
    ing up to 8% of the axioms which were not expressed in this fragment.
               1000
                      Prototype
                        CEX2.5



                100
CPU Time (s)




                 10




                  1


                                  Fig. 1. Experimental Results Obtained for NCI

    For each considered version α of NCI, we computed conjunctive query wit-
nesses for T1 = NCIα and T2 = NCIα+1 on signatures Σ = sig(NCIα ) ∩
sig(NCIα+1 ), where α + 1 denotes the successor version of α, together with
corresponding examples. The results that we obtained are depicted in Fig. 1.
The computations are sorted chronologically along the x-axis according to the
publication date of version NCIα . Each pair of bars represents the computation
times required by our prototype tool and by CEX 2.5, respectively, for one com-
parison. In the cases where only one bar is shown, the ontology T1 = NCIα was
cyclic and CEX 2.5 could not be used.
    Generally speaking, both tools required longer computation times on more
recent NCI versions than on older releases, which could be explained by the fact
that the size of NCI versions increased with every new release. In the comparisons
before version 10.03h our prototype tool could typically compute the witnesses
and example inclusions faster than CEX 2.5. However, on later versions our
new tool then required slightly longer computation times. One can also see that
overall it took the longest time to compute witnesses for cyclic versions of NCI.
    Finally, we note that in our experiments all the computations required at
most 2.85 GiB of main memory.

5                 Conclusion
We presented a unifying approach to solving the logical difference problem for
possibly cyclic ELHr -terminologies. We showed that the existence of backward
simulations in hypergraph representations of terminologies corresponds to the
absence of right-hand witnesses (an analogous correspondence exists between
forward simulations and left-hand witnesses). We also demonstrated the ap-
plicability of the hypergraph approach using a prototype implementation. The
experiments showed that in most cases our prototype tool outperformed the
previous tool, CEX 2.5, for computing the logical difference. Moreover, our pro-
totype tool could be successfully applied on fairly large cyclic terminologies,
whereas previous approaches only worked for acyclic (or rather small cyclic)
terminologies.
    We plan to further improve our prototype implementation. Moreover, exten-
sions of our techniques to DL-Lite, general ELHr -TBoxes, or even Horn-SHIQ
ontologies could be investigated.
References
 1. Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope further. In: Proceedings
    of OWLED’08 (2008)
 2. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.
    (eds.): The description logic handbook: theory, implementation, and applications.
    Cambridge University Press (2007)
 3. Ecke, A., Ludwig, M., Walther, D.: The concept difference for EL-terminologies
    using hypergraphs. In: Proceedings of DChanges’13. CEUR-WS.org (2013)
 4. Konev, B., Ludwig, M., Walther, D., Wolter, F.: The logical difference for the
    lightweight description logic EL. JAIR 44, 633–708 (2012)
 5. Konev, B., Ludwig, M., Wolter, F.: Logical difference computation with CEX2.5.
    In: Proceedings of IJCAR’12. pp. 371–377. Springer (2012)
 6. Konev, B., Walther, D., Wolter, F.: The logical difference problem for description
    logic terminologies. In: Proceedings of IJCAR’08. pp. 259–274. Springer (2008)
 7. Kontchakov, R., Wolter, F., Zakharyaschev, M.: Logic-based ontology compari-
    son and module extraction, with an application to DL-Lite. Artificial Intelligence
    174(15), 1093–1141 (Oct 2010)
 8. Lembo, D., Santarelli, V., Savo, D.F.: Graph-based ontology classification in
    OWL 2 QL. In: Proceedings of ESWC 2013. LNCS, vol. 7882, pp. 320–334. Springer
    (2013)
 9. Nortje, R., Britz, A., Meyer, T.: Module-theoretic properties of reachability mod-
    ules for SRIQ. In: Proceedings of DL’13. pp. 868–884. CEUR-WS.org (2013)
10. Suntisrivaraporn, B.: Polytime reasoning support for design and maintenance of
    large-scale biomedical ontologies. Ph.D. thesis, TU Dresden, Germany (2009)