=Paper=
{{Paper
|id=Vol-1193/paper_68
|storemode=property
|title=Detecting Conjunctive Query Differences between ELHr-Terminologies using Hypergraphs
|pdfUrl=https://ceur-ws.org/Vol-1193/paper_68.pdf
|volume=Vol-1193
|dblpUrl=https://dblp.org/rec/conf/dlog/Ludwig014
}}
==Detecting Conjunctive Query Differences between ELHr-Terminologies using Hypergraphs==
Detecting Conjunctive Query Differences between ELHr -Terminologies using Hypergraphs? Michel Ludwig and Dirk Walther TU Dresden, Theoretical Computer Science, Germany Center for Advancing Electronics Dresden {michel,dirk}@tcs.inf.tu-dresden.de Abstract. We present a new method for detecting logical differences be- tween EL-terminologies extended with role inclusions, domain and range restrictions of roles using a hypergraph representation of ontologies. In this paper we consider differences given by pairs consisting of a conjunc- tive query and of an ABox formulated over a vocabulary of interest. We define a simulation notion between such hypergraph representations and we show that the existence of simulations coincides with the absence of a logical difference. To demonstrate the practical applicability of our approach, we evaluate a prototype implementation on large ontologies. 1 Introduction The aim of this paper is to propose and investigate a novel and coherent approach to the logical difference problem as introduced in [4,6,7] using a hypergraph rep- resentation of ontologies. The logical difference is taken to be the set of queries relevant to an application domain that produce different answers when evalu- ated over ontologies that are to be compared. The language and signature of the queries can be adapted in such a way that exactly the differences of interest become visible, which can be independent of the syntactic representation of the ontologies. Three types of queries have been studied so far: concept subsump- tions, instance and conjunctive queries. The logical difference problem involves reasoning tasks such as determining the existence of a difference and of a suc- cinct representation of the entire set of queries that lead to different answers. Other relevant tasks include the construction of an example query that yields different answers from ontologies given a representation of the difference, as well as finding explanations, i.e. the axioms by which this query is entailed. Our approach is based on representing ontologies as hypergraphs and com- puting simulations between them. Hypergraphs are a generalisation of graphs with many applications in computer science and discrete mathematics. In knowl- edge representation, hypergraphs have been used implicitly to define reachability- based modules of ontologies [10], explicitly to define locality-based modules [9] ? Supported by the German Research Foundation (DFG) via the Cluster of Excellence ‘cfAED’. and to perform efficient reasoning with ontologies [8]. We consider ontologies that can be translated into directed hypergraphs by taking the concept and role names that occur in them as nodes and treating the axioms as hyperedges. For instance, the axiom A v ∃r.B is translated into the hyperedge ({xA }, {xr , xB }), and the axiom A ≡ B1 u B2 into the hyperedges ({xA }, {xB1 }), ({xA }, {xB2 }) and ({xB1 , xB2 }, {xA }). In this paper we consider the conjunctive query difference between ontologies formulated as terminologies in the description logic ELHr , i.e. the extension of EL with role inclusions and domain & range restrictions [1]. Given a signature Σ (i.e. a set of concept and role names), the Σ-query difference between TBoxes T1 and T2 is the set qDiff Σ (T1 , T2 ) of pairs of the form (A, q(a)), where A is an ABox and q(x) a conjunctive query which both only use symbols from Σ, and a is a tu- ple of individual names from A such that (T1 , A) |= q(a) and (T2 , A) 6|= q(a) [4]. An extension of ELHr has been introduced in [4] whose concept subsumptions taken as queries detect the same differences between general ELHr -TBoxes as conjunctive queries. Thus it is sufficient to consider concept subsumption queries over the extended language only. Primitive witness theorems state that for ev- ery concept subsumption in the difference between ELHr -terminologies, there are also simpler subsumptions of the form A v C, (dom(r) v C, ran(r) v C), or D v A that have an atomic concept (or a domain/range expression), called witness, either on the left-hand or the right-hand side. Checking for the exis- tence of a logical difference is thus equivalent to searching for so-called left- and right-hand witnesses. In [4] distinct methods based on semantic notions are em- ployed for each type of witness. The search for left-hand witnesses is performed by checking for simulations between canonical models, whereas two different approaches were suggested for right-hand witnesses: one is based on instance checking and the second one employs dynamic programming. In this paper we develop an alternative approach for finding witnesses based on checking for the existence of certain simulations between hypergraph rep- resentations of ontologies. The detection of witnesses is performed by checking for the existence of forward and backward simulations. The existence of such simulations between hypergraphs characterises the fact that the corresponding ontologies cannot be distinguished from each other with conjunctive queries, i.e. no logical difference exists. Our approach is unifying in the sense that the exis- tence of both types of witnesses can be characterised via graph-theoretic notions. We focus on backward simulations only as checking for forward simulations is similar to checking for simulations between canonical models [4]. The paper is organised as follows. After some preliminaries, we introduce a hypergraph representation of ELHr -terminologies and the notion of a backward simulation in hypergraphs. We show that the existence of backward simulations corresponds to the absence of right-hand witnesses. A prototype implementa- tion of an algorithm that checks for the existence of both types of simulations demonstrates that witnesses can typically be found at least as quickly as with the previous tool CEX 2.5 [5]. Our prototype implementation, however, can handle large cyclic terminologies, which was not possible with CEX 2.5. This paper uses results from [4, 6] and it extends the previously introduced approach on the concept subsumption difference between terminologies using hypergraphs but restricted to the logic EL [3]. 2 Preliminaries We start by briefly reviewing the description logic EL and its extensions ELran , ELran,u,u with range restrictions, conjunctions of roles, and the universal role, as well as concept subsumptions based on ELran,u,u and ELran . For a more detailed introduction to description logics, we refer to [2]. Let NC and NR be countably infinite and disjoint sets of concept names and role names. The sets of EL-concepts C, ELran -concepts D, ELran,u,u -concepts E and the sets of ELran,u,u -inclusions α and ELran -inclusions β are built according to the following grammar rules: C ::= > | A | C u C | ∃r.C D ::= > | A | D u D | ∃r.D | ran(r) E ::= > | A | E u E | ∃r1 u . . . u rn .E | ∃u.E α ::= D v E | r v s β ::= D v C | r v s where A ∈ NC , r, r1 , . . . , rn , s ∈ NR , n ≥ 1, and u ∈ / NR is the universal role. Concept inclusions of the form ran(r) v D are also called range restrictions, and those of the form dom(r) v D are termed domain restrictions, where dom(r) stands for ∃r.>. A TBox is a finite set of inclusions, which are also called axioms. An ELHr -terminology T is a TBox in which every axiom is of the form A v C, A ≡ C, r v s, ran(r) v C or dom(r) v C, where A is a concept name, C an EL-concept and no concept name occurs more than once on the left-hand side of an axiom.1 The semantics is defined using interpretations I = (∆I , ·I ), where the do- main ∆I is a non-empty set, and ·I is a function mapping each concept name A to a subset AI of ∆I , every role name r to a binary relation rI over ∆I , and uI = ∆I × ∆I . The extension C I of a concept C is defined inductively as: (>)I := T∆I , (C u D)I := C I ∩ DI , (∃r1 u . . . u rn .C)I := { x ∈ ∆I | ∃ y ∈ C I : n (x, y) ∈ i=1 riI }, and (ran(r))I := {y ∈ ∆I | ∃x : (x, y) ∈ rI }. An interpretation I satisfies a concept C, an inclusion C v D or r v s if, respectively, C I 6= ∅, C I ⊆ DI , or rI ⊆ sI . We write I |= ϕ if I satisfies the axiom ϕ. An interpretation I satisfies a TBox T if I satisfies all axioms in T ; in this case, we say that I is a model of T . An axiom ϕ follows from a TBox T , written T |= ϕ, if for all models I of T , we have that I |= ϕ. To simplify the presentation we assume that terminologies do not contain axioms of the form A ≡ B or A ≡ > (after having removed multiple B or >-conjuncts) for concept names A and B. A terminology is acyclic if it can 1 A concept equation A ≡ C stands for the inclusions A v C and C v A. be unfolded (i.e., the process of substituting concept names by their definitions terminates). An ELHr -terminology T is normalised iff it only contains axioms of the forms r v s, – ϕ v B1 u . . . u Bn , A v ∃r.B, A v dom(r), and – A ≡ B1 u . . . u Bm , A ≡ ∃r.B, where ϕ ∈ {A, dom(s), ran(s)}, n ≥ 1, m ≥ 2, A, B, Bi are concept names, r, s roles names, and each conjunct Bi is non-conjunctive in T , i.e. there does not exist an axiom of the form Bi ≡ B10 u. . .uBm 0 ∈ T for concept names B10 , . . . , Bm 0 with m ≥ 2 (otherwise Bi is said to be conjunctive in T ). Every ELHr -terminology T can be normalised in polynomial time such that the resulting terminology is a conservative extension of T [4]. Note that axioms of the form A ≡ ∃r.> are rewritten into A v dom(r) and dom(r) v A. A signature Σ is a finite set of symbols from NC and NR . The signature sig(ϕ) of a syntactic object ϕ is the set of concept and role names occurring in ϕ. Note that sig(∃u.C) = sig(C) for every concept C. The symbol Σ is used as a subscript to a set of concepts or inclusions to denote that its elements only use symbols from Σ, e.g., ELΣ , ELran r Σ , ELHΣ , etc. The logical difference between two terminologies for ELran,u,u -inclusions as query language is defined as follows [4, 6]. Definition 1 (Concept Inclusion Difference). The ELran,u,u -concept inclu- sion difference between ELHr -terminologies T1 and T2 w.r.t. a signature Σ is the ran,u,u set Diff Σ (T1 , T2 ) of all ELΣ -inclusions ϕ such that T1 |= ϕ and T2 6|= ϕ. If the set Diff Σ (T1 , T2 ) is not empty, then it typically contains infinitely many concept inclusions. We make use of the primitive witnesses theorems from [4], which state that it is sufficient to consider the following inclusions to represent a difference: (δ1 ) r v s, (δ2 ) ELran,u,u -inclusions of the forms A v E, dom(r) v E and ran(r) v E, and (δ3 ) ELran -inclusions the form D v A. The set of all ELran,u,u -concept inclusion difference witnesses is defined as WtnΣ (T1 , T2 ) = (roleWtnΣ (T1 , T2 ), lhsWtnΣ (T1 , T2 ), rhsWtnΣ (T1 , T2 )), where the set roleWtnΣ (T1 , T2 ) consists of all role inclusions in Diff Σ (T1 , T2 ), and the sets lhsWtnΣ (T1 , T2 ) ⊆ (NC ∩ Σ) ∪ { dom(r) | r ∈ Σ } ∪ { ran(r) | r ∈ Σ } and rhsWtnΣ (T1 , T2 ) ⊆ NC ∩ Σ of left-hand and right-hand concept difference witnesses consist of the left-hand sides of the type-δ2 inclusions in Diff Σ (T1 , T2 ) and the right-hand sides of type-δ3 inclusions in Diff Σ (T1 , T2 ), respectively. Ob- serve that the set WtnΣ (T1 , T2 ) is finite. Consequently, it can be seen as a suc- cinct representation of the typically infinite set Diff Σ (T1 , T2 ) in the sense that: Diff Σ (T1 , T2 ) = ∅ iff WtnΣ (T1 , T2 ) = (∅, ∅, ∅) [4]. Thus, to decide the existence of concept inclusion differences, it is equivalent to decide non-emptiness of the three witness sets. In this paper, we focus on right-hand witnesses in rhsWtnΣ (T1 , T2 ), i.e., only the inclusions of types δ3 are relevant. 3 Logical Difference using Hypergraphs Our approach for detecting logical differences is based on finding appropriate simulations between the hypergraph representations of terminologies. The hy- pergraph notion in this paper is such that the existence of certain simulations between the ontology hypergraphs of terminologies T1 and T2 coincides with lhsWtnΣ (T1 , T2 ) = ∅ and rhsWtnΣ (T1 , T2 ) = ∅. For every concept name A ∈ Σ (or role name r ∈ Σ), we verify whether A (or dom(r), ran(r)) belongs to lhsWtnΣ (T1 , T2 ), or whether A is a member of rhsWtnΣ (T1 , T2 ). For the former, we check for the existence of a forward simulation, and for the latter, for the ex- istence of a backward simulation between the ontology hypergraphs of T1 and T2 . In this paper we present backward simulations only. Checking for left-hand side witnesses and for witnesses in roleWtnΣ (T1 , T2 ) using ontology hypergraphs can be done similarly to [4]. We start with defining ontology hypergraphs and we subsequently introduce the notion of a backward simulation between such hypergraphs. 3.1 Ontology Hypergraphs We introduce a hypergraph representation of ELHr -terminologies. A directed hypergraph is a tuple G = (V, E), where V is a non-empty set of nodes (or vertices), and E is a set of directed hyperedges of the form e = (S, S 0 ), where S, S 0 ⊆ V. A node v ∈ V is reachable in G from a set of nodes V 0 ⊆ V (written V 0 ≥G v) if v ∈ V 0 , or there is a hyperedge e = (S, S 0 ) ∈ E such that v ∈ S 0 and every node in S is reachable from V 0 . We now show how to represent terminologies as hypergraphs. Definition 2 (Ontology Hypergraph). Let T be a normalised ELHr -termin- ology and let Σ be a signature. The ontology hypergraph GTΣ of T for Σ is a directed hypergraph GTΣ = (V, E) defined as follows: V ={ xA | A ∈ NC ∩ (Σ ∪ sig(T )) } ∪ { xr , xdom(r) , xran(r) | r ∈ NR ∩ (Σ ∪ sig(T )) }, and E ={({x` }, {xBi }) | ` ./ B1 u . . . u Bn ∈ T , ./ ∈ {v, ≡}, ` ∈ NC ∪ { dom(s), ran(s) | s ∈ NR }} ∪ {({xA }, {xdom(r) }) | A v dom(r) ∈ T or A ./ ∃r.B ∈ T , ./ ∈ {v, ≡}} ∪ {({xA }, {xr , xran(r)uB }), ({xran(r)uB }, {xB }), ({xran(r)uB }, {xran(r) }) | A ./ ∃r.B ∈ T , ./ ∈ {v, ≡}} ∪ { ({xr , xB }, {xA }) | A ≡ ∃r.B ∈ T } ∪ { ({xB1 , . . . , xBn }, {xA }) | A ≡ B1 u . . . u Bn ∈ T } ∪ { ({xdom(r) }, {xr , xran(r) }) | r ∈ NR ∩ (Σ ∪ sig(T ) } ∪ {({xr }, {xs }), ({xdom(r) }, {xdom(s) }), ({xran(r) }, {xran(s) }) | r v s ∈ T } The ontology hypergraph GTΣ = (V, E) contains a node x` for every signature symbol ` in Σ and T .2 Additionally, we represent concepts of the form dom(r) 2 Note that, differently to [3], the graph GTΣ does not contain a node representing >. and ran(r) as nodes in the graph. We include a hyperedge ({xdom(r) }, {xr , xran(r) }) for every role name r in Σ and T , which corresponds to the tautology dom(r) v ∃r.ran(r).3 Recall that dom(r) equals ∃r.>. The other hyperedges in GTΣ rep- resent the axioms in T . Every hyperedge is directed and can be understood as an implication, i.e., ({x`1 }, {x`2 }) stands for T |= `1 v `2 . The complex hyperedges are of the form ({xA }, {xr , xB }) and ({xr , xB }, {xA }) representing T |= A v ∃r.B and T |= ∃r.B v A, and of the form ({xB1 , ..., xBn }, {xA }) representing T |= B1 u . . . u Bn v A. Example 1. Let T = {A ≡ ∃r.X, X ≡ Y u Z, B v Z, ran(r) v Y }, and Σ = {A, B, r}. The ontology hypergraph GTΣ of T for Σ can be depicted as follows. xran(t) xZ xX xt xB xY xA xran(r)uX xdom(t) xr xran(r) xdom(r) In the following we introduce a relation →T on nodes xl , xl0 of an ontology hypergraph such that xl →T xl0 holds iff T |= l v l0 . Definition 3. Let GTΣ = (V, E) be the ontology hypergraph of a normalised ELHr -terminology T for a signature Σ. We define →T ⊆ V × V to be the smallest reflexive and transitive relation closed under the following conditions: – x →T x0 if ({x}, {x0 }) ∈ E; – x →T x0 if ({x}, {xr , xA }) ∈ E, ({xs , xB }, {x0 }) ∈ E, xr →T xs , and xA →T xB ; – x →T x0 if ({xB1 , . . . , xBm }, {x0 }) ∈ E and x →T xBi for every 1 ≤ i ≤ m. It can be readily seen that the relation →T can be computed in polynomial time in the size of T . Note that →T does not coincide with the usual reachability notion in a hypergraph. 3.2 Backward Simulation We introduce backward simulations between ontology hypergraphs whose exis- tence coincides with the absence of right-hand witnesses. The simulations are defined such that a node xA in GTΣ1 is simulated by a node xA0 in GTΣ2 iff A0 is entailed in T2 by exactly the same ELranΣ -concepts that entail A in T1 . To define backward simulations we need to take all the axioms into account that cause Σ-concepts to entail a concept name. Axioms of the forms A ≡ ∃r.B, A ≡ B1 u . . . u Bn , and ran(r) v A require special treatment, while it is more straightforward to deal with the other types of axioms. For the former, con- sider T1 = {A ≡ ∃r.X}, T2 = {A v ∃r.>}, and Σ = {A, r}. It holds that 3 These hyperedges are relevant for the forward simulation only. Diff Σ (T1 , T2 ) = ∅. Observe that there does not exist an ELran Σ -concept that en- tails A in T1 as the concept name X is not entailed by any Σ-concept in T . Thus, the node xA in GTΣ1 should be simulated by the node xA in GTΣ2 . To handle such cases, we want to characterise the entailment by an ELranΣ -concept in terms of (a special) reachability notion in ontology hypergraphs. For a signature Σ, let Σ Ran = Σ ∪ Σ dom ∪ Σ ran , where Σ dom = { dom(t) | t ∈ NR ∩ Σ } and Σ ran = { ran(t) | t ∈ NR ∩ Σ } are the sets consisting of concepts of the form dom(t) and ran(t) for every role name t in Σ, respectively. Definition 4 (Σ Ran -Reachability). Let GTΣ = (V, E) be the ontology hyper- graph of a normalised ELHr -terminology T for a signature Σ. We set VRan Σ = Ran Σ { x ∈ V | xσ →T x for some σ ∈ Σ } to be the set of nodes in GT that are reachable via →T from a node labelled with elements from Σ Ran . We say that a node v ∈ V is Σ Ran -reachable in GTΣ iff VRan Σ ≥GTΣ v. It can readily be seen that all Σ Ran -reachable nodes can be identified in polynomial time w.r.t. the size of T . Example 2. Let T = {A ≡ ∃r.X, X ≡ Y u Z, B v Z, ran(r) v Y } (cf. Ex. 1) and let Σ = {B, r}. Then all the nodes are Σ Ran -reachable in GTΣ and we have that T |= ∃r.B v A. The following example demonstrates that the relation →T is not sufficient to characterise entailment by ELran Σ -concepts in every case. Example 3. Let T = {A ≡ ∃r.X, X ≡ ∃r.B} and let Σ = {A, B, r}. The nodes xA and xB are Σ Ran -reachable in GTΣ , T |= ∃r.∃r.B v A, but xB 6→T xA . We now state the properties of Σ Ran -reachable nodes that we obtain. Lemma 1. Let T be a normalised ELHr -terminology and let Σ be a signature. Then the following statements hold: (i) xA ∈ V is Σ Ran -reachable in GTΣ iff there is an ELran Σ -concept D such that T |= D v A; (ii) xs ∈ V is Σ Ran -reachable in GTΣ iff there is s0 ∈ NR ∩Σ such that T |= s0 v s. For axioms of the form A ≡ B1 u . . . u Bn , we introduce the following no- tion which associates with every node xA in a hypergraph GT a set of concept names non-conj(xA ) that are essential to entail A in T (cf. [4]). Definition 5 (Non-Conjunctive). Let GTΣ = (V, E) be the ontology hyper- graph of a normalised ELHr -terminology T for a signature Σ. For xA ∈ V, let non-conj(xA ) be defined as: – if ({xB1 , . . . , xBm }, {xA }) ∈ E with m ≥ 2 (i.e. A ≡ B1 u . . . u Bm ∈ T ), we define non-conjT (xA ) = {xB1 , . . . , xBm }; – otherwise, let non-conjT (xA ) = {xA }. Note that for any concept name A in a normalised ELHr -terminology T the concept names B1 , . . . , Bm with non-conjT (xA ) = {xB1 , . . . , xBm } are non- conjunctive in T . We need to take special care of axioms of the form ran(r) v X as they might cause non-obvious entailments. Let T = {X ≡ B1 u B2 , A ≡ ∃r.X} and Σ = {A, B1 , B2 , r}. Then the Σ-concept ∃r.(B1 uB2 ) entails A in T . If we add the axiom ran(r) v B1 to T , then already the Σ-concept ∃r.B2 (of smaller signature) is sufficient to entail A in T . Intuitively, the conjunct B1 of X is already covered by ran(r) in the presence of the axiom ran(r) v B1 (as T |= ran(r) u B2 v X). To define backward simulations for axioms of the form A ≡ ∃r.X, all axioms of the form ran(r) v Y need to be taken into account. We will therefore define the notion of backward simulation using an additional parameter ζ ∈ C Σ = {} ∪ (NR ∩ Σ), called the role context. Such a parameter ζ stands for an expression of the form ran(ζ) in which a node x ∈ V1 should be simulated by a node x0 ∈ V2 . We treat as a special role name and set ran() = >. Additionally, we say that a node y ∈ V in an ontology hypergraph GTΣ = (V, E) is relevant for a node x in T w.r.t. a set of node labels L used in GTΣ if y ∈ non-conjT (x) and x` 6→T y for every ` ∈ L. We now give the definition of backward simulations as subsets of V1 ×V2 ×C Σ . Definition 6 (Backward Simulation). Let GTΣ1 = (V1 , E1 ), GTΣ2 = (V2 , E2 ) be the ontology hypergraphs of normalised ELHr -terminologies T1 and T2 for a signature Σ. A relation ←- ⊆ V1 × V2 × C Σ is a backward Σ-simulation if the following conditions are satisfied: (ib ) if (x, x0 , ζ) ∈ ←-, then for every σ ∈ Σ Ran : xσ →T1 x implies xσ →T2 y 0 for every y 0 ∈ V2 relevant for x0 in T2 w.r.t. {ran(ζ)}; (iib ) if (x, x0 , ζ) ∈ ←-, ({xr , y}, {x}) ∈ E1 , and y is Σ Ran -reachable in GTΣ1 , then for every s ∈ Σ such that xs →T1 xr , and for every y 0 ∈ V2 relevant for x0 in T2 w.r.t. {ran(ζ), dom(s)}, there exists ({xr0 , z 0 }, {y 0 }) ∈ E2 with xs →T2 xr0 and (y, z 0 , s) ∈ ←-; (iiib ) if (x, x0 , ζ) ∈ ←- and ({xX1 , . . . , xXn }, {x}) ∈ E1 , then for every y 0 ∈ V2 relevant for x0 w.r.t. {ran(ζ)} there exists y ∈ V1 relevant for x in T1 w.r.t. {ran(ζ)} with (y, y 0 , ) ∈ ←-. We write GTΣ1 ←- GTΣ2 iff there exists a backward Σ-simulation ←- ⊆ V1 × V2 × C Σ such that (xA , xA , ) ∈ ←- for every A ∈ NC ∩ Σ. Members of a backward simulation ←- are called simulation triples. For a node x in GTΣ1 to be backward simulated by x0 in GTΣ2 , Condition (ib ) enforces that appropriate Σ-concept names B or concepts of the form ran(s), dom(s) with s ∈ Σ that entail x in T1 must also entail x0 in T2 . Condition (iib ) applies to nodes xA ∈ GTΣ1 for which there exists an axiom A ≡ ∃r.X in T1 and propagates the simulation to the successor node xX by taking into account pos- sible entailments regarding domain or range restrictions in T2 . Condition (iiib ) handles axioms of the form A ≡ B1 u. . .uBn in T1 . We have to match every con- junct y 0 that is relevant for x0 in T2 with some conjunct y relevant for x in T1 (pos- sibly leaving some conjuncts y unmatched) since, intuitively speaking, some con- juncts in the definition of A in T1 can be ignored to preserve logical entailment. For instance, let T1 = {A ≡ B1 u B2 }, T2 = {B1 v A} and Σ = {A, B1 , B2 }. Then rhsWtnΣ (T1 , T2 ) = ∅ and, in particular, T2 |= B1 u B2 v A holds as well. Note that the simulation between conjuncts is propagated in the context only as all the conjuncts that are entailed by ran(ζ) have been filtered out already. Example 4. Let T1 = {A ≡ ∃r.X, X ≡ Y u Z, B v Z, ran(r) v Y } (cf. Ex. 1), T2 = {A ≡ X u Y, X ≡ ∃r.B, dom(s) v Y, r v s}, and Σ = {A, B, r}. It can be readily seen that the nodes xB , xY , xZ , and xX are Σ Ran -reachable in GTΣ1 . As only xB →T1 xB , we have that the node xB in GTΣ1 can be simulated by the node xB in GTΣ2 in the contexts and r. Similarly, as only xB →T1 xZ , the node xZ in GTΣ1 can be simulated by the node xB in GTΣ2 in the contexts and r. Hence, as non-conjT2 (xB ) = {xB } and as xZ is relevant for xX in T1 w.r.t. {ran(r)}, we have that xX in GTΣ1 can be simulated by xB in GTΣ2 in the context r. Finally, as non-conjT2 (xA ) = {xX , xY } and as only xX is relevant for xA in GTΣ2 (due to xdom(r) →T2 xY ), we can conclude that the node xA in GTΣ1 can be simulated by xA in GTΣ2 in the contexts and r. Overall, S = { (xA , xA , ζ) | ζ ∈ {, r} } ∪ { (xB , xB , ζ) | ζ ∈ {, r} } ∪ { (xZ , xB , ζ) | ζ ∈ {, r} } ∪ {(xX , xB , r)} is a backward Σ-simulation between GTΣ1 and GTΣ2 such that (xA , xA , ) ∈ S. Example 5. Let T1 , T2 , and Σ be defined as in Ex. 4. Now let T10 = T1 ∪{ran(t) v Z} and Σ 0 = Σ ∪ {t}. We observe that xran(t) →T2 x0 does not hold for any 0 0 0 node x0 ∈ GTΣ2 , i.e., the node xZ in GTΣ0 cannot be simulated by any node in GTΣ2 1 0 (in any context) as Condition (ib ) cannot be fulfilled. Hence, the node xX in GTΣ0 1 Σ0 cannot be simulated by xB in G T10 in the context r as Condition (iiib ) is vio- 0 lated. Thus, there cannot exist a backward Σ-simulation such that xA in GTΣ0 is 1 0 simulated by xA in GTΣ2 in the context as Condition (iib ) cannot be fulfilled. We can now show that the existence of a backward simulation coincides with the absence of right-hand witnesses and that one can check in polynomial time whether backward simulations exist. Theorem 1. Let T1 , T2 be normalised ELHr -terminologies, and let Σ be a sig- nature. Then it holds that rhsWtnΣ (T1 , T2 ) = ∅ iff GTΣ1 ←- GTΣ2 . Theorem 2. Let T1 , T2 be normalised ELHr -terminologies and let Σ be signa- ture. Then it can be checked in polynomial time whether GTΣ1 ←- GTΣ2 holds. So far we focused on finding concept names A contained in rhsWtnΣ (T1 , T2 ), which, together with the sets roleWtnΣ (T1 , T2 ) and lhsWtnΣ (T1 , T2 ), is sufficient to decide the existence of a logical difference between T1 and T2 . However, in practical applications users may require concrete concept inclusions D v A T1 T2 Time (s) - CEX 2.5 Time (s) - Prototype with ex. with ex. SM09a SM09b 340.40 580.44 211.08 214.20 SM09b SM10a 495.22 639.46 295.24 302.53 SM09b SM09a 477.34 599.07 324.56 328.81 SM10a SM09b 444.71 608.44 229.15 235.61 Table 1. Experimental Results Obtained for Snomed CT (or A v E) in Diff Σ (T1 , T2 ) that correspond to a witness A. We note that such example concept inclusions (and also example conjunctive queries) can be constructed recursively from triples for which the simulation conditions failed. 4 Experiments To investigate the practical applicability of our simulation-based approach for detecting right-hand witnesses, we implemented a prototype tool in OCaml that is based on the CEX 2.5 tool [5]. We then conducted a brief experimental eval- uation involving large fragments of three versions of Snomed CT (the first and second international release from 2009 as well as the first international release from 2010) and 119 versions of NCI4 which appeared between October 2003 and January 2014. The considered fragments of Snomed CT each contain about 280 000 concepts names and 62 role names. The aim of our experiments was to compare the performance of our prototype implementation against the CEX 2.5 tool, which can detect logical differences between acyclic terminologies only. We instructed both tools to compute the set WtnΣ (T1 , T2 ) for various versions T1 and T2 of Snomed CT and NCI. All the experiments were conducted on PCs equipped with an Intel Core i5-2500 CPU running at 3.30GHz, and all the com- putation times we report on are the average of three executions. In our experiments involving Snomed CT we used signatures composed of the intersection of the concept names in the two versions that were compared, together with the same 31 role names (including “RoleGroup”) that were cho- sen at random initially (and which occur in every version). The results that we obtained are shown in Table 1. The first two columns indicate which versions were used as ontologies T1 and T2 . The next two columns then show the com- putation times (CPU time) required by CEX 2.5, with column four depicting the computation times if additionally examples illustrating the witnesses were computed. The last two columns then indicate the computation times of our prototype tool. The times required when additionally examples were computed are shown in the last column. One can see that in all the cases our prototype tool required less time to compute difference witnesses (also together with example inclusions) than CEX 2.5. 4 More precisely, we first extracted the ELHr -fragment of the NCI versions by remov- ing up to 8% of the axioms which were not expressed in this fragment. 1000 Prototype CEX2.5 100 CPU Time (s) 10 1 Fig. 1. Experimental Results Obtained for NCI For each considered version α of NCI, we computed conjunctive query wit- nesses for T1 = NCIα and T2 = NCIα+1 on signatures Σ = sig(NCIα ) ∩ sig(NCIα+1 ), where α + 1 denotes the successor version of α, together with corresponding examples. The results that we obtained are depicted in Fig. 1. The computations are sorted chronologically along the x-axis according to the publication date of version NCIα . Each pair of bars represents the computation times required by our prototype tool and by CEX 2.5, respectively, for one com- parison. In the cases where only one bar is shown, the ontology T1 = NCIα was cyclic and CEX 2.5 could not be used. Generally speaking, both tools required longer computation times on more recent NCI versions than on older releases, which could be explained by the fact that the size of NCI versions increased with every new release. In the comparisons before version 10.03h our prototype tool could typically compute the witnesses and example inclusions faster than CEX 2.5. However, on later versions our new tool then required slightly longer computation times. One can also see that overall it took the longest time to compute witnesses for cyclic versions of NCI. Finally, we note that in our experiments all the computations required at most 2.85 GiB of main memory. 5 Conclusion We presented a unifying approach to solving the logical difference problem for possibly cyclic ELHr -terminologies. We showed that the existence of backward simulations in hypergraph representations of terminologies corresponds to the absence of right-hand witnesses (an analogous correspondence exists between forward simulations and left-hand witnesses). We also demonstrated the ap- plicability of the hypergraph approach using a prototype implementation. The experiments showed that in most cases our prototype tool outperformed the previous tool, CEX 2.5, for computing the logical difference. Moreover, our pro- totype tool could be successfully applied on fairly large cyclic terminologies, whereas previous approaches only worked for acyclic (or rather small cyclic) terminologies. We plan to further improve our prototype implementation. Moreover, exten- sions of our techniques to DL-Lite, general ELHr -TBoxes, or even Horn-SHIQ ontologies could be investigated. References 1. Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope further. In: Proceedings of OWLED’08 (2008) 2. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The description logic handbook: theory, implementation, and applications. Cambridge University Press (2007) 3. Ecke, A., Ludwig, M., Walther, D.: The concept difference for EL-terminologies using hypergraphs. In: Proceedings of DChanges’13. CEUR-WS.org (2013) 4. Konev, B., Ludwig, M., Walther, D., Wolter, F.: The logical difference for the lightweight description logic EL. JAIR 44, 633–708 (2012) 5. Konev, B., Ludwig, M., Wolter, F.: Logical difference computation with CEX2.5. In: Proceedings of IJCAR’12. pp. 371–377. Springer (2012) 6. Konev, B., Walther, D., Wolter, F.: The logical difference problem for description logic terminologies. In: Proceedings of IJCAR’08. pp. 259–274. Springer (2008) 7. Kontchakov, R., Wolter, F., Zakharyaschev, M.: Logic-based ontology compari- son and module extraction, with an application to DL-Lite. Artificial Intelligence 174(15), 1093–1141 (Oct 2010) 8. Lembo, D., Santarelli, V., Savo, D.F.: Graph-based ontology classification in OWL 2 QL. In: Proceedings of ESWC 2013. LNCS, vol. 7882, pp. 320–334. Springer (2013) 9. Nortje, R., Britz, A., Meyer, T.: Module-theoretic properties of reachability mod- ules for SRIQ. In: Proceedings of DL’13. pp. 868–884. CEUR-WS.org (2013) 10. Suntisrivaraporn, B.: Polytime reasoning support for design and maintenance of large-scale biomedical ontologies. Ph.D. thesis, TU Dresden, Germany (2009)