Module-theoretic properties of reachability modules for SRIQ Riku Nortje, Katarina Britz, and Thomas Meyer Center for Artificial Intelligence Research, University of KwaZulu-Natal and CSIR Meraka Institute, South Africa Email: {rnortje;abritz;tmeyer}@csir.co.za Abstract. In this paper we investigate the module-theoretic properties of ⊥− and >-reachability modules in terms of inseparability relations for the DL SRIQ. We show that, although these modules are not deplet- ing or self-contained, they share the robustness properties of syntactic locality modules and preserve all justifications for an entailment. 1 Introduction Modularization plays an important part in the design and maintenance of large scale ontologies. Modules are loosely defined as subsets of ontologies that cover some topic of interest, where the topic of interest is defined by a set of symbols. Extracting minimal modules is computationally expensive and even undecidable for expressive DLs [3, 4]. Therefore, the use of approximation techniques and heuristics plays an important role in the efficient design of algorithms. Syntactic locality [3, 4], because of its excellent model theoretic properties, has become an ideal heuristic and is widely used in a diverse set of algorithms [14, 2, 5]. Suntisrivaraporn [14] showed that, for the DL EL+ , ⊥-locality module ex- traction is equivalent to the reachability problem in directed hypergraphs. Nortjé et al. [10, 11] extended the reachability problem to include >-locality and intro- duced bidirectional reachability modules as a subset of ⊥>∗ -locality modules. This work was further extended to the DL SROIQ Nortje et al. [12] who showed that extracting ⊥>∗ -reachability modules is equivalent to extracting frontier graphs in hypergraphs. Reachability modules are not only of importance in hypergraph-based reasoning support for TBoxes [12], but are potentially smaller than syntactic locality modules. In this paper we investigate the module-theoretic properties of reachability modules for the DL SRIQ. We show that these modules are not self-contained or depleting but they are robust under vocabulary restrictions, vocabulary ex- tensions, replacement and joins. By showing that reachability modules preserve all justifications for entailments, we show that depleting modules are sufficient for preserving all justifications but not necessary. In Section 2 we give a brief introduction to the DL SRIQ and modulariza- tion as defined by inseparability relations. Section 3 introduces a normal form for SRIQ TBoxes as well as the rules necessary to transform any such TBox to normal form. In Section 4 we introduce both ⊥- and > reachability modules and investigate all their module theoretic properties in terms of inseperability rela- tions. All proofs of the work presented appears in the accompanying appendix. Lastly in Section 5 we conclude this paper with a short summary of the results. 2 Background In this section we give a brief introduction to modularization and the DL SRIQ [7] with its syntax and semantics listed in Table 2. NC and NR denote disjoint sets of atomic concept names and role names. The set NR includes the universal role whilst NC excludes the > and ⊥ concepts. For a complete definition of SRIQ, refer to Horrocks et al. [7], and for Description Logics refer to [1]. Constructs Syntax Semantics atomic concept C C I ∈ ∆I , C ∈ NC role R RI ⊆ ∆I × ∆I , R ∈ NR inverse role R− R −I = {(y, x) | (x, y) ∈ RI }, R ∈ NR universal role U U I = ∆I × ∆I role composition R1 ◦ . . . ◦ Rn {(x, z) | (x, y1 ) ∈ R1I ∧ (y1 , y2 ) ∈ R2I ∧ . . . I ∧(yn , z) ∈ Rn , n ≥ 2, Ri ∈ NR } top > >I = ∆I bottom ⊥ ⊥I = ∅ negation ¬C (¬C)I = ∆I \ C I conjunction C1 u C2 (C1 u C2 )I = C1I ∩ C2I disjunction C1 t C2 (C1 t C2 )I = C1I ∪ C2I exist restriction ∃R.C {x | (∃y)[(x, y) ∈ RI ∧ y ∈ C I ]} value restriction ∀R.C {x | (∀y)[(x, y) ∈ RI → y ∈ C I ]} self restriction ∃R.Self {x | (x, x) ∈ RI } atmost restriction 6 nR.C {x | #{y | (x, y) ∈ RI ∧ y ∈ C I } 6 n} atleast restriction > nR.C {x | #{y | (x, y) ∈ RI ∧ y ∈ C I } > n} Axiom Syntax Semantics concept inclusion C1 v C2 C1I ⊆ C2I role inclusion R1 ◦ . . . ◦ Rn v Rn+1 (R1 ◦ . . . ◦ Rn )I ⊆ RI , n ≥ 1 reflexivity Ref (R) {(x, x) | x ∈ ∆I } ⊆ RI irreflexivity Irr(R) {(x, x) | x ∈ ∆I } ∩ RI = ∅ disjointness Dis(R, S) S I ∩ RI = ∅ Table 1. Syntax and semantics of SRIQ Module extraction is the process of extracting subsets of axioms from TBoxes that are self contained with respect to some criteria. These sets of axioms, called modules, may be used for various purposes such as reuse, optimization and error pinpointing amongst others [4, 14]. Definition 1. (Module for the arbitrary DL L) Let L be an arbitrary de- scription language, O an L ontology, and σ a statement formulated in L. Then, 2 O0 ⊆ O is a module for σ in O(a σ-module in O) whenever: O |= σ if and only if O0 |= σ. Definition 1 is sufficiently general so that any subset of an ontology preserving a statement of interest is considered a module, the entire ontology is therefore a module in itself. Different use cases usually result in different notions of what the definition and characteristics of a module should be. Modules are often defined via the notion of conservative extensions. Given some signature (a set of concept and role names) and a set of axioms, a conservative extension of this set is simply one that implies all the same consequences over the signature. More formally: Definition 2. (Conservative extension [4]) Let T and T1 be two TBoxes such that T1 ⊆ T , and let Σ be a signature. Then – T is a Σ-conservative extension of T1 if, for every α with Sig(α) ⊆ Σ, we have T |= α iff T1 |= α. – T is a conservative extension of T1 if T is a Σ-conservative extension of T1 for Σ = Sig(T1 ). Given that both sets of axioms imply the same consequences for a given signature we may then use the smaller set whenever we wish to reason over this signature. A closely related notion to conservative extensions is that of inseparability. Definition 3. [13] T1 and T2 are Σ-concept name inseparable, written T1 ≡cΣ T2 , if for all Σ- concept names C, D, it holds that T1 |= C v D if and only if T2 |= C v D. Definition 4. [13] T1 and T2 are Σ-subsumption inseparable, written T1 ≡sΣ T2 , if for all terms X, Y that are concepts or roles over Σ, it holds that T1 |= X v Y if and only if T2 |= X v Y . Definition 5. [13] Let T be a TBox, M ⊆ T , S an inseparability relation and Σ a signature. We call M – an SΣ -module of T if M ≡SΣ T . – a self-contained SΣ -module of T if M ≡SΣ∪Sig(M) T . – a depleting SΣ -module of T if ∅ ≡SΣ∪Sig(M) T \ M. Modules may therefore be characterized by some inseparability criteria. It is of interest how modules defined this way would behave under different use case scenarios. For this purpose, several properties of inseparability relations [8] have been investigated in the literature, which allows us to compare different definitions of modules. Given a TBox T and a module M ⊆ T for a signature Σ, we are interested in the following inseparability properties: 3 – Robustness under vocabulary restrictions implies that when we wish to re- strict the symbols from Σ further we do not need to import a different module and may continue to use M. – Robustness under vocabulary extension implies that should we wish to add new symbols to Σ that do not appear in T we do not need to use a different module but may use M. – Robustness under replacement ensures that the result of importing M into a TBox T1 is a module of the result of importing T into T1 . This is also called module coverage and refers to the fact that importing a module does not affect its property of being a module. – Robusness under joins implies that if T and T1 are inseparable w.r.t. Σ and all the terms they share are from Σ, then each of them are inseparable with their union w.r.t. Σ. More formally: Definition 6. [8] The inseparability relation S is called – robust under vocabulary restrictions if, for all TBoxes T1 , T2 and all signa- tures Σ, Σ 0 with Σ ⊆ Σ 0 , the following holds: if T1 ≡SΣ 0 T2 , then T1 ≡SΣ T2 . – robust under vocabulary extensions if, for all TBoxes T1 , T2 and all signatures Σ, Σ 0 with Σ 0 ∩ (Sig(T1 ) ∪ Sig(T2 )) ⊆ Σ, the following holds: if T1 ≡SΣ T2 , then T1 ≡SΣ 0 T2 . – robust under replacement if, for all TBoxes T1 , T2 and all signatures Σ and every TBox T with Sig(T ) ∩ (Sig(T1 ) ∪ Sig(T2 )) ⊆ Σ, the following holds: if T1 ≡SΣ T2 then T1 ∪ T ≡SΣ T2 ∪ T . – robust under joins if, for all TBoxes T1 , T2 and all signatures Σ with Sig(T )∩ Sig(T2 ) ⊆ Σ, if T1 ≡SΣ T2 then Ti ≡SΣ T1 ∪ T2 , for i = 1, 2. 3 Normal Form In this section we introduce a normal form for SRIQ TBoxes. We utilize nor- malization in order to simplify the definitions, to ease the understanding of the work that follows, as well as to simplify the presentation of proofs. Definition 7. Given Bi ∈ (NC ∪ {>}), Ci ∈ (NC ∪ {⊥}), D ∈ {∃R.B, ≥ nR.B, ∃R.Self }, with R, S, Ri , Si role names from NR or their inverses and n > 1, a SRIQ TBox T is in normal form if every axiom α ∈ T is in one of the following forms: α1 : B1 u . . . u Bn v C1 t . . . t Cm α2 : D v C1 t . . . t Cm α3 : B1 u . . . u Bn v D α4 : R1 ◦ . . . ◦ Rn v Rn+1 α5 : R1 v R2 α6 : D1 v D2 α7 : Dis(R1 , R2 ) 4 In order to normalize a SRIQ TBox T we repeatedly apply the normalization rules from Table 2. Each application of a rule rewrites an axiom into its equivalent normal form. It is easy to see that the application of every rule ensures that the normalized TBox is a conservative extension of the original. We note that the SRIQ axiom Ref (R) is represented by its equivalent > v ∃R.Self and Irr(R) by ∃R.Self v ⊥ [1]. Table 2. SRIQ normalization rules NR1 B̂ u ¬Ĉ2 v Ĉ1 B̂ v Ĉ1 t Ĉ2 NR2 B̂1 v Ĉ t ¬B̂2 B̂1 u B̂2 v Ĉ NR3 B̂ u D̂ v Ĉ B̂ u A v Ĉ, D̂ v A, A v D̂ NR4 B̂ v Ĉ t D̂ B̂ v Ĉ t A, D̂ v A, A v D̂ NR5 B̂ v Ĉ1 u Ĉ2 B̂ v Ĉ1 , B̂ v Ĉ2 NR6 B̂1 t B̂2 v Ĉ B̂1 v Ĉ, B̂2 v Ĉ NR7 . . . ∀R.Ĉ . . . . . . ¬∃R.A . . ., A u Ĉ v ⊥, > v A t Ĉ NR8 . . . ∃R.D̂ . . . . . . ∃R.A . . ., D̂ v A, A v D̂ NR9 . . . > nR.D̂ . . . . . . > nR.A . . ., D̂ v A, A v D̂ NR10 . . . 6 nR.Ĉ . . . . . . ¬(> (n + 1)R.Ĉ) . . . NR11 B̂ ≡ Ĉ B̂ v Ĉ,Ĉ v B̂ NR12 > 0R.B v Ĉ > v Ĉ NR13 B̂ v ∃R.⊥ B̂ v ⊥ NR14 B̂ v> nR.⊥ B̂ v ⊥ NR15 B̂ v> 0R.B NR16 > nR.⊥ v Ĉ NR17 ∃R.⊥ v Ĉ NR18 B̂ u ⊥ v Ĉ NR19 ⊥ v Ĉ NR20 B̂ v Ĉ t > NR21 B̂ v > Above A is a new concept name not in NC , B̂i and Ĉi are possibly complex concept descriptions and D̂ a complex concept description. R ∈ NR or it’s inverse, n > 0 Theorem 1. Exhaustively applying the rules from Table 2 to any SRIQ TBox T results in a SRIQ TBox T 0 in normal form. The normalization process can be completed in linear time in the number of axioms. Example 1. Let α1 = B v ¬C, and α2 = ¬A v B. Then, α1 may be normalized by application of rule NR2 to α1N = B u C v ⊥ since ¬C = ¬C t ⊥. α2 may be normalized by application of rule NR1 to α2N = > v B ∪ A since ¬A = ¬A u >. For the rest of this paper we assume that all TBoxes are in normal form. 5 4 Reachability Modules Deciding conservative extensions has been shown to be computationally expen- sive or even undecidable for relatively inexpressive DLs. Therefore, an approxi- mation of these modules, based on syntax, called syntactic locality modules [4] has been introduced. Given a normalized TBox T , the definition of syntactic locality can be simplified to the following: Definition 8. (Normalized Syntactic Locality) Let Σ be a signature and T a normalized SRIQ TBox. An axiom α is ⊥-local w.r.t. Σ (>-local w.r.t Σ) if α ∈ Ax(Σ)⊥ (α ∈ Ax(Σ)> ), as defined in the grammar: ⊥-Locality Ax(Σ)⊥ ::= C ⊥ v C | w⊥ v R | Dis(S ⊥ , S) | Dis(S, S ⊥ ) Con⊥ (Σ) ::= A⊥ | C ⊥ u C | C u C ⊥ | ∃R⊥ .C | ∃R.C ⊥ | ∃R⊥ .Self | > nR⊥ .C |> nR.C ⊥ >-Locality Ax(Σ)> ::= C v C > | w v R> Con> (Σ) ::= A> | C > t C | C t C > | ∃R> .C > |> nR> .C > | ∃R> .Self In the grammar, we have that A⊥ , A> 6∈ Σ is an atomic concept, R⊥ (resp. S ⊥ ) is either an atomic role (resp. a simple atomic role) not in Σ or the inverse of an atomic role (resp. of a simple atomic role) not in Σ, C is any concept, R is any role, S is any simple role, and C ⊥ ∈ Con⊥ (Σ), C > ∈ Con> (Σ). We also denote by w⊥ a role chain w = R1 ◦ . . . ◦ Rn such that for some i with 1 ≤ i ≤ n, we have that Ri is (possibly inverse of ) an atomic role not in Σ. A TBox T is ⊥-local (>-local) w.r.t. Σ if α is ⊥-local (>-local) w.r.t. Σ for all α ∈ T . Algorithm 1 may be used to extract both the minimal ⊥ and >-locality based modules for a signature S. A variant of ⊥-syntactic locality modules called ⊥-reachability based modules [14] is based on the reachability problem in directed hypergraphs. Hypergraphs [9, 15] are a generalization of graphs and have been studied extensively since the 1970s as a powerful tool for modelling many problems in Discrete Mathematics. We extend the work done by Nortje et al.[11] and define reachability for SRIQ TBoxes. We then continue to show that these modules share all the robustness properties of locality modules and therefore is well suited to be used in the ontology reuse scenario. Definition 9. (⊥-Reachability) Let T be a SRIQ TBox in normal form and Σ ⊆ Sig(T ) a signature. The set of ⊥-reachable names in T w.r.t. Σ, denoted by ΣT←⊥ , is defined inductively as follows: – For every x ∈ (Σ ∪ {>}) we have x ∈ ΣT←⊥ . – For every inclusion axiom (αL v αR ) ∈ T , if Sig(αL ) ⊆ ΣT←⊥ then every y ∈ Sig(αR ) is also in ΣT←> . 6 Algorithm 1 (Module Extraction Algorithm for SRIQ [2]) Procedure extract module(O, S) Input: O: ontology; S: signature; Output: O1 : a module for S in O 1 : O1 := ∅, O2 := O 2 : while not empty(O2 ) do 3: α := select axiom(O2 ) 4: if local(α, S ∪ Sig(O1 )) then 5: O2 := O2 \ {α} 6: else 7: O1 := O1 ∪ {α} 8: O2 := O \ O1 9: end if 10 : end while 11 : return O1 Every axiom α := αL v αR such that Sig(αL ) ⊆ ΣT←⊥ we call ΣT←⊥ -reachable. Axioms of the form Dis(R, S) ∈ T are ΣT←⊥ -reachable whenever {R, S} ⊆ ΣT←⊥ . The set of all ΣT←⊥ -reachable axioms is denoted by TΣ←⊥ and is called the ⊥- reachability module for T over Σ. It is self-evident from Definition 8 that an axiom is ⊥-reachable w.r.t Σ exactly when it is not ⊥-local w.r.t. Σ. Similarly we define an axiom to be >-reachable exactly when it is not >-local. Definition 10. (>-Reachability) Let T be a SRIQ TBox in normal form and Σ ⊆ Sig(T ) a signature. The set of >-reachable names in T w.r.t. Σ, denoted by ΣT←> , is defined inductively as follows: – For every x ∈ (Σ ∪ ⊥) we have that x ∈ ΣT←> . – For all inclusion axioms (αL v αR ) ∈ T , if • αR = ⊥, or • αR is of the form A1 t . . . t An and all Ai ∈ ΣT←> , or • αR has any other form and there exists some x ∈ Sig(αR ) ∩ ΣT←> then every y ∈ Sig(αL ) is also in ΣT←> . Every axiom α := αL v αR such that, αR = ⊥, or αR is of the form A1 t . . . t An and all Ai ∈ ΣT←> , or αR has any other form and there exists some x ∈ Sig(αR ) ∩ ΣT←> , we call ΣT←> -reachable. All axioms of the form Dis(R, S) ∈ T are always ΣT←> -reachable and {R, S} ⊆ ΣT←> . The set of all ΣT←> -reachable axioms is denoted by TΣ←> and is called the >-reachability module for T over Σ. It is easy to show that ⊥-reachability modules are equivalent to ⊥-locality modules. However, by the definition of >-reachability we observe that these are not equivalent to >-locality modules. 7 Example 2. Let T be a TBox such that T = {α1 , α2 , α3 , α4 }, with α1 := A v ∃r.D1 , α2 := B v≥ nr.D2 , α3 := ∃r.> v C, α4 := D1 v D2 and let Σ = {C}. Then TΣ←> = {α1 , α2 , α3 } but the >-locality module for T w.r.t. Σ is {α1 , α2 , α3 , α4 }. The difference stems from the fact that in α1 and α2 the >-reachability of r does not ensure the >-reachability of D1 and D2 respectively. This occurs because, given an axiom α = αL v αR , >-locality ensure that if α is >-local then so are all of the symbols in Sig(α), whereas >-reachability is defined such that the >-reachability of α only guarantees that all symbols of αL and only some symbols of αR will be >-reachable. Thus >-reachability based modules are at most the size of >-locality modules but in general could be substantially smaller. Similar to ⊥>∗ -locality modules we note that reachability module extraction may also be alternated until a fixpoint is reached. These modules are denoted ∗ by TΣ←⊥> . In order to investigate the module-theoretic properties of reachability mod- ules, we follow a similar approach to Sattler et al. [13] and define inseparability different from that of conservative extensions. We say that T1 and T2 are insep- arable if their modules are equivalent, that is, a module extraction algorithm returns the same output for each of them. We define the following inseparability relations for reachability modules: Definition 11. Let T1 and T2 be TBoxes and Σ a signature. Then T1 and T2 are: ←> – Σ − > reachability inseparable, denoted by T1 ≡> Σ T2 , if T1 Σ = T2 ←>Σ ; – Σ − ⊥ reachability inseparable, denoted by T1 ≡Σ T2 , if T1 Σ = T2 ←⊥ ⊥ ←⊥ Σ ; ∗ ←⊥>∗ – Σ − ⊥>∗ reachability inseparable, denoted by T1 ≡⊥> Σ T 2 , if T 1 Σ = ∗ T2 ←⊥> Σ . Firstly we show that >-reachability modules are subsumption inseparable. Concept inseparability follows as a special case of subsumption inseparability. Lemma 1. Let T be a SRIQ TBox, and Σ ⊆ Sig(T ) a signature. Then T |= C v D if and only if TΣ←> |= C v D for arbitrary SRIQ concept descriptions C and D such that Sig(C) ∪ Sig(D) ⊆ Σ. Corollary 1. Let T be a normalized SRIQ TBox, Σ ⊆ Sig(T ) a signature and S an inseparability relation from Definitions 3 and 4. Then TΣ←> ≡SΣ T . TΣ←> is therefore a SΣ -module of T . We show by way of counter example that TΣ←> is not a self-contained or depleting SΣ module of T when ΣT←> 6= Sig(TΣ←> ). Example 3. Let T be a TBox such that T = {α1 = A v ∃r.D1 , α2 = B v≥ nr.D2 , α3 = ∃r.> v C, α4 = D1 v D2 }, and let Σ = {C}. Then TΣ←> = {α1 , α2 , α3 }, δ = Σ ∪ Sig(TΣ←> ) = {A, B, C, r, D1 , D2 } = 6 ΣT←> . But T |= D1 v ←> ←> D2 and TΣ 6|= D1 v D2 . Therefore TΣ is not a self-contained cΣ -module of T . Similarly, T \ TΣ←> |= α4 6= ∅ with Σ = D1 , D2 and D1 , D2 ∈ δ. Therefore, TΣ←> is not a depleting cΣ -module of T . 8 Before investigating the robustness properties of reachability modules we introduce some lemmas to aid us in the proofs that follow. Lemma 2. Let α be an axiom, Σ and Σ 0 be signatures, x ∈ {>, ⊥} and T a SRIQ TBox. Then: ←x 1. If Σ ⊆ Σ 0 and α is not Σ 0 T reachable, then α is not ΣT←x reachable. 2. If Σ 0 ∩ Sig(α) ⊆ Σ and α is not Σ reachable then α is not Σ 0 reachable. Lemma 3. Let α be an axiom, Σ and Σ 0 be signatures, x ∈ {>, ⊥} and T , T 0 SRIQ TBoxes. Then: 1. Given T1 = TΣ←x 0 ←x 0 , if Σ ⊆ Σ then TΣ = T1 ←x ←x Σ . In particular TΣ ⊆ TΣ←x 0 . 0 ←x ←x 2. If Σ ∩ Sig(T ) ⊆ Σ, then TΣ 0 ⊆ TΣ . ←x 3. If T ⊆ T 0 , then TΣ←x ⊆ T 0 Σ . Lemma 4. Let Σ be an signature, T1 and T2 be SRIQ TBoxes with Sig(T1 ) ∩ ←x Sig(T2 ) ⊆ Σ and x ∈ {>, ⊥}. Then (T1 ∪ T2 )Σ = T1 ←x ←x Σ ∪ T2 Σ . Proposition 1. For x ∈ {>, ⊥}, x-reachability is robust under replacement. Proposition 2. For x ∈ {>, ⊥}, x-reachability is robust under vocabulary ex- tensions. Proposition 3. For x ∈ {>, ⊥}, x-reachability is robust under vocabulary re- strictions. Proposition 4. For x ∈ {>, ⊥}, x-reachability is robust under joins. Reachability modules therefore share all the robustness properties listed. However, we have seen that these modules are neither depleting nor self-contained modules. Amongst other things, the depleting and self-contained nature of mod- ules are utilised in order to find all MinAs (justifications) for an entailment [6]. Definition 12. Let T be a SRIQ Tbox and M ⊆ T . M is a MinA (justifica- tion) for T |= C v D if M |= C v D and there exists no M1 ⊂ M such that M1 |= C v D. We show that although our modules do not share these properties they do contain all MinAs for a given signature. Theorem 2. Let T be a normalized SRIQ TBox and Σ a signature such that Σ ⊆ Sig(T ). Then for arbitrary concept descriptions C, D, such that T |= C v D and Sig(C) ∪ Sig(D) ⊆ ΣT←> we have that TΣ←> contains all MinAs for T |= C v D. ∗ The proof to show that TΣ←⊥> modules share all the robustness properties of TΣ←> modules follows from the above lemmas and follows the proof for ⊥>∗ - locality modules by Sattler, et al. [13]. 9 5 Conclusion We have investigated the module-theoretic properties of reachability modules for SRIQ TBoxes. Reachability modules differ from syntactic locality mod- ules in that they are not self-contained or depleting. One application of the self-contained and depleting nature of locality modules is the finding of all justi- fications for entailments. However, in terms of finding justifications, by showing that reachability modules do preserve all justifications for entailments, we have shown that these properties are sufficient but that they are not necessary. We did preliminary investigations into the size difference between locality and reachability modules. We extracted a random sample of 1000 modules from each of the Pizza, Nci, Nap and Xylocopa v4 ontologies. Reachability modules were between 2.5% and 33% smaller than locality modules with an average of 22% reduction in size across all ontologies tested. Our focus for future research is to do an in-depth empirical evaluation on differences with respect to size and performance between extracting reachability modules for SRIQ and existing syntactic locality methods. We also plan to extend these results to SROIQ. Acknowledgments We wish to thank the reviewers for their contributions and insightful comments. This work was partially funded by Project number 247601, Net2: Network for Enabling Networked Knowledge, from the FP7-PEOPLE- 2009-IRSES call. This work is based upon research supported in part by the National Research Foundation of South Africa (UID 85482, IFR2011032700018). References 1. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The description logic handbook: theory, implementation, and applications. Cambridge University Press, New York, NY, USA (2003) 2. Cuenca Grau, B., Halaschek-Wiener, C., Kazakov, Y., Suntisrivaraporn, B.: Incre- mental Classification of Description Logic Ontologies. Tech. rep. (2012) 3. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Just the right amount: extracting modules from ontologies. In: Williamson, C., Zurko, M. (eds.) Proceed- ings of the 16th International Conference on World Wide Web (WWW ’07). pp. 717–726. ACM, New York, NY, USA (2007) 4. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular Reuse of Ontolo- gies: Theory and Practice. Journal of Artificial Intelligence Research (JAIR) 31, 273–318 (2008) 5. Del Vescovo, C., Parsia, B., Sattler, U., Schneider, T.: The modular structure of an ontology: atomic decomposition and module count. In: O. Kutz, T.S. (ed.) Proc. of WoMO-11. Frontiers in AI and Appl., vol. 230, pp. 25–39. IOS Press (2011) 6. Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in owl. In: In Proc. of ISWC-08, volume 5318 of LNCS. pp. 323–338 (2008) 7. Horrocks, I., Kutz, O., Sattler, U.: The irresistible SRIQ. In: In Proc. of OWL: Experiences and Directions (2005) 10 8. Konev, B., Lutz, C., Walther, D., Wolter, F.: Modular ontologies. chap. Formal Properties of Modularisation, pp. 25–66. Springer-Verlag, Berlin, Heidelberg (2009) 9. Nguyen, S., Pretolani, D., Markenzon, L.: On some path problems on oriented hypergraphs. Theoretical Informatics and Applications 32(1-2-3), 1–20 (1998) 10. Nortjé, R.: Module extraction for inexpressive description logics. Master’s thesis, University of South Africa (2011) 11. Nortjé, R., Britz, K., Meyer, T.: Bidirectional reachability-based modules. In: Pro- ceedings of the 2011 International Workshop on Description Logics (DL2011). CEUR Workshop Proceedings, CEUR-WS (2011), http://ceur-ws.org 12. Nortjé, R., Britz, K., Meyer, T.: A normal form for hypergraph-based module extraction for SROIQ. In: Gerber, A., Taylor, K., Meyer, T., Orgun, M. (eds.) Australasian Ontology Workshop 2009 (AOW 2009). Ceur-ws, vol. 969, pp. 40–51. CEUR, Melbourne, Australia (2012), http://ceur-ws.org/Vol-969/proceedings.pdf 13. Sattler, U., Schneider, T., Zakharyaschev, M.: Which kind of module should I extract? In: Grau, B.C., Horrocks, I., Motik, B., Sattler, U. (eds.) Description Logics. CEUR Workshop Proceedings, vol. 477. CEUR-WS.org (2009) 14. Suntisrivaraporn, B.: Polynomial-Time Reasoning Support for Design and Main- tenance of Large-Scale Biomedical Ontologies. Ph.D. thesis, Technical University of Dresden (2009) 15. Thakur, M., Tripathi, R.: Complexity of Linear Connectivity Problems in Directed Hypergraphs. Linear Connectivity Conference pp. 1–12 (2001) 11 A Proofs for Theorems and Lemmas Theorem 1 Exhaustively applying the rules from Table 2 to any SRIQ TBox T results in a SRIQ TBox T 0 in normal form. The normalization process can be completed in linear time in the number of axioms. Proof: We show that any SRIQ TBox can be converted to an equivalent normal form as follows: – Step 1 - ≡-elimination: Rule NR11 may be applied at most once for each axiom in the TBox. No other rule introduces new axioms that contain equiv- alences. Therefore the elimination of all equivalences from the TBox will require linear time and add at most a linear number of axioms. – Step 2 - ∀-elimination: Applying rule NR7 to every occurrence of a universal restriction in any axiom, irrespective of order, will result in the elimination of all universal restrictions within that axiom. Nested restrictions are handled recursively as they are removed and inserted into the added axioms as Ĉ. There are a constant number of universal restrictions per axiom and therefore the application of rule NR7 will run in constant time, with each application adding a constant number of new axioms. Therefore eliminating all universal restrictions across all axioms will require linear time and add at most a linear number of new axioms. – Step 3 - 6-elimination: Applying rule NR10 to every occurrence of an atmost restriction in any axiom, irrespective of order, will result in the elimination of all atmost restrictions within that axiom. Nested restrictions are handled recursively as needed. There are a constant number of atmost restrictions per axiom and therefore the application of rule NR10 will run in constant time. Therefore eliminating all atmost restrictions across all axioms will require linear time. – Step 4 - Complex role-filler elimination: At the start of this step there are no universal or at most restrictions. We eliminate all complex role fillers by applying rules NR8 and NR9 to all axioms. There are a constant number of complex role fillers per axiom, therefore the application of these rules requires constant time per axiom and will add at most a constant number of axioms. Therefore, removing all complex role fillers from an ontology by using rules NR8 and NR9 will require at most linear time and add a linear number of new axioms. – Step 5 - ¬-elimination and u, t-simplification: At this step there are no existential restriction or at-least restriction with complex role fillers. Given any axiom α = (αL v αR ), in a left to right fashion we apply the rules as follows: 1. Apply rules NR1, NR3, NR6 to αL until αL consists of an existential restriction, an at-least restriction or the conjunction of concept names. There are a constant number of complex concept description, negations, disjunctions and conjunctions in αL , each of these rules either eliminates a negation, removes a complex concept description or eliminates a dis- junction. There are a constant number of times each of these operations 12 may be applied. Rules NR3 and NR6 each add a constant number of axioms. Therefore, αL can be processed in constant time for each axiom. 2. Apply rules NR2, NR4, NR5 to αR until αR consists of an existential restriction, an at-least restriction or a disjunction of concept names. There are a constant number of times each of these operations may be applied. Rules NR4 and NR5 each add a constant number of axioms. Therefore, αL can be processed in constant time for each axiom. These rules are applied repeatedly until no further processing may be ap- plied to either αL and αR . Since each step can be completed in constant time and add at most a constant number of new axioms, normalization can be completed in linear time in the number of axioms with at most a linear increase in the number of axioms. 2 Lemma 1 Let T be a SRIQ TBox, and Σ ⊆ Sig(T ) a signature. Then T |= C v D if and only if TΣ←> |= C v D for arbitrary SRIQ concept descriptions C and D such that Sig(C) ∪ Sig(D) ⊆ Σ. Proof: We have to prove two parts. First: If TΣ←> |= C v D then T |= C v D. This follows directly from the fact that TΣ←> ⊆ T and that SRIQ is monotonic. Conversely, we show that, if T |= C v D then TΣ←> |= C v D. Assume the contrary, that is, assume T |= C v D but that TΣ←> 6|= C v D. Then there must exist an interpretation I and an individual w ∈ ∆I such that I is a model 0 of TΣ←> and w ∈ C I \ DI . Modify I to I 0 by setting xI := ∆I for all concept 0 names x ∈ Sig(T ) \ ΣT←> , and rI := ∆I × ∆I for all roles names r ∈ Sig(T )\ Σ ←> and leaving everything else unchanged. We show that I 0 is a model of TΣ←> . For all α := αL v αR , with α ∈ TΣ←> , we have that: 0 – If αR is such that Sig(αR ) ⊆ ΣT←> we have that (αR )I = (αR )I since it does not change the interpretation of any symbols. – If αR is an existential restriction of the form ∃r.A with y ∈ Sig(αR ) \ ΣT←> , 0 0 then (y)I = ∆I or (y)I = ∆I × ∆I depending on whether y is a role or 0 concept name. In both cases we have that (αR )I ⊆ (αR )I . – If αR is an at-least restriction of the form ≥ nr.A with y ∈ Sig(αR ) \ ΣT←> , 0 0 then (y)I = ∆I or (y)I = ∆I × ∆I depending on whether y is a role or 0 concept name. In both cases we have that (αR )I ⊆ (αR )I . 0 – If αR is of the form ∃R.Self with R ∈ ΣT←> we have that (αR )I = (αR )I since it does not change the interpretation of the symbol R. – If α is of the form Dis(R, S) then by definition it is always in TΣ←> , thus R, S ∈ ΣT←> . Therefore, the interpretation of alpha does not change. 0 In all cases (αL )I = (αL )I since α ∈ TΣ←> and Sig(αL ) ∈ ΣT←> and thus 0 0 (αL )I ⊆ (αR )I . Thus, I 0 is a model for TΣ←> . Now for every α = (αL v αR ) ∈ ←> T \ TΣ we have: 0 I – αR is a concept name and αR = ∆I , or 0 I – αR is a role name and αR = ∆I × ∆I , or 13 – αR is a disjunction of the form A1 t . . . t An with at least one Ai 6∈ ΣT←> , 0 I0 thus AIi = ∆I and αR = AI1 ∪ . . . ∪ ∆I ∪ . . . ∪ AIn = ∆I , or 0 0 – αR is an existential restriction ∃r.A1 , thus rI = ∆I × ∆I and AI1 = ∆I so 0 that (∃r.A1 )I = ∆I , or0 0 – αR is ∃r.Self , thus rI = ∆I × ∆I so that (∃r.Self )I = ∆I , or0 I0 – αR is an atleast restriction ≥ nr.A2 , thus r = ∆ × ∆I , AI2 = ∆I and I 0 |∆I | ≥ n so that (≥ nr.A2 )I = ∆I . This follows from the fact that for any concept description ≥ nr.A, |∆I | ≥ |(r.A)I | ≥ n for it to be satisfiable. 0 0 I I Since for all cases αL ⊆ αR , we conclude that I 0 is a model for T . But I and I 0 correspond on all symbols y ∈ (Sig(D) ∪ Sig(C)) ⊆ Σ ⊆ ΣT←> and therefore 0 0 0 DI = DI and C I = C I . Now since C I = C I and w ∈ C I we have that 0 0 w ∈ C I \ DI and hence T 6|= C v D, contradicting the assumption. 2 Lemma 2 Let α be an axiom, Σ and Σ 0 be signatures, x ∈ {>, ⊥} and T a SRIQ TBox. Then: ←x 1. If Σ ⊆ Σ 0 and α is not Σ 0 T reachable, then α is not ΣT←x reachable. 2. If Σ 0 ∩ Sig(α) ⊆ Σ and α is not Σ reachable then α is not Σ 0 reachable. Proof: ←x 1. By the inductive definition of x-reachability if Σ ⊆ Σ 0 then ΣT←x ⊆ Σ 0 T . ←x Thus if α is not Σ 0 T reachable it can also not be ΣT←x -reachable. 2. Assume that α is not Σ reachable but it is Σ 0 reachable. Then there is some symbol y ∈ Sig(α) such that y 6∈ Σ and y is required for α to be Σ reachable. α is Σ 0 reachable so it must be the case that y ∈ Σ 0 . But this contradicts our assumption that Σ 0 ∩ Sig(α) ⊆ Σ. Thus, α is not Σ 0 reachable. Lemma 3 Let α be an axiom, Σ and Σ 0 be signatures, x ∈ {>, ⊥} and T , T 0 SRIQ TBoxes. Then: 1. Given T1 = TΣ←x 0 ←x 0 , if Σ ⊆ Σ then TΣ = T1 ←x ←x Σ . In particular TΣ ⊆ TΣ←x 0 . 0 ←x ←x 2. If Σ ∩ Sig(T ) ⊆ Σ, then TΣ 0 ⊆ TΣ . ←x 3. If T ⊆ T 0 , then TΣ←x ⊆ T 0 Σ . Proof: 1. Assume that there is some axiom α ∈ TΣ←x such that α 6∈ TΣ←x 0 . Therefore, ←x we have that α is not Σ 0 T reachable but that it is ΣT←x reachable. But this is a contradiction by Lemma 2.1 since Σ ⊆ Σ 0 . Thus, TΣ←x ⊆ TΣ←x 0 . A similar argument is used to show that TΣ←x ⊆ T1 ←x Σ and T 1 ←x Σ ⊆ T ←x Σ . 2. For every α ⊆ T we have that Σ 0 ∩ Sig(α) ⊆ Σ. Therefore, from Lemma 2.2 we have that whenever α ⊆ T is not Σ reachable it is also not Σ 0 reachable and we have that TΣ←x 0 contains at most all those axioms in TΣ←x . Thus, ←x ←x TΣ 0 ⊆ TΣ . ←x 3. Let δ = ΣT←x , δ 0 = Σ 0 T1 and α ∈ (T ∩ T1 ). Assume α is δ reachable but not δ 0 reachable. Since T ⊆ T1 and Sig(T ) ⊆ Sig(T1 ) we have by the inductive definition of x reachability that δ ⊆ δ 0 . But by Lemma 2.1 we have that whenever α is not δ 0 reachable then it is also not δ reachable. Therefore, TΣ←x contains at most all those axioms in T1 ←x Σ . Thus, TΣ ←x ⊆ T1 ←x Σ . 14 Lemma 4 Let Σ be an signature, T1 and T2 be SRIQ TBoxes with Sig(T1 ) ∩ ←x Sig(T2 ) ⊆ Σ and x ∈ {>, ⊥}. Then (T1 ∪ T2 )Σ = T1 ←x ←x Σ ∪ T2 Σ . ←x Proof: Let M = (T1 ∪ T2 )Σ , M1 = T1 ←x ←x Σ , M2 = T2 Σ . Now T1 ⊆ T1 ∪ T2 thus by Lemma 3.3 we have that M1 ⊆ M. Similarly M2 ⊆ M and thus M1 ∪M2 ⊆ M ∪ M which gives us M1 ∪ M2 ⊆ M. Let Σ 0 = Σ ∪ ΣT←x 1 ∪ ΣT←x 2 . To show that M ⊆ M1 ∪ M2 we note that, when extracting these modules, the order in which axioms are extracted are irrelevant. We therefore assume that any algorithm first ←x extracts axioms in M1 ∪M2 then tests all other axioms for Σ 0 T1 ∪T2 -reachability. Consider any axiom α ∈ (T1 ∪ T2 ) \ (M1 ∪ M2 ). If α ∈ T1 then α ∈ T1 \ M1 and α is not ΣT←x 1 ∪ Σ reachable. Now precondition Sig(T2 ) ∩ Sig(T1 ) ⊆ Σ implies ΣT←x 2 ∩ Sig(α) ⊆ Σ, taken that α is not ΣT←x 1 ∪ Σ reachable we manipulate this statement to derive (Σ ∪ ΣT←x 2 ∪ Σ ←x T1 ) ∩ Sig(α) ⊆ Σ ∪ ΣT←x 1 . Thus by Lemma ←x ←x 2.2 we have that α is not Σ ∪ ΣT2 ∪ ΣT1 reachable. The case where α ∈ T2 is treated analogously. 2 Proposition 1 For x ∈ {>, ⊥}, x-reachability is robust under replacement. Proof: Let Sig(T ) ∩ (Sig(T1 ) ∪ Sig(T2 )) ⊆ Σ. This implies that Sig(T ) ∩ Sig(Ti ) ⊆ Σ, for i = 1, 2. Now we have: ←x (T ∪ T1 )Σ = TΣ←x ∪ T1 ←x Σ (Lemma 4) = T2 ←x Σ ∪ T ←x Σ (Precondition) ←x = (T2 ∪ T )Σ (Lemma 4) 2 Proposition 2 For x ∈ {>, ⊥}, x-reachability is robust under vocabulary ex- tensions. Proof: Let Σ 0 ∩ (Sig(T1 ) ∪ Sig(T2 )) ⊆ Σ and T1 ≡xΣ T2 i.e., T1 ←x ←x Σ = T2 Σ . Let 00 0 00 00 0 Σ = Σ ∩ Σ (which implies Σ ⊆ Σ and Σ ⊆ Σ ). Then we have that: T1 ←x ←x Σ0 = T1 Σ 00 (∗) = (T1 ←x ←x Σ )Σ 00 (Lemma 3.1) = (T2 ←x Σ )←x Σ 00 (Precondition) ←x = T2 Σ 00 (Lemma 3.1) = T2 ←x Σ 0 (∗∗) As for equality (∗), set inclusion is due to Σ 0 ∩ Sig(T1 ) = Σ 00 and the combi- nation of Lemma 3.2 and Lemma 3.3, and the converse is due to Σ 00 ⊆ Σ 0 and Lemma 3.1. Equality (∗∗) is justified analogously. 2 Proposition 3 For x ∈ {>, ⊥}, x-reachability is robust under vocabulary re- strictions. Proof: Follows from the fact that we have robustness under vocabulary exten- sions. 2 15 Proposition 4 For x ∈ {>, ⊥}, x-reachability is robust under joins. Proof: For i = 1, 2, let Mi = Ti ←x Σ with Sig(T1 ) ∩ Sig(T2 ) ⊆ Σ, and let M = (T1 ∪ T2 )←x Σ . The precondition says that M1 = M2 . It is clear from Lemma 3.3 that M ⊇ Mi . It suffices to show M ⊆ M1 . Take any axiom ←x α ∈ (T1 ∪ T2 ) \ M1 . It remains to show that α is not Σ ∪ ΣM 1 reachable. In case α ∈ T1 \ M1 , then α is not ΣM1 reachable since M1 = T1 ←x ←x Σ . In case α ∈ T2 \ M1 , we also have that α ∈ T2 \ M2 because M1 = M2 . This means that α is not Σ ∪ ΣM ←x 2 reachable therefore not Σ ∪ ΣM←x 1 reachable. 2 Theorem 2 Let T be a normalized SRIQ TBox and Σ a signature such that Σ ⊆ Sig(T ). For arbitrary concept descriptions C, D such that T |= C v D and Sig(C) ∪ Sig(D) ⊆ ΣT←> we have that TΣ←> contains all MinAs for T |= C v D. Proof: Assume that T |= C v D for some Sig(C) ∪ Sig(D) ⊆ ΣT←> , but there is a MinA M for T |= C v D that is not contained in TΣ←> . If C v D is a tautology then M must be empty with M ⊆ TΣ←> . Thus, we assume that C v D is not a tautology. Since M 6⊆ TΣ←> , there must be an axiom α ∈ M \ TΣ←> . Define M1 := M ∩ TΣ←> . M1 is a strict subset of M since α 6∈ M1 . There are two cases, either M1 = ∅ or it contains at least one axiom. In the case where M1 = ∅, define T1 = T \ TΣ←> with M ⊆ T1 . Now since M |= C v D we have by monotinocity that T1 |= C v D. Since T1 ⊆ T we have by Lemma 3.3 that T1 ←> Σ ⊆ TΣ←> and thus that T1 ←> Σ = ∅. But by Lemma 1 ←> we have that T1 Σ |= C v D if, and only if, T1 |= C v D. Since C v D is not a tautology we have that T1 ←> Σ 6|= C v D and thus that M 6|= C v D. In the case where M1 6= ∅ we claim that M1 |= C v D, which contradicts the fact that M is a MinA for T |= C v D. We use proof by contraposition to show this. Assume that M1 6|= C v D, i.e., there is a model I1 of M1 such that C I1 6⊆ DI1 . We modify I1 to I by setting y I := ∆I1 for all concept names y ∈ Sig(T ) \ ΣT←> , and rI := ∆I1 × ∆I1 for all roles names r ∈ Sig(T ) \ ΣT←> . We have DI = DI1 since Sig(D) ⊆ ΣT←> , and C I = C I1 since Sig(C) ⊆ ΣT←> . It follows that C I 6⊆ DI . It remains to be shown that I is indeed a model of M, and therefore satisfies all axioms β = (βL v βR ) in M, including α. If β = Dis(Rr , R2 ) then by definition Sig(β) ⊆ ΣT←> so that 1 (β)I = (β)I . Otherwise there are two possibilities: – β ∈ M1 . Since M1 ⊆ TΣ←> , all symbols in Sig(βL ) are also in ΣT←> and possibly some symbols of Sig(βR ) may not be in ΣT←> . Consequently, I1 and I coincide on the names occurring in βL and since I1 is a model of M1 , we have that (βL )I = (βL )I1 and (βR )I1 ⊆ (βR )I . Therefore (βL )I ⊆ (βR )I . – β 6∈ M1 . Since β ∈ M, we have that β 6∈ TΣ←> , and hence β is not ΣT←> - reachable. Thus, I0 • βR is a concept name and βR = ∆I , or I0 • βR is a role name and βR = ∆I × ∆I , or • βR is a disjunction of the form A1 t. . .tAn with at least one Ai 6∈ ΣT←> , 0 I0 thus AIi = ∆I and βR = AI1 ∪ . . . ∪ ∆I ∪ . . . ∪ AIn = ∆I , or 16 0 0 • βR is an existential restriction ∃r.A1 , thus rI = ∆I × ∆I and AI1 = ∆I 0 so that (∃r.A1 )I = ∆I , or 0 0 • βR is ∃r.Self , thus rI = ∆I × ∆I so that (∃r.Self )I = ∆I , or I0 0 • βR is an atleast restriction ≥ nr.A2 , thus r = ∆ × ∆I , AI2 = ∆I I I I0 I and |∆ | ≥ n so that (≥ nr.A2 ) = ∆ . This follows from the fact that for any concept description ≥ nr.A, |∆I | ≥ |(r.A)I | ≥ n for it to be satisfiable. By definition of I, (βR )I = ∆I1 . Hence (βL )I ⊆ (βR )I . Therefore I is a model for M. But since C I 6⊆ DI we have that M 6|= C v D proving the contrapositive. 2 17