A normal form for hypergraph-based module extraction for SROIQ Riku Nortje, Katarina Britz, and Thomas Meyer Center for Artificial Intelligence Research, University of KwaZulu-Natal and CSIR Meraka Institute, South Africa Email: nortjeriku@gmail.com; {arina.britz;tommie.meyer}@meraka.org.za Abstract. Modularization is an important part of the modular design and maintenance of large scale ontologies. Syntactic locality modules, with their desirable model theoretic properties, play an ever increasing role in the design of algorithms for modularization, partitioning and rea- soning tasks such as classification. It has been shown that, for the DL EL+ , the syntactic locality module extraction problem is equivalent to the reachability problem for hypergraphs. In this paper we investigate and introduce a normal form for the DL SROIQ which allows us to map any SROIQ ontology to an equivalent hypergraph. We then show that standard hyperpath search algorithms can be used to extract modules similar to syntactic locality modules for SROIQ ontologies. 1 Introduction The advent of the semantic web presupposes a significant increase in the size of ontologies, their distributive nature and the requirement for fast reasoning algorithms. Modularization techniques not only play an increasingly important role in the design and maintenance of large-scale distributed ontologies, but also in the design of algorithms that increase the efficiency of reasoning tasks such as subsumption testing and classification [11, 1]. Extracting minimal modules is computationally expensive and even undecid- able for expressive DLs [2, 3]. Therefore, the use of approximation techniques and heuristics play an important role in the effective design of algorithms. Syntactic locality [2, 3], because of its excellent model theoretic properties, has become an ideal heuristic and is widely used in a diverse set of algorithms [11, 1, 4]. Suntisrivaraporn [11] showed that, for the DL EL+ , ⊥-locality module extrac- tion is equivalent to the reachability problem in directed hypergraphs. Nortjé et al. [9, 10] extended the reachability problem to include >-locality and introduced bidirectional reachability modules as a subset of ⊥> modules. In this paper we introduce a normal form for the DL SROIQ, which allows us to map any SROIQ ontology to an equivalent syntactic locality preserving hypergraph. We show that, given this mapping, the extraction of ⊥-locality modules is equivalent to the extraction of all B-hyperpaths, >-locality is similar to extracting all F -hyperpaths and ⊥>∗ modules to that of extracting frontier graphs. These similarities demonstrate a unique relationship between reasoning tasks, based on syntactic locality, for SROIQ ontologies, and standard well studied hypergraph algorithms. 2 Preliminaries 2.1 Hypergraphs Hypergraphs are a generalization of graphs and have been extensively studied since the 1970s as a powerful tool for modelling many problems in Discrete Math- ematics. In this paper we adapt the definitions of hypergraphs and hyperpaths from [8, 12]. A (directed) hypergraph is a pair H = hV, Ei, where V is a finite set of nodes, E ⊆ 2V × 2V is the set of hyperedges such that for every e = (T (e), H(e)) ∈ E, T (e) 6= ∅, H(e) 6= ∅, and T (e) ∩ H(e) = ∅. A hypergraph H0 = hV 0 , E 0 i is a subhypergraph of H if V 0 ⊆ V and E 0 ⊆ E. A hyperedge e is a B-hyperedge if |H(e)| = 1. A B-hypergraph is a hypergraph such that each hyperedge is a B-hyperedge. A hyperedge e is an F-hyperedge if |T (e)| = 1. An F-hypergraph is a hypergraph such that each hyperedge is an F-hyperedge. A BF-hypergraph is a hypergraph for which every edge is either a B- or an F-hyperedge. Let e = (T (e), H(e)) be a hyperedge in some directed hypergraph H. Then, T (e) is known as the tail of e and H(e) is known as the head of e. Given a directed hypergraph H = (V, E), its symmetric image H is a directed hypergraph defined as: V(H) = V(H) and E(H) = {(H, T ) | (T, H) ∈ E(H)}. We denote by BS(v) = {e ∈ E | v ∈ H(e)} and F S(v) = {e ∈ E | v ∈ T (e)} respectively the backward star and forward star of a node v. Let n and m be the number of nodesP and hyperedges in a hypergraph H. We define the size of H as size(H) = |V| + e∈E (|T (e)| + |H(e)|). Q A simple path st from s ∈ V(H) to t ∈ V(H) in H is a sequence (v1 , e1 , v2 , e2 , ..., vk , ek , vk+1 ) consisting of distinct nodes and hyperedges such that s = v1 , t = vk+1 and for Q every 1 ≤ i ≤ k, vi ∈ T (ei ) and vi+1 ∈ H(ei ). If in addition t ∈ T (e1 ) then st is a simple cycle. A simple path is cycle free if it does not contain any subpath that is a simple cycle. A node s is B-connected to itself. If there is a hyperedge e such that all nodes vi ∈ T (e) are B-connected to s, then every vj ∈ H(e) is B-connected to s. A B-hyperpath from s ∈ V(H) to t ∈ V(H)Q is a minimal subhypergraph of H where t is B-connected to s. An F-hyperpath st from s ∈ V(H) to t ∈ V(H) in H is Q a subhyperpath of H such that st is a B-hyperpath from t to s in H. A BF- hyperpath from s ∈ V(H) to t ∈ V(H) in H is a minimal (in the inclusion sense) subhyperpath of H such that it is simultaneously both a B-hyperpath and an F- hyperpath from s to t in H. We note that every hypergraph H can be transformed to a BF-hypergraph H0 by replacing each hyperedge e = (T (e), H(e)) with the two hyperedges e1 = (T (e), {nv }), e2 = ({nv }, H(e)) where nv is a new node. 2 Algorithm 1 (Visiting a hypergraph [8]) Procedure Bvisit(s,H) Procedure Fvisit(t,H) 1 : for each u ∈ V do blabel(u) := f alse; for each u ∈ V do f label(u) := f alse; 2 : for each e ∈ E do T (e) := 0; for each e ∈ E do T (e) := 0; 3 : Q := {s}; blabel(s) := true; Q := {t}; f label(t) := true; 4 : while Q 6= ∅ do while Q 6= ∅ do 5: select and remove u ∈ Q; select and remove u ∈ Q; 6: for each e ∈ F S(u) do for each e ∈ BS(u) do 7: T (e) := T (e) + 1; H(e) := H(e) + 1; 8: if T (e) := |T ail(e)|then if H(e) := |Head(e)|then 9: for each v ∈ Head(e) do for each v ∈ T ail(e) do 10 : if blabel(v) = f alse then if f label(v) = f alse then 11 : blabel(v) = true f label(v) = true 12 : Q := Q ∪ {v} Q := Q ∪ {v} Given some node s, Algorithm 1 can be used to find all B-connected or F- connected nodes to s in O(size(H)) time. Here, the set of all B-hyperpaths from s and F-hyperpaths to t are respectively represented by all those nodes n such that blabel(n) = true or f label(n) = true, as well as the edges connecting those nodes. Fig. 1. Example hypergraph H1 Example 1. In Figure 1 we have H1 = (V1 , E1 ), with V1 = {v1 , ..., v9 } and E1 = {e1 , e2 , e3 , e4 , e5 , e6 } such that e1 = ({v1 }, {v2 , v3 }), e2 = ({v2 }, {v1 , v5 , v6 }), e3 = ({v3 , v4 }, {v7 }), e4 = ({v5 , v6 }, {v8 }), e5 = ({v7 }, {v6 , v8 }) and e6 = ({v6 , v8 }, {v9 }). The directed hypergraph G1 with nodes V(G1 ) = {v1 , v2 , v3 , v5 , v6 , v8 , v9 } and E(G1 ) = {e1 , e2 , e4 , e6 } is a B-hyperpath from v1 to v9 in H1 . The hypergraph G2 with V(G2 ) = {v3 , v4 , v6 , v7 , v8 , v9 } and E(G2 ) = {e3 , e5 , e6 } is an F-hyperpath from v3 to v9 in H1 . The hypergraph G3 with V(G3 ) = {v6 , . . . v9 } and E(G3 ) = {e5 , e6 } is a BF-hyperpath from v7 to v9 in H1 . Definition 1. Given a hypergraph H = (V, E), the frontier graph H0 = (V 0 , E 0 , s, t) of H, such that V 0 ⊆ V, E 0 ⊆ E, s, t ∈ V, is the maximal (in the inclusion sense) BF -graph in which (1) s and t are the origin and destination nodes, (2) if v ∈ V 0 then v is B-connected to s, and t is F-connected to v in H0 . 3 Algorithm 2 (Frontier graph Extraction Algorithm [8]) Procedure frontier(H, H0 , s, t) 1 : H0 := H; change := true 2 : while change = true do 3 : change = f alse 3 : H0 = Bvisit(s, H0 ); H0 = F visit(t, H0 ) 4 : for each v ∈ V 0 5: if blabel(v) = f alse or f label(v) = f alse then 6: change := true 7: V 0 = V 0 − {v}; E 0 = E 0 − F S(v) − BS(v) 8 : if s 6∈ V 0 or t 6∈ V 0 then 9: H0 := ∅; change := f alse; Algorithm 2 can be used to extract a frontier graph for any source and destination nodes and runs in O(n size(H)) time. 2.2 The DL SROIQ In this section we give a brief introduction to the DL SROIQ [5, 7] with its syntax and semantics listed in Table 1. NC , NR and NI denote disjoint sets of atomic concept names, atomic roles names and individual names. The set NR includes the universal role. Well-formed formulas are created by combining concepts from the table by using the connectives ¬, u, t etc. Given R1 ◦ . . . ◦ Rn v R, where n > 1 and Ri , R ∈ NR , is a role inclusion axiom (RIA). A role hierarchy is a finite set of RIAs. Here R1 ◦ . . . ◦ Rn denotes a composition of roles where R, Ri may also be an inverse role R− . A role R is simple if it: (1) does not appear on the right-hand side of a RIA; (2) is the inverse of a simple role; or (3) appears on the right-hand side of a RIA only if the left-hand side consists entirely of simple roles. Ref (R), Irr(R) and Dis(R, S), where R, S are roles other than U , are role assertions. A set of role assertions is simple w.r.t. a role-hierarchy H if each assertion Irr(R) and Dis(R, S) uses only simple roles w.r.t. H. A strict partial order ≺ on NR is a regular order if, and only if, for all roles R and S: S ≺ R iff S − ≺ R. Let ≺ be a regular order on roles. A RIA w v R is ≺-regular if, and only if, R ∈ NR and w has one of the following forms: (1) R ◦ R, (2) R− , (3) S1 ◦ . . . ◦ Sn , where each Si ≺ R, (4) R ◦ S1 ◦ . . . ◦ Sn , where each Si ≺ R and (5) S1 ◦ . . . ◦ Sn ◦ R, where each Si ≺ R. A role hierarchy H is regular if there exists a regular order ≺ such that each RIA in H is ≺-regular. An RBox is a finite, regular role hierarchy H together with a finite set of role assertions simple w.r.t. H. If a1 , . . . , an are in NI , then {a1 , . . . , an } is a nominal. No is the set of all nominals. The set of SROIQ concept descriptions is the smallest set such that: (1) ⊥,>, each C ∈ NC , and each o ∈ No is a concept description. (2) If C is a concept description, then ¬C is a concept description. (3) If C and D are concept descriptions, R is a role description, S is a simple role description, and n is a non-negative integer, then the following are all concept descriptions: (C u D), (C t D), ∃R.C, ∀R.C, 6 nS.C, > nS.C, ∃S.Self . 4 Table 1. Syntax and semantics of SROIQ Concept Syntax Semantics atomic concept C ∈ NC C I ⊆ ∆I individual A ∈ NI aI ∈ ∆I nominal {a1 , . . . , an }, ai ∈ NI {aI1 , . . . , aIn } role R ∈ NR R I ⊆ ∆I × ∆I inverse role R − , R ∈ NR R−I = {(y, x)|(x, y) ∈ RI } universal role U U I = ∆I × ∆I role composition R1 ◦ . . . ◦ Rn {(x, z)|(x, y1 ) ∈ R1I ∧ (y1 , y2 ) ∈ R2I ∧ . . . ∧ (yn , z) ∈ Rn+1 I } top > >I = ∆I bottom ⊥ ⊥I = ∅ negation ¬C (¬C) = ∆I \ C I I conjunction C1 u C2 (C1 u C2 )I = C1I ∩ C2I disjunction C1 t C2 (C1 t C2 )I = C1I ∪ C2I exist restriction ∃R.C {x|(∃y)[(x, y) ∈ RI ∧ y ∈ C I ]} value restriction ∀R.C {x|(∀y)[(x, y) ∈ RI → y ∈ C I ]} self restriction ∃R.Self {x|(x, x) ∈ RI } atmost restriction 6 nR.C {x|#{y|(x, y) ∈ RI ∧ y ∈ C I } 6 n} atleast restriction > nR.C {x|#{y|(x, y) ∈ RI ∧ y ∈ C I } > n} Axiom Syntax Semantics concept inclusion C1 v C2 C1I ⊆ C2I role inclusion R1 ◦ . . . ◦ Rn v R (R1 ◦ . . . ◦ Rn )I ⊆ RI reflexivity Ref (R) {(x, x)|x ∈ ∆I } ⊆ RI irreflexivity Irr(R) {(x, x)|x ∈ ∆I } ∩ RI = ∅ disjointness Dis(R, S) S I ∩ RI = ∅ class assertion C(a) aI ∈ C I inequality assertion a 6= b aI 6= bI role assertion R(a, b) (aI , bI ) ∈ RI negative role assertion ¬R(a, b) (aI , bI ) 6∈ RI If C and D are concept description then C v D is a general concept inclusion (GCI) axiom. A TBox is a finite set of GCIs. If C is a concept description, a, B ∈ NI , R, S ∈ NR with S a simple role description, then C(a), R(a, b), ¬S(a, b), and a 6= b, are individual assertions. An SROIQ ABox is a finite set of individual assertions. All GCIs, RIAs, role assertions, and individual assertions are referred to as axioms. A SROIQ-KB base is the union of a TBox, RBox and ABox. 2.3 Modules Definition 2. (Module for the arbitrary DL L) Let L be an arbitrary de- scription language, O an L ontology, and σ a statement formulated in L. Then, O0 ⊆ O is a module for σ in O (a σ-module in O) whenever: O |= σ if and only if O0 |= σ. We say that O0 is a module for a signature S in O (an S-module in O) if, for every L statement σ with Sig(σ) ⊆ S, O0 is a σ-module in O. 5 Definition 2 is sufficiently general so that any subset of an ontology preserving a statement of interest is considered a module, the entire ontology is therefore a module in itself. An important property of modules in terms of the modular reuse of ontologies is safety [2, 3]. Intuitively, a module conforms to a safety condition whenever an ontology T reuses concepts from an ontology T 0 in such a way so that it does not change the meaning of any of the concepts in T 0 . This may be formalized in terms of the notion of conservative extensions: Definition 3. (Conservative extension [3]) Let T and T1 be two ontologies such that T1 ⊆ T , and let S be a signature. Then (1) T is an S-conservative extension of T1 if, for every α with Sig(α) ⊆ S, we have T |= α iff T1 |= α. (2) T is a conservative extension of T1 if T is an S-conservative extension of T1 for S = Sig(T1 ). Definition 4. (Safety [3, 6]) An ontology T is safe for T 0 if T ∪ T 0 is a conservative extension of T 0 . Further let S be a signature. We say that T is safe for S if, for every ontology T 0 with Sig(T ) ∩ Sig(T 0 ) ⊆ S, we have that T ∪ T 0 is a conservative extension of T 0 . Intuitively, given a set of terms, or seed signature, S, a S-module M based on deductive-conservative extensions is a minimal subset of an ontology O such that for all axioms α with terms only from S, we have that M |= α if, and only if, O |= α, i.e., O and M have the same entailments over S. Besides safety, reuse of modules requires two additional properties namely coverage and independence. Definition 5. (Module coverage [6]) Let S be a signature and T 0 , T be ontologies with T 0 ⊆ T such that S ⊆ Sig(T 0 ). Then, T 0 guarantees coverage of S if T 0 is a module for S in T . Definition 6. (Module Independence [6]) Given an ontology T and signa- tures S1 , S2 , we say that T guarantees module independence if, for all T1 with Sig(T ) ∩ Sig(T1 ) ⊆ S1 , it holds that T ∪ T1 is safe for S2 . Unfortunately, deciding whether or not a set of axioms is a minimal module is computationally hard or even impossible for expressive DLs [2, 3]. However, if the minimality requirement is dropped, good sized approximations can be defined that are efficiently computable, as in the case of syntactic locality, which modules are extracted in polynomial time. Algorithm 3 (Extract a locality module [2]) Procedure extract-module(T , S, x) Inputs: Tbox T ; signature S; x ∈ ⊥, >; Output x-module M 1 : M := ∅; T 0 = T ; 2 : repeat 3 : change = f alse 4 : for each α ∈ T 0 5: if α not x-local w.r.t. S∪Sig(M) then 6: M = M + {α} 7: T 0 = T 0 \ {α} 8: changed = true 9 : until changed = f alse 6 Definition 7. (Syntactic locality [3]) Let S be a signature and O a SROIQ ontology. An axiom α is ⊥-local w.r.t. S (>-local w.r.t S) if α ∈ Ax(S), as defined in the grammar: ⊥-Locality Ax(S) ::= C ⊥ v C|C v C > |w⊥ v R|Dis(S ⊥ , S)|Dis(S, S ⊥ ) Con (S) ::= A⊥ |¬C > |C ⊥ u C|C u C ⊥ |C1⊥ t C2⊥ |∃R⊥ .C|∃R.C ⊥ ⊥ |∃R⊥ .Self | > nR⊥ .C| > nR.C ⊥ Con (S) ::= ¬C ⊥ |C1> u C2> |C > t C|C t C > |∀R.C > | 6 nR.C > > |∀R⊥ .C| 6 nR⊥ .C >-Locality Ax(S) ::= C ⊥ v C|C v C > |w v R> Con (S) ::= ¬C > |C ⊥ u C|C u C ⊥ |C1⊥ t C2⊥ |∃R.C ⊥ | > nR.C ⊥ ⊥ |∀R> .C ⊥ | 6 nR> .C ⊥ Con (S) ::= A> |¬C ⊥ |C1> u C2> |C > t C|C t C > |∀R.C > | > ∃R> .C > | > nR> .C > | 6 nR.C > |∀R⊥ .C| 6 nR⊥ .C In the grammar, we have that A⊥ , A> 6∈ S is an atomic concept, R⊥ , R> (resp. S ⊥ ,S > ) is either an atomic role (resp. a simple atomic role) not in S or the inverse of an atomic role (resp. of a simple atomic role) not in S, C is any con- cept, R is any role, S is any simple role, and C ⊥ ∈ Con⊥ (S), C > ∈ Con> (S). We also denote by w⊥ a role chain w = R1 ◦ . . . ◦ Rn such that for some i with 1 ≤ i ≤ n, we have that Ri is (possibly inverse of ) an atomic role not in S. An ontology O is ⊥-local (>-local) w.r.t. S if α is ⊥-local (>-local) w.r.t. S for all α ∈ O. Algorithm 3 may be used to extract either >- or ⊥-locality modules. Al- ternating the algorithm between ⊥- and >-locality module extraction until a fixed-point is reached results in ⊥>∗ modules. 3 Normal form In this section we will introduce a normal form for any SROIQ ontology. The normal form is required to facilitate the conversion process between a SROIQ ontology and a hypergraph. Definition 8. Given Bi ∈ NC \ {⊥}, Ci ∈ NC \ {>}, D ∈ {∃R.B, ≥ nR.B, ∃R.Self }, Ri , Si ∈ NR and n > 1, a SROIQ ontology O is in normal form if every axiom α ∈ O is in one of the following forms: α1 : B1 u . . . u Bn v C1 t . . . t Cm α2 : ∃R.B1 v C1 t . . . t Cm α3 : B1 u . . . u Bn v ∃R.Bn+1 α4 : B1 u . . . u Bn v ∃R.Self α5 : ∃R.Self v C1 t . . . t Cm α6 : > nR.B1 v C1 t . . . t Cm α7 : B1 u . . . u Bn v> nR.Bn+1 α8 : R1 ◦ . . . ◦ Rn v Rn+1 α9 : D1 v D2 In order to normalize a SROIQ ontology O we repeatedly apply the nor- malization rules from Table 2. Each application of a rule rewrites an axiom into an equivalent normal form. Algorithm 4 illustrates the conversion process. 7 Algorithm 4 Given any SROIQ axiom α: 1. Recursively apply rules NR7 - NR11 to eliminate all equivalences, universal restrictions, atmost restrictions and complex role fillers. 2. Given that α = (αL v αR ), recursively apply the following steps until αL contains no disjunctions and αR contains no conjunctions: (a) recursively apply rules NR1, NR3, NR6 to αL , (b) recursively apply rules NR2, NR4, NR5 to αR . 3. recursively apply any applicable rules from NR12 through NR21. Table 2. SROIQ normalization rules NR1 ¬Ĉ2 v Ĉ1 > v Ĉ1 t Ĉ2 NR2 B̂1 v ¬B̂2 B̂1 u B̂2 v ⊥ NR3 B̂ u D̂ v Ĉ B̂ u A v Ĉ, D̂ v A, A v D̂ NR4 B̂ v Ĉ t D̂ B̂ v Ĉ t A, D̂ v A, A v D̂ NR5 B̂ v Ĉ1 u Ĉ2 B̂ v Ĉ1 , B̂ v Ĉ2 NR6 B̂1 t B̂2 v Ĉ B̂1 v Ĉ, B̂2 v Ĉ NR7 . . . ∀R.Ĉ . . . . . . ¬∃R.A . . ., A u Ĉ v ⊥, > v A t Ĉ NR8 . . . ∃R.D̂ . . . . . . ∃R.A . . ., D̂ v A, A v D̂ NR9 . . . > nR.D̂ . . . . . . > nR.A . . ., D̂ v A, A v D̂ NR10 . . . 6 nR.Ĉ . . . . . . ¬(> (n + 1)R.Ĉ) . . . NR11 B̂ ≡ Ĉ B̂ v Ĉ,Ĉ v B̂ NR12 > 0R.B v Ĉ > v Ĉ NR13 B̂ v ∃R.⊥ B̂ v ⊥ NR14 B̂ v> nR.⊥ B̂ v ⊥ NR15 B̂ v> 0R.B NR16 > nR.⊥ v Ĉ NR17 ∃R.⊥ v Ĉ NR18 B̂ u ⊥ v Ĉ NR19 ⊥ v Ĉ NR20 B̂ v Ĉ t > NR21 B̂ v > Above A 6∈ NC , B̂i and Ĉi are possibly complex concept descriptions and D̂ a complex concept description. R ∈ NR , n > 0. We note that rules NR18 and NR20 makes use of the commutativity of u and t. Theorem 1. Algorithm 4 converts any SROIQ ontology O to an ontology O0 in normal form, such that O0 is a conservative extension of O. The algorithm terminates in linear time and adds at most a linear number of axioms to O. For every normalized ontology O0 the definition of syntactic locality from Definition 7 may now be simplified to that of Definition 9. This is possible since for every axiom α = (αL v αR ) ∈ O0 , ⊥-locality of α is dependent solely on αL and >-locality is dependent solely on αR . 8 Definition 9. (Normal form syntactic locality) Let S be a signature and O a normalized SROIQ ontology. Any axiom α is ⊥-local w.r.t. S (>-local w.r.t S) if α ∈ Ax(S), as defined in the grammar: ⊥-Locality Ax(S) ::= C ⊥ v C | w⊥ v R | Dis(S ⊥ , S) | Dis(S, S ⊥ ) Con (S) ::= A⊥ | C ⊥ u | C u C ⊥ | ∃R⊥ .C | ∃R.C ⊥ | ∃R⊥ .Self | ⊥ > nR⊥ .C |> nR.C ⊥ >-Locality Ax(S) ::= C v C > | w v R> Con (S) ::= A> | C > t C | C t C > |∃R> .C > |> nR> .C > | > ∃R> .Self In the grammar, we have that A⊥ , A> 6∈ S is an atomic concept, R⊥ ,R> (resp. S ⊥ ,S > ) is either an atomic role (resp. a simple atomic role) not in S or the inverse of an atomic role (resp. of a simple atomic role) not in S, C is any con- cept, R is any role, S is any simple role, and C ⊥ ∈ Con⊥ (S), C > ∈ Con> (S). We also denote by w⊥ a role chain w = R1 ◦ . . . ◦ Rn such that for some i with 1 ≤ i ≤ n, we have that Ri is (possibly inverse of ) an atomic role not in S. An ontology O is ⊥-local(>-local) w.r.t. S if α is ⊥-local(>-local) w.r.t. S for all α ∈ O. We note that we may denormalize a normalized ontology if we maintain a possibly many-to-many mapping from normalized axioms to their original source axioms. Formally, define a function denorm : Ô → 2O , with O an SROIQ ontology and Ô its normal form. S For brevity, we write denorm(Φ), with Φ a set of normalized axioms, to denote α∈Φ denorm(α). 4 SROIQ hypergraph Suntisrivaraporn [11] showed that for the DL EL+ , extracting ⊥-locality modules are equivalent to the reachability problem in directed hypergraphs. This was extended in [9, 10] to include a reachability algorithm for >-locality modules. In this section we show that a SROIQ ontology O in normal form can be mapped to a hypergraph which preserves both ⊥-locality and >-locality. Definition 10. Let α be a normalized axiom and α⊥ a minimum set of symbols from Sig(α) required to ensure that α is not ⊥-local, and let H = (V, E) be a hypergraph. We say that an edge e ∈ E preserves ⊥-locality iff α⊥ = T (e). Similarly, e ∈ E preserves >-locality whenever α> = H(e). For each normal form axiom αi in Definition 8 we show that αi may be mapped to a set of hyperedges, with nodes denoting symbols from Sig(αi ), such that both ⊥-locality and >-locality are simultaneously preserved. – Given α1 : B1 u . . . u Bn v C1 t . . . t Cm we map it to the hyperedge eα1 = ({B1 , . . . , Bn }, {C1 , . . . , Cm }). We transform the hyperedge eα1 to two new hyperedges eB F α1 = ({B1 , . . . , Bn }, {H1 }) a B-hyperedge, eα1 = 9 ({H1 }, {C1 , . . . , Cm }) an F-hyperedge and with H1 a new node. By defi- nition each Cj is B-connected to H1 if all Bi are B-connected to H1 . From Definition 9 we know that this preserves ⊥-locality for α1 since it is ⊥-local, w.r.t. a signature S, exactly when any of the conjuncts Bi 6∈ S. In other words it is non ⊥-local exactly when all Bi ∈ S. The same also holds for >- locality, since eF α1 requires every Ci ∈ α1 to be in S for H1 to be F-connected. From Definition 9 we see that, w.r.t. a signature S, eF α1 is >-local exactly when any of the disjuncts Ci 6∈ S. – Given α2 : ∃R.B1 v C1 t. . .tCm or α6 : > nR.B1 v C1 t. . .tCm we map it to the two hyperedges eB F α2/6 = ({B1 , R}, {H2 }), eα2/6 = ({H2 }, {C1 , . . . , Cm }) an F-hyperedge and with H2 a new node. This mapping preserves ⊥-locality for α2/6 since by Definition 9 it is ⊥-local, w.r.t. a signature S, exactly when either B1 or R is not in S. The argument for >-locality follows that of α1 . – Given α3 : B1 u . . . u Bn v ∃R.Bn+1 or α7 : B1 u . . . u Bn v> nR.Bn+1 we map it to the hyperedges eB F1 α3/7 = ({B1 , B2 , . . . , Bn−1 , Bn }, {H3 }), eα3/7 = ({H3 }, {Bn+1 }), eF α3/7 = ({H3 }, {R}). This mapping preserves ⊥-locality for 2 α3/7 similarly to eB α1 for α1 . From Definition 9 we know that >-locality for either of these axioms, w.r.t. a signature S, is dependent on neither R nor Bn+1 being elements of S. Therefore, they are non >-local exactly when either or both of these are in S. This is represented by the two edges eF 1 α3/7 F2 and eα3/7 for which H3 becomes F-connected exactly when either R or Bn+1 is F-connected. – Given α4 : B1 u . . . u Bn v ∃R.Self and α5 : ∃R.Self v C1 t . . . t Cm we see that ∃R.Self is both ⊥ or > local exactly when R 6∈ S. Therefore we map α4 to the hyperedge eB α4 = ({R}, {C1 , . . . , Cm }), and α5 to the hyperedge eF α5 = ({B 1 , . . . , B n }, {R}). – Given α8 : R1 ◦ . . . ◦ Rn v Rn+1 , we see that α8 is ⊥-local exactly when any Ri 6∈ S, i ≤ n and is >-local exactly when Rn+1 6∈ S. We therefore map α8 to the hyperedge eB α8 = ({R1 , . . . , Rn }, {Rn+1 }). – For α9 we have many forms, all variants of those discussed in the pre- vious mappings. Therefore α9 is mapped to any of the following: eB α9 = 1 ({R, B1 }, {H9 }}), eF α9 1 = ({H 9 }, {R}), e F2 α9 = ({H 9 }, {B}), or e 1 α9 = ({R, B 1 }, {R}), or eF α9 1 = ({R 1 }, {R 2 }), e F2 α9 = ({R 1 }, {B}), or e 1 α9 = ({R 1 }, {R 2 }). Given a SROIQ ontology O in normal form we may now map every axiom α ∈ O to its equivalent set of hyperedges. For each of these mappings there are at most three hyperedges introduced, therefore mapping the whole ontology O to an equivalent hypergraph HO will result in a hypergraph with the number of edges at most linear in the number of axioms in O. It is easy to show that the mapping process can be completed in linear time in the number of axioms in O. We note that, similar to the normalization process, we may maintain a pos- sibly many-to-many mapping from normalized axioms to their associated hy- peredges. Formally, define a function deedge : HO → 2O , with O a SROIQ ontology and HO its hypergraph. S For brevity, we write deedge(Φ), with Φ a set of hyperedges, to denote e∈Φ deedge(e). 10 5 Hypergraph module extraction In this section we show that, given a hypergraph HO for a SROIQ ontology O, we may extract a frontier graph from HO which is a subset of a ⊥>∗ module. We show that some of these modules guarantee safety, module coverage and module independence. The hypergraph algorithms presented require one start node s and a destination node t. In order to extend these algorithms to work with an arbitrary signature S, we introduce a new node s with with an edge esi = (s, si ) for each si ∈ S ∪ >, as well as a new node t with an edge eti = (si , t) for each si ∈ S ∪ ⊥. Theorem 2. Let O be a SROIQ ontology and HO its associated hypergraph B and S a signature. Algorithm 1 - Bvisit extracts a set of B-hyperpaths HO corresponding to the ⊥-locality module for S in O. Therefore, these modules also guarantees safety, module coverage and module independence. Theorem 3. Let O be a SROIQ ontology and HO its associated hypergraph F and S a signature. Algorithm 1 - F visit extracts a set of F -hyperpaths HO corresponding to a subset of the >-locality module for S in O. Theorem 4. Let O be a SROIQ ontology and HO its associated hypergraph BF and S a signature. Algorithm 2 extracts a frontier graph HO corresponding to ∗ a subset of the ⊥> -locality module for S in O. The module extracted in Theorem 3 is a subset of the >-locality module for a given seed signature. It is as yet unclear whether or not these modules provide all the model-theoretic properties associated with >-locality modules. However, from the previous work done for the DL EL+ [10], it is evident that these modules preserve all entailments for a given seed signature S. Further, they also preserve and contain all justifications for any given entailment. Similarly, the exact module theoretic properties of modules associated with frontier graphs is something we are currently looking into. 6 Conclusion We have introduced a normal form for any SROIQ ontology, as well as the necessary algorithms in order to map any SROIQ ontology to a syntactic local- ity preserving hypergraph. This mapping process can be accomplished in linear time in the number of axioms with at most a linear increase in the number of hyperedges in the hypergraph. Standard path searching algorithms for hypergraphs may now be used to find: (1) sets of B-hyperpaths — this is equivalent to finding ⊥-syntactical locality modules; (2) sets of F -hyperpaths — these are subsets of >-locality modules, and (3) frontier graphs — these are subsets of ⊥>∗ modules. Whilst the modules associated with B-hyperpaths share all the module theoretic properties of ⊥- locality modules, it is unclear at this point which module-theoretic properties modules associated with F -hyperpaths and frontier graphs possess. 11 The ability to map SROIQ ontologies to hypergraphs, such that hyperedges preserve syntactic locality conditions, allows us to investigate the relationship between DL reasoning algorithms and the vast body of standard hypergraph algorithms in greater depth. Our primary focus for future research is to investigate and define the module- theoretic properties of modules associated with F -hyperpaths and frontier graphs as well as their relative performance with respect to existing locality methods. Thereafter, we aim to expand our research and investigate other hypergraph algorithms and how they may be applied to DL reasoning problems. References 1. Cuenca Grau, B., Halaschek-Wiener, C., Kazakov, Y., Suntisrivaraporn, B.: Incre- mental classification of description logic ontologies. Tech. rep. (2012) 2. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Just the right amount: extracting modules from ontologies. In: Williamson, C., Zurko, M. (eds.) Proceed- ings of the 16th International Conference on World Wide Web (WWW ’07). pp. 717–726. ACM, New York, NY, USA (2007) 3. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontolo- gies: Theory and practice. Journal of Artificial Intelligence Research (JAIR) 31, 273–318 (2008) 4. Del Vescovo, C., Parsia, B., Sattler, U., Schneider, T.: The modular structure of an ontology: atomic decomposition and module count. In: O. Kutz, T.S. (ed.) Proc. of WoMO-11. Frontiers in AI and Appl., vol. 230, pp. 25–39. IOS Press (2011) 5. Horrocks, I., Kutz, O., Sattler, U.: The even more irresistable SROIQ. In: Do- herty, P., Mylopoulos, J., Welty, C. (eds.) Proceedings of the Tenth International Conference on Princleples of Knowledge Representation and Reasoning. pp. 57–67. AAAI Press (2006) 6. Jiménez-Ruiz, E., Cuenca Grau, B., Sattler, U., Schneider, T., Berlanga, R.: Safe and economic re-use of ontologies: A logic-based methodology and tool support. In: Proceedings of ESWC-08. vol. 5021 of LNCS, pp. 185–199 (2008) 7. Maier, F., Ma, Y., Hitzler, P.: Paraconsistent OWL and related logics. In: Janowicz, K. (ed.) Semantic Web 2012. pp. 1–33. IOS Press (2012) 8. Nguyen, S., Pretolani, D., Markenzon, L.: On some path problems on oriented hypergraphs. Theoretical Informatics and Applications 32(1-2-3), 1–20 (1998) 9. Nortjé, R.: Module extraction for inexpressive description logics. Master’s thesis, University of South Africa (2011) 10. Nortjé, R., Britz, K., Meyer, T.: Bidirectional reachability-based modules. In: Pro- ceedings of the 2011 International Workshop on Description Logics (DL2011). CEUR Workshop Proceedings, CEUR-WS (2011), http://ceur-ws.org/ 11. Suntisrivaraporn, B.: Polynomial-Time Reasoning Support for Design and Main- tenance of Large-Scale Biomedical Ontologies. Ph.D. thesis, Technical University of Dresden (2009) 12. Thakur, M., Tripathi, R.: Complexity of linear connectivity problems in directed hypergraphs. Linear Connectivity Conference pp. 1–12 (2001) 12