Introduction

From Datalog Reasoning to Modular Structure of an Ontology

Changlong Wang

0 1

Zhiyong Feng

Guozheng Rao

Xin Wang

Xiaowang Zhang

1 0 School of Computer Science and Engineer Technology , NWNU,Lanzhou 730070 , China 1 School of Computer Science and Technology, Tianjin University , Tianjin 300072 , China

We propose a novel approach to modularly structuring an ontology. First, a ontology TBox is transformed into a datalog program. Second, a tree, which represents forward-chaining proof and contains information about modules, is mapped to a digraph, which allows us to extract intended ontology module via breadth-first search. Finally, we develop an algorithm to compute the modular structure of an ontology and take an example to illustrate how our strategy might work in practice.

Ontology Modular structure Datalog Digraph

Introduction

Ontologies, often treated as monolithic objects, su er from an inherent lack of structure [ 1 ]. To address this problem, atomic decomposition(AD) [ 2 ] is proposed to represent the modular structure of an ontology. This approach exploits locality-based module [ 3 ] to induce a modular structure and allows users to explore the entire ontology in a sensible manner so that we can find appropriate module. Atomic decomposition, however, is intimately tied to locality-based modules and thus su ers from their limitations [ 1 ]: it is bound by axiom structure and the extracted modules are often big in size. The recently proposed module extraction technique [ 4 ], in which module extraction is reduced to reasoning problem in datalog, allows us to extract smaller modules. Furthermore, those trees used in representing forward-chaining proofs in datalog can bring some information about ontology structure. Those distinguishing features stimulate us to further observe modular structure of an ontology by applying datalog reasoning. Ontologies and Datalog Reasoning The ontology axioms are formalised as rules: function-free sentences of the form ∀x.[φ(x) → ∃y.[∨ii==n1 ψi(x, y)]], where φ,ψ are conjunctions of distinct atoms. A signature Σ is a set of predicates and S ig(F) denotes the signature of a formula F. A TBox T B is a finite set of rules. A fact γ is a function-free ground atom. An ABox AB is a finite set of facts. An ABox with predicates from the signature Σ is denoted by Σ-ABox. Given a datalog program P and ABox AB, the materialisation of P ∪ AB can be computed by forward-chaining. A proof of γ in P ∪ AB is pair ρ = (T, λ) where T is a tree, λ is a mapping from nodes in T to facts, and from edges in T to rules in P, such that for each node v the followings hold: (1) λ(v) = γ if v is the root of T ; (2) λ(v) ∈ AB if v is a leaf; and (3) if v has children w1, ..., wn then each edge from v to wi is labelled by the same rule r and λ(v) is a consequence of r and λ(w1), ..., λ(wn). In this paper, we call T a proof tree.

Module Extraction via Datalog Reasoning The module extraction approach proposed in [ 4 ] consists of three steps: (1) Pick a substitution θ mapping all existentially quantified variables in T B to constants and then transform T B into a datalog program P. (2) Pick a Σ-ABox AB0 and materialise P ∪ AB0. (3) Pick a set ABr of “relevant facts” in the materialisation and compute the supporting rules in P for each such fact. Consequently, the module M consists of all rules in T B that yield some supporting rule in P and is fully determined by the substitution θ and the ABox AB0 and ABr. Intuitively, AB0 determines the module ”topic” and ABr determines which rules should be contained in M. Formally, this approach is defined as follows:

Definition 1 [ 4 ]. A module setting for T B and Σ is a tuple χ = (θ, AB0, ABr) with θ a substitution from existentially quantified variables in T B to constants, AB0 a Σ-ABox, ABr a S ig(T B)-ABox, and such that no constant in χ occurs in T B.

The program of χ is the smallest program Pχ containing, for each r = φ(x) → ∃y.[∨ii==n1 ψi(x, y)] in T B, the rule φ → ⊥ if n = 0 and all rules φ → γθ for each 1 ≤ i ≤ n and each atom γ in ψi. The support of χ is the set of rules r ∈ Pχ that support a fact from ABr in Pχ ∪ AB0. The module Mχ of χ is the set of rules in T B that have a corresponding datalog rule in the support of χ. ♢

Definition 1 shows that there exists some relationship between rules in Pχ which make it feasible in characterizing the modular structure of an ontology by datalog reasoning:

Proposition 1. Given two datalog rules dr1 and dr2 in a proof tree T , if the the head of dr2 is the body of dr1, then, for ever module Mχ, dr2 ∈ Mχ implies dr1 ∈ Mχ 3

From Proof Tree to Modular Structure of an Ontology

For the materialisation of AB0 and P, we need to construct proof trees for each fact entailed by AB0 ∪ P. Consequently, the module Mχ for χ = (θ, AB0, ABr) has one or more corresponding proof trees.

Definition 2. Given χ = (θ, AB0, ABr), a module Mχ is called single tree module (STM) if it has one corresponding proof tree, A module is called multiple trees module (MTM) if it has more than one corresponding proof trees.

In this paper, we mainly discuss STM. We use TM to denote the proof tree T for a module M, and use MT to denote the module M generated from a proof tree T . Clearly, a module Mχ for χ = (θ, AB0, ABr) is STM, if there exists only one fact γ ∈ ABr whose support contains supporting rules of any other fact β ∈ ABr.

Definition 3 [ 2 ]. A module M is called fake if there exist two incomparable (w.r.t. set inclusion) modules M1 , M2, such that M1 ∪ M2 = M; a module is called genuine if it is not fake.

Proposition 2. A STM is a genuine module.

Definition 4. Let T be a tree, a subtree S T in T is called breadth-first search tree (BFST), if S T is generated by breadth-first search starting from the root in T .

From Datalog Reasoning to Modular Structure of an Ontology

Theorem 1. Let TMχ be a proof tree for the genuine module Mχ and S T be a BFST in TMχ , the set of rules in T B that have a corresponding datalog rule in S T is a χ−module for χ = (θ, AB′0, AB′r), where AB′0 consists of the facts that have corresponding leaves in S T and AB′r consists of the single fact that corresponds to the root of S T .

Proof. From Definition 1 and the procedure of module extraction in [ 4 ], AB′0 can be picked as the initial ABox AB0, and AB′r as the relevant fact set ABr, the only fact in ABr is the root of S T and its support contains supporting rules of other inner node(corresponding to a fact in Pχ ∪ AB′0) in S T . Hence, the set of rules in T B that have a corresponding datalog rule in S T is a χ−module.

Theorem 1 shows that a module can be extracted from a proof tree T by breadth-first search. In order to represent the modular structure of original ontology, we map each proof tree T to a digraph DG by: (1) If rule r in T B has a corresponding datalog rule dr in T , DG contains node r; (2) For two node r1 and r2 in DG and their corresponding datalog rules dr1 and dr2 in T , if the body of dr1 is the head of dr2, there exist a edge from node r1 to r2 in DG.

The mapping from datalog proof tree to digraph has no impact on the dependent relationship between rules. In other words, we can obtain the same module from datalog proof tree and its corresponding digraph. Those obtained digraphs exactly capture the modular structure of an ontology. 4

Computing the Modular Structure of an Ontology

From previous discussion, it is feasible to design an algorithm to compute the modular structure of an ontology TBox T B. In the algorithm, we need construct some proof trees, such that each rule in T B have a corresponding datalog rule in those trees. Algorithm 1 outlines our approach to computing the modular structure of an ontology. In Algorithm 1, we can pick a general substitution θ and transform T B into a datalog program P, and pick S ig(T B)-ABox as the initial ABox. From Definition 2, Definition 4, and Definition 8 in [ 4 ], the modular structure in this paper can be induced on Σ-implication inseparable, Σ-fact inseparable, and Σ-model inseparable module.

To conclude, we illustrate our approach by an ontology TBox T B containing 6 axioms: α1 : A ⊑ ∃R.B, α2 : A ⊑ ∃R.C, α3 : B ⊓ C ⊑ D, α4 : D ⊑ ∃S .E, α5 : D ⊑ ∀S .F, α6 : ∃S (E ⊓ F) ⊑ G. Those axioms are formalised as rules as follows, where the three rules r1,r2, and r4 have two corresponding datalog rules, respectively. The mapping from proof trees to modular structure is shown in Figure 1.

Algorithm 1 Computing modular structure of an ontology 1: Input: a TBox T B 2: Output: the modular structure of T B — a set of digraphs 3: map T B to a datalog program P 4: use a S ig(T B)-ABox as AB0 and build proof trees for each fact in the materialisation of

AB0 ∪ P 5: map each proof tree to a digraph DG; 6: return the set of DG r1 : A(x) → ∃y1[R(x, y1) ∧ B(y1) r2 : A(x) → ∃y2[R(x, y2) ∧ C(y2) r3 : B(x) ∧ C(x) → D(x) r4 : D(x) → ∃y3[S (x, y3) ∧ E(y3) r5 : D(x) ∧ S (x, y) → F(y) r6 : S (x, y) ∧ E(y) ∧ F(y) → G(x) ⇒ ⇒ ⇒ dr1′ : A(x) → R(x, f (x)), dr2′ : A(x) → R(x, f (x)), dr1′′ : A(x) → B(c) dr2′′ : A(x) → C(c) dr4′ : D(x) → S (x, f (x)), dr4′′ : D(x) → E(c) 5

Conclusion and Future Work

We have proposed an approach to modular structure of ontology by mapping datalog proof tree to digraph. In this modular structure, ontology module can be computed by the standard algorithm of breadth-first search. Our current work is preliminary, we will implement the proposed approach and evaluate it on real ontologies in the future. Acknowledgments. This work is supported by the NSFC (61100049, 61373165) and the 863 Program of China (2013AA013204)

Chiara

Del Vescovo.: The Modular Structure of an Ontology: Atomic Decomposition and its applications . PhD thesis , The University of Manchester, 2013 .

Chiara

Del Vescovo ,

Bijan

Parsia , Ulrike Sattler, Thomas Schneider: The modular structure of an ontology: Atomic decomposition . In IJCAI , 2232 - 2237 , 2011 .

Bernardo

Cuenca Grau , Ian Horrocks, Yevgeny Kazakov, Ulrike Sattler: Modular reuse of ontologies: Theory and practice . J. of Artificial Intelligence Research . 31 , 273 - 318 , 2008 .

Ana

Armas Romero , Mark Kaminski,Bernardo Cuenca Grau, Ian Horrocks: Ontology Module Extraction via Datalog Reasoning . In AAAI, 1410 - 1416 , 2015 .