<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>From Datalog Reasoning to Modular Structure of an Ontology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Changlong Wang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhiyong Feng</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guozheng Rao</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xin Wang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaowang Zhang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science and Engineer Technology</institution>
          ,
          <addr-line>NWNU,Lanzhou 730070</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science and Technology, Tianjin University</institution>
          ,
          <addr-line>Tianjin 300072</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We propose a novel approach to modularly structuring an ontology. First, a ontology TBox is transformed into a datalog program. Second, a tree, which represents forward-chaining proof and contains information about modules, is mapped to a digraph, which allows us to extract intended ontology module via breadth-first search. Finally, we develop an algorithm to compute the modular structure of an ontology and take an example to illustrate how our strategy might work in practice.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology</kwd>
        <kwd>Modular structure</kwd>
        <kwd>Datalog</kwd>
        <kwd>Digraph</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Ontologies, often treated as monolithic objects, su er from an inherent lack of structure
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To address this problem, atomic decomposition(AD) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is proposed to represent
the modular structure of an ontology. This approach exploits locality-based module [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
to induce a modular structure and allows users to explore the entire ontology in a
sensible manner so that we can find appropriate module. Atomic decomposition, however, is
intimately tied to locality-based modules and thus su ers from their limitations [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]: it is
bound by axiom structure and the extracted modules are often big in size. The recently
proposed module extraction technique [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], in which module extraction is reduced to
reasoning problem in datalog, allows us to extract smaller modules. Furthermore, those
trees used in representing forward-chaining proofs in datalog can bring some
information about ontology structure. Those distinguishing features stimulate us to further
observe modular structure of an ontology by applying datalog reasoning.
Ontologies and Datalog Reasoning The ontology axioms are formalised as rules:
function-free sentences of the form ∀x.[φ(x) → ∃y.[∨ii==n1 ψi(x, y)]], where φ,ψ are
conjunctions of distinct atoms. A signature Σ is a set of predicates and S ig(F) denotes the
signature of a formula F. A TBox T B is a finite set of rules. A fact γ is a function-free
ground atom. An ABox AB is a finite set of facts. An ABox with predicates from the
signature Σ is denoted by Σ-ABox. Given a datalog program P and ABox AB, the
materialisation of P ∪ AB can be computed by forward-chaining. A proof of γ in P ∪ AB
is pair ρ = (T, λ) where T is a tree, λ is a mapping from nodes in T to facts, and from
edges in T to rules in P, such that for each node v the followings hold: (1) λ(v) = γ if
v is the root of T ; (2) λ(v) ∈ AB if v is a leaf; and (3) if v has children w1, ..., wn then
each edge from v to wi is labelled by the same rule r and λ(v) is a consequence of r and
λ(w1), ..., λ(wn). In this paper, we call T a proof tree.
      </p>
      <p>
        Module Extraction via Datalog Reasoning The module extraction approach
proposed in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] consists of three steps: (1) Pick a substitution θ mapping all existentially
quantified variables in T B to constants and then transform T B into a datalog program
P. (2) Pick a Σ-ABox AB0 and materialise P ∪ AB0. (3) Pick a set ABr of “relevant
facts” in the materialisation and compute the supporting rules in P for each such fact.
Consequently, the module M consists of all rules in T B that yield some supporting rule
in P and is fully determined by the substitution θ and the ABox AB0 and ABr.
Intuitively, AB0 determines the module ”topic” and ABr determines which rules should be
contained in M. Formally, this approach is defined as follows:
      </p>
      <p>
        Definition 1 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. A module setting for T B and Σ is a tuple χ = (θ, AB0, ABr) with θ
a substitution from existentially quantified variables in T B to constants, AB0 a Σ-ABox,
ABr a S ig(T B)-ABox, and such that no constant in χ occurs in T B.
      </p>
      <p>The program of χ is the smallest program Pχ containing, for each r = φ(x) →
∃y.[∨ii==n1 ψi(x, y)] in T B, the rule φ → ⊥ if n = 0 and all rules φ → γθ for each
1 ≤ i ≤ n and each atom γ in ψi. The support of χ is the set of rules r ∈ Pχ that support
a fact from ABr in Pχ ∪ AB0. The module Mχ of χ is the set of rules in T B that have a
corresponding datalog rule in the support of χ. ♢</p>
      <p>Definition 1 shows that there exists some relationship between rules in Pχ which
make it feasible in characterizing the modular structure of an ontology by datalog
reasoning:</p>
      <p>Proposition 1. Given two datalog rules dr1 and dr2 in a proof tree T , if the the head
of dr2 is the body of dr1, then, for ever module Mχ, dr2 ∈ Mχ implies dr1 ∈ Mχ
3</p>
    </sec>
    <sec id="sec-2">
      <title>From Proof Tree to Modular Structure of an Ontology</title>
      <p>For the materialisation of AB0 and P, we need to construct proof trees for each fact
entailed by AB0 ∪ P. Consequently, the module Mχ for χ = (θ, AB0, ABr) has one or
more corresponding proof trees.</p>
      <p>Definition 2. Given χ = (θ, AB0, ABr), a module Mχ is called single tree module
(STM) if it has one corresponding proof tree, A module is called multiple trees module
(MTM) if it has more than one corresponding proof trees.</p>
      <p>In this paper, we mainly discuss STM. We use TM to denote the proof tree T for a
module M, and use MT to denote the module M generated from a proof tree T . Clearly,
a module Mχ for χ = (θ, AB0, ABr) is STM, if there exists only one fact γ ∈ ABr whose
support contains supporting rules of any other fact β ∈ ABr.</p>
      <p>
        Definition 3 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. A module M is called fake if there exist two incomparable (w.r.t.
set inclusion) modules M1 , M2, such that M1 ∪ M2 = M; a module is called genuine
if it is not fake.
      </p>
      <p>Proposition 2. A STM is a genuine module.</p>
      <p>Definition 4. Let T be a tree, a subtree S T in T is called breadth-first search tree
(BFST), if S T is generated by breadth-first search starting from the root in T .</p>
      <p>From Datalog Reasoning to Modular Structure of an Ontology</p>
      <p>Theorem 1. Let TMχ be a proof tree for the genuine module Mχ and S T be a BFST
in TMχ , the set of rules in T B that have a corresponding datalog rule in S T is a χ−module
for χ = (θ, AB′0, AB′r), where AB′0 consists of the facts that have corresponding leaves in
S T and AB′r consists of the single fact that corresponds to the root of S T .</p>
      <p>
        Proof. From Definition 1 and the procedure of module extraction in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], AB′0 can
be picked as the initial ABox AB0, and AB′r as the relevant fact set ABr, the only fact
in ABr is the root of S T and its support contains supporting rules of other inner
node(corresponding to a fact in Pχ ∪ AB′0) in S T . Hence, the set of rules in T B that have
a corresponding datalog rule in S T is a χ−module.
      </p>
      <p>Theorem 1 shows that a module can be extracted from a proof tree T by breadth-first
search. In order to represent the modular structure of original ontology, we map each
proof tree T to a digraph DG by: (1) If rule r in T B has a corresponding datalog rule
dr in T , DG contains node r; (2) For two node r1 and r2 in DG and their corresponding
datalog rules dr1 and dr2 in T , if the body of dr1 is the head of dr2, there exist a edge
from node r1 to r2 in DG.</p>
      <p>The mapping from datalog proof tree to digraph has no impact on the dependent
relationship between rules. In other words, we can obtain the same module from datalog
proof tree and its corresponding digraph. Those obtained digraphs exactly capture the
modular structure of an ontology.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Computing the Modular Structure of an Ontology</title>
      <p>
        From previous discussion, it is feasible to design an algorithm to compute the
modular structure of an ontology TBox T B. In the algorithm, we need construct some proof
trees, such that each rule in T B have a corresponding datalog rule in those trees.
Algorithm 1 outlines our approach to computing the modular structure of an ontology.
In Algorithm 1, we can pick a general substitution θ and transform T B into a datalog
program P, and pick S ig(T B)-ABox as the initial ABox. From Definition 2,
Definition 4, and Definition 8 in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the modular structure in this paper can be induced on
Σ-implication inseparable, Σ-fact inseparable, and Σ-model inseparable module.
      </p>
      <p>To conclude, we illustrate our approach by an ontology TBox T B containing 6
axioms: α1 : A ⊑ ∃R.B, α2 : A ⊑ ∃R.C, α3 : B ⊓ C ⊑ D, α4 : D ⊑ ∃S .E, α5 : D ⊑ ∀S .F,
α6 : ∃S (E ⊓ F) ⊑ G. Those axioms are formalised as rules as follows, where the three
rules r1,r2, and r4 have two corresponding datalog rules, respectively. The mapping
from proof trees to modular structure is shown in Figure 1.</p>
      <p>Algorithm 1 Computing modular structure of an ontology
1: Input: a TBox T B
2: Output: the modular structure of T B — a set of digraphs
3: map T B to a datalog program P
4: use a S ig(T B)-ABox as AB0 and build proof trees for each fact in the materialisation of</p>
      <p>AB0 ∪ P
5: map each proof tree to a digraph DG;
6: return the set of DG
r1 : A(x) → ∃y1[R(x, y1) ∧ B(y1)
r2 : A(x) → ∃y2[R(x, y2) ∧ C(y2)
r3 : B(x) ∧ C(x) → D(x)
r4 : D(x) → ∃y3[S (x, y3) ∧ E(y3)
r5 : D(x) ∧ S (x, y) → F(y)
r6 : S (x, y) ∧ E(y) ∧ F(y) → G(x)
⇒
⇒
⇒
dr1′ : A(x) → R(x, f (x)),
dr2′ : A(x) → R(x, f (x)),
dr1′′ : A(x) → B(c)
dr2′′ : A(x) → C(c)
dr4′ : D(x) → S (x, f (x)),
dr4′′ : D(x) → E(c)
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>We have proposed an approach to modular structure of ontology by mapping datalog
proof tree to digraph. In this modular structure, ontology module can be computed by
the standard algorithm of breadth-first search. Our current work is preliminary, we will
implement the proposed approach and evaluate it on real ontologies in the future.
Acknowledgments. This work is supported by the NSFC (61100049, 61373165) and
the 863 Program of China (2013AA013204)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Chiara</given-names>
            <surname>Del</surname>
          </string-name>
          <article-title>Vescovo.: The Modular Structure of an Ontology: Atomic Decomposition and its applications</article-title>
          .
          <source>PhD thesis</source>
          , The University of Manchester,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Chiara</given-names>
            <surname>Del Vescovo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Bijan</given-names>
            <surname>Parsia</surname>
          </string-name>
          , Ulrike Sattler, Thomas Schneider:
          <article-title>The modular structure of an ontology: Atomic decomposition</article-title>
          .
          <source>In IJCAI</source>
          ,
          <fpage>2232</fpage>
          -
          <lpage>2237</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Bernardo</given-names>
            <surname>Cuenca</surname>
          </string-name>
          <string-name>
            <surname>Grau</surname>
          </string-name>
          , Ian Horrocks, Yevgeny Kazakov, Ulrike Sattler:
          <article-title>Modular reuse of ontologies: Theory and practice</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          .
          <volume>31</volume>
          ,
          <fpage>273</fpage>
          -
          <lpage>318</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Ana</given-names>
            <surname>Armas</surname>
          </string-name>
          <string-name>
            <surname>Romero</surname>
          </string-name>
          , Mark Kaminski,Bernardo Cuenca Grau, Ian Horrocks:
          <article-title>Ontology Module Extraction via Datalog Reasoning</article-title>
          . In AAAI,
          <fpage>1410</fpage>
          -
          <lpage>1416</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>