=Paper=
{{Paper
|id=None
|storemode=property
|title=Structure Formation to Modularize Ontologies
|pdfUrl=https://ceur-ws.org/Vol-875/regular_paper_1.pdf
|volume=Vol-875
|dblpUrl=https://dblp.org/rec/conf/womo/AutexierH12
}}
==Structure Formation to Modularize Ontologies==
<pdf width="1500px">https://ceur-ws.org/Vol-875/regular_paper_1.pdf</pdf>
<pre>
    Structure Formation to Modularize Ontologies

                         Serge Autexier and Dieter Hutter ?

                   German Research Center for Artificial Intelligence
                      Bibliotheksstr. 1, 28359 Bremen, Germany
                             {autexier|hutter}@dfki.de

        Abstract. It has been well recognized that the structuring of logic-
        based databases like formal ontologies is an important prerequisite to
        allow for an efficient reasoning on such databases. Development graphs
        have been introduced as a formal basis to represent and reason on struc-
        tured specifications. In this paper we present an initial methodology and
        a formal calculus to transform unstructured specifications into structured
        ones. The calculus rules operate on development graphs allowing one to
        separate specifications coalesced in one theory into a concisely structured
        graph.


1     Introduction

It has been long recognized that the modularity of specifications is an indis-
pensable prerequisite for an efficient reasoning in complex domains. Algebraic
specification techniques provide frameworks for structuring complex specifica-
tions and the authors introduced the notion of an development graph [4, 1, 6] as
a technical means to work with and reason about such structured specifications.
Typically ontologies are large and, even if structured, bear the problem of in-
consistencies as any large set of axioms. For instance, the SUMO ontology [7]
turned out to be inconsistent [10]. Recently Kurz and Mossakowski presented
an approach [5] to prove the consistency of an ontology in a modular way using
the (structured) architectural specification in CASL and provide mechanisms to
compose the models of the individual components to a global one. Furthermore,
there has been work [9] to (re-)structure ontologies following locality criteria into
modules which cover all aspects about specific concepts. However, since ontology
languages have simple imports without renaming, duplicated sub-ontologies that
are for instance equal up to renaming remain hidden, thus making the ontolo-
gies unnecessarily large, not only from a modeling point of view, but also from
a verification point of view as the same derived properties must be derived over
an over again.
    In this paper we present further steps towards the support for structure for-
mation in large specifications that additionally allows to factorize equivalent
sub-specification, i.e. sub-ontologies. The idea is to provide a calculus and a
corresponding methodology to crystallize intrinsic structures hidden in a speci-
fication and represent them explicitly in terms of development graphs. We start
?
    This work was funded by the German Federal Ministry of Education and Research
    under grants 01 IS 11013B and 01 IW 10002 (projects SIMPLE and SHIP)
2         Serge Autexier and Dieter Hutter

with a trivial development graph consisting of a single node that contains the
entire specification. Step by step, the specification is split to different nodes in
the development graph resulting in an increasingly richer graph. On the oppo-
site, common concepts that are scattered in different specifications are identified
and unified in a common theory (i.e. node).


2     Prerequisites

We base our framework on the notions of development graphs to specify and rea-
son about structured specifications. Development graphs D are acyclic, directed
graphs hN , Li, the nodes N denote individual theories and the links L indicate
theory inclusions with respect to signature morphisms attached to the links.
     This approach is based on the notion of institutions [3] (SignI , SenI , ModI ,
|=I,Σ ), where (i) Sign is a category of signatures, (ii) Sen : Sign −→ Set
is a functor giving the set of sentences Sen(Σ) over each signature Σ, and
for each signature morphism σ : Σ −→ Σ 0 , the sentence translation function
Sen(σ) : Sen(Σ) −→ Sen(Σ 0 ), where often Sen(σ)(ϕ) is written as σ(ϕ),
(iii) Mod : Signop −→ CAT is a functor giving the category of models over a
given signature, and for each signature morphism σ : Σ −→ Σ 0 , the reduct func-
tor Mod(σ) : Mod(Σ 0 ) −→ Mod(Σ), where often Mod(σ)(M 0 ) is written as
M 0 |σ , (iv) |=Σ ⊆ |Mod(Σ)| × Sen(Σ) for each Σ ∈ |Sign| is a satisfaction re-
lation,1 such that for each σ : Σ −→ Σ 0 in Sign, M 0 |=Σ 0 σ(ϕ) ⇔ M 0 |σ |=Σ ϕ
holds for each M 0 ∈ Mod(Σ 0 ) and ϕ ∈ Sen(Σ) (satisfaction condition).
     However, the abstractness of the signatures in the definition of institution
(they are just a category) makes it difficult to cope with modularization of signa-
tures. Hence, we equip institutions with an additional structure such that signa-
tures behave more set-like. Institutions with pre-signatures [2] are defined as in-
stitutions equipped
                 S       with an embedding | | : Sign → Set, the symbol functor, and
a map sym : Σ∈|Sign| Sen(Σ) → |Set|, such that ϕ ∈ Sen(Σ) iff sym(ϕ) ⊆ |Σ|
                S
for all ϕ ∈ Σ∈|Sign| Sen(Σ). The map sym gives the set of symbols used in
a sentence, and sentences are uniform in the sense that a well-formed sentence
is well-formed over a certain signature iff its symbols belong to that signature.
Pre-signatures are sets and pre-signature morphisms σ̄ mapping pre-signatures
are related to corresponding signature morphisms σ.
     Given an I-institution with pre-signatures, each node N ∈ N of the graph
is a tuple (sigN , axN , lemN ) such that sigN is a I-pre-signature called the local
signature of N , axN a set of I-sentences called the local axioms of N , and lemN
a set of I-sentences called the local lemmas of N . L is a set of global definition
               σ̄ +3
links M              N. Such a link imports the mapped theory of M (by the pre-
signature σ̄) as part of the theory of N . Thus we obtain corresponding notions
of global (pre-)signatures, axioms and lemmata that are defined inductively as
follows:
1
    |C| is the class of objects of a category C.
                               Structure Formation to Modularize Ontologies         3

 1. SigD (N ) = sigN ∪
                       S
                                σ̄ +3
                                           σ̄(SigD (M ))
                         M            N ∈S
 2. AxD (N ) = axN ∪
                      S
                               σ̄ +3
                                           σ(AxD (M ))
                        M            N ∈D
 3. LemD (N ) = lemN ∪
                         S
                                  σ̄ +3
                                              σ(LemD (M ))
                           M            N ∈D
A node N ∈ N is globally reachable from a node M ∈ N via a pre-signature
morphism σ̄, D ` M ? _
                        σ̄ +3
                              N for short, iff 1. either M = N and σ̄ = id is the
                                                 0                             00
identity pre-signature morphism, or 2. M
                                                σ̄   +3 K ∈ L, and D ` K ? _ σ̄ +3 N ,
              00   0
with σ̄ = σ̄ ◦ σ̄ .
    The maximal nodes (root nodes) dDe of a graph D are all nodes without
outgoing links. Dom D (N ) := SigD (N ) ∪ AxD (N ) ∪ LemD (N ) is the set of all
signature symbols, axioms and lemmata visible in a node N . The local domain
of N , dom N := sigN ∪ axN ∪ lemN is the set of all local signature symbols,
axioms and lemmata of N . The imported domain Imports D (N ) of N in D is the
set of all signature
                  S symbols, axioms and lemmas imported via incoming definition
links. Dom D = N ∈N Dom D (N ) is the set of all signature symbols, axioms and
lemmata occurring
                S      in D. Analogously we define SigD , AxD , LemD , and AssD .
Dom dDe = N ∈dDe Dom D (N ) is the set of all signature symbols, axioms and
lemmata occurring in the maximal nodes of D.
    A node N has a well-formed signature iff SigD (N ) is a valid I N -signature.
A development graph has a well-formed signature iff all its nodes have well-
formed signatures. Sigloc D (M ) := hsig
                                        M
                                          ∪ sym(axM ) ∪ sym(lemM )iSigD (M ) is the
local signature of N . A node N is well-formed iff it has a well-formed signature
SigD (N ) and AxD (N ), LemD (N ) ⊆ Sen(SigD (N )). A development graph is well-
formed, if all its nodes are well-formed.
    Given a node N ∈ N with well-formed signature, its associated class
ModD (N ) of models (or N -models for short) consists of those SigD (N )-models
                                                                       σ̄ +3
n for which (i) n satisfies the local axioms axN , and (ii) for each K       N ∈ S,
n|σ is a K-model. In the following we denote the class of Σ-models that fulfill
the Σ-sentences Ψ by ModΣ (Ψ ). A well-formed development graph D := hN , Li
is valid iff for all nodes N ∈ N ModD (N ) |= lemN .
    Given a signature Σ and Ax, Lem ⊆ Sen(Σ), a support mapping Supp for
Ax and Lem assigns each lemma ϕ ∈ Lem a subset H ⊆ Ax ∪ Lem such that
(i) Modhsym(H)∪sym(ϕ)iΣ (H) |= ϕ (ii) The relation @⊆ (Ax ∪ Lem) × Lem with
Φ @ ϕ ⇔ (Φ ∈ Supp(ϕ) ∨ ∃ψ.Φ ∈ Supp(ψ) ∧ ψ @ ϕ) is a well-founded strict
partial order. If D is a development graph, then a support mapping Supp is a
support mapping for D iff for all N ∈ D Supp is a support mapping for AxD (N )
and LemD (N ).


3    Development Graphs for Structure Formation

As a first step towards structure formation we will formalize requirements on
development graphs that reflect our intuition of an appropriate structuring for
4       Serge Autexier and Dieter Hutter

(formal) ontologies in particular (and formal specifications in general) in the
following principles.
    The first principle is semantic appropriateness, saying that the structure of
the development graph should be a syntactical reflection of the relations between
the various concepts in our ontology. This means that different concepts are
located in different nodes of the graph and the links of the graph reflect the
logical relations between these concepts. The second principle is closure saying,
for instance, that deduced knowledge should be located close to the concepts
guaranteeing the proofs. Also the concept defined by the theory of an individual
node of a development graph should have a meaning of its own and provide
some source of deduced knowledge. The third principle is minimality saying
that each concept (or part of it) is only represented once in the graph. When
splitting a monolithic theory into different concepts common foundations for
various concepts should be (syntactically) shared between them by being located
at a unique node of the graph.
    In the following we translate these principles into syntactical criteria on devel-
opment graphs and also into rules to transform and refactor development graphs.
In a first step we formalize technical requirements to enforce the minimality-
principle in terms of development graphs. Technically, we demand that each
signature symbol, each axiom and each lemma has a unique location in the de-
velopment graph. When we enrich a development graph with more structure we
forbid to have multiple copies of the same definition in different nodes. We there-
fore require that we can identify for a given signature entry, axiom or lemma a
minimal theory in a development graph and that this minimal theory is unique.
We define:

Definition 1 (Providing Nodes). Let hN , Li be a development graph. An
                                                                        σ̄ +3
entity e is provided in N ∈ N iff e ∈ Dom hN ,Li (N ) and ∀ M                 N. e 6∈
Dom hN ,Li (M ). Furthermore,
 1. e is locally provided in N iff additionally e ∈ dom N holds.
                                         σ̄ +3
 2. e is provided by a link l : M              N if not locally provided in N and
    ∃e ∈ Dom hN ,Li (M ). σ(e ) = e. In this case we say that l provides e from e0 .
       0                      0

    e is exclusively provided by l iff e is not provided by any other link l0 ∈ L.
Finally, the closure-principle demands that there are no spurious nodes in the
graph that do not contribute anything new to a concept. We combine these
requirements into the notion of location mappings:

Definition 2 (Location Mappings). Let D = hN , Li be a development graph.
A mapping loc D : Dom D → N is a location mapping for D iff
 1. loc D is surjective (closure)
 2. ∀N ∈ N . ∀e ∈ dom N . locD (e) = N
 3. ∀e ∈ Dom D . loc D (e) is the only node providing e (minimality)
Furthermore, for a given loc D we define loc −1D : N → 2
                                                           Dom D
                                                                  by loc −1
                                                                         D (N ) :=
{e ∈ Dom D |loc D (e) = N }. We will write loc and loc instead of loc D and loc −1
                                                      −1
                                                                                D
if D is clear from the context.
                             Structure Formation to Modularize Ontologies      5

    Based on the notion of location mappings we formalize our intuition of a
structuring. The idea is that the notion of being a structuring constitutes the
invariant of the structure formation process and guarantees both, requirements
imposed by the minimality-principle as well as basic conditions on a development
graph to reflect a given formal specification or ontology.
Definition 3 (Structuring). Let D = hN , Li be a valid development graph,
loc : Dom D → N , Σ ∈ |Sign|, Ax, Lem ⊆ Sen(Σ) and Supp be a support
mapping for D. Then (D, loc, Supp) is a structuring of (Σ, Ax, Lem) iff
 1. loc is a location mapping for D.
 2. let Dom dDe = Σ 0 ∪ Ax0 ∪ Lem0 then Σ = Σ 0 , Ax = Ax0 and Lem ⊆ Lem0 .
 3. ∀φ ∈ LemD . ∀ψ ∈ Supp(φ). ∃σ̄. loc(ψ) ? _
                                              σ̄ +3
                                                    loc(φ) ∧ σ̄(ψ) = ψ


4   Refactoring Rules

In the following we present the transformation rules on development graphs that
transform structurings again into structurings. Using these rules we are able to
structure the initially trivial development graph consisting of exactly one node
that comprises all given concepts step by step. This initial development graph
consisting of exactly one node satisfies the condition of a structuring provided
that we have an appropriate support mapping at hand.
    We define four types of structuring-invariant transformations: (i) horizontal
splitting and merging of development graph nodes, (ii) vertical splitting and
merging of development graph nodes, (iii) factorization and multiplication of
development graph nodes, and (iv) removal and insertion of specific links. Split-
ting and merging as well as factorization and multiplication are dual operations.
For lack of space and because we are mainly interested in rules increasing the
structure of a development graph we will omit the formal specification of the
merging and multiplication rules here.
    We illustrate our rules with the help of a running example in mathematics.
We start from a flat theory specifying a Ring over two operations + and ×. A
structure (R, +, ×) is a Ring, if (R, ×) is an abelian group, (R, +) a monoid
and × distributes over +. Furthermore, an abelian group is a monoid for which
additionally commutativity holds and inverse exists. The initial development
graph consists of a single node (without any links) containing all the symbol
definitions, axioms and theorems of the example.

Vertical Split. The first refactoring rule aims at the split of specifications in
different theories. In terms of the development graph a node is replaced by two
nodes one of them importing the other; each of them contains a distinct part
from a partitioning of the specification of the original node. While all outgoing
links start at the top node, we are free to reallocate incoming links to either
node. To formalize this rule we need constraints on how to split a specification
in different chunks such that local lemmata are always located in a node which
provides also the necessary axioms and lemmata to prove it.
6         Serge Autexier and Dieter Hutter

                                  ...                                                             ...
                           ρ1            ρm                                                ρ1              ρm
                                                 Vertical Split
                                  N                                                               N2
               θ̄1
                     θ̄k        σ̄1     σ̄l                                                           id
                                                Vertical Merge
     K1   . . . Kk                                                             θ̄1
                                  M1 . . . Mp                                        θ̄k          N1

                                                                                                σ̄1   σ̄l
                                                                       . . . Kk
                                                                  K1                              M1 . . . Mp
                                        Fig. 1. Vertical Split and Merge


Definition 4. Let S = (D, loc, Supp) be a structuring of (Σ, Ax, Lem) and N ∈
ND . A partitioning P for N is a set {N1 , . . . , Nk } with k > 1 such that 1. sigN =
sigN1 ] . . . ] sigNk , axN = axN1 ] . . . ] axNk , lemN = lemN1 ] . . . ] lemNk
2. sigNi ∪axNi ∪lemNi 6= ∅ for i = 1, . . . , k. A node Ni ∈ P is lemma independent
iff Supp(ψ) ∩ (axN ∪ lemN ) ⊆ (axNi ∪ lemNi ) for all ψ ∈ lemNi .

Definition 5 (Vertical Split). Let S = (hN , Li, loc, Supp) be a structuring
of (Σ, Ax, Lem) and P = {N1 , N2 } be a partitioning for some N ∈ N such
that N1 is lemma independent. Then, the vertical split S wrt. N and P is S 0 =
(D0 , loc 0 , Supp) with D0 = hN 0 , L0 i where

    N 0 :={N1 , N2 } ] (N \ N )

    L0 :={ M
                     σ̄     +3 M 0 ∈ L|M 6= N ∧ M 0 6= N } ∪ { N1                id        +3 N2}
                     σ̄ +3         σ̄ +3                                  σ̄         +3 M | N           σ̄      +3 M ∈ L}
           ∪ {M            N1 | M        N ∈ L} ∪ { N2
           
            N2     if loc(e) = N and e ∈ Dom D0 (N2 )
loc 0 (e) = N1      if loc(e) = N and e 6∈ Dom D0 (N2 )
             loc(e) otherwise
           

such that SigD0 (Ni ), i = 1, 2, are valid signatures and axi , lemi ⊆ Sen(SigD0 (Ni )),
i = 1, 2. Conversely, S is a vertical merge of N1 and N2 in S 0 .


Horizontal Split. Similar to a vertical split we introduce a horizontal split which
divides a node into two independent nodes. In order to ensure a valid new de-
velopment graph, each of the new nodes imports the same theories as the old
node and contributes to the same theories as the old node did.

Definition 6 (Horizontal Split). Let S = (hN , Li, loc, Supp) be a structuring
of (Σ, Ax, Lem), P = {N1 , . . . , Nk } be a partitioning for some node N ∈ N such
that each Ni ∈ P is lemma independent and loc −1 (N ) = dom N . The horizontal
split of S wrt. N and P is S 0 = (D0 , loc 0 , Supp) with D0 = hN 0 , L0 i where
 1. N 0 := {N1 , . . . , Nk } ] (N \ N )
                                                                   Structure Formation to Modularize Ontologies                                                                             7


                                                                                                                         σ̄| 1
                                                                                                                            Do
                                                                                                                              m
                                                                                                                                            . . . m m N1
            ...                                                                                                                      N
                                                                                                                                      k
                                                                                                                                                 σ̄ |Do
                                                                                              1                                                                                 m
     σ̄ 1            m                              Horizontal Split                        σ̄|Dom N
                                                                                                                                                                              σ̄|Dom N
                                                                                                                  1                                                                         k
                    σ̄
            N                                                                                                      N1                       ...                         Nk
  ¯θ 1                   θ¯n
                                                                                                                 θ̄1        θ̄ 1                            θ¯n               θ̄n
                                                   Horizontal Merge
            ...                                                                                                                             ...

                                                     Fig. 2. Horizontal Split and Merge


2. L0 := { M
                                    σ̄       +3 M 0 ∈ L|M 6= N ∧ M 0 6= N }

                    ∪ {M
                                            θ̄    +3 Ni| M            θ̄       +3 N ∈ L, i ∈ {1, . . . , k}}
                                   τ̄|Dom N
               ∪ { Ni
                        i
                         +3 M| N τ̄ +3 M ∈ L, i ∈ {1, . . . , k}}
 3. loc 0 (e) := Ni if e ∈ dom Ni for some i ∈ {1, . . . , k} and loc 0 (e) := loc(e)
    otherwise.
such that SigD0 (Ni ) are valid signatures and axi , lemi ⊆ Sen(SigD0 (Ni )) for
i = 1, . . . , k.
   Using the transformation rules, the flat initial theory of our running example
can be refactored (cf. Fig. 3). We apply the vertical split rule twice to extract R
and the distributivity law, followed by the horizontal-split rule to separate the
two instances of a monoid. Finally, we apply the vertical-split rule to extract
the extra axioms from M onoid(R, ×). We are left with two copies of a monoid
which we like to generalize to a common abstract monoid. This can be achieved
by the following rule.

Factorization The factorization rule allows one to merge equivalent concepts
into a single generalized concept and then to represent the individual ones as
instantiations of the generalized concept. A precondition of this rule is that all
individual concepts inherit the same (underlying) theories.
Definition 7 (Factorization). Let S = (hN , Li, loc, Supp) be a structuring of
(Σ, Ax, Lem). Let K1 , . . . , Kn , M1 , . . . , Mp ∈ N with p > 1 such that sigMj ∪
                                                          σ̄i,j
axMj 6= ∅ and ∃σ̄i,j . Ki                                          +3 Mj ∈ L for i = 1, . . . , n, j = 1, . . . , p.


                                                                                                                                                             Distrib(*, +)

                                                   Distrib(*, +)                                 Distrib(*, +)
                                                                                                                                                                              Comm(*),
                                                                                                                                                                              Inverse(*),
  Distrib(*, +), Comm(*),
  Inverse(*), Monoid(R, +),                  Comm(*), Inverse(*),                 Monoid(R, +)            Comm(*), Inverse(*),
  Monoid(R, *), R                            Monoid(R, +), Monoid(R, *)                                   Monoid(R, *)
                                                                                                                                             Monoid(R, +)                    Monoid(R, *)
                               Vertical                                   Horizontal
                               Split (2x)                                                                                        Vertical
                                                         R                  Split                     R                          Split
                                                                                                                                                                  R


                                  Fig. 3. Applying the split rules to our ring example
8                  Serge Autexier and Dieter Hutter

     ...                                            ...                                        ...                                         ...


              M1          ...          Mp                                                            N1                ...            Np
                                                                                                                 θ¯1
                                                                                                                             θ̄ p
                                                                            Factorization


                                            σ̄n,p
           σ̄1,1

                                                                                                                       N
                                σ̄ n
                                                                                                                                σ̄n
                     ,p
                   σ̄1


                                  ,1                                                                      σ̄ 1
                                                                        Multiplication
              K1          ...          Kn                                                            K1                ...            Kn


                         Fig. 4. Factorization and Multiplication (with σ̄i,j := θ̄j ◦ σ̄i )


    Suppose there are sets sig, ax and lem with (sig ∪ ax ∪ lem) ∩ Dom D = ∅ and
pre-signature morphisms θ1 , . . . , θp and σ1 , . . . , σn such that
- ∀e ∈ Dom D (Ki ). θj (σi (e)) = σi,j (e) and σi,j (e) = e ∨ σi,j (e) 6∈ Dom D
- sigMj ⊆ θj (sig) ⊆ Dom D (Mj ), axMj ⊆ θj (ax) ⊆ Dom D (Mj )
- ∀e ∈ lem holds ∃l ∈ {1, . . . p}. θl (e) ∈ lemMl , θi (e) = θj (e) implies i = j and
  θj (e) ∈ Dom D implies loc(θj (e)) ∈ Mj               S
- there is a support mapping SuppN for ax ∪ i=1,...,n σi (Dom D (Ki )) and lem.
    Then S 0 = (hN 0 , L0 i, loc 0 , Supp0 ) is a factorization of S with respect to M1 , . . .,
Mp and SuppN iff

           N 0 :={N } ∪ {Nj |j ∈ {1, . . . p}} ∪ N \ {M1 , . . . Mp }
                          with N = hsig, ax, lemi, Nj = h∅, ∅, lemMj \ θj (lem)i

             L0 :={ K
                                       σ̄           +3 K 0 ∈ L|K, K 0 6∈ {M1 , . . . Mp }
                                                                      σ̄i,j
                          ∪ { Ki
                                               σ̄i
                                                          +3 N| Ki            +3 Mj , j ∈ {1, . . . p}, i ∈ {1, . . . n}}

                                              θj
                          ∪ {N                         +3 Nj |j ∈ {1, . . . p}}

                          ∪ {K
                                               τ̄         +3 Nj | K    τ̄     +3 Mj ∧ (∀i ∈ {1, . . . n}.K 6= Ki ∧ τ̄ 6= σ̄i,j )

                ∪ { Nj
                         τ̄  +3 K| Mj τ̄ +3 K ∈ L, j ∈ {1, . . . p}}
                                            S
                N
                        if x ∈ Dom D0 (N ) \ i=1,...,n Dom D0 (Ki )
                                                        σ̄ +3
    loc 0 (x) := Nj      if x ∈ Dom D (Nj ) and ∀ K           Nj . x 6∈ Dom D0 (K)
                
                  loc(x) otherwise.
                

     Supp0 :=Supp ∪ SuppN .

    Applying the factorization rule allows us to introduce a generic concept of
monoids which is instantiated to obtain both previous copies of a monoid.
    The factorization rule only covers a sufficient criterion demanding that each
theory imported by a definition link to one concept is also imported via definition
links by all other concepts. The more complex case in which a theory is imported
                                                       Structure Formation to Modularize Ontologies                                        9
                            Distrib(*, +)                                                 Distrib(*, +)


                                             Comm(*),                                                         Comm(*),
                                             Inverse(*),                                                      Inverse(*),


             Monoid(R, +)                   Monoid(R, *)                   Monoid(R, +)          Monoid(G, op)              Monoid(R, *)


                                                           Factorization
                                 R                                                                        R


                                  Fig. 5. Factorization of our ring example

via a path of links can be handled by allowing one to shortcut a path in a single
global link. This results in the following rule.
Definition 8 (Transitive Enrichment). Let S := (hN , Li, loc, Supp) be a
structuring of (Σ, Ax, Lem), K, N ∈ N and there is a path K ? _
                                                                σ̄ +3
                                                                      N between
both. Then, (hN , L ∪ { K
                                                     σ̄       +3 N}i, loc, Supp) is a transitive enrichment of S.

   Definition links in a development graph can be redundant, if there are al-
ternatives paths which have the same morphisms or if they are not used in any
reachable node of the target. We formalize these notions as follows:
Definition 9 (Removable Link). Let S = (D, loc, Supp) (D = hN , Li) be a
structuring of (Σ, Ax, Lem). Let l ∈ L and D0 = hN , L \ {l}i. l is removable from
S and S 0 = (D0 , loc, Supp) is a reduction of S iff
               σ̄ +3
 1. ∀l0 : M          N. if l0 provides exclusively σ(e) from some eDom D (M ) then
    e ∈ Dom D0 (N ) and l 6= l0 ;
 2. ∀e ∈ Dom D .∀M ∈ DGRootsD. if loc(e) ? _
                                                     σ̄ +3
                                                           M then there exists M 0 ∈
   dD0 e such that loc(e) ? _
                              σ̄ +3
                                    M 0;
3. ∀φ ∈ LemD . Supp(φ) ⊆ Dom D0 (N ) and ∀Sigloc
                                             D (N ) ⊆ Dom D 0 (N ).

Theorem 1 (Structuring Preservation). Let S := (D, loc, Supp) (D = hN , Li)
be a structuring of (Σ, Ax, Lem). Then
 1. every horizontal split of S wrt. some N ∈ N and partitioning P of N ,
 2. every horizontal merge of S wrt. nodes {N1 , . . . , Nk } ⊆ N ,
 3. every vertical split of S wrt. some N ∈ N and partitioning P of N ,
 4. every factorization of S wrt. nodes M1 , . . . Mp ∈ N ,
 5. every multiplication of S wrt. N ,
 6. every transitive enrichment of S, and
 7. every reduction of S
is a structuring of (Σ, Ax, Lem).


5   Refactoring Process

The refactoring rules presented above provide the necessary instruments to ex-
ternalize the structure inherent in a given flat theory. Nevertheless we need
10         Serge Autexier and Dieter Hutter

Char                                                            Nat
Sig. char, A, B, . . . , Z                                      Sig. nat, 0, succ
Ax. ∀x : char. x = A ∨ · · · ∨ Z                                Ax. ∀x : nat. 0 6= succ(x)
Lem. —                                                                ∀x : nat. x = 0 ∨ ∃y : nat. x = succ(y), . . .
String                                                          Lem. —
Sig. str, strnil, addc                                          Natlist
Ax. ∀c : char, x : str. strnil 6= addc(c, x)                    Sig. nlist, natnil, addn
      ∀x : str. x = strnil ∨ ∃c : char, y : str.                Ax. ∀n : nat, x : nlist. natnil 6= addn(n, x)
      x = addc(c, y)                                                  ∀x : nlist. x = natnil
      ∀c, c0 : char, x, x0 : str. addc(c, x) = addc(c0 , x0 )         ∨ ∃n : nat, y : nlist. x = addn(n, y)
      → c = c0 ∧ x = x0 , . . .}                                      ∀n, n0 : nat, x, x0 : nlist. addn(n, x) = addn(n0 , x0 )
Lem. —                                                                → n = n0 ∧ x = x0 , . . .
Stringops                                                       Lem. —
Sig. strapp, strlen                                             Natlistops
Ax. strlen(strnil) = 0                                          Sig. natapp, nlen
      ∀c : char, x : str. strlen(addc(c, x)) = succ(x)          Ax. nlen(natnil) = 0
      ∀x : str. strapp(strnil, x) = x, . . .                          ∀n : nat, x : nlist. nlen(addn(n, x)) = succ(x)
Lem. ∀x, y, z : str. strapp(strapp(x, y), z)                          ∀x : nlist. natapp(natnil, x) = x, . . .
      = strapp(x, strapp(y, z)), . . .                          Lem. ∀x, y, z : nlist. natapp(natapp(x, y), z)
                                                                      = natapp(x, natapp(y, z)) . . .
                      Fig. 6. Heuristic motivated partitioning of a flat theory

appropriate heuristics to determine suitable partitions for horizontal or vertical
splits, in order to group together strongly related entities and afterwards make
explicit analogous groupings of entities by factorization. One heuristic is, for
instance, to group together all axioms and lemmas exclusively devoted to a spe-
cific sort. E.g. all axioms about characters, or all axioms about natural numbers.
From there we carve out those entities about a different sort including an already
classified sorts, and so on. Examples for these are strings or natural numbers,
and in the next iteration lists of strings (cf. Fig. 6). These identified subsets
can then be further partitioned into those defining the basic datatype and the
(inductive) functions and predicates over these. This separates the definition of
lists of natural numbers from the append functions on these, for instance.
     Such heuristics guide the application of the horizontal and vertical split rules.
From there we can identify nodes with isomorphic local entries and, using the
transitivity and reduction rules together with further applications of the split
rules try to enable the application of the factorization rule.
     We illustrate the refactoring process with the help of a further example,
where we start with flattened theories, and step by step carve out the intrinsic
structure. The example from Fig. 6 is about lists of natural numbers, strings
formed over characters and length functions operating on such lists. All the
signature symbols, axioms, and lemmas occur in a single node in the initial
development graph. For sake of readability we partition the set of these entities
into pairwise disjunct subsets and name them accordingly (cf. [8])
     Fig. 7 illustrates the structure formation process for this example. In the
first step the definitions of Nat and Char are separated from the rest by applying
the vertical split rule twice. In the next steps we form the individual theories of
String and Stringops respectively. Notice that Stringops includes strlen counting
characters in a string such that Stringops has to import Nat. After applying again
a vertical split we obtain the theories of Natlist and Natlistops. Since the local
axioms of String and Natlist are renamings of each other we factorize these local
axioms to a new theory of generic lists.
                                                                Structure Formation to Modularize Ontologies                                              11
 1. Initial Graph                          2. Vertical Split (2x)             3. Vertical Split                        4. Vertical Split

    Char, String, Stringops,                String, Stringops,                  Stringops,                              Stringops             Natlist,
    Nat, Natlist, Natlistops                Natlist, Natlistops                 Natlist, Natlistops                                           Natlistops


                                             Char               Nat             String          Nat                      String


                                                                                                                          Char                   Nat
                                                                                 Char


 5. Superfluous Link                                   6. Vertical Split                              7. Factorization

                         Natlist,                                          Natlistops                      Stringops                         Natlistops
    Stringops                                            Stringops
                         Natlistops


     String                                                 String           Natlist                        String                Genlist     Natlist


     Char                  Nat                              Char                                             Char                                Nat
                                                                              Nat


 8. Transitive Link (2x)                                                                          9. Factorization

          Stringops                              Natlistops                                               Stringops        Genlistops       Natlistops


              [String]           Genlist            [Natlist]                                              String            Genlist         Natlist


              Char                                    Nat                                                  Char                                Nat


                                                                      Fig. 7. Lists


This node consists of the renamed Genlist
local axioms of String (or Natlist, Sig.  list, nil, add
                                      Ax. ∀n : elem, x : list. nil 6= add(n, x)
respectively) plus the necessary sig-     ∀x : list. x = nil
nature definitions to obtain a well-      ∨ ∃n : elem, y : list. x = add(n, y)
                                          ∀n, n0 : elem, x, x0 : list. add(n, x) = add(n0 , x0 )
formed development graph node.            → n = n0 ∧ x = x0 , . . .
This theory is imported via defini- Lem. —
tion links to String and Natlist using corresponding pre-signature morphisms
to map it to list of chars and numbers, respectively. Both, String and Natlist
have now empty local signatures and axioms. In Fig 7 we indicate those nodes
with empty local signature, empty local axioms and empty local lemmata by a
light-gray color of the node name.


6         Related Work

There is related work to (re-)structure ontologies (after flattening) following lo-
cality criteria into modules containing all knowledge about specific concepts.
However, since ontology languages have simple imports without renamings, du-
plicated sub-ontologies that are for instance equal up to renaming remain hidden.
12      Serge Autexier and Dieter Hutter

To excavate these ontologies requires other heuristics than pure logic based ones
to guide the structuring process, and therefore we deliberately did not impose
specific criteria how to apply the rules beyond the minimal conditions that the
dependency relations among concepts, relations and facts and derived facts and
relations is preserved. It is easy to see that the modularizations of ontologies not
including derived lemmas obtained by using the locality criteria from [9] can
be constructed with our rules by starting from a single flattened ontology and
singling out the modules.

7    Conclusion
Based on the definition of structurings as concise development graphs, we pre-
sented transformation rules that allow one to make explicit common structures
hidden in a flat theory in terms of development graphs. It provides a framework
to modularize flattened ontologies in a useful way as illustrated by two simple
examples. An implementation is planned to provide an interactive tool to mod-
ularize and factorize large ontologies as well as (semi-)automatic procedures.

References
 1. S. Autexier and D. Hutter. Mind the gap - maintaining formal developments in
    MAYA. In Festschrift in Honor of J.H. Siekmann. Springer, LNCS 2605, 2005.
 2. S. Autexier, D. Hutter, and T. Mossakowski. Change management for hetero-
    geneous development graphs. In S. Siegler and N. Wasser, editors, Verification,
    Induction, Termination Analysis (Festschrift for Christoph Walther), volume 6463
    of LNAI, pages 54–80. Springer, 2010.
 3. J. A. Goguen and R. M. Burstall. Institutions: Abstract model theory for speci-
    fication and programming. Journal of the Association for Computing Machinery,
    39:95–146, 1992. Predecessor in: LNCS 164, 221–256, 1984.
 4. D. Hutter. Management of change in verification systems. In Proceedings 15th
    IEEE International Conference on Automated Software Engineering, ASE-2000,
    pages 23–34. IEEE Computer Society, 2000.
 5. O. Kutz and T. Mossakowski. A modular consistency proof for DOLCE. In The
    Association for the Advancement of Artificial Intelligence (AAAI-2011), 2011.
 6. T. Mossakowski, S. Autexier, and D. Hutter. Development graphs - proof manage-
    ment for structured specifications. Journal of Logic and Algebraic Programming,
    special issue on Algebraic Specification and Development Techniques, 67(1-2):114–
    145, april 2006.
 7. I. Niles and A. Pease. Towards a standard upper ontology. In Formal Ontology in
    Information Systems (FOIS-2001). ACM Press, 2001.
 8. I. Normann and M. Kohlhase. Extended formula normalization for  -retrieval
    and sharing of mathematical knowledge. In Towards Mechanized Mathematical
    Assistants (Calculemus/MKM). Springer, LNCS 4573, 2007.
 9. C. D. Vescovo, B. Parsia, U. Sattler, and T. Schneider. The modular structure
    of an ontology: Atomic decomposition. In Proceedings of the 22nd International
    Joint Conference on Artificial Intelligence (IJCAI’10), pages 2232–2237, 2010.
10. A. Voronkov. Inconsistencies in ontologies. In JELIA 2006. Springer LNCS 4160,
    2006.

</pre>