=Paper=
{{Paper
|id=Vol-1248/paper1
|storemode=property
|title=Constructing E-SHIQ Distributed Knowledge Bases via Ontology Modularization: The mONTul method
|pdfUrl=https://ceur-ws.org/Vol-1248/WoMO14-Paper1.pdf
|volume=Vol-1248
|dblpUrl=https://dblp.org/rec/conf/fois/SantipantakisV14
}}
==Constructing E-SHIQ Distributed Knowledge Bases via Ontology Modularization: The mONTul method==
<pdf width="1500px">https://ceur-ws.org/Vol-1248/WoMO14-Paper1.pdf</pdf>
<pre>
   Constructing E − SHIQ Distributed
      Knowledge Bases via Ontology
   Modularization: The mONTul method
                Georgios SANTIPANTAKIS a George VOUROS b
                       a
                         University of the Aegean, Greece
                         b
                           University of Piraeus, Greece

          Abstract. This article presents a reconfigurable method for the modu-
          larization of SHIQ ontologies, towards the construction of distributed
          E − SHIQ knowledge bases. The aim is to compute decompositions
          for correct, complete and efficient distributed reasoning. The proposed
          method combines graph-based modularization techniques with locality-
          based rules using a generic constraint problem solving framework. The
          paper presents experimental results concerning the modularization task,
          w.r.t. specific requirements for efficient distributed reasoning.
          Keywords. ontology modularization, locality, constraint satisfaction
          problem, correct and complete reasoning


Introduction
Modularization denotes either the extraction of modules from large ontologies
(e.g. for reusability purposes) or the partitioning of ontologies into modules (e.g.
for facilitating evolution or reasoning tasks). In both cases, modules should not
represent arbitrary chunks of knowledge. Each module must represent aspects
from a specific sub-domain of the original ontology. Considering the whole decom-
position of an ontology, all and only those entailments of the original ontology
must be entailed by reasoning over the collection of modules extracted. It follows
that the modular ontology must be in a decidable fragment of a representation
language.
     The aim in this paper is to propose a method for decomposing any ontology
in the SHIQ fragment of Description Logics (DL) into an arbitrary number of
modules that form a distributed E − SHIQ knowledge base, enabling correct and
complete reasoning. Each module must preserve to a great extent the meaning
of the terms in its signature. The E − SHIQ representation framework allows
modules to be semantically associated, thus reducing replication of axioms be-
tween them. It follows that a distributed knowledge base is a network of asso-
ciated modules and the meaning of terms in the decomposition is preserved by
considering the coupling of modules. The decomposition can facilitate efficient
reasoning using the sound and complete E − SHIQ distributed reasoner.
     Targeting efficient distributed reasoning tasks, as shown by the experiments
performed using the E − SHIQ reasoner [9], additional properties that must be
preserved by the networks of associated modules include: (a) Axioms must be
distributed between modules in an as much as possible even way, (b) networks of
modules must have a small diameter, and (c) there must not be many (and large)
cycles of associated modules.
      Existing modularization methods [6] are based on logical foundations, or may
apply methods manipulating graphs that resemble the structure of the original on-
tology. While logic-based methods preserve certain logical properties (e.g. [1],[2],
[3], [4], [5]), graph-based approaches apply heuristics to partition (e.g. [7], [8]) an
ontology into modules. These methods usually utilize network metrics and empir-
ical “distance measures” to define the module boundaries. The proposed modu-
larization method combines graph-based modularization techniques with locality-
based rules using a generic constraint problem solving framework. Ontology speci-
fications are used for building a graph of dependencies among the ontology terms.
These dependencies correspond to specific constraints concerning the assignment
of terms in modules. The satisfaction of these constraints secure the construction
of a well-formed E − SHIQ distributed knowledge base and specify conditions
for locality-based modularization.
      The specific contributions in this paper are the following: (a) It proposes the
mONTul modularization method that combines rules for constructing well-formed
E − SHIQ knowledge bases with rules for locality-based modularization in a con-
straint satisfaction problem. (b) The mONTul method proposed is reconfigurable
and generic in several aspects: (i) The set of constraints may change w.r.t. the
expressiveness of the original ontology and the modularization requirements, (ii)
constraints may be considered to be soft or hard, in any arbitrary combination,
(iii) mONTul can be used for extracting modules, although the focus here is on
computing partitionings; and finally, (iv) the method itself has been developed in
a modular architecture, allowing components to be replaced by state-of-the-art
methods. (c) It revises the notion of locality by also considering possible associa-
tions between terms of different modules in a distributed E − SHIQ knowledge
base.
      In the next sections, Section 1 briefly presents the background knowledge and
Section 2 revises the notion of locality for E − SHIQ. Section 3 presents the
mONTul method, Section 4 the experimental results and Section 5 concludes the
paper.

1. Background Knowledge
This section presents background knowledge on the E − SHIQ representation
framework [9] and on locality-based modularization [1]. To provide concrete ex-
amples, we consider the ontology O with the axioms:
            Conf erence ≡ M edicalConf erence t OtherConf erence          (1)
            Conf erence v Event                                           (2)
                  Event v HumanActivity                                   (3)
  P ediatricConf erence v M edicalConf erence                             (4)
                 Article v P ublishedM aterial u ∃P resentedAt.Event      (5)
        M edicalArticle v Article u ∀P resentedAt.M edicalConf erence (6)
    The E-SHIQ representation framework [9] belongs to the family of mod-
ular representation frameworks for Description Logics. It provides constructors
for associating ontology units (modules) that are within the SHIQ fragment of
Description Logics, preserving decidability.
     Given a non-empty set of indices I and a collection of modules indexed by I,
let NCi , NRi and NOi be the sets of concept, role and individual names for unit
i ∈ I, respectively. For some R ∈ NRi , Inv(R) denotes the inverse role of R and
(NRi ∪ {Inv(R)|R ∈ NRi }) is the set of SHIQ i-roles, i.e. the roles of the i-th
ontology Mi . An i-role axiom is either a role inclusion axiom or a transitivity
axiom. Let Ri be the set of i-role axioms.
     Let Eij be the set of ij-link relations relating individuals in Mi and Mj ,
i 6= j ∈ I. The sets of link relations are not pairwise disjoint, but are disjoint
with respect to the set of concept names. An ij-relation box Rij includes a finite
set of ij − link relation inclusion axioms in case i 6= j, and transitivity axioms
of the form T rans(E, (i, j)), where E is in (Eij ∩ NRi ), i.e. it is an ij-link relation
and an i-role. In case i = j, then Rij =Ri (with an abuse of notation) includes a
finite set of i − role inclusion axioms. Subsequently we use the term property to
denote both roles and link-relations.
     Briefly, the sets of i-concepts are inductively defined by the constructors
within the SHIQ fragment of Description Logics, where i-roles can be replaced
by ij-link relations, relating instances of the i-concept to instances of a j-concept,
where i 6= j ∈ I. Let i : C and i : D possibly complex concepts and i : C v i : D
(or i : C v D) a general concept inclusion (GCI) axiom in Mi . A finite set of
GCI’s in Mi is a TBox for i and it is denoted by Ti .
     Concept correspondences may be concept onto concept, or concept into con-
cept: Let C ∈ NCi , D ∈ NCj with i 6= j ∈ I. A concept onto (into) concept
correspondence from Mi to Mj that subjectively holds for Mj , is of the form
      w                          v
i : C → j : D (corresp. i : C → j : D).
Definition (Distributed Knowledge Base). A distributed knowledge base Σ =
hT, R, Ci is composed of the distributed TBox T, the distributed RBox R, and a
tuple of sets of correspondences C = (Cij )i6=j∈I between modules. A distributed
TBox is a tuple of TBoxes T= (Ti )i∈I , where each Ti is a finite set of i-concept in-
clusion axioms. A distributed RBox is a tuple of ij-property boxes R = (Rij )i,j∈I ,
where each Rij is a finite set of property inclusion axioms and transitivity axioms.
    A distributed ABox (DAB) includes a tuple of ABox’es Ai for each ontology
                                                                                 =
unit Mi , and sets Aij , i 6= j with individual correspondences of the form j:a 7→ i:b,
and property assertions of the form (a, b) : Eij , where Eij is an ij-link relation
in Eij , i 6= j. Thus, individual correspondences are specified from the subjective
point of view of Mi and, together with assertions concerning linked individuals,
these are made locally available to Mi .
Example: Let us for instance consider two units M1 , M2 for I = {1, 2}, com-
puted from ontology O. Then, the E − SHIQ distributed knowledge base is
Σ = hT, R, Ci, and is composed by the distributed TBox T, the distributed RBox
R, and a tuple of sets of correspondences C = (Cij )i6=j∈I between ontology units.
Specifically,
    -T= (Ti )i∈I , where
T1 = {1, 2, 3, 4}, (indicating the axioms included) and
T2 = {5, 6}.
    - R = ((Ri )i∈I , (Rij )i6=j∈I ), where Ri = Rij = ∅, i, j ∈ I,
    - C= (Cij )i6=j∈I , where
                                   ≡                                    ≡
C21 = {2 : M edicalConf erence → 1 : M edicalConf erence, 2 : Event → 1 : Event}
                                   ≡                                    ≡
C12 = {1 : M edicalConf erence → 2 : M edicalConf erence, 1 : Event → 2 : Event}
    - DAB = ((Ai )i∈I , (Aij )i6=j∈I ), where Ai = ∅ and Aij = ∅, for any i, j ∈ I.
For the sake of brevity, we use equivalence subjective correspondences: These
are actually specified by means of onto and into subjective correspondences.
Please notice that both modules hold subjective knowledge on concepts corre-
spondences, and these are symmetric. Finally, it must be noticed that the prop-
erty P resentedAt is a 2-role, only: E − SHIQ does not support correspondences
between roles.
     A distributed knowledge base forms a network of associated (via correspon-
dences and link relations) modules. Subsequently we use the terms network (of
modules), decomposition and distributed knowledge base interchangeably. Asso-
ciations are directional and may form cycles in the network of modules (as also
shown in the example above).
Definition (Domain relations). Domain relations rij , i 6= j ∈ I represent equalities
between individuals, from the subjective point of view of j. A domain relation
rij , i 6= j from ∆i to ∆j is a subset of ∆i × ∆j , s.t. in case d0 ∈ rij (d1 ) and
d0 ∈ rij (d2 ), then according to the subjective view of j, d1 = d2 (denoted by
d1 =j d2 ). Also, given a subset D of ∆Ii , rij (D) denotes ∪d∈D rij (d).
    Given that domain relations represent equalities, in case d1 ∈ rij (d) and
d2 ∈ rij (d), then d1 =j d2 . Therefore, E − SHIQ domain relations are globally
one-to-one relations.
Definition (Distributed Interpretation). Given the index I and i, j ∈ I, a dis-
tributed interpretation I of a distributed knowledge base Σ is the tuple formed
by the interpretations Iij = h∆i , ∆j , ·Iij i, i, j ∈ I, and a set of domain relations
rij , in case i 6= j ∈ I. Formally, I = h(Iij )i,j∈I , (rij )i6=j∈I i.
     A local interpretation Ii satisfies an i-concept C w.r.t. a distributed knowledge
base Σ, i.e. Ii  i : C iff C Ii 6= ∅. Ii satisfies an axiom C v D between i-concepts
(i.e. Ii  i : C v D) if C Ii ⊆ DIi . Also, Iij satisfies an ij-property inclusion
axiom R v S (Iij  R v S) if RIij ⊆ S Iij . A transitivity axiom T rans(E; (i, j))
is satisfied by I iff E Ii ∪ E Iij is transitive.
Definition (Distributed entailment and satisfiability). Σ d X v Y if for every I,
I d Σ implies I d X v Y , where X and Y are either i-concepts or ij-properties,
i, j ∈ I. Σ is satisfiable if there exists a I s.t. I d Σ. A concept i:C is satisfiable
with respect to Σ if there is a I s.t. I d Σ and C Ii 6= ∅.
    The E − SHIQ distributed reasoner implements a sound and complete
tableau algorithm [9] for combining local reasoning chunks corresponding to the
individual modules in a peer-to-peer fashion, inherently supporting the propaga-
tion of subsumptions between reasoning peers. The key mechanism for distributed
reasoning is the projection of nodes from the completion graph of one peer (with
a module) to another peer (with an associated module). This allows reasoning
peers to combine their local (subjective) knowledge with the knowledge that other
peers hold, in order to jointly compute entailments. Through projections, peers
propagate subsumptions through their connections to distant peers. Each projec-
tion request from a peer to another can trigger further projections from the later,
resulting to “avalanches” of triggered projections across paths in the network of
associated modules. Due to this, as also shown in [9], the distributed reasoner
is affected by (a) the distribution of axioms in modules, (b) the diameter of the
network of modules, and (c) the number of cycles in the network of modules.
     Locality-based modularization. The locality-based modularization extracts a
module for a given signature s.t. it can compute all and only the entailment of
the original ontology involving the terms in the signature. Formally, a module M
for an ontology O in the language L, w.r.t. a signature S is an ontology M ⊆ O
s.t. M and O entail the same axioms over S in L. As shown in [5], this notion
can be formalized using the notion of model-based conservative extensions. In this
case, every model of a module M of O can be extended to a model of O without
changing the interpretation domain or the interpretation of symbols in S. Thus,
given an O and S ⊆ Sig(O), a module M ⊆ O is defined to be a module of O
for S if O is a model conservative extension of M for S.
     Since the problem of checking whether any “part” M is a module of O for
a signature S is undecidable for fairly lightweight fragments of OWL [4], [5], we
need to compute approximations. According to [1] and [2], a sufficient condition
for a conservative model is locality:
Definition (∅-locality). Let S be a signature. An interpretation is ∅-local for S if
for every class A and property R not in S, we have AI = RI = ∅. An axiom α is
∅-local for S if I  α for each I that is ∅-local for S. An ontology O is ∅-local for
S if every axiom in O is ∅-local for S.
Example. We can easily check that axioms (5) and (6) in O\M1 are ∅-local for
Sig(M1 )= {Conference, MedicalConference, OtherConference, Event,
HumanActivity, PediatricConference}: Any interpretation I that interprets all
symbols in Sig(O) \ Sig(M1 ) as the empty set, is ∅-local for S=Sig(M1 ). However,
M1 is not ∅-local for Sig(M2 ) if we consider that Sig(M2 ) includes the terms
Event and/or M edicalConf erence: Axiom (3) for instance can not be made ∅-
local for any interpretation that interprets HumanActivity as the empty set,
since Event ∈ Sig(M2 ). This is the case for the axiom (1), as well, due to the
concept M edicalConf erence.
     Since checking ∅-locality is proved costly as well, we can use the following
syntactic conditions for checking ⊥-locality for a specific signature S. ⊥-locality
implies ∅-locality [1], [2].
     Given a signature S ⊆ Sig(O) for a SHIQ ontology O, the following grammar
recursively defines the positive ⊥-concepts, denoted as CS+ , for S:
      CS+ ::= A+ | ¬C − | C u C + | ∃R+ .C | ∃R.C + |≥ nR+ .C |≥ nR.C + (7)
where A+ is a concept name and R+ a role in Sig(O)\S, C ∈ Sig(O) and
R ∈ Sig(O). It must be noticed that given that S ⊆ Sig(O), a concept or role
in Sig(O) may not be in CS+ or in CS− . The negative ⊥-concepts CS− for S are as
follows:
  CS− ::= ¬C + | C1− u C2− | C − t C | ∀R+ .C | ∀R.C − |≤ nR+ .C |≤ nR.C − (8)
The other constructs of SHIQ can be expressed using the above constructors, so
they can be used in local concepts as well.
    A role inclusion axiom R+ v R or a transitivity axiom T rans(R+ ) is local
w.r.t. S. A GCI is local w.r.t. S if it is either of the form CS+ v C or of the form
C v CS− .
    A module M of an ontology is ⊥−local for a signature S, iff all its axioms
are local for S ∪ Sig(M).
Definition (⊥-module). Let O be an ontology and let S be a signature, S ⊆
Sig(O). M ⊆ O is a ⊥-module for O for S, if O\M is ⊥-local for S ∪ Sig(M).
Example. We can easily check that M2 = O\M1 with axioms 5 and 6 is ⊥-local
for S ∪ Sig(M1 ) = {Conference, MedicalConference, OtherConference, Event,
HumanActivity, PediatricConference}.

2. E − SHIQ locality-based modules
Given a distributed E − SHIQ knowledge base indexed by I, we have to specify
the rules for deciding when an individual E − SHIQ module Mi , i ∈ I, is a
⊥−module for its signature. To do this, we need first to refine the rules for ⊥-
concepts specified in formulae (7) and (8), so as (a) to assure that i-roles are
specified for i-concepts only, i ∈ I, and (b) to incorporate the cases where i-
concepts are constructed by means of restrictions on link relations.
    Therefore, denoting a role in an E − SHIQ knowledge base with R and a link
relation with E, then for an E − SHIQ module Mi , i ∈ I, the rules for positive
and negative ⊥−concepts for S = Sig(Mi ) are:
  CS+ ::= A+ | ¬C − | C u C + | ∃R+ .C + | ∃E + .C − |≥ nR+ .C + |≥ nE + .C −           (9)
  CS− ::= ¬C + | C1− u C2− | C − t C | ∀R− .C − | ∀E + .C − |≤ nR− .C − |≤ nE + .C −   (10)
Example. Considering our example with M1 including axioms {1, 2, 3, 4} and
M2 the axioms {5, 6}, then for Sig(M1 ) = {Conference, MedicalConfer-
ence,OtherConference, Conference,Event, HumanActivity, PediatricConference},
the concept ∃P resentedAt.Event is ⊥−local for Sig(M1 ), if P resentedAt is a
21-link relation, according to the fifth rule for positive ⊥−concepts.
     Forming P resentedAt as a 21-link relation makes the M2 -concept
∀P resentedAt.M edicalConf erence a negative ⊥−concept for Sig(M1 ). Never-
theless, in this case the module M2 is ⊥−local for Sig(M1 ), given that Article
is a positive ⊥−concept for Sig(M1 ).
     The above example shows that concepts of the form ∀R.C may present diffi-
culties for maintaining the ⊥−locality of E − SHIQ, depending on the context
of their appearance.
     To completely define ⊥−locality for E − SHIQ, we must also consider the
subjective correspondences between concept names. Considering subjective onto
concept correspondences between the module Mi and any other module Mj ,
i 6= j ∈ I, in a distributed E − SHIQ knowledge base, then similarly to GCIs
for the module Mi , the following rule holds for making an onto correspondence
                                 w
⊥−local for Sig(Mj ):      j : C → i : CS+
     At this point we have to clarify that correspondences that Mi subjectively
holds do not affect the locality of Mj .
     In the general case, into correspondences of Mi , can not be ⊥−local for
                                                   v
Sig(Mj ), since they are of the form:      j : CS− → i : CS+
    Nevertheless, such a subjective correspondence can be ⊥−local for S =
Sig(Mj )\(Sig(Mj ) ∩ Sig(Mi )) when the same concept name appears in both
sides of the correspondence and this concept name belongs in Sig(Mi ). Such a
                                       v
correspondence is of the form j : CS+ → i : CS+ , where S = Sig(Mj )\Sig(Mi ).
    It must be noticed that the above hold for equivalence correspondences as
well, since these are conjunctions of onto and into subjective correspondences.
Example: Considering again our running example, let us consider that axioms
(2) and (3) in M1 are formed as correspondences. Thus, the knowledge base be-
comes as follows: M1 includes the axioms {1, 4}, M2 the axioms {5, 6}, S =
Sig(M1 ) = {Conference, MedicalConference,OtherConference,HumanActivity,
PediatricConference} and Sig(M2 ) = Sig(O\M1 ). There are also two correspon-
                                          v                                    w
dences between modules: 1:Conference−              +
                                      S → 2:EventS and 1:HumanActivityS →
                                                                            −
        +
2:EventS . Focusing on the correspondences, we can observe that the first one is
not ⊥ − local for M2 (both correspondences are from the subjective point of view
of M2 ), while the second correspondence is ⊥ − local for M2 . Therefore, M2 is
not local for Sig(M1 ).
    Given that only into correspondences can bound the size of interpretations
for concepts in module Mi , and in order to maintain the notion of locality in
these cases, we revisit the definition of ⊥-module for E − SHIQ, considering the
restricted signature of such a module, denoted by S ∗ :
Definition (⊥ − E-module). Let O be an ontology and let S be a signature,
S ⊆ Sig(O). M ⊆ O is a ⊥ − E-module for O for S, if O\M is ⊥-local for the
restricted signature S ∗ = S ∪ Sig ∗ (M), where Sig ∗ (M) = Sig(M)- {C| C is a
                                                                  v
concept name and there exists a correspondence of the form C → C from the
subjective point of view of O\M}.
    Subsequently, the module O\M is called a ⊥ − E-local module for S ∗ =
S ∪ Sig ∗ (M). In case Sig ∗ (M) = Sig(M), then O\M is ⊥−local. Thus, every
⊥−local (⊥−module) is a ⊥−E-local module (resp. ⊥−E−module), but not vice-
versa. It must be noticed that the existence of onto correspondences do not harm
the completeness and correctness of the reasoning tasks, given that reasoning
tasks combine the (subjective) knowledge of all associated modules.
    The E −SHIQ constructors offer many options for associating different mod-
ules, either via link-relations, or via concept correspondences, or via combina-
tions of these: These are also alternative options for the modularization method.
Nevertheless, not all combinations are valid for E − SHIQ. For instance, lack
of role-to-role correspondences in E − SHIQ, pose the restrictions that a role
hierarchy is set in a single module.

3. The mONTul method
The intuition behind the proposed modularization method is to keep highly-
dependent ontology terms in the same module, subject to satisfying the con-
straints for ⊥ − E−local modules w.r.t. their signature. Please recall that the
major aim is not to provide locality-preserving modularizations per se, but to
compute decompositions that are complete and correct for reasoning tasks w.r.t.
locality-preserving constraints, towards enhancing the efficiency of distributed
reasoning tasks. Thus, locality constraints drive the decomposition process, reduc-
ing the several options that the method may consider for constructing distributed
E −SHIQ knowledge bases, preserving the coherency of the subject matters that
each module includes. The major steps of the method are as follows:
    a) Construction of a dependency graph for specifying the dependencies be-
       tween concepts and roles, according to the original theory,
    b) Clustering concepts and roles into groups so as to satisfy the specified
       constraints (as much as possible) and to decide the signature of modules;
       and finally,
    c) The construction of modules and of their associations.
    Finally, the network of associated modules is further “tuned” so as to elim-
inate cycles and reduce the diameter of the network by merging small modules.
The paragraphs that follow describe each major step of mONTul.
     Dependency Graph. Given an ontology O in the SHIQ fragment of Descrip-
tion Logics, the dependency graph for that ontology is a directed graph G = hE, Vi,
where V is a set of nodes corresponding to concepts (concept names and complex
concepts) and roles, and a set of unidirectional dependency edges E connecting the
nodes. Each node for a concept C in the graph is associated with a state variable
SC . The state SR of a node corresponding to an ontology role R, comprises two
variables SRf rom and SRto , indicating the module to which R belongs and the
module linked via R, respectively. All state variables range to a finite subset of
natural numbers D and their values specify whether dependent ontology terms
must appear in the signature of the same module or not.


                     Figure 1. The dependency graph for the example ontology
    The types of edges in the dependency graph are according to the constructors
available in the SHIQ fragment of Description Logics:
[E1:]     Two nodes vC and vD representing concepts C and D can be connected:
      (a) With an edge of type subc − dep if C v D,
     (b) with an edge of type cap − dep, if C is of the form ui Di , i = 1, 2... and there is a Dk , s.t. D = Dk
          (the equality specifies equality between the terms lexicalizing the concepts),
      (c) with an edge of type cup − dep, if C is of the form ti Di , i = 1, 2... and there is a Dk , s.t. D = Dk ,
     (d) with an edge of type compl, in case C and D are concept names and C is of the form (¬D), or
[E2:]     A concept C is related to a role R via an edge of type Q−dep,
          if C is of the form QR.D, where Q ∈ {∀, ∃, ≤ n, ≥ n}.
[E3:]     A role R is related to a concept C via an edge values − dep,
          if there is a restriction of the form QR.C, where Q ∈ {∀, ∃, ≤ n, ≥ n}.
[E4:]     A node vR corresponding to a role R can be related to a node vS corresponding to a role S as follows:
      (a) with a subr − dependency in case R v S, or
      (b) with an inv − dependency in case R is the inverse of S.
      (c) Transitivity of roles do not impose further dependencies.
Any C ≡ D and disjoint(C, D) axiom can be expressed using subsumption rela-
tions. Figure 1 presents the dependency graph for the example ontology.
    Edges in the dependency graph are associated with constraints that specify
how the states of adjacent nodes can be assigned values from D. These constraints
can be distinguished into generic constraints (GC) and locality preserving con-
straints (LC). While the former suffice for the construction of a E − SHIQ dis-
tributed knowledge base, the later encode the rules in formulae (9) and are neces-
sary for computing ⊥ − E−local modules M for the signature Sig(O)\Sig(M).
Specifically, the constraints are the following:
[GC1:] If there is a Q − dep between a node vC and a node vR , then it must hold that SC = SRf rom .
[GC2:] If there is a values − dep between a node vR and a node vC , then it must hold that SC = SRto
[GC3:] If there is a subc − dep between nodes vC and vD , then it must hold that SC = SD .
[GC4:] If there is a subr − dep or inv − dep between nodes vR and vS ,
       then it must hold that SSf rom = SRf rom and SSto = SRto
[LC1:] If there is a cap − dep between nodes vC and vD , and D is a complex concept, then SC 6= SD .
       If D is a concept name, then it must hold that SC = SD .
[LC2:] If there is a cup − dep between nodes vC and vD , then it must hold that SC = SD .
[LC3:] If there is an edge Q-dep between
       C and R, where Q ∈ {∃, ≥ n} and there is no Q-dep between C’ and R where Q ∈ {∀, ≤ n},
       then it must hold that SRf rom = SRto .
     It must be noticed that GC3 can be omitted. In this case we can specify onto
correspondences between concepts in different modules resulting to modules M
that are ⊥ − E−local for Sig(O)\Sig(M). Regarding the locality preserving con-
straints, these aim to construct positive ⊥−concepts for the module M according
to formulae (9), for the signature Sig(O)\Sig(M).
     Regarding specifications of the form QR.C, where Q ∈ {∀, ≤ n}, checking the
context where they appear is necessary. Given that we primarily aim at efficient
reasoning, in this work we choose to sacrifice locality in cases where this is affected
by this type of specification for the efficiency and simplicity of the modularization
method. Therefore, to deal with specifications of the form QR.C, where Q ∈ {∀, ≤
n} the method considers the following constraint:
[LC4:]If there is an edge Q − dep between C and R, where Q ∈ {∀, ≤ n, }, then it may hold
      that SRf rom 6= SRto , even if there is a Q-dep between C’ and R, where Q ∈ {∃, ≥ n}.
     This constraint (in contrast to the others) aims to make the concept QR.C
with Q ∈ {∀, ≤ n} negative ⊥− concepts for the signature Sig(O)\Sig(M).
     Now, given all the constraints specified, we need to point out that not all
constraints can be satisfied if they occur in a specific problem instance. There-
fore we can distinguish between soft and hard constraints: Given our decompo-
sition purpose, GC constraints (except GC3) are considered hard in any case.
Configurations with GC constraints being soft are beyond the scope of this work.
     Constraint Satisfaction. During the construction of the dependency graph,
the axioms of the ontology are consulted and parsed to detect constituent concepts
and roles. For each new concept or role, the method creates a new dependency
graph node and it assigns a value to its state variable from D, so as to satisfy
the constraints associated to it. This task ends with (probably many) violated
hard constraints, which are resolved using a CSP solver, with initial values being
those already assigned to the state variables during the construction of the depen-
dency graph. Given the assignments computed by the CSP solver, a hill-climbing
algorithm minimizes the number of the remaining violated constraints (if any).
     Module Construction. Once the state values of the nodes are decided, de-
pendency graph nodes are clustered into groups. Based on that, the method can
decide on the signature of each module. Each group contains those concept names
and properties whose state variables (SC for concepts and SRf rom for properties)
are equal, and the corresponding nodes are connected via a path in the depen-
dency graph.
     At this stage, the method considers additional desiderata concerning the rea-
soner’s performance, i.e. the distribution of axioms, the diameter of the network
and the size of cycles of associated modules. For each axiom α in the ontology,
the module M that will host this axiom is the one that maximizes the function:
                 |Sig(M)|+|Sig(M)∩Sig(α)|
U (M, α) = |T (M)|+1+|Sig(α)|−|Sig(M)∩Sig(α)| , where Sig(M) is the signature of M,
T (M) is the set of axioms already in M, and Sig(α) is the set of terms in the
axiom α. The intuition is to find the module M that includes most of the terms
in Sig(α), subject to the restriction that the number of axioms and correspon-
dences in M (counted in the denominator) will be kept low. Once the module
that will host the axiom is decided, the process will construct any correspondences
and link-relations to connect terms already in other modules. This happens with-
out affecting the locality of modules, given that no locality constraint has been
violated.
     Finally, the Node Merging (NM) tuning method reduces the number of mod-
ules in the network (which in the general case also reduces the diameter of the
network), improving the even distribution of axioms in the modules. Candidate
modules to be merged are those (a) with a common neighbor module, and (b)
with terms corresponding to dependency graph nodes with the same state. Mod-
ules are merged, if the new network will have lower coefficient of variation of the
distribution of axioms in the modules. The algorithm iterates until there are no
merging actions that can be performed. Merging actions do not violate locality,
since, given two ⊥−E−local modules Mi , Mj , for Sig ∗ (O\Mi ) and Sig ∗ (O\Mj )
respectively, Mi ∪ Mj is also a ⊥ − E−local module for Sig ∗ (O\(Mi ∪ Mj )).
     It can be proved that the modularization method computes an E − SHIQ
distributed knowledge base Σ, satisfying correctness and completeness of reason-
ing: by considering each unresolved constraint, and the actions taken to fix the
conflict, by associating modules. It follows that the proposed method is a reduc-
tion of any SHIQ ontology O onto an equivalent E − SHIQ Σ, and given that
the E − SHIQ reasoner is sound and complete [9], the modularization method
satisfies correctness and completeness of reasoning.

4. Experimental Results
To experiment with the proposed modularization method, we have gathered on-
tologies of various size and expressiveness by crawling web ontology repositories1 .
From these ontologies, we present the results concerning 18 ontologies. The corpus
is extended with six versions of the Semantic Information System (SIS) registry
[11] which are of considerable size and expressiveness.
     The 24 ontologies are categorized by their size: small (between 100-499 ax-
ioms), medium (between 500-4999 axioms), large (between 5000-9999 axioms)
and SIS ontologies (10000 and more axioms). The expressiveness of the language
used in all categories varies from ALC to SHIQ. All SIS ontologies are within
the ALCHIQ fragment of Description Logics.
  1 (June 25th, 2013), http://bioportal.bioontology.org/, http://www.cs.ox.ac.uk/isg/ontologies/

and http://owl.cs.manchester.ac.uk/owlcorpus
     In a previous work [10] we have evaluated montul evaluated against the lo-
cality based module extraction algorithm reported in [3], and the results have
shown that mONTul computes smaller and less modules. In this paper, the ex-
perimental cases concern specific configurations for each ontology. Each config-
uration depends on the sets of constraints used (and their distinction as hard
or soft), the size of the state variables domain D varying in the set {2, 10, 20,
40}, and the use of the network tuning method. We distinguish between the con-
figuration of type (A), where constraints GC1,GC2,GC4,LC1,LC2,LC3 are hard
and the N M tuning method is used; of type (B) where GC1,GC2,GC4 are hard,
LC1,LC2,LC3,LC4 are soft, and the N M tuning methods is used, and finally;
type (C) which is the same as (B) but without the tuning method. Each configu-
ration is denoted by a template of the form “CaseType” “number of states”. E.g.
“A 20” denotes type A with |D|=20.
     Figures 2 and 3 show the results of different mONTul configurations w.r.t.
our desiderata for (a) small diameter of the network of modules, (b) small number
of cycles of associated modules, (c) balanced distribution of axioms in the mod-
ules. Due to space restrictions, Figure 2 presents results only for |D| = 2, while
comparative results for larger D are shown only for the A configuration in Figure
3: These are representative for the results reported from configurations B and C.


Figure 2. Comparative results for A/B/C 2 configurations and for all networks: (a) Number of
modules, (b) Network diameter, (c) Number of cycles, (d) Distribution of axioms.


     Ontologies are ordered in the x axis by the (ascending) number of modules
produced by the A 2 configuration. Succinctly, configurations A and B compute
decompositions with no significant differences in all cases, although we may no-
tice that configurations A present a larger number of cycles in some experimental
cases, independently from the size of D. Configurations of type C, compared to
A and B, compute decompositions with larger number of modules/cycles, and
uneven distribution of axioms in modules (figures indicate the coefficient of vari-
ation of the distribution). This shows the necessity for tuning the decompositions
constructed. Furthermore, as Figure 3 shows, while the size of D increases, the
results get worst w.r.t. our requirements: This is due to the fact that larger D pro-
vides more degrees of freedom for the decomposition, thus more modules, greater
diameter and more cycles in the network of associated modules.
Figure 3. Comparative results for (a) number of modules, (b) network diameter, (c) number of
cycles, (d) distribution of axioms for all networks and A |D| configurations.

5. Concluding Remarks
Given an ontology within the SHIQ fragment of Description Logics, this paper
proposes the mONTul method for partitioning this ontology into an arbitrary
number of modules that (a) form a distributed E − SHIQ knowledge base, such
that (b) it can be used for correct and complete reasoning, and (c) each module
preserves to a great extent the meaning of the terms in its signature. Future
work concerns further investigation on the use of different sets of (hard and soft)
constraints and methods for tuning the network of associated modules.

References
 [1]   Bernardo Cuenca Grau, Ian Horrocks, Yevgeny Kazakov and Ulrike Sattler: A Logical
       Framework for Modularity of Ontologies, In proc. of IJCAI 2007, 298–303.
 [2]   Bernardo Cuenca Grau, Ian Horrocks, Yevgeny Kazakov and Ulrike Sattler: Just the right
       amount: extracting modules from ontologies, Proc. of WWW ’07, (2007), 717–726.
 [3]   Bernardo Cuenca Grau, Ian Horrocks, Yevgeny Kazakov and Ulrike Sattler: Modular reuse
       of ontologies: Theory and practice. JAIR, 31:1, (2008) 273–318
 [4]   Boris Konev, Carsten Lutz, Walther Dirk and Frank Wolter: Logical Difference and Module
       Extraction with CEX and MEX, Description Logics, CEUR Workshop Proc., 353, 2008.
 [5]   Carsten Lutz, Dirk Walther and Frank Wolter: Conservative Extensions in Expressive
       Description Logics, In Proc. of IJCAI 2007, 453-458.
 [6]   Jyotishman Pathak, Thomas M. Johnson, and Christopher G. Chute: Survey of modular
       ontology techniques and their applications in the biomedical domain, Integr. Comput.-
       Aided Eng., IOS Press, (16:3), (2009), 225–242.
 [7]   Julian Seidenberg and Alan Rector: Web ontology segmentation: analysis, classification
       and use, In Proc. of the 15th Int. Conf. on World Wide Web, ACM Press (2006), 13–22.
 [8]   Heiner Stuckenschmidt and Anne Schlicht: Structure-Based Partitioning of Large Ontolo-
       gies, Modular Ontologies, LNCS, 5445, (2009), 187–210.
 [9]   George A. Vouros and Georgios M. Santipantakis: Combining Ontologies with Correspon-
       dences and Link Relations: The E-SHIQ Representation Framework, arXiv, 1310.2493,
       (2013), cs.AI/1310.2493.
[10]   Georgios M. Santipantakis, George A. Vouros: Modularizing Ontologies for the Construc-
       tion of E − SHIQ Distributed Knowledge Bases, SETN 2014, 192–206.
[11]   George A. Vouros, et al: A semantic information system for services and traded resources
       in Grid e-markets, FGCS, 26:7, (2010), 916–933.

</pre>