Modules in Transition
             Conservativity, Composition, and Colimits

                         Oliver Kutz1 and Till Mossakowski2
                  1
                      SFB/TR 8 Spatial Cognition, Bremen, Germany
                          okutz@informatik.uni-bremen.de
      2
          DFKI Lab Bremen and SFB/TR 8 Spatial Cognition, Bremen, Germany
                            till.mossakowski@dfki.de


      Abstract. Several modularity concepts for ontologies have been studied in
      the literature. Can they be brought to a common basis? We propose to use
      the language of category theory, in particular diagrams and their colimits,
      for answering this question. We outline a general approach for representing
      combinations of logical theories, or ontologies, through interfaces of various
      kinds, based on diagrams and the theory of institutions. In particular, we
      consider theory interpretations, language extensions, symbol identification,
      and conservative extensions. We study the problem of inheriting conserva-
      tivity between sub-theories in a diagram to its colimit ontology. Finally,
      we apply this to the problem of conservativity when composing DDLs or
      E-connections.


1   Introduction
In this paper, we propose to use the category theoretic notions of diagram and
colimit in order to provide a common semantic backbone for various notions of
modularity in ontologies.
    At least three commonly used notions of ‘module’ in ontologies can be dis-
tinguished, depending of the kind of relationship between the ‘module’ and its
supertheory (or superontology): (1) a module can be considered a ‘logically inde-
pendent’ part within its superontology—this leads to the definition of module as a
part of a larger ontology which is a conservative extensions of it; (2) a module can
be a part of a larger ‘integrated ontology’, where the kind of integration determines
the relation between the modules—this is the approach followed by modular ontol-
ogy languages (e.g. DDLs, E-connections etc.); (3) a ‘part’ of a larger theory can
be considered a module for reasons of elegance, re-use, tradition, etc.—in this case,
the relation between a module and its supertheory might be a language extension,
theory extension/interpretation, etc.
    The main contributions of the present paper are the following: (i) building on
the theory of institutions, diagrams, and colimits, we show how these different
notions of module can be considered simultaneously using the notion of a module
diagram; (ii) we show how conservativity properties can be traced and inherited to
the colimit of a diagram; (iii) we show how this applies to the composition problem
in modular ontology languages such as DDLs and E-connections.
    In Section 2, we will introduce institutions and give several examples. Section 3
introduces the diagrammatic view of modules, and Section 4 studies the problem of
conservativity in diagrams. Finally, in Section 5, we sketch heterogeneous diagrams,
and apply this to modular ontology languages in Section 6.
2   Institutions

The study of modularity principles can be carried out to a quite large extent inde-
pendently of the details of the underlying logical system that is used. The notion of
institutions was introduced by Goguen and Burstall in the late 1970s exactly for
this purpose (see Goguen and Burstall (1992)). They capture in a very abstract and
flexible way the notion of a logical system by leaving open the details of signatures,
models, sentences (axioms) and satisfaction (of sentences in models).
    The importance of the notion of institutions lies in the fact that a surpris-
ingly large body of logical notions and results can be developed in a way that is
completely independent of the specific nature of the underlying institution.1
    We assume some acquaintance with the basic notions of category theory, and
refer to Adámek et al. (1990) for an introduction. The reader with no background in
category theory can envisage a category as a “graph with composition of arrows”,
a functor as a “graph homomorphism”. If C is a category, Cop is the dual category,
where all arrows are reversed.

Definition 1. An institution I = (Sign, Sen, Mod, |=) consists of

 – a category Sign of signatures,
 – a functor Sen : Sign −→ Set2 giving, for each signature Σ, the set of sentences
   Sen(Σ), and for each signature morphism σ : Σ −→ Σ 0 , the sentence translation
   map Sen(σ) : Sen(Σ) −→ Sen(Σ 0 ), where often Sen(σ)(ϕ) is written as σ(ϕ),
 – a functor Mod : Signop −→ CAT 3 giving, for each signature Σ, the category
   of models Mod(Σ), and for each signature morphism σ : Σ −→ Σ 0 , the reduct
   functor Mod(σ) : Mod(Σ 0 ) −→ Mod(Σ), where often Mod(σ)(M 0 ) is written
   as M 0 |σ ,
 – a satisfaction relation |=Σ ⊆ |Mod(Σ)| × Sen(Σ) for each Σ ∈ |Sign|,
such that for each σ : Σ −→ Σ 0 in Sign the following satisfaction condition holds:

                         (?)      M 0 |=Σ 0 σ(ϕ) iff M 0 |σ |=Σ ϕ

for each M 0 ∈ |Mod(Σ 0 )| and ϕ ∈ Sen(Σ), expressing that truth is invariant under
change of notation and enlargement of context.                                   t
                                                                                 u

The only condition governing the behaviour of institutions is thus the satisfaction
condition (?).4
   A theory in an institution is a pair T = (Σ, Γ ) consisting of a signature
Sig(T ) = Σ and a set of Σ-sentences Ax(T ) = Γ , the axioms of the theory. The
models of a theory T are those Sig(T )-models that satisfy all axioms in Ax(T ).
Logical consequence is defined as usual: T |= ϕ if all T -models satisfy ϕ. Theory
morphisms, also called interpretations of theories, are signature morphisms
1
  For an extensive treatment of the model theory in this setting, see Diaconescu (2007).
2
  Set is the category having all sets as objects and functions as arrows.
3
  CAT is the category of categories and functors. Strictly speaking, CAT is not a category
  but only a so-called quasicategory, which is a category that lives in a higher set-theoretic
  universe.
4
  Note, however, that non-monotonic formalisms can only indirectly be covered this way,
  but compare, e.g., Guerra (2001).
that map axioms to logical consequences. An institution is compact if T |= ϕ
implies T 0 |= ϕ for a finite subtheory T 0 of T .
    Examples of institutions include:
                                                              =
Example 2. First-order Logic. In the institution FOLms of many-sorted first-
order logic with equality, signatures are many-sorted first-order signatures, consist-
ing of sorts and typed function and predicate symbols. Signature morphisms map
symbols such that typing is preserved. Models are many-sorted first-order struc-
tures. Sentences are first-order formulas. Sentence translation means replacement
of the translated symbols. Model reduct means reassembling the model’s compo-
nents according to the signature morphism. Satisfaction is the usual satisfaction of
a first-order sentence in a first-order structure.                                  t
                                                                                    u

Example 3. Description Logics. Signatures of the description logic ALC consist
of a set of B of atomic concepts and a set R of roles, while signature morphisms
provide respective mappings. Models are single-sorted first-order structures that
interpret concepts as unary and roles as binary predicates. Sentences are subsump-
tion relations C1 v C2 between concepts, where concepts follow the grammar

                C ::= B | > | ⊥ | C1 t C2 | C1 u C2 | ¬C | ∀R.C | ∃R.C

Sentence translation and reduct is defined similarly as in FOL= . Satisfaction is
the standard satisfaction of description logics. ALCms is the many-sorted variant
of ALC. ALCO is obtained from ALC by extending signatures with nominals. The
(sub-Boolean) description logic EL restricts ALC as follows: C ::= B | > | C1 u
C2 | ∃R.C. SHOIN extends ALC with role hierarchies, transitive and inverse roles,
(unqualified) number restrictions, and nominals, etc.                          t
                                                                               u

Example 4. (Quantified) Modal Logics. The modal logic S4u has signatures
as classical propositional logic, consisting of propositional variables. Sentences are
built as in propositional logic, but add two unary modal operators,  and . Mod-
els are standard Kripke structures but based on reflexive and transitive relations.
Satisfaction is standard modal satisfaction, where  is interpreted by the transitive
reflexive relation, and  by universal quantification over worlds.
    The standard formulation of first-order modal logic QS5 (due to Kripke) has
signatures similar to FOL= , including variables and predicate symbols. Sentences
follow the grammar for FOL= -sentences using Booleans, quantifiers, and identity,
while adding the  operator, but leaving out constants and function symbols.
Models are constant-domain first-order Kripke structures, with the usual first-order
modal satisfaction.                                                                 t
                                                                                    u


3    Modules as Diagrams

Several approaches to modularity in ontologies have been discussed in recent years,
including the introduction of various so-called ‘modular ontology languages’. The
module system of the Web Ontology Language OWL itself is as simple as inade-
quate (Cuenca-Grau et al., 2006b): it allows for importing other ontologies, includ-
ing cyclic imports. The language Casl (see Bidoit and Mosses, 2004; CoFI (The
Common Framework Initiative), 2004) has been used in Lüttich et al. (2006) for
ontologies. Beyond imports, it allows for renaming, hiding and parameterisation.
Other languages that have been introduced include DDLs (Borgida and Serafini,
2002, 2003), E-connections (Kutz et al., 2004), and P-DLs (Bao et al., 2006b,a),
where, roughly speaking, more involved integration and modularisation mechanisms
than plain imports are envisaged.
    We will use the formalism of colimits of diagrams as a common semantic back-
bone for these languages.5 The intuition behind this formalism is explained as
follows:


        “Given a species of structure, say widgets, then the result of intercon-
     necting a system of widgets to form a super-widget corresponds to taking
     the colimit of the diagram of widgets in which the morphisms show how
     they are interconnected.” Goguen (1991)

    The notion of diagram is formalised in category theory. Diagrams map an index
category (via a functor) to a given category of interest. They can be thought of as
graphs in the category. A cocone over a diagram is a kind of “tent”: it consists of
a tip, together with a morphism from each object involved in the diagram into the
tip, such that the triangles arising from the morphisms in the diagram commute.
A colimit is a universal, or minimal cocone. We refer to (Adámek et al., 1990) for
formal details.
    In the sequel, we will assume that the signature category has all finite colimits,
which is a rather mild assumption; in particular, it is true for all the examples
above. Moreover, we will rely on the fact that colimits of theories exist in this case
as well; the colimit theory is defined as the union of all component theories in the
diagram, translated along the signature morphisms of the colimiting cocone.


Definition 5. A module diagram of ontologies is a diagram of theories such that
the nodes are subdivided into ontology nodes and interface nodes.

Composition of module diagrams is simply their union.


Example 6. Consider the union of the diagrams

       T1                           T2            T2                        T3
                                 -                                        -

                     Σ1                                          Σ2

where the Σi are interfaces and the Ti are ontologies. Think of e.g. T12 as being
an ontology that imports T1 and T2 , where Σ1 contains all the symbols shared
between T1 and T2 . Then T12 (and T23 ) can be obtained as pushouts, and so can
the overall union T123 (which typically is then further extended with new concepts
etc.). A “c” means “conservative”; this will be explained in Sect. 4.

5
    However, note that hiding is not covered by this approach.
                                                 T123 
                                     c       -                c

                             T12                                     T23 
                     c   -               c                c       -           c

             T1                                 T2                                  T3
                                             -                                    -

                             Σ1                                       Σ2
   It is clear that theories with an import structure are just tree-shaped diagrams,
while both shared parts and cyclic imports lead to arbitrary graph-shaped dia-
grams. The translation of Casl (without hiding) to so-called development graphs
detailed in (CoFI (The Common Framework Initiative), 2004) naturally leads to
diagrams as well. Finally, the diagrams corresponding to modular languages like
DDLs and E-connections will be studied in Sect. 6. Thus, diagrams can be seen as a
uniform mathematical formalism where properties of all of these module concepts
can be studied. An important such property is conservativity.


4   Conservative Diagrams
Conservative diagrams are important because they imply that the combined ontol-
ogy does not add new facts to the individual ontologies. Indeed, the notion of an
ontology module of an ontology T has been defined as any “subontology T 0 such
that T is a conservative extension of T 0 ” (Ghilardi et al., 2006)—this perfectly
matches our notion of conservative diagram below.
Definition 7. A theory morphism σ : T1 −→ T2 is proof-theoretically conser-
vative, if T2 does not entail anything new w.r.t. T1 , formally, T2 |= σ(ϕ) implies
T1 |= ϕ. Moreover, σ : T1 −→ T2 is model-theoretically conservative, if any
T1 -model M1 has a σ-expansion to T2 , i.e. a T2 -model M2 with M2 |σ = M1 .
It is easy to show that conservative theory morphisms compose. Moreover,
Proposition 8. Model-theoretic implies proof-theoretic conservativity.
However, the converse is not true in general:
Example 9. (Lutz and Wolter, 2007) Consider the following two EL TBoxes:

      Γ1 = {Human v ∃eats.>, Plant v ∃grows in.Area, Vegetarian v Healthy}
      Γ2 = {Human v ∃eats.Food, Food u Plant v Vegetarian}

It is easily seen that Γ1 ∪ Γ2 is a proof-theoretic conservative extension of Γ1 w.r.t.
EL. However, (Lutz and Wolter, 2007) also show this is not the case w.r.t. ALC,
as witnessed by

                    A := Human u ∀eats.Plant v ∃eats.Vegetarian,

since Γ1 ∪ Γ2 |= A, but Γ1 6|= A. In particular, it follows that Γ1 ∪ Γ2 is not a
model-theoretic conservative extension of Γ1 .
Definition 10. A (proof-theoretic, model-theoretic) conservative module
diagram of ontologies is a diagram of theories such that the theory morphism
of any ontology node into the colimit of the diagram is (proof-theoretically resp.
model-theoretically) conservative.
     By conservativity, the definition immediately yields:
Proposition 11. The colimit ontology of a proof-theoretic (model-theoretic) mod-
ule diagram is consistent (satisfiable)6 if any of the component ontologies is.
Thus, in particular, in a conservative module diagram, an ontology node Oi can
only be consistent (satisfiable) if all other ontology nodes Oj , j 6= i, are consistent
(satisfiable) as well.
    The main question is how to ensure these conservativity properties in the united
diagram. To study this, we introduce some notions from model theory, namely vari-
ous notions of interpolation (for proof-theoretic conservativity) and amalgama-
tion (for model-theoretic conservativity).
    Craig interpolation plays a crucial role in connection with proof systems in
structured theories, see Borzyszkowski (2002). It comes in various forms.
     AIP. (Arrow Interpolation) If |= ϕ → ψ, then there exists some χ that only
     uses symbols occurring in both ϕ and ψ, with |= ϕ → χ and |= χ → ψ.
However, this relies on a connective → being present in the institution. For the
general study of module systems, the following formulation is more suitable:
     TIP. (Turnstile Interpolation) If ϕ |= ψ, then there exists some χ that only
     uses symbols occurring in both ϕ and ψ, with ϕ |= χ and χ |= ψ.
TIP follows from AIP in the presence of a deduction theorem, and in this case, a
further generalisation follows (see Areces and Marx, 1998):
     SIP. (Splitting Interpolation) If ϕ0 , ϕ1 |= ψ, then there exists some χ that
     only uses symbols occurring in both ψ and the union of the symbols of ϕ0
     and ϕ1 , with ϕ0 |= χ and ϕ1 , χ |= ψ.
    We now further generalise SIP in two ways. The first generalisation concerns
the rather implicit use of signatures in the definitions above. Making this explicit
would mean to assume that ϕ lives in a signature Σ1 , ψ lives in a signature Σ2 ,
the entailment ϕ |= ψ lives in Σ1 ∪ Σ2 , and the interpolant in Σ1 ∩ Σ2 . Since we do
not want to go into the technicalities for equipping an institution with unions and
intersections (see Diaconescu et al. (1993) for details), we replace Σ1 ∩ Σ2 with a
signature Σ, and Σ1 ∪ Σ2 with Σ 0 such that Σ 0 is obtained as a pushout from the
other signatures via suitable signature morphisms (cf. the diagram below).
    Secondly, we move from single sentences to sets of sentences. This is useful
since we want to support DLs and TBox reasoning, and DLs like EL do not allow
to rewrite ‘conjunctions of subsumptions’, i.e., we cannot collapse a TBox into a
single sentence. (In case of compact logics, the use of sets is equivalent to the use
of finite sets.)
    We arrive at the following definition (in the sequel, fix an arbitrary institution
I = (Sign, Sen, Mod, |=)):
6
    Contrary to the terminology used in DL, we distinguish here proof-theoretic (syntactic)
    consistency of a theory T (which means T 6|= ϕ for some sentence ϕ) from model-
    theoretic (semantic) satisfiability (which means M |= T for some model M ).
Definition 12. The institution I has the Craig-Robinson interpolation prop-
erty (Shoenfield (1967), Dimitrakos and Maibaum (2000)), if for any pushout

                                             Σ0 
                                   θ1    -            θ2

                           Σ1                                  Σ2
                                    σ1              σ2     -

                                             Σ
any set Γ1 of Σ1 -sentences and any sets Γ2 , ∆2 of Σ2 -sentences with

                             θ1 (Γ1 ) ∪ θ2 (∆2 ) |= θ2 (Γ2 ),

there exists a set of Σ-sentences Γ (called the interpolant) such that

                        Γ1 |= σ1 (Γ ) and ∆2 ∪ σ2 (Γ ) |= Γ2 .

   Craig-Robinson interpolation is, in general, strictly stronger than Craig inter-
polation. However, for almost all logics typically used in knowledge representation,
they are indeed equivalent. We first give a criterion that applies to institutions
generally, taken from Diaconescu (2007):
Proposition 13. A compact institution with implication has Craig-Robinson in-
terpolation iff it has Craig interpolation.
Here, an institution has implication if for any two Σ-sentences ϕ, ψ, there exists a
Σ-sentence χ such that, for any Σ-model M ,

                        M |= χ iff (M |= ϕ implies M |= ψ)

Since for modal logics, the deduction theorem (for the global consequence relation
|=) generally fails, these logics do not have implication in the above sense, and we
cannot apply Prop. 13. However, we can apply a slightly more concrete criterion
for modal logics from the literature (cf. Prop. 2.1 in Areces and Marx (1998)):
Proposition 14. Let L be a modal logic whose local consequence relation is com-
pact and such that its class of Kripke frames is closed under point-generated sub-
frames. Then Craig interpolation for L implies Craig-Robinson interpolation.

Example 15 (Interpolation). The description logic ALC can be conceived as a syn-
tactic variant of multi-modal K, for which (Gabbay, 1972; Gabbay and Maksi-
mova, 2006) show Craig interpolation. K does not have implication, but satisfies
the assumptions of Prop. 14. Hence, ALC has Craig-Robinson interpolation. The
situation for DLs with nominals is less positive, in fact, the presence of nominals
generally destroys (standardly formulated) Craig interpolation (compare the dis-
cussion in (Kutz, 2004, Chapter 3.4), and Areces and ten Cate (2006)) but can
sometimes be restored, for instance, by treating nominals as logical constants, i.e.,
by freely reusing them. Here is a counterexample formulated in ALCO. Let

                          Γ := {> v ∃S.C u ∃S.¬C} and
                          ∆ := {∀S.(D t i) v ∃S.D}
where i is a nominal. Clearly, Γ |= ∆, for in every model M |= Γ , every point
has at least two S-successors. But i can only be true in at most one of those
successors, which entails M |= ∆. Now, (using bisimulations) it can be shown that
in ALCO there is no ∆0 built from shared concept names alone (there are none)
such that Γ |= ∆0 and ∆0 |= ∆. If we allow to use non-shared concept constructors
(modalities), an interpolant could obviously be given in logics such as SHOIN by
using (unqualified) number restrictions and by setting ∆0 = {> v (>2S)}. Note,
however, that ten Cate et al. (2006) show that interpolation still fails for ALCQO
(since Beth fails), but that the Beth definability property is recovered for ALCO@,
or indeed for SHIFO@.
    Craig-Robinson for FOLms is shown in Diaconescu (2007) (when one of the
signature morphisms is injective on sorts). Craig interpolation for FOL is a classic
result of Craig (1957), and Craig-Robinson follows since FOL is compact and has
implication.
    The failure of Craig interpolation for QS5 is shown in Fine (1979).7 But it holds
for the quantified extension of K (Gabbay, 1972), and so does Craig-Robinson.
    Finally, the modal logic S4u has Craig-interpolation,8 is compact (Goranko and
Passy, 1992), and has implications (for M |= ϕ =⇒ M |= ψ, set χ = ϕ → ψ).
Thus, S4u has Craig-Robinson interpolation.

The amalgamation property (called ‘exactness’ in Diaconescu et al. (1993)) is a
major technical assumption in the study of specification semantics, see Sannella
and Tarlecki (1988).

Definition 16. An institution I is (weakly) semi-exact if, for any pushout with
notation as depicted above, given any pair (M1 , M2 ) ∈ Mod(Σ1 ) × Mod(Σ2 ) that
is compatible (in the sense that M1 and M2 reduce to the same Σ-model) can be
amalgamated to a unique (or weakly amalgamated to a not necessarily unique) Σ 0 -
model M (i.e., there exists a (unique) M ∈ Mod(Σ 0 ) that reduces to M1 and M2 ,
respectively), and similarly for model morphisms.
    I is (weakly) exact, if additionally the initial (= empty, usually) signature has
exactly (at least) one model (and one model morphism).

   Note that (weak) exactness implies (weak) semi-exactness, while both are in-
dependent of interpolation.
   Weak semi-exactness for these institutions follows with standard methods, see
Diaconescu (2007). The same holds for exactness for the many-sorted variants.
Exactness, however, obviously fails for the single-sorted logics as well as for QS5,
because in these logics, the implicit universe resp. the implicit set of worlds leads to
the phenomenon that the empty signature has many different models. The following
propositions are folklore in institutional model theory, see Diaconescu (2007).
7
  Craig interpolation for QS5 can be restored, however, by extending the language with
  propositional quantifiers (Fitting, 2002) or nominals and @-operator (Areces et al.,
  2003).
8
  S4u can be thought of as the independent fusion of the modal logics S4 and S5, which
  both have interpolation, plus the containment axiom ϕ → ϕ. The interpolation
  property transfers to the fusion by a result of Kracht and Wolter (1991). However,
  since S4u is a Sahlqvist axiomatisable logic whose frame conditions are universal Horn,
  it also follows for S4u by a result of (Marx and Venema, 1997).
                 institution weakly semi-exact exact Craig-Robinson
                 EL                  +           -          ?
                 ALCms               +           +         +
                 ALC                 +           -         +
                 ALCO                +           -          -
                 ALCQO               +           -          -
                 SHOIN               +           -          ?
                 FOLms               +           +         +
                 QS5                 +           -          -


                 Fig. 1. (Weak semi-) exactness and Craig-Robinson


Theorem 17. 1. In an institution with Craig-Robinson interpolation, proof-theoretic
   conservativity is preserved along pushouts.
2. In an institution that is weakly semi-exact, model-theoretic conservativity is
   preserved along pushouts.

     Call a diagram connected if the graph underlying its index category is con-
nected when the identity arrows are deleted. A diagram is thin, or a preorder,
if its index category is thin (i.e., there is at most one arrow between two given
objects). A preorder is finitely bounded inf-complete if any two elements with
a common lower bound have an infimum.

Theorem 18. 1. If an institution has Craig-Robinson interpolation, composition
   of module diagrams via union preserves proof-theoretic conservativity if the
   resulting diagram is a connected finitely bounded inf-complete preorder.
2. If an institution is weakly semi-exact, composition of module diagrams via union
   preserves model-theoretic conservativity, if the resulting diagram is a connected
   finitely bounded inf-complete preorder. In case of weak exactness of the insti-
   tution, the connectedness assumption can be replaced with satisfiability of the
   involved ontologies.

Proof. We can obtain colimits for arbitrary connected finitely bounded inf-complete
preorders by successively taking pushouts. In each successive step, the pushout for
two maximal nodes with a common lower bound is taken along the infimum, thereby
decreasing the set of maximal nodes by one. Connectedness ensures that in case of
more than one maximal node, at least two of them have a common lower bound.
If a diagram with one maximal (=maximum) node is reached, this node provides
the colimit.
    Since in each individual module diagram, each ontology conservatively lies in
the colimit of that module diagram, by the above argument, this carries over to
the colimit of the union of the module diagrams.
    In case of a weakly exact institution, we always can add the initial signature
and the unique signature morphisms out of it, and thus get a connected diagram.
Hence we can drop the connectedness assumption in this case. Instead, we need
conservativity of the newly added signature morphisms, which amount to satisfia-
bility of the ontologies.                                                        t
                                                                                 u

   See Example 6 for a union of conservative diagrams.
5     Heterogeneous Module Diagrams
As Schorlemmer and Kalfoglou (2007) argue convincingly, relating ontologies may
happen across different institutions, since ontologies are written in many different
formalisms, like relation schemata, description logics, first-order logic, and modal
logics.
    Heterogeneous specification is based on some graph of logics and logic transla-
tions, formalised as institutions and so-called institution comorphisms, see Goguen
and Roşu (2002). The latter are again governed by the satisfaction condition, this
time expressing that truth is invariant also under change of notation across different
logical formalisms:

                        M 0 |=JΦ(Σ) αΣ (ϕ) ⇔ βΣ (M 0 ) |=IΣ ϕ.
Here, Φ(Σ) is the translation of signature Σ from institution I to institution J,
αΣ (ϕ) is the translation of the Σ-sentence ϕ to a Φ(Σ)-sentence, and βΣ (M 0 ) is
the translation (or perhaps: reduction) of the Φ(Σ)-model M 0 to a Σ-model.
    The so-called Grothendieck institution, see Diaconescu (2002); Mossakowski
(2002), is a technical device for giving a semantics to heterogeneous theories in-
volving several institution. The Grothendieck institution is basically a flattening,
or disjoint union, of a logic graph. A signature in the Grothendieck institution con-
sists of a pair (L, Σ) where L is a logic (institution) and Σ is a signature in the
logic L. Similarly, a Grothendieck signature morphism (ρ, σ) : (L1 , Σ1 ) → (L2 , Σ2 )
consists of a logic translation ρ = (Φ, α, β) : L1 −→ L2 plus an L2 -signature mor-
phism σ : Φ(Σ1 ) −→ Σ2 . Sentences, models and satisfaction in the Grothendieck
institution are defined in a component wise manner.
    Hence, the definitions and results of the previous sections also apply to the
heterogeneous case. Special care is needed in obtaining Craig-Robinson interpo-
lation or weak semi-exactness in the Grothendieck institution. Diaconescu (2007)
and Mossakowski (2006) contain some relevant results. In our experience with the
heterogeneous tool set (see Mossakowski et al. (2007)), for the Grothendieck in-
stitution, it was much easier to obtain weak semi-exactness than Craig-Robinson
interpolation.


6     Modular Languages
In this section, we show how the integration of ontologies via various ‘modular lan-
guages’ can be re-formulated in module diagrams. We concentrate on DDLs and
E-connections, which we reformulate as many-sorted theories. Finally, we analyse
the problem of conservativity when composing DDLs or E-connections. In the fol-
lowing, we will assume basic acquaintance with the syntax and semantics of both,
DDLs and E-connections. Details have to remain sketchy for lack of space.

6.1   DDLs and E-connections as heterogeneous module diagrams
From the discussion of Section 5, it should be clear that DDLs or E-connections
can essentially be considered as many-sorted heterogeneous theories: component
ontologies can be formulated in different logics, but have to be built from many-
sorted vocabulary, and link relations are interpreted as relations connecting the
sorts of the component logics (compare Baader and Ghilardi (2007) who note that
this is an instance of a more general co-comma construction). To be more precise,
assume a DDL D = (S1 , S2 ) is given. Knowledge bases for D can contain bridge
rule of the form:
                     v                                       w
                 Ci −→ Cj         (into rule)           Ci −→ Cj          (onto rule)
where Ci and Cj are concepts from Si and Sj (i 6= j), respectively.9
    An interpretation I for a DDL knowledge base is a pair ({Ii }i≤n , R), where each
Ii is a model for the corresponding Si , and R is a function associating with every
pair (i, j), i 6= j, a binary relation rij ⊆ Wi × Wj between the domains Wi and Wj
of Ii and Ij , respectively.


                         CE (T1ms , T
                                     ms
                                     2 )                         DDL(T1ms , T2ms )
                                                                   -

                                               T1ms ] T2
                                                        ms

                                               -
                                           c                 c
                                 T1ms                                 T2ms 
                             -                                     -
                         c                                                      c
               T1                                  ∅                                    T2

                     Fig. 2. E-connections and DDLs many-sorted


    In the many-sorted re-formulation of DDLs, the relation rij is now interpreted as
a relation between the >-sort of S1 and the >-sort of S2 . Bridge rules are expressed
as existential restrictions of the form
                    (]) ∃rij .Ci v Cj                  and           ∃rij .Ci w Cj
The fact that bridge rules are atomic statements in a DDL knowledge base now
translates to a restriction on the grammar governing the usage of the link relation
rij in the multi-sorted formalism (see Borgida (2007) for a discussion of related
issues). In fact, the main difference between DDLs and various E-connections now
lies in the expressivity of this ‘link language’ L connecting the different sorts of
the ontologies. In basic DDL as defined above, the only expressions allowed are
those given in (]), so the link language of DDL is a certain, very weak sub-Boolean
fragment of many sorted ALC, namely the one given through (]). In E-connections,
expressions of the form ∃rij .Ci are again concepts of Sj , to which Booleans (or other
operators) of Sj as well as restrictions using relations rji can be applied. Thus, the
basic link language of E-connections is sorted ALCIms (relative to the now richer
languages of Si ).10
9
   We consider here only DDL in its most basic form, comprising into and onto rules, but
   no individual correspondences. DDL in this form can be seen as a sub-formalism of
   one-way E-connections, see (Cuenca-Grau and Kutz, 2007).
10
   But can be weakened to ALCms or the link language of DDLs, or strengthened to more
   expressive many-sorted DLs such as ALCQIms .
    Such many-sorted theories can easily be represented in a diagram as shown
in Figure 2. Here, we first (conservatively) obtain a disjoint union T1ms ] T2ms as a
pushout, where the component ontologies have been turned into sorted variants (us-
ing an institution comorphism from the single-sorted to the many-sorted logic), and
the empty interface guarantees that no symbols are shared at this point. An E-con-
nection KB in language CE (T1ms , T2ms ) or a DDL KB in language DDL(T1ms , T2ms ) is
then obtained as a (typically not conservative) theory extension.

6.2   The problem of iteration, or, how to keep modules intact
When connecting ontologies via bridges, or interfaces, this typically is not conser-
vative everywhere, but only for some of the involved ontologies. We give a criterion
for a single ontology to be conservative in the combination. While the theorem can
be applied to arbitrary interface nodes, when applied to E-connections or DDLs,
we assume that bridge nodes contain DDL bridge rules or E-connection assertions.
Theorem 19. Assume that we work in an institution that has Craig-Robinson in-
terpolation (is weakly semi-exact). Let ontologies T1 , . . . , Tn be connected via bridges
Bij , i < j. If Ti is proof-theoretically (model-theoretically) conservative in Bij for
i < j, then T1 is proof-theoretically (model-theoretically) conservative in the result-
ing colimit ontology.
Proof. By induction over n. The base n = 1 is clear. Suppose now that the result
holds for n, such that T1 lies conservatively in the colimit ontology T , and we add
Tn+1 with corresponding bridges B1,n+1 , . . . , Bn,n+1 .

                                                                                      T0
                                                                            c     -
                                                                                       6

                                                             c       -
                                                                            ...

                                            c     - •


                             c   - •                                ...

                      T                        - B1,n+1            ...        - Bn,n+1
             c    -              c                               c               -
      T1              ...            Tn                              Tn+1
The resulting new colimit theory T 0 is constructed by successively constructing
pushouts, whence we can use Theorem 17 to lift the conservativities of the mor-
phisms Ti → Ti,n+1 to conservativities of the arrows in the chain from T to T 0 .
Since conservative theory morphisms compose, T1 is conservative in T 0 .       t
                                                                               u
As concerns the applicability of the theorem, we have given an overview of log-
ics being weakly semi-exact or having Craig-Robinson interpolation in Fig. 4. Of
course, the conservativity assumptions have to be shown additionally. We here give
an example of the failure of the claim of the theorem in case we work in a logic
that lacks Craig-Robinson interpolation.
Example 20. Assume ontologies T1 , T2 , T3 are formulated in the DL ALCO, recall
the example of failure of interpolation for ALCO given in Example 15, and consider
the following diagram:
                                         B23
                            T2                     - T3
                                           c


                             


                                                      -
                                       c        c
                                 B12                B13

                                           T1

where Sig(T1 ) ⊆ {S, B, D, i}, Sig(T2 ) ⊆ {C1 , C2 }, and Sig(T3 ) ⊆ {B1 , B2 }. More-
over, T1 ⊇ {∃S.D}, and B12 = {> v ∃S.∃R1 .C1 , > v ∃S.∃R1 .¬C2 }, B13 = {B1 ≡
∃R3−1 .B, B2 ≡ ∃R3−1 .B}, and B23 = {C1 ≡ ∃R2 .B1 , C2 ≡ ∃R2 .B2 }.
    Here, the roles R1 , R2 , R3 can be seen as link relations, and since we apply
existential restrictions ∃S to ∃R2 .C1 etc., the example can be understood as a
composition of (binary) E-connections.
    The reader can check that Ti is conservative in Bij for j > i. However, in the
colimit (union) of this diagram, ∀S.D t i v ∃S.D follows, while this does not follow
in T1 , and thus T1 is not conservative in the colimit ontology.


7   Discussion and Outlook
Diagrams and their colimits offer the right level of abstraction to study conser-
vativity issues in different languages for modular ontologies. We have singled out
conditions that allow for lifting conservativity properties from individual diagrams
to their combinations.
    An interesting point is the question whether proof-theoretic or model-theoretic
conservativity should be used. The model-theoretic notion ensures ‘modularity’ in
more logics than the proof-theoretic one since the lifting theorem for the former
only depends on mild amalgamation properties. By contrast, for the latter one
needs Craig-Robinson interpolation which fails, e.g., for some description logics
with nominals, and also for QS5—but these logics are used in practice for ontology
design.
    Moreover, as argued in Sect. 5, when relating ontologies across different insti-
tutions, the model-theoretic notion is more feasible. Finally, it has the advantage
of being independent of the particular language, which implies avoidance of exam-
ples like the one presented in Lutz and Wolter (2007) (Example 9 above), where a
given ontology extension is proof-theoretically conservative in EL but not in ALC.
Of course, model-theoretic conservativity generally is harder to decide, but it can
be ensured by syntactic criteria, and the work related to this is promising (e.g.
Cuenca-Grau et al., 2006a, 2007).

Acknowledgements
Work on this paper has been supported by the Vigoni program of the DAAD and
by the DFG-funded collaborative research centre ‘Spatial Cognition’.
We thank John Bateman, Joana Hois, and Lutz Schröder for fruitful discussions.
                               Bibliography


Adámek, J., Herrlich, H. and Strecker, G. (1990). Abstract and Concrete Cate-
  gories. Wiley, New York. Freely available at http://www.math.uni-bremen.de/∼dmb/
  acc.pdf.
Areces, C., Blackburn, P. and Marx, M. (2003). Repairing the Interpolation Theo-
  rem in Quantified Modal Logic. Annals of Pure and Applied Logics 123, 287–299.
Areces, C. and Marx, M. (1998). Failure of interpolation in combined modal logics.
  Notre Dame Journal of Formal Logic 39, 253–273.
Areces, C. and ten Cate, B. (2006). Hybrid Logics. In J. v. Benthem, P. Blackburn
  and F. Wolter, eds., Handbook of Modal Logic. Amsterdam: Elsevier, pages 821–868.
Baader, F. and Ghilardi, S. (2007). Connecting Many-Sorted Theories. The Journal
  of Symbolic Logic 72, 535–583.
Bao, J., Caragea, D. and Honavar, V. (2006a). Modular Ontologies - A Formal
  Investigation of Semantics and Expressivity. In Proc. of ASWC . Springer.
Bao, J., Caragea, D. and Honavar, V. (2006b). On the Semantics of Linking and
  Importing in Modular Ontologies. In Proc. of ISWC . Springer.
Bidoit, M. and Mosses, P. D. (2004). Casl User Manual . LNCS Vol. 2900 (IFIP
  Series). Springer. Freely available at http://www.cofi.info.
Borgida, A. (2007). On Importing Knowledge from DL Ontologies: some Intuitions and
  Problems. In Proc. of DL.
Borgida, A. and Serafini, L. (2002). Distributed Description Logics: Directed Domain
  Correspondences in Federated Information Sources. In On The Move to Meaningful
  Internet Systems, volume 2519 of LNCS . Springer.
Borgida, A. and Serafini, L. (2003). Distributed Description Logics: Assimilating
  Information from Peer Sources. Journal of Data Semantics 1, 153–184.
Borzyszkowski, T. (2002). Logical systems for structured specifications. Theoretical
  Computer Science 286, 197–245.
CoFI (The Common Framework Initiative) (2004). Casl Reference Manual . LNCS
  Vol. 2960 (IFIP Series). Springer. Freely available at http://www.cofi.info.
Craig, W. (1957). Three uses of the Herbrand-Genzen theorem in relating model theory
  and proof theory. Journal of Symbolic Logic 22, 269–285.
Cuenca-Grau, B., Horrocks, I., Kazakov, Y. and Sattler, U. (2007). Ontology
  reuse: Better safe than sorry. In Proc. of DL-07 .
Cuenca-Grau, B., Horrocks, I., Kutz, O. and Sattler, U. (2006a). Will my On-
  tologies Fit Together? In Proc. of DL-06 .
Cuenca-Grau, B. and Kutz, O. (2007). Modular Ontology Languages Revisited. In
  Proc. of the IJCAI’07 Workshop on Semantic Web for Collaborative Knowledge Acqui-
  sition (SWeCKa), Hyderabad, India, January 2007 .
Cuenca-Grau, B., Parsia, B. and Sirin, E. (2006b). Combining OWL Ontologies
  Using E-Connections. Journal of Web Semantics 4, 40–59.
Diaconescu, R. (2002). Grothendieck institutions. Applied Categorical Structures 10,
  383–402.
Diaconescu, R. (2007). Institution-independent Model Theory. Springer Verlag. To
  appear.
Diaconescu, R., Goguen, J. and Stefaneas, P. (1993). Logical Support for Modu-
  larisation. In 2nd Workshop on Logical Environments. New York: CUP, pages 83–130.
Dimitrakos, T. and Maibaum, T. (2000). On a generalised modularisation theorem.
  Information Processing Letters 74, 65–71.
Fine, K. (1979). Failures of the Interpolation Lemma in Quantified Modal Logic. The
  Journal of Symbolic Logic 44, 201–206.
Fitting, M. (2002). Interpolation for First Order S5. The Journal of Symbolic Logic 67,
  621–634.
Gabbay, D. and Maksimova, L. (2006). Interpolation and definability: Modal and Intu-
  itionistic Logics (Oxford Logic Guides). Oxford science publications. Clarendon press,
  Oxford.
Gabbay, D. M. (1972). Craig’s interpolation theorem for modal logics. In Confer-
  ence in Mathematical Logic, London ’70 , volume 255 of Lecture Notes in Mathematics.
  Springer.
Ghilardi, S., Lutz, C. and Wolter, F. (2006). Did I Damage My Ontology? A Case
  for Conservative Extensions in Description Logics. In Proceedings of KR-06 .
Goguen, J. and Roşu, G. (2002). Institution morphisms. Formal aspects of computing
  13, 274–307.
Goguen, J. A. (1991). A categorical manifesto. Mathematical Structures in Computer
  Science 1, 49–67.
Goguen, J. A. and Burstall, R. M. (1992). Institutions: Abstract Model Theory for
  Specification and Programming. Journal of the ACM 39, 95–146.
Goranko, V. and Passy, S. (1992). Using the Universal Modality: Gains and Questions.
  J. Log. Comput. 2, 5–30.
Guerra, S. (2001). Composition of Default Specifications. J. Log. Comput. 11, 559–578.
Kracht, M. and Wolter, F. (1991). Properties of Independently Axiomatizable Bi-
  modal Logics. The Journal of Symbolic Logic 56, 1469–1485.
Kutz, O. (2004). E-Connections and Logics of Distance. Ph.D. thesis, The University of
  Liverpool.
Kutz, O., Lutz, C., Wolter, F. and Zakharyaschev, M. (2004). E-Connections of
  Abstract Description Systems. Artificial Intelligence 156, 1–73.
Lüttich, K., Masolo, C. and Borgo, S. (2006). Development of Modular Ontologies in
  CASL. In P. Haase, V. Honavar, O. Kutz, Y. Sure and A. Tamilin, eds., 1st Workshop on
  Modular Ontologies 2006 , volume 232 of CEUR Workshop Proceedings. CEUR-WS.org.
Lutz, C. and Wolter, F. (2007). Conservative Extensions in the Lightweight Descrip-
  tion Logic EL. In Proc. of CADE-07 . Springer.
Marx, M. and Venema, Y. (1997). Multi-dimensional modal logic. Dordrecht: Kluwer
  Academic Publishers.
Mossakowski, T. (2002). Comorphism-based Grothendieck logics. In K. Diks and
  W. Rytter, eds., Mathematical Foundations of Computer Science, volume 2420 of
  LNCS . Springer, pages 593–604.
Mossakowski, T. (2006). Institutional 2-cells and Grothendieck institutions. In K. Fu-
  tatsugi, J.-P. Jouannaud and J. Meseguer, eds., Algebra, Meaning and Computation.
  Essays Dedicated to Joseph A. Goguen on the Occasion of His 65th Birthday, LNCS
  4060. Springer.
Mossakowski, T., Maeder, C. and Lüttich, K. (2007). The Heterogeneous Tool Set.
  In O. Grumberg and M. Huth, eds., TACAS 2007 , volume 4424 of Lecture Notes in
  Computer Science. Springer-Verlag Heidelberg.
Sannella, D. and Tarlecki, A. (1988). Specifications in an arbitrary institution.
  Information and Computation 76, 165–210.
Schorlemmer, W. M. and Kalfoglou, Y. (2007). Institutionalising Ontology-Based
  Semantic Integration. Journal of Applied Ontology. To appear.
Shoenfield, J. (1967). Mathematical Logic. Addison-Wesley, Reading, Massachusetts.
ten Cate, B., Conradie, W., Marx, M. and Venema, Y. (2006). Definitorially Com-
  plete Description Logics. In P. Doherty, J. Mylopoulos and C. Welty, eds., Proceedings
  of KR 2006 . AAAI Press.