Modules in Transition Conservativity, Composition, and Colimits Oliver Kutz1 and Till Mossakowski2 1 SFB/TR 8 Spatial Cognition, Bremen, Germany okutz@informatik.uni-bremen.de 2 DFKI Lab Bremen and SFB/TR 8 Spatial Cognition, Bremen, Germany till.mossakowski@dfki.de Abstract. Several modularity concepts for ontologies have been studied in the literature. Can they be brought to a common basis? We propose to use the language of category theory, in particular diagrams and their colimits, for answering this question. We outline a general approach for representing combinations of logical theories, or ontologies, through interfaces of various kinds, based on diagrams and the theory of institutions. In particular, we consider theory interpretations, language extensions, symbol identification, and conservative extensions. We study the problem of inheriting conserva- tivity between sub-theories in a diagram to its colimit ontology. Finally, we apply this to the problem of conservativity when composing DDLs or E-connections. 1 Introduction In this paper, we propose to use the category theoretic notions of diagram and colimit in order to provide a common semantic backbone for various notions of modularity in ontologies. At least three commonly used notions of ‘module’ in ontologies can be dis- tinguished, depending of the kind of relationship between the ‘module’ and its supertheory (or superontology): (1) a module can be considered a ‘logically inde- pendent’ part within its superontology—this leads to the definition of module as a part of a larger ontology which is a conservative extensions of it; (2) a module can be a part of a larger ‘integrated ontology’, where the kind of integration determines the relation between the modules—this is the approach followed by modular ontol- ogy languages (e.g. DDLs, E-connections etc.); (3) a ‘part’ of a larger theory can be considered a module for reasons of elegance, re-use, tradition, etc.—in this case, the relation between a module and its supertheory might be a language extension, theory extension/interpretation, etc. The main contributions of the present paper are the following: (i) building on the theory of institutions, diagrams, and colimits, we show how these different notions of module can be considered simultaneously using the notion of a module diagram; (ii) we show how conservativity properties can be traced and inherited to the colimit of a diagram; (iii) we show how this applies to the composition problem in modular ontology languages such as DDLs and E-connections. In Section 2, we will introduce institutions and give several examples. Section 3 introduces the diagrammatic view of modules, and Section 4 studies the problem of conservativity in diagrams. Finally, in Section 5, we sketch heterogeneous diagrams, and apply this to modular ontology languages in Section 6. 2 Institutions The study of modularity principles can be carried out to a quite large extent inde- pendently of the details of the underlying logical system that is used. The notion of institutions was introduced by Goguen and Burstall in the late 1970s exactly for this purpose (see Goguen and Burstall (1992)). They capture in a very abstract and flexible way the notion of a logical system by leaving open the details of signatures, models, sentences (axioms) and satisfaction (of sentences in models). The importance of the notion of institutions lies in the fact that a surpris- ingly large body of logical notions and results can be developed in a way that is completely independent of the specific nature of the underlying institution.1 We assume some acquaintance with the basic notions of category theory, and refer to Adámek et al. (1990) for an introduction. The reader with no background in category theory can envisage a category as a “graph with composition of arrows”, a functor as a “graph homomorphism”. If C is a category, Cop is the dual category, where all arrows are reversed. Definition 1. An institution I = (Sign, Sen, Mod, |=) consists of – a category Sign of signatures, – a functor Sen : Sign −→ Set2 giving, for each signature Σ, the set of sentences Sen(Σ), and for each signature morphism σ : Σ −→ Σ 0 , the sentence translation map Sen(σ) : Sen(Σ) −→ Sen(Σ 0 ), where often Sen(σ)(ϕ) is written as σ(ϕ), – a functor Mod : Signop −→ CAT 3 giving, for each signature Σ, the category of models Mod(Σ), and for each signature morphism σ : Σ −→ Σ 0 , the reduct functor Mod(σ) : Mod(Σ 0 ) −→ Mod(Σ), where often Mod(σ)(M 0 ) is written as M 0 |σ , – a satisfaction relation |=Σ ⊆ |Mod(Σ)| × Sen(Σ) for each Σ ∈ |Sign|, such that for each σ : Σ −→ Σ 0 in Sign the following satisfaction condition holds: (?) M 0 |=Σ 0 σ(ϕ) iff M 0 |σ |=Σ ϕ for each M 0 ∈ |Mod(Σ 0 )| and ϕ ∈ Sen(Σ), expressing that truth is invariant under change of notation and enlargement of context. t u The only condition governing the behaviour of institutions is thus the satisfaction condition (?).4 A theory in an institution is a pair T = (Σ, Γ ) consisting of a signature Sig(T ) = Σ and a set of Σ-sentences Ax(T ) = Γ , the axioms of the theory. The models of a theory T are those Sig(T )-models that satisfy all axioms in Ax(T ). Logical consequence is defined as usual: T |= ϕ if all T -models satisfy ϕ. Theory morphisms, also called interpretations of theories, are signature morphisms 1 For an extensive treatment of the model theory in this setting, see Diaconescu (2007). 2 Set is the category having all sets as objects and functions as arrows. 3 CAT is the category of categories and functors. Strictly speaking, CAT is not a category but only a so-called quasicategory, which is a category that lives in a higher set-theoretic universe. 4 Note, however, that non-monotonic formalisms can only indirectly be covered this way, but compare, e.g., Guerra (2001). that map axioms to logical consequences. An institution is compact if T |= ϕ implies T 0 |= ϕ for a finite subtheory T 0 of T . Examples of institutions include: = Example 2. First-order Logic. In the institution FOLms of many-sorted first- order logic with equality, signatures are many-sorted first-order signatures, consist- ing of sorts and typed function and predicate symbols. Signature morphisms map symbols such that typing is preserved. Models are many-sorted first-order struc- tures. Sentences are first-order formulas. Sentence translation means replacement of the translated symbols. Model reduct means reassembling the model’s compo- nents according to the signature morphism. Satisfaction is the usual satisfaction of a first-order sentence in a first-order structure. t u Example 3. Description Logics. Signatures of the description logic ALC consist of a set of B of atomic concepts and a set R of roles, while signature morphisms provide respective mappings. Models are single-sorted first-order structures that interpret concepts as unary and roles as binary predicates. Sentences are subsump- tion relations C1 v C2 between concepts, where concepts follow the grammar C ::= B | > | ⊥ | C1 t C2 | C1 u C2 | ¬C | ∀R.C | ∃R.C Sentence translation and reduct is defined similarly as in FOL= . Satisfaction is the standard satisfaction of description logics. ALCms is the many-sorted variant of ALC. ALCO is obtained from ALC by extending signatures with nominals. The (sub-Boolean) description logic EL restricts ALC as follows: C ::= B | > | C1 u C2 | ∃R.C. SHOIN extends ALC with role hierarchies, transitive and inverse roles, (unqualified) number restrictions, and nominals, etc. t u Example 4. (Quantified) Modal Logics. The modal logic S4u has signatures as classical propositional logic, consisting of propositional variables. Sentences are built as in propositional logic, but add two unary modal operators,  and . Mod- els are standard Kripke structures but based on reflexive and transitive relations. Satisfaction is standard modal satisfaction, where  is interpreted by the transitive reflexive relation, and  by universal quantification over worlds. The standard formulation of first-order modal logic QS5 (due to Kripke) has signatures similar to FOL= , including variables and predicate symbols. Sentences follow the grammar for FOL= -sentences using Booleans, quantifiers, and identity, while adding the  operator, but leaving out constants and function symbols. Models are constant-domain first-order Kripke structures, with the usual first-order modal satisfaction. t u 3 Modules as Diagrams Several approaches to modularity in ontologies have been discussed in recent years, including the introduction of various so-called ‘modular ontology languages’. The module system of the Web Ontology Language OWL itself is as simple as inade- quate (Cuenca-Grau et al., 2006b): it allows for importing other ontologies, includ- ing cyclic imports. The language Casl (see Bidoit and Mosses, 2004; CoFI (The Common Framework Initiative), 2004) has been used in Lüttich et al. (2006) for ontologies. Beyond imports, it allows for renaming, hiding and parameterisation. Other languages that have been introduced include DDLs (Borgida and Serafini, 2002, 2003), E-connections (Kutz et al., 2004), and P-DLs (Bao et al., 2006b,a), where, roughly speaking, more involved integration and modularisation mechanisms than plain imports are envisaged. We will use the formalism of colimits of diagrams as a common semantic back- bone for these languages.5 The intuition behind this formalism is explained as follows: “Given a species of structure, say widgets, then the result of intercon- necting a system of widgets to form a super-widget corresponds to taking the colimit of the diagram of widgets in which the morphisms show how they are interconnected.” Goguen (1991) The notion of diagram is formalised in category theory. Diagrams map an index category (via a functor) to a given category of interest. They can be thought of as graphs in the category. A cocone over a diagram is a kind of “tent”: it consists of a tip, together with a morphism from each object involved in the diagram into the tip, such that the triangles arising from the morphisms in the diagram commute. A colimit is a universal, or minimal cocone. We refer to (Adámek et al., 1990) for formal details. In the sequel, we will assume that the signature category has all finite colimits, which is a rather mild assumption; in particular, it is true for all the examples above. Moreover, we will rely on the fact that colimits of theories exist in this case as well; the colimit theory is defined as the union of all component theories in the diagram, translated along the signature morphisms of the colimiting cocone. Definition 5. A module diagram of ontologies is a diagram of theories such that the nodes are subdivided into ontology nodes and interface nodes. Composition of module diagrams is simply their union. Example 6. Consider the union of the diagrams T1  T2 T2  T3 - - Σ1 Σ2 where the Σi are interfaces and the Ti are ontologies. Think of e.g. T12 as being an ontology that imports T1 and T2 , where Σ1 contains all the symbols shared between T1 and T2 . Then T12 (and T23 ) can be obtained as pushouts, and so can the overall union T123 (which typically is then further extended with new concepts etc.). A “c” means “conservative”; this will be explained in Sect. 4. 5 However, note that hiding is not covered by this approach. T123  c - c T12  T23  c - c c - c T1  T2  T3 - - Σ1 Σ2 It is clear that theories with an import structure are just tree-shaped diagrams, while both shared parts and cyclic imports lead to arbitrary graph-shaped dia- grams. The translation of Casl (without hiding) to so-called development graphs detailed in (CoFI (The Common Framework Initiative), 2004) naturally leads to diagrams as well. Finally, the diagrams corresponding to modular languages like DDLs and E-connections will be studied in Sect. 6. Thus, diagrams can be seen as a uniform mathematical formalism where properties of all of these module concepts can be studied. An important such property is conservativity. 4 Conservative Diagrams Conservative diagrams are important because they imply that the combined ontol- ogy does not add new facts to the individual ontologies. Indeed, the notion of an ontology module of an ontology T has been defined as any “subontology T 0 such that T is a conservative extension of T 0 ” (Ghilardi et al., 2006)—this perfectly matches our notion of conservative diagram below. Definition 7. A theory morphism σ : T1 −→ T2 is proof-theoretically conser- vative, if T2 does not entail anything new w.r.t. T1 , formally, T2 |= σ(ϕ) implies T1 |= ϕ. Moreover, σ : T1 −→ T2 is model-theoretically conservative, if any T1 -model M1 has a σ-expansion to T2 , i.e. a T2 -model M2 with M2 |σ = M1 . It is easy to show that conservative theory morphisms compose. Moreover, Proposition 8. Model-theoretic implies proof-theoretic conservativity. However, the converse is not true in general: Example 9. (Lutz and Wolter, 2007) Consider the following two EL TBoxes: Γ1 = {Human v ∃eats.>, Plant v ∃grows in.Area, Vegetarian v Healthy} Γ2 = {Human v ∃eats.Food, Food u Plant v Vegetarian} It is easily seen that Γ1 ∪ Γ2 is a proof-theoretic conservative extension of Γ1 w.r.t. EL. However, (Lutz and Wolter, 2007) also show this is not the case w.r.t. ALC, as witnessed by A := Human u ∀eats.Plant v ∃eats.Vegetarian, since Γ1 ∪ Γ2 |= A, but Γ1 6|= A. In particular, it follows that Γ1 ∪ Γ2 is not a model-theoretic conservative extension of Γ1 . Definition 10. A (proof-theoretic, model-theoretic) conservative module diagram of ontologies is a diagram of theories such that the theory morphism of any ontology node into the colimit of the diagram is (proof-theoretically resp. model-theoretically) conservative. By conservativity, the definition immediately yields: Proposition 11. The colimit ontology of a proof-theoretic (model-theoretic) mod- ule diagram is consistent (satisfiable)6 if any of the component ontologies is. Thus, in particular, in a conservative module diagram, an ontology node Oi can only be consistent (satisfiable) if all other ontology nodes Oj , j 6= i, are consistent (satisfiable) as well. The main question is how to ensure these conservativity properties in the united diagram. To study this, we introduce some notions from model theory, namely vari- ous notions of interpolation (for proof-theoretic conservativity) and amalgama- tion (for model-theoretic conservativity). Craig interpolation plays a crucial role in connection with proof systems in structured theories, see Borzyszkowski (2002). It comes in various forms. AIP. (Arrow Interpolation) If |= ϕ → ψ, then there exists some χ that only uses symbols occurring in both ϕ and ψ, with |= ϕ → χ and |= χ → ψ. However, this relies on a connective → being present in the institution. For the general study of module systems, the following formulation is more suitable: TIP. (Turnstile Interpolation) If ϕ |= ψ, then there exists some χ that only uses symbols occurring in both ϕ and ψ, with ϕ |= χ and χ |= ψ. TIP follows from AIP in the presence of a deduction theorem, and in this case, a further generalisation follows (see Areces and Marx, 1998): SIP. (Splitting Interpolation) If ϕ0 , ϕ1 |= ψ, then there exists some χ that only uses symbols occurring in both ψ and the union of the symbols of ϕ0 and ϕ1 , with ϕ0 |= χ and ϕ1 , χ |= ψ. We now further generalise SIP in two ways. The first generalisation concerns the rather implicit use of signatures in the definitions above. Making this explicit would mean to assume that ϕ lives in a signature Σ1 , ψ lives in a signature Σ2 , the entailment ϕ |= ψ lives in Σ1 ∪ Σ2 , and the interpolant in Σ1 ∩ Σ2 . Since we do not want to go into the technicalities for equipping an institution with unions and intersections (see Diaconescu et al. (1993) for details), we replace Σ1 ∩ Σ2 with a signature Σ, and Σ1 ∪ Σ2 with Σ 0 such that Σ 0 is obtained as a pushout from the other signatures via suitable signature morphisms (cf. the diagram below). Secondly, we move from single sentences to sets of sentences. This is useful since we want to support DLs and TBox reasoning, and DLs like EL do not allow to rewrite ‘conjunctions of subsumptions’, i.e., we cannot collapse a TBox into a single sentence. (In case of compact logics, the use of sets is equivalent to the use of finite sets.) We arrive at the following definition (in the sequel, fix an arbitrary institution I = (Sign, Sen, Mod, |=)): 6 Contrary to the terminology used in DL, we distinguish here proof-theoretic (syntactic) consistency of a theory T (which means T 6|= ϕ for some sentence ϕ) from model- theoretic (semantic) satisfiability (which means M |= T for some model M ). Definition 12. The institution I has the Craig-Robinson interpolation prop- erty (Shoenfield (1967), Dimitrakos and Maibaum (2000)), if for any pushout Σ0  θ1 - θ2 Σ1  Σ2 σ1 σ2 - Σ any set Γ1 of Σ1 -sentences and any sets Γ2 , ∆2 of Σ2 -sentences with θ1 (Γ1 ) ∪ θ2 (∆2 ) |= θ2 (Γ2 ), there exists a set of Σ-sentences Γ (called the interpolant) such that Γ1 |= σ1 (Γ ) and ∆2 ∪ σ2 (Γ ) |= Γ2 . Craig-Robinson interpolation is, in general, strictly stronger than Craig inter- polation. However, for almost all logics typically used in knowledge representation, they are indeed equivalent. We first give a criterion that applies to institutions generally, taken from Diaconescu (2007): Proposition 13. A compact institution with implication has Craig-Robinson in- terpolation iff it has Craig interpolation. Here, an institution has implication if for any two Σ-sentences ϕ, ψ, there exists a Σ-sentence χ such that, for any Σ-model M , M |= χ iff (M |= ϕ implies M |= ψ) Since for modal logics, the deduction theorem (for the global consequence relation |=) generally fails, these logics do not have implication in the above sense, and we cannot apply Prop. 13. However, we can apply a slightly more concrete criterion for modal logics from the literature (cf. Prop. 2.1 in Areces and Marx (1998)): Proposition 14. Let L be a modal logic whose local consequence relation is com- pact and such that its class of Kripke frames is closed under point-generated sub- frames. Then Craig interpolation for L implies Craig-Robinson interpolation. Example 15 (Interpolation). The description logic ALC can be conceived as a syn- tactic variant of multi-modal K, for which (Gabbay, 1972; Gabbay and Maksi- mova, 2006) show Craig interpolation. K does not have implication, but satisfies the assumptions of Prop. 14. Hence, ALC has Craig-Robinson interpolation. The situation for DLs with nominals is less positive, in fact, the presence of nominals generally destroys (standardly formulated) Craig interpolation (compare the dis- cussion in (Kutz, 2004, Chapter 3.4), and Areces and ten Cate (2006)) but can sometimes be restored, for instance, by treating nominals as logical constants, i.e., by freely reusing them. Here is a counterexample formulated in ALCO. Let Γ := {> v ∃S.C u ∃S.¬C} and ∆ := {∀S.(D t i) v ∃S.D} where i is a nominal. Clearly, Γ |= ∆, for in every model M |= Γ , every point has at least two S-successors. But i can only be true in at most one of those successors, which entails M |= ∆. Now, (using bisimulations) it can be shown that in ALCO there is no ∆0 built from shared concept names alone (there are none) such that Γ |= ∆0 and ∆0 |= ∆. If we allow to use non-shared concept constructors (modalities), an interpolant could obviously be given in logics such as SHOIN by using (unqualified) number restrictions and by setting ∆0 = {> v (>2S)}. Note, however, that ten Cate et al. (2006) show that interpolation still fails for ALCQO (since Beth fails), but that the Beth definability property is recovered for ALCO@, or indeed for SHIFO@. Craig-Robinson for FOLms is shown in Diaconescu (2007) (when one of the signature morphisms is injective on sorts). Craig interpolation for FOL is a classic result of Craig (1957), and Craig-Robinson follows since FOL is compact and has implication. The failure of Craig interpolation for QS5 is shown in Fine (1979).7 But it holds for the quantified extension of K (Gabbay, 1972), and so does Craig-Robinson. Finally, the modal logic S4u has Craig-interpolation,8 is compact (Goranko and Passy, 1992), and has implications (for M |= ϕ =⇒ M |= ψ, set χ = ϕ → ψ). Thus, S4u has Craig-Robinson interpolation. The amalgamation property (called ‘exactness’ in Diaconescu et al. (1993)) is a major technical assumption in the study of specification semantics, see Sannella and Tarlecki (1988). Definition 16. An institution I is (weakly) semi-exact if, for any pushout with notation as depicted above, given any pair (M1 , M2 ) ∈ Mod(Σ1 ) × Mod(Σ2 ) that is compatible (in the sense that M1 and M2 reduce to the same Σ-model) can be amalgamated to a unique (or weakly amalgamated to a not necessarily unique) Σ 0 - model M (i.e., there exists a (unique) M ∈ Mod(Σ 0 ) that reduces to M1 and M2 , respectively), and similarly for model morphisms. I is (weakly) exact, if additionally the initial (= empty, usually) signature has exactly (at least) one model (and one model morphism). Note that (weak) exactness implies (weak) semi-exactness, while both are in- dependent of interpolation. Weak semi-exactness for these institutions follows with standard methods, see Diaconescu (2007). The same holds for exactness for the many-sorted variants. Exactness, however, obviously fails for the single-sorted logics as well as for QS5, because in these logics, the implicit universe resp. the implicit set of worlds leads to the phenomenon that the empty signature has many different models. The following propositions are folklore in institutional model theory, see Diaconescu (2007). 7 Craig interpolation for QS5 can be restored, however, by extending the language with propositional quantifiers (Fitting, 2002) or nominals and @-operator (Areces et al., 2003). 8 S4u can be thought of as the independent fusion of the modal logics S4 and S5, which both have interpolation, plus the containment axiom ϕ → ϕ. The interpolation property transfers to the fusion by a result of Kracht and Wolter (1991). However, since S4u is a Sahlqvist axiomatisable logic whose frame conditions are universal Horn, it also follows for S4u by a result of (Marx and Venema, 1997). institution weakly semi-exact exact Craig-Robinson EL + - ? ALCms + + + ALC + - + ALCO + - - ALCQO + - - SHOIN + - ? FOLms + + + QS5 + - - Fig. 1. (Weak semi-) exactness and Craig-Robinson Theorem 17. 1. In an institution with Craig-Robinson interpolation, proof-theoretic conservativity is preserved along pushouts. 2. In an institution that is weakly semi-exact, model-theoretic conservativity is preserved along pushouts. Call a diagram connected if the graph underlying its index category is con- nected when the identity arrows are deleted. A diagram is thin, or a preorder, if its index category is thin (i.e., there is at most one arrow between two given objects). A preorder is finitely bounded inf-complete if any two elements with a common lower bound have an infimum. Theorem 18. 1. If an institution has Craig-Robinson interpolation, composition of module diagrams via union preserves proof-theoretic conservativity if the resulting diagram is a connected finitely bounded inf-complete preorder. 2. If an institution is weakly semi-exact, composition of module diagrams via union preserves model-theoretic conservativity, if the resulting diagram is a connected finitely bounded inf-complete preorder. In case of weak exactness of the insti- tution, the connectedness assumption can be replaced with satisfiability of the involved ontologies. Proof. We can obtain colimits for arbitrary connected finitely bounded inf-complete preorders by successively taking pushouts. In each successive step, the pushout for two maximal nodes with a common lower bound is taken along the infimum, thereby decreasing the set of maximal nodes by one. Connectedness ensures that in case of more than one maximal node, at least two of them have a common lower bound. If a diagram with one maximal (=maximum) node is reached, this node provides the colimit. Since in each individual module diagram, each ontology conservatively lies in the colimit of that module diagram, by the above argument, this carries over to the colimit of the union of the module diagrams. In case of a weakly exact institution, we always can add the initial signature and the unique signature morphisms out of it, and thus get a connected diagram. Hence we can drop the connectedness assumption in this case. Instead, we need conservativity of the newly added signature morphisms, which amount to satisfia- bility of the ontologies. t u See Example 6 for a union of conservative diagrams. 5 Heterogeneous Module Diagrams As Schorlemmer and Kalfoglou (2007) argue convincingly, relating ontologies may happen across different institutions, since ontologies are written in many different formalisms, like relation schemata, description logics, first-order logic, and modal logics. Heterogeneous specification is based on some graph of logics and logic transla- tions, formalised as institutions and so-called institution comorphisms, see Goguen and Roşu (2002). The latter are again governed by the satisfaction condition, this time expressing that truth is invariant also under change of notation across different logical formalisms: M 0 |=JΦ(Σ) αΣ (ϕ) ⇔ βΣ (M 0 ) |=IΣ ϕ. Here, Φ(Σ) is the translation of signature Σ from institution I to institution J, αΣ (ϕ) is the translation of the Σ-sentence ϕ to a Φ(Σ)-sentence, and βΣ (M 0 ) is the translation (or perhaps: reduction) of the Φ(Σ)-model M 0 to a Σ-model. The so-called Grothendieck institution, see Diaconescu (2002); Mossakowski (2002), is a technical device for giving a semantics to heterogeneous theories in- volving several institution. The Grothendieck institution is basically a flattening, or disjoint union, of a logic graph. A signature in the Grothendieck institution con- sists of a pair (L, Σ) where L is a logic (institution) and Σ is a signature in the logic L. Similarly, a Grothendieck signature morphism (ρ, σ) : (L1 , Σ1 ) → (L2 , Σ2 ) consists of a logic translation ρ = (Φ, α, β) : L1 −→ L2 plus an L2 -signature mor- phism σ : Φ(Σ1 ) −→ Σ2 . Sentences, models and satisfaction in the Grothendieck institution are defined in a component wise manner. Hence, the definitions and results of the previous sections also apply to the heterogeneous case. Special care is needed in obtaining Craig-Robinson interpo- lation or weak semi-exactness in the Grothendieck institution. Diaconescu (2007) and Mossakowski (2006) contain some relevant results. In our experience with the heterogeneous tool set (see Mossakowski et al. (2007)), for the Grothendieck in- stitution, it was much easier to obtain weak semi-exactness than Craig-Robinson interpolation. 6 Modular Languages In this section, we show how the integration of ontologies via various ‘modular lan- guages’ can be re-formulated in module diagrams. We concentrate on DDLs and E-connections, which we reformulate as many-sorted theories. Finally, we analyse the problem of conservativity when composing DDLs or E-connections. In the fol- lowing, we will assume basic acquaintance with the syntax and semantics of both, DDLs and E-connections. Details have to remain sketchy for lack of space. 6.1 DDLs and E-connections as heterogeneous module diagrams From the discussion of Section 5, it should be clear that DDLs or E-connections can essentially be considered as many-sorted heterogeneous theories: component ontologies can be formulated in different logics, but have to be built from many- sorted vocabulary, and link relations are interpreted as relations connecting the sorts of the component logics (compare Baader and Ghilardi (2007) who note that this is an instance of a more general co-comma construction). To be more precise, assume a DDL D = (S1 , S2 ) is given. Knowledge bases for D can contain bridge rule of the form: v w Ci −→ Cj (into rule) Ci −→ Cj (onto rule) where Ci and Cj are concepts from Si and Sj (i 6= j), respectively.9 An interpretation I for a DDL knowledge base is a pair ({Ii }i≤n , R), where each Ii is a model for the corresponding Si , and R is a function associating with every pair (i, j), i 6= j, a binary relation rij ⊆ Wi × Wj between the domains Wi and Wj of Ii and Ij , respectively. CE (T1ms , T ms 2 ) DDL(T1ms , T2ms ) - T1ms ] T2 ms - c c T1ms  T2ms  - - c c T1 ∅ T2 Fig. 2. E-connections and DDLs many-sorted In the many-sorted re-formulation of DDLs, the relation rij is now interpreted as a relation between the >-sort of S1 and the >-sort of S2 . Bridge rules are expressed as existential restrictions of the form (]) ∃rij .Ci v Cj and ∃rij .Ci w Cj The fact that bridge rules are atomic statements in a DDL knowledge base now translates to a restriction on the grammar governing the usage of the link relation rij in the multi-sorted formalism (see Borgida (2007) for a discussion of related issues). In fact, the main difference between DDLs and various E-connections now lies in the expressivity of this ‘link language’ L connecting the different sorts of the ontologies. In basic DDL as defined above, the only expressions allowed are those given in (]), so the link language of DDL is a certain, very weak sub-Boolean fragment of many sorted ALC, namely the one given through (]). In E-connections, expressions of the form ∃rij .Ci are again concepts of Sj , to which Booleans (or other operators) of Sj as well as restrictions using relations rji can be applied. Thus, the basic link language of E-connections is sorted ALCIms (relative to the now richer languages of Si ).10 9 We consider here only DDL in its most basic form, comprising into and onto rules, but no individual correspondences. DDL in this form can be seen as a sub-formalism of one-way E-connections, see (Cuenca-Grau and Kutz, 2007). 10 But can be weakened to ALCms or the link language of DDLs, or strengthened to more expressive many-sorted DLs such as ALCQIms . Such many-sorted theories can easily be represented in a diagram as shown in Figure 2. Here, we first (conservatively) obtain a disjoint union T1ms ] T2ms as a pushout, where the component ontologies have been turned into sorted variants (us- ing an institution comorphism from the single-sorted to the many-sorted logic), and the empty interface guarantees that no symbols are shared at this point. An E-con- nection KB in language CE (T1ms , T2ms ) or a DDL KB in language DDL(T1ms , T2ms ) is then obtained as a (typically not conservative) theory extension. 6.2 The problem of iteration, or, how to keep modules intact When connecting ontologies via bridges, or interfaces, this typically is not conser- vative everywhere, but only for some of the involved ontologies. We give a criterion for a single ontology to be conservative in the combination. While the theorem can be applied to arbitrary interface nodes, when applied to E-connections or DDLs, we assume that bridge nodes contain DDL bridge rules or E-connection assertions. Theorem 19. Assume that we work in an institution that has Craig-Robinson in- terpolation (is weakly semi-exact). Let ontologies T1 , . . . , Tn be connected via bridges Bij , i < j. If Ti is proof-theoretically (model-theoretically) conservative in Bij for i < j, then T1 is proof-theoretically (model-theoretically) conservative in the result- ing colimit ontology. Proof. By induction over n. The base n = 1 is clear. Suppose now that the result holds for n, such that T1 lies conservatively in the colimit ontology T , and we add Tn+1 with corresponding bridges B1,n+1 , . . . , Bn,n+1 . T0 c - 6 c - ... c - • c - • ... T  - B1,n+1  ... - Bn,n+1 c - c c - T1 ... Tn Tn+1 The resulting new colimit theory T 0 is constructed by successively constructing pushouts, whence we can use Theorem 17 to lift the conservativities of the mor- phisms Ti → Ti,n+1 to conservativities of the arrows in the chain from T to T 0 . Since conservative theory morphisms compose, T1 is conservative in T 0 . t u As concerns the applicability of the theorem, we have given an overview of log- ics being weakly semi-exact or having Craig-Robinson interpolation in Fig. 4. Of course, the conservativity assumptions have to be shown additionally. We here give an example of the failure of the claim of the theorem in case we work in a logic that lacks Craig-Robinson interpolation. Example 20. Assume ontologies T1 , T2 , T3 are formulated in the DL ALCO, recall the example of failure of interpolation for ALCO given in Example 15, and consider the following diagram: B23 T2 - T3 c  - c c B12 B13 T1 where Sig(T1 ) ⊆ {S, B, D, i}, Sig(T2 ) ⊆ {C1 , C2 }, and Sig(T3 ) ⊆ {B1 , B2 }. More- over, T1 ⊇ {∃S.D}, and B12 = {> v ∃S.∃R1 .C1 , > v ∃S.∃R1 .¬C2 }, B13 = {B1 ≡ ∃R3−1 .B, B2 ≡ ∃R3−1 .B}, and B23 = {C1 ≡ ∃R2 .B1 , C2 ≡ ∃R2 .B2 }. Here, the roles R1 , R2 , R3 can be seen as link relations, and since we apply existential restrictions ∃S to ∃R2 .C1 etc., the example can be understood as a composition of (binary) E-connections. The reader can check that Ti is conservative in Bij for j > i. However, in the colimit (union) of this diagram, ∀S.D t i v ∃S.D follows, while this does not follow in T1 , and thus T1 is not conservative in the colimit ontology. 7 Discussion and Outlook Diagrams and their colimits offer the right level of abstraction to study conser- vativity issues in different languages for modular ontologies. We have singled out conditions that allow for lifting conservativity properties from individual diagrams to their combinations. An interesting point is the question whether proof-theoretic or model-theoretic conservativity should be used. The model-theoretic notion ensures ‘modularity’ in more logics than the proof-theoretic one since the lifting theorem for the former only depends on mild amalgamation properties. By contrast, for the latter one needs Craig-Robinson interpolation which fails, e.g., for some description logics with nominals, and also for QS5—but these logics are used in practice for ontology design. Moreover, as argued in Sect. 5, when relating ontologies across different insti- tutions, the model-theoretic notion is more feasible. Finally, it has the advantage of being independent of the particular language, which implies avoidance of exam- ples like the one presented in Lutz and Wolter (2007) (Example 9 above), where a given ontology extension is proof-theoretically conservative in EL but not in ALC. Of course, model-theoretic conservativity generally is harder to decide, but it can be ensured by syntactic criteria, and the work related to this is promising (e.g. Cuenca-Grau et al., 2006a, 2007). Acknowledgements Work on this paper has been supported by the Vigoni program of the DAAD and by the DFG-funded collaborative research centre ‘Spatial Cognition’. We thank John Bateman, Joana Hois, and Lutz Schröder for fruitful discussions. Bibliography Adámek, J., Herrlich, H. and Strecker, G. (1990). Abstract and Concrete Cate- gories. Wiley, New York. Freely available at http://www.math.uni-bremen.de/∼dmb/ acc.pdf. Areces, C., Blackburn, P. and Marx, M. (2003). Repairing the Interpolation Theo- rem in Quantified Modal Logic. Annals of Pure and Applied Logics 123, 287–299. Areces, C. and Marx, M. (1998). Failure of interpolation in combined modal logics. Notre Dame Journal of Formal Logic 39, 253–273. Areces, C. and ten Cate, B. (2006). Hybrid Logics. In J. v. Benthem, P. Blackburn and F. Wolter, eds., Handbook of Modal Logic. Amsterdam: Elsevier, pages 821–868. Baader, F. and Ghilardi, S. (2007). Connecting Many-Sorted Theories. The Journal of Symbolic Logic 72, 535–583. Bao, J., Caragea, D. and Honavar, V. (2006a). Modular Ontologies - A Formal Investigation of Semantics and Expressivity. In Proc. of ASWC . Springer. Bao, J., Caragea, D. and Honavar, V. (2006b). On the Semantics of Linking and Importing in Modular Ontologies. In Proc. of ISWC . Springer. Bidoit, M. and Mosses, P. D. (2004). Casl User Manual . LNCS Vol. 2900 (IFIP Series). Springer. Freely available at http://www.cofi.info. Borgida, A. (2007). On Importing Knowledge from DL Ontologies: some Intuitions and Problems. In Proc. of DL. Borgida, A. and Serafini, L. (2002). Distributed Description Logics: Directed Domain Correspondences in Federated Information Sources. In On The Move to Meaningful Internet Systems, volume 2519 of LNCS . Springer. Borgida, A. and Serafini, L. (2003). Distributed Description Logics: Assimilating Information from Peer Sources. Journal of Data Semantics 1, 153–184. Borzyszkowski, T. (2002). Logical systems for structured specifications. Theoretical Computer Science 286, 197–245. CoFI (The Common Framework Initiative) (2004). Casl Reference Manual . LNCS Vol. 2960 (IFIP Series). Springer. Freely available at http://www.cofi.info. Craig, W. (1957). Three uses of the Herbrand-Genzen theorem in relating model theory and proof theory. Journal of Symbolic Logic 22, 269–285. Cuenca-Grau, B., Horrocks, I., Kazakov, Y. and Sattler, U. (2007). Ontology reuse: Better safe than sorry. In Proc. of DL-07 . Cuenca-Grau, B., Horrocks, I., Kutz, O. and Sattler, U. (2006a). Will my On- tologies Fit Together? In Proc. of DL-06 . Cuenca-Grau, B. and Kutz, O. (2007). Modular Ontology Languages Revisited. In Proc. of the IJCAI’07 Workshop on Semantic Web for Collaborative Knowledge Acqui- sition (SWeCKa), Hyderabad, India, January 2007 . Cuenca-Grau, B., Parsia, B. and Sirin, E. (2006b). Combining OWL Ontologies Using E-Connections. Journal of Web Semantics 4, 40–59. Diaconescu, R. (2002). Grothendieck institutions. Applied Categorical Structures 10, 383–402. Diaconescu, R. (2007). Institution-independent Model Theory. Springer Verlag. To appear. Diaconescu, R., Goguen, J. and Stefaneas, P. (1993). Logical Support for Modu- larisation. In 2nd Workshop on Logical Environments. New York: CUP, pages 83–130. Dimitrakos, T. and Maibaum, T. (2000). On a generalised modularisation theorem. Information Processing Letters 74, 65–71. Fine, K. (1979). Failures of the Interpolation Lemma in Quantified Modal Logic. The Journal of Symbolic Logic 44, 201–206. Fitting, M. (2002). Interpolation for First Order S5. The Journal of Symbolic Logic 67, 621–634. Gabbay, D. and Maksimova, L. (2006). Interpolation and definability: Modal and Intu- itionistic Logics (Oxford Logic Guides). Oxford science publications. Clarendon press, Oxford. Gabbay, D. M. (1972). Craig’s interpolation theorem for modal logics. In Confer- ence in Mathematical Logic, London ’70 , volume 255 of Lecture Notes in Mathematics. Springer. Ghilardi, S., Lutz, C. and Wolter, F. (2006). Did I Damage My Ontology? A Case for Conservative Extensions in Description Logics. In Proceedings of KR-06 . Goguen, J. and Roşu, G. (2002). Institution morphisms. Formal aspects of computing 13, 274–307. Goguen, J. A. (1991). A categorical manifesto. Mathematical Structures in Computer Science 1, 49–67. Goguen, J. A. and Burstall, R. M. (1992). Institutions: Abstract Model Theory for Specification and Programming. Journal of the ACM 39, 95–146. Goranko, V. and Passy, S. (1992). Using the Universal Modality: Gains and Questions. J. Log. Comput. 2, 5–30. Guerra, S. (2001). Composition of Default Specifications. J. Log. Comput. 11, 559–578. Kracht, M. and Wolter, F. (1991). Properties of Independently Axiomatizable Bi- modal Logics. The Journal of Symbolic Logic 56, 1469–1485. Kutz, O. (2004). E-Connections and Logics of Distance. Ph.D. thesis, The University of Liverpool. Kutz, O., Lutz, C., Wolter, F. and Zakharyaschev, M. (2004). E-Connections of Abstract Description Systems. Artificial Intelligence 156, 1–73. Lüttich, K., Masolo, C. and Borgo, S. (2006). Development of Modular Ontologies in CASL. In P. Haase, V. Honavar, O. Kutz, Y. Sure and A. Tamilin, eds., 1st Workshop on Modular Ontologies 2006 , volume 232 of CEUR Workshop Proceedings. CEUR-WS.org. Lutz, C. and Wolter, F. (2007). Conservative Extensions in the Lightweight Descrip- tion Logic EL. In Proc. of CADE-07 . Springer. Marx, M. and Venema, Y. (1997). Multi-dimensional modal logic. Dordrecht: Kluwer Academic Publishers. Mossakowski, T. (2002). Comorphism-based Grothendieck logics. In K. Diks and W. Rytter, eds., Mathematical Foundations of Computer Science, volume 2420 of LNCS . Springer, pages 593–604. Mossakowski, T. (2006). Institutional 2-cells and Grothendieck institutions. In K. Fu- tatsugi, J.-P. Jouannaud and J. Meseguer, eds., Algebra, Meaning and Computation. Essays Dedicated to Joseph A. Goguen on the Occasion of His 65th Birthday, LNCS 4060. Springer. Mossakowski, T., Maeder, C. and Lüttich, K. (2007). The Heterogeneous Tool Set. In O. Grumberg and M. Huth, eds., TACAS 2007 , volume 4424 of Lecture Notes in Computer Science. Springer-Verlag Heidelberg. Sannella, D. and Tarlecki, A. (1988). Specifications in an arbitrary institution. Information and Computation 76, 165–210. Schorlemmer, W. M. and Kalfoglou, Y. (2007). Institutionalising Ontology-Based Semantic Integration. Journal of Applied Ontology. To appear. Shoenfield, J. (1967). Mathematical Logic. Addison-Wesley, Reading, Massachusetts. ten Cate, B., Conradie, W., Marx, M. and Venema, Y. (2006). Definitorially Com- plete Description Logics. In P. Doherty, J. Mylopoulos and C. Welty, eds., Proceedings of KR 2006 . AAAI Press.