Reasoning about Consistency in Model Merging Mehrdad Sabetzadeh† Shiva Nejati† Marsha Chechik‡ Steve Easterbrook‡ † Simula Research Laboratory ‡ University of Toronto Oslo, Norway Toronto, Canada {mehrdad,shiva}@simula.no {chechik,sme}@cs.toronto.edu ABSTRACT 1. INTRODUCTION Models undergo a variety of transformations throughout de- In the past several years, we have been studying the prob- velopment. One of the key transformations is merge, used lem of model integration, particularly in situations where the when developers need to combine a set of models with re- models are originating from distributed sources of informa- spect to the overlaps between them. A major question about tion. Many activities in model-based development fall under model transformations in general, and merge in particular, is the umbrella of integration. These include (1) merging, used what consistency properties are preserved across the trans- to build a global view of a set of overlapping perspectives formations and what consistency properties may need to be (e.g., [23, 20, 18, 10]); (2) composition, used to assemble a re-checked (and if necessary, re-established) over the result. set of autonomous but interacting components that run se- In previous work [18], we developed a technique based on quentially or in parallel (e.g., [2, 5, 6]); and (3) weaving, used category-theoretic colimits for merging sets of inter-related in aspect-oriented development to incorporate cross-cutting models. The use of category theory leads to the preservation concerns into a base system (e.g., [22]). of the algebraic structure of the source models in the merge; Our position towards the integration problem has been however, this does not directly provide a characterization that the integration operators (e.g., merge, compose, weave) of the (in)consistency properties that carry over from the must tolerate inconsistency. That is, the operators must source models to the result, because consistency properties work for any given set of models, even when the models are are predominantly expressed as logical formulas. Hence, an inconsistent. This position is motivated by two well-known investigation of the connections between the “algebraic” and observations: First, immediate resolution of inconsistency “logical” properties of model merging became necessary. can be disruptive in projects where ambiguities and conflicts In this paper, we undertake such an investigation and use tend to occur frequently [11]. Second, maintaining consis- techniques from finite model theory [9] to show that the use tency at all times can be counter-productive because it may of colimits indeed leads to the preservation of certain logical lead to premature commitment to design decisions that have properties. Our results have implications beyond the merge not yet been sufficiently analyzed [12]. framework in [18] and are potentially useful for the broad In light of our position, it is important to understand how range of techniques in the graph transformation and alge- different consistency properties are affected by the integra- braic specification literature that use colimits as the basis tion operations. Specifically: for model manipulations. • If all the source models are consistent with respect to a given consistency property, will the integrated model Categories and Subject Descriptors be consistent with respect to that property as well? D.2.1 [Software Engineering]: Requirements/Specifications • If there is an inconsistency in the source models, will the inconsistency necessarily carry over to the inte- General Terms grated model? Design, Verification Answering the first question is interesting to enable com- positional reasoning about consistency. Answering the sec- Keywords ond question is useful for understanding the nature of an Inconsistency, Merge, Logical Preservation inconsistency. In particular, inconsistencies that are due to incomplete information in the individual source models can be automatically resolved in the integrated models when the source models are complementary and address each other’s areas of incompleteness. For example, an abstract class with Permission to make digital or hard copies of all or part of this work for no descendants in a UML class diagram might be seen as in- personal or classroom use is granted without fee provided that copies are consistent. But this class might be inherited from in other not made or distributed for profit or commercial advantage and that copies models and hence the overall view might still be consistent. bear this notice and the full citation on the first page. To copy otherwise, to In contrast, cyclic inheritance in a UML class diagram can- republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. not be resolved in the integrated model (unless the inte- Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. grated model omits information from the source models). 32 Since consistency rules are often described in logical lan- A guages (e.g., first order logic), we are interested in study- ing how different integration operators preserve the logical An equivalence class properties of models. In general, property preservation is R3 D a powerful tool for reasoning about model transformations. R4 The main question that property preservation tackles is the M4 following: If a property (formula) ϕ in some logic holds over A D B a model M , will ϕ also hold over a model M 0 derived from M via some transformation? In this paper, we discuss the logical property preservation B B C characteristics of our merge operator in [18]. The merge op- erator is based on category theory which has been widely used as a theoretical basis for characterizing model merg- E E E ing. In a categorical setting, merge is typically performed R1 M R2 M by computing a colimit – an algebraic construct for combin- M1 2 3 ing a set of objects interrelated by a set of mappings. While colimits provide an effective and mathematically precise way Figure 1: Example interconnection diagram for merge, their pure algebraic characterization is not di- rectly applicable for reasoning about the logical properties Merge of model merging. Specifically, given a property ϕ expressed in a particular logic, one cannot readily determine from the A C definition of colimit whether ϕ is preserved from the source models to the merged model. We use techniques from finite model theory [9] to show B that colimits indeed preserve a certain class of logical prop- erties. The logical language we use as the basis for our work is first order logic extended with least fixpoints. Ex- D E tension with fixpoints is important, because standard first order logic cannot express properties that require the com- putation of reachability. For example, acyclic inheritance Figure 2: Merge of the diagram in Figure 1 for UML class diagrams is not expressible in standard first order logic. Our results have implications beyond our merge are simple UML class diagrams with their overlapping parts framework in [18] and are potentially useful for the broad specified through four mappings, R1 , . . . , R4 (depicted us- range of techniques in the graph transformation [16] and al- ing directed dashed lines). A simpler example with just gebraic specification [1] communities that use colimits as the two models, M1 , M2 , and one mapping, R, is shown in Fig- basis for model manipulations. ure 3(a). A third example is given in Figure 4(a), where the The remainder of this paper is structured as follows: In shared parts of two models, M2 , M3 , are captured by a third Section 2, we provide background information on our merge model M1 and two mappings R1 , R2 . algorithm and present the logical preliminaries for our work. The input to the merge algorithm is an interconnection di- In Section 3, we describe our general logical preservation re- agram D = hM1 , . . . , Mi , R1 , . . . , Rj i. The algorithm works sults for colimits; and in Section 4, we use these general by unifying elements in M1 , . . . , Mi that fall into the same results to reason about the preservation of some logical ex- equivalence class induced by R1 , . . . , Rj . As an example, pressions that are frequently used in consistency rules. We we have delineated by thin dashed lines one of the several conclude in Section 5 with a summary and directions for equivalence classes in Figure 1. Note that each unmapped future work. element in the input models falls into a distinct equivalence class of its own. 2. BACKGROUND For convenience, in the example shown in Figure 1, we used a consistent vocabulary for naming the elements of 2.1 Structural Model Merging M1 , . . . , M4 , hence defining R1 , . . . , R4 based on name equal- We first briefly review our merge operator. For more ities. In general, models may not have a common vocabu- information, see [18]. The operator hinges on three ab- lary, and mappings are not necessarily based on vocabulary stractions: models, mappings, and interconnection diagrams. similarities (e.g., see the examples in Figures 3(a) and 4(a)). Each model is described as a graph, and each mapping – as The merged model has exactly one element corresponding a binary relation over two models equating their correspond- to each equivalence class. Since mappings denote equality ing elements. Mappings preserve type information, i.e., they of mapped element pairs and hence are symmetric, the di- do not equate elements that have different types. Further, rectionality of mappings is ignored in the computation of they preserve structure, i.e., if a mapping R maps an edge equivalence classes. e to an edge e0 , it must also map the source and target of e Figure 2 shows the resulting merge for the interconnec- to the source and target of e0 , respectively. tion diagram of Figure 1. The merge provides interesting The third abstraction, the interconnection diagram, cap- insights about how consistency properties can be broken tures a set of models and a set of known or hypothesized across merge. For example, we may have consistency rules mappings between them. An example interconnection di- that check for multiple or cyclic inheritance in UML class agram is shown in Figure 1. In this example, M1 , . . . , M4 diagrams. Obviously, these rules are satisfied over the in- 33 R, we say that an occurrence of R is negative if it is under the scope of an odd number of negations, and positive, oth- dividual source models in Figure 1, but the global view of erwise. We say that a formula is positive in R if there are no the system (i.e., the merge) is inconsistent. In particular, negative occurrences of R in it, i.e., either all occurrences of in Figure 2: B has two parents; and B, C, E form a cycle. R are positive, or there are none at all. In Section 4, we provide a systematic explanation of what properties of the source models carry over to the merge and Lemma 2.2 [3] If ϕ(R, ~ x) is positive in R, then Fϕ is mono- what properties do not. tone. 2.2 Logical Background Definition 2.3 (least fixpoint logic) [9] The least fixpoint First Order (FO) logic is one of the most commonly used logic (LFP) extends FO with the following formula building logical languages for expressing consistency rules [11] and rule: is used as the basis for our work here. FO by itself is not expressive enough to describe properties that involve reach- • if ϕ(R, ~x) is a formula positive in R, where R is k-ary, ability or cycles. To address this limitation, one can add and ~t is a tuple of terms, where |~x| = |~t| = k, then to FO a least fixpoint operator, obtaining the least fixpoint x)](~t) [lfpR,~x ϕ(R, ~ logic (LFP) [9]. Below, we first formally define the notion of relational structure and FO. We then define the concept is a formula, whose free variables are those of ~t. of least fixpoint and show how FO can be extended into LFP. Our exposition follows the standard approach in finite The semantics is defined as follows: model theory (e.g., see [9]). A |= [lfpR,~x ϕ(R, ~ x)](~a) iff ~a ∈ lfp(Fϕ ). Definition 2.1 (relational structure) A (relational) struc- Example 2.4 (reachability) Consider graphs whose edge ture is an object A = (A, R1 , . . . , Rm ), where A is a nonempty relation is E, and let set, m is a natural number, R1 , . . . , Rm are abstract relation symbols with associated arities k1 , . . . , km (nonnegative in- ϕ(R, x, y) = E(x, y) ∨ ∃z (E(x, z) ∧ R(y, z)) . tegers), and each Ri is a ki -ary relation on A. Reachability, i.e., the transitive closure of E, is characterized The set A is called the universe of A. The sequence of rela- by the formula tion symbols R1 , . . . , Rm together with corresponding arities ψ(x, y) = [lfpR,x,y ϕ(R, x, y)](x, y). k1 , . . . , km comprise the vocabulary of A. We usually denote a vocabulary by σ. Relation RiA is called the interpretation That is, ψ(a, b) holds over a graph G iff there is a path from of a relation symbol Ri in A. a to b in G. FO formulas in a vocabulary σ are built up from atomic formulas using negation, conjunction, disjunction, and exis- 2.3 Homomorphisms and Preservation of tential and universal quantification: Logical Properties ϕ ::= x = y | R(x1 , . . . , xn ) | ¬ϕ | ϕ1 ∧ ϕ2 | ϕ1 ∨ ϕ2 | Our merge framework embeds each source model into the merge through a homomorphism. The existence of these homomorphisms leads to the preservation of certain consis- ∃xϕ(x) | ∀xϕ(x) tency properties. Below, we review the theoretical results In the above, x, y and x1 . . . , xn are variables, R is an underlying our discussion of property preservation in Sec- n-ary relation symbol in σ, and ϕ1 and ϕ2 are formulas. tion 3. We begin with a definition of homomorphism: Given a set U , let P(U ) denote its powerset. A set X ⊆ U is said to be a fixpoint of a mapping F : P(U ) → P(U ) if Definition 2.5 (homomorphism) Let A = (A, R1A , . . . , Rm A ) B B F (X) = X. A set X ⊆ U is a least fixpoint of F if it is and B = (B, R1 , . . . , Rm ) be structures in the same vocabu- a fixpoint, and for every other fixpoint Y of F , we have lary. A homomorphism from A to B is a function h : A → B X ⊆ Y . The least fixpoint of F is denoted by lfp(F ). Least such that h(RiA ) ⊆ RiB , i.e., if (a1 , . . . , aki ) ∈ RiA then fixpoints are guaranteed to exist only if F is monotone. That (h(a1 ), . . . , h(aki )) ∈ RiB for every 1 ≤ i ≤ m. is, The first result about property preservation under homo- X ⊆ Y implies F (X) ⊆ F (Y ). morphisms, dating back to the 1950’s, is the Los-Tarski- Lyndon Theorem: We now add a least fixpoint operator to FO. Suppose we have a vocabulary σ, and an additional relation symbol Theorem 2.6 (homomorphism preservation theorem) R 6∈ σ of arity k. Let ϕ(R, x1 , . . . , xk ) be a formula with (e.g., see [14, 15]) A first order formula is preserved under vocabulary σ ∪ {R}. For a structure A with vocabulary σ, homomorphisms on all structures (finite and infinite) if and the formula ϕ(R, ~x) yields a mapping Fϕ : P(Ak ) → P(Ak ) only if it is equivalent to an existential positive formula, i.e., defined as follows: a formula without negation and universal quantification. Fϕ (X) = {~a | A |= ϕ(X/R, ~a)} The existential positive fragment of FO is denoted ∃FO+ . The notation ϕ(X/R, ~a) means that X is substituted for R For our purposes, we are interested in finite structures only, in ϕ. More precisely, if A0 is a (σ ∪{R})-structure expanding and like many classical mathematical logic results that fail A, in which R is interpreted as X, then A0 |= ϕ(~a). in the finite case (e.g., compactness), there is the danger To ensure that Fϕ is monotone, we impose certain restric- that the above result may fail as well when restricted to tions. Given a formula ϕ that may contain a relation symbol finite structures. Fortunately, this is not the case. 34 Theorem 2.7 (h. p. t. in the finite) [15] A first order because there is no edge from node a to node d and vice formula is preserved under homomorphisms on finite struc- versa. The general observation here is that, when there is tures if and only if it is equivalent to an ∃FO+ formula. more than one universal quantifier, universally quantified variables can be assigned values from non-shared parts of The forward direction of the if-and-only-if (i.e., sufficiency) different source models. In such a case, property satisfac- in the above result can be extended to the existential posi- tion over the individual source models may not extend to tive fragment of LFP, denoted ∃LFP+ . the merge. Lemma 2.8 Every ∃LFP+ formula is preserved under ho- Currently, we do not know whether F3 leads to further momorphisms on finite structures. property preservation results. This is an issue that we plan to investigate in future work. A proof of the above lemma is provided in the appendix. The lemma is the basis for the results we present in Sec- 4. PRESERVATION OF (IN)CONSISTENCY tion 3. In previous work [19], we identified three general types of expressions commonly used in structural consistency prop- 3. GENERAL PRESERVATION RESULTS erties. These are: Our merge framework offers three key features [18]: • Compatibility expressions, used for ensuring compati- F1 Merge yields a family of mappings, in our case graph bility of the type of an edge with the types of its end- homomorphisms, one from each source model onto the points. merged model. This feature ensures that the merge does not loose information, i.e., it represents all the • Multiplicity expressions, used for defining a minimum source models completely. and a maximum number for edges of a given type in- cident to a node. F2 The merged model does not contain any unmapped elements, i.e., every element in the merged model is • Reachability expressions, used for checking existence of the image of some element in the source models. This paths of edges of a given type between two nodes. feature ensures minimality, i.e., the merge does not introduce information that is not present in or implied Below, we use results of Section 3 to reason about the by the source models. preservation of these expressions. F3 Merge respects the mappings in the source system, i.e., Compatibility Properties. Preservation of compatibility prop- the image of each element in the merged model remains erties can be established directly through algebraic means, the same, no matter which path through the mappings but it is interesting to see if the same can be done through in the source system one follows. This feature ensures logical means. For example, in a class diagram, an edge of non-reduandancy. More precisely, if a concept appears type “implements” must relate a class to an interface; other- in more than one source model, only one copy of it wise, the diagram would not be well-formed. This property appears in the merged model. can be formalized as follows: From F1 and Lemma 2.8 (in Section 2.3), it follows that C1 = ∀e (Edge(e) ∧ Type(e, “implements”) ⇒ the result of our merge procedure preserves the existential Compatibleclass,interface (e)) positive fragment of LFP. where Edge is the set of edges of the graph representing Theorem 3.1 If an existential positive LFP formula ϕ is a class diagram, Type is a binary relation between the set satisfied by some source model M , any merge in which M of edges and different types of relations between classes, participates satisfies ϕ as well. and Compatibleclass,interface (e) is a constraint that verifies By F2 and the above theorem, we obtain the following whether the source and target nodes of edge e are of type result regarding preservation of universal properties. class and interface, respectively. The general form of this constraint can be formalized using an existential positive Theorem 3.2 Let ϕ(x) be an existential positive LFP for- formula as follows: mula with a free variable x. If the formula ψ = ∀x ϕ(x) is Compatibleα,β (e) = ∃n (∃m (Source(e, n) ∧ Target(e, m) ⇒ satisfied by all the source models, ψ is satisfied by any merge Type(n, α) ∧ Type(m, β))) of the models as well. Notice that Theorem 3.2 allows the introduction of only where Source and Target are binary relations, respectively one universal quantifier. To gain intuition on what happens giving the source and target node for a given edge. when additional universal quantifiers are introduced, con- The sub-formula Type(e, “implements”) of C1 appears in sider the system in Figure 3(a) and let the relation E(x, y) negated form, but the negation can be resolved, knowing denote the graph edge relation. Both models in Figure 3(a) that (1) the set of types is fixed and, (2) every element has are complete graphs and hence satisfy the property {t1 , . . . , tn }, the a type. More precisely, if the set of types is W ∀x ∀y Node(x) ∧ Node(y) ⇒ E(x, y) 1 . However, the prop- formula ¬Type(e, t` ) can be replaced with i6=` Type(e, ti ). erty is violated over the resulting merge shown in Figure 3(b), Hence, by Theorem 3.2, C1 is preserved. 1 The property uses implication and hence has negation. But the negation can be resolved, because every element in the Multiplicity Properties. One can show through simple counter- universe that is not a node is an edge. Therefore, the prop- examples that none of the following lift from the source erty is equivalent to ∀x ∀y Edge(x) ∨ Edge(y) ∨ E(x, y). models to the merge: 35 a b,c a b c d R d M1 M2 Merge (a) (b) Figure 3: Illustration for violation of universal properties u v w x y z M2 M3 R1 R2 a,b,u,x,y c,d,v,w,z a b c d Merge M1 (a) (b) Figure 4: Illustration for violation of multiplicity properties • There exists at least c elements satisfying ϕ. made up of edges of type “extends” exists. Using the argu- ment we gave when discussing preservation of compatibility • There exists exactly c elements satisfying ϕ. properties, we know that the negation of Type(c1, ”class”) • There exists at most c elements satisfying ϕ. can be resolved. Further, ¬Abstract(c1) can be replaced with a positive property, say, Concrete(c1). It now follows from It is easy to see why the “exactly” and “at most” cases Theorem 3.2 that C2 is preserved. do not get preserved, noting that merge normally has more Note that our results can be used for reasoning about information than any of the source models. To understand preservation of inconsistency as well. For example, consider why the “at least” case is not preserved, note that homo- the following rule: morphisms (and functions as well) are not necessarily one- to-one, and can therefore shrink the number of elements C3 = ∃c ((Type(c, “class”) ∧ Reachableextends (c, c))) satisfying a property. For example, consider the system of models in Figure 4(a) and its merge in Figure 4(b). For sim- This rule holds over a class diagram M when the inher- plicity, the models are discrete graphs, i.e., sets, and their itance hierarchy in M is cyclic, i.e., M is inconsistent. By mappings are functions. Although M1 , M2 , M3 all satisfy Theorem 3.1, we can conclude that any merged model that the property “there exists at least three (distinct) nodes”, has M as a source model satisfies C3 as well, and hence is the merge has only two nodes, hence violating the property. also inconsistent. It is important to mention that the flexibility to fuse to- gether multiple elements of the same source model is not an 5. CONCLUSION undesirable feature and is indeed valuable when one needs In this paper, we showed that the use of algebraic colimits to perform an abstraction during merge [7]. for model merging leads to the preservation of certain logical properties. We used our results to formally reason about the Reachability Properties. An interesting consequence of The- preservation of consistency properties across merge. orem 3.1 is the preservation of paths in the merge. Recall Based on our recent survey of existing model merging that we gave a formalization for the reachability property techniques [17], algebraic theories, including category the- in Example 2.4. To see how this can be used for reasoning ory, lattice theory, and formal concept analysis, are increas- about consistency, consider the following consistency rule ingly being used for characterizing model integration prob- over class diagrams: “Every abstract class must have a con- lems. What makes these theories particularly attractive is crete implementation”. This rule is formally expressed as the level of abstraction to which they lead, allowing the follows: merge process to be described in a flexible and highly generic C2 = ∀c ((Type(c, “class”) ∧ Abstract(c)) ⇒ way. At the same time, one must account for the fact that ∃c0 (Concrete(c0 ) ∧ Reachableextends (c0 , c))) merge is often an intermediate step for activities such as be- havioural synthesis [23, 21], reasoning over global behaviours where Reachableextends (x, y) holds iff a path from x to y of systems [4], and data integration and exchange [8, 13]. 36 To facilitate these activities, it is crucial to be able to rea- ICSE ’03: Proceedings of the 25 International son about the preservation of semantic properties (including Conference on Software Engineering, pages 455–464, consistency properties) of the source models in merge. Do- 2003. ing so requires establishing proper connections between the [12] B. Nuseibeh, S. Easterbrook, and A. Russo. “Making algebraic techniques used in model integration and the log- Inconsistency Respectable in Software Development”. ical techniques used in the activities named above. This is The Journal of Systems and Software, 58(2):171–180, a non-trivial task but is an essential step toward making 2001. model integration more effective in practice. [13] L. Popa, Y. Velegrakis, R. Miller, M. Hernández, and For future work, we would like to provide a full logical R. Fagin. Translating web data. In Proceedings of 28th characterization of colimits. In particular, the logical im- International Conference on Very Large Data Bases, plications of F 3, described in Section 3, is unknown to us pages 598–609, 2002. at the moment and need to be revisited in the future. Fur- [14] E. Rosen. Some aspects of model theory and finite ther, it may be possible to trade off development flexibility structures. The Bulletin of Symbolic Logic, in favour of a broader class of preserved properties, e.g., by 8(3):380–403, 2002. using more constrained mappings for relating models, or by [15] B. Rossman. Existential positive types and placing restrictions on the patterns used for interconnect- preservation under homomorphisisms. In Proceedings ing the models. We leave an elaboration of these topics to of the 20th Annual IEEE Symposium on Logic in future investigation. Lastly, we would like to explore the ap- Computer Science, pages 467–476, 2005. plication of our results for checking the consistency of model [16] G. Rozenberg, editor. Handbook of graph grammars manipulations in graph transformation and algebraic speci- and computing by graph transformation: Foundations, fication approaches that are based on colimits. volume 1. World Scientific, River Edge, NJ, USA, 1997. 6. REFERENCES [17] M. Sabetzadeh. Merging and Consistency Checking of [1] E. Astesiano, H. Kreowski, and B. Krieg-Brueckner, Distributed Models. PhD thesis, University of Toronto, editors. Algebraic Foundations of Systems 2008. Specification. Springer-Verlag, Secaucus, NJ, USA, [18] M. Sabetzadeh and S. Easterbrook. View merging in 1999. the presence of incompleteness and inconsistency. [2] E. Clarke, O. Grumberg, and D. Peled. Model Requirements Engineering Journal, 11(3):174–193, Checking. MIT Press, 1999. 2006. [3] B. Davey and H. Priestly. Introduction to Lattices and [19] M. Sabetzadeh, S. Nejati, S. Liaskos, S. Easterbrook, Order. Cambridge University Press, 1990. and M. Chechik. “Consistency Checking of [4] S. Easterbrook and M. Chechik. A framework for Conceptual Models via Model Merging”. In RE ’07: multi-valued reasoning over inconsistent viewpoints. Proceedings of 15th IEEE International Requirements In Proceedings of the 23rd International Conference on Engineering Conference, pages 221–230, 2007. Software Engineering, pages 411–420, 2001. [20] S. Uchitel and M. Chechik. “Merging Partial [5] J. Hay and J. Atlee. “Composing Features and Behavioural Models”. In SIGSOFT ’04/FSE-12: Resolving Interactions”. In SIGSOFT ’00/FSE-8: Proceedings of the 12th ACM SIGSOFT International Proceedings of the 8th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Symposium on Foundations of Software Engineering, pages 43–52, 2004. pages 110–119, 2000. [21] S. Uchitel and J. Kramer. A workbench for [6] M. Jackson and P. Zave. “Distributed Feature synthesising behaviour models from scenarios. In Composition: a Virtual Architecture for Proceedings of the 23rd International Conference on Telecommunications Services”. IEEE Transactions on Software Engineering, pages 188–197, 2001. Software Engineering, 24(10):831 –847, 1998. [22] J. Whittle, A. Moreira, J. Araújo, P. Jayaraman, [7] Y. Kalfoglou and M. Schorlemmer. Ontology mapping: A. Elkhodary, and R. Rabbi. “An Expressive Aspect The state of the art. In Y. Kalfoglou, M. Schorlemmer, Composition Language for UML State Diagrams”. In A. Sheth, S. Staab, and M. Uschold, editors, Semantic MoDELS ’07: Proceedings of the 10th International Interoperability and Integration, number 04391 in Conference on Model Driven Engineering Languages Dagstuhl Seminar Proceedings. IBFI, 2005. and Systems, pages 514–528, 2007. [8] M. Lenzerini. Data integration: A theoretical [23] J. Whittle and J. Schumann. “Generating Statechart perspective. In Proceedings of the 21st Symposium on Designs from Scenarios”. In ICSE ’00: Proceedings of Principles of Database Systems, pages 233–246, 2002. 22nd International Conference on Software [9] L. Libkin. Elements Of Finite Model Theory. Texts in Engineering, pages 314–323. ACM Press, May 2000. Theoretical Computer Science. An EATCS Series. Springer, 2004. [10] S. Nejati, M. Sabetzadeh, M. Chechik, S. Easterbrook, and P. Zave. “Matching and Merging of Statecharts Specifications”. In ICSE ’07: Proceedings of the 29th International Conference on Software Engineering, pages 54–64, 2007. [11] C. Nentwich, W. Emmerich, and A. Finkelstein. “Consistency Management with Repair Actions”. In 37 APPENDIX A. PROOF FOR LEMMA 2.8 Let A = (A, R1A , . . . , Rm A ) and B = (B, R1B , . . . , Rm B ) be a pair of relational structures over vocabulary σ = (R1 , . . . , Rm ). Let h : A → B be a homomorphism. We show that for every ϕ ∈ ∃LF P + and for every ~a ∈ Ak A |= ϕ(~a) ⇒ B |= ϕ(h(~a)) where h(~a) = (h(a1 ), . . . , h(ak )). The proof for ϕ ∈ ∃F O+ ∩ ∃LF P + follows from Theo- rem 2.7. Below, we provide a proof for least fixpoint formu- las. Let ϕ(~x) = [lfpR,~y α(R, ~ y )](~ x). By the definition of lfp, for every structure A, the formula ϕ yields a mapping Fα,A : P(Ak ) → P(Ak ) defined as follows: Fα,A (X) = {~a | A |= α(X/R, ~a)} By Definition 2.3 and Knaster-Tarski’s fixpoint theorem, for every ~a ∈ Ak we have: ∞ [ i ~a ∈ Fα,A (∅) ⇔ A |= [lfpR,~y α(R, ~ y )](~a) i=0 i i We first prove by induction that h(Fα,A (∅)) ⊆ Fα,B (∅). Base case: Let ~a ∈ Fα,A (∅). Then, A |= α(∅, ~a). Since A is a substructure of B by h and since h(∅) = ∅, we have B |= α(∅, h(~a)). Thus, h(~a) ∈ Fα,B (∅). i i−1 Inductive step: Let ~a ∈ Fα,A (∅). Then, A |= α(Fα,A (∅), ~a). Since A is a substructure of B by h, we have B |= i−1 i−1 α(h(Fα,A (∅)), h(~a)). Thus, h(~a) ∈ Fα,B (h(Fα,A (∅))). By the inductive hypothesis and since Fα,B is mono- i tone, h(~a) ∈ Fα,B (∅). Thus, h( ∞ S i S∞ i i=0 Fα,A (∅)) ⊆ i=0 Fα,B (∅) (1) Therefore, A |= ϕ(~a) ⇔ (By assumption ϕ(~a) = [lfpR,~y α(R, ~ y )](~a)) A |= [lfpR,~y α(R, ~ y )](~a) ⇔ (By SDefinition 2.3 and Knaster-Tarski’s Theorem) ~a ∈ ∞ i i=0 Fα,A (∅) ⇔ (Since h isShomomorphism) h(~a) ∈ h( ∞ i i=0 Fα,A (∅)) ⇒ (By (1)) h(~a) ∈ ∞ i S i=0 Fα,B (∅) ⇔ (By Definition 2.3 and Knaster-Tarski’s Theorem) B |= [lfpR,~y α(R, ~y )](h(~a)) ⇔ (By definition of lfp) B |= ϕ(h(~a)) 38