Round-trip migration of object-oriented data model instances Luca Beurer-Kellner1 , Jens von Pilgrim2 and Timo Kehrer3 1 ETH Zürich, Zürich, Switzerland 2 HAW Hamburg, Hamburg, Germany 3 Humboldt-Universität zu Berlin, Berlin, Germany Abstract The communication of web-based services is typically organized through public APIs which rely on a common data model shared among all system components. To accommodate new or changing requirements, a common approach is to plan data model changes in a backward compatible fashion. While this relieves developers from an instant migration of the system components including the data they are operating on, it causes serious maintenance problems and architectural erosion in the long term. We argue that an alternative solution to this problem is to use a translation layer serving as a round-trip migration service which is responsible for the forth-and-back translation of object-oriented data model instances of different versions. However, the development of such a round-trip migration service is not yet properly supported by existing technologies. In this challenge, we focus on the key task of developing the required migration functions, framing this as a model transformation problem. Keywords Web development, API and data model evolution, translation layer, round-trip migration, model transformation 1. Introduction migrate the existing data using tools such as Liquibase1 . Once the migration has been performed, components Context: In web development, the communication of relying on version 1 of the data model are replaced by web-based services is typically organized through pub- their updated successor versions. lic APIs which rely on a common data model shared In practice, however, not all the affected components among all system components. Over time, the shared can be migrated instantly and at the same time [4]. A data model must be changed to accommodate new or common workaround is to plan data model changes in changing requirements, and the system components (i.e., a backward compatible fashion. However, this severely services) including the data they are operating on must hampers flexibility when evolving the data model, and be migrated. This API evolution problem is a well-known essentially comes at the cost of architectural erosion, in- challenge for web APIs [1, 2, 3]. creased maintenance efforts and technical debt. A more Figure 1 illustrates this problem by means of a typical flexible solution would be to operate components relying example of a distributed system exposing a three-tier on different data model versions at the same time and to architecture with a client, a service and a database layer. use a translation layer serving as round-trip migration The API and its underlying data model are evolved from service being responsible for the forth-and-back transla- version 1 (red, not striped) to version 2 (green, striped), tion of object-oriented data model instances of different which may lead to different architectural evolution sce- versions. The evolution scenarios ➋, ➌ and ➍ use such a narios, depending on the temporal order of updating the 1 involved components. Ideally, all components are up- https://www.liquibase.org/ dated simultaneously (scenario ➊). When performed in an online fashion, we need a translation layer (TL) to 1 2 3 4 Client V.1 Client V.2 Client V.1 Client V.1 Client V.1 API V.1 API V.2 API V.1 API V.1 TL 4 TTC’20: Transformation Tool Contest, Part of the Software V.1 V.2 TL 3 Service V.1 Service V.2 Service V.1 V.1 V.2 Technologies: Applications and Foundations (STAF) federated API V.2 conferences, Eds. A. Boronat, A. García-Domínguez, G. Hinkel, and F. Database Database TL 2 V.1 V.2 Service V.2 Service V.2 Service V.1 Service V.2 Křikava, 17 July 2020, Bergen, Norway (online). " luca.beurer-kellner@inf.ethz.ch (L. Beurer-Kellner); TL 1 Database Database Database Jens.vonPilgrim@haw-hamburg.de (J. v. Pilgrim); V.1 V.2 Service V.2 Service V.2 Service V.2 timo.kehrer@informatik.hu-berlin.de (T. Kehrer)  0000-0001-7734-3106 (L. Beurer-Kellner); 0000-0002-7025-8301 Figure 1: An example of a distributed system. The API and (J. v. Pilgrim); 0000-0002-2582-5557 (T. Kehrer) its underlying data model are evolved from version 1 (red, not © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). striped) to version 2 (green, striped). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) round-trip migration service to migrate and migrate back changes in methods or endpoints in HTTP are out of shared data model instances on demand. Architecturally, scope. Protocol changes (e.g., change of message format, this allows for greater flexibility than the aforementioned authentication, rate limit) as mentioned in Wang et al. solutions. It leaves open a wide variety of design deci- [1] are also not considered here. Finally, we focus on a sions, regarding the use of different data model versions single round-trip migration at a time and do not consider as well as the location of the translation layer (client-side, concurrent operations. server-side, in the database system, etc.). We frame the development of migration functions as Research Gap: Although it seems to be an attractive a transformation problem that abstracts from techno- solution to deal with data model evolution, the develop- logical details. While the shared data model is typically ment of a round-trip migration layer which is responsible defined through Web API specification languages, we for the the forth-and-back translation of object-oriented choose a more simple and explicit representation using data model instances of different versions is not yet prop- an object-oriented modeling approach. Conceptually, we erly supported by existing technologies. consider object-oriented data models and instances as Frameworks such as Google’s Protocol Buffers2 , graphs, serving as basis for the problem definition which Apache Thrift3 or Apache Avro4 support versioning of we present more formally in Section 2. Next, in Section 3, the whole API and provide annotations in order to change we give a set of selected data model evolution scenarios an API in a backwards compatible way. On a more fine- and the corresponding round-trip migration tasks which grained level, UpgradeJ [5] extends Java to support ver- are to be solved within this challenge. In Section 4, we sioned type declarations. It allows for upgrading to new present criteria for evaluating the submitted solutions. versions dynamically at run-time, however, the revised Finally, Section 5 presents a simple reference solution, class must have at least the fields and method signatures serving as baseline for more sophisticated solutions based as the original one. Dmitriev et al. [6] discuss evolu- on model transformation concepts and technologies. tion techniques for the PJama persistence framework. An evaluation framework which may be used by solu- Programmers can write migration functions which are tion providers and which comprises a set of experimental embedded by means of static methods. However, there subjects is briefly described in Appendix A. The frame- is no dedicated support for implementing round-trip mi- work as well as a reference solution for this case may be grations. found at https://github.com/lbeurerkellner/ttc2020. Traditional research on data model evolution and in- Relation to Previous TTC Cases: At the 2017 edi- stance migration has its roots in the database systems tion of the Transformation Tool Contest, the “Families community. Here, schema evolution generally refers to to Persons Case” [9] has been presented. It models a the process of facilitating the modification of a database well-known bidirectional transformation problem which schema without loss of existing data or compromising is closely related to the underlying problem of our case. data integrity [7]. The main aim, however, is to merely up- However, coming from a more practical setting, we want date instance data in response to schema changes, which to emphasize different aspects. As it will become appar- inherently differs from round-trip migrating instances ent from our evolution scenarios presented in Section between different versions of an API. 3, our background is mostly motivated by the features The same limitation applies to more recent work in of modern web-development languages (e.g., the use of model-driven engineering. Here, multiple approaches optional fields in Section 3.3) as well as the development have been proposed addressing the migration of instance process of web applications in general (e.g., our evalua- models in response to meta-model changes, referred to as tion criterion re-usability in Section 4.4). meta-model evolution and model co-evolution [8]. Their goal, however, similar to schema evolution, is to merely update instance models in response to meta-model evo- 2. Problem Definition lution. Nonetheless, a multitude of techniques that have In this section, we introduce our conceptual, technology- been proposed in the context of model evolution and independent notion of object-oriented data models and model transformation may serve as a proper basis for the instances, and then present properties which we would specification of round-trip migrations. ideally expect from round-trip migrations. Challenge in a Nutshell: In this challenge, we focus on the key task of developing migration functions which are needed by a round-trip migration service. We only 2.1. Data Models and Instances consider API changes affecting the shared data model, Graphs are a natural means to conceptually define object- while other aspects of API evolution such as signature oriented data models and instances. For the sake of being 2 https://developers.google.com/protocol-buffers compatible with the majority of available model trans- 3 https://thrift.apache.org formation technologies, our notion of a graph can be 4 https://avro.apache.org transferred to model representations which are based on ent data models to communicate with each other, a the essential MOF (EMOF) standard being defined by the translation layer is responsible for migrating instances OMG5 . Specifically, a graph 𝐺 = (𝐺𝑁 , 𝐺𝐸 , 𝑠𝑟𝑐𝐺 , 𝑡𝑔𝑡𝐺 ) forth and back. Formally, a translation layer is a tuple consists of two disjoint sets 𝐺𝑁 and 𝐺𝐸 containing 𝑇 = (𝑀1 , 𝑀2 , 𝑓, 𝑔) where 𝑀1 and 𝑀2 denote the data the nodes and the edges of the graph, respectively. Ev- models the layer translates from and to via migration ery edge represents a directed connection between two functions 𝑓 : ℳ1 → ℳ2 and 𝑔 : ℳ2 → ℳ1 , respec- nodes, which are called the source and target nodes tively. Given an instance 𝑚1 ∈ ℳ1 , we refer to the of the edge, formally represented by source and tar- consecutive application of 𝑓 and 𝑔 to 𝑚1 , i.e., 𝑔(𝑓 (𝑚1 )), get functions 𝑠𝑟𝑐𝐺 , 𝑡𝑔𝑡𝐺 : 𝐺𝐸 → 𝐺𝑁 . Given two as the round-trip migration of 𝑚1 via 𝑀2 . Likewise, since graphs 𝐺 and 𝐻, a pair of functions (𝑓𝑁 , 𝑓𝐸 ) with translation layers are supposed to work symmetrically in 𝑓𝑁 : 𝐺𝑁 → 𝐻𝑁 and 𝑓𝐸 : 𝐺𝐸 → 𝐻𝐸 forms a graph either direction, given an instance 𝑚2 ∈ ℳ2 , 𝑓 (𝑔(𝑚2 )) morphism 𝑓 : 𝐺 → 𝐻 if it maps the nodes and edges of denotes the round-trip migration of 𝑚2 via 𝑀1 . The 𝐺 to those of 𝐻 in a structure-preserving way, i.e., ∀𝑒 ∈ round-trip migration of an instance 𝑚1 via 𝑀2 (resp. 𝐺𝐸 : 𝑓𝑁 (𝑠𝑟𝑐𝐺 (𝑒)) = 𝑠𝑟𝑐𝐻 (𝑓𝐸 (𝑒)) ∧ 𝑓𝑁 (𝑡𝑔𝑡𝐺 (𝑒)) = 𝑚2 via 𝑀1 ) is called successful if 𝑔(𝑓 (𝑚1 )) = 𝑚1 (resp. 𝑡𝑔𝑡𝐻 (𝑓𝐸 (𝑒)). 𝑓 (𝑔(𝑚2 )) = 𝑚2 ). A translation layer 𝑇 is considered An object-oriented data model is conceptually con- successfully round-trip-migrating if the following condi- sidered as a distinguished graph referred to as type tions hold: graph 𝑇 , while an instance of this data model is formally treated as an instance graph 𝐺 typed over 𝑇 . Formally, a ∀ 𝑚1 ∈ ℳ1 : 𝑔(𝑓 (𝑚1 )) = 𝑚1 (1) type graph 𝑇 = (𝑇𝑁 , 𝑇𝐸 , 𝑠𝑟𝑐𝑇 , 𝑡𝑔𝑡𝑇 , 𝐼, 𝐴) is a special ∀ 𝑚2 ∈ ℳ2 : 𝑓 (𝑔(𝑚2 )) = 𝑚2 (2) graph whose nodes and edges are representing types, and which comprises the definition of a node type hierarchy In practice, round-trip migrations as introduced above 𝐼 ⊆ 𝑇𝑁 × 𝑇𝑁 , which must be an acyclic relation, and a will barely happen since, more often than not, a compo- set 𝐴 ⊆ 𝑇𝑁 identifying abstract node types. The typing nent will not directly return an instance it just received relation between instances and data models may be for- but rather apply some modification to the instance be- malized by a special graph morphism 𝑡𝑦𝑝𝑒𝐺 : 𝐺 → 𝑇 re- fore returning it. Given two data models 𝑀1 and 𝑀2 , a round-trip migration with modification of an instance lating an instance graph 𝐺 with its associated type graph 𝑚1 ∈ ℳ1 via 𝑀2 is a consecutive application of func- 𝑇 [10]. The way we handle attributes and attribute decla- tions 𝑔 ∘ 𝑐2 ∘ 𝑓 (𝑚1) = 𝑔(𝑐2 (𝑓 (𝑚1))) where, like above, rations follows the definition of attributed graphs given in [11]. The main idea of formalizing node attributes in𝑓 and 𝑔 are migration functions from 𝑀1 to 𝑀2 and an instance graph is to consider them as edges of a spe-𝑀2 to 𝑀1 , respectively, and 𝑐2 : ℳ2 → ℳ2 is an instance modification function performing the modifi- cial kind referring to data values. Analogously, attributes declared by node types of a type graph are represented cation of the migrated instance 𝑓 (𝑚1 ) ∈ ℳ2 . Due as special edges referring to data type nodes. to the modification of 𝑓 (𝑚1 ), the original definition of In order to avoid going into any technical details ofa successful round-trip migration is not suitable any- model transformation approaches yet, we will take an more. The result of migrating back the modified in- extensional view on data models. That is, speaking aboutstance 𝑐2 (𝑓 (𝑚1 )) ∈ ℳ2 is not expected to be the a data model 𝑀 , then ℳ refers to the (infinite) set of original instance 𝑚1 . Intuitively, the result is rather data model instances which are properly typed over 𝑀 . expected to be a modification 𝑐1 (𝑚1 ) of instance 𝑚1 where 𝑐1 : ℳ1 → ℳ1 represents the corresponding co- modification of 𝑐2 on data model 𝑀1 . A translation layer 2.2. Round-Trip Migration Functions 𝑇 = (𝑀1 , 𝑀2 , 𝑓, 𝑔) which handles round-trip migra- We differentiate the migration and the modification of tions between data models 𝑀1 and 𝑀2 is called success- instances. Given two data models 𝑀1 and 𝑀2 with fully round-trip migrating with modification if there are 𝑀1 ̸= 𝑀2 , a total function 𝑓 : ℳ1 → ℳ2 is con- co-modifications 𝑐1 : ℳ1 → ℳ1 and 𝑐2 : ℳ2 → ℳ2 sidered a migration function from 𝑀1 to 𝑀2 . Given two such that the following conditions hold: instances 𝑚1 ∈ ℳ1 and 𝑚2 ∈ ℳ2 , we say that 𝑚1 is ∀ 𝑚1 ∈ ℳ1 : 𝑔(𝑐2 (𝑓 (𝑚1 ))) = 𝑐1 (𝑚1 ) (3) migrated to 𝑚2 if 𝑓 (𝑚1 ) = 𝑚2 . On the contrary, given a single data model 𝑀 , a total function 𝑐 : ℳ → ℳ ∀ 𝑚 2 ∈ ℳ 2 : 𝑓 (𝑐1 (𝑔(𝑚2 ))) = 𝑐 2 (𝑚2 ) (4) is considered an instance modification function. Given two instances 𝑚 and 𝑚′ typed over 𝑀 , we say that 𝑚 is modified to become 𝑚′ if 𝑐(𝑚) = 𝑚′ . 3. Selected Evolution Scenarios To allow two components which depend on differ- In the following sections 3.2 through 3.4, we introduce a selection of different cases of data model evolution and 5 https://www.omg.org/spec/MOF according round-trip migration scenarios. Data models :Person :Person name = "Alice" name = "Alice" age = 25 :Person :Person Person Person name = "Alice" name = "Alice" name : String name : String age = -1 age : Int :Person :Person name = "Alice" name = "Alice" age = 25 Figure 2: Illustration of the data model evolution scenario “Create/Delete Field” (left) and the corresponding round-trip migrations 𝑀1 ↦→ 𝑀2 ↦→ 𝑀1 and 𝑀2 ↦→ 𝑀1 ↦→ 𝑀2 (right). Requested specifications for the latter are referred to as Task_1_M1 _M2 _M1 and Task_1_M2 _M1 _M2 , respectively. :Person :Person name = "Alice" name = "Alice" age = 25 ybirth = 1995 :Person :Person Person Person name = "Alice" name = "Alice" name : String name : String ybirth = 1995 age = 25 age : Int ybirth : Int :Person :Person name = "Alice" name = "Alice" age = 25 ybirth = 1995 Data model evolution migrate migrateBack Figure 3: Illustration of the data model evolution scenario “Rename Field” (left) and the corresponding round-trip mi- grations 𝑀1 ↦→ 𝑀2 ↦→ 𝑀1 and 𝑀2 ↦→ 𝑀1 ↦→ 𝑀2 (right). Requested specifications for the latter are referred to as Task_2_M1 _M2 _M1 and Task_2_M2 _M1 _M2 , respectively. and instances are represented using UML class and object Person instance does not provide a concrete value for diagram notations, respectively. Each scenario comprises this field. The more complicated case, however, is the two versions of a data model that demonstrate the ap- 𝑀2 ↦→ 𝑀1 ↦→ 𝑀2 round-trip migration since it needs to plication of typical edit operations on object-oriented access a previous revision of the migrated object during a data models in a minimal context. Each scenario can later stage in the round-trip migration. Here, the value of be interpreted from two perspectives, i.e., from 𝑀1 to field age should be recovered from the original Person 𝑀2 , or vice versa. The respective edit operations which instance. In the context of traditional bidirectional trans- can be observed in both cases are inverse to each other. formation, this can be considered as a standard scenario We discuss round-trip migrations in both directions, us- which we use as a warm-up task of our round-trip mi- ing the shorthand notations 𝑀1 ↦→ 𝑀2 ↦→ 𝑀1 and gration case. 𝑀2 ↦→ 𝑀1 ↦→ 𝑀2 , respectively. For each of these round-trip migration scenarios, the 3.2. Rename Field task is to specify the required migration functions, re- ferred to as migrate and migrate back in the sequel. That In this evolution scenario, the name of a field is changed. is, each of the four data model evolution scenarios yields The most simple reason for this kind of change is to im- two tasks which we ask to be solved by solution providers, prove the wording in the data model to better reflect the summing up to a total number of eight tasks for the entire terminology of a domain of interest. A more challenging case. Since all of these tasks are independent from each change is to slightly update the meaning of a field, as it is other, participants may address a subset of them. the case in our evolution scenario presented in Figure 2 (left). Here, the field age in 𝑀1 is changed to ybirth in 𝑀2 , now capturing a Person’s year of birth instead of 3.1. Create/Delete Field its current age. In this scenario, a new field is added to (removed from) The migration functions which are to be developed for a class of the data model, as illustrated in Figure 3 (left). this scenario should account for this semantic change We assume this field to be functionally independent from and convert between proper values of fields age and any other field of the same class. ybirth. As illustrated in Figure 2 (right), we assume the As illustrated in Figure 3 (right), in a 𝑀1 ↦→ 𝑀2 ↦→ current date as a basis for the conversions in both direc- 𝑀1 round-trip migration, the new field age should be tions. In this case, the change in the semantics of age and set to some suitable default value since the original ybirth requires the integration of some user-defined :Person :Person name = "Alice" age = 25 age = 25 :Person :Person name = "Alice" name = "" age = 25 age = 25 :Person :Person name = "Alice" age = 25 age = 25 Person Person name : String name : String [?] age : Int age : Int :Person :Person name = "Alice" :Person name = "Alice" :Person age = 25 name = "Alice" age = 25 name = "Alice" age = 25 age = 25 :Person :Person age = 25 name = "" :Person :Person age = 25 name = "" name = "" age = 25 age = 25 Instance modificaion Figure 4: Illustration of the data model evolution scenario “Declare Field Optional/Mandatory” (left) and the corresponding round-trip migrations 𝑀1 ↦→ 𝑀2 ↦→ 𝑀1 and 𝑀2 ↦→ 𝑀1 ↦→ 𝑀2 (right). Requested specifications for the latter are referred to as Task_3_M1 _M2 _M1 and Task_3_M2 _M1 _M2 , respectively. The lower example round-trip migration demonstrates how to deal with instance modifications. :Person :Person name = "Alice" name = "Alice" age = 25 ybirth = 1995 owner owner Person Person :Dog :Dog :Person :Person name : String name : String name = "Bob" name = "Bob" age = 2 name = "Alice" name = "Alice" age : Int ybirth : Int ybirth = 1995 age = 25 1 1 owner owner owner owner :Dog :Dog :Person :Person Dog Dog name = "Bob" name = "Bob" name = "Alice" name = "Alice" age = -1 name : String name : String age = 25 ybirth = 1995 age : Int owner owner :Dog :Dog name = "Bob" name = "Bob" age = 2 Figure 5: Illustration of the data model evolution scenario “Multiple Edits” (left) and the corresponding round-trip mi- grations 𝑀1 ↦→ 𝑀2 ↦→ 𝑀1 and 𝑀2 ↦→ 𝑀1 ↦→ 𝑀2 (right). Requested specifications for the latter are referred to as Task_4_M1 _M2 _M1 and Task_4_M2 _M1 _M2 , respectively. arithmetic operation during transformation. Purely struc- data model shown in Figure 4 (left). The latter case is tural approaches often lack this feature, even though in represented by the default notation used for all other our context of Web APIs this is an important requirement. fields, meaning that the field is a mandatory one. The key issue here is to deal with potential null- 3.3. Declare Field Optional/Mandatory values in 𝑀2 and their corresponding default values in 𝑀1 . This is rather straightforward in a 𝑀1 ↦→ 𝑀2 ↦→ In this scenario, the multiplicity of a field is generalized 𝑀1 round-trip migration, as illustrated in Figure 4. Here, (specialized) from 1 to 0..1 (0..1 to 1). The former case null-values in 𝑀2 may occur due to a modification of means that the field is declared to be optional, as indicated the migrated instance, and they should be translated to a by the notation [?] attached to field name in 𝑀2 of the default value in 𝑀1 . The 𝑀2 ↦→ 𝑀1 ↦→ 𝑀2 round-trip migration is more complicated. Here, we have to check according round-trip migration scenarios are supported, whether a default value has been synthesized during mi- the more expressive is the transformation approach. gration or through an explicit modification. In the former To turn this intuition into a measurable evaluation case, as illustrated by the upper right example shown in criterion, we assess the correctness of each task by pro- Figure 4, a synthesized default value is migrated back to viding sets of associated tests. A test case comprises pairs a null-value. In the latter case, illustrated by the lower of instances serving as input and as expected output of right example shown in Figure 4, the default value is the a round-trip migration. For each of the tasks presented result of an explicit modification in 𝑀1 , which should be in Section 3, a first test case is derived from the example migrated back to a default value instead of a null-value presented in that section. A second test case is added in in 𝑀2 . This evolution scenario is of special interest to order to prevent literal encodings of solutions (except for us, since optional fields are a common pattern used in the taks presented in Section 3.3, which already has two the design and evolution of Web APIs. associated test cases. A task is considered to be solved correctly if it passes all tests. 3.4. Multiple Edits All tasks are scored by means of the provided test cases. A point is given for each passing test case, and points In this evolution scenario, we combine two edit opera- are summarized over all test cases. This means that all tions which we have already considered before. As we tasks are scored evenly between zero and two points. can see in Figure 5 (left), from an 𝑀1 to 𝑀2 perspective, Zero means the task has not been tackled at all, one point the field age of class Dog has been deleted, which corre- indicates a partial solution, and two points mean that the sponds to the edit operation considered in the evolution task has been solved and the transformation has been scenario presented in Section 3.1. At the same time, the implemented correctly. name and semantics of field age of the referenced class Person has been changed to ybirth, as in the evolution 4.2. Comprehensibility scenario presented in Section 3.2. The corresponding 𝑀1 ↦→ 𝑀2 ↦→ 𝑀1 and 𝑀2 ↦→ Specifications of migration functions should be compre- 𝑀1 ↦→ 𝑀2 round-trip migrations are illustrated in Fig- hensible in order to be maintainable and to allow for ure 5 (right). Their specification can be considered as better manual validation. Our idea of evaluating solu- a combination of the migration functions required for tions is to compare their comprehensibility with that of the evolution scenarios presented in Section 3.2 and Sec- the provided reference solution (see Section 5). For each tion 3.1. The main aim of this scenario is to call for task, the comprehensibility of the reference solution is solutions that support some form of re-usability (see Sec- scored by one point. Better, equal and worse comprehen- tion 4). sibility of a submitted solution are acknowledged by two, one and zero points, respectively. We acknowledge that such a classification is highly bi- 4. Evaluation Criteria ased by subjective preferences. Developers being familiar with model transformation languages such as Henshin To evaluate the quality of the proposed solutions, we give or ATL most likely prefer a declarative or declarative- a set of quality characteristics which we consider to be rel- imperative style, while mainstream web developers will evant for the specification of round-trip migrations. We most likely prefer a purely imperative style of writing draw inspirations from previous work on defining qual- migrations. More objective measures such as code met- ity attributes of model transformations [12, 13, 14, 15]. rics, as proposed by Götz et al. [16, 17] to compare size We refine each quality characteristic into measurable at- and complexity of model transformations written in Java tributes for each of the tasks presented in Section 3. To and ATL, are hardly applicable to compare transforma- obtain concrete measures for their solutions, participants tions which are written in languages that follow different are kindly invited to use the evaluation framework pro- paradigms (which is to be expected for the different solu- vided with the case resources (see Appendix A). This tions of this case). way, some of the measures can be obtained in a semi- To that end, we see two options for assessing the com- automated manner. prehensibility of solutions, both of which involve a hu- man in the loop. In the offline variant, we will use two 4.1. Expressiveness distinct groups of students to evaluate a solution by an- A first important and rather obvious quality characteristic swering a survey, similar to [18]. One group of students is the expressiveness of the transformation language and will have a background on model transformation lan- system being used to specify and execute round-trip mi- guages, while the other group is supposed to have only grations. Intuitively, the more data model evolution and (basic) programming skills (in Java). The second variant is to conduct a live evaluation with the TTC participants. 4.3. Bidirectionality 4.5. Performance Bidirectional transformations (BX) [19] appear to be an Finally, we evaluate the proposed solutions with regards attractive solution to our problem as they support to to runtime performance. While the functional correct- synthesize migration functions in both directions from ness of round-trip migrations is an important step to- a single specification. Such single specifications may wards a valid solution, the Web API context also requires be symmetric as, e.g., in the case of triple graph gram- efficient solutions. The implementation of a more com- mars [20], or asymmetric as, e.g., in the case of putback- plex translation layer would be out of the scope of this based bidirectional programming [21]. challenge. Therefore, as a limited evaluation of the run- Within this challenge, we do not insist on any particu- time characteristics of the proposed solutions, we repeat- lar mechanism for specifying bidirectional transforma- edly run the round-trip migrations required to support tions, and all mechanisms are ranked equally. All tasks the evaluation scenarios described in Section 3 for a large and extension tasks are scored evenly with zero (no bidi- number of iterations and measure their execution time. In rectionality) or one point (support for bidirectionality). general however, we consider runtime performance a sec- ondary evaluation criterion. Hence, differences among 4.4. Re-usability proposed solutions with regards to runtime performance shall only serve as a tie-breaker among solutions which As with any other kind of software, re-use mechanisms score equally for the other four criteria. are an indispensable means to increase the productivity and quality of model transformations. To that end, nu- merous re-use mechanisms for model transformations 5. Reference Solution have been proposed in the literature, a survey may be found in [22]. We evaluate re-usability by means of To provide a reference solution for this case, we im- the “Multiple Edits” evolution scenario presented in Sec- plemented all the migration functions which are re- tion 3.4 since it subsumes the scenarios presented in quired to support the 8 round-trip migration tasks aris- sections 3.2 and 3.1. ing from our four data model evolution scenarios pre- One possible option is to achieve re-usability by means sented in Section 3 in Java. Its integration into the evalu- of delegation. Specifically, when developing migration ation framework presented in Appendix A is illustrated functions supporting the round-trip migration of Dog in Figure 7 (bottom). Each task is realized by a con- instances, this could be achieved by, e.g., delegating the crete subclass of class AbstractTask, each of which migration of the referenced Person instances to migra- is being instantiated by the concrete task factory called JavaTaskFactory. None of the migrations is delegated tion functions which have been already defined. Another possible re-use mechanism could be to ab- to a dedicated model transformation system, but the mi- stract from the concrete data models and to specify the re- gration functions migrate and migrateBack are di- quired migration functions in a generic manner, focusing rectly implemented in Java. on the conceptual parts of the respective edit operations. The generic migration functions would then be instan- Qualitative evaluation results Table 1 summarizes tiated for the concrete data model used in this scenario. the qualitative evaluation results for our Java-based ref- This is similar to the extraction of core transformation erence solution, namely for the criteria expressiveness, concepts that generalize over several meta-models [23]. comprehensibility, bidirectionality and re-usability. On In the context of Web APIs, we see this as a core re- the one hand, it is not surprising that a general purpose quirement of a feasible transformation approach. In our programming language like Java is expressive enough to setting, the continuous evolution of a data model also implies the continuous development of a corresponding Total Transformation Runtime migration layer. From a software engineering point of 40000 view, a transformation approach should therefore pro- 35000 vide support for re-usability. More specifically, s single 30000 25000 change to the data model should require only one corre- Runtime (ms) 20000 sponding change to the migration layer, which implies 15000 that existing migration code can be re-used. 10000 We do not insist on any particular re-use mechanism, 5000 and all re-use mechanisms are ranked equally. Support 0 0 250000 500000 750000 1000000 1250000 1500000 1750000 2000000 for re-usability is acknowledged by four points, while no No. of Repetitions points are given if the specification has been developed Figure 6: Performance results of our provided reference so- from scratch. lution. Table 1 Evaluation results obtained for the reference solution. Numbers in brackets indicate the maximum score that can be achieved. Evolution Scenario / Task Expressiveness Comprehensibility Bidirectionality Re-usability Create/Delete Field Task_1_M1 _M2 _M1 2 (2) 1 (2) 0 (1) n.a. Task_1_M2 _M1 _M2 2 (2) 1 (2) 0 (1) n.a. Rename Field Task_2_M1 _M2 _M1 2 (2) 1 (2) 0 (1) n.a. Task_2_M2 _M1 _M2 2 (2) 1 (2) 0 (1) n.a. Declare Field Optional/Mandatory Task_3_M1 _M2 _M1 2 (2) 1 (2) 0 (1) n.a. Task_3_M2 _M1 _M2 2 (2) 1 (2) 0 (1) n.a. Multiple Edits Task_4_M1 _M2 _M1 2 (2) 1 (2) 0 (1) 0 (4) Task_4_M2 _M1 _M2 2 (2) 1 (2) 0 (1) 0 (4) ∑︀ ∑︀ ∑︀ ∑︀ : 16 (16) : 8 (16) : 0 (8) : 0 (8) correctly solve all the tasks provided with this case. Thus, One of the next steps to further extend this challenge the reference solution achieves the maximum score in could be to study more evolution scenarios than the four this category, i.e., two points per task summarizing to 16considered in this paper. Moreover, we could think of points in total. On the other hand, bidirectionality and a (semi-)automated specification of the required round- re-usability are not supported at all. trip migration functions. Again, we are convinced that technologies from the field of model-driven engineer- Performance results Figure 6 illustrates the runtime ing, notably techniques for model matching [31, 32] and characteristics of our reference solution in terms of the differencing [33], can serve as starting point for such performance test of our evaluation framework (see Ap- automation. pendix A). These results were obtained on a Mid-2014 MacBook Pro with an Intel Core i5 processor running at 2,6 GHz and 8 gigabytes of main memory. As expected, References the time consumed to perform the round-trip migrations [1] S. Wang, I. Keivanloo, Y. Zou, How do developers grows linearly with the number of iterations. It takes react to RESTful API evolution?, in: Intl. Conf. on about 40 seconds to perform all the 2 million iterations Service-Oriented Computing, 2014. of our performance test. [2] E. Wittern, Web APIs - Challenges, Design Points, and Research Opportunities, in: Intl. Workshop on 6. Summary and Outlook API Usage and Evolution, 2018. [3] S. Sohan, C. Anslow, F. Maurer, A case study of In this paper, we outlined our vision of a so-called trans- web API evolution, in: IEEE World Congress on lation layer which supports the communication of web- Services, 2015. based services in different, incompatible versions. One of [4] T. Espinha, A. Zaidman, H.-G. Gross, Web API the key tasks of implementing such a translation layer is growing pains: Stories from client developers and to support the round-trip migration of instances of object- their code, in: Intl. Conf. on Software Maintenance, oriented data models in different versions. In this chal- Reengineering, and Reverse Engineering, 2014. lenge description, we phrased this as a model transforma- [5] G. Bierman, M. Parkinson, J. Noble, UpgradeJ: Incre- tion problem which, in contrast to previous TTC cases mental typechecking for class upgrades, in: Euro- on the same topic, is driven by the needs and specifics of pean Conference on Object-Oriented Programming, our application context. We are convinced that modern 2008. model transformation technologies such as Henshin [24], [6] M. Dmitriev, M. Atkinson, Evolutionary data con- VIATRA [25] or ATL [26] are capable of solving the chal- version in the PJama persistent language, in: Intl. lenge in an elegant way. In particular, solutions to the Workshop on Object Oriented Databases, 1999. TTC 2017 “Families to Persons Case” [27, 28, 29, 30] may [7] E. Rahm, P. A. Bernstein, An online bibliography on be adapted to our case with moderate effort. schema evolution, ACM Sigmod Record 35 (2006). [8] R. Hebig, D. E. Khelladi, R. Bendraou, Approaches [22] A. Kusel, J. Schönböck, M. Wimmer, G. Kappel, to co-evolution of metamodels and models: A sur- W. Retschitzegger, W. Schwinger, Reuse in model- vey, IEEE Transactions on Software Engineering to-model transformation languages: are we there 43 (2017). yet?, Software & Systems Modeling 14 (2015) 537– [9] A. Anjorin, T. Buchmann, B. Westfechtel, The fam- 572. ilies to persons case, in: Proceedings of the 10th [23] S. Sen, N. Moha, V. Mahé, O. Barais, B. Baudry, Transformation Tool Contest at STAF 2017, 2017. J.-M. Jézéquel, Reusable model transformations, [10] E. Biermann, C. Ermel, G. Taentzer, Formal founda- Software & Systems Modeling 11 (2012) 111–125. tion of consistent EMF model transformations by [24] D. Strüber, K. Born, K. D. Gill, R. Groner, T. Kehrer, algebraic graph transformation, Software & Sys- M. Ohrndorf, M. Tichy, Henshin: A usability- tems Modeling 11 (2012). focused framework for EMF model transformation [11] R. Heckel, J. M. Küster, G. Taentzer, Confluence development, in: Intl. Conf. on Graph Transforma- of typed attributed graph transformation systems, tion, Springer, 2017, pp. 196–208. in: Intl. Conf. on Graph Transformation, Springer, [25] D. Varró, A. Balogh, The model transformation 2002, pp. 161–176. language of the VIATRA2 framework, Science of [12] E. Syriani, J. Gray, Challenges for addressing quality Computer Programming 68 (2007) 214–234. factors in model transformation, in: Intl. Conf. on [26] F. Jouault, I. Kurtev, Transforming models with Software Testing, Verification and Validation, IEEE, ATL, in: Intl. Conf. on Model Driven Engineering 2012, pp. 929–937. Languages and Systems, Springer, 2005, pp. 128– [13] C. M. Gerpheide, R. R. Schiffelers, A. Serebrenik, A 138. bottom-up quality model for QVTO, in: Int. Con- [27] G. Hinkel, An NMF solution to the families to per- ference on the Quality of Information and Commu- sons case at the TTC 2017, in: TTC@STAF, volume nications Technology, IEEE, 2014, pp. 85–94. 2026 of CEUR Workshop Proceedings, 2017, pp. 35– [14] K. Lano, K. Maroukian, S. Y. Tehrani, Case study: 39. Fixml to Java, C# and C++, in: TTC@STAF, 2014, [28] A. Zündorf, A. Weidt, The sdmlib solution to the pp. 2–6. TTC 2017 families 2 persons case, in: TTC@STAF, [15] S. Getir, D. A. Vu, F. Peverali, D. Strüber, T. Kehrer, volume 2026 of CEUR Workshop Proceedings, 2017, State elimination as model transformation problem, pp. 41–45. in: TTC@STAF, 2017, pp. 65–73. [29] T. Horn, Solving the TTC families to persons case [16] S. Götz, M. Tichy, T. Kehrer, Dedicated model trans- with funnyqt, in: TTC@STAF, volume 2026 of formation languages vs. general-purpose languages: CEUR Workshop Proceedings, 2017, pp. 47–51. A historical perspective on ATL vs. Java., in: MOD- [30] L. Samimi-Dehkordi, B. Zamani, S. K. Rahimi, Solv- ELSWARD, 2021, pp. 122–135. ing the families to persons case using evl+strace, [17] S. Höppner, T. Kehrer, M. Tichy, Contrasting dedi- in: TTC@STAF, volume 2026 of CEUR Workshop cated model transformation languages vs. general Proceedings, 2017, pp. 54–62. purpose languages: A historical perspective on ATL [31] D. S. Kolovos, D. Di Ruscio, A. Pierantonio, R. F. vs. Java based on complexity and size, Software and Paige, Different models for model matching: An Systems Modeling (2021). To appear. analysis of approaches to support model differenc- [18] A. Nugroho, Level of detail in UML models and ing, in: ICSE Workshop on Comparison and Ver- its impact on model comprehension: A controlled sioning of Software Models, IEEE, 2009, pp. 1–6. experiment, Information and Software Technology [32] T. Kehrer, U. Kelter, P. Pietsch, M. Schmidt, Adapt- 51 (2009) 1670 – 1685. ability of model comparison tools, in: Intl. Conf. on [19] S. Hidaka, M. Tisi, J. Cabot, Z. Hu, Feature-based Automated Software Engineering, ACM, 2012, pp. classification of bidirectional transformation ap- 306–309. proaches, Software & Systems Modeling 15 (2016) [33] T. Kehrer, U. Kelter, G. Taentzer, A rule-based ap- 907–928. proach to the semantic lifting of model differences [20] A. Schürr, Specification of graph translators with in the context of model versioning, in: Intl. Conf. triple graph grammars, in: Intl. Workshop on on Automated Software Engineering, IEEE, 2011, Graph-Theoretic Concepts in Computer Science, pp. 163–172. Springer, 1994, pp. 151–163. [34] C. Brun, A. Pierantonio, Model differences in the [21] H.-S. Ko, T. Zan, Z. Hu, Bigul: a formally veri- eclipse modeling framework, UPGRADE, The Eu- fied core language for putback-based bidirectional ropean Journal for the Informatics Professional 9 programming, in: SIGPLAN Workshop on Partial (2008) 29–34. Evaluation and Program Manipulation, 2016, pp. 61–72. AllFunctionalTests PerformanceTests task_1_M1_M2_M1() testPerformance() task_1_M2_M1_M2() ... task_4_M2_M1_M2() «enumeration» TaskInfo TASK_1_M1_M2_M1 «abstract» TASK_1_M2_M1_M2 AbstractBenchmarkTests ... init() TASK_4_M2_M1_M2 «use» «use» «abstract» «instantiate» «abstract» AbstractTaskFactory AbstractTask createTask(TaskInfo info, EPackage M1, EPackage M2) AbstractTask(EPackage M1, EPackage M2) migrate(EObject instance) : EObject migrateBack(EObject instance) : EObject JavaTaskFactory Task_1_M1_M2_M1 Task_1_M2_M1_M2 ... Task_4_M2_M1_M2 Task_4_M2_M1_M2 Figure 7: Evaluation framework architecture (top) and integration of the Java-based reference solution (bottom). A. Evaluation Framework the model comparison tool EMF Compare [34]. A performance test is provided by the class General architecture Tests in our evaluation frame- PerformanceTests. There is only one test method, work may be run as JUnit tests. The abstract class called testPerformance(), which proceeds as follows: AbstractBenchmarkTests serves as a base class for Similarly, to the functional test cases, the test relies on the all concrete tests (see below), doing some basic initializa- correct implementation of the AbstractTaskFactory tion. As illustrated by the architectural overview shown and AbstractTask. During performance testing, all in Figure 7, the class AbstractBenchmarkTests takes test cases provided for the four evaluation scenarios are the client role of an implementation of the Abstract Fac- executed repeatedly. That is, a full round-trip migration, tory design pattern, the classes AbstractTaskFactory involving calls to migrate and migrateBack is per- and AbstractTask are supposed to encapsulate con- formed. After a certain number of warm-up iterations, crete solutions. That is, for each of the eight tasks pre- this test loop is repeated for a total of 2 million repetitions. sented in Section 3, solution providers who want to use The test method measures execution with the increasing our evaluation framework are asked to provide a con- number of repetitions and stores the results into the file crete subclass of AbstractTask which is to be instanti- results.csv at the root of the solution’s bundle. See ated by a concrete subclass of AbstractTaskFactory. the provided code repository of the evaluation frame- The class AbstractTask defines the signatures of the work regarding plotting scripts for the resulting data. two central migration functions called migrate and migrateBack, respectively. The idea is that migrate and migrateBack then delegate the actual transforma- Registration of a concrete task factory In order to tion task to the model transformation system used in a register a concrete subclass of AbstractTaskFactory, concrete solution. solution providers may use the Eclipse extension point mechanism. Concrete task factories can be registered Functional tests vs. performance tests All test through a dedicated extension point6 . Please note cases for assessing the correctness of each of the that, in this case, the classes AllFunctionalTests eight tasks presented in Section 3 may be run as and PerformanceTests need to be run as JUnit Plug- JUnit tests which are collected in the Java class In Test. Alternatively, solution providers may sub- called AllFunctionalTests. Each test method, i.e., class AllFunctionalTests and PerformanceTests task_1_M1_M2_M1() through task_4_M2_M1_M2(), which can be then run as a normal JUnit test. In this case, executes a particular task and checks whether for a given the init method of these concrete subclasses must take input models the obtained output model looks as ex- care of instantiating the concrete task factory. Our ref- pected. Checking the equivalence of an actual and ex- pected round-trip migration result is performed using 6 de.hub.mse.ttc2020.benchmark.concretetaskfactory erence solution (see Section 5) implements both options for the sake of illustration. Test data Finally, since many model transformation tools available in the model transformation research com- munity are based on the Eclipse Modeling technology stack, we provide implementations of the data models used in the evolution scenarios presented in Section 3 in EMF Ecore. Consequently, instances serving as test data for assessing the correctness of transformation tasks are represented as EMF instances (often referred to as instance models in the EMF community).