=Paper= {{Paper |id=Vol-1693/voltpaper2 |storemode=property |title=Verified bidirectional transformations by construction |pdfUrl=https://ceur-ws.org/Vol-1693/VoltPaper2.pdf |volume=Vol-1693 |authors=Kevin Lano,Sobhan Yassipour-Tehrani |dblpUrl=https://dblp.org/rec/conf/models/LanoT16 }} ==Verified bidirectional transformations by construction== https://ceur-ws.org/Vol-1693/VoltPaper2.pdf
      Verified Bidirectional Transformations by
                     Construction

                         K. Lano, S. Yassipour-Tehrani

                               Dept. of Informatics
                              King’s College London
                                   London, UK
                           Email: kevin.lano@kcl.ac.uk,
                        sobhan.yassipour tehrani@kcl.ac.uk




      Abstract. Bidirectional transformations (bx) are of increasing signifi-
      cance in model-based engineering. Currently bx are defined using a num-
      ber of specialised transformation languages. In this paper we show how
      standard UML elements such as use cases and OCL constraints can be
      used to define bx, thus taking advantage of the wide industrial and educa-
      tional adoption and support for UML. We define patterns and techniques
      for specifying bx so that they are correct by construction.



1   Introduction

Bidirectional model transformations (bx) provide a means to maintain a consis-
tency relation R between two models which may both change.
    Bx are significant for a wide range of model based engineering scenarios such
as round-trip engineering; automatic consistency maintenance of multiple inter-
related models, and application and database co-evolution [3].
    Bx are characterised by a binary relation R : SL ↔ TL between a source
language (metamodel) SL and a target language TL. R(m, n) holds for a pair of
models m of SL and n of TL when the models consist of data which corresponds
under R. It should be possible to automatically derive from the definition of R
both forward and reverse transformations R → : SL × TL → TL and R ← : SL ×
TL → SL which aim to establish/maintain R between their first (respectively
second) and their result target (respectively source) models, given both existing
source and target models.
    Bx should satisfy two key conditions [16, 12]:

1. Correctness: the forward and reverse transformations derived from a relation
   R do establish R: R(m, R → (m, n)) and R(R ← (m, n), n) for each m : SL,
   n : TL.
2. Hippocraticness: if source and target models already satisfy R then the for-
   ward and reverse transformations do not modify the models: R(m, n) ⇒
   R → (m, n) = n and R(m, n) ⇒ R ← (m, n) = m for each m : SL, n : TL.
Additionally, updates to a model by R → or R ← should be the minimal necessary
to restore R: this is termed the principle of least change [13].
    Section 2 describes current bx research, Section 3 describes some bx patterns,
and Section 4 describes our approach to bx specification in UML. Section 5
provides a proof that the bx properties for R → are satisfied by transformations
defined using our approach. Section 6 gives an evaluation of the approach.


2   Bx Specification Languages and Techniques
Bx are the subject of extensive theoretical research [16, 17, 4, 12] and an annual
conference, BX. The most well-known transformation language which supports
definition of bx is QVT-R [14]. However, QVT-R has had limited uptake because
of its complex semantics and specification style, compared to simpler unidirec-
tional MT languages such as ATL [6] and ETL [7]. Triple Graph Grammars
(TGG) are a graph transformation-based formalism which supports bx defini-
tion, although with a restricted expression language compared to OCL. Other
approaches, such as JTL [5], and extensions of TGG [1], use constraint-based
programming techniques to interpret relations P (s, t) between source and target
elements as specifications in both forward and reverse directions. Model finding
using Alloy is used to bidirectionalise ATL in [12].
    In this paper we describe an approach to bx definition using standard UML
2 elements (class diagrams, use cases, OCL) which avoids the use of specialised
languages, and therefore is preferable from the viewpoint of industrial uptake,
and for the long-term maintenance and reuse of bx. In our approach, unidirec-
tional transformations τ are specified using use cases and OCL, then a statically-
computed bx relation Postτ & Invτ for τ is derived, together with a forward
transformation τ → extending τ , and an inverse τ ← of τ → . The bx relation and
transformations are derived from τ using a higher-order transformation. This
is generally more efficient than the use of constraint programming and model
finding.


3   Patterns for Bidirectional Transformations
We have identified a number of patterns for bx, which can be used to establish
the bx correctness conditions [11]:
 – Auxiliary Correspondence Model: maintain a detailed trace between
   source model and target model elements to facilitate change-propagation in
   source to target or target to source directions.
 – Cleanup before Construct: for R → , remove superfluous target model
   elements which are not in the transformation relation R with any source
   elements, before constructing target elements related to source elements,
   and similarly for R ← .
 – Unique Instantiation: Do not recreate elements t in one model which
   already correspond to an element s in the other model, instead modify the
    data of t to enforce the transformation relation. Use key attributes to identify
    when elements should be created or updated.

All three of these patterns are used in our approach to define bx using UML. In
addition, we have used three further patterns to structure bx:

 – Factor a Text-to-text bx by a Model-to-model bx: this architectural
   pattern defines a text-to-text bx as a composition of parsing, model-to-model
   and model-to-text transformations in both directions.
 – Inverse Recursive Descent: if a forward transformation τ → is defined
   using structural recursion on the source language SL, the reverse transfor-
   mation τ ← may be derived from τ → as a dual structural recursion on TL.
 – Bx Data Correspondence: a special case of Auxiliary Correspondence
   Model which defines an auxiliary association in SL ∪ TL to express a bx
   relation concerning specific elements or basic values (eg., a correspondence
   between expression operators in one language and the other).


4   Bidirectional Transformation Specification in
    UML-RSDS

UML-RSDS is a particular version of executable UML (xUML), based on UML
2 and OCL 2.4, with a formal semantics [8] and an established toolset [9]. Model
transformations τ are specified in UML-RSDS as UML use cases, defined declar-
atively by two main predicates, expressed in a subset of OCL:

1. Postconditions Postτ which define the intended effect of the transformation
   at its termination. These are an ordered conjunction C1 & ... & Cn of OCL
   constraints (also termed rules in the following) and also serve to define a
   design (activity) and implementation of the transformation.
2. Invariants Invτ which define expected invariant properties which should hold
   during the transformation execution. These can be derived from Postτ , or
   can be specified explicitly by the developer. They are the constraints of the
   use case, considered as a UML classifier.

    A UML activity stat(Cn) is derived by the UML-RSDS tools for each post-
condition constraint Cn. This activity is constructed so that it establishes Cn:
[stat(Cn)]Cn. The activity for Postτ is formed as a composition of the stat(Cn).
In this paper, stat(Postτ ) is the sequential composition of the stat(Cn).
    Data-dependency relations between the Post constraints are important in
ensuring the correctness of both unidirectional and bidirectional UML-RSDS
transformations. Let rd (Cn) denote the read-frame of a constraint Cn, the set
of entity type names and data features that it reads, and wr (Cn) denote its
write frame. These are defined in [8].
    A dependency ordering Cn < Cm is defined between constraints by wr (Cn)∩
rd (Cm) ̸= {} “Cm depends on Cn”. A use case with postconditions C1 , . . . , Cn
should satisfy the syntactic non-interference conditions:
 1. If Ci < Cj , with i ̸= j , then i < j .
 2. If i ̸= j then wr (Ci ) ∩ wr (Cj ) = {}.
Together, these conditions ensure that the activities stat(Cj ) of subsequent con-
straints Cj cannot invalidate earlier constraints Ci , for i < j .
    A constraint is termed a type 1 constraint if its write and read frames are
disjoint. Such constraints usually have an implementation as a bounded loop, as
opposed to a fixed-point/unbounded iteration.
    An invariant Inv and explicit inverse predicate Post ∼ can be derived me-
chanically from Post [9, 11]. This derivation is automated in the UML-RSDS
tools. Tables 1 and 2 show some examples of inverses P ∼ of predicates P . The
transformation developer can also specify inverses for particular Cn by defin-
ing a suitable Cn ∼ constraint in Inv , for example, to express that a predicate
t.z = s.x + s.y should be inverted as s.x = t.z − s.y.


P(s,t)                              P ∼ (s, t)                             Condition
t.g = s.f                           s.f = t.g                              Assignable features f , g
t.g = s.f →pow (x )                 s.f = t.g→pow (1.0/x )                 x non-zero,
                                                                           independent of s.f
t.b2 = not(s.b1)                     s.b1 = not(t.b2)                      Boolean attributes b1, b2
t.g = K ∗ s.f + L                    s.f = (t.g − L)/K                     f , g numeric attributes
Numeric constants K , L
K ̸= 0
R(s, t) & Q(s, t)                    R ∼ (s, t) & Q ∼ (s, t)
s→forAll (x |                        t→forAll (y |                           E is the entity
   F →exists(y | P (x , y) & y : t))     E →exists(x | P ∼ (x , y) & x : s)) element type of s
                             Table 1. Inverse of Predicates




    In terms of the framework of [16], the source-target relation Rτ associated
with a UML-RSDS transformation τ is Postτ & Invτ . Rτ is not necessarily
bijective. The forward direction of τ is normally computed as stat(Postτ ): the
UML activity derived from Postτ when interpreted procedurally [8]. However, in
order to achieve the correctness and hippocraticness properties, Invτ must also
be considered: before stat(Postτ ) is applied to the source model m and target
model n, n must be cleared of elements which fail to satisfy Invτ . This is a case
of the Cleanup before Construct pattern.
    Given a typical postcondition Ci of the form
    Si ::
            SCondi (self ) ⇒ Tj →exists(t | TCondj (t) & Pi (self , t))
the corresponding invariant constraint is Invi :
    Tj ::
            TCondj (self ) ⇒ Si →exists(s | SCondi (s) & Pi (s, self ))
P(s,t)                      P ∼ (s, t)                               Condition
t.rr = s.r →reverse()       s.r = t.rr →reverse()                    r , rr ordered association ends
t.rr = s.r →first()         s.r →first() = t.rr                      r ordered association end
t.rr = s.r →last()          s.r →last() = t.rr                       r ordered association end
t.rr = s.r →including(s.p) s.r = t.rr →front() &                     rr , r ordered association ends
t.rr = s.r →append (s.p)    s.p = t.rr →last()                       p 1-multiplicity end
t.rr = Sequence{s.p1, s.p2} s.p1 = t.rr →at(1) &                     rr ordered association end
                            s.p2 = t.rr →at(2)                       p1, p2 1-multiplicity ends
t.rr = s.r →select(P )      s.r = (s.r − (s.r →select(P ) − t.rr ))→ r , rr
                            merge(t.rr →select(P ))                  set-valued
t.rr = s.r →sort()          s.r = t.rr →asSet()                      r set-valued, rr ordered
t.rr = s.r →asSequence()
                    Table 2. Inverse of Predicates on Associations




   The general cleanup constraint derived from Invi is:
    Tj ::
            TCondj (self ) &
            not(Si →exists(s | SCondi (s) & Pi (s, self ))) ⇒ self →isDeleted ()
However, this constraint may delete more Tj instances than is necessary: not only
instances which correspond (i) to deleted Si instances, or (ii) to Si instances that
no longer satisfy SCondi , but also (iii) to Si instances satisfying SCondi , but not
Pi (s, self ). In this last case, self should not be deleted, instead stat(Pi ) should
be performed to re-establish Pi between the modified s and self . Therefore, some
form of tracing is required, to distinguish these cases.
    The Auxiliary Correspondence Model pattern can be used to introduce trac-
ing relations between source and target entities. In general such relations can
be many-many, and are implemented by an auxiliary association Si ∗$rtrace —
∗
$trace Tj linking corresponding elements in the two languages. In the transforma-
tion specification, s.$trace can be referred to as s→equivalents(), and t.$rtrace
as t→requivalents(). The computational overhead of maintaining such relations
can be substantial, and instead we usually apply the Auxiliary Correspondence
Model pattern to introduce String-valued identity attributes (primary keys) for
source and target entities to record and identify which source and target in-
stances correspond: source instance s corresponds to target instance t when
s.sId = t.tId . This also means that Si [t.tId ] = s and Tj [s.sId ] = t. Finally, the
Unique Instantiation pattern is used to modify existing objects in one model
which correspond to a modified object in the other model, instead of creating a
new object.
    If identity attributes have been introduced using Auxiliary Correspondence
Model, the cleanup constraints can be simplified to:
    Tj ::
            TCondj (self ) &
            not(tId : Si →collect(sId )) ⇒ self →isDeleted ()
and

      Tj ::
              TCondj (self ) & tId : Si →collect(sId ) &
              not(SCondi (Si [tId ])) ⇒ self →isDeleted ()

which correspond to cases (i) and (ii) above. This version is also potentially
more efficient than the original cleanup constraint. The second case is omitted
if SCondi is the default true. The updates of case (iii) are carried out in the
Post constraints. τ → is defined to have postconditions consisting of the cleanup
constraints, followed by Post, and hence it is an extension of τ : its postconditions
imply those of τ .
    The constraints of Postτ need to be interpreted using Unique Instantiation
in τ → : the E →exists(e | P ) quantifier in rule succedents should be interpreted
as “create a new e : E and establish P for e, unless there already exists an
e : E satisfying P ”. This is also known as ‘check before enforce’ semantics. If
key attributes are defined, a constraint

      E ::
              F →exists(f | f .fId = eId & P )

means that if eId : F →collect(fId ) (a corresponding F instance F [eId ] already
exists), then it is selected and modified to satisfy P – this performs case (iii)
of the cleanup actions to re-establish Inv after a source model change. Only if
eId ̸∈ F →collect(fId ) is a new F instance created.


5     Correctness and Hippocraticness Properties for τ →
In this section we give a formal justification that bx defined using the UML
specification approach described above do satisfy the bx properties.
    We consider first the case of UML-RSDS transformations that satisfy the
Structure Preservation pattern [10], in which source language entities Si cor-
respond to target language entities Ti , and whose constraints are of type 1,
satisfying syntactic non-interference. Identity attributes are used to implement
Auxiliary Correspondence Model. This means that τ is a separate-models trans-
formation with source language S and target language T , and postcondition
Post is an ordered conjunction of constraints Ci each of the form:

      Si ::
              SCondi (self ) ⇒
                 Ti →exists(t | t.tId = sId & TCondi (t) & Pi (self , t))

and Inv is a conjunction of constraints Invi of the form

      Ti ::
              TCondi (self ) ⇒
                 Si →exists(s | tId = s.sId & SCondi (s) & Pi (s, self ))
     The simplified cleanup constraints for τ → are therefore:

      Ti ::
              TCondi (self ) & not(tId : Si →collect(sId )) ⇒ self →isDeleted ()

and

      Ti ::
              TCondi (self ) & tId : Si →collect(sId ) &
              not(SCondi (Si [tId ])) ⇒ self →isDeleted ()

The second constraint is omitted if SCondi is the default true predicate. We
denote the collection of the cleanup constraints by Cleanup.
    Bx correctness holds, since the cleanup constraints together with Post es-
tablish Inv : the Cleanup constraints remove any target model elements that
fail to correspond by identity to source model elements in the domain of τ ,
and Post modifies target model elements that do correspond to valid source el-
ements, in order to re-establish Inv . By construction, stat(Post) preserves Inv
and establishes Post. Therefore, the sequential composition of stat(Cleanup) and
stat(Post) establishes the bx relation R as Post & Inv .
    Hippocraticness holds, since stat(Cleanup) has no effect if the models al-
ready satisfy R, and neither does stat(Post), due to the Unique Instantiation
interpretation of the exists operator. The principle of least change is satisfied,
since changes to the source model either lead to deletion of exactly those target
elements (and their incident links) which cannot be modified to correspond to
source elements (cases (i) and (ii) above), or to creation/modification of tar-
get elements, using stat(Ci ). But stat(Ci ) is designed as a minimally-intrusive
update to the target model which will establish Ci . Interpreting a quantifier
Tj →exists(t | P ) using Unique Instantiation means that un-necessary creation
of target elements is avoided.
    A more complex case is Entity Merging, where two or more source entity
types Si , Sj may map to the same target entity type Ti . Provided that the
respective TCondi conditions are pairwise disjoint, the same construction and
argument for the bx correctness apply as in the Structure Preservation case.
Semantic non-interference of Post holds, because the set of Ti instances pro-
duced from Si is disjoint from the set produced from Sj , although syntactic non-
interference fails1 . Entity Splitting involves one source entity type mapping to
two or more different target entity types. There are two versions of this pattern:
(i) two or more separate postconditions on Si , with disjoint SCondi conditions,
mapping to distinct target entity types; (ii) a single constraint which maps one
Si instance to multiple linked instances of different Tj . The first case can be
treated in a similar way to the Structure Preservation case, provided that syn-
tactic or semantic non-interference of Post holds. The second case is common in
1
    A use case satisfies semantic non-interference if for i < j : Ci ⇒ [stat(Cj )]Ci , where
    [act]P is the weakest-precondition of P with respect to act. Syntactic non-
    interference implies semantic non-interference, but not conversely.
refinement transformations, and in some migrations. Constraints, eg., C1 , will
have the form

    S1 ::
            SCond1 (self ) ⇒ T1 →exists(t1 | t1.t1Id = sId &
               T2 →exists(t2 | t2.t2Id = sId & TCond1 (t1) &
                           TCond2 (t1, t2) & P1 (self , t1, t2)))

Both t1 and t2 are given id values self .sId . This should be the only constraint
which creates either T1 or T2 instances (this restriction is implied by syntactic
non-interference). Inv1 is of the form

    T1 ::
            TCond1 (self ) & t2 : T2 & t2.t2Id = t1Id & TCond2 (self , t2) ⇒
               S1 →exists(s | t1Id = s.sId & SCond1 (s) & P1 (s, self , t2))

The t2 : T2 expression acts like an additional T2 →forAll (t2 quantifier over the
remainder of the constraint.
   The cleanup constraints are:

    T1 ::
            TCond1 (self ) & t2 : T2 & t2.t2Id = t1Id & TCond2 (self , t2) &
            not(S1 →exists(s | t1Id = s.sId & SCond1 (s))) ⇒
                               self →isDeleted () & t2→isDeleted ()

The effect of such a bx is that a ‘cluster’ of target instances are related to
each source instance, with all elements of the cluster sharing the source instance
identity value.
    Again, stat(Post) establishes Post and preserves Inv , whilst stat(Cleanup)
together with stat(Post) establishes Inv . Thus correctness follows. Hippocrat-
icness holds because target model pairs (or clusters) t1, t2, ... are only deleted
if they do not correspond to a source instance. Unique Instantiation can apply
to multiple exists quantifiers in the same manner as to single quantifiers. The
principle of least change follows from the minimality of stat(Ci ) as an activity
to establish Ci .
    Finally, the Map Objects before Links pattern can be used. This separates the
construction of target instances, and the linking of these instances. The linking
constraints have the form:

    S1 ::
            Tj [idS1 ].rr = TRef [r .idSRef ]

These define target model association ends rr from source model association ends
r , looking-up target model elements Tj [idS1 ] and TRef [r .idSRef ] which have
already been created by a preceding constraint. However, the linking constraints
are not invariants of the transformation, and are omitted from Inv . The cleanup
constraints for such τ are defined as for the preceding cases, and by a similar
argument, bx correctness, hippocraticness and least change can be shown. The
re-establishment of Post and Inv for corresponding source and target elements
is now performed by possibly a number of separate Post constraints.
    A similar analysis applies to the reverse transformation τ ← . This has post-
condition Postτ∼ , the explicit form of Invτ . Postτ∼ may involve view updates [2]
of the source language SL, based on changes to TL elements. For example, pred-
icates s.f →last() = t.g or s.f →select(P 1) = t.g.
    stat(Post ∼ ) establishes Inv , whilst the cleanup constraints together with
Post ∼ establish Post. Thus τ ← satisfies correctness. Hippocraticness follows
since neither the cleanup or Post ∼ constraints modify the source model if R
already holds.


6   Evaluation

The approach described in this paper enables transformation developers to fo-
cus on writing correct postconditions Postτ for a conventional (unidirectional)
forward transformation τ . From these, the necessary invariants and cleanup con-
straints for bx execution mode can be automatically generated by a higher-order
transformation, producing a bx relation Postτ & Invτ and forward and reverse
transformations with respect to this relation. The forward transformation ex-
tends τ . This is the same approach as taken for the bidirectionalisation of ATL
in [12]: from a unidirectional forward ATL transformation τ , a bx relation T
and transformations T → and T ← are derived, where T → extends τ . In contrast
to [12] we use the minimality of the activity stat(P ) to achieve the principle
of least change, rather than reliance upon an external formalism (Alloy). Meta-
model constraints are incorporated into the bx relation by [12]. Our approach is
instead to prove separately that Postτ & Invτ ⇒ ThTL , where ThTL are the
target language constraints.
     By using standard UML notations and concepts, our approach should en-
able bx specifications to be retained and maintained independently of particular
transformation technologies. The use of OCL for transformation specification fa-
cilitates the automated derivation of transformation invariants and reverse rules.
Our approach scales up to transformations of practical size, and we have applied
it to a complex UML to C code generator with over 150 mapping rules. The ap-
proach defined here could also potentially be used with other similar specification
formalisms, such as Alf/fUML [15] and EVL (http://www.eclipse.org/epsilon/doc/evl).


References

 1. A. Anjorin, G. Varro, A. Schurr, Complex attribute manipulation in TGGs with
    constraint-based programming techniques, BX 2012, Electronic Communications of
    the EASST vol. 49, 2012.
 2. F. Bancilhon, N. Spyratos, Update semantics of relational views, ACM Transactions
    on Database Systems, vol. 6, no. 4, pp. 557–575, 1981.
 3. M. Beine, N. Hames, J. Weber, A. Cleve, Bidirectional transformations in database
    evolution: a case study ‘at scale’, EDBT/ICDT 2014, CEUR-WS.org, 2014.
 4. J. Cheney, J. McKinna, P. Stevens, J. Gibbons, Towards a repository of bx exam-
    ples, EDBT/ICDT 2014, 2014, pp. 87–91.
 5. A. Cicchetti, D. Di Ruscio, R. Eramo, A. Pierantonio, JTL: a bidirectional and
    change propagating transformation language, SLE 2010, LNCS vol. 6563, 2011, pp.
    183–202.
 6. F. Jouault, F. Allilaire, J. Bezivin, I. Kurtev, ATL: a model transformation tool,
    Science of Computer Programming, vol. 72, no. 1, pp. 31–39, 2008.
 7. D. S. Kolovos and R. F. Paige and F. Polack, The Epsilon Transformation Lan-
    guage, ICMT, 2008, pp. 46–60.
 8. K. Lano, S. Kolahdouz-Rahimi, Constraint-based specification of model transforma-
    tions, Journal of Systems and Software, vol. 88, no. 2, February 2013, pp. 412–436.
 9. K. Lano, The UML-RSDS Manual, www.dcs.kcl.ac.uk/staff/kcl/uml2web/umlrsds.pdf,
    2016.
10. K. Lano, S. Kolahdouz-Rahimi, Model-transformation Design Patterns, IEEE
    Transactions in Software Engineering, vol 40, 2014.
11. K. Lano, S. Kolahdouz-Rahimi, S. Yassipour-Tehrani, Patterns for Specifying Bidi-
    rectional Transformations in UML-RSDS, ICSEA 2015.
12. N. Macedo, A. Cunha, Least-change bidirectional model transformation with QVT-
    R and ATL, Softw. Syst. Model, 2014, doi:10.1007/s10270-014-0437-x.
13. L. Meertens, Designing constraint maintainers for user interaction, Third Work-
    shop on Programmable Structured Documents, Tokyo University, 2005.
14. OMG, MOF 2.0 Query/View/Transformation Specification v1.1, 2011.
15. OMG, Action Language for Foundational UML (ALF), v1.0.1, 2015.
16. P. Stevens, Bidirectional model transformations in QVT: semantic issues and open
    questions, SoSyM, vol. 9, no. 1, January 2010, pp. 7–20.
17. P. Stevens, A simple game-theoretic approach to checkonly QVT-Relations, Softw.
    Syst. Model vol. 12, no. 1, 2013, pp. 175–199.