<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Optimized declarative transformation First Eclipse QVTc results</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Edward D. Willink</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Willink Transformations Ltd.</institution>
          ,
          <addr-line>Reading</addr-line>
          ,
          <country country="UK">UK</country>
          ,
          <institution>ed/at/willink.me.uk</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>It is over ten years since the rst OMG QVT FAS1 was made available with the aspiration to standardize the edgling model transformation community. Since then two serious implementations of the operational QVTo language have been made available, but no implementations of the core QVTc language, and only rather preliminary implementations of the QVTr language. No signi cant optimization of these (or other transformation) languages has been performed. In this paper we present the rst results of the new Eclipse QVTc implementation demonstrating scalability and major speedups through the use of metamodel-driven scheduling and direct Java code generation.</p>
      </abstract>
      <kwd-group>
        <kwd>optimization</kwd>
        <kwd>declarative transformation</kwd>
        <kwd>scheduling</kwd>
        <kwd>code generation</kwd>
        <kwd>QVTc</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>the e ciency hazards that are addressed in Section 3. Section 4 describes the
derivation of an e cient declarative schedule and Section 5 presents preliminary
results demonstrating some scalability and code generation speed-ups. Section 6
summarizes some related work and Section 7 concludes.
2</p>
      <p>E</p>
      <p>ciency
The execution time of each algorithm in a computer program is `obviously' the
product of many factors: (W:N:R:L:C)A
{ W - the Workload - the number of data elements to be processed
{ N - the Necessary computations per data element
{ R - the memory Representation access overhead
{ L - the programming Language overhead
{ C - the Control overhead
{ A - the Algorithmic overhead</p>
      <p>We cannot optimize N or W since they represent the Necessary Work.</p>
      <p>
        We have limited choice in regard to L since use of a more e cient Language
than Java may incur tooling, portability or cultural di culties. We have similarly
limited choice in Representation since there may be few frameworks to choose
from. We should however be aware that Java may cost a factor of two or three
and that unthinking use of EMF [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] may incur a factor of ten (Figure 5).
      </p>
      <p>The Control overhead is in uenced a bit by programming style, but mostly
by the tooling approach, thus an interpreted form of execution may easily incur
a ten-fold overhead when compared to a code generated one.</p>
      <p>The above costs are all proportionate. Algorithmic cost is however
exponential, normally with a unit exponent, but a poor algorithm may involve quadratic
or worse costs. It is therefore recognized that the best way to improve
performance is to discover a better algorithm, or, more practically, to replace a stupid
algorithm that performs repeated or unnecessary work.</p>
      <p>For a model transformation using an imperative transformation language,
the programmer speci es the algorithms (mappings) and the control structures
that invoke them (mapping calls). We hope that the programmer selects good
algorithms and that the tooling for the transformation language implements
them without disproportionate overheads so that the overall execution incurs
only proportionate costs relative to the given program's Necessary Work.</p>
      <p>When we use a declarative model transformation, the control structures are
no longer de ned by the programmer. A control strategy must be discovered by
the transformation tooling. This has the potential to give improved performance
since a better control strategy may be discovered than that programmed
imperatively. On the other hand, inferior declarative transformation tooling may incur
Control overheads that result in disproportionate Algorithmic overheads.</p>
      <p>In this paper we outline approaches that avoid Algorithmic overheads and
since we use code generation to Java, we compound a perhaps ve-fold reduction
in Representation access and a perhaps ve-fold reduction in Control overhead
when compared to an interpreted execution using dynamic EMF.</p>
    </sec>
    <sec id="sec-2">
      <title>Architecture</title>
      <p>
        The Eclipse QVTd architecture for QVTr and QVTc `solves' the problem of
implementing a triple language speci cation by introducing Yet Another Three
QVT Languages[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] : QVTu, QVTm and QVTi. Subsequent scheduler
development has introduced two more languages QVTp and QVTs for use in the
transformation chain shown in Figure 1.
      </p>
      <p>QVTr2QVTc implements the RelToCore transformation only partially de ned
by the QVT speci cation. This is still a work in progress.</p>
      <p>QVTc2QVTu exploits the user's chosen direction to form a Unidirectional
transformation without the bloat for the unwanted directions.</p>
      <p>QVTu2QVTm normalizes and attens to form a Minimal transformation free
from the complexities of mapping re nement and composition.</p>
      <p>QVTm2QVTp Partitions mappings into micro-mappings that are free from
deadlock hazards.</p>
      <p>The foregoing representations use increasingly restricted semantic subsets of
the QVTc language.</p>
      <p>QVTp2QVTs creates a graphical representation suitable for dependency analysis
enabling an e cient Schedule to be planned.</p>
      <p>QVTs2QVTi serializes the graphical representation and schedule into an
Imperative extension and variation of the QVTc language. This can be executed
directly by the QVTi interpreter that extends the OCL Virtual Machine.
Alternatively an extended form of the OCL code generator may be used to
produce Java code directly. This bypasses many of the overheads of interpreted
execution or dynamic EMF. The generated code comprises one outer class
pertransformation. Each mapping is realized as either a nested class or a function
depending on whether state is needed to handle repeated or premature execution.</p>
    </sec>
    <sec id="sec-3">
      <title>Scheduling</title>
      <sec id="sec-3-1">
        <title>Naive Schedule</title>
        <p>A declarative transformation can always execute using the naive polling schedule
{ Retry loop - loop until all work done
{ Mapping loop - loop over all possible mappings
{ Object loops - multi-dimensional loop for all object/argument pairings
{ Compatibility guard - if object/argument pairings are type compatible
{ Repetition guard - if this is not a repeated execution
{ Validity guard - if all input objects are ready
{ Execute mapping for given object/argument pairings
{ Create a memento of the successful execution</p>
        <p>This is hideously ine cient with speculative executions wrapped in at least
three loops, guards and mementos when compared to a simple linear `loop' nest.</p>
        <p>The key functionality of a declarative model transformation comprises a suite
of OCL expressions that relates output elements and their properties to input
elements and their properties using metamodels to de ne the type system of the
elements and properties. The suite of OCL expressions is structured as mappings
that provide convenient re-usable modules of control.
4.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Metamodel-driven planning, micro-mappings</title>
        <p>The metamodel type system enables producer/consumer relationships to be
established between mappings. These relationships can be used by an intelligent
`Mapping loop' to avoid many of the retries that result from careless attempted
execution of consumers before producers. Similarly, considering only type
compatible objects in the `Object loops' eliminates another major naivety.</p>
        <p>Analysis of the types alone is of limited value, since each type may have
many properties. Each property assignment must be analyzed separately to avoid
deadlock hazards. For instance, an attempt to assign both forward and backward
references of a circular linked list at once deadlocks somewhere round the loop.
Therefore a declarative mapping that appears to assign all properties at once,
must be broken up into smaller micro-mappings to avoid a deadlock hazard. In
practice fewer than 10% of mappings appear to need partitioning into
micromappings, and once a valid schedule has been established, some of these can be
merged.</p>
        <p>The `Object loops' can be very ine cient even when restricted to type
compatible objects, since the number of permutations grows exponentially with the
number of inputs. For genuinely independent inputs this cost is fundamental
and the only solution is for the programmer to provide a better algorithm that
exposes a dependence. In practice many objects have a very strong relationship
such as between a parent and a child object. From the parent there may be many
child candidates, but from the child there is only zero or one parent object. We
therefore replace the parent as an externally searched input by an internally
derived computation. This reduces the order of the input permutation. In practice
between 90% and 100% of micro-mappings can be simpli ed to only one input.</p>
        <p>A micro-mapping either executes successfully updating its outputs, or fails
completely updating nothing until a later successful re-execution. Failure occurs
whenever the computation prematurely navigates from one object to another.
We therefore want to identify as many of the objects that a micro-mapping
may access in order to provide a static schedule in which consumed objects are
produced rst.
4.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Intra-micro-mapping scheduling</title>
        <p>Of course at compile-time we have no instances, just types which only give
us limited producer-consumer precision. However the constraints in a
declarative transformation de ne patterns that enable us to relate multiple types. Our
QVTp2QVTs analysis therefore starts with a transliteration from partitioned
QVTc as shown in Figure 2 to graphical form as shown in Figure 3</p>
        <p>The example is taken from the ubiquitous UML to RDBMS example and
shows the mapping from a Class (child of a Package) to a Table (child of a
Schema) and since we are using QVTc, the intermediate trace is programmed
explicitly as a ClassToTable. The textual representation in Figure 2 shows a two
input (p, c), two output (p2s, s) interface and four creations (c2t, t, pk, pc). The
parentheses of the where clause show three pattern matching constraints. The
braces of the where clause show thirteen assignments.</p>
        <p>
          The graphical form is inspired in part by Henshin [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] but with additional
colors. Black indicates what is constant at compile-time. Blue shows what is
loaded from input models. Green indicates what is created. Additionally Cyan
identi es what is consumed after it is produced elsewhere.
        </p>
        <p>Solid edges show something-to-one unidirectional navigation paths between
rectangular class instances and their rounded-rectangular attributes. Nodes are
annotated with instance name and type, edges with property name and
multiplicity. Dashed edges and ellipses show computations rather than navigations.</p>
        <p>Bidirectional navigations are shown as pairs of unidirectional edges, but only
something-to-one navigations are shown. Consequently any navigation edge may
be traversed in its forward direction to the precisely one object that matches the
pattern element. Following all the paths in this way identi es that all navigable
objects can be reached unambiguously from the c:Class input. This is drawn
with a very thick boundary to emphasize its importance. What appeared to
be a 4-input mapping in textual form is actually a 1-input mapping, so what
appeared to challenge the scheduler with a 4-dimensional search is realizable
with a 1-dimensional loop.</p>
        <p>The graphical form therefore clearly shows, in blue, that we need to match
a c:Class in the loaded input model that has a Package at c.namespace and a
String at c.name. Additionally the cyan shows that execution is not possible until
p.middle provides p2s and p2s.schema provides s. Once these are all available, the
many objects and relationships shown in green can be produced. We therefore
have a very simple dependency rule; every cyan node or edge must be produced
by a corresponding green node or edge in another micro-mapping.
4.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Inter-micro-mapping scheduling</title>
        <p>In practice it is not quite so simple because we must account for multiple
producers and derived types. But in principle we can just join all the micro-mappings
up with dependencies to derive the schedule as shown in Figure 42.</p>
        <p>At the top, the root pseudo mapping is a producer of input model elements,
triaged as allInstances of interesting types. These are consumed by the 19
rectangles one for each merged micro-mapping of the transformation.</p>
        <p>Solid edges pass model elements to each consuming input. Simple connections
are drawn directly. Multiple producers or multiple consumers are drawn through
an intermediate ellipse which corresponds to a bu er that accumulates inputs
from many producers and then makes them available to many consumers.</p>
        <p>Further connections are shown using dashed edges and ellipses. These can
easily be recomputed by the consumer and so no connection needs to be rei ed,
however the schedule must ensure that the producer produces before the
consumer consumes. Thus for the ClassToTable example, a Class is passed via a
solid edge; the other three inputs are computed by navigation from the one
input. These computations fail until the dashed dependencies have been satis ed.
The many (75%) dashed elements shows a useful compile-time optimization.
2 One rectangle, two ellipses and associated lines have been removed so that the nodes
and edges are discernible; the lettering is not relevant.</p>
        <p>A sequential schedule is derived by allocating an integer slot index to each
micro-mapping in turn such that all its predecessors have smaller indexes. In
practice, a perfect static schedule can often be derived, but sometimes a recursion
may introduce a cyclic dependency that can only be resolved at run-time. For
this small proportion of micro-mappings additional run-time support is activated
so that a failed speculative execution blocks according to the failure and retries
only once the failure cause has been resolved. Few micro-mappings incur multiple
failures and so in practice the retry overhead is less than 50% and only where
the data relationships exceed the compile-time analysis capability.</p>
        <p>Three of the nineteen UML2RDBMS micro-mappings require two inputs.
This incurs a quadratic tooling algorithm ine ciency that shows up when the
performance is measured for input models containing associations. For just
packages, classes and properties, the execution scales linearly with Workload.
Examination of these three mappings reveals that there is a 1:N relationship between
a primary and a secondary part of the mapping. Future work includes an
optimization to treat the primary input as the one input and derive the secondary
input using a local loop. This local loop can iterate only over matches, whereas
the current external global loop iterates over many mismatches.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>As just described, the performance of the QVTc variant of UML2RDBMS scales
linearly when the input model contains just packages, classes and properties,
but, pending further work, goes quadratic once associations are added.</p>
      <p>In this section we plot the performance of the simple Familes2Persons
transformation that involves a two-way guarded decision around a copy3. The plot
demonstrates the scalability and the underlying tooling e ciency.4</p>
      <p>The top two lines of Figure 5 show the performance of the new interpreted
QVTc and contrasts it with the Eclipse QVTo. There is very little di erence,
except that QVTo has a much higher trace overhead and so fails to execute
beyond 2,500,000 elements; an unexpected advantage of the explicit QVTc trace.</p>
      <p>The two lines slightly below these show the old and new ATL VMs. Both fail
to get close to 10,000,000 elements in the 4GB 64 bit VM.</p>
      <p>The next two lines are about 30 times faster and contrast the new code
generated QVTc with the standard EcoreUtil copy functionality. QVTc is currently
about 20% slower and just fails to reach 10,000,000 elements.</p>
      <p>The nal line is a manually coded implementation of the copy that bypasses
the overheads of dynamic EMF in similar fashion to the code generated QVTc.
For large numbers of model elements, this is about twenty times faster indicating
that there is considerable scope for further improvement using the current
approach. Using a non-EMF model representation can give further improvements
and moving to C execution yet more.
3 Model overheads are reduced by Java code generated by an EMF genmodel.
4 The plots use no averaging. Single point wobbles may be due to concurrent activity.</p>
      <p>
        Multiple point wobbles may be due to fortuitous cache alignment.
The idea of generalizing the concepts of a Java Virtual Machine to a
Modeling Virtual Machine is attractive and has in uenced ATL [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. However while
its original representation had small byte-code-like concepts, they were far from
byte-sized. The newer EMFTVM is therefore able to do signi cantly better [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
but retains a transformation structure that supports transformation with OCL
expressions as an afterthought. EMFTVM is 80% slower than its EcoreUtil
`optimum'. In contrast the OCL VM uses the OCL Abstract Syntax as its `byte-code'
and supports extension to the QVTi AS so that all AS elements form part of a
single executable element hierarchy. The performance reported in the previous
section was 20% slower than EcoreUtil but the `optimum' target for further work
is ten times faster than EcoreUtil.
      </p>
      <p>
        Code generation has not been pursued by other transformation engines,
perhaps because OCL code generation is not easy. Super cially there are similarities
between OCL and Java and so a number of researchers have performed simple
text template mapping [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. This works sometimes, but fails to handle OCL in
general. In contrast the Eclipse OCL Java code generator converts the OCL AS
to an intermediate CG AS upon which a number of optimizations and rewrites
are performed before Java code is emitted. The code generator supports all OCL
constructs and its extensibility is demonstrated by its re-use for QVTi.
      </p>
      <p>
        The Graph Transformation community has been very active in providing a
rigorous foundation for graph mappings. Sadly the QVT speci cation ignored
this important work, preferring instead to de ne the semantics of the QVTr
transformation language using an incomplete exposition of a transformation of
QVTr written in an untested QVTr to another language (QVTc) that has at
best informal semantics. The utility and power of the graphical QVTs
representation may begin to bridge the gap between these two communities. The coloring
in QVTs is inspired by the use of colors to denote create/delete/no-change in
endogenous transformations. The rei cation of the QVTc traceability element
mirrors the evolution operators in UMLX [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for heterogeneous transformations.
      </p>
      <p>
        Active Operations [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] also reify mappings to persist the state necessary for
incremental execution. Micro-mappings similarly support incremental execution,
but their primary rationale is to be a deadlock-free unit of computation.
7
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>We have introduced the rst implementation of the QVTc speci cation.</p>
      <p>We have introduced the rst direct code generator for model transformations.
We have shown that the direct code generator gives a thirty fold speed-up.</p>
      <p>We have shown how a graph presentation of metamodel and dependency
analyses tames the naive ine ciencies of a declarative schedule.</p>
      <p>We have reported rst results and mentioned some future works. Many more
optimizations to do, and of course, QVTr.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Biermann</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ermel</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Warning</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Visual Modeling of Controlled EMF Model Transformation using HENSHIN</article-title>
          <source>Proceedings of the Fourth International Workshop on Graph-Based Tools</source>
          , GraBaTs
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Eclipse</surname>
            <given-names>ATL</given-names>
          </string-name>
          Project. https://projects.eclipse.org/projects/modeling.mmt.atl
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Eclipse</surname>
            <given-names>EMF</given-names>
          </string-name>
          Project. https://projects.eclipse.org/projects/modeling.emf.emf
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Eclipse</surname>
            <given-names>OCL</given-names>
          </string-name>
          Project. https://projects.eclipse.org/projects/modeling.mdt.ocl
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Eclipse</surname>
            <given-names>QVT</given-names>
          </string-name>
          <article-title>Declarative Project</article-title>
          . https://projects.eclipse.org/projects/modeling.mmt.qvtd
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>6. EMFTVM performance. https://wiki.eclipse.org/ATL/EMFTVM#Performance</mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jouault</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beaudoux</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>On the Use of Active Operations for Incremental Bidirectional Evaluation of OCL 15th</article-title>
          <source>International Workshop on OCL and Textual Modeling</source>
          , Ottawa,
          <year>2015</year>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Wilke</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Java Code Generation for Dresden OCL2 for Eclipse</article-title>
          ,
          <source>February</source>
          <volume>19</volume>
          ,
          <year>2009</year>
          . http://claaswilke.de/publications/study/beleg.pdf
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Willink</surname>
            ,
            <given-names>E:</given-names>
          </string-name>
          <article-title>UMLX : A Graphical Transformation Language for MDA Model Driven Architecture: Foundations and Applications</article-title>
          ,
          <string-name>
            <surname>MDAFA</surname>
          </string-name>
          <year>2003</year>
          , Twente,
          <year>June 2003</year>
          . http://eclipse.org/gmt/umlx/doc/MDAFA2003-4/
          <fpage>MDAFA2003</fpage>
          -4.pdf
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Willink</surname>
          </string-name>
          , E.:
          <article-title>Yet Another Three QVT Languages</article-title>
          .
          <source>ICMT</source>
          <year>2013</year>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. OMG.
          <source>Meta Object Facility (MOF) 2</source>
          .0 Query/View/Transformation Speci cation,
          <source>Version</source>
          <volume>1</volume>
          .3 Beta. OMG Document Number: ptc/15-10-02,
          <year>March 2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>