<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Existing and New Ideas on Least Change Triple Graph Grammars</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Milica Stojkovic</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sven Laux</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anthony Anjorin</string-name>
          <email>aanjoring@mail.uni-paderborn.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Paderborn University</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>903</volume>
      <abstract>
        <p>At least two actively developed model synchronization frameworks employ a conceptually similar algorithm based on Triple Graph Grammars as an underlying formalism. Although this algorithm exhibits acceptable behavior for many use cases, there are still scenarios in which it is sub-optimal, especially regarding the \least change" criterion, i.e., the extent to which models are changed to restore consistency. In this paper, we demonstrate this with a minimal example, discuss various suggestions from literature, and propose a novel approach that can be used to address this shortcoming in the future.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Folder
name: String
structure and package hierarchy. The same attribute condition f.name == p.name is used in R1 and R2 to
require identical names of corresponding Folders and Packages. The model triple depicted in Fig. 2 is said to
be consistent with respect to the TGG consisting of rules R1 and R2, because it can be created by applying R1
to the empty triple graph, then R2, and R2 again as required.</p>
      <p>
        Assume the folder structure is now updated by deleting f2 and reconnecting f1 directly with f3. To restore
consistency, this change must be forward propagated to the package hierarchy. According to Hermann et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
forward synchronization is performed by applying three auxiliary functions: Forward alignment (fAln), then
deletion (Del) and nally forward addition (fAdd) (backward synchronization is analogous).
      </p>
      <p>The task of fAln is simply to delete all \dangling" correspondence elements (referred to as \corrs"), i.e., all
corrs that were previously connected to deleted source elements. As we deleted the folder f2, the corr connecting
it to p2 will be dangling and must be deleted as well. This situation is depicted in Fig. 4, showing the model
triple after the source update but with the corr to be deleted highlighted in red (grey in a b/w printout).</p>
      <p>The task of Del is to determine a maximal consistent sub-triple of the current state after applying fAln. To
compute this maximal consistent sub-triple, Del attempts to mark the entire triple graph by applying suitable
TGG rules. This marking process is visualized in Fig. 5 using checkboxes to represent the (un)marked state
of all nodes and the edges that connect the individual triples accordingly (edges can be uniquely inferred for
the example). The root elements f1, p1, and their connecting corr can be successfully marked with rule R1
(successfully applied rules are denoted in square brackets). The remaining elements cannot be marked with any
rule, however, and are left unmarked. When the marking process terminates, Del deletes all unmarked elements
(highlighted in Fig. 5 in red/grey) in the correspondence and target models.</p>
      <p>In a nal step, the third function fAdd applies TGG rules to transform all unmarked elements in the source
model, extending the correspondence and target models with new elements in the process. The nal consistent
state after applying fAdd is depicted in Fig. 6, with the newly created elements highlighted in green (grey).</p>
      <p>
        While the algorithm is correct, complete, stable, and invertible (cf. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] for all details), our simple example
shows clearly that it makes a rather rough approximation of the elements that must be deleted (and recreated) to
ensure consistency and thus performs unnecessary changes. For a model triple with a very deep folder structure
and corresponding package hierarchy, removing the second folder as we did would result in the deletion and
identical re-creation of all sub-packages. For a realistic example, this will incur substantial information loss, as
the sub-packages will probably have attribute values and other contents such as les with manually written code,
which cannot be re-generated from the source model. Even without rigorously formalizing least change, this is
clearly sub-optimal and most probably unwanted synchronization behavior.
2
      </p>
      <p>
        Existing Ideas on How to Improve Least Change Behavior for TGGs
An approach to improve the TGG algorithm is suggested in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The idea is not to delete elements right away
with Del, but rather to mark them for deletion and try to reuse them later with fAdd instead of re-creating
them. This breaks the clear separation of fAln, Del and fAdd, allowing the functions to work together during
the synchronization process. Applying this idea to our example, the improved algorithm would mark the target
element p2 and its connected corr for deletion without actually deleting any of those elements (resulting in a
\lazy" fAln+Del). During translation, before actually (re-)creating any new elements, fAdd can now rst check
in the \deletion pool" if a suitable element can be recycled. In this manner, the package p3 and its corr to the
folder f3 would only be marked for deletion in Fig. 5 and then re-used (instead of being re-created as before).
The process is nalized by deleting all elements remaining in the deletion pool, in this case the package p2 and
its dangling corr. While this can potentially avoid information loss by recycling elements, choosing the \right"
elements to re-use makes the whole process highly non-deterministic and could either overwhelm end-users if this
freedom of choice is left open, or surprise them if the decision is internalized and automated.
      </p>
      <p>
        A di erent approach proposed by Hildebrandt [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is to specify or automatically derive additional \ x rules"
for the sole purpose of restoring consistency locally. This idea can be applied to the algorithm by Hermann et
al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] by replacing fAln with a function fix, that attempts to restore consistency locally. Such x strategies
can potentially enlarge the maximal consistent sub-triple determined subsequently by Del. In our example,
such a local fix would be to mirror the update in the target model by deleting p2 (with its dangling corr) and
reconnecting p1 with p3. In this case, Del would be able to mark the entire triple, avoiding unnecessary deletion
and re-creation. It is, however, unclear how to derive these x rules automatically from arbitrarily complex
TGGs, or if they are manually speci ed, how to guarantee that they do not contradict the original TGG.
      </p>
      <p>
        Finally, practical guidelines for developing a TGG are presented by Anjorin et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], including a \best
practice" guideline concerning least change. Inspecting rule R2 in Fig. 3, we can see that sub-packages depend
on their parent package as context. This means that the existence of the parent package is a precondition for the
existence of the sub-package. For our example, however, this context dependency is not absolutely necessary;
Anjorin et al. explain that dependencies like this should be avoided by splitting such rules into an \island rule"
that only creates elements (a folder and corresponding package) and another \bridge rule" that adds the required
edges (connecting the folder and package to their parents). Applying this guideline to our example replaces the
\extension rule" R2 in Fig. 3, with the bridge R2' depicted in Fig. 7. According to the least change guideline [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
bridges are better than extensions as they re ect the actual context dependencies more accurately. With this
\improved" TGG, synchronization is performed as follows: fAln behaves the same as in Fig. 4; when computing
the maximal consistent sub-triple, however, Del can now mark the elements below the deleted folder f2 using
rule R1 (Fig. 8). This means that only the package p2 and its incident edges are deleted. After deleting these
elements, fAdd can now translate the unmarked edge connecting f1 and f3 with the bridge rule R2' (Fig. 9).
      </p>
      <p>R2' (Bridge)</p>
      <p>Although this works for our example, there are two issues: Firstly, replacing extension with bridge (and island)
rules can change the language generated by the TGG, resulting in synchronizers that now accept and propagate
previously invalid updates to produce results that are inconsistent with respect to the original TGG. As an
example, note that it becomes possible to create arbitrary graphs of folders and packages using the bridge rule
R2', while the original extension rule R2 could only generate trees. In this case, suitable application conditions
for R2' could be used to solve the problem, but this requires challenging, manual e ort with no guarantees that
both TGGs are equivalent. Secondly, although the refactored TGG might be more suitable for synchronization,
TGGs are used for other consistency management tasks where this change might have a negative impact, e.g.,
consistency checks or model generation. The TGG developer might also prefer a certain style of speci cation,
e.g., for readability reasons, and should not be forced to adhere to a certain structure for rather technical reasons.
3</p>
      <p>Combining Optimization Techniques and TGGs to Improve Least Change
Our idea stems mainly from the observation that Del establishes a maximal consistent sub-triple, which must
be in the language generated by the TGG. For large TGG rules (e.g., extension instead of island and bridge
rules), this leads to coarse, rigid steps in the synchronization process. The algorithm can thus be improved by
automatically decomposing a given TGG into a new, more exible TGG consisting of rules that are as small
as possible. While this enables taking smaller, more precise steps in the synchronization process, appropriate
constraints ranging over valid rule application sequences must be automatically generated to ensure that the
nal result is again in the language of the original TGG. To demonstrate why such constraints are required, we
now discuss the forward propagation of an inconsistent update: instead of deleting the folder f2, let us assume
that our le system allows us to add a symbolic link between folder f1 and folder f3 as depicted in Fig. 10, but
that a corresponding modi cation is not allowed in the language of packages.</p>
      <p>A tool based on our original TGG would refuse to forward propagate this update, as there is no rule that can
create edges without creating the corresponding sub-folders and sub-packages. Using the decomposed bridge rule
R2', however, the invalid change would be propagated to the target model, resulting either in an inconsistent
triple or, perhaps worse, in a runtime exception as the target model cannot be manipulated in this manner.
In order to correctly detect and reject this inconsistent update, we will use the following four steps:
1. Decompose the original TGG rules to derive a new TGG and corresponding constraint derivation logic.
2. Start the synchronization process using the decomposed rules and collect all possible rule applications.
3. Generate speci c constraints based on the constraint derivation logic from Step 1.
4. Solve the constraint problem to make a guaranteed consistent choice of rule applications from Step 2.</p>
      <p>The rst step is to decompose complex rules into smaller ones. Extension rules can, for example, be split into
island and bridge rules, which can themselves be decomposed further if they create multiple elements. For our
example, extension rule R2 can be decomposed to produce the bridge rule R2' depicted in Fig. 7. To ensure that
the language of the TGG is not changed by this decomposition, constraints demanding that every application
of R2' be exclusively paired with a suitable application of R1 (creating sf and sp) must be derived during
the synchronization process to ensure that a decomposed sub-derivation =R)1=R)20 can be fused to an equivalent</p>
      <p>R2
derivation =) with the original TGG rule. This constraint derivation logic is produced during the decomposition
and stored for later use during synchronization.</p>
      <p>(a6)</p>
      <p>(a6)</p>
      <p>The next step concerns the synchronization process: Once the change has been applied, the decomposed
language is used to collect all existing and potential rule applications. As depicted in Fig. 10, we can annotate
each existing rule application by introducing corresponding variables a1 to a5. Furthermore, we also annotate
the potential rule application a6. Note that instead of performing any propagation directly, we only collect all
existing and potential rule applications before considering additional constraints to guide our nal choice.</p>
      <p>Constraints are now generated based on the constraint derivation logic obtained from the decomposition
in Step 1. For our example, we have to make sure that every time R2' is used to mark elements, there are
complementary applications of R1 marking the required context elements. To accomplish this there are two
constraints that have to be taken into account: (1) context rule application constraints ensure that context
requirements are ful lled (e.g., a4 depends on a1 and a2), while (2) exclusive rule application constraints ensure
that the elements are marked only once. This ensures here that only trees and not graphs can be created. In
our example, both a5 and the potential rule application a6 depend on a3, but our constraint derivation logic
demands that a3 can only be combined with one of them, hence the derived constraint a5 XOR a6.</p>
      <p>In a nal step, consistency is ensured by solving an optimization problem over the set of constraints generated
in Step 3. To in uence this step, di erent objective functions can be speci ed depending on the desired outcome.
For our example it would be reasonable to enforce least change by maximizing the number of marked elements,
but in other scenarios one could prefer most recent changes, or ask the user to decide. For the situation depicted
in Fig. 10, there exists no solution for which all elements are marked. Possible optimal solutions include rejecting
the update (analogously to synchronization with the original TGG), and suggesting that a5 be revoked instead.</p>
      <p>
        The underlying idea of combining standard optimization techniques (e.g., SAT or ILP solvers) with TGGs
has already been proposed by Leblebici et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and promising rst results for the task of consistency checking
have been obtained. Our suggestion for future work is to apply this technique to the task of synchronization as
a means of improving the least change related quality of current TGG-based synchronizers.
      </p>
      <p>Our suggestion generalizes and automates the island/bridge guideline, solving the two current problems as
follows: Appropriate constraints are automatically derived to guarantee that the nal result is again consistent
with respect to the original TGG thus allowing intermediate, potentially inconsistent states and the exibility
for least change optimization. The TGG speci ed by the user remains unchanged as the optimal form for
synchronization is derived automatically as part of the compilation process.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Anthony</given-names>
            <surname>Anjorin</surname>
          </string-name>
          , Erhan Leblebici, Roland Kluge, Andy Schurr, and
          <article-title>Perdita Stevens. A Systematic Approach and Guidelines to Developing a Triple Graph Grammar</article-title>
          . In Alcino Cunha and Ekkart Kindler, editors,
          <source>BX</source>
          <year>2015</year>
          , volume
          <volume>1396</volume>
          , pages
          <fpage>81</fpage>
          {
          <fpage>95</fpage>
          . CEUR-WS.org,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Joel</given-names>
            <surname>Greenyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Pook</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jan</given-names>
            <surname>Rieke</surname>
          </string-name>
          .
          <article-title>Preventing Information Loss in Incremental Model Synchronization by Reusing Elements</article-title>
          . In Robert B. France,
          <string-name>
            <surname>Jochen M. Kuester</surname>
          </string-name>
          , Behzad Bordbar, and Richard F. Paige, editors,
          <source>ECMFA</source>
          <year>2011</year>
          , volume
          <volume>6698</volume>
          , pages
          <fpage>144</fpage>
          {
          <fpage>159</fpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Frank</given-names>
            <surname>Hermann</surname>
          </string-name>
          , Hartmut Ehrig, Fernando Orejas, Krzysztof Czarnecki, Zinovy Diskin, Yingfei Xiong, Susann Gottmann, and Thomas Engel.
          <source>Model Synchronization Based on Triple Graph Grammars: Correctness, Completeness and Invertibility. Software and Systems Modeling</source>
          ,
          <volume>14</volume>
          (
          <issue>1</issue>
          ):
          <volume>241</volume>
          {
          <fpage>269</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Stephan</given-names>
            <surname>Hildebrandt</surname>
          </string-name>
          .
          <article-title>On the Performance and Conformance of Triple Graph Grammar Implementations</article-title>
          .
          <source>PhD thesis</source>
          , University of Potsdam,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Cheney</given-names>
            <surname>James</surname>
          </string-name>
          , Jeremy Gibbons,
          <string-name>
            <surname>James McKinna</surname>
            ,
            <given-names>and Perdita</given-names>
          </string-name>
          <string-name>
            <surname>Stevens</surname>
          </string-name>
          .
          <article-title>Towards a Principle of Least Surprise for Bidirectional Transformations</article-title>
          . In Alcino Cunha and Ekkart Kindler, editors,
          <source>BX</source>
          <year>2015</year>
          , volume
          <volume>1396</volume>
          , pages
          <fpage>66</fpage>
          {
          <fpage>80</fpage>
          . CEUR-WS.org,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Erhan</given-names>
            <surname>Leblebici</surname>
          </string-name>
          .
          <article-title>Towards a Graph Grammar-Based Approach to Inter-Model Consistency Checks with Traceability Support</article-title>
          . In Anthony Anjorin and Jeremy Gibbons, editors,
          <source>BX</source>
          <year>2016</year>
          , volume
          <volume>1571</volume>
          , pages
          <fpage>35</fpage>
          {
          <fpage>39</fpage>
          . CEUR-WS.org,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>