<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Pending Evolution of Grammars</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Software Analysis &amp; Transformation Team, Centrum Wiskunde &amp; Informatica</institution>
          ,
          <addr-line>Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>9</lpage>
      <abstract>
        <p>The classic approach to grammar manipulation is based on instant processing of grammar edits, which limits the kinds of grammar evolution scenarios that can be expressed with it. Treating transformation preconditions as guards poses limitations on concurrent changes of the same grammar, on reuse of evolution scripts, on expressing optionally executed steps, on batch processing and optimization of them, etc. We propose an alternative paradigm of evolution, where a transformation can be scheduled for later execution based on its precondition. This kind of extreme evolution can be useful for expressing scenarios that are impossible to fully automate within the classic or the negotiated transformation paradigms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>A colloquial expression `consider it done' means that the subject of the
conversation is either indeed already done, or will be done in the very near future |
in either case, the receiver of such a message can rest assured that the subject
will take place if it has not already, and is expected to act as if it has indeed
happened. The technique of pending evolution that we introduce in this paper,
is similar to that expression, and the bene ts of it are not unlike the subtle
di erences between considering something done and it having been done.</p>
      <p>As it turns out, the pending evolution scheme allows us to e ciently model
scenarios of grammar evolution, deployment and maintenance that are
impossible to model within the traditional grammar transformation paradigm, which is
brie y explained in x2. The method is introduced in x3. Since the most pro ts
hide deep in the details, we spend the rest of the paper (x4) by motivating the
use of pending evolution for grammars instead of classical evolution scripts, by
concrete examples. x5 concludes the paper by summarizing its contributions and
discussing related work.
GrammarLab is a codename for a grammar manipulation project that is
currently being migrated from the Software Language Processing Suite1 initiative
to its own repository2. It is centered around the concept of a grammar in a
broad sense [5], which can be extracted by abstracting away the idiosyncratic
details that we see on class diagrams, in algebraic data type de nitions,
object grammars, concrete syntax speci cations, database schemata and exposed
library interfaces | all these are `grammars in a broad sense', since they model
commitment to grammatical structure.</p>
      <p>Beside extraction, GrammarLab is good in dealing with programmable
grammar transformations | a disciplined method of grammar evolution, where every
change is expressed as a call to a transformation operator with a well-de ned
semantics; as well as grammar mutations | large scale strategies for changing
one simple thing in a priori unknown number of places. GrammarLab also
includes library for grammar analysis and metrics, but they are less relevant for
this paper.
3</p>
    </sec>
    <sec id="sec-2">
      <title>Pending evolution</title>
      <p>In GrammarLab, the evolution of a grammar is speci ed by a sequence of steps,
each referring to a transformation operator or a mutation, with proper
parameterization: e.g., rst we rename a nonterminal, then we factor its de nition
and then extract a part of it into a new nonterminal. Each of the possible
transformation operators and mutations in the library, have preconditions that
determine their applicability and postconditions that demonstrate their
successful execution. Whenever a postcondition of a step or a precondition of the next
step fails, the transformation sequence is interrupted and an error occurs instead.</p>
      <p>When a negotiated transformation paradigm [13] is explored, failure of a
preor a postcondition means the start of a negotiation: e.g., if a rename fails, the
engine can propose alternative name pairs that would enable its execution. A
clever strategy for negotiations can drastically increase applicability and reuse
of a transformation script, while still allow for full automation.</p>
      <p>
        The pending evolution paradigm that we propose here, can hold any
transformation step pending until its precondition becomes enabled. As will become
apparent from the following examples in x4, it is possible to: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) push the pending
steps all the way to the end of the evolution sequence and then disregard them;
or (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) leave them forever pending and always ready to be applied any number
of times necessary; or (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) relax the constraints about the order of steps; or (
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
collect and log the information about all the possibly non-sequential failures in a
system; or even intentionally decide that particular steps must be taken by defer
their actual execution until later. Only the simplest local cases can be expressed
in terms of negotiations. On the other hand, only simplest negotiations can be
expressed as pending changes. In short, negotiated transformations enables
exibility with the outcome of one step, while pending evolution enables exibility
with the order of multiple steps.
2 V. Zaytsev. GrammarLab, 2013. http://grammarware.github.io/lab.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Scenarios</title>
      <p>The paradigm of pending evolution probably has much wider applicability, but
here we sketch at least four user stories for it, inspired by the problems in the
grammarware technological space that can be addressed and solved there.
4.1</p>
      <sec id="sec-3-1">
        <title>Optional execution</title>
        <p>In the classic grammar transformation engine of GrammarLab, any grammar
transformation step that changes nothing in the grammar (we call them `vacuous
transformations'), is considered erroneous, since in most reasonable contexts
| correction, adaptation, evolution, etc | a change that changes nothing, is
meaningless. However, with some negotiated transformation schemes [13], one
could nd it sensible to ignore the fact that a transformation step brought no
actual changes, if considered in a broader context. In particular, consider the
following scenarios:
{ Suppose we have a repository of grammars, such as the Grammar Zoo3 [15].</p>
        <p>The repository is highly heterogeneous and contains `grammars in a broad
sense' extracted from parser speci cations, compiler sources, readable
documentation, privately created webpages, community contributed wikis,
generated and manually built artifacts. However, one of the steps known from
grammar research [8] to increase the quality of a grammar, is resolving all
`bottom' nonterminals | the ones that are used within the grammar but
never de ned (a grammar with no bottom nonterminals is called a `level 3
grammar' by Lammel and Verhoef in [8]). While some de nitions are simply
missing from the grammar due to development mistakes, quite commonly
these are lexical, or character-level, de nitions, containing the rules about
how an identi er name or a numeric literal should look in a language being
de ned. A big fraction of these, as becomes apparent after mining Grammar
Zoo, have meaningful names such as `string', `identi er', `integer', `id', etc,
and can be matched to a small library of prede ned production rules such as
`a string is a symmetrically quoted sequence of one or more characters' or `an
integer value is an optional sign followed by one or more digits, the rst of
which is not zero'. This can be automated and ran over the whole repository,
which can of such substantial size that prevents its manual veri cation4 |
however, it would be desirable for the framework to introduce the missing
de nitions only if they are truly missing, and allow individual grammars to
retain their speci c views of what a string or a boolean looks like. Hence, we
allow the introduce operator5 to be left pending, and disregard it at the
end of the transformation application.
3 Grammar Zoo, http://slps.github.io/zoo
4 Grammar Zoo contains 569 grammars at the day of paper submission.
5 Introduce and other grammar transformation operators are documented at http:
//github.com/grammarware/slps/wiki/introduce and similar URIs.
{ Consider another scenario. In grammarware technological space, there are
two most common styles of production rules, that we will traditionally call
horizontal and vertical. A horizontal de nition says that the nonterminal N is
`either X or Y or Z', while the vertical one makes three statements that `N is
X', as well as `N is Y', as well as `N is Z'. These can be formally proven to be
equivalent. There are many exceptions, but most language documents prefer
horizontal de nitions (e.g., Java Language Speci cation [3]), while language
workbenches tend toward vertical ones (e.g., The Meta-Environment [4]) or
make no distinction between them (e.g., Rascal [6]). Some transformation
operators also expect their arguments to be either horizontal or vertical,
which leads to the evolution scenarios speci ed in such a way where some
of the operator calls are preceded by the calls of horizontal or vertical
operators, while others are not. Obviously, this excessive versatility hinders
maintainability and changeability of the transformations. It would be better
to write these transformation steps as assertions. For instance, we can specify
that the de nition must by vertical before we deyaccify it, and this step
would be optional, requiring no action if the original de nition is already
vertical.
4.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Error handling</title>
        <p>In GrammarLab, transformations are stopped whenever an error occurs, and an
error message is displayed. Within the negotiated paradigm [13], it is possible to
negotiate for another outcome. One of the rather complex strategies for achieving
that, is the one that skips over the failing transformation step and proceeds with
the rest of the script, and then displays all the error messages at the end of
the computation. To demonstrate the usefulness of this approach, consider the
following detailed scenarios.</p>
        <p>{ In the context of grammar recovery, suppose that we want to extract several
grammars in bulk | they are written in the same style, in the same
metalanguage, with some a priori unknown di erences between them (perhaps they
are di erent versions or dialects of the same language). After carefully
considering one of them, a grammar engineer develops a post-extraction
transformation script that makes the grammar maximally connected, adds missing
de nitions, xes misspelt nonterminal names and corrects other problems.
Naturally, we want to reuse the same transformation script for recovering
the rest of these grammars. However, in the traditional setup, most of the
automated reuse cases will fail because some of the extracted grammars will
have some misspellings already xed, others will lack the part that concerns
the xes, etc. Advanced error handling (or ignoring) can help greatly with
scalability in this case, by skipping over inapplicable xes, applicability
classi cation, etc.
{ Imagine another scenario concerning maintenance of grammar
transformation scripts. Suppose that we have several grammars that are being converged
together in multiple steps | e.g., the case study converging six Java
grammars found in di erent editions of the Java Language Speci cation book,
consisted of 1611 transformation steps arranged in 70 di erent scripts [10].
When an error is spotted in one of the existing steps, or when another step
needs to be added in the middle of the transformation chain, or when the
order of existing steps needs to be adjusted, it becomes a very labor-intensive
task since every failure stops the transformation computation | having the
luxury of recovering after a failure noticeably increases debugging
capabilities.
4.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Pending xes</title>
        <p>Both the traditional programmed and the negotiated transformation models
delegate the decision about the transformation order to the original script: any
transformation step takes place after the one that precedes it in the speci cation
and is followed by the one after it. However, there are situations when we can
develop certain transformation scenarios and leave them pending so that they
can be executed when (if) the times comes and they become applicable. In
general this is useful in case of preserving any kind of normal form properties,
but we provide two detailed cases taken from practice:
{ Recall the di erence between horizontal and vertical de nitions that we have
explained in the previous section. Suppose that our grammar uses vertical
de nitions exclusively | this is easy to achieve by grammar transformations
or mutations, and easy to validate with a metaprogramming formula or by
micropatterns. However, if we would like to specify that the dominance of
vertical de nitions is not incidental and that we would like to preserve it, it
is not possible to express this constraint within the straightforward grammar
programming approach. With pending evolution, we could leave the
verticalizing mutation pending. Then, if someone introduces a new nonterminal to
the grammar, and that nonterminal is found to be horizontal, the mutation
becomes enabled, is executed and recharged for next use.
{ Grammar recovery is a process of extracting a grammar from an existing
software artifact that may not be of perfect quality. Automated grammar
recovery methodology [14] is based on a collection of heuristics that are
partly con gurable and partly inferred from the notation speci cation. One
of such heuristics is splitting composite terminals: for instance, if a
terminal like `);' is found, it can be broken into two consecutive terminals: `)'
and `;' | simply because the resulting atomic terminals are more helpful
for other heuristics (like matching parentheses). A grammar mutation that
breaks up composite terminals, can be programmed and left pending, such
that under any circumstances that would bring such terminals to the
grammar (such as importing another grammar, introducing a new nonterminal
de nition, folding/unfolding, projecting symbols, etc), it becomes enabled
and is immediately red to split such terminals as desired.</p>
        <p>These scenarios are su ciently di erent from the ones in the previous
section not only in motivation, but also in realization, since we speak of pending
mutations (which are large scale transformations) and recharging them after
transformation.
4.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Intentional pending</title>
        <p>Since we have discussed preserving the grammar already being in the normal
form, another scenario deserves mentioning where the grammar is normalized |
or rather, when such a normalizing mutation is left pending. Below there are two
use cases for this situation, but any normalization could possibly spawn another
one.</p>
        <p>{ Suppose that we have a grammar written in a speci c notation (usually a
dialect of EBNF). Suppose also that a notation evolves, and the grammar is
required to coevolve in order to preserve conformance to the metalanguage.
This scenario is called `metalinguistic evolution' [12] and has been studied
su ciently to be applied in an automated fashion. One of such applications
involves a grammar being exported to a particular notation, which it might
not perfectly t. For instance, the grammar may use an explicit repetition
(usually denoted with `*') or other metaconstructs which are lacking from
the notation. Another typical case is that the target metalanguage insists on
a particular naming convention for the nonterminal (e.g., all must be written
in capitals). In that case, the grammar needs to coevolve with the `change' of
notation from its original one to the one that it is being exported to. However,
this coevolution is essentially a part of the exporting process, and as such
must always take place after all the other evolution steps. Hence, it can be left
pending until the very end of the transformation script, and be executed last,
removing the use of excessive metalanguage elements, changing the naming
convention and adjusting grammar before the actual export mapping.
{ Grammars in a broad sense can be observed in very di erent environments
and extracted from artifacts hailing from di erent technological spaces: XML
schemata, Ecore models, class diagrams, parser speci cations, data types,
etc. Even when these de ne one intended language, they are di erent in
many ways. A technique called grammar convergence [9] is used to reverse
engineer the real relationships between such grammars: based on
expertwritten transformation scripts, it can show which grammars de ne the same
language, which de ne languages that are subsets or supersets of one another,
and which are incomparable. It is also possible to automate the creation
of such scripts, but the inference algorithm performs best when grammars
are in so called `abstract normal form'. Many constraints of the abstract
normal form contradict the practice of grammar engineering, so it would be
most desirable to continue working with the non-normalized grammar and
then perform the pending normalization right before the guided grammar
convergence algorithm is applied. Then, the obtained result can be traced
back to the original grammar by reversing the bidirectional transformation
chain produced by the normalizer.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Concluding remarks</title>
      <p>There are some techniques similar to pending evolution in the inconsistency
management, most notably with concurrent transformation schemes. Such
inconsistencies can be represented as separate rst-class entities [2] and
incorporated directly to the resulting model [7], which enables e cient handling of
inconsistency detection and resolutions as graph transformation rules [11] in
a much less extreme way than the one proposed in this paper. The fact that
these approaches of inconsistency modeling and resolution are not entirely
covered by negotiated grammar transformation, has inspired us to look for common
schemes of advanced change impact propagation, importing ideas from
modelware to grammarware and adapting them to the domain.</p>
      <p>To summarize, we have proposed the following use cases for the technique of
pending grammar evolution:
{ optional execution (x4.1)
optionally complementing the grammar with missing de nitions
using optional transformations as assertions
{ error handling (x4.2)
reusing transformations for bulk extraction
debugging transformations
{ pending xes (x4.3)
persistent commitment to a normal form
pending recovery heuristics
{ intentional pending (x4.4)
pre-export processing
pre-convergence normalization</p>
      <p>
        Pending evolution for grammars (either in a broad sense [5] or in the
classic sense [1]) has never been considered before. Investigating the impact and
opportunities for pending evolution schemes in other elds like program
transformation remains future work. In transaction handling domains both of great
strictness (such as database management and mainframe job processing) and
persistent inconsistency (such as managing wiki contents with a bot) one will be
able to nd techniques somewhat similar to the one we have proposed here.
5. P. Klint, R. Lammel, and C. Verhoef. Toward an Engineering Discipline for
Grammarware. ACM Transactions on Software Engineering Methodology (TOSEM),
14(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ):331{380, 2005.
6. P. Klint, T. van der Storm, and J. Vinju. EASY Meta-programming with Rascal.
      </p>
      <p>In J. M. Fernandes, R. Lammel, J. Visser, and J. Saraiva, editors, Post-proceedings
of the Third International Summer School on Generative and Transformational
Techniques in Software Engineering (GTTSE 2009), volume 6491 of LNCS, pages
222{289, Berlin, Heidelberg, Jan. 2011. Springer-Verlag.
7. M. Kogel, H. Naughton, J. Helming, and M. Herrmannsdorfer. Collaborative Model
Merging. In Companion of the ACM International Conference on Object Oriented
Programming Systems Languages and Applications, SPLASH '10, pages 27{34, New
York, NY, USA, 2010. ACM.
8. R. Lammel and C. Verhoef. Semi-automatic Grammar Recovery. Software|</p>
      <p>
        Practice &amp; Experience, 31(15):1395{1438, Dec. 2001.
9. R. Lammel and V. Zaytsev. An Introduction to Grammar Convergence. In
M. Leuschel and H. Wehrheim, editors, Proceedings of the Seventh International
Conference on Integrated Formal Methods (iFM 2009), volume 5423 of LNCS, pages
246{260, Berlin, Heidelberg, Feb. 2009. Springer-Verlag.
10. R. Lammel and V. Zaytsev. Recovering Grammar Relationships for the Java
Language Speci cation. Software Quality Journal (SQJ), 19(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ):333{378, Mar. 2011.
11. T. Mens, R. Van Der Straeten, and M. D'Hondt. Detecting and Resolving Model
Inconsistencies Using Transformation Dependency Analysis. In O. Nierstrasz,
J. Whittle, D. Harel, and G. Reggio, editors, Model Driven Engineering Languages
and Systems (MoDELS'06), volume 4199 of LNCS, pages 200{214. Springer, 2006.
12. V. Zaytsev. Language Evolution, Metasyntactically. Electronic Communications
of the European Association of Software Science and Technology (EC-EASST), 49,
2012.
13. V. Zaytsev. Negotiated Grammar Transformation. In J. De Lara, D. Di Ruscio, and
A. Pierantonio, editors, Post-proceedings of the Extreme Modeling Workshop (XM
2012). ACM Digital Library, Nov. 2012. In print, currently available at http://
www.di.univaq.it/diruscio/sites/XM2012/xm2012_submission_11.pdf. An
extended version is currently under major revision to the Special issue on Extreme
Modeling of The Journal of Object Technology (JOT).
14. V. Zaytsev. Notation-Parametric Grammar Recovery. In A. Sloane and S.
Andova, editors, Post-proceedings of the 12th International Workshop on Language
Descriptions, Tools, and Applications (LDTA 2012). ACM Digital Library, June
2012.
15. V. Zaytsev. Grammar Zoo: A Repository of Experimental Grammarware. Under
major revision for the Fifth Special issue on Experimental Software and Toolkits
of Science of Computer Programming (SCP EST5), 2013.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Aho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sethi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          . Compilers: Principles, Techniques and Tools. Addison-Wesley,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Cicchetti</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Di Ruscio, and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Pierantonio</surname>
          </string-name>
          .
          <article-title>A Metamodel Independent Approach to Di erence Representation</article-title>
          .
          <source>Journal of Object Technology</source>
          ,
          <volume>6</volume>
          (
          <issue>9</issue>
          ):
          <volume>165</volume>
          {
          <fpage>185</fpage>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <source>TOOLS EUROPE</source>
          <year>2007</year>
          | Objects, Models, Components, Patterns.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>J.</given-names>
            <surname>Gosling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Joy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Steele</surname>
          </string-name>
          , and
          <string-name>
            <surname>G. Bracha.</surname>
          </string-name>
          <article-title>The Java Language Speci cation</article-title>
          .
          <source>Addison-Wesley, third edition</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>J.</given-names>
            <surname>Heering</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R. H.</given-names>
            <surname>Hendriks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Klint</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Rekers. The Syntax De nition Formalism</surname>
          </string-name>
          <string-name>
            <surname>SDF</surname>
          </string-name>
          |
          <article-title>Reference Manual</article-title>
          .
          <source>ACM SIGPLAN Notices</source>
          ,
          <volume>24</volume>
          (
          <issue>11</issue>
          ):
          <volume>43</volume>
          {
          <fpage>75</fpage>
          ,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>