<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Survey on Template-based Code Generation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lechanceux Luhunu</string-name>
          <email>lechanceux.luhunu.kavuya@umontreal.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eugene Syriani</string-name>
          <email>syriani@iro.umontreal.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DIRO, University of Montreal</institution>
          ,
          <addr-line>Montreal, QC</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-Among the various model-to-text transformation paradigms, template-based code generation (TBCG) is the most popular in MDE. Given the diversity of tools and approaches, it is necessary to classify and compare existing TBCG techniques to provide appropriate support to developers. We conduct a systematic mapping study of the literature to better understand the trends and characteristics of TBCG techniques over the past 16 years. We also evaluate the expressiveness, performance and scalability of the associated tools based on a range of models that implement critical patterns. Index Terms-model-driven engineering, code generation, systematic mapping study, performance study, expressiveness evaluation I. MOTIVATION AND GOALS</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>A critical step in model-driven engineering (MDE) is the
automatic synthesis of a textual artifact from models. This
is a very useful model transformation to generate application
code, to serialize the model in persistent storage, and
generate documentation or reports. Among the various
model-totext transformation paradigms, template-based code generation
(TBCG) is the most popular in MDE. TBCG is a synthesis
technique that produces code from high-level specifications,
called templates. A template is an abstract and generalized
representation of the textual output it describes. It has a static
part, text fragments that appear in the output “as is”, and a
dynamic part embedded with splices of meta-code that encode
the generation logic. It is a popular technique in MDE, as
they both emphasize abstraction and automation. Given the
diversity of tools and approaches, it is necessary to classify
and compare existing TBCG techniques and tools to provide
appropriate support to developers.</p>
      <p>In this work, we conduct a systematic mapping study
(SMS) of the literature in order to understand the trends,
identify the characteristics of TBCG, assess the popularity
of existing tools, and determine the influence that MDE has
had on TBCG over the past 16 years. Based on this SMS,
we compare the nine most popular TBCG tools found in
the literature. We perform a qualitative evaluation of their
expressiveness based on typical metamodel patterns that
influence the implementation of the templates. The expressiveness
of a tool is the set of language constructs that can be used
to complete a particular task natively. This is important since,
to the best of our knowledge, there are no available metrics
to assess the code generation templates. We also evaluate
the performance and scalability of these tools based on a
60
50
40
30
20
10</p>
      <p># of papers</p>
      <p>Fig. 1: Evolution of papers in the corpus
range of models that conform to a metamodel composed by
the combination of these patterns.</p>
    </sec>
    <sec id="sec-2">
      <title>II. TRENDS OF TBCG We first present the results of the SMS we conducted on TBCG.</title>
      <sec id="sec-2-1">
        <title>A. Systematic Mapping Study</title>
        <p>
          We followed the process defined in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] to portray the
literature on TBCG. The protocol we followed is described
in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The research questions guiding the SMS are: (RQ1)
What are the trends in template-based code generation? (RQ2)
What are the characteristics of TBCG approaches? (RQ3) To
what extent are TBCG tools being used? (RQ4) What is the
place of MDE in TBCG? We collected 5 131 papers published
between 2000–2016 from online databases that matched the
keywords we searched for. After screening all these papers,
we obtained a final corpus of 481 papers. We then classified
each paper according to a classification scheme (available in
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]) that helps answering our research questions.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>B. Evolution of TBCG</title>
        <p>Fig. 1 reports the number of papers per year, averaging
around 28. This significantly large sample of papers clearly
suggests that TBCG has received sufficient attention from the
research community. The community has maintained a
production rate in-line with the last 11 years average, especially
with a constant rate of appearance in journal articles (24%).
The only exceptions were a significant boost in 2013 and a
dip in 2015. The most popular venues are MODELS, SOSYM,
ECMFA. However, we noticed a decrease of publications in
MDE venues, indicating that TBCG is now applied in
development projects rather than being a critical research problem to
solve. Conference papers as well as venues outside MDE and
software engineering had a significant impact on the evolution
of TBCG. Given that TBCG seems to have reached a steady
publication rate since 2005, we can expect contributions from
the research community to continue in that trend.</p>
      </sec>
      <sec id="sec-2-3">
        <title>C. Characteristics of TBCG</title>
        <p>
          Output-based templates have always been the most popular
style from the beginning (72%). This template style is when
the template is syntactically based on the actual target output,
such as in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] that uses Xpand. Nevertheless, there have been
some attempts to propose other template styles, like the
rulebased style in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], but they did not catch on (4%). Because
of its simplicity to use, the predefined style is probably still
popular in practice, e.g., in CASE tools, but less in research
papers (24%).
        </p>
        <p>TBCG has been used to synthesize a variety of application
code or documents. As expected, the study shows that
highlevel language inputs (general purpose 48% or domain-specific
22% modeling languages) have prevailed over any other type
(schema 20% or programming languages 10%). Specifically
for MDE approaches to TBCG, the input to transform is
moving from general purpose to domain-specific models.</p>
        <p>The study confirms that the community uses TBCG to
generate mainly source code (81%), rather than structured data
e.g., XML (16%) or natural language documents (3%). This
trend is set to continue since the automation of computerized
tasks is continuing to gain ground in all fields. TBCG has been
implemented in many domains, software engineering (55%)
and embedded systems (13%) being the most popular, but also
unexpectedly in unrelated domains, such as bio-medicine and
finance.</p>
        <p>The study revealed a total of 77 different tools for TBCG.
Many studies implemented code generation with a
custommade tool that was never or seldom reused. This indicates
that the development of new tools is still very active.
Modelbased tools are the most popular (49%). Since the research
community has favored output-based template style, this has
particularly influenced the tools implementation. This
template style allows for more fine-grained customization of the
synthesis logic which seems to be what users have favored.
This particular aspect is also influencing the expansion of
TBCG into industry. Well-known tools like Acceleo, Xpand
and Velocity are moving from being simple research material
to effective development resources in industry.</p>
      </sec>
      <sec id="sec-2-4">
        <title>D. Role of MDE</title>
        <p>
          The burst of papers in 2005 coincides with the transition
form the UML to MODELS conference. MDE venues have led
to increase the average number of publications by a factor
of four. There are many advantages to code generation, such
as reduced development effort, easier to write and understand
domain/application concepts and less error-prone [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. These
are, in fact, the pillar principles of MDE and domain-specific
modeling [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Thus, it is not surprising to see that many, though
not exclusively, code generation tools came out from the MDE
community. As TBCG became a commonplace in general, the
research in this area is now mostly conducted by the MDE
community. Furthermore, MDE has brought very popular tools
that have encountered a great success, and they are also
contributing to the expansion of TBCG across industry. It is
important to mention that the MDE community publishes in
specific venues like MODELS, SOSYM, or ECMFA unlike other
research communities where the venues are very diversified.
These three are the top ranked venues in terms of number of
TBCG paper published. All this analysis clearly concludes that
the advent of MDE has been driving TBCG research.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>III. TOOL EXPRESSIVENESS</title>
      <p>
        We evaluate and compare the nine most popular tools found
in the SMS with respect to metamodel patterns that drive
the implementation of the dynamic part of the template. The
complete evaluation methodology is described in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>A. Metamodel Patterns for TBCG</title>
        <p>
          To evaluate the expressiveness of TBCG tools, we identify
a minimal set of four common structures found in metamodels
that influence TBCG. This is the result of analyzing a plethora
of metamodels that were used for TBCG from repositories [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ],
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], known metamodel patterns [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], and industrial
experiences [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. We evaluated the tools on the common running
example of invoice production, for which the metamodel is
depicted in Fig. 2.
        </p>
        <p>1) Navigation: this pattern is when there is a navigable
relation between two classes. A template uses it to access the
data of a target class related to the class of the current context.</p>
        <p>2) Variable dependency: this pattern is like the navigation
pattern, but when the template desires to output a value that
depends on variables present in other classes.</p>
        <p>3) Polymorphism: this pattern takes advantage of an
inheritance relationship to reuse parts of the template. The template
implements the output for the super class and only what varies
for the subclass(es).</p>
        <p>4) Recursion: this pattern consists of a recursive
selfrelation of a class. The template can be reapplied on objects
of the same type in a transparent way.</p>
        <p>Navigation</p>
        <p>Variable dependency</p>
        <p>Polymorphism</p>
        <p>Recursion</p>
      </sec>
      <sec id="sec-3-2">
        <title>B. Template expressiveness</title>
        <p>Table I summarizes the qualitative evaluation of the
expressiveness of each TBCG tools, showing whether it successfully
implemented each pattern or not.</p>
        <p>All tools successfully implement the trivial navigation
pattern. For example, to access the meta-data date from the
invoice object, all tools use the dot operator. In XSLT,
navigating through a composition relation is accomplished with
the xsl:value-of expression. It also requires a different
strategy when the relation is an association.</p>
        <p>We implemented the variable dependency pattern to output
and calculate the total of the invoice. Acceleo and XSLT
have powerful built-in mathematical functions, especially for
collection types. EGL, JET, Velocity, T4, and Xtend2 rely on
the use of global variables and statement blocks. It was not
possible to implement this pattern with StringTemplate (ST)
and Xpand “natively”. We resorted to extend the template with
a Java program to handle the calculations.</p>
        <p>We used the polymorphism pattern to process the priced
items of an invoice, that are subtypes of the abstract item
type. In Acceleo, Xpand, and Xtend2, it is mandatory to write
a template block for the super class even though its content is
not printed in the output. In EGL, the content of the superclass
template definition block is output, along with the content of
the one for the subclass. In JET, Velocity, T4, XSLT, and ST,
no template code can be defined for abstract classes. Thus,
the developer must replicate the common template code for
all possible subclasses.</p>
        <p>We implemented the recursion pattern to obtain the depth
level of the invoice Category from the hierarchy of
categories present in the model. We were only able to implement it
in EGL, Acceleo, Xtend2, and T4 thanks to the use of function
or typed definition block. The dedicated language of T4 allows
to call C# functions defined in the template and thus implement
recursion. Although Xpand supports typed definition blocks,
they only take a single argument which is a type of element
in the input metamodel. Thus it is not possible to accumulate
a value in a variable. XSLT does not implement this pattern
either because there is no trace between the argument that is
passed to the function and the variable passed in the initial
invocation. It is not possible to implement this pattern in JET,
ST, and Velocity due to the absence of typed definition block
or function.</p>
        <p>Time (ms)</p>
        <p>X
X
X
X</p>
        <p>X
X
X
X
X
X
X
X
X
X</p>
        <p>X
1E+6
1E+5
1E+4
1E+3
1E+2
1E+1
1E+0
Model size
1E+1 1E+2 1E+3 1E+4 1E+5
JET Velocity ST Xtend2 T4 XSLT Xpand EGL Acceleo</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IV. PERFORMANCE EVALUATION</title>
      <p>
        To compare the performance of the tools, we generated 10
models conforming to Fig. 2, with a size varying from 10
to 105 classes. There are 3 instances of navigation, 7 to 105
variable dependencies, 6 to 105 ploymorphisms, and 1 to 102
recursions. Fig. 3 shows that the execution time increases with
the size of the model for all tools. The complete evaluation
methodology is described in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>Overall, JET is the fastest tool, completing the whole
experiment. This is expected since JET generates instantly the
corresponding Java class from the template as the developer
is writing the template. Therefore, the execution time here
corresponds to executing the generated Java code that produces
the output. Excluding the special case of JET, Velocity and ST
are the fastest. T4 is as efficient as JET for smaller models.
However, for larger models, it becomes slower than Velocity
and ST. Xtend2 outperforms T4 for these models tool, making
it the fastest model-based tool. Xpand and XSLT come next.
The slowest tools are EGL followed by Acceleo.</p>
      <p>Velocity templates execution scales remarkably well by only
a factor of 15 for models with 105 elements compared to
smaller models with 103 elements. It is followed by JET,
Xtend2, ST, and XSLT with around a factor of 25. For the
remaining tools, the size of the model has a significant effect
on their performance.T4 and Acceleo have the worst scale
factor.</p>
      <p>
        Enabling the recursion pattern gives a similar trend for the
four tools concerned. It did not influence significantly the
performance of Acceleo and T4, but Xtend2 performed 10%
slower than in Fig. 3. However, EGL performed 10% faster
because the dedicated language EOL supports caching [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
    </sec>
    <sec id="sec-5">
      <title>V. CONCLUSION</title>
      <p>The community has been diversely using TBCG over the
past 16 years, and that research and development is still
very active. TBCG has been greatly influenced by MDE.
Both model-based and code-based tools are becoming effective
development resources in industry. The former are the most
capable tools since most of them successfully implemented all
the metamodel patterns. However, the latter performed much
faster. Although JET is the fastest tool, Xtend2 offers the best
compromise between the expressiveness and the performance.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Feldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mujtaba</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Mattsson</surname>
          </string-name>
          , “
          <article-title>Systematic Mapping Studies in Software Engineering,” in Evaluation and Assessment in Software Engineering, ser</article-title>
          .
          <source>EASE'08</source>
          , vol.
          <volume>17</volume>
          . British Computer Society,
          <year>2008</year>
          , pp.
          <fpage>68</fpage>
          -
          <lpage>77</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Syriani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Luhunu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Sahraoui</surname>
          </string-name>
          , “
          <source>Systematic Mapping Study of Template-based Code Generation,” Tech. Rep. arXiv:1703.06353</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>[3] http://www-ens.iro.umontreal.ca/ luhunukl/survey/classification.html.</mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.</given-names>
            <surname>Dahman</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Grabowski</surname>
          </string-name>
          , “
          <article-title>UML-based specification and generation of executable web services,” in System Analysis and Modeling, ser</article-title>
          .
          <source>LNCS</source>
          , vol.
          <volume>6598</volume>
          . Springer,
          <year>2010</year>
          , pp.
          <fpage>91</fpage>
          -
          <lpage>107</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Hemel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Kats</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Groenewegen</surname>
          </string-name>
          , and E. Visser, “
          <article-title>Code generation by model transformation: a case study in transformation modularity</article-title>
          ,
          <source>” Software &amp; Systems Modeling</source>
          , vol.
          <volume>9</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>375</fpage>
          -
          <lpage>402</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Balzer</surname>
          </string-name>
          , “
          <article-title>A 15 Year Perspective on Automatic Programming,”</article-title>
          <source>Transactions on Software Engineering</source>
          , vol.
          <volume>11</volume>
          , no.
          <issue>11</issue>
          , pp.
          <fpage>1257</fpage>
          -
          <lpage>1268</lpage>
          ,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kelly</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Tolvanen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Domain-Specific</surname>
            <given-names>Modeling</given-names>
          </string-name>
          :
          <article-title>Enabling Full Code Generation</article-title>
          . John Wiley &amp; Sons,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Luhunu</surname>
          </string-name>
          and E. Syriani, “
          <article-title>Comparison of the expressiveness and performance of template-based code generation tools,” in Software Language Engineering, ser</article-title>
          .
          <source>LNCS</source>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>[9] http://web.emn.fr/x-info/atlanmod/index.php?title=Zoos.</mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>[10] http://www.remodd.org.</mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>H.</given-names>
            <surname>Cho</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Gray</surname>
          </string-name>
          , “Design Patterns for Metamodels,” in DomainSpecific Modeling workshop, ser.
          <source>SPLASH '11 Workshops. ACM</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sousa</surname>
          </string-name>
          , E. Syriani, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Paquin</surname>
          </string-name>
          , “Feedback on How MDE Tools are Used Prior to Academic Collaboration,”
          <source>in Symposium On Applied Computing</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kolovos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Paige</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Garcıa</surname>
          </string-name>
          <string-name>
            <surname>Domınguez</surname>
          </string-name>
          ,
          <source>The Epsilon Book. Eclipse</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>