<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Variability Management in Domain-Specific Languages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Méndez-Acuña</string-name>
          <email>david.mendez-acuna@inria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Rennes 1 and INRIA</institution>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Domain-specific languages (DSLs) have demonstrated their capability to reduce the gap between the problem domain and the technical decisions during the software development process. However, building a DSL is not an easy task because it requires specialized knowledge and skills. Moreover, the challenge becomes even more complex in the context of multi-domain companies where several domains coexist across the business units and, consequently, there is a need of dealing not only with isolated DSLs but also with families of DSLs. To deal with this complexity, the research community has been working on the definition of approaches that use the ideas of Software Product Lines Engineering (SPLE) for building and maintaining families of DSLs. In this paper, we present a PhD thesis that is aimed to contribute to this effort. In particular, we explain the challenges that need to be addressed during the process of going from a family of DSLs to a software language line. Then, we briefly discuss the state of the art, and finally we introduce a research plan.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Domain-specific languages (DSLs) allow domain experts the expression of
solutions directly in terms of relevant domain concepts and, for example, use
generative mechanisms to transform specifications of DSLs into software artifacts (e.g.
code, configuration files or documentation). Thus, abstracting away from the
complexity of the rest of the system and the intricacies of its implementation.
As a result, the construction of DSLs is becoming a recurrent activity during
the development of software intensive systems [11]. However, the engineering of
DSLs is a complex task in the context of large companies such as Thales where
the users often have different –and sometimes conflicting– requirements on the
same DSL. For example, certain user may require a textual concrete syntax while
another user prefers a graphical concrete syntax. Worst, for two different users,
the meaning (or semantics) of a particular concept of the language may differ.
When this occurs, we have a set of different DSLs that share some commonalities
and that are distinguished each other by some particularities. In the literature
this is known under the term of families of DSLs [14].</p>
      <p>
        Figure 1 illustrates this situation by presenting a segment of the family of
DSLs for modeling finite state machines deeply studied in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and composed of:
Rhapsody [8], Classical statecharts [9], and UML state machines diagrams [7].
In this case, the commonalities among the three DSLs are the basic concepts
of states and transitions. In turn, the differences are focused on the concepts of
timed transitions and simultaneous events; while Rhapsody and UML include
the notion of timed transitions, it is not supported in classical statecharts. On
the other hand, classical statecharts support simultaneous events whereas both
Rhapsody and UML adopt to the idea of run-to-completion i.e., each event
is attended after processing the precedent one. Note that the second difference
corresponds to a conflict in the semantics of the events dispatching functionality;
all languages support that functionality but its semantics varies.
      </p>
      <p>
        Although the problem of dealing with families of DSLs is becoming more and
more recurrent, currently there is little support for their implementation.
Usually, each DSL member of the family is developed in isolation and from scratch
what is not convenient because the construction of DSLs is a challenging task;
to successfully perform such activity, an engineer must own not only quite solid
modeling skills but also the technical expertise for conducting the definition of
specific artifacts such as grammars, metamodels, compilers, and interpreters,
among others. Nevertheless, as remembered by J.M. Favre in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] software
languages are software too and, consequently, there is room for application of
software engineering techniques that facilitate their construction process (i.e.,
software languages engineering [12]). In particular, the research community has
realized that the ideas behind Software Product Lines Engineering (SPLE) can
be used for facilitating the construction of families of DSLs. This paper presents
a PhD thesis aimed to contribute in this effort. To do so, we briefly discuss the
challenges that should be addressed during the process of going from a family
of DSLs to a software languages line. Then, we discuss the state of the art, and
finally we present a research plan.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Problem Statement</title>
      <p>We have identified three challenges towards the construction software languages
lines.</p>
      <p>
        – Languages modular design. The main purpose of an approach for
wellengineering families of DSLs is to increase the reuse in the sense that those
language segments that are shared between two or more members of the
family can be easily shared without being implemented independently and
from scratch. To do so, it is necessary to separate those shared segments
and encapsulate them in such a way that their specification artifacts can
be actually used as part of the specification of the languages that require
it. This separation implies a mechanism that allows to specify a language in
different modules (a.k.a. language units) that can be later composed together
to provide a complete language specification.
– Multi-dimensional variability modeling. It is worth noting that
differences and particularities among members of a family of DSLs can be found
in one or several dimensions of the specification. Indeed, the work presented
in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] provides a classification of the possible types of variability that can be
found within families of DSLs. A brief summary of this classification is
introduced below; the challenge in this case is to be able to propose an approach
that supports these types of variability.
      </p>
      <p>
        Functional variability: One of the motivations for implementing families
of DSLs is to offer customized languages that provide only the
constructs required by certain type of users. The hypothesis is that the user
will adopt the language easier if the language only offers the constructs
he/she actually needs. If there are additional concepts the complexity
of the language (and the tools) needlessly increases and "the users are
forced to rely on abstractions that might not be naturally part of the
abstraction level at which they are working" [17]. Functional variability
refers to the capability of selecting the desired language constructs for a
particular type of user. Because of the abstract syntax of the language
is the base of the specification –i.e., there can not be definitions neither
in the concrete syntax nor the semantics for concepts that do not exist
in the abstract syntax– the functional variability relies on the activity of
selecting the subset of the abstract syntax required for a particular user.
Syntactic variability: Depending on the context and, again, on the type
of user, the use of certain types of concrete syntax may be more
appropriate than other one. Consider for example, the dichotomy between textual
or graphical notations. Empirical studies such as the presented in [15]
show that, for a specific case, graphical notations are more appropriate
than textual notations whereas other evaluation approaches argue that
textual notations have advantages in cases where models become large
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. More intermediate perspectives (e.g., [10]) state that graphical and
textual notations more that mutually exclusive are complementary.
Syntactical variability refers to the capability of supporting different
representations for the same language construct. In terms of the specification,
the syntactical variability can be viewed as different implementations of
the concrete syntax specification unit for a given abstract syntax
specification unit.
      </p>
      <p>
        Semantic variability: Another problem that has gained attention in the
literature of software languages engineering is the semantic variation
points existing in software languages. A semantic variation point appears
where the same construct can have several interpretations. Consider for
example the semantics differences that exist between state machines
languages explored in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]; the construct fork can be interpreted as a
concurrency point where all the output transitions are dispatched
simultaneously or simply as a bifurcation point where the output transitions
are dispatched sequentially. Semantic variability refers to capability of
supporting different interpretations to the same language construct. In
terms of the specification, semantic variability can be viewed as different
implementations of the semantic specification unit for a given abstract
specification unit.
– Language units composition. After successfully modeling the variability,
the next challenge is to compose a concrete DSL from a configuration of the
family. In other words, after selecting the required language units, they have
to interact each other for constituting a DSL that actually works. To do so, a
mechanism for language unit composition is required. The main requirement
on this composition mechanism is that allows to select from a set of language
units the ones that should be combined. To do so, an external language
that allows to express the composition is needed. In the context of software
architecture that should be an architectural description language that allows
to select a set of given components and compose them together. Note that
this requirement is strongly related with the modularization mechanism.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Related work</title>
      <p>The idea of families of languages has already been discussed in the software
engineering literature. We can find “integral approaches" that propose an integral
solution to the problem by addressing the three challenges presented above.
Besides, there are “partial approaches" that deal only with one or two of these
challenges (e.g., approaches for components-based DSLs development deal with
languages modular design but do not consider variability management). In this
section, we briefly discuss the strengths and weaknesses of integral approaches.
Partial approaches will discussed in a publication currently in progress.</p>
      <p>
        In the work presented in [16] the modularization problem is treated by using
Neverlang [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] i.e., a tool that allows the expression of a software language in
separate language units that can be later composed for generating an interpreter. In
turn, the variability management mechanism is addressed by using the Common
Variability Language (CVL). The main advantage of that work is that there is
a clear mapping between the variability modeling approach with the
modularization tool. The specifically the authors present the notion of “slices" as the
mechanism for composing certain features of a language in Neverlang, then,
providing support for functional and semantical variability. However, in Neverlang
the abstract and the concrete syntax of language units are defined in the same
artifact (i.e., a BNF-like grammar). As a result, it is not possible to support
syntactical variability. Besides, there is not any support for static validation of
compatibility between different language units. Hence, compatibility problems
can only be detected in composition time by reading the errors produced by the
composition tool.
      </p>
      <p>The work presented in [14] introduces the concept of crosscutting
modularization, that is, the capability of decomposing a language not only into different
language units but also decomposing a language feature into several tool
features in order to support not only functional variability but also syntactical
and semantical variability. Each tool feature represents a dimension of the
language specification (e.g., syntax, semantics, constraints, documentation). The
weaknesses of this approach are fundamentally the lack of static verification of
composability and the lack of support for the definition of constraints in the
variability model. Using this approach, the language engineer may produce invalid
configurations by selecting, for example, a semantic feature that requires a
nonselected abstract syntax feature. To void this, the variability model should be
annotated with dependency constraints –what may produce as many constraints
as existing features in the model–. We claim that this process can be facilitated
by considering the definition of a staged processes where each dimension of the
variability is defined in a different variability model and they are presented in
order to the user in such a way that each step only includes selectable features.</p>
      <p>
        Finally, work presented in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is a formalization of the foundations for
managing variability managment in software languages lines. It considers the notions
of functional, semantic, and syntactic variability. Besides, the authors present
a languages benchmark called MontiCore [13] that supports modularization of
modeling languages. Although there is not a technical impediment for
implementing a family of DSLs using this variability modeling mechanism in junction
with MontiCore, to the best of our knowledge, there is not a concrete
implementation of an approach that includes both functionalities at the same time.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Proposed approach</title>
      <p>Our solution approach includes one solution strategy for each one of the
aforementioned challenges. These strategies are summarized below:
– Components-based DSLs’ development. We propose to address
languages modularization by means of an approach for components-based DSLs’
development where the main concept is the notion of language units that
interact each other by means of languages interfaces. The idea is that
dependencies between language units are expressed as software interfaces that
offer capabilities for compatibility checking and, in a later phase, languages
composition. Our preliminary results, suggest the need of the definition of
different types of languages interfaces that support different scenarios for
languages modularization such as languages embedding or languages
extension.
– Multi-dimensional CVL for variability modeling: To deal with
variability modeling within families of DSLs, we propose to enhance the CVL
(Common Variability Language) with capabilities for multi-dimensional
variability modeling. We choose CVL because it provides support not only for
variability models but also for implementation models which facilitate the
mapping between language features and language units. Besides, it
introduces the notion realization models constitute a mechanism for configuring
a DSL by offering multi-staged variability.
– Language units composition: In order to perform the composition of
several language units we explore two strategies: compilation based composition
and interpretation based composition. In the first case, the idea is to compose
the language units specifications and produce a complete specification that
can be later used for automatically generating language tooling such as
editors or type-checkers. In the second case, the idea is to maintain specification
separated and generate the corresponding tooling for each one. After that,
the services of each of tooling are orchestrated to offer an infrastructure for
the DSL. Each strategy presents advantages and disadvantages. For
example, whereas interpretation based composition may impact the performance
of the language, it enables variability at runtime.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation &amp; validation plan</title>
      <p>
        Currently, the plan for the evaluation of the approach is based on two case
studies. The first one is a family of languages for finite state machines that
presents the syntax and semantics variation points that can be found in languages
for expressing state machines and that have been largely studied in the literature
(e.g., [
        <xref ref-type="bibr" rid="ref3 ref6">3,6</xref>
        ]). It is worth to mention that this is a real-world case study that can be
found in the context of Thales. The second case study is a family of languages for
extended feature models that shows the different interpretation that constructs
such as feature attributes have received.
      </p>
      <p>The two case studies include the three dimensions of the variability. However,
whereas the former requires the implementation of operational semantics the
second one requires the implementation of translational semantics. Like this,
we validate that our ideas are useful not only for executable DSLs but also for
generative approaches that use transformations.</p>
    </sec>
    <sec id="sec-6">
      <title>Current status and planned time-line</title>
      <p>The planned time-line in terms of the expected contributions and current status
(dash line) is shown in Figure 2.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The research presented in this paper is supported by the European Union within
the FP7 Marie Curie Initial Training Network “RELATE" under grant agreement
number 264840 and VaryMDE, a collaboration between Thales and INRIA.
7. O. Group. Uml specification, version 2.0, 2005.
8. D. Harel and H. Kugler. The rhapsody semantics of statecharts (or, on the
executable core of the uml). In H. Ehrig, W. Damm, J. Desel, M. Grofle-Rhode,
W. Reif, E. Schnieder, and E. Westk% mper, editors, Integration of Software
Specification Techniques for Applications in Engineering, volume 3147 of Lecture Notes
in Computer Science, pages 325–354. Springer Berlin Heidelberg, 2004.
9. D. Harel and A. Naamad. The statemate semantics of statecharts. ACM Trans.</p>
      <p>Softw. Eng. Methodol., 5(4):293–333, Oct. 1996.
10. F. Heidenreich, J. Johannes, S. Karol, M. Seifert, and C. Wende. Derivation and
refinement of textual syntax for models. In R. Paige, A. Hartman, and A. Rensink,
editors, Model Driven Architecture - Foundations and Applications, volume 5562
of Lecture Notes in Computer Science, pages 114–129. Springer Berlin Heidelberg,
2009.
11. J.-M. Jézéquel, D. Méndez-Acuña, T. Degueule, B. Combemale, and O. Barais.</p>
      <p>When Systems Engineering Meets Software Language Engineering. In CSD&amp;M’14
- Complex Systems Design &amp; Management, Paris, France, Nov. 2014. Springer.
12. A. Kleppe. Software Language Engineering: Creating Domain-Specific Languages</p>
      <p>Using Metamodels. Addison-Wesley Professional, 1 edition, 2008.
13. H. Krahn, B. Rumpe, and S. V^lkel. Monticore: a framework for compositional
development of domain specific languages. International Journal on Software Tools
for Technology Transfer, 12(5):353–372, 2010.
14. J. Liebig, R. Daniel, and S. Apel. Feature-oriented language families: A case study.</p>
      <p>In Proceedings of the Seventh International Workshop on Variability Modelling of
Software-intensive Systems, VaMoS ’13, pages 11:1–11:8, New York, NY, USA,
2013. ACM.
15. B. Mora, F. GarcÌa, F. Ruiz, and M. Piattini. Graphical versus textual software
measurement modelling: an empirical study. Software Quality Journal, 19(1):201–
233, 2011.
16. E. Vacchi, W. Cazzola, S. Pillay, and B. Combemale. Variability support in
domainspecific language development. In M. Erwig, R. Paige, and E. Wyk, editors,
Software Language Engineering, volume 8225 of Lecture Notes in Computer Science,
pages 76–95. Springer International Publishing, 2013.
17. S. Zschaler, P. S nchez, J. Santos, M. AlfÈrez, A. Rashid, L. Fuentes, A. Moreira,
J. Ara˙jo, and U. Kulesza. Vml* ? a family of languages for variability
management in software product lines. In M. van den Brand, D. Ga?evi?, and J. Gray,
editors, Software Language Engineering, volume 5969 of Lecture Notes in Computer
Science, pages 82–102. Springer Berlin Heidelberg, 2010.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>W.</given-names>
            <surname>Cazzola</surname>
          </string-name>
          .
          <article-title>Domain-specific languages in few steps: The neverlang approach</article-title>
          .
          <source>In In Proc. of Intl. Conf. on Software Composition (SC)</source>
          , volume
          <volume>7306</volume>
          <source>of LNCS</source>
          , pages
          <fpage>162</fpage>
          -
          <lpage>177</lpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>M.</given-names>
            <surname>Cengarle</surname>
          </string-name>
          , H. Gr^nniger, and
          <string-name>
            <given-names>B.</given-names>
            <surname>Rumpe</surname>
          </string-name>
          .
          <article-title>Variability within modeling language definitions</article-title>
          .
          <source>In A. Sch¸rr and B</source>
          . Selic, editors,
          <source>Model Driven Engineering Languages and Systems</source>
          , volume
          <volume>5795</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>670</fpage>
          -
          <lpage>684</lpage>
          . Springer Berlin Heidelberg,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Crane</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dingel</surname>
          </string-name>
          .
          <article-title>Uml vs. classical vs. rhapsody statecharts: not all models are created equal</article-title>
          .
          <source>Software &amp; Systems Modeling</source>
          ,
          <volume>6</volume>
          (
          <issue>4</issue>
          ):
          <fpage>415</fpage>
          -
          <lpage>435</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>H.</given-names>
            <surname>Eichelberger</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Schmid</surname>
          </string-name>
          .
          <article-title>A systematic analysis of textual variability modeling languages</article-title>
          .
          <source>In Proceedings of the 17th International Software Product Line Conference, SPLC '13</source>
          , pages
          <fpage>12</fpage>
          -
          <lpage>21</lpage>
          , New York, NY, USA,
          <year>2013</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>J.-M. Favre</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Gasevic</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <article-title>L% mmel, and</article-title>
          <string-name>
            <given-names>E.</given-names>
            <surname>Pek</surname>
          </string-name>
          .
          <article-title>Empirical language analysis in software linguistics</article-title>
          .
          <source>In Software Language Engineering</source>
          , volume
          <volume>6563</volume>
          <source>of LNCS</source>
          , pages
          <fpage>316</fpage>
          -
          <lpage>326</lpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>H.</given-names>
            <surname>Fecher</surname>
          </string-name>
          , J. Sch^nborn, M. Kyas, and W.-P. de Roever.
          <article-title>29 new unclarities in the semantics of uml 2.0 state machines</article-title>
          . In K.
          <article-title>-</article-title>
          K. Lau and R. Banach, editors,
          <source>Formal Methods and Software Engineering</source>
          , volume
          <volume>3785</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>52</fpage>
          -
          <lpage>65</lpage>
          . Springer Berlin Heidelberg,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>