Programmatic Muddle Management

                 Dimitrios S. Kolovos, Nicholas Matragkas,
               Horacio Hoyos Rodríguez, and Richard F. Paige

               Department of Computer Science, University of York,
                     Deramore Lane, York, YO10 5GH, UK
                      {dimitris.kolovos, nicholas.matragkas,
                       hhr502, richard.paige}@york.ac.uk


      Abstract. In this paper we demonstrate how diagrams constructed us-
      ing general-purpose drawing tools in the context of agile language de-
      velopment processes can be annotated and consumed by model manage-
      ment programs (such as simulators, model-to-model and model-to-text
      transformations). The aim of this work is to enable engineers to engage
      in programmatic model management early in the language development
      process, so that they can explore whether or not the languages and mod-
      els constructed are fit for purpose. We demonstrate a proof-of-concept
      prototype developed atop the Epsilon platform and a flexible graph def-
      inition language (GraphML).


1   Introduction
The quality and usefulness of a Domain Specific Language (DSL) depends on
accurately identifying the domain concepts, their features and relationships. As
such, the involvement of domain experts in the language development process
is crucial. In the early stages of the language development process, domain ex-
perts often provide informal example diagrams/sketches from which engineers
can infer a first version of the metamodel of the envisioned language. To ob-
tain additional feeback, engineers then need to develop an initial version of a
language-specific modelling tool that enables domain experts to further experi-
ment with the language. This typically constitutes the first step of an iterative
process during which the metamodel of the language can undergo several revi-
sions. When 3-layer modelling frameworks such as MOF/EMF are used, for each
change in the metamodel, language engineers need to update and re-deploy a new
version of the modelling tool, and for non-additive changes to the metamodel
they also need to provide support for automated migration of older models.
    To achieve shorter and more eﬃcient iteration cycles, several techniques that
challenge this top-down metamodel-centric approach have recently been pro-
posed. In such approaches, the early phases of the language development process
involve the construction of example diagrams using flexible drawing tools, which
can be used to (semi-)automatically infer a rigid metamodel only once suﬃcient
confidence in the completeness and maturity of the language has been developed.
    In this paper we argue that example diagrams constructed in the context of
this process should also be processable by model management programs (such
2      D. Kolovos et. al.

as simulators, model-to-model and model-to-text transformations) so that engi-
neers can develop additional and early confidence that the constructed language
is fit for purpose. The rest of the paper is organised as follows. In Section 2 we
provide an overview of related work in the field of bottom-up and agile metamod-
elling. In Section 3 we illustrate an approach for enabling engineers to engage in
programmatic model management activities early in the language development
process, and demonstrate a proof-of-concept prototype developed atop the Ep-
silon platform and a flexible graph definition language (GraphML). In Section 4
we conclude and provide directions to further work.


2   Background and Motivation

In [1], the authors propose an example-driven approach where users are able to
construct informal diagrams using the Dia drawing tool, and these diagrams are
then used to infer appropriate metamodels in an interactive manner. Similarly,
in [2] the authors introduce a systematic semi-automated approach to create
visual DSLs from a set of domain model examples provided by an end-user.
The MetAmodel Recovery System (MARS) [3] is a semi-automatic inference-
based system for recovering a metamodel from a set of instance models through
application of grammar inference algorithms. This approach does not rely on
example models provided by end-users, but it relies on models, which no longer
conform to a metamodel due to its evolution. In [4], the authors present a tool
(GraCoT) that supports co-development of EMF models and metamodels, in
a loosely-coupled manner that promotes agility and simplifies the process of
co-evolution.
    To our knowledge, research in this area so far has focused solely on agile
model construction and automated metamodel inference. In our view, to further
validate the maturity and completeness of a metamodel, it is also important for
language engineers to develop some confidence that models conforming to this
metamodel can support the automated model management operations involved
in the envisioned MDE workflow (simulation, model-to-model and model-to-text
transformation etc.)


3   Proposed Approach

In this paper we illustrate an approach for rendering diagrams constructed using
general-purpose drawing tools amenable to programmatic model management.
An overview of the proposed approach is illustrated in Figure 1. Consistently
with previously-proposed bottom-up metamodelling techniques, in this approach
language engineers and domain experts can start the language development pro-
cess by drawing diagrams depicting example models, which (conceptually) con-
form to the envisioned language, using a general purpose diagram drawing tool.
    In the next stage, engineers can augment these conceptual diagrams using
a set of predefined textual annotations (discussed in Section 3.2) to specify the
                                      Programmatic Muddle Management         3

types and features of diagram elements of interest in an agile manner. Anno-
tated diagrams are then automatically transformed into an intermediate repre-
sentation (muddle) that can be programmatically managed using existing model
management languages.
    In this work we use GraphML, the conceptual metamodel of which is illus-
trated in Figure 2, for diagram drawing, and languages of the Epsilon platform
[5] for automated model management, but in principle this approach should be
applicable to other diagram formats and model management languages.


                           Fig. 1. Process Overview


                         Fig. 2. GraphML Metamodel


3.1   Running Example
We illustrate the process of constructing, annotating, and programmatically
managing GraphML diagrams through a running example. In this example, our
aim is to define a flowchart language that supports timed events and delays. To
develop some confidence that the envisioned language is feature-complete, we
also need to implement a proof-of-concept program that can execute/simulate
models that conform to the language.
     4       D. Kolovos et. al.

        We start by using the yEd1 GraphML-compliant tool to draw an example
     diagram that conceptually conforms to the envisioned flowchart language. The
     diagram, illustrated in Figure 3 consists of labeled rectangles which conceptually
     represent actions, a diamond which represents a decision, directed edges which
     represent transitions, a hexagon that represents the triggering event, a circle
     which represents a delay, and a hexagon which represents the time at which the
     attached event should fire for the first time.


                                  Fig. 3. Flowchart Diagram

         We now take a leap and in Listing 1.1 we present the implementation of
     a simple simulator for such flowcharts, expressed in the Epsilon Object Lan-
     guage [6], an imperative OCL-based model query and transformation language.
     We provide a brief overview of the behaviour and the organisation of the simu-
     lator and then demonstrate how we need to annotate the diagram of Figure 3
     so that the simulator program can use it as an input model that can drive its
     execution.

 1   var event = Event.all.selectOne(e|e.entryPoint = true);
 2   var time = event.time.hours.toMinutes();
 3   event.process();
 4
 5   operation Event process() {
 6     ("Event: " + self.name + " at " + time.toHours()).println();
 7     self.outgoing.at(0).target.process();
 8   }
 9
10   operation Action process() {
11     ("Action: " + self.name).println();
12     if (not self.outgoing.isEmpty()) {
13       self.outgoing.at(0).target.process();
14     }
15   }
16   operation Decision process() {
17     ("Decision: " + self.name).println();
18     var random = self.outgoing.random();
19     ("Chose: " + random.name).println();
20     random.target.process();
21   }
22
23   operation Delay process() {
24     time = time + self.mins;
25     ("Waited for " + self.mins + "mins").println();

     1
         http://www.yworks.com/en/products_yed_about.html
                                                 Programmatic Muddle Management        5

26       self.outgoing.at(0).target.process();
27   }
28
29   operation String toMinutes() : Integer {
30     var parts = self.split(":");
31     return parts[0].asInteger() * 60 + parts[1].asInteger();
32   }
33
34   operation Integer toHours() : String {
35     return (self / 60).asString().pad(2, "0", false) +
36       ":" + (self - (self / 60)*60).asString().pad(2, "0", false);
37   }
                             Listing 1.1. Simple flowchart simulator

         – Assuming that a flowchart can contain many events, in line 1 we select one
           event that has its entryPoint attribute set to true;
         – In line 3, we keep a copy of the time (converted to minutes) at which this
           event is fired for the first time;
         – In line 4, we process the target of the first outgoing transition of the event;
           Calls to process() operations are dynamically dispatched depending on the
           type of their context object, and behave as discussed below;
         – The Event.process() operation prints a message and processes the target of
           its first outgoing transition;
         – The Action.process() operation prints a message and then, if the action has
           any outgoing transition, it processes the target of the first of them;
         – The Decision.process() operation chooses a random outgoing transition, prints
           its name and processes its target;
         – The Delay.process() operation adds the delay time to the global time, prints
           a message and then processes the target of its first outgoing transition;
         – The toMinutes() and toHours() operations can convert HH:MM-formatted
           time strings to integers (number of minutes) and vice versa.
           A sample execution trace of the simulator appears below.

 1   Event: Alarm clock rings at 08:00        7    Event: Alarm clock rings at 08:10
 2   Action: Wake up                          8    Action: Wake up
 3   Decision: Is it too early?               9    Decision: Is it too early?
 4   Chose: yes                              10    Chose: no
 5   Action: Hit snooze                      11    Action: Get up
 6   Waited for 10mins


     3.2     Annotating GraphML Diagrams
     To facilitate the execution of model management programs such as the one il-
     lustrated in Listing 1.1, we need to annotate diagram elements with additional
     information. For example, we need to declare that the type of all rectangle nodes
     in this diagram is Action, and that the type of directed edges is Transition. As
     GraphML does not provide built-in support for capturing type-related informa-
     tion for nodes and edges, we need to use GraphML’s extensibility facilities2 to
     define Type extension fields for nodes and edges.
     2
         http://docs.yworks.com/yfiles/doc/developers-guide/graphml.
         html
6      D. Kolovos et. al.

    The value of the Type extension field of a node/edge needs to adhere to the
name (> name)* pattern, where > is used to denote inheritance. For exam-
ple, by setting the Type field of the Wake up node to Action > FlowchartEle-
ment, we define that the node is an instance of the Action type, and that the
FlowchartElement type is a super-class of Action. All types are unique by name
and are created the first time they are encountered in the diagram. For example,
by subsequently setting the Type field of Hit snooze to Action, we are reusing
the Action type defined in Wake up instead of creating a new one. Beyond type-
related information, we also need to capture additional information using the
following GraphML extensions summarised in Table 1.

                            Table 1: GraphML extensions

Extension      For Description                      Pattern
Properties     Node, Descriptors      and    values (int|String|boolean|real)?
               Edge for primitive attributes of (\*)? name (= value)?
                     nodes/edges
Default        Node, Descriptor of the slot under (int|String|boolean|real)?
               Edge which the first label of the name
                     node/edge should be made ac-
                     cessible
Source role    Edge Descriptor of the role of the name (\*)?
                     source end of the edge
Target role    Edge Descriptor of the role of the name (\*)?
                     target end of the edge
Role in source Edge Descriptor of the role of the name (\*)?
                     edge in its source node
Role in target Edge Descriptor of the role of the name (\*)?
                     edge in its target node


    The value of the Properties field of a node/edge can contain zero or more
lines of text. Each line needs to adhere to the pattern above and define the
type, multiplicity, name and value of the property. For example, by setting the
value of the Type field of the Alarm clock rings node to Event and the text of
its Properties field to boolean startEvent = true, we define that the node has a
single-valued boolean startEvent property, with a value set to true.
    The value of the Default field should conform to the pattern above and define
the name of the default slot of the node/edge and, optionally, its primitive type
(defaults to String). For example, by setting the Default field of the Wake up
node to name, the first label of the node that does not match the property
descriptor pattern (in this case, the Wake up label), will be made accessible
through a name property of type String.
    The values of the Source role, Target role, Role in source, and Role in target
fields of an edge define the name and multiplicity of the respective roles. For
example, in the yes transition we define the following values for these properties:
                                         Programmatic Muddle Management            7

Source role: source, Target role: target, Role in source: outgoing *, Role in target:
incoming *.


3.3   Deriving a Muddle

The next step of the process is to parse the annotated GraphML diagram and
construct an intermediate model (muddle) that conforms to the metamodel of
Figure 4. This is achieved through a multi-pass transformation which is trans-
parent to the end-user and which comprises the following steps.


                     Fig. 4. Intermediate (Muddle) Metamodel


 1. For every typed node in the graph, it creates an empty MuddleElement in
    the intermediate model and its corresponding MuddleElementType (if the
    latter does not already exist). It also looks for nodes for which the Default
    field has a valid value. When this happens, the value of the Default field is
    used to produce a primary Feature which is added to the type of the created
    MuddleElement;
 2. Iterates through the created elements and creates/populates their slots, based
    on the descriptors provided in the Properties field of the node. Again, for
    each new property a Feature is created and added to the type of the ele-
    ment. As such, by setting the value of the Properties field of Alarm clock
    rings to boolean startEvent = true, all model elements of type Event obtain
    a single-valued startEvent boolean feature;
 3. Iterates through the labeled and untyped edges of the graph (e.g. the time
    edge in the diagram of Figure 3). For each edge, it adds an untyped Feature
    to the type of its source muddle element, a respective Slot to the source
    muddle element, and adds the target of the edge to the values of the slot;
 4. Iterates through the unlabeled and untyped edges of the graph and attempts
    to fit their targets into appropriate slots of the source muddle elements (i.e.
    slots that already contain at least one value of the same type);
    8       D. Kolovos et. al.

     5. For every typed edge of the graph it creates an empty MuddleElement and
        its corresponding LinkElementType, similar to what was discussed for nodes
        in step 1. It also attempts to create primary, role in source, role in target,
        source and target Features for the created LinkElementTypes;
     6. Iterates through the typed edges of the graph and creates/populates their
        slots similar to what was discussed in step 2;
     7. Adjusts the multiplicities of features based on the maximum number of values
        of their slots. In this process, single-valued features, slots of which contain
        more than one values become multi-valued (but not the opposite).


    3.4   Consuming Muddles in Epsilon Programs

    Epsilon provides an abstraction layer (Epsilon Model Connectivity – EMC3 )
    that shields the languages of the platform from the intricacies of concrete model
    representations and enables them to access models conforming to a wide range of
    technologies. To enable Epsilon languages to access muddles, we have developed
    a new driver that implements the set of interfaces required by EMC. Due to
    space restrictions, a detailed discussion on the new driver is beyond the scope of
    this paper.
        The driver enables all languages in Epsilon to query muddles. For example,
    in addition to the simulator of Listing 1.1, Listing 1.2 demonstrates an exemplar
    constraint written in the validation language of the platform (EVL4 ), and Listing
    1.3 demonstrates an exemplar model-to-text transformation written in EGL5 .
1   context Decision {
2     constraint HasMoreThanOneOutgoingTransitions {
3       check: self.outgoing.size() > 2
4       message: "Decision " + self.name + " needs to have at least 2 outgoing
             transitions"
5     }
6   }

                   Listing 1.2. Validation constraint for flowchart models

1   The flowchart has [%=Action.all.size()%] actions:
2     [%for (action in Action.all) {%]
3     - [%=action.name%]
4     [%}%]

              Listing 1.3. Model-to-text transformation for flowchart models


    4     Conclusions and Further Work

    In this paper we have argued for the importance of enabling engineers to engage
    in exploratory model management operations early on in the language devel-
    opment process and demonstrated an approach and a prototype that enables
    3
      http://www.eclipse.org/epsilon/doc/emc
    4
      http://www.eclipse.org/epsilon/doc/evl
    5
      http://www.eclipse.org/epsilon/doc/egl
                                        Programmatic Muddle Management           9

engineers to annotate and programmatically manage GraphML diagrams using
languages of the Epsilon platform. In the future, we plan to investigate support-
ing additional GraphML constructs such as subgraphs and hyperedges.
    In our view, while constructing diagrams using using general-purpose draw-
ing tools can be very useful in the early phases of the language development
process, it can become cumbersome and error-prone as the example diagrams
and the DSL become larger and more mature - at which stage a transition to a
language-specific modelling tool should be consider. To reduce the overhead of
this transition, we plan to investigate inferring annotated metamodels that can
then be consumed by tools such as Eugenia6 to automatically generate language-
specific model editors.


Acknowledgements
This research was part supported by the EPSRC, through the Large-Scale Com-
plex IT Systems project (EP/F001096/1) and by the EU, through the Auto-
mated Measurement and Analysis of Open Source Software (OSSMETER) FP7
STREP project (318736).


References
1. Jesús Sánchez-Cuadrado, Juan Lara, and Esther Guerra. Bottom-up meta-
   modelling: An interactive approach. In Robert France, Jürgen Kazmeier, Ruth
   Breu, and Colin Atkinson, editors, Model Driven Engineering Languages and Sys-
   tems, volume 7590 of Lecture Notes in Computer Science, pages 3–19. Springer
   Berlin Heidelberg, 2012.
2. Hyun Cho, J. Gray, and E. Syriani. Creating visual domain-specific modeling lan-
   guages from end-user demonstration. In Modeling in Software Engineering (MISE),
   2012 ICSE Workshop on, pages 22–28, 2012.
3. Faizan Javed, Marjan Mernik, Jeﬀ Gray, and Barrett R. Bryant. Mars: A metamodel
   recovery system using grammar inference. Inf. Softw. Technol., 50(9-10):948–968,
   August 2008.
4. Villalobos J. Gómez P., Sánchez M. Gracot, a tool for co-creation of models and
   metamodels in specific domains. In Proc. Academics Tooling with Eclipse (ACME
   2013) at European Conference on Object-Oriented Programming (ECOOP2013).
   ACM, 2013.
5. Richard F. Paige, Dimitrios S. Kolovos, Louis M. Rose, Nicholas Drivalos, Fiona
   A.C. Polack. The Design of a Conceptual Framework and Technical Infrastruc-
   ture for Model Management Language Engineering. In Proc. 14th IEEE Interna-
   tional Conference on Engineering of Complex Computer Systems, Potsdam, Ger-
   many, 2009.
6. Dimitrios S. Kolovos, Richard F.Paige and Fiona A.C. Polack. The Epsilon Object
   Language. In Proc. European Conference in Model Driven Architecture (EC-MDA)
   2006, volume 4066 of LNCS, pages 128–142, Bilbao, Spain, July 2006.

6
    http://www.eclipse.org/epsilon/doc/eugenia