Constructing Scalable Domain-Specific
           Graphical Modelling Languages

                               Antonio Garmendia

                     Universidad Autónoma de Madrid (Spain)
                            antonio.garmendia@uam.es


      Abstract. The adoption of Model-Driven Engineering (MDE) as soft-
      ware development paradigm, is recently growing. The creation of Do-
      main Specific Modelling Languages (DSMLs) is a frequent task, because
      it has several benefits to describe a particular domain. However, the con-
      struction of graphical DSMLs is a technically challenging task, and the
      generated environments do not scale well for large models.
      In order to improve this situation, in this thesis, we propose borrow-
      ing modularity concepts from programming languages. In this manner,
      DSMLs will be enriched with fragmentation strategies, which will pro-
      duce environments able to seamlessly partition models across folders and
      files. To facilitate the definition of the graphical syntax, we propose a
      guided, heuristics-based approach, which is capable to profit from the
      defined fragmentation strategy.

      Keywords: Domain-Specific Modelling Languages, Graphical Modelling
      Environments, Scalable Modelling, Meta-modelling, Modularity.


1   Introduction and Problem Statement
The Model-Driven Engineering (MDE) paradigm, proposes software develop-
ment by the use of models of higher level of abstraction than code [13]. Hence,
models are used to automate many activities, like code generation, system simu-
lation or testing. One of the important concepts within MDE are Domain-Specific
Modelling Languages (DSMLs), which focus on modelling using primitives of a
particular domain.
    Current applications become complex and MDE aims to reduce production
costs. However, while models have a higher level of abstraction than code, for
large systems, they may become large and unwieldy as well. There are different
tools that focus on MDE, being Eclipse Modelling Framework (EMF) [14], a
widespread technology and with a set of compatible plug-ins to facilitate the
creation of DSMLs. The existing technology, including EMF, does not have ap-
propriate mechanisms for fragmenting models, which are generally monolithic,
making them heavier and difficult to process for tools and understanding for
people.
    With respect to models, there is an absence of a system that facilitates the
modularity, therefore, the composition, extension and re-utilization, becomes
2

difficult. Eclipse JDT for example, basically proposes the organization of Java
projects in packages, fragments and compilation units. I propose to organise a
model like a programming project, fragmenting it into folders and files. Once the
fragmentation is applied, there is a need to define scoping rules for references,
visibility rules for model elements and mechanisms for efficient check of OCL
constraints. The modularity in models would bring as a benefit, the creation
of libraries of abstractions, that provide standard solutions to already resolved
problems.
    A further difficulty is the definition of graphical editors for DSMLs. Graphical
Frameworks do not scale well and do not support scalability mechanisms, just
barely propose layers and hierarchical drill down, to describe the different levels
of the model. Thus, I propose a guided, heuristics-based approach, to help users
with the definition of the graphical syntax and afterwards the generation of these
environments. Accordingly, the generated editor will be capable to profit from
the defined fragmentation strategy.
    The proposal is to establish modularity mechanisms for DSMLs, which pro-
vide scalability for DSMLs. In addition, facilitate the generation of graphical
modelling environments that supports this modularity.


2   Related Work

This research attempts to provide scalability through modularity techniques to
build modelling environments for DSMLs. Thereby, we are going to focus on
works dealing with the need of process large models. In addition to this, we
will review the use of frameworks for the generation of graphical editors. Our
approach consists in generating structured model editors from meta-models and
speed up the generation of graphical environments through patterns and wizards.
    There are some works that use different techniques to process large models
and obtain some performance gains. Among them, the work of Scheidgen and
Zubow [12] which proposes a persistence framework that allows automatic and
transparent fragmentation to create, update, traverse and query models, based
on EMF framework. Another example is Morsa [5], which makes use of a NoSQL
database to provide scalable access to large models. Finally, the most mature
technology is CDO [4], that is a repository of EMF models, which needs to pre-
process meta-models in order to persist their instances. The two works described
above, compared their technology with CDO, finding scalability issues from the
latter.
    On the other hand, some works decompose models into sub-models for en-
hancing their comprehensibility. In [15] Strüber et al. describe a tool which use
Information Retrieval Algorithms for model splitting and an engine to obtain
several sub-models. Other example is described by Kelsen et al. [10], providing
a mathematical description of linear-time algorithm for construction the decom-
position of models, but there is no tool support.
    There are a collection of tools that propose MDE solutions for the develop-
ment of graphical editors compatible with EMF. There are some Eclipse plug-ins
                                                                                    3


Fig. 1: General scheme for the semi-automatic generation of the modelling environment.

to construct diagram editors, such as: GMF 1 , Eugenia2 , Graphiti3 and Sirius 4 .
Graphiti provides a Java API for coding, the other frameworks are model-based.
Sirius, is the graphical modelling framework that is currently becoming popular
in the MDE community. However, we need some expert knowledge to use these
frameworks, sometimes they require the configuration of complex environment
specifications. Besides that, the constructed editors do not scale well, because it
is based on build monolithic models. Our approach is to generate the scalable
environment with the required settings.


3    Proposed solution

We are going to base our tool in DSLtao, which provides an environment to
design DSMLs through patterns. DSLtao proposes a catalogue of patterns di-
vided into domain, design, concrete syntax, dynamic semantics and infrastruc-
ture patterns [11]. These patterns may include services which can contribute to
the functionality of the generated environment. For this thesis, we are going to
provide a number of patterns and services, to create scalable environments for
graphical DSMLs.
     We plan to create the environment for DSMLs according to the scheme in
Fig. 1. Our tool will take as input a meta-model, with instances of patterns,
such as: Modularity, Filter, Scoping, Visibility and its corresponding graphical
representation. The interaction of the DSML designer with the tool, would be
through wizards, that facilitate the creation of diagram editors, proposing ef-
ficient and flexible schemes that may be adapted to different problems. After
applying the patterns to the DSML meta-model, the modelling environment will
be automatically generated.
     In terms of fragmentation, the modularity mechanism that we currently sup-
port [6] fragments models across composition relationships between classes. To
make the approach more flexible, we are also planning to provide some fragmen-
tation strategy which does not depend only on composition relationships. We
are planning to apply graph theory based cluster algorithms to models, and also
some heuristics to detect that the models need to be split due to their large size
1
  https://wiki.eclipse.org/Graphical_Modeling_Framework
2
  http://eclipse.org/epsilon/doc/eugenia/
3
  http://eclipse.org/graphiti/
4
  https://www.eclipse.org/sirius/
4

[16]. With this approach, we can provide fragmentation strategies which do not
depend on composition and therefore a non-hierarchical fragmentation can be
carried out.
    Using this modular organization, we plan to add scoping mechanism, to allow
objects being visible or not, between model elements that belong to different
locations in the file system or satisfy certain query. We intend to use Epsilon
Object Language (EOL) 5 and Hawk [1], to query the fragmented models and
obtain the elements efficiently.
    In the case of graphical syntax, we propose a wizard to define the concrete
syntax. Here, we are going to develop a dedicated wizard to apply the graphical
pattern over the elements of the DSMLs meta-model. This wizard will contain
some heuristics, that generate a graphical representation model, from which the
environment is synthesized, using different graphical editor frameworks. Using
this model, we can improve the cognitive effectiveness, using Moody’s principles
for the evaluation of graphical notations [9].
    Finally, we are going to do some research in Model Transformations (MTs).
Taking advantage that the model is fragmented, it could be possible to put
forward a formal description which explains how we can take advantage of the
splitted model and improve the performance of MTs, perhaps in connection with
distributed models for transformation execution [2].


4   Preliminary Work
We have developed EMF Splitter 6 , which works both as an independent Eclipse
plug-in, and as a plug-in of DSLtao. EMF Splitter supports both, the definition
of a modular structure for EMF models [6] and the definition of their graphical
syntax relying on Sirius [8]. We made some initial research by defining modu-
larity mechanisms for meta-models developed by us and third parties, and by
generating graphical environments in the context of the MONDO EU project 7 .
    First of all, we define a modular structure based on how Eclipse JDT orga-
nized projects. This Modularity Pattern has three main concepts Project, Pack-
age and Unit, from which we generate an Eclipse plug-in that enables the editing
of models according to this structure [6]. The model content can be organized
in folders and files, being possible the composition of the parts to build a mono-
lithic one. The generated plug-in also provides the decomposition of the model,
according to the selected fragmentation strategy.
    Additionally, we explored how to combine fragmentation strategies with vi-
sualization mechanisms. The feasibility of this combination is confirmed based
on an evaluation over a synthetic models, and the model sets of the GraBaTs’09
contest. In [7], we applied fragmentation strategies to large models to obtain
more manageable chunks. The exploration is done by applying different abstrac-
tion strategies to visualize big models in the forms of graphs.
5
  http://www.eclipse.org/epsilon/doc/eol/
6
  http://antoniogarmendia.github.io/EMF-Splitter
7
  http://www.mondo-project.org/
                                                                                5

    The construction of modelling environments to create and edit models, is
usually an ad-hoc, technically difficult process. In [8], we proposed the genera-
tion of modelling environment using a wizard to define the concrete syntax and
to automate the process. We have a platform independent GraphicalRepresenta-
tion meta-model and then this representation is transformed into a technology-
specific editor. Currently we support Sirius, but in the future we plan to support
other graphical frameworks. In addition to this, we have decided to support man-
ual modification of the generated editor, trying to avoid overriding the manual
changes.


5    Expected Contribution

We will provide developers with advanced tools, to facilitate the work with large
models, facilitating distributed development, through division of models in more
manageable chunks.
    We are also implementing a pattern based approach for the automatic gen-
eration of graphical DSMLs. We are currently working on new features and
developing further services like layers and abstractions. In the future, we hope
to contribute to the generation of advanced graphical and textual environments.
In addition, the approach could be used to the modernization of existing editors.
In this moment, the generation of editors is very costly for developers. For that
reason, the creation of the wizard to define the concrete syntax aims to reduce
this effort.
    In this research we will generate the plug-ins that produce the environment
using the defined patterns. These patterns may be applied in combination, then
the developer can reuse the functionality implemented. Our goal is to generate
scalable modelling environments.


6    Plan for Evaluation and Validation

We plan some experiments to test the performance of the generated tools and
also validate the construction process. We already made some test with the Gra-
BaTs’09 models, in which we evaluate the performance of model fragmentation.
In the future, we plan to choose other case studies like the Knowledge Discov-
ery Metamodel (KDM) 8 models, which provides a representation related to
the existing software assets. The companies use these technology to understand
the functionality and the data of their legacy systems [3]. With our approach,
these models will not be monolithic. Instead, our framework will be able to de-
compose them, into folders and files, to facilitate their edition, navigation and
comprehension.
    One of the main purpose of this approach is to make easier the tasks to
developers. For that reason, it is necessary to test the tool with users have some
experience in the creation of environments for DSMLs. We will evaluate the
8
    https://www.eclipse.org/gmt/modisco/infrastructure/KDM/
6


                     Fig. 2: Gantt Diagram-Research Schedule.

process performing a case study and additionally a test to assess the quality
of our tool. For instance, the wizard for defining the concrete syntax has been
partially assessed in the MONDO project.


7   Current Status

At this moment, we are developing EMF Splitter that enables developers to build
models in a structured way. We are in an early stage of development, but already
we made some experiments with large models and how to visually explore them.
We also provide a wizard for a semi-automatic generation of graphical editors.
Currently, we only support Sirius.
    We aim to continue working on this tool, to provide more advanced features
in the generation of modelling environments using patterns. The tasks are sum-
marized in Figure 2, with an approximation of the estimated time. We plan to
finish the thesis by the end of 2018.


References
 1. K. Barmpis, S. Shah, and D. S. Kolovos. Towards Incremental Updates in Large-
    Scale Model Indexes, pages 137–153. Springer International Publishing, Cham,
    2015.
 2. A. Benelallam, A. Gómez, M. Tisi, and J. Cabot. Distributed model-to-model
    transformation with ATL on mapreduce. In Proceedings of the 2015 ACM SIG-
    PLAN International Conference on Software Language Engineering, SLE 2015,
    Pittsburgh, PA, USA, October 25-27, 2015, pages 37–48, 2015.
 3. H. Bruneliere, J. Cabot, F. Jouault, and F. Madiot. Modisco: A generic and
    extensible framework for model driven reverse engineering. In Proceedings of the
    IEEE/ACM International Conference on Automated Software Engineering, ASE
    ’10, pages 173–174, New York, NY, USA, 2010. ACM.
                                                                                    7

 4. Connected Data Objects (CDO). http://www.eclipse.org/cdo/.
 5. J. Espinazo-Pagán, J. S. Cuadrado, and J. G. Molina. Morsa: A scalable ap-
    proach for persisting and accessing large models. In Model Driven Engineering
    Languages and Systems, 14th International Conference, MODELS 2011, Welling-
    ton, New Zealand, October 16-21, 2011. Proceedings, pages 77–92, 2011.
 6. A. Garmendia, E. Guerra, D. S. Kolovos, and J. de Lara. EMF splitter: A struc-
    tured approach to EMF modularity. In Proceedings of the 3rd Workshop on
    Extreme Modeling co-located with ACM/IEEE 17th International Conference on
    Model Driven Engineering Languages & Systems, XM@MoDELS 2014, Valencia,
    Spain, September 29, 2014., pages 22–31, 2014.
 7. A. Garmendia, A. Jiménez-Pastor, and J. de Lara. Scalable model exploration
    through abstraction and fragmentation strategies. In Proceedings of the 3rd Work-
    shop on Scalable Model Driven Engineering part of the Software Technologies: Ap-
    plications and Foundations (STAF 2015) federation of conferences, L’Aquila, Italy,
    July 23, 2015., pages 21–31, 2015.
 8. A. Garmendia, A. Pescador, E. Guerra, and J. de Lara. Towards the generation
    of graphical modelling environments aided by patterns. In SLATE, volume 563 of
    CCIS, pages 160–168. Springer, 2015.
 9. D. Granada, J. M. Vara, V. A. Bollati, and E. Marcos. Enabling the Development of
    Cognitive Effective Visual DSLs. Springer International Publishing, Cham, 2014.
10. P. Kelsen, Q. Ma, and C. Glodt. Models within models: Taming model complexity
    using the sub-model lattice. In Fundamental Approaches to Software Engineering
    - 14th International Conference, FASE 2011, Held as Part of the Joint European
    Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, Ger-
    many, March 26-April 3, 2011. Proceedings, pages 171–185, 2011.
11. A. Pescador, A. Garmendia, E. Guerra, J. S. Cuadrado, and J. de Lara. Pattern-
    based development of domain-specific modelling languages. In MoDELS, pages
    166–175. IEEE, 2015.
12. M. Scheidgen, A. Zubow, J. Fischer, and T. H. Kolbe. Automated and transpar-
    ent model fragmentation for persisting large models. In Proceedings of the 15th
    International Conference on Model Driven Engineering Languages and Systems,
    MODELS’12, pages 102–118, Berlin, Heidelberg, 2012. Springer-Verlag.
13. T. Stahl, M. Völter, J. Bettin, A. Haase, and S. Helsen. Model-driven software
    development - technology, engineering, management. Pitman, 2006.
14. D. Steinberg, F. Budinsky, M. Paternostro, and E. Merks. EMF: Eclipse Modeling
    Framework, 2nd Edition. Addison-Wesley Professional, 2008. See also http://
    www.eclipse.org/modeling/emf/.
15. D. Strüber, J. Rubin, G. Taentzer, and M. Chechik. Splitting models using infor-
    mation retrieval and model crawling techniques. In Fundamental Approaches to
    Software Engineering - 17th International Conference, FASE 2014, Held as Part
    of the European Joint Conferences on Theory and Practice of Software, ETAPS
    2014, Grenoble, France, April 5-13, 2014, Proceedings, pages 47–62, 2014.
16. R. Xu and D. C. W. II. Survey of clustering algorithms. IEEE Trans. Neural
    Networks, 16(3):645–678, 2005.