<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automated Analysis, Validation and Suboptimal Code Detection in Model Management Programs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ran Wei</string-name>
          <email>ran.wei@york.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dimitrios S. Kolovos</string-name>
          <email>dimitris.kolovos@york.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of York</institution>
          ,
          <addr-line>Deramore Lane, Heslington, York</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>As MDE is increasingly applied to larger and more complex systems, the models that MDE platforms need to manage can grow signi cantly in size. Additionally, model management programs that interact with such models become larger and more complicated, which introduces further challenges in ensuring their correctness and maintainability. This paper presents an automated static analysis and validation framework for languages of the Epsilon platform. By performing static analysis on model management programs written in the Epsilon languages, this framework aims to improve program correctness and development e ciency in MDE development processes. In addition, by applying analysis on model management programs, sub-optimal performance patterns can be detected early in the development process and feedback can be provided to the developers to enable e cient management of large models.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The Epsilon platform [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is a platform which provides a
broad range of model management languages built on top of
an extensible common imperative model query and modi
cation language called the Epsilon Object Language (EOL).
EOL is an interpreted language that supports optional
typing of variables, operations, and operation parameters. This
provides a high degree of exibility (for example, it enables
duck typing [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]) and eliminates the need for explicit type
casts, as typing-related information is currently only
considered at runtime. On the other hand, the absence of a
Copyright c 2014 for the individual papers by the papers’ authors.
Copying permitted for private and academic purposes. This volume is published
and copyrighted by its editors.
      </p>
      <p>BigMDE ’14, July 24, 2014 York, UK
static analyser for EOL and the languages that build atop
it, implies that a wide range of genuine errors that could be
detected at design time are only detected at runtime.
Beyond detecting type-related errors, a static analyser could
also be used to support advanced analysis capabilities, such
as code detection, to improve the performance of model
querying and transformation programs, particularly in the
context of processing large models.</p>
      <p>The remainder of the paper is organised as follows. In
Section 2 we brie y motivate the need for static analysis
capabilities in model management languages, in particular the
Epsilon platform. In Section 3 we present the architecture
of a static analysis framework for the Epsilon platform and a
static analyser for the Epsilon Object Language. In Section
4 we present a sub-optimal code detection facility developed
atop the static analysis framework. In Section 5 we discuss
preliminary evaluation of our work. In Section 6 we
discuss related work and in Section 7 we conclude and provide
directions for further work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. BACKGROUND</title>
      <p>
        MDE allows developers to construct models which abstract
away from technical details, using concepts closely related to
the domain of interest to reduce accidental complexity. The
constructed models are then transformed into part (or all) of
the target system under development. Whilst Model
Transformation is considered to be the heart and soul of Model
Driven Engineering [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], other model management
operations are equally important. Typically, such operations
include model validation, model merging, model comparison,
etc.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2.1 Epsilon</title>
      <p>
        Epsilon is a mature and well-established family of
interoperable languages for model management. Languages in
Epsilon can be used to manage models of diverse
metamodels and technologies. The architecture of the Epsilon
platform is depicted in Figure 1. At the core of Epsilon is the
Epsilon Object Language (EOL) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. EOL is an
imperative language which reuses a part of the Object Constraint
Language OCL) but provides additional features such as
multiple model access, statement sequencing and groupings,
uniformity of function invocation, model modi cation,
debugging and error reporting. Although EOL can be used
as a general purpose model management language, its
primary aim is to be reused in task-speci c languages. Thus,
by extending EOL, a number of task-speci c model
management languages have been implemented, including those
for model transformation (ETL), model comparison (ECL),
model merging (EML), model validation (EVL), model
refactoring (EWL) and model-to-text transformation (EGL).
Epsilon is metamodeling-technology agnostic [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], models
written in di erent modelling languages can be managed by
Epsilon model management languages via the Epsilon Model
Connectivity (EMC) layer, which o ers a uniform interface
for interacting with models of di erent modelling
technologies. Currently, EMC drivers have been implemented to
support EMF [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], MDR, Z speci cations in LaTeX using
CZT and plain XML. Epsilon is an Eclipse project and is
widely used both in the academia and the industry1.
      </p>
    </sec>
    <sec id="sec-4">
      <title>2.2 Epsilon and static analysis</title>
      <p>
        EOL supports optional typing of variables, operations and
operation parameters. The listing below demonstrates the
exibility that this delivers using as an example, a program
that operates on an Ecore model. In line 1, an operation
called getName() is de ned, its context type is Any, which
means that the operation can be invoked on objects/model
elements of any type. In line 2, the keyword self refers
to the object on which the operation is called, for
example, in the statement o.getName(), self refers to o. Thus in
line 2, the operation checks if the object is an instance of
ENamedElement, if it is, it will return self.name. A
typical static analyser would complain in line 3 that "self.name"
does not exist because the type of self is declared as Any,
not Ecore!ENameElement. However, this will never be a
problem at run-time due to the if condition in line 2.
1 operation Any getName() {
2 if(self.isTypeOf(Ecore!ENamedElement))
3 return self.name;
4 else return "Unnamed";
5 }
1http://eclipse.org/epsilon/users/
On the other hand, some genuine errors can also remain
hidden until runtime without the help of static analysis.
The listing below demonstrates an example of a genuine
error. In line 1, a variable named o is de ned to be of type
Ecore!EClass, and in line 2, o is assigned the value of a
random EClass in the Ecore model. In line 3, the program
tries to print the value property of o which does not exist
according to the Ecore metamodel. As such, an error will
be thrown at runtime.
1 var o : Ecore!EClass;
2 o = Ecore!EClass.all.first();
3 o.value.println();
As the size of such programs grow, locating genuine errors
becomes an increasingly di cult task. For example, the
EOL program that underpins the widely-used EuGENia tool
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] consists of 1212 lines of code, that query and modify 4
models that conform to 4 di erent metamodels concurrently.
Performing code review on this code for genuine error
detection is a time consuming process. Additionally, performing
changes is also di cult, as developers have to manually
identify and manage dependencies and relations between
building blocks (operations for example) of such programs. Such
tasks require e ort and time [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. For instance, to delete an
operation named F1 it is necessary to know if it is invoked
by any other operations or code blocks. Manually
performing this task is error-prone.
      </p>
      <p>Since model management programs interact with models,
performing static analysis on model management programs
can also help identify sub-optimal performance patterns in
the process of accessing models. Analysis can also help
comprehend the model management programs. For example,
coverage analysis can be performed to nd out which parts
of the metamodel of particular models are accessed, and
test cases can be generated based on this comprehension to
better test a model management program. With the
potential bene ts mentioned above, we propose and implement a
static analysis framework for Epsilon.</p>
    </sec>
    <sec id="sec-5">
      <title>3. TOWARDS A STATIC ANALYSIS FRAME</title>
    </sec>
    <sec id="sec-6">
      <title>WORK FOR EPSILON</title>
      <p>In this section we discuss the proposed static analysis
framework in detail. The general idea of our approach can be
illustrated by Figure 2. We rst transform the Homogeneous
Abstract Syntax Tree of an EOL program into a
Heterogeneous Abstract Syntax Tree, we then apply resolution
algorithms (including variable resolution and type resolution)
to derive a Heterogeneous Abstract Syntax Graph, with its
elements connected to each other. We then perform pattern
detection to detect sub-optimal code. The aim of the
framework is to provide a generic static analysis facility for all
languages of the Epsilon platform. Since the core language
of the Epsilon platform is EOL, we rst develop a static
analyser for EOL.</p>
    </sec>
    <sec id="sec-7">
      <title>3.1 Deriving Heterogeneous Abstract Syntax</title>
    </sec>
    <sec id="sec-8">
      <title>Trees</title>
      <p>
        Currently, Epsilon provides an ANTLR [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] based parser for
EOL. The ANTLR parser produces homogeneous Abstract
Syntax Trees (AST) with each node containing a text and
an id, which the EOL execution engine consumes. To
facilitate static analysis at a high level of abstraction, our rst
step was to de ne an EMF-based metamodel for EOL and
to transform these homogeneous ASTs to models
(heterogeneous trees) that conform to the EOL metamodel.
As EOL is a reasonably complex language, we only
introduce the basic and novel parts of the EOL metamodel and
the EOL standard library. Figure 3 lists a number of basic
building blocks that constitute an EOL program. The
fundamental element of the EOL metamodel is the EolElement
metaclass, as all other metaclasses in the EOL metamodel
directly or indirectly extend it, and contains information
related to the line and column numbers of the respective
text in an EOL program for traceability purposes. A
Program contains a number of Import (s), which are used to
import other programs. A Program also contains a number of
OperationDe nition(s) which de ne additional
operations/functions on existing types. A Program contains a Block,
which contains a number of Statement (s). Expression is
also a fundamental element which is generally contained in
Statement (s) and other EolElement (s). For each of the
Expression, there is a Type associated to it. The type of the
Expression is generally unknown at the time the source code
of a program is parsed into an EOL model, but is resolved
later in the type resolution process. In order to run an EOL
program that involves processing models, Epsilon currently
requires the user to select the required
models/metamodels via a user interface at runtime. To facilitate accessing
models at design-time for static analysis, we introduce the
ModelDeclarationStatement to declare references to models
in the source code. The syntax of a model declaration
statement is as follows.
1 model library driver EMF {
2 nsuri = "http://library/1.0"
3 };
Like Epsilon, the static analysis framework is also
technologyagnostic. As such, beyond the model's local name, a model
declaration statement de nes the type of the model in
question (in this case EMF), as well as a set of
model-typespeci c key-value parameters (in this case nsuri = http :
==library=1:0) that the framework can use to obtain the
model's metamodel. We have currently implemented
facilities to support models de ned using EMF and plain XML.
In the future we plan to extend the static analysis
framework with support for other types of models supported by
Epsilon, such as models de ned in MDR, spreadsheets, etc.
With the metamodel de ned, we developed a facility which
transforms the ANTLR based ASTs into models that
conform to the EOL metamodel. It should be noted that at the
stage of AST to EOL model transformation, declared models
are not inspected. So at this stage, for the statement
1 var book : Book
it is only known that variable book is of type
ModelElementType whose elementName is Book. Later in the type
resolution process, this information is used against declared
models so that the Book type can be resolved.
      </p>
      <p>Comparing with our current approach, an alternative
approach would have been to rede ne EOL's grammar using a
toolkit such as Xtext or EMFText which can automate the
source-code to model transformation process but we have
opted for an intermediate transformation instead in order
to reuse Epsilon's robust and proven parsers.
3.2</p>
    </sec>
    <sec id="sec-9">
      <title>Deriving Heterogeneous Abstract Syntax</title>
    </sec>
    <sec id="sec-10">
      <title>Graphs</title>
      <p>With the EOL metamodel and the AST to EOL
transformation de ned, the next step of the process involves linking
elements of the EOL model (heterogeneous tree) constructed
in the previous phase to derive an abstract syntax graph. We
have created several facilities to achieve this.</p>
      <sec id="sec-10-1">
        <title>3.2.1 EOL Visitor</title>
        <p>To facilitate the traversal of di erent elements in the EOL
model and to support extensibility, we developed a facility
which generates Java visitor classes from Ecore metamodels.
We then generated visitor classes from the EOL metamodel
which provide a general mechanism to traverse all elements
in an EOL model. The EOL Visitor serves as the
infrastructure of the static analysis framework, as all other facilities
in the static analysis framework extend the EOL visitors to
implement functionalities. The EOL visitor also provides
high extensibility as new analysis mechanisms can be
implemented simply by extending the EOL visitor.</p>
      </sec>
      <sec id="sec-10-2">
        <title>3.2.2 Variable Resolver</title>
        <p>The rst phase of the static analysis on an EOL program
involves resolving identi ers to their de nitions. Context-free
identi ers in EOL can refer to 1) declared
variables/operation parameters and 2) undeclared variables provided by
the execution engine at run-time. For declared variables,
the variable resolver establishes links between the variable
declaration and its references. For example, in line 1 of the
listing provided below, a variable named a is declared. In
line 2, a is referenced. The variable resolver will establish a
link between the reference in line 2 and the declaration in
line 1.
1 var a : Integer;
2 a = 10;
Variable resolution also applies to parameters in operation
de nitions. In the following listing, the variable resolver
will establish a link between the reference of the parameter
toP rint in line 2 and its declaration in line 1.
1 operation definedPrint(toPrint: String) : String {
2 toPrint.println();
3 }
There are also some implicit variables which are not declared
by the developer but are rather provided by the execution
engine. For example, the keyword self in an operation
definition refers to the object/model element on which the
operation is invoked. The following listing demonstrates how
self is used. The variable resolver will establish a link
between self and the object on which printSelf is invoked.
1 operation Any printSelf() {
2 self.println();
3 }
It is important to note that at the stage of variable
resolution, model element types are not resolved.</p>
      </sec>
      <sec id="sec-10-3">
        <title>3.2.3 Type Resolver</title>
        <p>In EOL there are several built-in primitive types (Boolean,
Integer, Real, String) and collection types (Set, Bag,
Sequence and OrderedSet ). There is also a built-in Map type
and the Any type. These types are all subclasses of Type
in the EOL metamodel. The resolution of the above types
is performed during the heterogeneous abstract syntax tree
derivation. There is also a subclass of T ype in the EOL
metamodel called M odelElementT ype which includes
typing information regarding models de ned using di erent
technologies. Such typing information should be determined by</p>
        <sec id="sec-10-3-1">
          <title>EOL Visitor EOL</title>
        </sec>
        <sec id="sec-10-3-2">
          <title>Metamodel</title>
        </sec>
        <sec id="sec-10-3-3">
          <title>EOL Variable Resolver AST2EOL</title>
        </sec>
        <sec id="sec-10-3-4">
          <title>EOL Abstract Syntax Tree Metamodel Connectivity</title>
          <p>accessing the corresponding models.</p>
          <p>To support accessing metamodels at design-time, we
introduced M odelDeclarationStatements, which are exible to
support models de ned in di erent modelling technologies.
A model declaration de nes the model's name, aliases, the
modelling technology (driver) that the model conforms to,
and a set of driver-speci c parameters. The listing below is
an example of M odelDeclarationStatement; it declares a
model named library with alias l and its driver to be EM F ,
it then speci es the EMF-speci c namespace URI (nsuri )
of the model so that the analyser knows how to locate and
access its metamodel.
1 model library alias l driver EMF {nsuri = "http://
library/1.0"};
To facilitate uniform analysis of the structural information of
models of diverse modelling technologies, the static analysis
framework needs to convert type-related information from
di erent modelling technologies into a common
representation. Instead of inventing a new representation, we have
decided to use EMF's Ecore. As such, the static analysis
framework provides a modular architecture where pluggable
components are responsible for transforming di erent types
of model declarations into Ecore EPackages. For di erent
modelling technologies:</p>
          <p>For EMF models, return the registered EPackage by
looking up the metamodel nsURI property of the model
declaration.</p>
          <p>For plain XML, construct an EPackage by analysing
the contents of a sample model le speci ed by
respective the model declaration parameter.</p>
          <p>We have developed drivers for EMF models and plain XML
les, and a common interface which allows new drivers for
di erent modelling technologies to be integrated with the
static analysis framework. By accessing
models/metamodels, the type resolver is able to resolve types with regards to
models/metamodels.</p>
          <p>The variable resolver and type resolver constitute the
infrastructure of the static analysis framework for the Epsilon
languages. The infrastructure is depicted in Figure 4. The
EOL Abstract Syntax Tree layer is provided by the EOL
engine, the AST2EOL layer uses the AST and the EOL
metamodel to translate the AST to an EOL model. The
EOL Variable Resolver and the EOL Type Resolver, both
make use of the EOL Visitor and the Metamodel
Connectivity layer (which is used to read metamodels) to establish
a fully type-resolved EOL model.</p>
          <p>
            The static analysis infrastructure can be easily extended. As
proof of concept, we have also implemented all of the
aforementioned facilities for the Epsilon Transformation Language.
We extended the EOL metamodel to create an ETL
metamodel, with the ETL metamodel, we created the ETL visitor
facility; we extended the AST2EOL to create a AST2ETL
facility; we extended the EOL variable resolver and type
resolver to create ETL variable and type resolvers. The EOL
and ETL static analysers can be found under the Epsilon
Labs open-source project [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ].
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>4. SUBOPTIMAL CODE DETECTION</title>
      <p>
        Rule-based model transformation languages usually rely on
query or navigation languages for traversing the source
models to feed transformation rules with the required model
elements. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], the authors suggest that in complex
transformation de nitions, a signi cant part of the transformation
logic is devoted to model navigation. In the context of
largescale MDE processes, models can contain several millions of
elements. Thus, it is important to retrieve desired elements
in an e cient way. On top of the static analysis framework,
we have built a facility which is able to detect sub-optimal
performance patterns when navigating and retrieving model
elements. This facility performs pattern matching to detect
potential computationally heavy code in EOL (and
potentially all Epsilon languages). It does so by matching patterns
de ned in the Epsilon Pattern Language (EPL) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] against
fully resolved EOL abstract syntax graphs.
      </p>
      <p>The structure of this facility is depicted in Figure 5. The
SubOptimalDetector has a EOLM odel as input to perform
the detection; it makes use of the EP LEngine of the
Epsilon platform to derive Abstract Syntax Trees, it has a set
of de ned EP LP atterns (.epl scripts) using EPL, and a
logging facility (LogBook) to keep the warning messages it
generates for pattern matches.</p>
      <p>In this section, we present the sub-optimal detection facility.
We provide several examples that illustrate potential
suboptimal performance patterns in the context of large scale
model manipulation. We then present and explain a
suboptimal performance pattern de ned in EPL. It should be
noted that this facility targets EOL programs, however, it
can be easily extended to cope with programs written in
other Epsilon languages as discussed earlier.</p>
      <p>The examples we present are all based on a simple Library
metamodel illustrated in Figure 6. The Library metamodel
contains two simple metaclasses, Author and Book. An
Author has a rst name, a surname and a number of published
Book s where a Book has a name and an Author. The
association between Author and Book is bidirectional, they are
books and authors respectively.</p>
    </sec>
    <sec id="sec-12">
      <title>4.1 Inverse navigation</title>
      <p>A frequent operation in EOL is to retrieve all model elements
of a speci c type by using the :all property call which can
be a computationally heavy operation to perform as models
grow in size. By analysing the metamodel of the model
under question, bidirectional relationships between model
elements can be used to avoid such heavy computations.
1 var a = Author.all.first;
2 var books = Book.all.select(b|b.author = a);
3 var aBook = Book.all.selectOne(b|b.author = a);
The listing above demonstrates a potential performance
bottleneck. In line 1, an Author is retrieved from the model.
In line 2, all instances of type Book are retrieved and then
a conditional select is performed to nd the books that are
written by Author 'a'. However, since the relationship
between Author and Book is bidirectional, this can be replaced
by the (more e cient) statement:
1 var books = a.books;
Thus the complexity of the operation all is reduced from n
to 1 given that n is the number of Books in the model under
question. It is also the case for the selectOne operation in
line 3, which can be rewritten as:
1 var aBook = a.books.first();</p>
    </sec>
    <sec id="sec-13">
      <title>4.2 Compound select operations</title>
      <p>Another computationally-heavy pattern is the presence of
compound select operations on the same collection.
1 var authors = Author.all.select(a|a.first_name =
2 'William').select(a|a.surname = 'Shakespeare');
Listing 1: a potential performance overhead using
compound select operations
Listing 1 demonstrates such operations. In line 1, all of the
Authors are retrieved rst, then a select operation is
performed to select all Authors whose rst names is W illiam,
then another select operation is performed to select all Authors
whose surname is Shakespeare. The complexity of this
operation is n2 given that n is the number of Authors in the
model under question. However, the condition of both the
select operations can be put together to form a single select
operation. And the statement above can be written as
1 var authors = Author.all.select(a|a.first_name =
2 'William' and a.surname = 'Shakespeare');
the complexity of this operation is therefore n as the
collection of the Authors is only traversed once.</p>
    </sec>
    <sec id="sec-14">
      <title>4.3 Select operation on unique collections</title>
      <p>Performing select operations on unique collections (sets) can
sometimes be ine cient depending on the condition of the
select operation.
1 var authorOne = Author.all.first;
2 var authorTwo = Author.all.last;
3 var bookOne = authorOne.books.first;
4 var bookSet : Set(Book);
5 bookSet.addAll(authorTwo.books);
6 bookSet.select(b|b = bookOne);</p>
      <p>Listing 2: Select operation on unique collection
Listing 2 demonstrates an ine cient and computationally
expensive select operation. In Line 1 and 2, two Authors are
retrieved from the model; in line 3, a Book is retrieved from
authorOne's publications; in line 4, a Set called bookSet is
created and in line 5, all of the Books that authorT wo
published are added to bookSet. In line 6 the select operation
iterates through all of the books in the bookSet and nd the
ones that match the bookOne. However, the bookSet is a
unique collection, which means that all of the elements in it
only appear once. Therefore, it is not necessary to perform
a select operation but rather a selectOne operation, as the
select operation would return at most one result eventually.
The complexity of the select operation is n given that n
is the number of books that authorT wo published; If the
select operation is replaced with selectOne, the complexity
of it would be 1 for the best case scenario and n for the worst
case scenario (n=2 for the average case).</p>
    </sec>
    <sec id="sec-15">
      <title>4.4 Collection element existence</title>
      <p>In some cases, checking existence of an element inside a
collection can be written in ine cient ways.
1 if(Book.all.select(b|b.name = "EpsilonBook")
2 .size() &gt; 0) {
3 "There is a book called EpsilonBook".println();
4 }</p>
      <p>Listing 3: Collection element existence
Listing 3 demonstrates such a scenario. In line 1, the
condition of the if statement retrieves all instances of Book,
then selects the ones with the name EpsilonBook, and
calculates the size of it then evaluates if the size is greater
than 0. This operation eventually checks for the existence
of a book named EpsilonBook. Thus, this operation can be
more e ciently re-written as:
1 Book.all.exists(b|b.name = "EpsilonBook")
4.5</p>
    </sec>
    <sec id="sec-16">
      <title>Select the first element in a collection</title>
      <p>Listing 4 demonstrates another example of sub-optimal EOL
code.
1 var anEpsilonBook = Book.all.select(b|b.name =
2 "EpsilonBook").first();</p>
      <p>Listing 4: Select an element in a collection
In line 1, a select operation is performed on all of the
instances of Book to lter out the books with the name
'EpsilonBook', then a f irst operation is performed to select the
rst one of the collection returned by select. This can be
more e ciently re-written as:
1 var anEpsilonBook = Book.all.selectOne(b|b.name =
2 "EpsilonBook");
to avoid traversing all of the instances of Book.</p>
    </sec>
    <sec id="sec-17">
      <title>4.6 A sub-optimal performance pattern</title>
      <p>In this section we present a sub-optimal performance pattern
which is written in the Epsilon Pattern Language (EPL). To
understand how this pattern works, it is rst important to
understand what is contained in an EOL model for a certain
EOL program.</p>
      <sec id="sec-17-1">
        <title>4.6.1 Understanding an EOL model</title>
        <p>Figure 7 illustrates a fragment of an EOL model which
represents the statement below.
1 Book.all.select(b|b.author = a);
Firstly, invocations of the select() operation in the EOL
metamodel are represented by the F OLM ethodCallExpression
(FirstOrderLogic method call) metaclass; it has a name (an
instance of N ameExpression) and an iterator (an instance
of V ariableExpression). In this case, the name is 'select'
and the iterator is 'b'.</p>
        <p>The select operation has a condition, in this case, it is an
instance of EqualsOperatorExpression. The lhs (left hand
side) of it is an instance of P ropertyCallExpression, whose
target (an instance of NameExpression) is 'b' and property
(an instance of NameExpression) is 'author'; the rhs (right
hand side) of it is 'a' (an instance of NameExpression). Both
the lhs and rhs of the EqualsOperatorExpression have
resolvedTypes, in this case, they are both Author (instances
of M odelElementT ypes).</p>
        <p>The target of the F OLM ethodCallExpression is an
instance of P ropertyCallExpression with its target as Book
(an instance of NameExpression) and its property as all (an
instance of NameExpression). The types of these
expressions, altogether with some irrelevant details are omitted
for the purpose of the discussion.</p>
      </sec>
      <sec id="sec-17-2">
        <title>4.6.2 The EPL pattern</title>
        <p>In Listing 5, we de ne an EPL pattern to match occurences
of the pattern described above. In lines 2-6, a guard is
dened to look for a F OLM ethodCallExpression the name
of which is either 'select' or 'selectOne'; the type of the
condition should be EqualsOperatorExpression; its target
should be an instance of P ropertyCallExpression; and the
property of the P ropertyCallExpression should be 'all'.
In lines 8-10, a guard is de ned to look for an instance of
EqualsOperatorExpression in the condition of the
FOLMethodCallExpression found previously, the lhs of which
should be an instance of P ropertyCallExpression.
Lines 12-14 specify that the resolvedT ype of the lhs should
be an instance of M odelElementT ype. In lines 16-18, it
speci es that the resolvedT ype of the rhs should be an
instance of M odelElementT ype. In lines 20-24, it speci es
that the type of the lhs and the rhs should be the same.
Lines 26-37 perform the match of the pattern. This part
rstly fetches the ERef erence from the lhs of the condition
(in this case, 'b.author', it is an ERef erence because as
previously discussed, all metamodels are converted to EMF
metamodels for uniformity). The EReference is then
inspected; if it is not null and it has an eOpposite reference,
the pattern continues to check if the type of the eOpposite
of the reference is the type of the rhs of the condition (in
this case, 'author').</p>
        <p>In lines 39-47, a helper method is de ned to help look for an
ERef erence given an EClass and a name; its
implementation is straightforward.
1 pattern InverseNavigation
2 folcall : FOLMethodCallExpression
3 guard: (folcall.method.name = 'select' or folcall.</p>
        <p>method.name = 'selectOne')
4 and folcall.conditions.isTypeOf(</p>
        <p>EqualsOperatorExpression)
and folcall.target.isTypeOf(PropertyCallExpression)
and folcall.target.property.name = 'all',
condition : EqualsOperatorExpression
from: folcall.condition
guard: condition.lhs.isTypeOf(</p>
        <p>PropertyCallExpression)</p>
      </sec>
      <sec id="sec-17-3">
        <title>4.6.3 The Java pattern</title>
        <p>Our original attempt to construct the sub-optimal
performance detection facility was to de ne patterns using Java,
we de ned a method in Java to achieve the same function
described above. The equivalent Java implementation is 76
lines of code with a long chain of If statements which makes
it very di cult to comprehend. With the EPL approach,
the patterns are more comprehensible. Developers can
contribute to the pattern pool by de ning their own EPL
patterns and registering them with the framework through
appropriate extension points.</p>
      </sec>
    </sec>
    <sec id="sec-18">
      <title>5. TOOL SUPPORT</title>
      <p>
        The static analyser proposed in this paper is a pragmatic
static analyser. The exibility of EOL allows its users to
write code with optional typings that always work at
runtime. Reporting errors on such cases are not desirable for
EOL, especially on legacy EOL code that is proven to work
with extensive testing. Thus the design decision was to
allow such behaviour and delegate the resolution to the EOL
execution engine. We applied the static analyser on a large
EOL program (Ecore2GMF.eol) that underpins the
Eugenia tool [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for evaluation. We allow optimistic typing
when Any type is encountered in assignments, operation
calls and property calls, we provide a warning message to
notify the user that there might be potential type
incompatibility issues at run-time. With this con guration,
analysis on Ecore2GMF.eol which consists of 1212 lines of code
generates 126 warning messages. This result shows that the
static analyser supports plausible legacy code. At the same
time, it provides reasonable warning messages when optional
typing is used.
      </p>
      <p>After evaluating the static analyser, we evaluate the
suboptimal performance detection facility. Figure 8 provides a
screenshot of the editor we implemented by extending the
existing EOL Eclipse-based editor. The lines of code with
warnings represent matches of the patterns discussed above.
The implementation of the editor is able to extract and
display the warning messages generated by the detection
facility. The sub-optimal performance detection facility is not
only able to detect patterns that incur performance
overheads, but also provide suggestions on how to rewrite the
code. An example warning message is shown in line 8, the
warning message suggests to rewrite the operation as:
1 Author.all.select(a|a.first_name = "William" and a.</p>
      <p>surname = "Shakespeare")
The rest of the patterns function as expected.</p>
    </sec>
    <sec id="sec-19">
      <title>6. RELATED WORK</title>
      <p>
        There are several automated analysis and validation tools for
model management programs. In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] the authors propose
a generic static analysis framework for model
transformations speci ed in VIATRA2 Textual Command Language
(VTCL [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]). The latest static analysis framework detects
common errors and type related errors regarding models.
However, the VIATRA2 framework provides limited
support for other metamodelling technologies as it uses its own
modelling language (VTML) and store the metamodels in
the model space. Additionally, VTCL is not as exible as
EOL; it does not provide optional typing mechanisms as
EOL does.
      </p>
      <p>
        Acceleo [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] provides static analysis mechanisms for syntax
highlighting, content assistant, and model related error
detection. However, to the best of our knowledge, it does not
support modelling technologies other than EMF.
Xtend [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] also provide static analysis facilities which are used
to detect syntax and built-in type-related errors, model
related type information validations are not included.
In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], a static analysis tool is proposed to detect errors in
model transformations written in the Atlas Transformation
Language (ATL), the tool presented is used to convert an
ATL program into a model, but no validation algorithms are
implemented on this tool to our best knowledge.
The latest release of ATL IDE [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] provides a static
analysis facility, it resolves the types of variables including
builtin ATL types and types related to metamodels. The ATL
IDE also provides code-completion of operation calls and
metamodel element navigations. However, the static
analysis is not responsible to provide any errors on type
incompatibilities as it adopts an optimistic and exible approach.
The ATL platform also provides limited support for multiple
modelling technologies other than EMF.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], ways of deriving optimisation patterns from
benchmarking OCL operations for model querying and navigation
are proposed and several optimisation patterns are
identied, including short-circuit boolean expressions, opposite
relationship navigation, operations on collections, etc.
      </p>
    </sec>
    <sec id="sec-20">
      <title>7. CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper we have reported on our work on designing and
developing a static analysis framework for the core language
of Epsilon (EOL). The focus of our report is mostly on the
sub-optimal detection facility of the static analysis
framework. However, it is to be noted that the static analysis
framework is able to detect various type-related errors that
may occur using EOL. The static analysis framework follows
a pragmatic approach so as not to compromise the exibility
of the Epsilon languages. As a result, it can generate false
negatives (problems that exist but cannot be detected by
the static analyser). To minimise the number of false
negatives, a more strict coding style is encouraged - to avoid
the use of Any type as much as possible, so that the static
analyser can perform more accurate analysis. This is clearly
a trade-o to make; to obtain better error reporting,
developers need to write more boilerplate code with explicit type
casting, while to obtain better exibility, developers need
to bare with the fact that the analyser may produce false
negatives that emerge at run-time.</p>
      <p>It should be noted that the sub-optimal performance
detection facility is only one application of the static analysis
framework for Epsilon. In the future, we plan to look into
facilities such as program comprehension, metamodel
coverage analysis, impact analysis, etc. We will also look into
the possibility of pre-loading models and look for more
negrained performance patterns for EOL programs.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] Epsilon labs</article-title>
          . https://epsilonlabs.googlecode.com/ svn/trunk/StaticAnalysis/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Balogh</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Varro</surname>
          </string-name>
          .
          <article-title>Advanced model transformation language constructs in the VIATRA2 framework</article-title>
          .
          <source>In Proceedings of the 2006 ACM symposium on Applied computing</source>
          , pages
          <volume>1280</volume>
          {
          <fpage>1287</fpage>
          . ACM,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Friese</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Kolb</surname>
          </string-name>
          .
          <article-title>Validating Ecore models using oAW Work ow</article-title>
          and OCL.
          <source>Eclipse Summit Europe</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>J.-M. Gauthier</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Bouquet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hammad</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Peureux</surname>
          </string-name>
          , et al.
          <article-title>Veri cation and Validation of Meta-Model Based Transformation from SysML to VHDL-AMS</article-title>
          .
          <source>In MODELSWARD 2013, 1st Int. Conf. on Model-Driven Engineering and Software Development</source>
          , pages
          <volume>123</volume>
          {
          <fpage>128</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Jouault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Allilaire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bezivin</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Kurtev.</surname>
          </string-name>
          <article-title>ATL: A model transformation tool</article-title>
          .
          <source>Science of computer programming</source>
          ,
          <volume>72</volume>
          (
          <issue>1</issue>
          ):
          <volume>31</volume>
          {
          <fpage>39</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Koenig</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Moo</surname>
          </string-name>
          .
          <article-title>Templates and duck typing</article-title>
          .
          <source>Dr. Dobbs</source>
          , June,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kolovos</surname>
          </string-name>
          .
          <article-title>An extensible platform for speci cation of integrated languages for model management</article-title>
          .
          <source>PhD thesis</source>
          , University of York,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kolovos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Paige</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Garc</surname>
          </string-name>
          <article-title>a-Dom nguez</article-title>
          .
          <source>The Epsilon Book. Structure</source>
          ,
          <volume>178</volume>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Kolovos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Paige</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Polack</surname>
          </string-name>
          .
          <article-title>The Epsilon Object Language (EOL)</article-title>
          .
          <source>In Model Driven Architecture{Foundations and Applications</source>
          , pages
          <volume>128</volume>
          {
          <fpage>142</fpage>
          . Springer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Rose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Kolovos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Paige</surname>
          </string-name>
          .
          <article-title>Eugenia live: a exible graphical modelling tool</article-title>
          .
          <source>In Proceedings of the 2012 Extreme Modeling Workshop</source>
          , pages
          <volume>15</volume>
          {
          <fpage>20</fpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>J. Sanchez Cuadrado</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Jouault</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Garc</surname>
          </string-name>
          a
          <article-title>-Molina, and</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Bezivin</surname>
          </string-name>
          .
          <article-title>Deriving ocl optimization patterns from benchmarks</article-title>
          .
          <source>Electronic Communications of the EASST</source>
          ,
          <volume>15</volume>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sendall</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Kozaczynski</surname>
          </string-name>
          .
          <article-title>Model transformation: The heart and soul of model-driven software development</article-title>
          .
          <source>Software</source>
          , IEEE,
          <volume>20</volume>
          (
          <issue>5</issue>
          ):
          <volume>42</volume>
          {
          <fpage>45</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Steinberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Budinsky</surname>
          </string-name>
          , E. Merks, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Paternostro</surname>
          </string-name>
          .
          <article-title>EMF: eclipse modeling framework</article-title>
          .
          <source>Pearson Education</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ujhelyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Horvath</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Varro</surname>
          </string-name>
          .
          <article-title>A generic static analysis framework for model transformation programs</article-title>
          .
          <source>Technical report, Technical report</source>
          , Budapest University of Technology and Economics,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vieira</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Ramalho</surname>
          </string-name>
          .
          <article-title>A static analyzer for model transformations</article-title>
          .
          <source>In 3rd International Workshop on Model Transformation with ATL</source>
          , Zurich, Switzerland,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>