Analyzing Dynamic Models using
               a Data-flow based Approach

                  Christian Saad, Bernhard Bauer (Supervisor)
                christian.saad@informatik.uni-augsburg.de

                                University of Augsburg


       Abstract. Meta modeling as a method allows to devise languages suited
       for specific application domains, e.g. for describing the structural or be-
       havioral aspects of software systems. As meta models constitute an ab-
       stract syntax (often enriched with static semantics) there exist obvious
       similarities to the area of formal languages. The research effort described
       in this paper is intended to examine to what extent and to what benefit
       compiler construction concepts, namely the data-flow analysis method,
       can be transferred to the modeling domain in order to validate static
       semantics and perform abstract interpretations on models.


1     Motivation

In the last decade, the use of meta models has evolved from a scientific approach
to a widely used technique in areas like software development and business pro-
cess modeling (BPM). This advancement has been greatly endorsed by industry
standards like the OMG’s Meta-Object Facility (MOF) framework that forms
the basis for several prominent modeling languages like the the Unified Model-
ing Language or the Business Process Modeling Notation (BPMN) and even a
model-centric development approach called Model-driven Architecture (MDA)1 .
    Nevertheless, when compared to formal languages, an inherent flaw of the
modeling technique becomes obvious: The restrictions on a language’s structure
which are given by its abstract syntax, i.e. a context-free grammar (CFG) or
a meta model, are often not enough to ensure correct expressions - a check of
the static semantics is also in order. For this purpose, CFGs are often extended
with attributes, forming attribute grammars (AG), which enable compilers to
validate statements based on the context in which they appear (cf. [1]).
    A limited amount of static analysis can be performed using the OMG’s Ob-
ject Constraint Language (OCL, [7]) which enables to phrase constraints on the
structure of MOF and UML-based meta models. However, it has several draw-
backs: Through navigation statements, OCL constraints are tightly tied to the
structure of the meta model. This can lead to difficulties if constraints require
the consideration of a model element’s context, e.g. the position of an action
in an UML activity diagram relative to its preceeding/succeeding actions. As a
1
    http://www.omg.org/technology/documents/modeling_spec_catalog.htm
language which is mainly intended to be used to validate constraints by perform-
ing static queries on a model’s elements, OCL does not contain semantics which
would enable a fixed-point analysis since, in the case of cyclic dependencies this
would require a continuous reevaluation the DFA’s equation system. Finally,
complex navigation statements make OCL rules very prone to be invalidated if
the structure of the underlying meta model changes, often requiring extensive
adjustments.
    Since models comprise a graph structure, defining and calculating informa-
tion flow enables an advanced analysis of a model’s properties. These may either
be properties of the graph structure itself, e.g. strongly connected component
regions, or semantic attributes, e.g. the availability of a resource created at one
point in a control-flow model in some other part of the model, This research is
dedicated to adapting the data-flow analysis (DFA) approach commonly used in
compiler construction for deriving optimizations from a program’s control-flow
graph, to modeling, thus enabling a fixed-point analysis on (meta) models.


2   Related Work
Since the definition of static semantics is a vital step when developing formal
languages, many different techniques have been considered for this purpose.
    The OCL can be considered a comparatively simple language for require-
ments exceeding syntactic expressiveness and by design is very well-integrated
into the modeling domain. A critical evaluation of its expressive power can be
found in [5], with emphasis on difficulties in calculating transitive closures.
    Approaches considering the use of formal semantics include graph transfor-
mations, abstract state machines or first order logic (cf. [8] [6] [3]).
    Several authors use DFA-based methods to derive information from models.
In [9], a fixed-point calculation is used on executable models to derive def/use
relationships between actions while the authors of [2] propose the use of control-
flow information to improve software testing.
    The focus in these cases is, however, directed towards implementing a specific
case study. To the best of our knowledge, there is no other research effort with the
goal of applying the concept of a fixed-point based DFA calculation to models.


3   Approach for Model-based Data-flow Analysis
Both meta models and (context-free) grammars operate on different layers of
abstraction. For grammars, these usually consist of EBNF, CFGs and language
expressions. In the widely-used MOF one distinguishes between meta-meta (M3),
meta (M2) and model (M1) layer. Because of conceptual similarity, both tech-
niques can be aligned according to their levels of abstraction. This alignment
requires that data-flow equations are attached to language artefacts (i.e. M2
meta classes) and instantiated/executed for expressions, i.e. M1 objects.
   Since attribute grammars are a well-proven extension of this design, it was
decided that this strategy will also be employed in the definition of data-flow
equations: An attribute definition consisting of an identifier and a data-type can
be bound to several meta model classes through the use of attribute occurrences.
Semantic rules (corresponding to data-flow equations) are assigned to attribute
definitions and attribute occurrences to calculate initialization and iteration val-
ues, respectively. Data-flow between attributes occurs if a semantic rule requests
the value of another attribute as input.


         (a) Data-flow analysis meta model             (b) Calculate predecessors

      Fig. 1. Model-based DFA definition in the notion of attribute grammars


    To stay consistent with the notion of modeling and to minimize frictional
losses between different techniques, it is desirable to represent DFAs themselves
as models, i.e. to provide a meta model which in the described alignment hier-
archy acts as an extension of the M3 layer (cf. Figure 1(a)).
    To calculate a DFA for a given model, attribute occurrences assigned to meta
classes need to be instantiated for model elements derived from these classes.
While the instantiation process is straightforward, it has to be noted that this
must happen in compliance with principles like generalization, i.e. if defined for
a super class, attributes must be inherited by instances of sub classes.
    An example is shown in Figure 1(b) which calculates the transitive closure
of predecessor nodes in a simple control-flow graph. The class node has an as-
signed attribute occurrence of the type all predecessors. The associated rule,
which is repeatedly executed for all nodes, recursively creates the union of direct
predecessors and the value of all predecessors at preceding nodes.
    The fixed-point calculation significant differs from traditional DFA tech-
niques: Since semantic rules may request the value of arbitrary attributes as
input when they are executed, there exists neither an inherent flow direction nor
any prior knowledge about output relationships between attribute instances.
Since the commonly used worklist algorithm depends on this information, it is
not applicable in this context. To accomodate for this, an algorithm has been
developed that dynamically records input/output relationships by executing the
rules recursively to create a dependency graph which can then be used as a basis
to derive a nearly optimal execution order, thus minimizing the amount of rule
executions.
4     Research

The research effort described in this abstract has to cover the following issues:

Definition and Alignment To transfer the method of DFA to the modeling
   domain, similarities and differences between both application areas must be
   identified and a new definition language has to be devised and aligned with
   the modeling techniques.
DFA Algorithms Suitable algorithms for calculating DFA equation systems
   have to be devised and proven to be correct.
Complexity and Performance The practical and theoretical performance of
   different DFA solving algorithms has to be evaluated.
Tooling and Use Cases To prove the feasability of this approach, a tooling
   environment has to be provided alongside the implementation of several use
   cases which need to be evaluated against implementations based on alterna-
   tive theoretical foundations.

     While efforts for DFA definition and integration into modeling are well ad-
vanced (as described in the previous section), a formalization of these results is
still in order. The same goes for the DFA solving algorithm which has proven
to perform very well in comparison to adaptions of conventional algorithms like
the work list method.
     The Model Analysis Framework (MAF2 ) is a fully functional prototype pro-
viding tooling support for the described concepts. It is built upon Eclipse model-
ing techniques, namely the Eclipse Modeling Framework (EMF), an implemen-
tation of the MOF standard. Once all artefacts required for an analysis (i.e.
meta models, models and attributions based on the meta model shown in Fig-
ure 1(a)) are loaded into the respective repositories, the defined attributes are
instantiated for the model’s elements and handed to the selected evaluation al-
gorithm. First experiments have shown that the developed algorithm preforms
much better in calculating the result values than a traditional worklist algorithm
that has been enhanced with the ability to handle dynamically discovered output
dependencies.
     Currently implemented case studies (which are available from the code repos-
itory) include: Calculating properties of control-flow graphs like transitive prede-
cessor/successor sets and strongly connected components (SCC), definition/use
relationships in workflows and the hierarchical subdivision of control-flows into
its Single-Entry-Single-Exit (SESE) components using a token-flow algorithm
(cf. [4]) which has been reimplemented using DFA. In the future, additional ap-
plications areas are planned like clone detection, validating modeling guidelines
and generating test models for model-based testing approaches.

2
    http://code.google.com/p/model-analysis-framework/
References
1. Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques,
   and Tools (2nd Edition). Addison Wesley (August 2006), http://www.amazon.ca/
   exec/obidos/redirect?tag=citeulike09-20\&amp;path=ASIN/0321486811
2. Garousi, V., Bri, L., Labiche, Y.: Control Flow Analysis of UML 2.0 Sequence
   Diagrams (2005)
3. Georg, G., Bieman, J., France, R.: Using Alloy and UML/OCL to specify run-time
   configuration management: a case study. Practical UML-Based Rigorous Develop-
   ment Methods-Countering or Integrating the eXtremists 7 (2001)
4. Götz, M., Roser, S., Lautenbacher, F., Bauer, B.: Token Analysis of Graph-Oriented
   Process Models. New Zealand Second International Workshop on Dynamic and
   Declarative Business Processes (DDBP), in conjunction with the 13th IEEE In-
   ternational EDOC Conference (EDOC 2009) (September 2009)
5. Mandel, L., Cengarle, M.: On the expressive power of the Object Con-
   straint Language OCL. Available on the World Wide Web: http://www. fast.
   de/projeckte/forsoft/ocl (1999)
6. Ober, I.: An ASM Semantics of UML Derived from the Meta-Model and Incorpo-
   rating Actions pp. 356–371 (2003)
7. Object Management Group: Object Constraint Language.
   http://www.omg.org/spec/OCL/2.0/ (Mai 2006)
8. Varró, D.: A formal semantics of UML Statecharts by model transition systems.
   Graph Transformation pp. 378–392 (2002)
9. Waheed, T., Iqbal, M., Malik, Z.: Data Flow Analysis of UML Action Semantics for
   Executable Models. Lecture Notes in Computer Science 5095, 79–93 (2008)