=Paper= {{Paper |id=Vol-1775/MODELS2016-SRC_paper_5 |storemode=property |title=Scalable Graph Query Evaluation and Benchmarking with Realistic Models |pdfUrl=https://ceur-ws.org/Vol-1775/MODELS2016-SRC_paper_5.pdf |volume=Vol-1775 |authors=Gábor Szárnyas |dblpUrl=https://dblp.org/rec/conf/models/Szarnyas16 }} ==Scalable Graph Query Evaluation and Benchmarking with Realistic Models== https://ceur-ws.org/Vol-1775/MODELS2016-SRC_paper_5.pdf
                           Scalable Graph Query Evaluation and
                            Benchmarking with Realistic Models

                                                       Gábor Szárnyas
                                    Budapest University of Technology and Economics
                                  Department of Measurement and Information Systems
                               MTA-BME Lendület Research Group on Cyber-Physical Systems
                                                 szarnyas@mit.bme.hu




Abstract                                                         ments to systems engineers. MDE tools check these patterns
Model queries are widely used in model-driven engineer-          by evaluating graph queries.1
ing toolchains: models are checked for errors with validation
queries, model simulations and transformations require com-      1.1    Scalable Graph Queries
plex pattern matching, while injective mappings for views        As models are rapidly increasing in size and complexity, ef-
are defined with model queries. Efficient and scalable eval-     ficient execution of model validation operations is challeng-
uation of complex queries on large models is a challenging       ing for the currently available toolchains, like ARTOP [2],
task. To achieve scalable graph query evaluation, I identified   Capella [38] or Papyrus [53].
key challenges such as the lack of credible benchmarks and           The last decade brought considerable improvements
difficulties of obtaining real models for performance testing.   in distributed storage and query technologies, known as
To address these challenges, my contributions target (1) dis-    NoSQL systems. These systems provide quick evaluation
tributed incremental graph queries, (2) a cross-technology       of simple retrieval operations and they are able to answer
benchmark for model validation, (3) characterization of re-      complex queries in a scalable manner, albeit not instantly.
alistic models, and (4) realistic models generation.             Providing quick response times for evaluating such queries
                                                                 over large and evolving data sets is still a challenging task.
Keywords distributed queries, model validation, model
                                                                     Graph queries capturing validation constraints are often
generation, benchmarking
                                                                 complex, including many join, antijoin and filtering opera-
                                                                 tions. However, most query technologies cannot efficiently
1.   Problem and Motivation
                                                                 evaluate such operations for models with 10 million model
Model-Driven Engineering (MDE) is a development method-          elements [48], while models of critical systems, software
ology used in many application domains such as criti-            and geospatial models are often 1–2 orders of magnitude
cal applications (automotive, avionics and railway sys-          larger [43]. A possible solution for scalable graph queries
tems [7, 31, 58]). To increase the efficiency of development,    is to use distributed query processing techniques [16, 59].
MDE facilitates the use of models in various modelling lan-      This brings us to the first research question I investigated.
guages targeting different levels of abstraction. Models can
be used not only for presenting the structure and behaviour         RQ 1. How to incrementally evaluate graph queries
of the system, but also for synthesizing various design ar-         over a distributed platform?
tifacts (such as source code, configuration files, documen-
tation). To catch design flaws early, model validation tech-     1.2    Benchmarking
niques check the well-formedness of models. Design rules
                                                                 To assess the performance of a graph query engine, a bench-
and well-formedness constraints are often captured in the
                                                                 mark framework is of high importance. According to the
form of graph patterns [9] to highlight invalid model ele-
                                                                 Benchmark Handbook [21], a useful benchmark is (1) rel-
                                                                 evant, (2) portable, (3) scalable, and (4) simple. To ensure
                                                                 relevance, the benchmark must use a representative work-
                                                                 load and models similar to realistic ones. Providing relevant
                                                                 results, while also guaranteeing the other three properties
                                                                 (portability, scalability and simplicity) is a major challenge.

                                                                 1 In this paper, I use the term graph as a synonym for instance model.
    For real-world industrial systems, both metamodels and         2.    Preliminaries
instance models are protected by intellectual property rights      This section introduces an example used throughout the pa-
(IPR). For example, AUTOSAR [7] is not an open standard,           per and presents the concept of incremental queries.
but only available to members of the consortium, therefore
it is not suitable for an open performance benchmark. Sim-         2.1   Running Example: Railway Network
ilarly, engineering models in the avionics and railway do-         As a running example, I use a small railway network, defined
mains are also not available to the public.                        on the metamodel of the Train Benchmark [48], a model val-
    These challenges confirm the need for a benchmark              idation benchmark (the benchmark and my related contribu-
framework, which provides a real-world-like workload sce-          tions are discussed in Section 4.2).2
nario and evaluates realistic queries on realistic models.             Figure 1 shows a schematic representation of the network,
Therefore, the second research question is the following.          with routes (1–3), switches and segments. As the first switch
  RQ 2. How to assess query technologies for a contin-             is set to a straight position and the second switch is set to a
  uous model validation scenario?                                  diverging position, a train passing through this track would
                                                                   follow route #3, hence that route active.
1.3     Characterization of Realistic Models                                     SwitchPosition       Route 1

                                                                                                Segment                                  Route 2
While existing generators may produce large models in in-                    Segment                            Switch
                                                                                                                              Segment

creasing sizes, these models are usually simple and syn-                               Switch      Segment                    Segment
thetic, which hinders their credibility for industrial and re-                                                                          Route 3
                                                                                                             SwitchPosition
search benchmarking purposes. Up to my best knowledge,
there are no existing techniques to characterize models used       Figure 1. Railway example model. The positions of the
in MDE practice. To develop such a technique, first I had to       switches designate route #3 as active.
address questions about model metrics, such as:

 • Which metrics can be used for characterizing models?
 • Is is possible to distinguish models of different domains,
      purely based on their metrics?

   To answer these questions, I conducted a literature review
in other disciplines, e.g. network theory and social network
analysis. The high-level goal of the research is to answer the
following question.

  RQ 3. What makes a model realistic?

1.4     Generating Realistic Models
Custom generators of graph-based models are used in MDE
for many purposes such as functional testing and perfor-
mance benchmarking of modeling environments to ensure
the correctness and scalability of tools. However, none is ca-
pable of generating realistic models scalable in size:             Figure 2. The graph of the railway example in Figure 1. The
                                                                   metamodel is shown in the top left corner of the figure. The
 • Logic-based synthesis (like Alloy [25]) generate well-          graph pattern for finding the active routes is shown in the top
      formed models but lack scalability.                          right corner.
 • Rule-based approaches [48] are capable of generating
      large models by using transformation rules or random             Modeling tools often represent their models as graphs.
      mutations to add new elements. However, they provide no      Figure 2 shows the example network as a labelled, attributed
      guarantees that the resulting model is realistic. Some ap-   graph, along with the metamodel of the graph. Routes follow
      proaches do not even guarantee well-formedness, which        a set of switch positions that contain the prescribed position
      is a prerequisite for realistic models.                      (straight or diverging) of the switch. The railway track con-
                                                                   sists of connected switches and segments.
   It is an open research question if it is possible to ensure     2 To guarantee that the example is concise and easy to understand, the
these properties.                                                  example only uses a fraction of the Train Benchmark metamodel. The
                                                                   benchmark uses models that are significantly more complex: they contain
  RQ 4. How to generate scalable and realistic models?             more metamodel elements (types) and consist of more elements (objects).
    The active route can be determined by evaluating a graph        Cross-technology benchmark for continuous validation.
query (by graph pattern matching). A route is active if all         Numerous benchmarks have been proposed to compare the
its switches are in the position prescribed by the switch           performance of query and transformation engines, but no
positions of the route. In other words, a route is active if none   openly available cross-technology benchmarks exist for con-
of its switches are set to a different position as the prescribed   tinuous model validation.
position. This results in the pattern shown in the upper right         The first transformation benchmark was proposed in [56],
corner of Figure 2. In the example, the graph query selects         which gave an overview of typical application scenarios of
route #3 as the active one, as both its switch positions (6 and     graph transformations together with their characteristic fea-
8) are satisfied by the corresponding switches (10 and 13).         tures. Many transformation challenges have been proposed
                                                                    as cases for graph and model transformation contests. How-
2.2   Incremental Query Evaluation                                  ever, only [22, 61] focus on query performance, while others
In many use cases, queries are continuously evaluated, while        measure the usability of the tools, the conciseness and read-
changes affect only a restricted part of data. The queries and      ability of the query languages and tests various advanced
transformations for simulation and well-formedness valida-          features, including reflection, traceability, etc.
tion in MDE are typical examples of such a workload. The               There are numerous benchmarks from the area of seman-
goal of incremental query evaluation is to speed up such            tic databases. SP2 Bench [44] features a synthetic DBLP-
queries, utilizing the (partial) results obtained during the pre-   like dataset, the Berlin SPARQL Benchmark (BSBM) [11]
vious executions of the query to compute the latest set of          simulates an e-commerce application, while the DBpe-
changes. For example, if the current position of the second         dia SPARQL benchmark [35] features a real data set with
switch in Figure 1 changes from diverging to straight, the          queries based on real-world user queries. The Linked Data
change only affects a small part of the graph (node 13 in Fig-      Benchmark Council (LDBC) recently developed the Social
ure 2). This allows the incremental query engine to quickly         Network Benchmark [19], a cross-technology benchmark,
reevaluate the query: in this case, the active route is changed     which provides an interactive workload and focuses on nav-
from #3 to #2.                                                      igational pattern matching (i.e. traversal operations). While
   Incremental query evaluation algorithms use additional           some of these benchmarks feature update operations and
data structures for caching interim results, hence they con-        hence measure incremental query performance, they provide
sume more memory than search-based, non-incremental al-             workloads that significantly differ from MDE use cases.
gorithms. In other words, they trade memory consumption
for execution speed. While incremental query engines pro-           Characterization of realistic models. Revealing essen-
vide quick response times for various use cases [9, 48], their      tial structural similarities and differentiations among net-
excessive memory consumption limits their scalability.              works from different fields is a fundamental objective in
                                                                    network theory with a wide range of applications. The au-
                                                                    thors of [15] list 22 areas using network theory, including so-
3.    Related Work                                                  cial network analysis, transportation, biomolecular networks
To appropriately address all the research questions in the          and chemistry. Network theory is also studied in physics,
context of MDE, a wide range of multidisciplinary topics            e.g. in the context of statistical mechanics [5]. However,
needs to be covered.                                                most of these applications use untyped (one-dimensional)
                                                                    networks. So far, existing multidimensional studies only
Distributed incremental graph queries. The Rete algo-               focused on models of a single application domain, such
rithm was originally created by Charles Forgy for rule-based        as neighbourhood and centrality analysis of a social net-
expert systems [20]. Bunke et al. [14] were the first to pro-       work [12], relevance and correlation analysis of different
pose the Rete algorithm in the context of graph transforma-         dimensions in Flickr [28], community detection in the net-
tions. Bergmann et al. adapted the algorithm for the Eclipse        work of YouTube [52].
Modeling Framework in the EMF-I NC Q UERY project [9],                  The authors of [10] use graph metrics to capture the struc-
now part of the V IATRA project [55].                               ture and evolution of software products and processes in
    Query languages and execution engines have been de-             order to detect significant structural changes, help estimate
veloped to support incremental graph queries on a single-           bug severity, prioritize debugging efforts, and predict defect-
machine environment. Drools [27] is an incremental busi-            prone releases in software engineering. Metrics are also
ness rule engine for Java-based systems. INSTANS [40] pro-          used for understanding the main characteristics of domain-
vides incremental queries over RDF [57].                            specific metamodels, to study model transformations with
    Various distributed, but non-incremental graph query sys-       respect to the corresponding metamodels, and search corre-
tems exist, including an approach based on SAP HANA [29],           lations between them via analytical measures [41].
a graph transformation tool using the Bulk Synchronous
Parallel graph processing model [30], and Trinity, an RDF-          Realistic model generation. The SP2 Bench [44] bench-
based query engine [45].                                            mark uses a generator based on the statistics of the DBLP
Figure 3. Graph of the contributions. Research questions are typeset in bold. Proposed contributions are noted with dashed
border. The arrows indicate the relationship between contributions, e.g. the results for C3 could be directly used for C1 and C4.


library. The authors of [36] use Boltzmann samplers [17] to        Hence, the query can be formalized in relational algebra as:3
ensure efficient generation of uniform models.
   OMOGEN [13] is a tool for automatically generating test               route . follows ./ σcurrentPosition6=position (
models, used for testing model transformations. The tool
                                                                                                                   
                                                                           switch ./ target ./ switchPosition) = {h3i}
takes a metamodel and a set of model fragments as its inputs
and combines the fragments using several strategies to build       where ./ denotes the natural join operator that joins its
valid instances.                                                   operands based on their common attributes, and . denotes
   gMark [8] is a domain-independent framework for syn-            the antijoin operator (also known as the anti-semijoin [46])
thesizing large graphs, allowing the user to specify parame-       that keeps the tuples from its left operand which do not have
ters – size, types, degree distributions and other constraints     a matching tuples in its right operand.
– for the graphs to be generated. gMark is also able to gen-          Figure 4 shows a distributed Rete network implementing
erate query workloads with queries of different size, shape        this relational algebra expression. The network is allocated
and selectivity.                                                   to two machines, Server 1 and Server 2. This allows the
                                                                   query engine to scale for larger graphs, for which the Rete
4.    Approach and Contributions                                   network would not fit in the memory of a single workstation.
                                                                   However, this approach still has a bottleneck limiting scal-
Figure 3 summarizes key the research questions and contri-         ability: if a Rete node cannot fit to the memory of a single
butions of my research. This section presents my approach          workstation, it will run out of memory.
along with achieved and proposed contributions.
                                                                   3 To formalize the query, the relations for the vertices and edges in Figure 2

4.1   Distributed incremental graph queries                        can be defined as follows:

To achieve scalable incremental query evaluation, I adapted         • route(route) = {h1i, h2i, h3i}
the Rete algorithm for distributed systems. I demonstrate the       • follows(route, switchPosition) =
Rete algorithm works on the ActiveRoute query (Figure 2).                 {h1, 4i, h2, 5i, h2, 7i, h3, 6i, h3, 8i}
As described in Section 2.1, the query collect Routes, where        • switch(switch, currentPosition) =
                                                                          {h4, divi, h5, stri, h6, stri, h7, stri, h8, divi}
all Switches along the route are in the position prescribed by
                                                                    • target(switchPosition, switch) =
the corresponding SwitchPosition. In other words, without                 {h4, 10i, h5, 10i, h6, 10i, h7, 13i, h8, 13i}
using the universal quantifier (∀), it searches for routes that     • switchPosition(switchPosition, position) =
do not have a SwitchPosition which prescribes a position                  {h10, stri, h13, divi}
different from the current position of its target Switch [39].
                                        Model access adapter                                              4.2   Cross-technology benchmark for continuous
                                                     Notifications                                              validation
                             Route.              Switch.          SwitchPosition.   SwitchPosition.
     Route
                             follows         currentPosition          target           position           In Section 2.1, I used a running example from the Train
      〈r〉
                             〈r, swP〉           〈sw, cP〉    a        〈swP, sw〉             〈swP, p〉   a   Benchmark framework. The Train Benchmark is an incre-
                   swP              swP                                  swP            swP
                                                                                                          mental model validation benchmark, continuously devel-
                            Join                                                 Join                     oped by the Fault-Tolerant Systems Research Group since
                         〈r, swP, sw〉                                       〈swP, sw, p〉
                                                                                                          2010. I have significantly extended the Train Benchmark,
            r                 r                    sw               sw                                    both conceptually and implementation-wise. Figure 5 shows
                 Antijoin                                  Join                                           the inputs of the benchmark process, the benchmark phases
                   〈r〉                             〈sw, cP, swP, p〉
                                                                                                          and the benchmark results.
                     Results
                                                 Selection, p ≠ cP                                           The Train Benchmark is a macro benchmark that aims
                                                  〈sw, cP, swP, p〉
            Query results                                                                                 to measure the performance of continuous model vali-
                Server 1                                             Server 2                             dation with graph-based models and constraints captured
                                                                                                          as queries. The benchmark is cross-technology, i.e. it is
   Figure 4. Rete network for the ActiveRoute pattern.                                                    implemented on a range technologies. The serialization
                                                                                                          formats include Eclipse-based model-driven engineering
                                                                                                          toolchains (EMF), graph databases [42], relational databases
   Using these techniques and algorithms, I made following                                                (SQL) and semantic technologies (RDF [57]). The query en-
contributions.                                                                                            gines include relational engines (SQLite, MySQL), graph
                                                                                                          transformation frameworks (V IATRA [55]), rule engines
Combine distributed actor model with Rete-based query                                                     (Drools [27]), graph query engines (Neo4j [37]) and SPARQL
evaluation network. I designed a distributed architecture                                                 engines (Sesame [3], Jena [1]). Also, the framework is ex-
and prototyped I NC Q UERY-D, a Rete-based query engine                                                   tensible which allows users of the benchmark to incorporate
using actors for distributed scalability. I presented a detailed                                          new technologies.
performance evaluation in the context of model incremental                                                   Earlier versions of the benchmark have been continuously
well-formedness validation. The results showed nearly in-                                                 used for performance measurements since 2012 [47, 54].
stantaneous complex query reevaluation well beyond 10M+                                                   The benchmark is also part of the benchmark suite used by
model elements, [47]. To further extend the scalability of the                                            the MONDO EU FP7 [34] project and was selected as a
system, I proposed sharding individual Rete nodes in [32].                                                case for the 2015 Transformation Tool Contest [51] as well.
                                                                                                          The benchmark framework is available as an open-source
Distributed termination protocol for asynchronous Rete.                                                   project.4
As Rete is an asynchronous algorithm, determining if the
network is in a consistent state w.r.t. the latest change set                                             Scalable technology-agnostic model generator. While the
requires a distributed termination protocol. The protocol was                                             original benchmark framework included a model generator,
also presented in [47] and [32].                                                                          its scalability was limited. I redesigned the model generator
                                                                                                          focusing on two aspects: (1) ensuring scalability for large
Experimental evaluation over distributed NoSQL databases.                                                 models, and (2) allowing the framework users to easily adapt
The proposed architecture and algorithms are representation-                                              new representations.
agnostic. They have been integrated with the Neo4j graph
database [23], the Titan distributed graph database and                                                   Propose novel query and transformation mixes for bench-
4store, a semantic database [47].                                                                         mark. The workload profile of the benchmark simulates
                                                                                                          real-world model validation scenarios of users loading, val-
Evaluation of Rete network optimization and allocation                                                    idating and transforming their models. The transformations
strategies. Allocating the Rete nodes in the cloud is a com-                                              capture user edits and quick-fix like automated refactor-
plex optimization problem, where the goal is to minimize                                                  ing operations. Some queries in the benchmark are struc-
the cost of communication between the nodes. I presented                                                  turally similar to AUTOSAR [7] validation queries (pre-
a solver-based approach for allocating Rete nodes in [33].                                                sented in [9]), while other aim to test various features of
I also proposed optimization techniques used in relational                                                graph query engines (such as efficient filtering and evalua-
query optimization for enhancing the performance of graph                                                 tion of negative conditions).
queries [50].
                                                                                                          Automated visualization and reporting. The framework
Uniqueness. Up to my best knowledge, existing technolo-                                                   features end-to-end automation [24] to (1) set up configura-
gies are either distributed [30, 45] or incremental [55], but                                             tions of benchmark runs, (2) generate large model instances
there is no system that provides scalable, distributed incre-
mental graph queries.                                                                                     4 https://github.com/FTSRG/trainbenchmark
                  Model             Run:   ×k                                                          Iteration:   ×n
                  Query                    Read               Check                Transformation               Recheck             Benchmark results
                                                                                                                                    {# of invalid elements,
              Scenario                execution time   # of invalid elements,         execution time       # of invalid elements,    execution times,
  {batch, inject, repair}                                 execution time                                      execution time         memory consumption}

                                              Figure 5. Phases of the Train Benchmark.


(3) execute benchmark measurements, (4) synthesize dia-                         I considered a metric useful if it separates models of different
grams for measurements using R scripts5 .                                       domains from each other, while provides similar values for
                                                                                models within the same domain. I also investigated whether
Cross-technology evaluation of incremental query execu-
                                                                                some of these metrics can distinguish real models from auto-
tion time and memory consumption. This cross-technology
                                                                                generated synthetic ones.
benchmark can be adapted to different model representation
                                                                                    Exploratory analysis relied on data visualization, while
formats and query technologies. This is demonstrated by 12+
                                                                                confirmatory analysis used statistical methods (such as per-
reference implementations over four different technological
                                                                                forming Kolmogorov–Smirnov tests on the derived metrics
spaces (EMF, graph databases, RDF and SQL) presented
                                                                                distributions). My initial finding is that different versions of
in [48].
                                                                                clustering coefficients (i.e. how tightly connected the model
Uniqueness. Compared to other benchmarks, the Train                             elements are) were particularly useful for such classifica-
Benchmark has the following set of distinguishing features:                     tions. But, unsurprisingly, no single metric was able to suf-
                                                                                ficiently handle all the domains. The analysis also provides
 • The workload profile follows a real-world model valida-
                                                                                some insights that needs to be considered in future model
      tion scenario by updating the model with changes derived                  generators to synthesize realistic models.
      by simulated user edits or transformations.
 • The benchmark measures the performance of both initial                       Automated classification of domain models using machine
                                                                                learning. As a future research objective, I plan to use
      validation and incremental revalidation.
                                                                                machine learning techniques for automated classification of
 • This benchmark was designed with cross-technology                            domain models.
      adaptations in mind. It can be implemented with different
      model representation formats and query technologies.                      Uniqueness. Up to my best knowledge, this is the first
                                                                                investigation for using multidimensional graph metrics for
4.3     Characterization of realistic models                                    both characterizing the realism of models and distinguishing
In [49], I presented multidisciplinary graph metrics and eval-                  different domain models from each other.
uated them on instance models from different domains. As a                      4.4     Realistic model generation
result, I proposed some metrics which turned out to be useful
for characterizing the structure of models.                                     As a proposed contribution, I plan to design and develop a
                                                                                generator that is capable of producing realistic models scal-
Adapt multidisciplinary metrics for engineering models.                         able in size. While there are solutions for generating either
I performed a literature review and identified several graph                    scalable or realistic models, there are no known approaches
metrics from other disciplines. For evaluating these metrics,                   for the combination of both, rendering this a high-risk re-
I gathered instance models from software and systems engi-                      search task. The long-term research objective of generating
neering domains:                                                                scalable and realistic models breaks down to the following
 • AutoFOCUS system models [6],                                                 steps:
 • Building Information Models (BIM) [18],                                      1. metrics guided generation of realistic models,
 • Capella system models [38],                                                  2. domain model generation by design space exploration [4],
 • JaMoPP code models [26],                                                     3. scalable rule-based generation of domain models.
 • railway models from the Train Benchmark [48],
 • Yakindu [60] statecharts.
                                                                                Acknowledgments
                                                                                I would like to thank my advisor, Dániel Varró for his guid-
Statistical characterization of different domains and mod-                      ance during my research. I would also like to express my
els. I used both exploratory and confirmatory data analysis                     gratitude to István Ráth, Gábor Bergmann and numerous
techniques in order to determine the “usefulness” of metrics.                   colleagues an co-authors for sharing their suggestions and
5 https://www.r-project.org/                                                    ideas.
References                                                                 ers, Managers, Designers, Engineers and Contractors. Wiley
 [1] Apache Jena. http://jena.apache.org/.                                 Publishing, 2008.

 [2] Artop: The AUTOSAR Tool Platform. https://www.artop.             [19] O. Erling, A. Averbuch, J. Larriba-Pey, H. Chafi, A. Gubichev,
     org/.                                                                 A. Prat-Pérez, M. Pham, and P. A. Boncz. The LDBC social
                                                                           network benchmark: Interactive workload. In Proceedings of
 [3] Sesame: RDF API and query engine.              http://www.            the 2015 ACM SIGMOD International Conference on Man-
     openrdf.org/.                                                         agement of Data, Melbourne, Victoria, Australia, May 31 -
 [4] H. Abdeen, D. Varró, H. A. Sahraoui, A. S. Nagy, C. De-              June 4, 2015, pages 619–630, 2015.
     breceni, Á. Hegedüs, and Á. Horváth. Multi-objective op-
                                                                      [20] C. Forgy. Rete: A fast algorithm for the many patterns/many
     timization in rule-based design space exploration. In ASE,
                                                                           objects match problem. Artif. Intell., 19(1):17–37, 1982.
     pages 289–300, 2014.
                                                                      [21] J. Gray, editor. The Benchmark Handbook for Database and
 [5] R. Albert and A.-L. Barabási. Statistical mechanics of com-
                                                                           Transaction Systems (2nd Edition). Morgan Kaufmann, 1993.
     plex networks. Rev. Mod. Phys., 74(1):47–97, January 2002.
                                                                      [22] T. Horn, C. Krause, and M. Tichy. The TTC 2014 movie
 [6] V. Aravantinos et al. AutoFOCUS 3: Tooling concepts for
                                                                           database case. TTC 2014, page 93, 2014.
     seamless, model-based development of embedded systems. In
     Joint Proceedings of ACES-MB & WUCOR co-located with             [23] B. Izsó, G. Szárnyas, I. Ráth, and D. Varró. IncQuery-D:
     MoDELS, pages 19–26, 2015.                                            Incremental graph search in the cloud. In BigMDE, 2013.
 [7] AUTOSAR Consortium. The AUTOSAR Standard. http:                  [24] B. Izsó, G. Szárnyas, I. Ráth, and D. Varró. MONDO-SAM:
     //www.autosar.org/.                                                   A framework to systematically assess MDE scalability. In
 [8] G. Bagan, A. Bonifati, R. Ciucanu, G. H. L. Fletcher,                 BigMDE@STAF, pages 40–43, 2014.
     A. Lemay, and N. Advokaat. Generating flexible workloads         [25] D. Jackson. Alloy: a lightweight object modelling notation.
     for graph databases. Proc. VLDB Endow., 9(13):1457–1460,              ACM Trans. Softw. Eng. Methodol., 11(2):256–290, 2002.
     Sept. 2016.                                                      [26] JaMoPP. The Java Model Parser and Printer, 2016. http:
 [9] G. Bergmann, Á. Horváth, I. Ráth, D. Varró, A. Balogh,            //www.jamopp.org/index.php/JaMoPP.
     Z. Balogh, and A. Ökrös. Incremental evaluation of model       [27] JBoss. Drools. http://www.jboss.org/drools.
     queries over EMF models. In MODELS, pages 76–90, 2010.
                                                                      [28] P. Kazienko, K. Musial, and T. Kajdanowicz. Multidimen-
[10] P. Bhattacharya, M. Iliofotou, I. Neamtiu, and M. Faloutsos.          sional social network in the social recommender system. IEEE
     Graph-based analysis and prediction for software evolution.           Trans. Systems, Man, and Cybernetics, 41(4):746–759, 2011.
     In ICSE, pages 419–429, 2012.
                                                                      [29] C. Krause, D. Johannsen, R. Deeb, K. Sattler, D. Knacker, and
[11] C. Bizer and A. Schultz. The Berlin SPARQL benchmark. In-             A. Niadzelka. An SQL-based query language and engine for
     ternational Journal on Semantic Web & Information Systems,            graph pattern matching. In ICGT, 2016.
     5(2):1–24, 2009.
                                                                      [30] C. Krause, M. Tichy, and H. Giese. Implementing graph
[12] P. Bródka, P. Kazienko, K. Musial, and K. Skibicki. Analysis
                                                                           transformations in the bulk synchronous parallel model. In
     of neighbourhoods in multi-layered dynamic social networks.
                                                                           FASE. 2014.
     Int. J. Computational Intelligence Systems, 2012.
                                                                      [31] B. Luteberget, C. Johansen, and M. Steffen. Rule-based con-
[13] E. Brottier, F. Fleurey, J. Steel, B. Baudry, and Y. L. Traon.
                                                                           sistency checking of railway infrastructure designs. In Inte-
     Metamodel-based test generation for model transformations:
                                                                           grated Formal Methods - 12th International Conference, IFM
     an algorithm and a tool. In ISSRE, pages 85–94, 2006.
                                                                           2016, Reykjavik, Iceland, June 1-5, 2016, Proceedings, pages
[14] H. Bunke, T. Glauser, and T. Tran. An efficient implemen-             491–507, 2016.
     tation of graph grammars based on the RETE matching al-
                                                                      [32] J. Maginecz and G. Szárnyas. Sharded joins for scalable
     gorithm. In Graph-Grammars and Their Application to Com-
                                                                           incremental graph queries. In 23rd PhD Mini-Symposium,
     puter Science, 4th International Workshop, Bremen, Germany,
                                                                           Budapest University of Technology and Economics, 2016.
     March 5-9, 1990, Proceedings, pages 174–189, 1990.
                                                                      [33] J. Makai, G. Szárnyas, I. Ráth, Á. Horváth, and D. Varró.
[15] L. d. F. Costa, O. N. Oliveira, G. Travieso, F. A. Rodrigues,
                                                                           Optimization of incremental queries in the cloud. In 3rd
     P. R. Villas Boas, L. Antiqueira, M. P. Viana, and L. E. Cor-
                                                                           International Workshop on Model-Driven Engineering on and
     rea Rocha. Analyzing and modeling real-world phenomena
                                                                           for the Cloud (CloudMDE) at MODELS, 2015.
     with complex networks: a survey of applications. Advances in
     Physics, 60(3):329–412, 2011.                                    [34] MONDO project. Scalable Modeling and Model Management
[16] J. Dean and S. Ghemawat. Mapreduce: Simplified data pro-              on the Cloud Project, 7th EU Framework Programme, 2016.
     cessing on large clusters. In OSDI, pages 137–150, 2004.         [35] M. Morsey, J. Lehmann, S. Auer, and A.-C. N. Ngomo. DB-
[17] P. Duchon, P. Flajolet, G. Louchard, and G. Schaeffer. Boltz-         pedia SPARQL benchmark: Performance assessment with real
     mann samplers for the random generation of combinatorial              queries on real data. In ISWC, 2011.
     structures. Combinatorics, Probability & Computing, 13(4-        [36] A. Mougenot, A. Darrasse, X. Blanc, and M. Soria. Uniform
     5):577–625, 2004.                                                     random generation of huge metamodel instances. In ECMDA-
[18] C. Eastman, P. Teicholz, R. Sacks, and K. Liston. BIM Hand-           FA, pages 130–145, 2009.
     book: A Guide to Building Information Modeling for Own-          [37] Neo Technology. Neo4j. http://neo4j.org/.
[38] PolarSys.       Capella.       https://www.polarsys.org/                  transformation platform: three generations of the VIATRA
     capella/.                                                                 framework. SOSYM, 15(3):609–629, 2016.
[39] A. Rensink. Representing first-order logic using graphs. In          [56] G. Varró, A. Schürr, and D. Varró. Benchmarking for graph
     ICGT, pages 319–335, 2004.                                                transformation. In VL/HCC. IEEE Press, 2005.
[40] M. Rinne, E. Nuutila, and S. Törmä. INSTANS: high-                 [57] W3C. Resource Description Framework (RDF). http://
     performance event processing with standard RDF and                        www.w3.org/standards/techs/rdf/.
     SPARQL. In ISWC, 2012.                                               [58] J. Whittle, J. E. Hutchinson, and M. Rouncefield. The state
[41] J. D. Rocco, D. D. Ruscio, L. Iovino, and A. Pierantonio. Min-            of practice in model-driven engineering. IEEE Software,
     ing correlations of ATL model transformation and metamodel                31(3):79–85, 2014.
     metrics. In MiSE, pages 54–59, 2015.                                 [59] R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica.
[42] M. A. Rodriguez and P. Neubauer. Constructions from Dots                  Graphx: a resilient distributed graph system on spark. In
     and Lines. Bulletin of American Society for Information                   GRADES co-loated with SIGMOD/PODS, page 2, 2013.
     Science & Technology, August/September, 2010.                        [60] Yakindu. Statechart Tools. http://statecharts.org/.
[43] M. Scheidgen, A. Zubow, J. Fischer, and T. H. Kolbe. Au-             [61] A. Zündorf. AntWorld benchmark specification, GraBaTs,
     tomated and transparent model fragmentation for persisting                2008.
     large models. In Model Driven Engineering Languages and
     Systems - 15th International Conference, MODELS 2012,
     Innsbruck, Austria, September 30-October 5, 2012. Proceed-
     ings, pages 102–118, 2012.
[44] M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel.
     SP2Bench: A SPARQL performance benchmark. Shanghai,
     China, 2009. IEEE.
[45] B. Shao, H. Wang, and Y. Li. Trinity: a distributed graph
     engine on a memory cloud. In SIGMOD, 2013.
[46] A. Silberschatz, H. F. Korth, and S. Sudarshan. Database
     System Concepts, 5th Edition. McGraw-Hill Book Company,
     2005.
[47] G. Szárnyas, B. Izsó, I. Ráth, D. Harmath, G. Bergmann, and
     D. Varró. IncQuery-D: A distributed incremental model query
     framework in the cloud. In MODELS, pages 653–669, 2014.
[48] G. Szárnyas, B. Izsó, I. Ráth, and D. Varró. The Train Bench-
     mark: Cross-technology performance evaluation of continu-
     ous model validation. Software and Systems Modeling, 2017.
     Accepted.
[49] G. Szárnyas, Z. Kővári, A. Salánki, and D. Varró. Towards the
     characterization of realistic models: Evaluation of multidisci-
     plinary graph metrics. In MODELS, pages 87–94, New York,
     NY, USA, 2016. ACM.
[50] G. Szárnyas, J. Maginecz, and D. Varró. Evaluation of opti-
     mization strategies for incremental graph queries. Periodica
     Polytechnica, EECS, 2017. Accepted.
[51] G. Szárnyas, O. Semeráth, I. Ráth, and D. Varró. The TTC
     2015 Train Benchmark case for incremental model validation.
     In Proceedings of the 8th TTC, a part of STAF, 2015.
[52] L. Tang, X. Wang, and H. Liu. Community detection via het-
     erogeneous interaction analysis. Data Min. Knowl. Discov.,
     25(1):1–33, 2012.
[53] The Eclipse Foundation. Papyrus, 2015. https://eclipse.
     org/papyrus/.
[54] Z. Ujhelyi, G. Bergmann, Á. Hegedüs, Á. Horváth, B. Izsó,
     I. Ráth, Z. Szatmári, and D. Varró. EMF-IncQuery: An inte-
     grated development environment for live model queries. Sci.
     Comput. Program., 98:80–99, 2015.
[55] D. Varró, G. Bergmann, Á. Hegedüs, Á. Horváth, I. Ráth,
     and Z. Ujhelyi. Road to a reactive and incremental model