<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Optimization of Incremental Queries in the Cloud</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>J o ́zsef Makai</institution>
          ,
          <addr-line>Ga ́bor Sza ́rnyas, A</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>-Database and model queries are the foundations of data-driven applications. Their performance is of primary importance in model-driven software engineering (MDE), especially with the evergrowing complexity of software modeling projects. To address the scalability issues of traditional MDE tools, distributed frameworks are being developed - however, query optimization in a distributed context brings a whole new set of challenges, including capacity limits of individual nodes and network communication. In this paper, we aim to address the allocation optimization challenge in the context of distributed incremental query evaluation. Our methods are based on a combination of heuristics-based resource consumption estimations and constraint satisfaction programming. We evaluate the impact of the optimization techniques by conducting benchmark measurements.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>I. INTRODUCTION</p>
      <p>Model-driven software engineering (MDE) plays an
important role in the development processes of critical embedded
systems. With the dramatic increase in complexity that is
also affecting critical embedded systems in recent years,
modeling toolchains are facing scalability challenges as the
size of design models constantly increases, and automated tool
features become more sophisticated.</p>
      <p>Many scalability issues can be addressed by improving
query performance. Incremental model queries aim to reduce
query response time by limiting the impact of model
modifications to query result calculation. This technique has been
proven to improve performance dramatically in several cases
(e.g. on-the-fly well-formedness validation or model
synchronization), at the cost of increased memory consumption.</p>
      <p>INCQUERY-D [1] aims to address the memory consumption
problem by offloading memory-intensive components to a
distributed architecture. INCQUERY-D is a system based on
a distributed Rete network [2] that can scale up to handle
very large models (typically stored in distributed NoSQL or
graph databases) and complex queries efficiently.</p>
      <p>However, the introduction of the cloud and grid-like
architecture introduces a whole set of query optimization
challenges. In this paper, we focus on the (static) allocation
optimization problem, which in our context means the assignment</p>
      <p>This work was partially supported by the MONDO (EU ICT-611125)
project and Lendu¨let programme (MTA-BME Lendu¨let Cyber-Physical
Systems Research Group).
of memory-intensive distributed components to computation
resources according to a strategy that is optimal in terms
of overall cost, or favourable in terms of query evaluation
performance. We propose to use combinatorial optimization
methods based on constraint satisfaction programming, aided
by heuristics to estimate i) resource usage of individual
system components and ii) network traffic between distributed
computation nodes. Our aim is to provide a fully automated
optimization allocation mechanism for INCQUERY-D.</p>
    </sec>
    <sec id="sec-2">
      <title>II. BACKGROUND</title>
      <sec id="sec-2-1">
        <title>A. The Allocation Problem</title>
        <p>In the context of distributed systems, allocation means the
assignment of computation resources to computation tasks.
The input of the allocation consists of the computation tasks
and the edges between them (which represents data-flow
between two tasks), the available machines with their capacities
and possibly the quality of network links between machines.
Computation tasks have to be assigned to processes first,
as we run processes (Java Virtual Machines specifically for
INCQUERY-D) on machines, and the resource consumption of
a process consists of the collective resource consumption of
tasks assigned to this particular process. The output of the
allocation is a possible mapping of processes to machines.</p>
        <p>1) Allocation constraints: Allocation constraints are
derived from the observation that certain subsets of computation
tasks use so much resources together that allocating them to
a set of machines without sufficient resources would cause a
bottleneck in the query evaluation or even cause the failure of
the computation. We call an allocation valid, if the constraints
are satisfied. We assume that the tasks of query evaluation
are memory-intensive computations, therefore the allocation
constraints consist of memory constraints in our case. The
goal of allocation optimization is to provide valid allocation
optimized for certain optimization targets.</p>
        <p>In this paper, we consider two possible optimization targets.
1) Communication Minimization: This objective is
supposed to increase the performance of query evaluation by
reducing the overhead of network communication originated
from the data transmission in the distributed system. This is
mainly useful when time spent with network communication
is significant compared to the time of local computations
performed by tasks, that is true for distributed query evaluation.</p>
        <p>2) Cost Minimization: Cost Minimization aims to reduce
the monetary cost of computation infrastructure the system
is operated on. This technique is useful for most distributed
systems, as large systems usually require a lot of resources and
thus, are expensive to operate, especially when the system runs
in a public cloud infrastructure.</p>
      </sec>
      <sec id="sec-2-2">
        <title>C. INCQUERY-D and the Rete Algorithm</title>
        <p>INCQUERY-D [1] is a distributed, incremental model query
engine that aims to address scalability issues of incremental
queries over large models. The high-level architecture of the
system is shown in Figure 1. INCQUERY-D stores the models
in distributed databases, as the models can be arbitrarily large.
A typical INCQUERY-D installation uses a distributed graph
database such as 4store1.</p>
        <p>Database Cluster
Database
shard 1</p>
        <p>DB Server 1</p>
        <p>Database
shard n</p>
        <p>DB Server n
Load and Modify Data</p>
        <p>1) The Rete Algorithm: The theoretical foundations of
INCQUERY-D are provided by the Rete algorithm [2]. For
each query, an asynchronous dataflow network of
communicating nodes is constructed, consisting of three main types of
Rete nodes: i) the indexer layer of INCQUERY-D consists of
input nodes that are responsible for caching model element
identifiers by type and for propagating change notifications;
ii)worker nodes implement relational data operations (such as
joins and projections) that make up a query and they also store
the partial results to be propagated to their children nodes; iii)
production nodes are responsible for storing and maintaining
the final results of queries, as they reside at the endpoints of the
network. Furthermore, they provide an interface for accessing
query results and result deltas.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>1http://4store.org/</title>
      <p>III. ALLOCATION OPTIMIZATION IN INCQUERY-D</p>
      <sec id="sec-3-1">
        <title>A. Formalization of the Allocation Problems</title>
      </sec>
      <sec id="sec-3-2">
        <title>1) Communication Minimization: For this problem, we</title>
        <p>need to represent communication intensity between processes
and quality of network connection between machines in a
mathematical model to approximate data transmission time.
For communication intensity, we defined our own logical data
unit, the normalized tuple, which is based on the simplification
that all data inside the Rete network is represented by tuples
made up of identical scalar elements (strings).</p>
      </sec>
      <sec id="sec-3-3">
        <title>Definition 1 (Normalized tuple): Given a set of model ele</title>
        <p>ments organized in tuples (with the same arity), the normalized
tuple is defined as the product of the tuple arity (number of
fields) and the number of tuples.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Definition 2 (Overhead multiplier): To describe the quality</title>
        <p>of network connections, we use a simplification called
overhead multiplier. This value represents all important parameters
of the network connection as a scalar that allows us to compare
the transmission times associated to identical (logical) volumes
of data across various connections.</p>
      </sec>
      <sec id="sec-3-5">
        <title>The use of normalized tuple and overhead multiplier have</title>
        <p>been validated by empirical measurements, as shown in Figure
2. We can also observe that transmission time within the same
host is characteristically smaller than between different hosts.</p>
        <p>
          Time to send large data
Communication edges between processes: The matrix E
contains the communication intensity between each two
processes measured in normalized tuples. For the ith
process the ith row describes these values.
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
        </p>
        <p>The W matrix contains the calculated communication
weights between processes for valid allocations.</p>
        <p>Valid allocations have to satisfy: i) for each machine, the
memory capacity cannot be exceeded by the processes
Communication overheads between machines: The matrix
O contains the overhead multipliers between machines.</p>
        <p>For the ith machine the ith row describes these values.</p>
        <p>The vector w contains the cost of each machine in the
system, the cost of the machine if used, otherwise 0.</p>
        <p>If two processes are placed to particular machines, the
communication intensity has to be multiplied with the overhead
value between the machines.</p>
        <p>
          8i; j; k; l; k 6= l : xi;k + xj;l
2 ! wk;l = ek;l oi;j
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          )
        </p>
        <p>As the objective, we minimize the sum of elements in the
W matrix:</p>
        <p>m
8j : X xi;j = 1; xi;j 2 f0; 1g
i=1</p>
        <p>8
weight = min &gt;&lt; X
&gt;:11&lt;&lt;ki&lt;&lt;nn
wi;k
9
&gt;
=
&gt;
;
s1 x1;1 + s2 x1;2 +
s1 x2;1 + s2 x2;2 +
+ sn x1;n
+ sn x2;n
s1 xm;1 + s2 xm;2 +
+ sn xm;n
.
.
.</p>
        <p>c1
c2
cm
ii) each process must be allocated to exactly one machine.</p>
      </sec>
      <sec id="sec-3-6">
        <title>2) Cost Minimization: The input:</title>
        <p>
          Memory requirements (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) and memory capacities (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) are
same as for Communication Minimization.
        </p>
        <p>Cost of machines: The monetary cost of machines.</p>
        <p>B = [b1; b2;
; bm]
8i 2 f1;</p>
        <p>n
; mg : X xi;j
j=1
1 $ wi = bi
(12)</p>
        <p>As the objective, we minimize the sum of elements in the
w vector:
cost = min
( m )</p>
        <p>
          X wi ;
i=1
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          )
(
          <xref ref-type="bibr" rid="ref9">9</xref>
          )
(
          <xref ref-type="bibr" rid="ref10">10</xref>
          )
(11)
(13)
        </p>
        <p>The lack of exact a-priori knowledge of optimization
parameters is a well-known challenge in query optimization,
which can be addressed by heuristics-based estimations to
yield reasonable approximations in an efficient way. In our
context, the memory usage of processes is a primary target for
such estimations, for both problems. Furthermore, the
communication intensity (i.e. network traffic) between processes
is necessary for the Communication Minimization problem.</p>
        <p>1) Importance of Precise Estimation: If we overestimate
the memory requirement of a process and reserve much more
memory than necessary, then we can not allocate as many
processes to a machine as would otherwise be possible, and
thus we can not reach possible better allocations. On the
other hand if we underestimate the memory consumption of
processes, then we risk either that the processes fail because
of insufficient amount of allocated memory or we risk the
violation of allocation constraints, and thus the overusage
of resources. As a consequence, accurate memory estimation
is required for good resource utilization and for the stable
operation of the system.</p>
        <p>For Communication Minimization, the precise estimation
of communication intensity is of key importance as it could
misdirect the allocation.</p>
        <p>For the Rete algorithm that operates with collections storing
static structures (tuples), the memory usage and the
communication intensity of a component can be estimated using simple
linear regression methods. The independent variable of these
linear functions is the estimated number of model elements,
measured in normalized tuples, stored by the Rete nodes.</p>
        <p>2) Calculation and Propagation of Estimations: The core
idea is to estimate the amount of outbound network traffic
from a Rete node by the amount of data it contains, as in
the initialization phase of the network, the entirety of this
data will be transmitted. On the other hand, data storing Rete
nodes must store all the data, that is propagated by their parent
nodes.</p>
        <p>This principle can be applied and propagated through the
complete Rete structure using breadth-first search (BFS).
Figure 3 shows an example where we marked which iteration the
different estimations can be calculated in. The white circle
means the memory estimation and the black circle means the
communication intensity estimation. After the level of input
nodes (for those the amount of stored and so the propagated
data can be known exactly, as those propagate all their input
data), the estimations for their direct children nodes can be
calculated in the second iteration and so on. It is important to
note that we can not calculate estimations for a node until the
estimations are calculated for all its parent nodes.
5000 MB
2400 MB
2
1</p>
        <p>Worker node
3
4
Production node
Fig. 5: Rete node and Processes with estimations.</p>
        <p>The estimations for different types of Rete nodes are
calculated following different rules that were refined in empirical
experiments carried out with typical model-driven engineering
workloads and models (sparse graphs). We present a few
examples for these rules, for more examples see [3].</p>
        <p>
          When i) joining nodes of the model through edges, we 3) 1-to-1 Allocation: Constraint (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ) in OPL:
expect to have 1% of the input set size (Cartesian product 1 forall (j in 1..n)
of the two input sets) as result. For ii) selection nodes (which 2 sum(i in 1..m)
filters elements according to criteria), we expect to have 10% 3 x[i][j] == 1;
of input elements as result, as this is a good upper-bound The constraint for process 1 is:
estimation for usual queries.
600 x1;1 + 2400 x1;2 + 3200 x1;3 + 500 x1;4
The constraint is the following for machine 1 :
        </p>
      </sec>
      <sec id="sec-3-7">
        <title>C. Solution by Constraint Satisfaction</title>
        <p>We proved both the Communication and Cost Minimization
problems to be N P-hard [3], therefore we propose to solve
the problems by constraint satisfaction programming (CSP) [4]
using the IBM CPLEX Optimizer2 as the platform of
implementation. We show how the Communication Optimization 1 forall (i,j in 1..m)
problem can be formalized with CPLEX and the OPL lan- 2 forall (k,l in 1..n)
guage3. We also illustrate the constraints by an example. 3 (x[i][*k]o+vxe[rjh]e[ald]s[&gt;i=][2j)];=&gt; w[k][l] == edges[k][l]
1) Input representation: Figure 4 shows the available ma- 4
chines with their memory capacities and with the overhead 5 minimize sum(i,j in 1..n) w[i][j];
multipliers between each pair of machines. Constraints for the communication cost between processes</p>
        <p>Figure 5 shows the sample Rete network with the estima- 1 and 3 with the possibility to be placed to any pair of
tions given and already assigned to processes. machines:</p>
        <p>
          We have an auxiliary matrix X to describe the allocation
constraints: (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ) and (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ). If xi;j = 1 then the jth process will x1;1 + x1;3 2 ! w1;3 = 200000 1
be allocated to the ith machine and 0 means the opposite.
        </p>
        <p>
          4) Objective Function: The OPL code for (
          <xref ref-type="bibr" rid="ref8">8</xref>
          ) and (
          <xref ref-type="bibr" rid="ref9">9</xref>
          )
constraints, where the edges matrix (n n) contains the
communication intensity values between every two processes
and the overheads matrix (m m) contains the overhead
multipliers between machines.
2http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer
3http://www-01.ibm.com/software/commerce/optimization/modeling
5) Solution: Given this input, the CPLEX Optimizer
determines the overall communication intensity of the optimal
solution to be 2,200,000 that is shown in Figure 6.
        </p>
        <p>Worker node</p>
        <p>3200 MB</p>
        <p>Production
node
500 MB</p>
        <p>To justify the beneficial effects of Communication
Minimization on query performance and stability, we conducted
measurements using a model validation benchmark, the Train
Benchmark [5], a suite designed for evaluating the
performance of graph pattern matching tools.</p>
      </sec>
      <sec id="sec-3-8">
        <title>A. Experiments</title>
        <p>We designed two experiments. In the first experiment, the
impact of respecting allocation constraints is assessed by
comparing an optimized configuration to a na¨ıve setup that a)
uses the default heap size for the Java Virtual Machines, and b)
assigns close to the maximum RAM available to the JVMs.
In the second experiment, we assess the impact of network
traffic optimization by comparing the optimized configuration
to a non-optimized setup. To emphasize the effect of network
communication on overall performance, we configured the
cloud environment to use connections with lower bandwith (10
Mbit/s) that simulate a system under load. In both experiments,
we run a complex query of the Train Benchmark on instance
models of increasing sizes, up to 2 million nodes and 11
million edges.</p>
        <p>1) Hardware and software setup: The benchmark software
configuration consisted of an extended Train Benchmark setup
using the 4store (version 1.1.5) database in a clustered
environment. We ran three virtual machines (VMs) on a private
cloud powered by Apache VCL, each VM was running
64bit Ubuntu Linux 14.04 on Intel Xeon 2.5 GHz CPUs and
8 GBs of RAM. We used Oracle 64-bit Java Virtual Machines
(version 1.7.0 72). We integrated our monitoring system into
the benchmark environment in order to record telemetry data at
each execution stage of the benchmark. The benchmark results
were automatically processed by the R scripts provided by the
Train Benchmark framework.</p>
      </sec>
      <sec id="sec-3-9">
        <title>B. Results and Analysis</title>
      </sec>
      <sec id="sec-3-10">
        <title>1) Experiment 1: Memory optimization: As it can be seen</title>
        <p>in Figure 7, both the “default heap size” variant and the
“maximum heap size” variants failed to execute successfully
for the largest instance model sizes, both reporting out of heap
space errors in the JVMs running one of the Rete nodes.</p>
        <p>As the “default heap size” variant uses 1 GB heap limits, the
JVM may get into a thrashing state under such loads and cause
a timeout during measurement which the Train Benchmark
framework registers as a failed run. Similarly, in the case of the
“maximum heap size” variant, due to the lack of a reasonable
upper limit, the JVMs may interfere with each other’s memory
allocations resulting in a runtime error.</p>
        <p>Batch validation time RouteSensor (x,y:logscale), XForm</p>
        <p>2) Experiment 2: Network optimization: Figure 7 compares
the optimized variant to a “non-optimized” (shown as
unoptimized). We may observe that while the overall characteristics
are similar, the optimized variant shows a constant-multiplier
advantage, running approximately 15–20% faster than the
unoptimized variant. Figure 8 shows the reevaluation time
of a query after changes were applied to the model. In
the reevaluation phase, the amount of data transmitted is
significantly less than in the first evaluation phase. As the
numbers are comparatively small (which is consistent with the
Train Benchmark specification), the advantage of the optimized
variant is within the measurement error range.</p>
        <p>These observations are explained by telemetry data recorded
by a monitoring system during the measurement. The network
traffic measurements are summarized in Figure 9, which
compares the network traffic volume recorded for the
“optimized” and “unoptimized” variants. It can be seen that while
the overall volume is practically equivalent, its distribution
between local and remote links is characteristically different
(i.e. in the “optimized” case, the overall remote volume is
approximately 15% lower than in the “unoptimized” case).</p>
        <p>Traffic'[MBytes]
Remote'RX+TX
Local'RX+TX
SUM'Remote
SUM'Local'
Total'traffic'</p>
        <p>Fig. 9: Network traffic statistics.</p>
      </sec>
      <sec id="sec-3-11">
        <title>C. Threats to Validity</title>
        <p>For these experiments, we considered the following internal
and external threats to validity. As transient and unknown
background load in the cloud environment can introduce noise,
we performed several execution runs and considered their
minimum value for the result plots as a countermeasure. We
strived to avoid systematic errors in the experiments (e.g.
incorrect queries or workloads) by cross-checking all results
with the specification of the Train Benchmark. The validity of
the analysis and especially the generalizability of the results
to real-world workloads has been thoroughly investigated in
several academic papers on the Train Benchmark (e.g. [1],
[6]). We believe that our measurements are faithful extensions
of the Train Benchmark and thus the results of these previous
works apply to our contributions as well.</p>
        <p>V. RELATED WORK</p>
        <p>Based on an idea originating in mathematical economics, [7]
proposes a characteristically different performance model
compared to our domain. In these cases, the cost of
communication comes from the costs of reaching data from other nodes,
which is not applicable for data-driven distributed systems
such as INCQUERY-D. Furthermore, global allocation resource
constraints are not considered. There are other models [8],
where the cost can include the execution costs of tasks on
particular processors besides the communication cost, which
is a planned future direction of our work.</p>
        <p>Apers et al. developed a scheduling-based optimization
approach [9] for distributed queries over relational databases. In
their model, relational operations are scheduled on computers.
The model uses the estimated time of different operations and
data transmission, as it is required for scheduling. However,
as relational database queries are not incremental, memory
constraints for computers are not considered.</p>
        <p>Many different mathematical models exist for the problem
according to the varying requirements of systems, but up to
our best knowledge the currently proposed model of allocation
was never formalized and investigated for distributed systems
before, especially in the context of incremental model queries.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>VI. CONCLUSION</title>
      <p>We presented an approach for allocation optimization in the
context of distributed systems, focusing on two different
optimization targets: reducing resource usage through minimizing
remote network traffic and cost reduction through minimizing
the amount of computation resources required to evaluate
an incremental query. Our approach is based on a mapping
of the allocation optimization problems to the combinatorial
optimization domain, solved with the state-of-the-art CPLEX
Optimizer. We applied and evaluated this concept to the
optimization of distributed incremental model queries in the
context of INCQUERY-D.</p>
      <p>Our primary aim for the future is a technological
adaptation of our allocation techniques to the YARN resource
management framework [10]. This is directly motivated by
the fact that INCQUERY-D itself is moving towards a
Hadoopbased implementation. In this context, experience has shown
that while YARN has several built-in schedulers for resource
allocation, those are not well-suited to long-running,
memoryintensive jobs that are inter-related by complex structural
constraints (such as the Rete network). Our long-term goal
is to generalize the CPLEX-based optimization approach to a
wider range of Hadoop/YARN applications.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] Sza´rnyas, G. et al.:
          <string-name>
            <surname>IncQuery-D: A Distributed Incremental</surname>
          </string-name>
          <article-title>Model Query Framework in the Cloud</article-title>
          .
          <source>In: ACM/IEEE 17th International Conference on Model Driven Engineering Languages and Systems</source>
          , 2014, Valencia, Spain, Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Forgy</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          :
          <article-title>Rete: A fast algorithm for the many pattern/many object pattern match problem</article-title>
          .
          <source>Artificial intelligence 19(1)</source>
          (
          <year>1982</year>
          )
          <fpage>17</fpage>
          -
          <lpage>37</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Makai</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Optimization of Incremental Queries in the Cloud</article-title>
          .
          <source>Report</source>
          . https://tdk.bme.hu/VIK/DownloadPaper/Inkrementalis-lekerdezesekoptimalizacioja-a (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Van</given-names>
            <surname>Hentenryck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Simonis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Dincbas</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.:</surname>
          </string-name>
          <article-title>Constraint satisfaction using constraint logic programming</article-title>
          .
          <source>Artificial intelligence 58(1)</source>
          (
          <year>1992</year>
          )
          <fpage>113</fpage>
          -
          <lpage>159</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] Sza´rnyas, G., Semera´th,
          <string-name>
            <surname>O.</surname>
          </string-name>
          ,
          <source>Ra´th, I.</source>
          , Varro´,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>The TTC 2015 Train Benchmark Case for Incremental Model Validation</article-title>
          . Transformation Tool Contest,
          <year>2015</year>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Izso</given-names>
            <surname>´</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          , Szatma´ri,
          <string-name>
            <given-names>Z.</given-names>
            ,
            <surname>Bergmann</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          ,
          <article-title>Horva´th, A´., Ra´th, I.: Towards Precise Metrics for Predicting Graph Query Performance</article-title>
          .
          <source>In 2013 IEEE/ACM 28th International Conference on Automated Software Engineering (ASE)</source>
          , pages
          <fpage>412</fpage>
          -
          <lpage>431</lpage>
          , Silicon Valley, CA, USA,
          <year>2013</year>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Ferguson</surname>
            ,
            <given-names>D.F.</given-names>
          </string-name>
          et al.:
          <article-title>7. In: ECONOMIC MODELS FOR ALLOCATING RESOURCES IN COMPUTER SYSTEMS</article-title>
          . (
          <year>1996</year>
          )
          <fpage>156</fpage>
          -
          <lpage>183</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Ferna</surname>
          </string-name>
          <article-title>´ndez-</article-title>
          <string-name>
            <surname>Baca</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Allocating modules to processors in a distributed system</article-title>
          .
          <source>Software Engineering</source>
          , IEEE Transactions on
          <volume>15</volume>
          (
          <issue>11</issue>
          ) (
          <year>1989</year>
          )
          <fpage>1427</fpage>
          -
          <lpage>1436</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Apers</surname>
            ,
            <given-names>P.M.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hevner</surname>
            ,
            <given-names>A.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>S.B.</given-names>
          </string-name>
          :
          <article-title>Optimization algorithms for distributed queries</article-title>
          .
          <source>Software Engineering, IEEE Transactions on (1)</source>
          (
          <year>1983</year>
          )
          <fpage>57</fpage>
          -
          <lpage>68</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Vavilapalli</surname>
            ,
            <given-names>Vinod</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          , et al.:
          <article-title>”Apache hadoop yarn: Yet another resource negotiator</article-title>
          .
          <source>” Proceedings of the 4th annual Symposium on Cloud Computing. ACM</source>
          ,
          <year>2013</year>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>