Procedural Code Representation in a Flow
         Procedural Code Representation in a Flow
                         Graph
                         Graph
                          Michal Brabec and David Bednárek
                          Michal Brabec and David Bednárek
              Parallel Architectures/Algorithms/Applications Research Group
              Parallel Architectures/Algorithms/Applications
                            Department of Software EngineeringResearch Group
                            Department   of Software Engineering
             Faculty of Mathematics and Physics, Charles University in Prague
             Faculty ofMalostranské
                        Mathematicsnám.
                                      and Physics, Charles
                                           25, Prague,     University
                                                       Czech Republic in Prague
                       Malostranské nám. 25, Prague, Czech Republic
                           {brabec,bednarek}@ksi.mff.cuni.cz
                           {brabec,bednarek}@ksi.mff.cuni.cz

          Abstract. Modern scientific computing often combines extensive calcu-
          lation with complex structure of data; however, the programming method-
          ologies and languages of high-performance computing significantly differ
          from those of databases. This impedance mismatch leads many projects
          to the use of either primitive (like JSON) or overly general (like dis-
          tributed file systems) methods of data access, ignoring the decades of
          development in database technology. In this paper, we investigate the
          possibility to represent procedural code fragments using a network of op-
          erators similar to query plans used in relational database systems. Such
          a unified representation forms the necessary step towards an integrated
          computational-database platform. We propose a flow graph representa-
          tion that allows us to analyze, transform and optimize applications more
          efficiently and without additional data. Along with the graph, we de-
          signed an algorithm that transforms a procedural code into the graph.
          Keywords: compiler, graph, optimization, parallelism


   1    Introduction
   Modern data processing often combines complex data layouts with intensive
   calculations. Despite the ongoing effort in the area of no-SQL databases, the
   traditional relational paradigm, especially in its column-based version [2], still
   offers unmatched maturity and efficiency up to multi-terabyte ranges. However,
   the database systems were not designed with general computing in mind.
       Systems based on the MapReduce paradigm allow the programmer to inte-
   grate general procedural code with a distributed data storage more easily. De-
   spite of their popularity, MapReduce implementations may still be outperformed
   by parallel databases even in brute-force tasks where the sophisticated database
   approach has seemingly no advantage [20]. However, the same experiments also
   show that the performance dominance of parallel databases is limited to work-
   loads implemented by built-in functions; as soon as user-defined functions are
   required, the performance falls rapidly.
       This observation shows that the runtime stages of modern parallel database
   systems are extremely efficient even under brute-force computing load. How-
   ever, this efficiency is degraded by the inability of the relational optimizers to


M. Nečaský, J. Pokorný, P. Moravec (Eds.): Dateso 2015, pp. 89–100, CEUR-WS.org/Vol-1343.
90     Michal Brabec, David Bednárek


efficiently handle procedural code fragments contained in user-defined functions
[15]. Nevertheless, with a different front-end, a parallel database system may
become a suitable runtime for parallel computing.
     Such a front-end would have to compile procedural code into physical exe-
cution plans used in database systems. Modern databases, as well as streaming
systems, use graph-based execution plans whose nodes are not limited to rela-
tional algebra operators, as shown by the successful adaptation of many rela-
tional engines to XML or RDF [17].
     In this paper, we present flow graphs – an intermediate code capable to rep-
resent procedural code with its complex control-flow. Unlike typical intermediate
representations used in compilers, the flow graphs are designed to be similar to
pipeline-based models used in many database, streaming, and general parallel
platforms.
     Besides the introduction of flow graphs, we also describe the key algorithms
which take part in the transformation of procedural code into flow graphs. The
algorithms described here are applied after language-specific phases like parsing
and semantic analysis and they also make use of analytical algorithms which are
already frequently used in compilers.
     The rest of the paper is organized as follows: After reviewing the related
work, flow graphs are defined in Section 3. Section 4 presents the transformation
algorithms; in Section 5, we revise possible optimizations of the flow graph.


2    Related Work

The flow graph described in this paper is similar but not identical to other
modeling languages, like Petri nets [21] or Kahn process networks [14]. The
main difference is that the flow graph was designed for automatic generation
from the source code, where the other languages are generally used to model the
application prior to implementation [16] or to verify a finished system [9]. The
flow graph is similar to the graph transformation system [8], which can be used
to design and analyze applications, but it is not convenient for execution. There
are frameworks that generate GTS from procedural code like Java [7], though
the produced graphs are difficult to optimize. The flow graph has similar traits
to frameworks that allow applications to be generated from graphs, like UML
diagrams [12] [4], but we concentrate both on graph extraction and execution.
    The flow graph is closely related to graphs used in compilers, mainly the
dependence and control flow graphs [3], where the flow graph merges the infor-
mation from both. The construction of the flow graph and its subsequent opti-
mization relies on compiler techniques, mainly points-to analysis [23], dependence
testing [18] and control-flow analysis [22]. In compilers, graphs resulting from
these techniques are typically used as additional annotation over intermediate
code.
    The flow graph is not only a compiler data representation, it is a processing
model as well, similar to KPN graphs [13]. It can be used as a source code for
specialized processing environments, where frameworks for pipeline parallelism
                            Procedural Code Representation in a Flow Graph          91


are the best target, since these frameworks use similar models for applications
[19]. One such a system is the Bobox framework [10], where the flow graph can
be used to generate the execution plan similarly to the way Bobox is used to
execute SPARQL queries [11].


3   Flow Graph

The Flow graph was designed as a compact format for representation of proce-
dural code. Once constructed, it contains the code along with the information
about its structure, including control flow and data flow, and it can be trans-
formed back into procedural code. In this section, we define the flow graph and
we explain its relation to other processing models.
    A flow graph is a directed graph, where nodes represent operations and edges
represent data exchange among the operations. The direction of the edge indi-
cates the direction of data flow (source and sink operations). Figure 1 shows
an unoptimized flow graph for a function without branches (see Listing 1 for
source code). The gray nodes denote dead code and they will be removed during
later optimization steps. The flow graph can become complicated once control
flow is introduced – see Figure 5 for an example of a more complex graph that
implements a program with a loop.


                                             ldc 5      stl b     ldl b      add

void S t at e m e n t s ( ) {
                                             ldc 3      stl a     ldl a      stl c
  int a = 3 ;
  int b = 5 ;
  int c = 0 ;                                                                ldl c
  c = a + b;
  print ( c );                                                               call
}


Listing 1. Simple function without Fig. 1. Simple function transformed to
branches                           an unoptimized flow graph

    For a particular domain of application, a specific flow graph language is de-
fined that contains a set of basic operations and a set of data types. We construct
flow graphs based on such platform specific language. Both nodes and edges con-
tain information about the represented operation or data type respectively. Each
edge is connected to a specific input or output of the node according to the rep-
resented operation (the data type must be compatible).
    As our research is focused on C#, we use CIL [1] instructions to specify node
operations. In this paper, there are four most common instruction types: ldl x
(load variable or constant), stl x (write to a variable), ble (conditional branch),
92      Michal Brabec, David Bednárek


add etc. (mathematical operations). We omit data types for the edges, because
they are not important for the graph construction.


3.1   Execution Model

The flow graph execution model defines the way nodes process data and com-
municate. Nodes represent operations and edges represent unbounded queues
(FIFO).
   Operations have three states - waiting for input, processing and inactive. Each
operation starts waiting for input, it fires once input is available, processes the
input and produces an output. Once the input is processed, the operation again
waits for data. Nodes without input (loading constants) fire at the beginning of
the execution, produce data and then they become inactive.
   The queues transport single values of the assigned data type (based on the
edge). Nodes must always consume their input, they cannot simply check the
data and leave them in the queue. For example a simple node, which adds two
numbers, fires when there are data in both its incoming queues, it consumes
both numbers and then is stores the result in its outgoing queue.


3.2   Special Nodes

Loops and branches of the source language are transformed into a graph of
platform independent special nodes which interact with the data-flow carried
through the basic nodes that perform the basic operations.
     An extended primary node is a special version of any basic operation without
parameters, like load constant value. The extended version has a single input
and it restarts whenever the input contains data.
     A broadcast node has a single input and a variable number of outputs. It
represents a simple operation that creates a copy of the input for each output.
This node is created whenever an operation must pass its result to multiple
operations and the number of receivers defines the number of outputs. A loop
feedback node is a special type broadcast node that distributes the positive result
of a conditional branch to all extended primary nodes.
     A merge node has a single output and three inputs. It represents an operation
that accepts data from two sources and passes them to a single operation and
it is used to merge data flow after a conditional branch. The node has an extra
input used to get feedback from another node, generally a conditional branch.
The node fires when all three inputs contain data.
     A loop merge node is a special version of a merge node with two inputs for
conditional branches, it is used to merge data in loops. The input is split into
two pairs, where each contains one data input and one branch input. The node
fires, whenever both inputs in a pair contain data. The node is either positive
or negative, where the conditional inputs are required to be either positive or
negative, for passing data into a loop or outside a loop.
                          Procedural Code Representation in a Flow Graph        93


                   add       ldc 5*

Fig. 2. Node types: basic, extended primary, broadcast, merge, loop merge, feed-
back


4   Flow Graph Construction

In this section, we present a two-phase algorithm that transforms a procedural
code to a flow graph. The input to the algorithm is an intermediate code; in our
case, the CIL. For simplicity, we assume that the CIL code was compiled from a
C# source without unsafe code and goto. We also assume that the code is first
subjected to a points-to analysis [23] which resolves potential aliasing problems.
    The basic idea of the algorithm is that each CIL instruction is transformed
into a node. Edges are generated according to the data used by each instruction.
Edges correspond to the inputs / outputs of the instructions. When the basic
graph is ready, we add special nodes for broadcasting and merging data according
to branches and loops in the code. For simplicity, we ignore function calls in this
description – see Section 4.1 for more information on function handling.


Basic Graph Construction The first step of the algorithm is to create basic
nodes according to the instructions of the source code. We use basic operations
in the first part, the special operations are introduced in the second part. Algo-
rithm 1 contains all the necessary steps and it produces the basic nodes N and
edges E of the graph.
    We can create a node for every CIL instruction, because we defined the
operations based on the instruction set (lines 1 to 3 in Algorithm 1).
    Next, we analyze how instructions exchange data, either using registers or
stack, and we create edges that connect the source and sink instruction. CIL
instructions communicate using virtual stack and we use a stack simulator to
analyze the way the instructions exchange data. Then we create edges between
nodes representing instructions that exchange data (lines 4 to 6 in Algorithm 1),
along with the appropriate data type. The analysis is similar when the instruc-
tions communicate through registers, only instead of stack simulator, we have
to connect instructions that use the same register.
    Then, we have to take into account the data passing through variables. We
have to create edges between nodes representing variable access, from write to
read. This step is more complex, because we have to connect every variable
read with the nearest write, along all the possible execution paths, which means
analyzing the control flow.
    We generate basic blocks for the input code. In every basic block, we locate
all the instruction that access any variable (lines 7 to 11 in Algorithm 1). First
we create edges inside every block, where we connect variable writes to reads if
94      Michal Brabec, David Bednárek


 . method void S t a t e m e n t s ( ) c i l
    ldc . i4 .3
    stloc .0
    ldc . i4 .5                                   void BranchElse ( ) {
    stloc .1                                        int a = 3 ;
    ldloc .0                                        int c = 0 ;
    ldloc .1                                        i f ( a > 0)
    add                                                c = 4;
    stloc .2                                        else
    ldloc .2                                           c = −4;
    call        void p r i n t ( i n t 3 2 )        print ( c );
    ret                                           }


 Listing 2. CIL code of Statements                    Listing 3. Simple branch
 function


both access the same variable but only if the read is after the write and there
is no other write between them (lines 12 to 16 in Algorithm 1). Results of this
step are illustrated in Figure 1 which shows the intermediate flow graph based
on the CIL code in Listing 2, generated from the source code in Listing 1.
    Next, we have to connect variables between basic blocks. We locate all reads
before the first write in every block, for every variable. We call them accessible
reads. We connect the last write in a basic block to all the accessible reads in
the blocks that follow the original, until one of them contains an instruction
that writes to the same variable (lines 17 to 26 in Algorithm 1). Basically, we
perform an exhaustive search through the block graph, where we stop at nodes
that change the studied variable. This way the data is propagated through the
control flow.
    Figure 3 shows a flow graph generated from the function in Listing 3. There
is one conditional jump that produced the two edges that lead to the node
ldl c. The branch instruction must decide which input is used (Section 4). It is
important that the initialization of c to 0 is identified as unused code.


Control Flow Management In this section, we present algorithms necessary
to make the graph produced by the Algorithm 1 deterministic and compliant
to the flow graph definition. We must make sure that the nodes have a proper
number of inputs and outputs and that the branching conditions deterministi-
cally decide what value is used at every time, especially in loops. Algorithm 2
contains all the transformations for handling control flow.
    We start this algorithm by locating loops in the basic block graph. We locate
all the blocks L of the inner-most loop and we locate the block Lb that contains
the backward conditional jump b that restarts the loop (it is sufficient to locate
the backward jump, because we consider a restricted C#). We duplicate the
entire block Lb as Lbi , the copy drives the first iteration which is different, since
it uses data initialized before the loop started (initialization of variables). See
                           Procedural Code Representation in a Flow Graph         95


                    Algorithm 1. Basic graph construction
Require: I – set of instructions i
    B – basic blocks
    M – set containing all memory locations (variables)
    Ci – all instructions consuming the output of the instruction i
    Ri – variables read by instruction i
    Wi – variables read by instruction i
Ensure: N – nodes of the flow graph
    E – edges of the flow graph
 1: for all i ∈ I do
 2:   N = N ∪ {Ni : operation(Ni ) = Oi } – nodes based on instructions
 3: end for
 4: for all Ni ∈N do
 5:   E = E ∪ ENi Nj : j ∈ Ci – edges based on instruction communication (stack)
 6: end for
 7: for all b ∈ B do
 8:   Loadbm = {i ∈ b : m ∈ Ri ∧ w ∈ b < i =⇒ m 6∈ Ww } – read before update
 9:   Loadbmj = {i ∈ b : m ∈ Ri ∧ j < i ∧ w ∈ [j, i] =⇒ m 6∈ Ww } – read after write
10:   Storebm = {i ∈ b : m ∈ Wi ∧ w ∈ b > i =⇒ m 6∈ Ww } – last update in block
11: end for
12: for all b ∈ B do
13:   for all m ∈M do
14:      E = E ∪ ENj Ni : i ∈ Loadbmj – edges based on variable access
15:   end for
16: end for
17: for all m ∈ M do
18:   for all b ∈ B do
19:      for all n0 ∈B : next(b, n0 ) do
20:         E = E ∪ ENj Ni : j ∈ Storebm ∧ i ∈ Loadbn0 – edges between blocks
21:         if Storebn = ∅ then
22:            recursion for {n1 ∈ B : next(n0 , n1 )}
23:         end if
24:      end for
25:   end for
26: end for


Listing 4 where i is first compared while it still has the value of 1. We locate
all nodes ni inside the loop with more incoming edges than inputs (line 1 in
Algorithm 2). We create loop merge nodes m for all ni (line 2 in Algorithm 2),
redirect the incoming edges to the merge nodes (line 3 in Algorithm 2). We add
edge Emni . Finally we connect the merge nodes to the conditional branches and
we pair the input coming from outside the loop to the duplicate branch in bi in
Lbi and the other with the branch b in Lb (line 4 in Algorithm 2). This ensures
that the first iteration takes the data from outside and all the rest take the
internal data. See Figure 5 for complete loop with merge nodes.
    When the loops are fitted with merge nodes, we add a loop feedback node
that restarts all the nodes without input. We replace all nodes without input in
96        Michal Brabec, David Bednárek

 ldc 3        ldc 4       ldc -4
                                              ldc 3       ldc 4                ldc -4

                                              stl a       stl c                stl c
  stl a        stl c       stl c

                                              ldl a
  ldl a                    ldl c                              blec
               blec                           ldc 0
                                                                               ldl c
 ldc 0                     call
                                                                                call

Fig. 3. Basic graph of a conditional Fig. 4.           Completely                       transformed
branch                               branch

                                                                       stl i       ldc 1
                                                                                             ldl i
                                                      ldl i          ldc 1*        call*
          void S i n g l e L o o p ( )                                                       ble
          {
                                                              add
              for ( int i = 1 ;                                                             ldc 5
                       i <= 5 ;
                       i ++)                                   stl i
              {
                     print ( ) ;
              }                                        ble             ldl i
          }                                                                      ldc 5*


     Listing 4. Simple loop function            Fig. 5. Flow graph of a loop


a loop by their extended version, see Section 3.2. We connect the feedback node
to the branch in Brl and to all the extended nodes in the loop (lines 6 to 10
in Algorithm 2). Figure 5 shows a complete loop, where the Brl and Brcl are
outlined by a dashed rectangle and the entire loop by a solid rectangle.
    When all loops are transformed, we must locate all remaining nodes n with
more incoming edges than inputs (line 11 in Algorithm 2), the multiple inputs are
the result of conditional branches. Figure 3 shows the situation where two edges
lead to a node with a single input (ldl c). We solve this situation by introducing
a merge node m along with the edge Emn (lines 12 to 13 in Algorithm 2). Then
we redirect the two input edges to the m (line 14 in Algorithm 2). Finally, we
have to locate the conditional branch responsible for the merge and connect it
to the merge node. We can do this by following the paths from source nodes,
where we locate the branch just before the paths join, blec (branch if a ≤ 0)
in Figure 3. The result of this algorithm is in Figure 4, where the edge with a
square is the positive input and the triangle is negative.
                           Procedural Code Representation in a Flow Graph          97


    The final step is to locate all nodes n with more outgoing edges than outputs.
This is solved simply by using a broadcast node. We create a new broadcast node
b, we add an edge Enb and we change the outgoing edges to start in b (lines 16
to 20 in Algorithm 2).


                       Algorithm 2. Iteration over stripes
Require: B – basic blocks
Ensure: N – nodes of the flow graph
    E – edges of the flow graph
 1: for all {ni ∈ N : ∃L ⊂ B ∧ ni ∈ L ∧ |Exn | > inputs(ni)} do
 2:   N = N ∪ {m} – add loop merge node
 3:   ∀Exni : x ∈ N =⇒ E = (E \ {Exni }) ∪ {Exm }
 4:   E = E ∪ {Emni } ∪ {Ebm } ∪ {Ebi m } – b and bi are conditional jumps of the loop
 5: end for
 6: for all {∀ L ⊂ B : loop(L)} do
 7:   N = N ∪ {f } – add loop feedback node
 8:   E = E ∪ {Ebf } – where b is conditional jump of the loop
 9:   ∀e ∈ L : extended(e) =⇒ E = E ∪ {Ef e }
10: end for
11: for all {n ∈ N : |Exn | > inputs(n)} do
12:   N = N ∪ {m} – add merge node
13:   E = E ∪ {Emn } ∪ {Ebm } – where b is the conditional jump
14:   ∀Exni : x ∈ N =⇒ E = (E \ {Exn }) ∪ {Exm }
15: end for
16: for all {n ∈ N : |Enx | > outputs(n)} do
17:   N = N ∪ {b} – add broadcast node
18:   E = E ∪ {Enb }
19:   ∀Enx : x ∈ N =⇒ E = (E \ {Enx }) ∪ {Ebx }
20: end for


4.1   Functions and Methods


Methods are analyzed starting with the main function and then the graph is con-
structed as additional methods are called. Whenever a method call is located,
we create a flow graph for the called method, treating its parameters as local
variables. We connect the parameters to their actual values (source variables or
constants). This approach is equivalent to complete procedure integration and
it is applicable only for programs whose call graph is acyclic and contains rea-
sonable number of paths. Using additional special nodes, any program might be
transformed; however, at the cost of additional operations which correspond to
the call-return pairs in conventional program execution. In the supposed appli-
cation domain, the general approach is probably unnecessary.
98      Michal Brabec, David Bednárek


4.2   Objects and Arrays
A variable of complex data type (object or array) can contain a number of
memory locations (members or elements) that can be accessed using special
instructions. We treat member data of objects as separate variables where the
same members of two independent objects are different variables. Arrays can be
viewed as objects with a single member - data, where the data contains multiple
independent values. Arrays are treated this way by many compiler algorithms
[23].


5     Graph Optimizations
A flow graph produced by the algorithm presented in Section 4 is generally very
big, since we transform every instruction into a single node, which is not very
convenient for execution, but it can be efficiently analyzed and optimized. In
this section, we shortly introduce optimizations that can produce more compact
graphs.
    Each node in a flow graph, as defined in Section 3, represents a basic or special
operation. We introduce the merge rules, to allow optimizations of the graph.
The merge rules define the way the operations are combined to produce complex
operations. They define the behavior (source code) of the complex operation and
its inputs and output along with their data types. The merge rules are added to
the definition of the specific flow graph language.
    A complex operation is created by merging other operations, either basic or
complex, according to the merge rules defined along with the graph. The merge
rules for CIL instructions contain for instance chaining of the instructions. For
example, when merging the   Paddition   instr. in A + B + C, we create a complex
                               3
operation that is equal to 1 Ini .

                                                 ldl x           ldl i                             ldc 1

                                                         mul             ldc 1*        ldl i       stl i

       stl i         ldc 1                               stl x                 add

                                                                               stl i               ldl i

      gen i          ldc 5*   call*
                                                 ldc 5*             ldl i                          ble


                                                           ble                                     ldc 5
               ble
                                                           ldl x            call           stl x   ldc 1


      Fig. 6. Merged simple loop            Fig. 7. Factorial computation graph
   Another transformation is aimed at simple loops controlled by a single vari-
able. Figure 5 shows a very simple loop that is controlled by the variable i,
updated in every iteration. The loop creates many unnecessary dependences.
                            Procedural Code Representation in a Flow Graph           99


When we locate such a simple situation, we can merge the entire loop into a
single node that just generates appropriate values in this case {1, 2, 3, 4}. We
can utilize algorithms used in compilers [18] to locate the loops, because the
graph contains all the information found in the original source code. Figure 6
shows how the loop from Figure 5 is optimized. This optimization is essential
for improving the efficiency of the flow graph, compare the optimized graph to
an unoptimized graph implementing the computation of a factorial, Figure 7.


6   Conclusions

We designed the flow graph to represent a procedural code along with important
information about its structure and behavior. We designed an algorithm that
allows us to create a flow graph for an application implemented in a subset of
C# and compiled to CIL. This transformation becomes a part of a toolchain
that allows the transformation of C# programs into a stream-based parallel
computing platform [5]. The algorithm can be modified for other languages, like
Java bytecode [6].
    The flow graph is a powerful tool for application analysis and optimization.
Besides generating pipeline-based execution plans, the flow graph can be used
for automatic parallelization. For such use, the original flow graph may be too
fine-grained – in this case, it has to be transformed using a set of merge rules to
make the final parallel application efficient.


Acknowledgements

This paper was supported by Czech Science Foundation (GAČR) project P103-
13-08195, and by the Grant Agency of Charles University Grant Agency (GAUK)
project 122214.


References
 1. TG3. Common Language Infrastructure (CLI). Standard ECMA-335, June 2005
 2. Abadi, D., Boncz, P.A., Harizopoulos, S., Idreos, S., Madden, S.: The design and
    implementation of modern column-oriented database systems. Foundations and
    Trends in Databases 5(3), 197–280 (2013)
 3. Allen, R., Kennedy, K.: Optimizing compilers for modern architectures. Morgan
    Kaufmann San Francisco (2002)
 4. Balasubramanian, D., Narayanan, A., van Buskirk, C., Karsai, G.: The graph
    rewriting and transformation language: Great. Electronic Communications of the
    EASST 1 (2007)
 5. Brabec, M., Bednárek, D., Malý, P.: Transformation of pipeline stage algorithms to
    event-driven code. In: Kurkova, V., Bajer, L., Svátek, V. (eds.) Proceedings of the
    14th Conference on Information Technologies - Applications and Theory, Jasna,
    Slovakia, 2014. CEUR Workshop Proceedings, vol. 1214, pp. 13–20. CEUR-WS.org
    (2014), http://ceur-ws.org/Vol-1214
100     Michal Brabec, David Bednárek


 6. Chambers, C., Raniwala, A., Perry, F., Adams, S., Henry, R.R., Bradshaw, R.,
    Weizenbaum, N.: FlumeJava: easy, efficient data-parallel pipelines. In: ACM Sig-
    plan Notices. vol. 45, pp. 363–375. ACM (2010)
 7. Corradini, A., Dotti, F.L., Foss, L., Ribeiro, L.: Translating java code to graph
    transformation systems. In: Graph Transformations, pp. 383–398. Springer (2004)
 8. Ehrig, H., Ehrig, K., Prange, U., Taentzer, G.: Graph transformation systems.
    Fundamentals of Algebraic Graph Transformation pp. 37–71 (2006)
 9. Ezpeleta, J., Colom, J.M., Martinez, J.: A Petri net based deadlock prevention
    policy for flexible manufacturing systems. Robotics and Automation, IEEE Trans-
    actions on 11(2), 173–184 (1995)
10. Falt, Z., Kruliš, M., Bednárek, D., Yaghob, J., Zavoral, F.: Locality aware task
    scheduling in parallel data stream processing. In: Camacho, D., Braubach, L.,
    Venticinque, S., Badica, C. (eds.) Intelligent Distributed Computing VIII, Stud-
    ies in Computational Intelligence, vol. 570, pp. 331–342. Springer International
    Publishing (2015)
11. Falt, Z., Čermák, M., Dokulil, J., Zavoral, F.: Parallel SPARQL query processing
    using Bobox. International Journal On Advances in Intelligent Systems 5(3 and
    4), 302–314 (2012)
12. Geiger, L., Zündorf, A.: Graph based debugging with fujaba. Electr. Notes Theor.
    Comput. Sci. 72(2), 112 (2002)
13. Geilen, M., Basten, T.: Requirements on the execution of Kahn process networks.
    In: Programming languages and systems, pp. 319–334. Springer (2003)
14. Gilles, K.: The semantics of a simple language for parallel programming. In: Infor-
    mation Processing: Proceedings of the IFIP Congress. vol. 74, pp. 471–475 (1974)
15. Guravannavar, R., Sudarshan, S.: Rewriting procedures for batched bindings. Pro-
    ceedings of the VLDB Endowment 1(1), 1107–1123 (2008)
16. Josephs, M.B.: Models for data-flow sequential processes. In: Communicating Se-
    quential Processes. The First 25 Years, pp. 85–97. Springer (2005)
17. Mayer, S., Grust, T., Van Keulen, M., Teubner, J.: An injection with tree aware-
    ness: adding staircase join to postgresql. In: Proceedings of the Thirtieth inter-
    national conference on Very large data bases-Volume 30. pp. 1305–1308. VLDB
    Endowment (2004)
18. Muchnick, S.S.: Advanced compiler design implementation. Morgan Kaufmann
    Publishers (1997)
19. Navarro, A., Asenjo, R., Tabik, S., Cascaval, C.: Analytical modeling of
    pipeline parallelism. In: Parallel Architectures and Compilation Techniques, 2009.
    PACT’09. 18th International Conference on. pp. 281–290. IEEE (2009)
20. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stone-
    braker, M.: A comparison of approaches to large-scale data analysis. In: Proceed-
    ings of the 2009 ACM SIGMOD International Conference on Management of data.
    pp. 165–178. ACM (2009)
21. Peterson, J.L.: Petri nets. ACM Comput. Surv. 9(3), 223–252 (Sep 1977),
    http://doi.acm.org/10.1145/356698.356702
22. Reps, T.: Program analysis via graph reachability. Information and software tech-
    nology 40(11), 701–726 (1998)
23. Sridharan, M., Bodı́k, R.: Refinement-based context-sensitive points-to analysis for
    Java. ACM SIGPLAN Notices 41(6), 387–400 (2006)