<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Hybrid Flow Graphs: Towards the Transformation of Sequential Code into Parallel Pipeline Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michal Brabec</string-name>
          <email>brabec@ksi.mff.cuni.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Bednárek</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Software Engineering Faculty of Mathematics and Physics, Charles University Prague</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>9</fpage>
      <lpage>16</lpage>
      <abstract>
        <p>Transforming procedural code for execution by specialized parallel platforms requires a model of computation sufficiently close to both the sequential programming languages and the target parallel environment. In this paper, we present Hybrid Flow Graphs, encompassing both control flow and data flow in a unified, pipeline based model of computation. Besides the definition of the Hybrid Flow Graph, we introduce a formal framework based on graph rewriting, used for the specification of Hybrid Flow Graph semantics as well as for the proofs of correctness of the associated code transformations. As a formalism particularly close to pipeline-based runtime environments which include many modern database engines, the Hybrid Flow Graphs may become a powerful means for the automatic parallelization of sequential code under these environments.</p>
      </abstract>
      <kwd-group>
        <kwd>compiler</kwd>
        <kwd>graph</kwd>
        <kwd>optimization</kwd>
        <kwd>parallelism</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Parallelization of procedural code has been studied for
several decades, resulting in a number of mature
implementations, usually tightly coupled with compilers
of FORTRAN or C. In most cases, the parallelization
techniques are targeted at multi-threaded environment,
sometimes augmented with task-based parallelism. This
shared-memory approach is perfectly suited for numeric
applications; however, there are niches of computing
which may benefit from a different run-time paradigm. In
particular, database systems make use of execution plans
which explicitly denote the data flows between operators;
similar graph-based representation is often used in
streaming systems. Besides other advantages, the explicitly
denoted data flow allows easier distribution of the
computation across different nodes, compared to the complexity of
potentially random access in the shared-memory model of
computation.</p>
      <p>Nevertheless, database and streaming systems lack the
generality of procedural programming languages. To
extend the applicability of such runtime systems towards
general programming, a system capable of compiling (a
subset of) a well-known procedural language is required.
This paper demonstrates an important step towards such
a system – an intermediate representation capable of
describing procedural code using the language of data-flow
oriented graphs.</p>
      <p>Our approach is focused at target environments in which
an application is presented in two layers – the declarative
upper layer represented as a graph of operators, the lower
layer consisting of a procedural implementation of
individual operators. In such a setting, a language front-end
transforms the application source code into an
intermediate representation; then, a strategic phase decides where
to put the boundary between the declarative upper layer
and the procedural lower layer. Finally, the upper layer
is converted into the final graph while procedural code (in
source, bytecode, or binary form) is generated from the
lower layer.</p>
      <p>Realization of this idea involves many particular
problems; in this paper, we deal with the intermediate
representation which serves as the backbone connecting the
frontend, strategy, and generator phases. Since the upper layer
will finally be represented by a graph, it seems reasonable
that also the intermediate representation be graph based.
However, such a graph shall also be able to completely
represent the lower, procedural part of the application.
Although the theory of compiling uses graphs in many
applications including the representation of the code, the
control-flow and data-flow aspects of the code are usually
approached differently. In our approach, the control flow
and the data flow is represented by the same mechanism
because the strategic phase must be able to make the cuts
inside both the control flow and the data flow.</p>
      <p>In this paper, we present Hybrid Flow Graphs (HFG) as
an intermediate representation of procedural code,
capable of representing control-flow and data-flow in the same
layer. We will demonstrate two forms of the HFG, the
Sequential HFG and the General HFG. The former form
has its execution constrained to sequential, equivalent to
the execution of the procedural code which it was
generated from. The latter form exhibits independent
execution of individual operations, showing available
parallelism. Of course, making operations independent requires
careful dependency analysis – we assume that the required
points-to/alias/dependency analyses were made before the
conversion to HFG, using some of the algorithms known
from the theory of compiler construction.</p>
      <p>We will describe the operational semantics of both the
sequential and the independent forms of HFG and we will
also demonstrate the conversion between them. We will
use graph rewriting as the vehicle for the description of
both the semantics and conversions. The use of graph
rewriting systems allows for some generality of the
approach, both in the description and in the implementation.
In particular, the set of operations provided by the
intermediate code may be defined almost arbitrarily as long as
their semantics is depicted by graph rewriting rules. On
the other hand, the set of control-flow operations is fixed
since it forms the non-trivial part of the transformation
from sequential code.</p>
      <p>
        In this paper, we will use the set of types and operators
borrowed from the C# language, more exactly, from its
standard bytecode representation called CIL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This
includes some artifacts specific to the CIL, namely the stack
and the associated operations. Since the Sequential HFG
mimics the behavior of the CIL abstract machine, the stack
operations are preserved in the HFG. However, they must
disappear in the transformation to the General HFG;
otherwise, the presence of stack operations would prohibit
almost any parallelism. Since the removal of stack
operations is a non-trivial sub-problem, we include the
corresponding (CIL-specific) algorithm in this paper, as well as
the description of the preceding conversion of CIL into the
sequential HFG.
      </p>
      <p>The rest of the paper is organized as follows: We review
the related work in Section 2. The hybrid flow graph and
its semantics is defined in Section 3. We define the
sequential hybrid flow graph in Section 4. In Section 5, we
present an algorithm producing a sequential HFG from a
source code. Section 6 presents the algorithms that
transform a general HFG to a sequential HFG.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        The flow graph described in this paper is similar but not
identical to other modeling languages, like Petri nets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
or Kahn process networks [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The main difference is that
the flow graph was designed for automatic generation from
the source code, where the other languages are generally
used to model the application prior to implementation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
or to verify a finished system [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The flow graph is
similar to the graph rewriting system [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which can be used
to design and analyze applications, but it is not convenient
for execution. There are frameworks that generate GRS
from procedural code like Java [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], though the produced
graphs are difficult to optimize. The flow graph has
similar traits to frameworks that allow applications to be
generated from graphs, like UML diagrams [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], but we
concentrate both on graph extraction and execution.
      </p>
      <p>
        The flow graph is closely related to graphs used
in compilers, mainly the dependence and control flow
graphs [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], where the flow graph merges the
information from both. The construction of the flow graph and
its subsequent optimization relies on compiler techniques,
mainly points-to analysis [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], dependence testing [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]
and control-flow analysis [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In compilers, graphs
resulting from these techniques are typically used as additional
annotation over intermediate code.
      </p>
      <p>
        The flow graph is not only a compiler data
representation, it is a processing model as well, similar to KPN
graphs [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. It can be used as a source code for specialized
processing environments, where frameworks for pipeline
parallelism are the best target, since these frameworks use
similar models for applications [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. One such a system is
the Bobox framework [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], where the flow graph can be
used to generate the execution plan similarly to the way
Bobox is used to execute SPARQL queries [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
2.1
      </p>
      <sec id="sec-2-1">
        <title>Graph Rewriting</title>
        <p>A graph rewriting system (GRS) is a set of rules R
transforming one graph to another. The rewriting systems are
very similar to grammars where each rule has a left side
that has to be matched to the graph and a right side that
replaces it.</p>
        <p>The GRS are very simple and they are best explained
by an example. Figure 2 shows the CIL based rewriting
system, where each rule emulates a single CIL instruction.</p>
        <p>We use extended rewriting rules where ∗ represents
any node (wildcard) and we add Greek letters to identify
unique wildcard nodes where necessary. Variable names
(x or y) represent only nodes with data. Expressions are
used to simplify data manipulation. This is necessary so
we do not have to create a rule for each combination of
instruction or inputs.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Hybrid Flow Graph</title>
      <p>A hybrid flow graph (HFG) is an annotated directed graph,
where nodes together with their labels represent operations
and edges represent queues for transferring data. The
direction of the edge indicates the direction of data flow.
Figure 1 shows a general hybrid flow graph for a simple
function (see Listing 1 for source code).
v o i d S i m p l e L o o p ( i n t a ) {
f o r ( i n t i = a ; i &lt; 5 ; )
{
i = c a l l ( i ) ;
}
}
Listing 1 Simple function containing a single loop</p>
      <p>A hybrid flow graph is a general concept that can be
used regardless of operations O or data types T . A
platform specific flow graph model is defined by
specifying O and T based on the target platform. The
platform specific definition isHF G_MODELplat f orm = (O ∪
{special operations}, T ), where O and T are platform
specific and special operations are platform independent
(see Section 3.1). The hybrid flow graph semantics
(explained in Section 3.2) are also platform specific.</p>
      <p>
        As our research is focused on C#, we use CIL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
instructions to specify node operations. The most common
instruction in our examples are: ldl x (load variable or
constant), stl x (write to a variable), cle (compare less or
equal), add etc. (mathematical operations) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We omit
data types for the edges, because they are not important
for the graph construction.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Representation of Control Flow</title>
        <p>Control flow operations are transformed into platform
independent special operations which interact with the
dataflow carried through the platform specific basic
operations. Graphical notation of special operations is in
Figure 3, where a broadcast is a black square and a merge is
a white square.</p>
        <p>A broadcast node has a single input and a variable
number of outputs. It represents an operation that creates
a copy of the input for each output.</p>
        <p>A merge node has a single output and three inputs. It
represents an operation that accepts data from two sources
and passes them to a single operation based on a condition,
it is used to merge data flow after a conditional branch.</p>
        <p>A loop merge node is a special version of the merge
node with two inputs for conditional branches, it is used
to merge data in loops. The input is split into two pairs,
where each contains one data input and one condition
input. The node first reads data from the first pair and then
it loops, while the second pair has a positive value in the
condition input.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Semantics of Hybrid Flow Graphs</title>
        <p>The hybrid flow graph semantics define the way nodes
process data and communicate. Basically, nodes represent
operations or data and edges represent unbounded queues
(FIFO) for data transfer.</p>
        <p>Semantics are platform specific, similar to the hybrid
flow graph model. We define the hybrid flow graph
semantics using the graph rewriting systems according to the
platform specific operations logic.</p>
        <p>We define the HFG semantics for CIL using graph
rewriting. Figure 2 shows the example of definition for
the basic instructions. Instruction load constant (ldc) has
a special behavior, it produces a single constant and then
it repeats while it has an input, we specify this behavior
by changing its label to ldc − r. The definition of special
operations is illustrated in Figure 3, the figure contains the
definition of broadcast and merge. We use special
graphical notation for special operations, since they are platform
independent (broadcast is black square, merge is a white
square).</p>
        <p>We can use the graph rewriting rules defined in Figure 2
and 3 to perform the computation of a hybrid flow graph.
The computation of a flow graph with a for-loop is shown
in Figure 4. Each part of Figure 4 contains the HFG state
after application of a single rule and the computation starts
after application the ldc5 rule.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Sequential HFG</title>
      <p>In this section, we define the semantics of a sequential
hybrid flow graph using the graph transformation systems.
A sequential hybrid flow graph is a special HFG that uses
a program counter a memory nodes instead of special
operations defined in Section 3.1. It has a strict control flow
that closely follows that of the source program. It is
useless for the parallel pipelines or other applications, but it
allows us to create a general HFG from the source code.</p>
      <p>
        We use special operations for variables and the stack
(CIL virtual machine is stack based [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]), these operations
(a) application of ldc5 rule
      </p>
      <p>(b) application of broadcast
(c) application of cle rule</p>
      <p>(d) application of merge rule
are used to model the data flow. Plus we add a special
node P (program counter) to model the control flow.
4.1</p>
      <sec id="sec-4-1">
        <title>Semantics of a CIL Sequential HFG</title>
        <p>The sequential hybrid flow graph semantics is defined
using the graph rewriting, same as any other hybrid flow
graph. The transformation rules for the CIL based
sequential hybrid flow graph are defined in Figure 5, we present
only the rules necessary for our examples.</p>
        <p>
          The GRS rules defined in Figure 5 are completely based
on the CIL instruction definition [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. The control flow is
controlled by the node P, which follows the edges between
instructions. The instruction edges are constructed
according to the following rules: 1) The standard instructions
follow one another 2) The branches have two outgoing edges,
one following the jump target and the other leading to the
next instruction.
        </p>
        <p>The data management is handled by the stack and
variable operations, where each instruction modifies the
proper number of stack levels and variables, based on the
instruction definition.</p>
        <p>The actual computation is the same as in the case of the
.NET virtual machine, the control flow follows the same
path, variables and stack contain the same value at after
every evaluated instruction. Graphically, the computation
is similar to Figure 4.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Transformation of CIL to Sequential HFG</title>
      <p>In this section, we explain how to create a sequential
hybrid flow graph from a CIL code. We present an algorithm
that produces a sequential HFG that follows the CIL
computation step by step, including control and data flow. It
allows us to define the advanced transformation algorithm
in Section 6.</p>
      <p>The basic transformation of a CIL code to a sequential
HFG requires two steps: 1) create nodes based on the
instructions 2) create edges based on the control and data
flow.</p>
      <p>
        The entire transformation is in Algorithm 1, function v
maps nodes to operations. The node P is basically a
program counter, it points to the actual instruction and it is
used to handle the control flow. The input sets I, M, Pi are
obtained from the source code. Sri and Swi are based
directly on CIL specification [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The transformation is best
presented on an example, Listing 1 contains CIL source
code and Figure 6 contains the resulting graph.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Transformation of Sequential HFG to</title>
    </sec>
    <sec id="sec-7">
      <title>General HFG</title>
      <p>In this section, we present an algorithm that transforms
a procedural code to a hybrid flow graph. We create the
CIL sequential HFG (Section 5) and convert it to a general
HFG that requires neither the program counter P, nor the
Algorithm 1 Transformation of CIL to Sequential HFG
Require: I – set of instruction addresses</p>
      <p>M – set containing all variables
Ci, Pi – code and parameter of the instruction i
Sri – number of stack values read by instruction i
Swi – number of stack values written by instruction i
Ensure: V – nodes of the flow graph
v(Vi) – label of the node Vi</p>
      <p>E – edges of the flow graph
1: V := V P,V STACK
2: V := V ∪ ViINSTR : i ∈ I
3: v(ViINSTR) := hCi Pii for each i ∈ I
4: V := V ∪ VmVAR : m ∈ M
5: v(VmVAR) := hvar mi for each m ∈ M
6: E := 0/
7: for all i ∈ I do
8: if i = branch then
9: E := E ∪ n(ViINSTR,VPIiNSTR)o
10: end if
11: E := E ∪ (ViINSTR,ViI+N1STR)
12: end for
13: for all i ∈ I do
14: if i = stl then
15: E := E ∪ n(ViINSTR,VPVi AR)o
16: end if
17: if i = ldl then
18: E := E ∪ n(VPVi AR,ViINSTR)o
19: end if
20: if Swi &gt; 0 then
21: E := E ∪ (ViINSTR,V STACK)
22: end if
23: if Sri &gt; 0 then
24: E := E ∪ (V STACK,ViINSTR)
25: end if
26: end for
memory nodes. The input for the algorithm is an
intermediate code; in our case CIL.</p>
      <p>The basic idea of the algorithm is that every CIL
instruction is transformed to a node. The data and control
flow is then modeled using the sequential HFG presented
in Section 5. We emulate computation on the graph and
note what data can reach which instruction, by what path.
Finally, we create edges according to the collected
information and we introduce special nodes where necessary.
To construct a complete flow graph, we create nodes based
on instructions and use a modified sequential HFG to
create edges and special nodes. We use basic blocks to model
all valid execution paths which we use to create all the
necessary edges. Algorithm 2 describes the entire process,
the other algorithms are discussed in the following
sections. The result of Algorithm 2, applied to the code in
Listing 1, is the HFG in Figure 1 and Figure 8 shows the
graph before NOP elimination.</p>
      <p>The first step of the algorithm is to create basic nodes
according to the instructions of the source code (lines 1
to 3 in Algorithm 2).</p>
      <p>To crate the edges, we generate necessary execution
paths using the CIL path generator (Section 6.4) and
analyze all the possible inputs by the CIL reachability
algorithm (Section 6.3). Basically, we emulate computations
for every valid execution path recording the possible
inputs, and the paths leading to them, for every instruction.</p>
      <p>First, we convert the basic blocks to a regular expression
(function to_regular_expression), then we generate all the
valid paths using the CIL path generator. Next, we have
to update the GCIL so it contains the correct number of
iterations (function update) for every loop (this due to the
IDs assigned to branches as shown in Figure 9). Finally,
we use the CIL reachability to create sets S[0:N] containing
all the possible inputs for each instruction (lines 4 to 10).
S[0:N] represents sets S0 to SN that contain inputs for
instructions indexed 0 through N.</p>
      <p>We can create edges, once the input sets S[0:N] are ready.
We start by discarding duplicate sources, because each
instruction can produce only one value, which can reach the
instruction i through multiple paths (different iterations
inbetween), we can keep any instance as they are equivalent
(lines 12 to 16). Then we check the number of inputs left,
if there is the same number of sources as the instruction i
has inputs then we simply create edges (lines 17 to 21).</p>
      <p>If there is more sources than i has inputs then we create
merge nodes to select the correct input value and we
connect them to a condition that decides what input is used.
We iterate over the sources and we compare the attached
paths, the paths always start the same (i.e. the first
instruction) and they continue until there is a conditional branch
that changes the execution order for one of them. We
create a merge node, connect both sources to it and we
connect the branch condition to the condition input. This way,
we reduce the number of incoming edges, until it matches
the number of inputs.</p>
      <p>The last step of Algorithm 2 is the elimination of empty
operations that no longer have any purpose. This means
mainly loading and storing local variables (ldl and stl).
Eliminate NOP removes any nodes that do not change the
data.
6.2</p>
      <sec id="sec-7-1">
        <title>Symbolic Semantics of Sequential HFG</title>
        <p>We use modified semantics for the CIL sequential HFG
defined (Section 5). The modified rules work with
instruction identifiers instead of actual values. This way, we can
analyze which instructions consume the value produced
other instructions. The rule modification is shown in
Figure 9.</p>
        <p>We assign each instruction an identifier equal to its
index (∀i ∈ I : id(i) = i). The identifiers of branches decide
how many times the branch is true before it is f alse, we</p>
        <sec id="sec-7-1-1">
          <title>Algorithm 2 Hybrid flow graph construction</title>
          <p>Require: I – set of instructions</p>
          <p>B – graph of basic blocks
GCIL – modified CIL graph</p>
          <p>RCIL – modified rules for CIL graph
Ensure: N – nodes of the flow graph</p>
          <p>
            E – edges of the flow graph
1: N := {Ni : i ∈ I ∧ v(Ni) = Oi}
2: regex := to_regular_expression(B)
3: S[0:N] := 0/
4: for all W ∈ CIL_path_generator(regex, GCIL) do
5: for all si ∈ CIL_reachability(I, update(GCIL), RCIL)
do
6: Si := Si ∪ si
7: end for
8: end for
9: for all si ∈ S[0:N] do
10: for all p ∈ si do
11: while ∃q ∈ si : q. f irst = p. f irst ∧ q 6= p do
12: si := si \ q
13: end while
14: end for
15: if |si| = input_count(i) then
16: for all p ∈ si do
17: E := E ∪ {(p, i)}
18: end for
19: else
20: while |si| &gt; input_count(i) do
21: a := si[0]
22: b := si[
            <xref ref-type="bibr" rid="ref1">1</xref>
            ]
23: V := V ∪ {merge}
24: E := E ∪ {a. f irst, merge}
25: E := E ∪ {b. f irst, merge}
26: E := E ∪ {merge, i}
27: while a.second[pos] = b.second[pos] do
28: pos := pos + 1
29: end while
30: E := E ∪ {pos, merge}
31: end while
32: end if
33: end for
34: eiminateN OP(V, E)
use this to drive the control flow (explained in the next
section). Last modification is that the rules output identifiers
of consumed values, this is indicated by the bold dashed
boxes on the right side in Figure 9.
6.3
          </p>
        </sec>
      </sec>
      <sec id="sec-7-2">
        <title>CIL Reachability</title>
        <p>We use the modified HFG semantics to locate all possible
inputs for each instruction i, where an input is an
instructions producing a value consumed by i. We do this by
emulating CIL computation and storing the inputs for each
instruction, implemented by Algorithm 3.</p>
        <sec id="sec-7-2-1">
          <title>Algorithm 3 CIL reachability</title>
          <p>Require: I – set of instructions</p>
          <p>GCIL – modified CIL graph</p>
          <p>RCIL – modified rules for CIL graph
Ensure: S[0:N] – sets of sources for each instruction
1: path := () – empty path
2: while ∃r ∈ RCIL : applicable(GCIL, r) do
3: i := program_counter(GCIL) – current instruction
4: consumed := apply(GCIL, r) – apply rule
5: for all id ∈ consumed do
6: Si := Si ∪ {(id, path)}
7: end for
8: path := path · i – update execution path</p>
        </sec>
      </sec>
      <sec id="sec-7-3">
        <title>9: end while</title>
        <p>Algorithm 3 uses several simple functions: applicable
tells whether the rule can be matched on the graph, apply
applies the rule and returns the input IDs (dashed boxes in
Figure 9), program_counter returns the instruction
connected to P.
6.4</p>
      </sec>
      <sec id="sec-7-4">
        <title>CIL Path Generator</title>
        <p>We have to examine all valid execution paths to make sure
that the HFG is complete. We do this by creating a regular
expression representing the control flow and then
generating all valid words (paths).</p>
        <p>
          We create basic block graph based on the source code,
this is a very simple algorithm explained in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The
basic block graph can be interpreted as a finite state machine ,
the source code is finite, producing only a finite number
of states. We disregard the actual values produced in the
blocks that define the actual control flow. Instead, we treat
the basic block graph as a finite state machine that accepts
a set of words where each word defines a valid order of
basic blocks. Therefore the execution paths can be
represented by a regular expression [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. Based on the
basic blocks, we create a regular expression modeling all the
valid execution paths.
        </p>
        <sec id="sec-7-4-1">
          <title>Algorithm 4 CIL path generator</title>
          <p>Require: R – regular expression
Ensure: W – set of words
1: for all l ∈ [1 : 3 ∗ length(R)] do
2: for all w ∈ {v : v = e∗ ∧ e ∈ Σ ∧ |v| = l} do
3: if valid(w, R) then
4: W := W ∪ {w}
5: end if
6: end for
7: end for</p>
          <p>Algorithm 4 generates all valid execution paths, where
the function valid simply checks if the word is valid
according to the regular expression and Σ is the alphabet.
We limit the word length, thus limiting the loop iteration
count, because multiple loop iterations do not add new
information (see Section 6.5 for more details).
In the following sections, we discuss the reasoning behind
the hybrid flow graph construction presented in Section 6,
mainly why it is equivalent to the input source code. Here,
we present only the main ideas of a proof, because a
complete proof would require too much space.</p>
          <p>We have to focus mainly on three parts of the
transformation algorithm: 1) the CIL sequential hybrid flow graph
is equivalent to the CIL source code 2) the analyzed
execution paths contain all the possible inputs 3) the merge
nodes created according to the execution paths are placed
correctly.
We prove the equivalence of the CIL code and the CIL
hybrid graph (Section 5) by making sure that they have
the same control and data flow. We can do this, because
the CIL graph contains a node for program counter and
nodes for variables and stack.</p>
          <p>We base the proof on the fact that the HFG operations
behave the same way as the instructions. The program
counter is passed from one instruction to another in the
same way as in CIL – control flow is the same. The
instructions store data using special operations (nodes) for
stack and variables in the same way they do in memory –
data flow is equivalent. We omit the full proof, because it
would be too long and technical.
We cannot use the same approach for a general hybrid flow
graph as for CIL sequential HFG, because a general HFG
does not model the platform specific memory features and
control logic. Instead, we focus on the transformation of
a CIL sequential HFG to a general HFG. We prove their
equivalence by tracing all valid execution paths.
Lemma 1 (Necessary execution paths). Let W we a set
of words produced by the CIL path generator algorithm
(Algorithm 4) and E := {e∗ : e ∈ Σ} set of all words, then
Sw∈W reachable(w) = Se∈E reachable(e), where the
function reachable represents the CIL reachability algorithm.
Proof. The lemma says that the paths generated by
Algorithm 4 produce all possible inputs for all instructions.
This is true because there is only a limited number of paths
that can produce new inputs.</p>
          <p>
            The basic block graph can be viewed as a finite state
machine. The regular language accepted by the state machine
can be infinite, but only due to the Pumping lemma [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ],
which means infinite loop iterations.
          </p>
          <p>The important effect of loops is that the instructions use
values produced outside the loop in the first iteration and
then they consume values produced in the loop. The only
possible exception is a conditional branch in the loop, but
we check the paths with both branches taken in the first
iteration.</p>
          <p>Lemma 2 (Data flow path merging) . Let pairs
(S[1:N], P[1:N]) be sources for instruction i, then
input_count(i) = N =⇒ (S[1:N], i) ∈ E(HF G) or
input_count(i) &lt; N =⇒ ∃m ∈ V (HF G) ∧ (S[1:N], m) ∈
E(HF G)
Proof. The lemma says that either all the sources were
used to create appropriate edges or that a merge nodes
were introduced if there is too many sources.</p>
          <p>The first implication is true, because Algorithm 2
creates an edge for every source (lines 17 to 20).</p>
          <p>The second implication is correct, because the
algorithm creates merge nodes when there is too many sources
(line 25). The merge is created based on the execution path
stored for each input, it is connected to the last instruction
of the prefix shared by both paths (lines 29 to 32). This
means that the merge is driven by the conditional branch
that changed the execution path and decided what source
would be used.
7</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Conclusions</title>
      <p>We designed the flow graph to represent a procedural code
along with important information about its structure and
behavior. We defined the HFG semantics using graph
rewriting, which is basically a grammar on graphs. We
defined semantics for both the sequential and general HFG
using the formalism, thus making the sequential HFG a
special version of the general HFG. We designed an
algorithm that transforms a sequential source code (C#
compiled to CIL) to a general HFG using mainly the HFG
semantics and graph algorithms. Therefore, the algorithm
can be modified for other platforms.</p>
      <p>
        We have implemented a prototype of the general HFG
that uses the defined semantics and is capable of
completely parallel computation. The prototype serves as the
proof of concept, showing that the HFG is defined
correctly. It is implemented using the Bobox framework [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments</title>
      <p>This paper was partially supported by the Grant Agency
of Charles University (GAUK) project 122214 and
by the Czech Science Foundation (GACR) project
P103/13/08195.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>“</given-names>
            <surname>TG3. Common Language Infrastructure (CLI). Standard</surname>
          </string-name>
          ECMA-
          <volume>335</volume>
          ,
          <year>June 2005</year>
          .”
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Peterson</surname>
            ,
            <given-names>J. L.</given-names>
          </string-name>
          :
          <article-title>Petri nets</article-title>
          .
          <source>ACM Comput. Surv.</source>
          ,
          <volume>9</volume>
          (
          <issue>3</issue>
          ) (Sep.
          <year>1977</year>
          ),
          <fpage>223</fpage>
          -
          <lpage>252</lpage>
          , [Online]. Available: http: //doi.acm.
          <source>org/10</source>
          .1145/356698.356702
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Gilles</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>The semantics of a simple language for parallel programming</article-title>
          .
          <source>In Information Processing: Proceedings of the IFIP Congress</source>
          <volume>74</volume>
          (
          <year>1974</year>
          ),
          <fpage>471</fpage>
          -
          <lpage>475</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Josephs</surname>
            ,
            <given-names>M. B.</given-names>
          </string-name>
          :
          <article-title>Models for data-flow sequential processes</article-title>
          .
          <source>In: Communicating Sequential Processes. The First 25 Years</source>
          , Springer,
          <year>2005</year>
          ,
          <fpage>85</fpage>
          -
          <lpage>97</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Ezpeleta</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Colom</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinez</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A Petri net based deadlock prevention policy for flexible manufacturing systems</article-title>
          .
          <source>Robotics and Automation, IEEE Transactions on 11 (2)</source>
          (
          <year>1995</year>
          ),
          <fpage>173</fpage>
          -
          <lpage>184</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Ehrig</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ehrig</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prange</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taentzer</surname>
          </string-name>
          , G.:
          <article-title>Graph transformation systems</article-title>
          .
          <source>Fundamentals of Algebraic Graph Transformation</source>
          (
          <year>2006</year>
          ),
          <fpage>37</fpage>
          -
          <lpage>71</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Corradini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dotti</surname>
            ,
            <given-names>F. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Foss</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Translating java code to graph transformation systems</article-title>
          .
          <source>In: Graph Transformations</source>
          , Springer,
          <year>2004</year>
          ,
          <fpage>383</fpage>
          -
          <lpage>398</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Geiger</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zündorf</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Graph based debugging with fujaba</article-title>
          .
          <source>Electr. Notes Theor. Comput. Sci</source>
          .
          <volume>72</volume>
          (
          <issue>2</issue>
          ) (
          <year>2002</year>
          ),
          <fpage>112</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Balasubramanian</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Narayanan</surname>
            , A., van Buskirk,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karsai</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The graph rewriting and transformation language: Great. Electronic Communications of the EASST 1 (</article-title>
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Allen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kennedy</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Optimizing compilers for modern architectures</article-title>
          . Morgan Kaufmann San Francisco,
          <year>2002</year>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Sridharan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bodík</surname>
          </string-name>
          , R.:
          <article-title>Refinement-based contextsensitive points-to analysis for Java</article-title>
          .
          <source>ACM SIGPLAN Notices</source>
          <volume>41</volume>
          (
          <issue>6</issue>
          ) (
          <year>2006</year>
          ),
          <fpage>387</fpage>
          -
          <lpage>400</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Muchnick</surname>
            ,
            <given-names>S. S.:</given-names>
          </string-name>
          <article-title>Advanced compiler design implementation</article-title>
          . Morgan Kaufmann Publishers,
          <year>1997</year>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Reps</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Program analysis via graph reachability</article-title>
          .
          <source>Information and software technology 40 (11)</source>
          (
          <year>1998</year>
          ),
          <fpage>701</fpage>
          -
          <lpage>726</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Geilen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Basten</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Requirements on the execution of Kahn process networks</article-title>
          .
          <source>In: Programming languages and systems</source>
          , Springer,
          <year>2003</year>
          ,
          <fpage>319</fpage>
          -
          <lpage>334</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Navarro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asenjo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tabik</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cascaval</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Analytical modeling of pipeline parallelism</article-title>
          .
          <source>In: Parallel Architectures and Compilation Techniques</source>
          ,
          <year>2009</year>
          , PACT'
          <volume>09</volume>
          , 18th International Conference on, IEEE,
          <year>2009</year>
          ,
          <fpage>281</fpage>
          -
          <lpage>290</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Falt</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kruliš</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bednárek</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yaghob</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zavoral</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Locality aware task scheduling in parallel data stream processing</article-title>
          . In:
          <article-title>Intelligent Distributed Computing VIII, ser</article-title>
          .
          <source>Studies in Computational Intelligence</source>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Camacho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Braubach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Venticinque</surname>
          </string-name>
          , and C. Badica, Eds., Springer International Publishing,
          <year>2015</year>
          ,
          <volume>570</volume>
          ,
          <fpage>331</fpage>
          -
          <lpage>342</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Falt</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , Cˇermák,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Dokulil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Zavoral</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          :
          <article-title>Parallel SPARQL query processing using Bobox</article-title>
          .
          <source>International Journal On Advances in Intelligent Systems</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ,4) (
          <year>2012</year>
          ),
          <fpage>302</fpage>
          -
          <lpage>314</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>McNaughton</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yamada</surname>
          </string-name>
          , H.:
          <article-title>Regular expressions and state graphs for automata</article-title>
          ,
          <year>1960</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Jaffe</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A necessary and sufficient pumping lemma for regular languages</article-title>
          .
          <source>ACM SIGACT News</source>
          <volume>10</volume>
          (
          <issue>2</issue>
          ) (
          <year>1978</year>
          ),
          <fpage>48</fpage>
          -
          <lpage>49</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>