<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Willem Visser, Klaus Havelund, Guillaume Brat,
Seungjoon Park, and Flavio Lerda. Model
Checking Programs. Automated Software Engg.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>State Space Reduction Strategies for Model Checking Concurrent C Programs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Amira Methni, Belgacem Ben Hedia, Matthieu Lemerre,</string-name>
          <email>matthieu.lemerreg@cea.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Serge Haddad</string-name>
          <email>haddad@lsv.ens-cachan.fr</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kamel Barkaoui</string-name>
          <email>kamel.barkaoui@cnam.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CEA, LIST, Centre de Saclay, PC172</institution>
          ,
          <addr-line>91191, Gif-sur-Yvette, FRANCE, famira.methni,belgacem.ben-hedia</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>CEDRIC Lab, CNAM</institution>
          ,
          <addr-line>Paris</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LSV, ENS Cachan</institution>
          ,
          <addr-line>CNRS, &amp; INRIA</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1989</year>
      </pub-date>
      <volume>10</volume>
      <issue>2</issue>
      <fpage>1</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>Model checking is an effective technique for uncovering subtle errors in concurrent systems. Unfortunately, the state space explosion is the main bottleneck in model checking tools. Here we propose a state space reduction technique for model checking concurrent programs written in C. The reduction technique consists in an analysis phase, which defines an approximate agglomeration predicate. This latter states whether a statement can be agglomerated or not. We implement this predicate using a syntactic analysis, as well as a semantic analysis based on abstract interpretation. We show the usefulness of using agglomeration technique to reduce the state space, as well as to generate an abstract TLA+ specification from a C program.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>Model checking is an attractive formal verification
technique because it is automatic. It offers extensive
and thorough coverage by considering all possible
behaviors of a system, unlike traditional testing
methods. Given a set of properties expressed in
a temporal logic and a model, model-checking
automatically analyzes the state space of the
model and checks whether the model satisfies the
properties (Clarke et al. 1999). However, the main
obstacle of model checking is the state explosion
problem and concurrency is a major contributor to
this problem.</p>
      <p>
        Many solutions have already been investigated for
reducing the complexity of model checking. For
instance, by getting a simpler model from the original
one using abstraction technique (Clarke et al. 1994),
or by using on-the-fly model checking to eliminate
part of the search to the automaton representing the
(negation of the) checked property
        <xref ref-type="bibr" rid="ref3">(Fernandez et al.
1992)</xref>
        .
      </p>
    </sec>
    <sec id="sec-2">
      <title>1.1. Contribution</title>
      <p>
        In this paper, we present a state space reduction
technique for model checking concurrent programs
written in a low level language. We apply this
technique to the verification of C programs by an
explicit model checker. We use TLA+
        <xref ref-type="bibr" rid="ref10">(Lamport
1994)</xref>
        as a formal specification language for our
concurrent C programs and we base ourselves on
previous work reported in (Methni et al. 2015).
The reduction technique is based on an analysis
phase, which defines an approximate agglomeration
predicate. This latter states whether a statement
can be agglomerated or not. We implement this
predicate using a syntactic analysis, as well as a
semantic analysis based on abstract interpretation
of C code. The particularity of our method is that we
apply the reduction technique during the generation
of TLA+ code and by using the abstract interpretation
technique. We show the usefulness of using this
technique to reduce the state space during the
verification of C programs, as well as to generate an
abstract TLA+ specification from a C program.
      </p>
    </sec>
    <sec id="sec-3">
      <title>1.2. Outline</title>
      <p>The rest of the paper is organized as follows. We
give an overview of TLA+ in Section 2. Section 3
presents how we specify the semantics of C in TLA+.
Section 4 describes the reduction technique and how
we implement it on C programs. Experimental results
are presented in Section 5. We discuss related work
in Section 6. Section 7 concludes and illustrates
future research directions.</p>
    </sec>
    <sec id="sec-4">
      <title>2. OVERVIEW OF TLA</title>
      <p>
        TLA+
        <xref ref-type="bibr" rid="ref11">(Lamport 2002)</xref>
        is the specification language
of the Temporal Logic of Actions (TLA). TLA is
hf ormulai
hactioni
hpredicatei
hstate f unctioni
,
,
hpredicatei j [hactioni]hstate functioni j :hf ormulai
j hf ormulai ^ hf ormulai j hf ormulai
boolean valued expression containing constant symbols, variables,
and primed variables
hf ormulai with no primed variables j ENABLED hactioni
nonboolean expression containing constant symbols and variables
a variant of linear temporal logic introduced by
        <xref ref-type="bibr" rid="ref10">(Lamport 1994)</xref>
        for specifying and reasoning about
concurrent systems. Readers interested in a more
detailed presentation of TLA+ can refer to Lamport’s
book
        <xref ref-type="bibr" rid="ref11">(Lamport 2002)</xref>
        .
      </p>
      <p>TLA+ specifies a system by describing its possible
behaviors. A behavior is an infinite sequence
of states. A state is an assignment of values
to variables. A state function is a nonboolean
expression built from constants, variables and
constant operators and it assigns a value to each
state. For example, y + 3 is a state function that
assigns to state s the value 3 plus the value
that s assigns to the variable y. An action is a
boolean expression containing constants, variables
and primed variables (adorned with “0” operator).</p>
      <p>Unprimed variables refer to variable values in the
current state and primed variables refer to their
values in the next-state. Thus, an action represents
a relation between old states and new states. A state
predicate (or predicate for short) is an action with no
primed variables.</p>
      <p>
        The syntax of TLA is given in Figure 1 (the symbol
, means equal by definition). TLA+ formulas are
built up from actions and predicates using boolean
operators (: and ^ and others that can be derived
from these two), quantification over logical variables
(8; 9), and the unary temporal operator (always) of
the linear temporal logic
        <xref ref-type="bibr" rid="ref13 ref5">(Manna and Pnueli 1992)</xref>
        .
      </p>
      <p>The predicate “ENABLED A”, where A is an action, is
defined to be true in a state s iff there exists some
state t such that the pair of states hs; ti satisfies
A. The formula [A]vars, where A is an action and
vars the tuple of all system variables, is equal to
(A _ (vars0 = vars)) where vars0 is the expression
obtained by priming all variables in vars. It asserts
that every step (pair of successive states) is either
an A step or else leaves the values of all variables
vars unchanged. TLA+ defines the abbreviation
“UNCHANGED vars” to denote that vars0 = vars.</p>
      <p>While TLA+ permits a variety of specification styles,
the specification that we use is defined by:</p>
      <p>Spec , Init ^
[N ext]vars ^ F airness</p>
      <p>Formula Spec is true of a behavior iff Init is
true of the first state of and every step of is
either a N ext step or a “stuttering step”, in which
none of the specified variables change their values,
and F airness holds. The behaviors satisfying the
specification formula given by Equation (1) are the
ones that represent correct behaviors of the system,
where a behavior represents a conceivable history of
a universe that may contain the system.</p>
      <p>The TLA+ formula Spec ) is valid when the
model represented by Spec satisfies the property ,
or implements the model .</p>
      <p>TLA+ has an explicit model checker called TLC that
can be used to check the validity of safety and
liveness properties. TLC handles specifications that
have the standard form of the formula (1). For this
reason, we only use specification formula of the form
of Equation (1). TLC requires a configuration file
which defines the finite-state instance to analyze. It
begins by generating all states satisfying the initial
predicate Init. Then, it generates every possible
next-state t such that the pair of states hs; ti satisfies
N ext and the F airness constraints, looking for a
state where an invariant is violated. Finally, it checks
temporal properties over the state space.</p>
    </sec>
    <sec id="sec-5">
      <title>3. TRANSLATION FROM C TO TLA+</title>
      <p>Our approach to checking a concurrent C program is
to first translate it into a TLA+ specification, to which
the TLA+ tools can be applied. In what follows, we
briefly present how we specify the semantics of C in
TLA+ by describing the memory layout considered
and how we model the control flow of a C program.
The memory is separated into four areas that do not
overlap:</p>
    </sec>
    <sec id="sec-6">
      <title>3.1. Memory Layout</title>
      <p>
        C file is parsed and normalized according to CIL
(C Intermediate Language)
        <xref ref-type="bibr" rid="ref17">(Necula et al. 2002)</xref>
        which transforms complicated constructs of C into
simpler ones. This transformation makes programs
more amenable to analysis and transformation.
According to the Abstract Syntax Tree (AST) of the
C program, C2TLA+ generates automatically a TLA+
specification according to a set of translation rules
detailed in our previous work (Methni et al. 2015).
In C2TLA+, a concurrent program consists of
many interleaved sequences of operations called
processes, corresponding to threads in C. Each
process has a unique identifier id. The set of all
processes is determined by the TLA+ constant
P rocSet.
      </p>
      <p>Figure 2 presents a C program and the content of
the memory as modeled by C2TLA+. We consider
that the C code is executed by two processes. One
process executes p1() function and the other one
executes p2() function.
a shared memory called data that stores
global (and static) variables. In the example of
Figure 2a, the x variable is shared by the two
processes.
a local memory for each process, called
stack and stores local variables and function
parameters. The memory stack[id] specifies
the local memory of process id and is
composed of stack frames. Each stack frame
corresponds to a call to a function. In the
example of Figure 2a, stack[1] is composed
of two stack frames, one of p1() function and
one of inc() function. When a function call
terminates, its stack frame is removed.
a local memory for each process called
register modeled as a sequence and stores
the program counter of each process. The
head of this sequence contains the statement
being currently executed by the process id.
a local memory called ret which contains
values to be returned by processes.</p>
      <p>The memory is modeled in TLA+ by a variable called
memory. It is a record whose four fields represent
the four memory areas. The global memory data
behaves like an array of values, whereas stack
and register behave like a FIFO (First In, First
Out) queues. Access to those memory areas is
addressed using offsets. So, a memory address is
a couple [loc,of f s] of memory location loc, (data or
stack area) and an offset of f s in this location. For
instance, Addr x specified in Figure 2b defines the
memory address of x variable.</p>
      <p>The main operations that manage the memory are
load() and store():
load() is the function that given the current
state of memory mem and a memory address
addr (in the form of [loc; of f s]), returns the
value stored at the address addr in the memory
mem,
store() is the function that given the current
state of memory mem, a memory address
addr and a value val, returns the new copy of
the memory after storing the value val at the
memory address addr.</p>
    </sec>
    <sec id="sec-7">
      <title>3.2. Specifying the C control-flow</title>
      <p>Each C statement i is identified in C2TLA+ by a
label assigned by CIL and is modeled by a TLA+
function, noted stmti(), which takes as arguments
the process identifier id and the memory mem,
and returns the new content of the memory after
executing the statement.</p>
      <p>Each stmti() updates the program counter register
of the process id and may change the content of
mem, stack, and/or ret memory areas depending
on the type of the statement (assignment, jump
statements, etc.). For instance, the statement
on line 20 is translated into the TLA+ function
p2 20(id; mem) defined as follows:
p2 20(id; mem) =
LET mcopy = load(id; Addr x; [val 7! 3]) IN
[data 7! mcopy:mem; stack ! mcopy:stack;
register 7! [mem:register EXCEPT ![id] =
h[pc 7! h“p2 21”i; f p 7! Head(mem:register[id]):f p]i</p>
      <p>T ail(mem:register[id]);
ret 7! mem:ret]
The definition of p2 20() function uses the TLA+
construct LET/IN to define a temporary variable that
stores the value of the memory after affecting the
value 3 to the memory address Addr x. The symbol
defines the concatenation operator for TLA+
sequences. Head(s) is a TLA+ function that returns
the head of the sequence s and T ail(s) returns
the tail of the sequence s. Then, the register[id]
is updated by the label value of the successor
statement given by the control flow graph (CFG) of
the C program (provided by CIL).</p>
      <p>The control flow of the C program in C2TLA+ is
ensured by the dispatch() function. For the example
of Figure 2a, this function is defined as follows:
dispatch(id; mem) =</p>
      <p>CASE Head(mem:register[id]):pc = "inc 6"
! inc 6(id; mem)
Head(mem:register[id]):pc = "inc 7"
! inc 7(id; mem)
Head(mem:register[id]):pc = "p1 11"
! p1 11(id; mem)
Head(mem:register[id]):pc = "p1 12"
! p1 12(id; mem)
Head(mem:register[id]):pc = "p1 13"
! p1 13(id; mem)
Head(mem:register[id]):pc = "p1 14"
! p1 14(id; mem)
Head(mem:register[id]):pc = "p2 19"
! p2 19(id; mem)
Head(mem:register[id]):pc = "p2 20"
! p2 20(id; mem)
Head(mem:register[id]):pc = "p2 21"
! p2 21(id; mem)</p>
      <p>OTHER ! mem
The dispatch() function calls, according to the
value of the pc field contained at the top
the process register (determined by the
expression Head(mem:register[id]):pc), the corresponding
TLA+ function to execute, i. e., the C instruction to
execute.</p>
      <p>The C program is thus simulated by the Spec formula
given by equation (1). The Init predicate specifies
the initial values of the memory and the N ext action
is defined as follows:</p>
      <p>N ext =
_ 9 id 2 P rocSet :
^ memory:regsiter[id] 6= hi
^ memory0 = dispatch(id; mem)
_ 8 id 2 P rocSet :
^ memory:regsiter[id] = hi
^ UNCHANGED memory
It states that one of the processes that has not
finished execution (its register[id] is not empty) is
nondeterministically chosen to execute one action
until all processes finish execution, i. e., all registers
become empty. Executing an action consists in
calling dispatch() function. For example, when
Head(mem:register[id]):pc equals to "inc 6", calling
the function inc 6(id; mem) will update the value of
stack[id] (as tmp is stored in the local memory) as
well as the top of register[id]. As the register[id] is
still not empty, the control flow is thus passed to the
successor statement.</p>
      <p>The behavior of the C program modeled by the Spec
formula can be given in terms of a state transition
system.</p>
      <p>Definition 1. A state transition system is a 3-tuple
T = (Q; I; ) given by
a finite set of states Q,
a set I Q of initial states, specified by the
Init predicate,
a transition relation Q Q that links two
states. This latter corresponds to satisfying the
predicate N ext.</p>
      <p>The state transition system encodes the state space
of the corresponding TLA+ specification of the C
program.</p>
      <p>Figure 3 illustrates the state space of the
corresponding TLA+ specification of the C code given in
Figure 2a. It consists in all the possible interleaving
of process execution. In order to simplify, we
represent only the content of pc field contained at the
top of the register[1] and register[2] memories. Each
state of the graph matches a valuation of memory
variable, i. e., its four fields.</p>
    </sec>
    <sec id="sec-8">
      <title>3.3. Process Synchronization</title>
      <p>All processes interact with each other through the
shared memory data. Concurrent access to this
latter is ensured via synchronization mechanism. There
are many different ways to implement concurrency
synchronization in C. For instance, by using locks
and semaphores, or by providing low level
hardware instructions (e. g., test-and-set and
compareand-swap). To support synchronization mechanism,
generated TLA+ specifications by C2TLA+ can be
completed with manually written TLA+ specifications
to provide concurrency primitives and atomic
instructions. More detailed information about integrating
synchronization primitives in TLA+ specifications can
be found in our previous work (Methni et al. 2015).</p>
    </sec>
    <sec id="sec-9">
      <title>4. APPLYING REDUCTION ON C PROGRAMS</title>
      <p>The process of generating an optimized TLA+
specification is illustrated in Figure 4. To apply
reduction on C programs, it is necessary to define
the agglomeration predicate. The C program is
first analyzed. This analysis phase defines an
approximate agglomeration predicate which takes
as argument a C statement and returns true or
false depending on whether the statement can be
agglomerated or not. This predicate can be safe or
unsafe. The meaning of safe predicate depends on
how the analysis is performed.</p>
      <p>The predicate is said safe when the analysis
is a safe approximation. Its definition is as
follows:
– if the statement does not modify the
shared memory, the predicate returns
true,
– if the statement modifies the shared
memory or it is unknown, it returns false.</p>
      <p>The unknown predicate states that we have no
idea if the statement modifies the memory or
not.</p>
      <p>The predicate is called unsafe when a
statement is agglomerated and we are not sure
if it modifies the global memory or not.
In this Section, we introduce the agglomeration
technique by an example. Then, we describe the
implementation of a safe agglomeration predicate
by using a syntactic and semantic analysis on C
programs. Then, we show the interest of using
an unsafe agglomeration predicate to generate an
abstract TLA+ specification of a C program.
In what follows, we use the expression
agglomerating TLA+ actions, to designate agglomerating the
corresponding statements in the C program.</p>
    </sec>
    <sec id="sec-10">
      <title>4.1. An introducing example</title>
      <p>
        As the semantics of a TLA+ is expressed through
a state transition system, where transitions between
states are ensured by TLA+ actions, the reduction
technique consists in agglomerating consecutive
actions into one atomic action which performs the
effects of the original ones. The reduction idea based
on agglomeration has been widely used in Petri Nets
        <xref ref-type="bibr" rid="ref7">(Haddad and Pradat-Peyre 2006; Berthelot 1986)</xref>
        .
Figure 5 shows three consecutive states linked by
two actions x0 = x + 1 and x0 = x + 2. The result
of this agglomeration (represented by !), is two
states linked by one atomic action which is the result
of executing the action x0 = x + 1, then x0 = x + 2.
x = 2
x0 = x + 1
x0 = x + 2
x = 3
      </p>
      <p>!
x = 5
x = 2
x = 5</p>
      <p>x0 = (x + 1) + 2</p>
    </sec>
    <sec id="sec-11">
      <title>4.2. Using syntactic and semantic analysis</title>
    </sec>
    <sec id="sec-12">
      <title>Syntactic analysis</title>
      <p>A syntactic analysis is performed to detect
statements on which reduction can be applied. Often, C
functions make use of local variables, and when a
statement refers only to local variables, the value
for which the statement is executed by a process
cannot change the execution of other processes.
Furthermore, we assume that statements involving
local pointer variable cannot be agglomerated as
they may reference shared memory.</p>
      <p>Moreover, we consider that jump statements, namely
goto, break and continue, can be agglomerated
with its successor (designated by computing the
CFG), as they only change the local register of the
process (register[id] in TLA+ specification).</p>
    </sec>
    <sec id="sec-13">
      <title>Semantic analysis</title>
      <p>In many scenarios, a concurrent C program could
contain, in its global memory, data blocks that are
accessed only by one process at a time. In that case,
syntactic analysis is insufficient. Therefore, we use
a semantic approximation predicate. The C program
is thus analyzed and an approximation of memory
access is computed using the Mthread Frama-C
plugin (Mth). This latter provides information about
the memory zones that are accessed concurrently
by more than one process and those that are not. In
this case, the agglomeration predicate is safe as the
analysis is based on an over-approximation of the
memory.</p>
      <p>We consider the example given by Figure 2a.
To illustrate the agglomeration technique on this
example, we represent the C program by its control
flow graph, illustrated by Figure 6a, where each
state of the graph represents a C statement and
edges represent the control flow. After applying the
syntactic and semantic analysis on this example,
the control flow graph is transformed into a smaller
graph, given in Figure 6b. Each state of the
graph corresponds to one statement or a block of
statements. As the inc() function uses only local
memory, its block definition is combined into one
basic block. For p1() function, statement on line
11 is agglomerated with statement on line 12 and
statement on line 14 is agglomerated with the return
instruction.</p>
    </sec>
    <sec id="sec-14">
      <title>4.3. Generating an abstract specification</title>
      <p>
        Agglomerating statements can also be useful to
generate an abstract TLA+ specification of a C
program. The user can define which C statements
can be agglomerated using an unsafe predicate. The
resulting TLA+ specification can be viewed as an
abstract formal specification of the C program.
As TLA+ is a fragment of LTLnx (Linear Temporal
Logic without the “next operator”), it is well known
that the equivalence between checking a property
given in LTLnx on an abstract model and checking it
on the original model is ensured by the preservation
        <xref ref-type="bibr" rid="ref5">(Goltz et al. 1992)</xref>
        .
      </p>
    </sec>
    <sec id="sec-15">
      <title>Example: Agglomerating critical sections</title>
      <p>Consider the following fragment of C code (Figure 7)
implementing an example of the producer/consumer
model. The two processes share a buffer protected
by a mutex m. The synchronization between
processes is ensured by two semaphores empty and
full. Mutex and semaphores are implemented as
an integer values and are only accessible through
two atomic operations P() and V().</p>
      <p>Translating this implementation into a TLA+
specification and model checking results in verifying
all interleavings of actions between processes. We
define the agglomeration predicate that states that
statements protected by mutex (namely by P() and
V() primitives) can be agglomerated. After reduction,
the control flow graph of this example is illustrated in
Figure 8b.</p>
      <p>Therefore, the block statements from line 12 to line
14 and that from line 12 to 26 are agglomerated into
one state. The state space of the TLA+ specification
generated after agglomerations contains fewer
states than the one without agglomerations as the
reduction inside the critical section restricts the
amount of interleaving allowed between processes.
We define the mutual exclusion property in TLA+ as
follows:
mut exclusion(lbl1; lbl2) = ((8 id1; id2 2 P rocSet :
^ (id1 6= id2) ^ (Head(memory:register[id1]) 6= hi)
^(Head(memory:register[id2]) 6= hi)
^(Head(memory:regsiter[id1]):pc = lbl1))
) Head(memory:register[id2]):pc 6= lbl2)
In what follows, we show how we implement
the reduction on TLA+ specification. As described
in Section 3, each execution of N ext action
corresponds to executing an atomic C statement.
The reduction in C programs, consists in translating
a sequence of C statements into one action instead
of multiple ones. Let i be the identifier of a statement
and j be the identifier of its successor. To do that, we
generate for each statement i a new function that we
call stmt longi() defined below.</p>
      <p>stmt longi(id; mem) =</p>
      <p>stmt longj(id; dispatch(id; mem))
The definition of stmt longi(id; mem) consists in
calling the function of the successor statement j,
noted by stmt longj() and passing as argument the
memory state returned by dispatch(id; mem).
(a) Before reduction
(b) After reduction
To generate a reduced TLA+ specification, we
iterate over all statements and when the
agglomeration predicate returns true for a
statement i, its translation consists in calling the
function stmt longj (id; dispatch(id; mem)). Otherwise,
we call dispatch(id; mem) function.</p>
      <p>The C program is thus specified by the formula
Spec given by equation (1), except that the N ext
action calls a new function dispatch red(), instead of
dispatch() function. For the example of figure 2a, the
dispatch red() function is defined in Figure 9:
The dispatch red(id; mem) function calls according
to the value the program counter pc contained at the
head of mem:register[id] the corresponding TLA+
function to execute.</p>
    </sec>
    <sec id="sec-16">
      <title>5. EXPERIMENTS</title>
      <p>
        The reduction technique is totally automatic and was
integrated in C2TLA+ which is a Frama-C
        <xref ref-type="bibr" rid="ref1">(Cuoq
et al. 2012)</xref>
        plugin, implemented in OCaml. This
Section is concerned with our practical experience.
We use the Mthread plugin results and the syntactic
analysis as described in Section 4 to implement our
agglomeration technique.
      </p>
      <p>We consider one sequential C program and four
concurrent programs:
dispatch red(id; mem) =</p>
      <p>CASE Head(mem:st[id]):pc = "inc 6"</p>
      <p>! inc long 6(id; mem)
Head(mem:register[id]):pc = "inc 7"</p>
      <p>! inc long 7(id; mem)
Head(mem:register[id]):pc = "p1 11"</p>
      <p>! p1 long 11(id; mem)
Head(mem:register[id]):pc = "p1 12"</p>
      <p>! p1 long 12(id; mem)
Head(mem:register[id]):pc = "p1 13"</p>
      <p>! p1 long 13(id; mem)
Head(mem:register[id]):pc = "p1 14"</p>
      <p>! p1 long 14(id; mem)
Head(mem:register[id]):pc = "p2 19"</p>
      <p>! p2 long 19(id; mem)
Head(mem:register[id]):pc = "p2 20"</p>
      <p>! p2 long 20(id; mem)
Head(mem:register[id]):pc = "p2 21"</p>
      <p>! p2 long 21(id; mem)</p>
      <p>OTHER ! mem</p>
      <p>
        Lamport’s Bakery and Peterson algorithms
obtained from
        <xref ref-type="bibr" rid="ref18">(Raynal 2013)</xref>
        and Dekker
mutual exclusion algorithm presented in
        <xref ref-type="bibr" rid="ref2">(Dijkstra
1968)</xref>
        .
      </p>
      <p>
        Dining philosopher problem. We use the
solution that appears in Tanenbaum’s book
        <xref ref-type="bibr" rid="ref19">(Tanenbaum 2007)</xref>
        .
      </p>
      <p>These programs make typical examples for
demonstrating the strength of the state space reduction.
C2LTA+ takes as input a C program and generates
for each one the corresponding TLA+ specification.
Using the TLC model checker, we compute the total
number of generated states and we verify a set of
properties on the two specifications.</p>
      <p>Results of experiments are shown in Table 1, where
#Proc denotes the process number, #St denotes the
numbers of states and #T denotes the time for model
checking in seconds. Columns 3 to 6 give information
about the state space generated with and without
applying the reduction technique. The last column
indicates the reduction factor, the ratio between the
state space generated without reduction and the one
after applying the reduction technique.</p>
      <p>All experimental results were performed on an
Intel Core Pentium i7-2760QM machine with 8
cores (2.40GHz each), with 8Gb of RAM memory.
For zunebug, one property to verify is program
termination, which is a liveness property that we
express as follows:</p>
      <p>termination = (Head(memory:register[1]) = hi)
This property asserts that the register of the
program will eventually be empty. For the TLA+
specification without agglomeration, checking this
property causes TLC to report an error. This error
occurs when the code takes as input the last day
of a leap year, causing the code to enter into an
infinite loop. After applying the reduction technique
for the zunebug program, the state space size of its
corresponding TLA+ specification equals 2. This is
due to the fact that the program is sequential. Model
checking the TLA+ code with the last day of a leap
year causes the TLC model to report an incorrect
recursive function definition.</p>
      <p>For the concurrent programs, the mutual exclusion
property has been successfully verified on Peterson,
Bakery, Dekker and Philosophers benchmarks. As
expected, the size of the state space with
agglomerations is always smaller than the one without
agglomerations. For the philosopher example with
5 processes, the state space without agglomeration
takes more than 7 hours to be model checked.
However, using the reduction technique the
specification is verified in 6 minutes. The reduction factor
in this cas reaches 99.32. The reduction technique
obtains good results on these benchmarks due to the
elimination of some intermediate states.</p>
    </sec>
    <sec id="sec-17">
      <title>6. RELATED WORK</title>
      <p>There are a wealth of research contributions on
formal verification of software as well as techniques
for the reduction of the state space.</p>
      <p>
        Program slicing is a technique introduced by
(Weiser 1981) for simplifying sequential programs for
debugging and program understanding. It consists
in removing from the program features that are
irrelevant for the property to be verified. Recently,
slicing technique has been used to reduce the state
space of a system in model checking. It has been
applied to Promela
        <xref ref-type="bibr" rid="ref16">(Millett and Teitelbaum 2000)</xref>
        ,
the input language for the Spin model checker
        <xref ref-type="bibr" rid="ref9">(Holzmann 1997)</xref>
        . The interested reader can refer to
        <xref ref-type="bibr" rid="ref20">(Tip 1995)</xref>
        for a detailed description of the different
approaches used in the program slicing.
      </p>
      <p>
        Predicate abstraction
        <xref ref-type="bibr" rid="ref6">(Graf and Sa¨ıdi 1997)</xref>
        is a
technique in which a set of predicates over the
programs variables is used to construct an abstract
program. This technique is being used in SLAM (Ball
and Rajamani 2002), BLAST
        <xref ref-type="bibr" rid="ref8">(Henzinger et al. 2003)</xref>
        and MAGIC (Chaki et al. 2004).
      </p>
      <p>
        Other approaches perform reduction during
exploration of the state space of the program. For
example, partial order reduction (Valmari 1989) is a
technique which explores only a representative subset
of the state space of a model. The basic idea is to
exploit the commutativity caused by the
interleavings of transitions, which result in the same state.
This technique was first introduced for checking the
absence of deadlock. Subsequently, a number of
variants of this technique have been developed and
integrated in verification tools, like Spin
        <xref ref-type="bibr" rid="ref9">(Holzmann
1997)</xref>
        and Verisoft
        <xref ref-type="bibr" rid="ref4">(Godefroid 1997)</xref>
        .
      </p>
      <p>
        Although we have mentioned some projects in the
C context, there are also significant works interested
in model checking the Java language. For example,
JPF (Visser et al. 2003) uses state compression
technique to handle big states, partial order and
symmetry reduction, slicing, abstraction and runtime
analysis techniques to reduce the state space.
In this work, the state space reduction technique that
we propose is closer to that originally introduced by
(Berthelot 1986) in Petri nets formalism. Berthelot
developed a large set of reduction rules for reducing
the complexity of verification. Extended work has
been proposed by
        <xref ref-type="bibr" rid="ref7">(Haddad and Pradat-Peyre 2006)</xref>
        .
Our work differs from this latter by the fact that the
model of our TLA+ specification is a state transition
system and the agglomeration predicate depends
on the analysis of the C program. Our reduction
technique is applied during the generation of TLA+
code unlike the partial order reduction technique
which performs reduction during the construction of
the state space. Besides, we use TLA+ as formal
framework which provides an expressive power to
specify the semantics of a programming language
and can reason about concurrent systems and
can express safety and liveness properties unlike
SLAM and BLAST which have limited support for
concurrent properties as they only check safety
properties.
      </p>
    </sec>
    <sec id="sec-18">
      <title>7. CONCLUSION AND FUTURE WORK</title>
      <p>We have proposed a technique to reduce the state
space for model checking C programs. We used
C2TLA+ to translate the semantics of C to the
formal specification language TLA+. This reduction
technique is based on an analysis phase, which
defines an approximate agglomeration predicate that
states whether a statement can be agglomerated
or not. We implemented this predicate by applying
a syntactic and semantic analysis on C Programs.
We illustrated the effectiveness of applying the
agglomeration technique to reduce the state space
during the verification of C programs and also as
well as to define an abstract TLA+ specification that
model the behavior of C programs.</p>
      <p>
        We aim to integrate a mechanism for structuring
large TLA+ specifications from C programs using
a refinement process between different levels of
abstraction. Finally, we are planning to apply the
methodology on a critical part of the microkernel of
the PharOS
        <xref ref-type="bibr" rid="ref12">(Lemerre et al. 2011)</xref>
        real-time operating
system (RTOS).
Mthread plugin.
      </p>
      <p>mthread.html.</p>
      <p>URL http://frama-c.com/
Thomas Ball and Sriram K. Rajamani. The SLAM
project: Debugging System Software via Static
Analysis. SIGPLAN Not, 2002.</p>
      <p>Ge´rard Berthelot. Checking properties of nets
using transformation. In Advances in Petri Nets
1985, Covers the 6th European Workshop on
Applications and Theory in Petri Nets-selected
Papers, pages 19–40, London, UK, UK, 1986.</p>
      <p>Springer-Verlag.</p>
      <p>Sagar Chaki, Edmund M. Clarke, Alex Groce,
Somesh Jha, and Helmut Veith. Modular
verification of software components in c. IEEE
Trans. Software Eng., 30(6):388–402, 2004.
Edmund M. Clarke, Orna Grumberg, and David E.</p>
      <p>Long. Model Checking and Abstraction. ACM
Trans. Program. Lang. Syst., 16(5):1512–1542,
1994.</p>
      <p>Edmund M. Clarke, Jr., Orna Grumberg, and
Doron A. Peled. Model Checking. MIT Press,
Cambridge, MA, USA, 1999.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Pascal</given-names>
            <surname>Cuoq</surname>
          </string-name>
          , Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and
          <string-name>
            <given-names>Boris</given-names>
            <surname>Yakobowski</surname>
          </string-name>
          .
          <string-name>
            <surname>Frama-C: A Software Analysis</surname>
          </string-name>
          <article-title>Perspective</article-title>
          .
          <source>In Proceedings of the 10th international conference on Software Engineering and Formal Methods, SEFM'12</source>
          , pages
          <fpage>233</fpage>
          -
          <lpage>247</lpage>
          , Berlin, Heidelberg,
          <year>2012</year>
          . Springer-Verlag. URL http://frama-c.com/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Edsger W.</given-names>
            <surname>Dijkstra</surname>
          </string-name>
          .
          <article-title>Cooperating sequential processes</article-title>
          . In F. Genuys, editor,
          <source>Programming Languages: NATO Advanced Study Institute</source>
          , pages
          <fpage>43</fpage>
          -
          <lpage>112</lpage>
          . Academic Press,
          <year>1968</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Jean-Claude</surname>
            <given-names>Fernandez</given-names>
          </string-name>
          , Laurent Mounier, Claude Jard, and
          <string-name>
            <given-names>Thierry</given-names>
            <surname>Jron</surname>
          </string-name>
          .
          <article-title>On-the-fly verification of finite transition systems</article-title>
          .
          <source>Formal Methods in System Design</source>
          ,
          <volume>1</volume>
          (
          <issue>2</issue>
          -3):
          <fpage>251</fpage>
          -
          <lpage>273</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Patrice</given-names>
            <surname>Godefroid</surname>
          </string-name>
          .
          <article-title>Model Checking for Programming Languages using VeriSoft</article-title>
          .
          <source>In In Proceeedings of the 24th ACM Symposium on Principles of Programming Languages</source>
          , pages
          <fpage>174</fpage>
          -
          <lpage>186</lpage>
          . ACM Press,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Ursula</given-names>
            <surname>Goltz</surname>
          </string-name>
          , Ruurd Kuiper, and
          <string-name>
            <given-names>Wojciech</given-names>
            <surname>Penczek</surname>
          </string-name>
          .
          <article-title>Propositional Temporal Logics and Equivalences</article-title>
          . In W.R. Cleaveland, editor,
          <source>CONCUR '92</source>
          , volume
          <volume>630</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>222</fpage>
          -
          <lpage>236</lpage>
          . Springer Berlin Heidelberg,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Susanne</given-names>
            <surname>Graf</surname>
          </string-name>
          and
          <article-title>Hassen Sa¨ıdi. Construction of Abstract State Graphs with PVS</article-title>
          .
          <source>In Proceedings of the 9th International Conference on Computer Aided Verification, CAV '97</source>
          , pages
          <fpage>72</fpage>
          -
          <lpage>83</lpage>
          , London, UK, UK,
          <year>1997</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Serge</given-names>
            <surname>Haddad</surname>
          </string-name>
          and
          <article-title>Jean-Franc¸ois Pradat-Peyre. New Efficient Petri Nets Reductions for Parallel Programs Verification</article-title>
          .
          <source>Parallel Processing Letters</source>
          ,
          <volume>16</volume>
          (
          <issue>1</issue>
          ):
          <fpage>101</fpage>
          -
          <lpage>116</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Thomas A.</given-names>
            <surname>Henzinger</surname>
          </string-name>
          , Ranjit Jhala, Rupak Majumdar, and
          <string-name>
            <given-names>Gregoire</given-names>
            <surname>Sutre</surname>
          </string-name>
          .
          <source>Software Verification with BLAST</source>
          . pages
          <fpage>235</fpage>
          -
          <lpage>239</lpage>
          . Springer,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Gerard J.</given-names>
            <surname>Holzmann</surname>
          </string-name>
          .
          <article-title>The Model Checker SPIN</article-title>
          .
          <source>IEEE Trans. Software Eng.</source>
          ,
          <volume>23</volume>
          (
          <issue>5</issue>
          ):
          <fpage>279</fpage>
          -
          <lpage>295</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Leslie</given-names>
            <surname>Lamport</surname>
          </string-name>
          .
          <article-title>The Temporal Logic of Actions</article-title>
          .
          <source>ACM Trans. Program. Lang. Syst.</source>
          ,
          <volume>16</volume>
          (
          <issue>3</issue>
          ):
          <fpage>872</fpage>
          -
          <lpage>923</lpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Leslie</given-names>
            <surname>Lamport</surname>
          </string-name>
          .
          <source>Specifying Systems</source>
          ,
          <string-name>
            <surname>The</surname>
            <given-names>TLA</given-names>
          </string-name>
          +
          <article-title>Language and Tools for Hardware and Software Engineers</article-title>
          .
          <source>Addison-Wesley</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Lemerre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ohayon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chabrol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jan</surname>
          </string-name>
          , and
          <string-name>
            <surname>M-B. Jacques</surname>
          </string-name>
          .
          <article-title>Method and Tools for MixedCriticality Real-Time Applications within PharOS</article-title>
          .
          <source>In Proceedings of AMICS 2011: 1st International Workshop on Architectures and Applications for Mixed-Criticality Systems</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Zohar</given-names>
            <surname>Manna</surname>
          </string-name>
          and
          <string-name>
            <given-names>Amir</given-names>
            <surname>Pnueli</surname>
          </string-name>
          .
          <article-title>The Temporal Logic of Reactive and Concurrent Systems</article-title>
          . SpringerVerlag New York, Inc., New York, NY, USA,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Amira</given-names>
            <surname>Methni</surname>
          </string-name>
          , Matthieu Lemerre, Belgacem Ben Hedia, Serge Haddad, and
          <string-name>
            <given-names>Kamel</given-names>
            <surname>Barkaoui</surname>
          </string-name>
          .
          <article-title>Specifying and Verifying Concurrent C Programs with TLA+</article-title>
          .
          <source>In Formal Techniques for Safety-Critical Systems</source>
          , volume
          <volume>476</volume>
          of Communications in Computer and Information Science, pages
          <fpage>206</fpage>
          -
          <lpage>222</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Springer</surname>
          </string-name>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Lynette</surname>
            <given-names>I. Millett</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Tim</given-names>
            <surname>Teitelbaum</surname>
          </string-name>
          .
          <article-title>Issues in Slicing PROMELA and its Applications to Model Checking, Protocol Understanding, and Simulation</article-title>
          .
          <source>International Journal on Software Tools for Technology Transfer</source>
          ,
          <volume>2</volume>
          (
          <issue>4</issue>
          ):
          <fpage>343</fpage>
          -
          <lpage>349</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>George C.</given-names>
            <surname>Necula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Scott</given-names>
            <surname>Mcpeak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Shree P.</given-names>
            <surname>Rahul</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Westley</given-names>
            <surname>Weimer</surname>
          </string-name>
          . CIL:
          <article-title>Intermediate Language and Tools for Analysis and Transformation of C Programs</article-title>
          .
          <source>In International Conference on Compiler Construction</source>
          , pages
          <fpage>213</fpage>
          -
          <lpage>228</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Michel</given-names>
            <surname>Raynal. Concurrent Programming</surname>
          </string-name>
          : Algorithms, Principles, and Foundations. Springer, Heidelberg,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Andrew S.</given-names>
            <surname>Tanenbaum</surname>
          </string-name>
          .
          <article-title>Modern Operating Systems</article-title>
          . Prentice Hall Press, Upper Saddle River, NJ, USA, 3rd edition,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Frank</given-names>
            <surname>Tip</surname>
          </string-name>
          .
          <article-title>A Survey of Program Slicing Techniques</article-title>
          .
          <source>Journal of Programming Languages</source>
          ,
          <volume>3</volume>
          :
          <fpage>121</fpage>
          -
          <lpage>189</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>