=Paper=
{{Paper
|id=Vol-1431/paper7
|storemode=property
|title=State Space Reduction Strategie for Model Checking Concurrent C Programs
|pdfUrl=https://ceur-ws.org/Vol-1431/paper7.pdf
|volume=Vol-1431
|dblpUrl=https://dblp.org/rec/conf/vecos/MethniLHHB15
}}
==State Space Reduction Strategie for Model Checking Concurrent C Programs==
State Space Reduction Strategies for Model Checking Concurrent C Programs Amira Methni, Belgacem Ben Hedia, Matthieu Lemerre, Serge Haddad Kamel Barkaoui CEA, LIST, Centre de Saclay, LSV, ENS Cachan, CNRS CEDRIC Lab, CNAM, PC172, 91191, Gif-sur-Yvette, FRANCE & INRIA, France Paris, France {amira.methni,belgacem.ben-hedia, haddad@lsv.ens-cachan.fr kamel.barkaoui@cnam.fr matthieu.lemerre}@cea.fr Model checking is an effective technique for uncovering subtle errors in concurrent systems. Unfortunately, the state space explosion is the main bottleneck in model checking tools. Here we propose a state space reduction technique for model checking concurrent programs written in C. The reduction technique consists in an analysis phase, which defines an approximate agglomeration predicate. This latter states whether a statement can be agglomerated or not. We implement this predicate using a syntactic analysis, as well as a semantic analysis based on abstract interpretation. We show the usefulness of using agglomeration technique to reduce the state space, as well as to generate an abstract TLA+ specification from a C program. Model checking, TLA, State space reduction, Agglomeration predicate. 1. INTRODUCTION concurrent C programs and we base ourselves on previous work reported in (Methni et al. 2015). Model checking is an attractive formal verification The reduction technique is based on an analysis technique because it is automatic. It offers extensive phase, which defines an approximate agglomeration and thorough coverage by considering all possible predicate. This latter states whether a statement behaviors of a system, unlike traditional testing can be agglomerated or not. We implement this methods. Given a set of properties expressed in predicate using a syntactic analysis, as well as a a temporal logic and a model, model-checking semantic analysis based on abstract interpretation automatically analyzes the state space of the of C code. The particularity of our method is that we model and checks whether the model satisfies the apply the reduction technique during the generation properties (Clarke et al. 1999). However, the main of TLA+ code and by using the abstract interpretation obstacle of model checking is the state explosion technique. We show the usefulness of using this problem and concurrency is a major contributor to technique to reduce the state space during the this problem. verification of C programs, as well as to generate an abstract TLA+ specification from a C program. Many solutions have already been investigated for reducing the complexity of model checking. For 1.2. Outline instance, by getting a simpler model from the original one using abstraction technique (Clarke et al. 1994), The rest of the paper is organized as follows. We or by using on-the-fly model checking to eliminate give an overview of TLA+ in Section 2. Section 3 part of the search to the automaton representing the presents how we specify the semantics of C in TLA+. (negation of the) checked property (Fernandez et al. Section 4 describes the reduction technique and how 1992). we implement it on C programs. Experimental results are presented in Section 5. We discuss related work 1.1. Contribution in Section 6. Section 7 concludes and illustrates In this paper, we present a state space reduction future research directions. technique for model checking concurrent programs written in a low level language. We apply this 2. OVERVIEW OF TLA technique to the verification of C programs by an explicit model checker. We use TLA+ (Lamport TLA+ (Lamport 2002) is the specification language 1994) as a formal specification language for our of the Temporal Logic of Actions (TLA). TLA is hf ormulai , hpredicatei | [hactioni]hstate f unctioni | ¬hf ormulai | hf ormulai ∧ hf ormulai | hf ormulai hactioni , boolean valued expression containing constant symbols, variables, and primed variables hpredicatei , hf ormulai with no primed variables | ENABLED hactioni hstate f unctioni , nonboolean expression containing constant symbols and variables Figure 1: TLA syntax (Lamport 2002) a variant of linear temporal logic introduced by where: (Lamport 1994) for specifying and reasoning about concurrent systems. Readers interested in a more • Init is a state predicate describing the possible detailed presentation of TLA+ can refer to Lamport’s initial states by assigning values to all system book (Lamport 2002). variables, TLA+ specifies a system by describing its possible • N ext is an action representing the program’s behaviors. A behavior is an infinite sequence next-state relation, of states. A state is an assignment of values • vars is the tuple of all variables, to variables. A state function is a nonboolean expression built from constants, variables and • F airness is an optional formula representing constant operators and it assigns a value to each weak or strong assumptions about the execu- state. For example, y + 3 is a state function that tion of actions. assigns to state s the value 3 plus the value that s assigns to the variable y. An action is a Formula Spec is true of a behavior σ iff Init is boolean expression containing constants, variables true of the first state of σ and every step of σ is and primed variables (adorned with “0 ” operator). either a N ext step or a “stuttering step”, in which Unprimed variables refer to variable values in the none of the specified variables change their values, current state and primed variables refer to their and F airness holds. The behaviors satisfying the values in the next-state. Thus, an action represents specification formula given by Equation (1) are the a relation between old states and new states. A state ones that represent correct behaviors of the system, predicate (or predicate for short) is an action with no where a behavior represents a conceivable history of primed variables. a universe that may contain the system. The syntax of TLA is given in Figure 1 (the symbol The TLA+ formula Spec ⇒ φ is valid when the , means equal by definition). TLA+ formulas are model represented by Spec satisfies the property φ, built up from actions and predicates using boolean or implements the model φ. operators (¬ and ∧ and others that can be derived from these two), quantification over logical variables TLA+ has an explicit model checker called TLC that (∀, ∃), and the unary temporal operator (always) of can be used to check the validity of safety and the linear temporal logic (Manna and Pnueli 1992). liveness properties. TLC handles specifications that have the standard form of the formula (1). For this The predicate “ENABLED A”, where A is an action, is reason, we only use specification formula of the form defined to be true in a state s iff there exists some of Equation (1). TLC requires a configuration file state t such that the pair of states hs, ti satisfies which defines the finite-state instance to analyze. It A. The formula [A]vars , where A is an action and begins by generating all states satisfying the initial vars the tuple of all system variables, is equal to predicate Init. Then, it generates every possible (A ∨ (vars0 = vars)) where vars0 is the expression next-state t such that the pair of states hs, ti satisfies obtained by priming all variables in vars. It asserts N ext and the F airness constraints, looking for a that every step (pair of successive states) is either state where an invariant is violated. Finally, it checks an A step or else leaves the values of all variables temporal properties over the state space. vars unchanged. TLA+ defines the abbreviation “UNCHANGED vars” to denote that vars0 = vars. 3. TRANSLATION FROM C TO TLA+ While TLA+ permits a variety of specification styles, Our approach to checking a concurrent C program is the specification that we use is defined by: to first translate it into a TLA+ specification, to which the TLA+ tools can be applied. In what follows, we Spec , Init ∧ [N ext]vars ∧ F airness (1) briefly present how we specify the semantics of C in 1 int x = 3; 2 int y = 0; 3 4 int inc ( int i ) 5 { 6 int tmp ; 7 tmp = i +1; 8 return tmp ; 9 } 10 11 void p1 () { 12 int a = 1; 13 x = inc ( a ) ; 14 a = x; 15 return ; 16 } 17 18 void p2 () { 19 int a ; 20 x = x - 1; 21 x = 3; 22 y = 0; 23 } (a) Code C source (b) Memory layout Figure 2: Example of a C code in which one process (with id equals to 1) executes p1() function and the second one executes p2(). The top of the stack[1] indicates that process 1 is executing the statement with label 6 of inc() function. U ndef represents an undefined value such as the value of an uninitialized variable. TLA+ by describing the memory layout considered The memory is separated into four areas that do not and how we model the control flow of a C program. overlap: 3.1. Memory Layout • a shared memory called data that stores global (and static) variables. In the example of C file is parsed and normalized according to CIL Figure 2a, the x variable is shared by the two (C Intermediate Language) (Necula et al. 2002) processes. which transforms complicated constructs of C into simpler ones. This transformation makes programs • a local memory for each process, called more amenable to analysis and transformation. stack and stores local variables and function According to the Abstract Syntax Tree (AST) of the parameters. The memory stack[id] specifies C program, C2TLA+ generates automatically a TLA+ the local memory of process id and is specification according to a set of translation rules composed of stack frames. Each stack frame detailed in our previous work (Methni et al. 2015). corresponds to a call to a function. In the example of Figure 2a, stack[1] is composed In C2TLA+, a concurrent program consists of of two stack frames, one of p1() function and many interleaved sequences of operations called one of inc() function. When a function call processes, corresponding to threads in C. Each terminates, its stack frame is removed. process has a unique identifier id. The set of all processes is determined by the TLA+ constant • a local memory for each process called P rocSet. register modeled as a sequence and stores the program counter of each process. The Figure 2 presents a C program and the content of head of this sequence contains the statement the memory as modeled by C2TLA+. We consider being currently executed by the process id. that the C code is executed by two processes. One • a local memory called ret which contains process executes p1() function and the other one values to be returned by processes. executes p2() function. The memory is modeled in TLA+ by a variable called is updated by the label value of the successor memory. It is a record whose four fields represent statement given by the control flow graph (CFG) of the four memory areas. The global memory data the C program (provided by CIL). behaves like an array of values, whereas stack and register behave like a FIFO (First In, First The control flow of the C program in C2TLA+ is Out) queues. Access to those memory areas is ensured by the dispatch() function. For the example addressed using offsets. So, a memory address is of Figure 2a, this function is defined as follows: a couple [loc,of f s] of memory location loc, (data or ∆ stack area) and an offset of f s in this location. For dispatch(id, mem) = instance, Addr x specified in Figure 2b defines the CASE Head(mem.register[id]).pc = ”inc 6” memory address of x variable. → inc 6(id, mem) Head(mem.register[id]).pc = ”inc 7” The main operations that manage the memory are → inc 7(id, mem) load() and store(): Head(mem.register[id]).pc = ”p1 11” → p1 11(id, mem) • load() is the function that given the current Head(mem.register[id]).pc = ”p1 12” state of memory mem and a memory address → p1 12(id, mem) addr (in the form of [loc, of f s]), returns the Head(mem.register[id]).pc = ”p1 13” value stored at the address addr in the memory → p1 13(id, mem) mem, Head(mem.register[id]).pc = ”p1 14” → p1 14(id, mem) • store() is the function that given the current Head(mem.register[id]).pc = ”p2 19” state of memory mem, a memory address → p2 19(id, mem) addr and a value val, returns the new copy of Head(mem.register[id]).pc = ”p2 20” the memory after storing the value val at the → p2 20(id, mem) memory address addr. Head(mem.register[id]).pc = ”p2 21” 3.2. Specifying the C control-flow → p2 21(id, mem) OTHER → mem Each C statement i is identified in C2TLA+ by a label assigned by CIL and is modeled by a TLA+ The dispatch() function calls, according to the function, noted stmti (), which takes as arguments value of the pc field contained at the top the process identifier id and the memory mem, the process register (determined by the expres- and returns the new content of the memory after sion Head(mem.register[id]).pc), the corresponding executing the statement. TLA+ function to execute, i. e., the C instruction to execute. Each stmti () updates the program counter register of the process id and may change the content of The C program is thus simulated by the Spec formula mem, stack, and/or ret memory areas depending given by equation (1). The Init predicate specifies on the type of the statement (assignment, jump the initial values of the memory and the N ext action statements, etc.). For instance, the statement is defined as follows: on line 20 is translated into the TLA+ function ∆ p2 20(id, mem) defined as follows: N ext = ∨ ∃ id ∈ P rocSet : p2 20(id, mem) = ∆ ∧ memory.regsiter[id] 6= hi ∆ LET mcopy = load(id, Addr x, [val 7→ 3]) IN ∧ memory 0 = dispatch(id, mem) [data 7→ mcopy.mem, stack → mcopy.stack, ∨ ∀ id ∈ P rocSet : register 7→ [mem.register EXCEPT ![id] = ∧ memory.regsiter[id] = hi h[pc 7→ h“p2 21”i, f p 7→ Head(mem.register[id]).f p]i ∧ UNCHANGED memory ◦ T ail(mem.register[id]), ret 7→ mem.ret] It states that one of the processes that has not finished execution (its register[id] is not empty) is The definition of p2 20() function uses the TLA+ nondeterministically chosen to execute one action construct LET/IN to define a temporary variable that until all processes finish execution, i. e., all registers stores the value of the memory after affecting the become empty. Executing an action consists in value 3 to the memory address Addr x. The symbol calling dispatch() function. For example, when ◦ defines the concatenation operator for TLA+ Head(mem.register[id]).pc equals to ”inc 6”, calling sequences. Head(s) is a TLA+ function that returns the function inc 6(id, mem) will update the value of the head of the sequence s and T ail(s) returns stack[id] (as tmp is stored in the local memory) as the tail of the sequence s. Then, the register[id] well as the top of register[id]. As the register[id] is Figure 3: Example of a state space still not empty, the control flow is thus passed to the are many different ways to implement concurrency successor statement. synchronization in C. For instance, by using locks and semaphores, or by providing low level hard- The behavior of the C program modeled by the Spec ware instructions (e. g., test-and-set and compare- formula can be given in terms of a state transition and-swap). To support synchronization mechanism, system. generated TLA+ specifications by C2TLA+ can be Definition 1. A state transition system is a 3-tuple completed with manually written TLA+ specifications T = (Q, I, δ) given by to provide concurrency primitives and atomic instruc- tions. More detailed information about integrating • a finite set of states Q, synchronization primitives in TLA+ specifications can • a set I ⊆ Q of initial states, specified by the be found in our previous work (Methni et al. 2015). Init predicate, • a transition relation δ ⊆ Q × Q that links two 4. APPLYING REDUCTION ON C PROGRAMS states. This latter corresponds to satisfying the predicate N ext. The process of generating an optimized TLA+ specification is illustrated in Figure 4. To apply The state transition system encodes the state space reduction on C programs, it is necessary to define of the corresponding TLA+ specification of the C the agglomeration predicate. The C program is program. first analyzed. This analysis phase defines an approximate agglomeration predicate which takes Figure 3 illustrates the state space of the corre- as argument a C statement and returns true or sponding TLA+ specification of the C code given in false depending on whether the statement can be Figure 2a. It consists in all the possible interleaving agglomerated or not. This predicate can be safe or of process execution. In order to simplify, we rep- unsafe. The meaning of safe predicate depends on resent only the content of pc field contained at the how the analysis is performed. top of the register[1] and register[2] memories. Each state of the graph matches a valuation of memory • The predicate is said safe when the analysis variable, i. e., its four fields. is a safe approximation. Its definition is as follows: 3.3. Process Synchronization – if the statement does not modify the All processes interact with each other through the shared memory, the predicate returns shared memory data. Concurrent access to this lat- true, ter is ensured via synchronization mechanism. There – if the statement modifies the shared memory or it is unknown, it returns false. The unknown predicate states that we have no idea if the statement modifies the memory or not. • The predicate is called unsafe when a statement is agglomerated and we are not sure if it modifies the global memory or not. (a) Before reduction (b) After reduction Figure 6: The control flow graph of a C code example 4.2. Using syntactic and semantic analysis Figure 4: Reduction process Syntactic analysis In this Section, we introduce the agglomeration A syntactic analysis is performed to detect state- technique by an example. Then, we describe the ments on which reduction can be applied. Often, C implementation of a safe agglomeration predicate functions make use of local variables, and when a by using a syntactic and semantic analysis on C statement refers only to local variables, the value programs. Then, we show the interest of using for which the statement is executed by a process an unsafe agglomeration predicate to generate an cannot change the execution of other processes. abstract TLA+ specification of a C program. Furthermore, we assume that statements involving local pointer variable cannot be agglomerated as In what follows, we use the expression agglomer- they may reference shared memory. ating TLA+ actions, to designate agglomerating the corresponding statements in the C program. Moreover, we consider that jump statements, namely goto, break and continue, can be agglomerated 4.1. An introducing example with its successor (designated by computing the CFG), as they only change the local register of the As the semantics of a TLA+ is expressed through process (register[id] in TLA+ specification). a state transition system, where transitions between states are ensured by TLA+ actions, the reduction Semantic analysis technique consists in agglomerating consecutive In many scenarios, a concurrent C program could actions into one atomic action which performs the contain, in its global memory, data blocks that are effects of the original ones. The reduction idea based accessed only by one process at a time. In that case, on agglomeration has been widely used in Petri Nets syntactic analysis is insufficient. Therefore, we use (Haddad and Pradat-Peyre 2006; Berthelot 1986). a semantic approximation predicate. The C program is thus analyzed and an approximation of memory Figure 5 shows three consecutive states linked by access is computed using the Mthread Frama-C two actions x0 = x + 1 and x0 = x + 2. The result plugin (Mth). This latter provides information about of this agglomeration (represented by −→), is two the memory zones that are accessed concurrently states linked by one atomic action which is the result by more than one process and those that are not. In of executing the action x0 = x + 1, then x0 = x + 2. this case, the agglomeration predicate is safe as the analysis is based on an over-approximation of the x=2 x=2 memory. 0 x =x+1 We consider the example given by Figure 2a. x=3 → x0 = (x + 1) + 2 To illustrate the agglomeration technique on this example, we represent the C program by its control x0 = x + 2 flow graph, illustrated by Figure 6a, where each x=5 x=5 state of the graph represents a C statement and edges represent the control flow. After applying the Figure 5: Agglomerating actions syntactic and semantic analysis on this example, the control flow graph is transformed into a smaller 1 # define BUFFER_SIZE 5 graph, given in Figure 6b. Each state of the 2 mutex m ; graph corresponds to one statement or a block of 3 sem full = 0 , empty = BUFFER_SIZE ; 4 int buffer [ BUFFER_SIZE ]; /* the buffer */ statements. As the inc() function uses only local 5 int count ; /* buffer count */ memory, its block definition is combined into one 6 basic block. For p1() function, statement on line 7 void Producer ( int item ) { 11 is agglomerated with statement on line 12 and 8 while ( TRUE ) { 9 item = rand () ; /* g e n e r a t e a random number */ statement on line 14 is agglomerated with the return 10 P (& empty ) ; /* acquire the empty lock */ instruction. 11 P (& m ) ; /* acquire the mutex lock */ 12 if ( count < BUFFER_SIZE ) { 4.3. Generating an abstract specification 13 buffer [ count ] = item ; 14 count ++; } 15 V (& m ) ; /* release the mutex lock */ Agglomerating statements can also be useful to 16 V (& full ) ; /* signal full */ generate an abstract TLA+ specification of a C 17 } program. The user can define which C statements 18 } can be agglomerated using an unsafe predicate. The 19 void Consumer ( void ) { 20 while ( TRUE ) { resulting TLA+ specification can be viewed as an 21 int item ; abstract formal specification of the C program. 22 P (& full ) ; /* acquire the full lock */ 23 P (& m ) ; /* aquire the mutex lock */ As TLA+ is a fragment of LTL\x (Linear Temporal 24 if ( count > 0) { Logic without the “next operator”), it is well known 25 item = buffer [( count -1) ]; 26 count - -; } that the equivalence between checking a property 27 V (& m ) ; /* release the mutex lock */ given in LTL\x on an abstract model and checking it 28 V (& empty ) ; /* signal empty */ on the original model is ensured by the preservation 29 } 30 } (Goltz et al. 1992). Example: Agglomerating critical sections Figure 7: Example of a producer/consumer model using Consider the following fragment of C code (Figure 7) locks implementing an example of the producer/consumer model. The two processes share a buffer protected by a mutex m. The synchronization between This property expresses that critical sections cannot processes is ensured by two semaphores empty and be executed simultaneously. This property was full. Mutex and semaphores are implemented as verified on the TLA+ specification after reduction. an integer values and are only accessible through Thus, we can deduce that the property is also two atomic operations P() and V(). verified on the specification generated without the reduction technique. Translating this implementation into a TLA+ spec- ification and model checking results in verifying 4.4. Integrating the reduction into TLA+ all interleavings of actions between processes. We specification define the agglomeration predicate that states that statements protected by mutex (namely by P() and In what follows, we show how we implement V() primitives) can be agglomerated. After reduction, the reduction on TLA+ specification. As described the control flow graph of this example is illustrated in in Section 3, each execution of N ext action Figure 8b. corresponds to executing an atomic C statement. The reduction in C programs, consists in translating Therefore, the block statements from line 12 to line a sequence of C statements into one action instead 14 and that from line 12 to 26 are agglomerated into of multiple ones. Let i be the identifier of a statement one state. The state space of the TLA+ specification and j be the identifier of its successor. To do that, we generated after agglomerations contains fewer generate for each statement i a new function that we states than the one without agglomerations as the call stmt longi () defined below. reduction inside the critical section restricts the ∆ amount of interleaving allowed between processes. stmt longi (id, mem) = We define the mutual exclusion property in TLA+ as stmt longj (id, dispatch(id, mem)) follows: The definition of stmt longi (id, mem) consists in ∆ mut exclusion(lbl1, lbl2) = ((∀ id1, id2 ∈ P rocSet : calling the function of the successor statement j, ∧ (id1 6= id2) ∧ (Head(memory.register[id1]) 6= hi) noted by stmt longj () and passing as argument the ∧(Head(memory.register[id2]) 6= hi) memory state returned by dispatch(id, mem). ∧(Head(memory.regsiter[id1]).pc = lbl1)) ⇒ Head(memory.register[id2]).pc 6= lbl2) (a) Before reduction (b) After reduction Figure 8: The control flow graph of the Producer() and Consumer() functions before and after reduction To generate a reduced TLA+ specification, we iterate over all statements and when the ag- glomeration predicate returns true for a state- ment i, its translation consists in calling the func- tion stmt longj (id, dispatch(id, mem)). Otherwise, ∆ we call dispatch(id, mem) function. dispatch red(id, mem) = CASE Head(mem.st[id]).pc = ”inc 6” The C program is thus specified by the formula → inc long 6(id, mem) Spec given by equation (1), except that the N ext Head(mem.register[id]).pc = ”inc 7” action calls a new function dispatch red(), instead of → inc long 7(id, mem) dispatch() function. For the example of figure 2a, the Head(mem.register[id]).pc = ”p1 11” dispatch red() function is defined in Figure 9: → p1 long 11(id, mem) Head(mem.register[id]).pc = ”p1 12” The dispatch red(id, mem) function calls according → p1 long 12(id, mem) to the value the program counter pc contained at the Head(mem.register[id]).pc = ”p1 13” head of mem.register[id] the corresponding TLA+ → p1 long 13(id, mem) function to execute. Head(mem.register[id]).pc = ”p1 14” → p1 long 14(id, mem) Head(mem.register[id]).pc = ”p2 19” 5. EXPERIMENTS → p2 long 19(id, mem) The reduction technique is totally automatic and was Head(mem.register[id]).pc = ”p2 20” integrated in C2TLA+ which is a Frama-C (Cuoq → p2 long 20(id, mem) et al. 2012) plugin, implemented in OCaml. This Head(mem.register[id]).pc = ”p2 21” Section is concerned with our practical experience. → p2 long 21(id, mem) We use the Mthread plugin results and the syntactic OTHER → mem analysis as described in Section 4 to implement our Figure 9: Example of the dispatch red() function definition agglomeration technique. We consider one sequential C program and four concurrent programs: Table 1: Comparing Model Checking Results with & without Reduction (time in seconds) Without reduction With reduction Prorgam #Proc Factor #St #T #St #T Zunebug 1 389 0.147 2 0.136 99.48 Dekker 2 173 0.128 70 0.109 59.53 2 107 1.37 22 0.131 79.43 Peterson 4 1.080.161 59.2 31.221 4.82 97.10 2 2.389 1.91 223 1.67 90.66 Bakery 4 50.515.927 1560 835.355 76.6 98.36 4 9.791.509 366 146.106 12 98.5 Philos 5 >619.309.984 25340 4.179.520 352 99.32 • Zunebug which is a bug in the internal clock specification without agglomeration, checking this driver of Zune 30GB music player. The source property causes TLC to report an error. This error code is taken from (Weimer et al. 2010). occurs when the code takes as input the last day of a leap year, causing the code to enter into an • Lamport’s Bakery and Peterson algorithms ob- infinite loop. After applying the reduction technique tained from (Raynal 2013) and Dekker mu- for the zunebug program, the state space size of its tual exclusion algorithm presented in (Dijkstra corresponding TLA+ specification equals 2. This is 1968). due to the fact that the program is sequential. Model • Dining philosopher problem. We use the checking the TLA+ code with the last day of a leap solution that appears in Tanenbaum’s book year causes the TLC model to report an incorrect (Tanenbaum 2007). recursive function definition. These programs make typical examples for demon- For the concurrent programs, the mutual exclusion strating the strength of the state space reduction. property has been successfully verified on Peterson, C2LTA+ takes as input a C program and generates Bakery, Dekker and Philosophers benchmarks. As for each one the corresponding TLA+ specification. expected, the size of the state space with agglom- erations is always smaller than the one without Using the TLC model checker, we compute the total agglomerations. For the philosopher example with number of generated states and we verify a set of 5 processes, the state space without agglomeration properties on the two specifications. takes more than 7 hours to be model checked. However, using the reduction technique the specifi- Results of experiments are shown in Table 1, where cation is verified in 6 minutes. The reduction factor #Proc denotes the process number, #St denotes the in this cas reaches 99.32. The reduction technique numbers of states and #T denotes the time for model obtains good results on these benchmarks due to the checking in seconds. Columns 3 to 6 give information elimination of some intermediate states. about the state space generated with and without applying the reduction technique. The last column indicates the reduction factor, the ratio between the 6. RELATED WORK state space generated without reduction and the one There are a wealth of research contributions on after applying the reduction technique. formal verification of software as well as techniques All experimental results were performed on an for the reduction of the state space. Intel Core Pentium i7-2760QM machine with 8 Program slicing is a technique introduced by cores (2.40GHz each), with 8Gb of RAM memory. (Weiser 1981) for simplifying sequential programs for For zunebug, one property to verify is program debugging and program understanding. It consists termination, which is a liveness property that we in removing from the program features that are express as follows: irrelevant for the property to be verified. Recently, ∆ termination = (Head(memory.register[1]) = hi) slicing technique has been used to reduce the state space of a system in model checking. It has been This property asserts that the register of the applied to Promela (Millett and Teitelbaum 2000), program will eventually be empty. For the TLA+ the input language for the Spin model checker (Holzmann 1997). The interested reader can refer to C2TLA+ to translate the semantics of C to the (Tip 1995) for a detailed description of the different formal specification language TLA+. This reduction approaches used in the program slicing. technique is based on an analysis phase, which defines an approximate agglomeration predicate that Predicate abstraction (Graf and Saı̈di 1997) is a states whether a statement can be agglomerated technique in which a set of predicates over the or not. We implemented this predicate by applying programs variables is used to construct an abstract a syntactic and semantic analysis on C Programs. program. This technique is being used in SLAM (Ball We illustrated the effectiveness of applying the and Rajamani 2002), BLAST (Henzinger et al. 2003) agglomeration technique to reduce the state space and MAGIC (Chaki et al. 2004). during the verification of C programs and also as well as to define an abstract TLA+ specification that Other approaches perform reduction during explo- model the behavior of C programs. ration of the state space of the program. For exam- ple, partial order reduction (Valmari 1989) is a tech- We aim to integrate a mechanism for structuring nique which explores only a representative subset large TLA+ specifications from C programs using of the state space of a model. The basic idea is to a refinement process between different levels of exploit the commutativity caused by the interleav- abstraction. Finally, we are planning to apply the ings of transitions, which result in the same state. methodology on a critical part of the microkernel of This technique was first introduced for checking the the PharOS (Lemerre et al. 2011) real-time operating absence of deadlock. Subsequently, a number of system (RTOS). variants of this technique have been developed and integrated in verification tools, like Spin (Holzmann 1997) and Verisoft (Godefroid 1997). REFERENCES Although we have mentioned some projects in the Mthread plugin. URL http://frama-c.com/ C context, there are also significant works interested mthread.html. in model checking the Java language. For example, Thomas Ball and Sriram K. Rajamani. The SLAM JPF (Visser et al. 2003) uses state compression project: Debugging System Software via Static technique to handle big states, partial order and Analysis. SIGPLAN Not, 2002. symmetry reduction, slicing, abstraction and runtime analysis techniques to reduce the state space. Gérard Berthelot. Checking properties of nets using transformation. In Advances in Petri Nets In this work, the state space reduction technique that 1985, Covers the 6th European Workshop on we propose is closer to that originally introduced by Applications and Theory in Petri Nets-selected (Berthelot 1986) in Petri nets formalism. Berthelot Papers, pages 19–40, London, UK, UK, 1986. developed a large set of reduction rules for reducing Springer-Verlag. the complexity of verification. Extended work has been proposed by (Haddad and Pradat-Peyre 2006). Sagar Chaki, Edmund M. Clarke, Alex Groce, Our work differs from this latter by the fact that the Somesh Jha, and Helmut Veith. Modular model of our TLA+ specification is a state transition verification of software components in c. IEEE system and the agglomeration predicate depends Trans. Software Eng., 30(6):388–402, 2004. on the analysis of the C program. Our reduction technique is applied during the generation of TLA+ Edmund M. Clarke, Orna Grumberg, and David E. code unlike the partial order reduction technique Long. Model Checking and Abstraction. ACM which performs reduction during the construction of Trans. Program. Lang. Syst., 16(5):1512–1542, the state space. Besides, we use TLA+ as formal 1994. framework which provides an expressive power to Edmund M. Clarke, Jr., Orna Grumberg, and specify the semantics of a programming language Doron A. Peled. Model Checking. MIT Press, and can reason about concurrent systems and Cambridge, MA, USA, 1999. can express safety and liveness properties unlike SLAM and BLAST which have limited support for Pascal Cuoq, Florent Kirchner, Nikolai Kosmatov, concurrent properties as they only check safety Virgile Prevosto, Julien Signoles, and Boris properties. Yakobowski. Frama-C: A Software Analysis Perspective. In Proceedings of the 10th international conference on Software Engineering 7. CONCLUSION AND FUTURE WORK and Formal Methods, SEFM’12, pages 233–247, Berlin, Heidelberg, 2012. Springer-Verlag. URL We have proposed a technique to reduce the state http://frama-c.com/. space for model checking C programs. We used Edsger W. Dijkstra. Cooperating sequential TLA+. In Formal Techniques for Safety-Critical processes. In F. Genuys, editor, Programming Systems, volume 476 of Communications in Com- Languages: NATO Advanced Study Institute, puter and Information Science, pages 206–222. pages 43–112. Academic Press, 1968. Springer, 2015. Jean-Claude Fernandez, Laurent Mounier, Claude Lynette I. Millett and Tim Teitelbaum. Issues Jard, and Thierry Jron. On-the-fly verification in Slicing PROMELA and its Applications to of finite transition systems. Formal Methods in Model Checking, Protocol Understanding, and System Design, 1(2-3):251–273, 1992. Simulation. International Journal on Software Tools for Technology Transfer, 2(4):343–349, Patrice Godefroid. Model Checking for Programming 2000. Languages using VeriSoft. In In Proceeedings of the 24th ACM Symposium on Principles of George C. Necula, Scott Mcpeak, Shree P. Rahul, Programming Languages, pages 174–186. ACM and Westley Weimer. CIL: Intermediate Language Press, 1997. and Tools for Analysis and Transformation of C Programs. In International Conference on Ursula Goltz, Ruurd Kuiper, and Wojciech Penczek. Compiler Construction, pages 213–228, 2002. Propositional Temporal Logics and Equivalences. In W.R. Cleaveland, editor, CONCUR ’92, volume Michel Raynal. Concurrent Programming: Algo- 630 of Lecture Notes in Computer Science, pages rithms, Principles, and Foundations. Springer, 222–236. Springer Berlin Heidelberg, 1992. Heidelberg, 2013. Susanne Graf and Hassen Saı̈di. Construction of Andrew S. Tanenbaum. Modern Operating Systems. Abstract State Graphs with PVS. In Proceedings Prentice Hall Press, Upper Saddle River, NJ, USA, of the 9th International Conference on Computer 3rd edition, 2007. Aided Verification, CAV ’97, pages 72–83, London, UK, UK, 1997. Springer-Verlag. Frank Tip. A Survey of Program Slicing Techniques. Journal of Programming Languages, 3:121–189, Serge Haddad and Jean-François Pradat-Peyre. 1995. New Efficient Petri Nets Reductions for Parallel Programs Verification. Parallel Processing Letters, Antti Valmari. Stubborn Sets for Reduced State 16(1):101–116, 2006. Space Generation. In Proceedings of the Tenth International Conference on Application and Thomas A. Henzinger, Ranjit Jhala, Rupak Majum- Theory of Petri Nets, pages 1–22, 1989. dar, and Gregoire Sutre. Software Verification with BLAST. pages 235–239. Springer, 2003. Willem Visser, Klaus Havelund, Guillaume Brat, Seungjoon Park, and Flavio Lerda. Model Gerard J. Holzmann. The Model Checker SPIN. Checking Programs. Automated Software Engg., IEEE Trans. Software Eng., 23(5):279–295, 1997. 10(2):203–232, April 2003. Leslie Lamport. The Temporal Logic of Actions. Westley Weimer, Stephanie Forrest, Claire ACM Trans. Program. Lang. Syst., 16(3):872–923, Le Goues, and ThanhVu Nguyen. Automatic 1994. Program Repair with Evolutionary Computation. Commun. ACM, 53(5):109–116, May 2010. Leslie Lamport. Specifying Systems, The TLA+ Language and Tools for Hardware and Software Mark Weiser. Program Slicing. In Proceedings of Engineers. Addison-Wesley, 2002. the 5th International Conference on Software En- gineering, ICSE ’81, pages 439–449, Piscataway, M. Lemerre, E. Ohayon, D. Chabrol, M. Jan, and NJ, USA, 1981. IEEE Press. M-B. Jacques. Method and Tools for Mixed- Criticality Real-Time Applications within PharOS. In Proceedings of AMICS 2011: 1st International Workshop on Architectures and Applications for Mixed-Criticality Systems, 2011. Zohar Manna and Amir Pnueli. The Temporal Logic of Reactive and Concurrent Systems. Springer- Verlag New York, Inc., New York, NY, USA, 1992. Amira Methni, Matthieu Lemerre, Belgacem Ben He- dia, Serge Haddad, and Kamel Barkaoui. Spec- ifying and Verifying Concurrent C Programs with