Model Stack for the Pervasive Verification of a Microkernel-based Operating System? Matthias Daum, Jan Dörrenbächer, and Sebastian Bogan Saarland University, Computer Science Dept. 66123 Saarbrücken, Germany {md11,jandb,sebastian}@wjpserver.cs.uni-sb.de Abstract. Operating-system verification gains increasing research interest. The com- plexity of such systems is, however, challenging and many endeavors are limited in some respect: Some projects focus on a particular aspect like memory safety, not pursuing functional correctness. Others restrict their verification efforts to a single layer of soft- ware, assuming correctness of those below. Only few projects aim at pervasive formal verification of a computer system over several software layers. In our paper, we present an approach to the formal specification of a microkernel-based operating system at several layers and glance on our verification experience with this model stack. From our experience, we conclude that pervasiveness entails more than just cumulative verification efforts on several layers. In fact, it is a challenging task to integrate models and proofs into a uniform, coherent theory. 1 Introduction Software-verification tools have greatly improved in recent years. As a conse- quence, the verification of low-level software gains increasing research interest as a touch-stone for the industrial applicability of software verification in gen- eral. While many aspects of operating-system verification are currently studied, they are not integrated into a uniform, coherent theory. In our paper, we concen- trate on the integration of several operating-system models in order to form a pervasive verification stack. We point out the integration problems and describe the specific design decisions that we made in order to support integration. A characteristic problem of operating-system software is that hardware com- ponents become visible and the program functionality cannot be expressed in the pure semantics of a high-level language like C. Usually, this functionality is implemented in assembly and encapsulated in a small number of C functions, or primitives. Thus, only a small part of low-level code is indeed implemented in assembly while larger code portions are written in C and just use these primi- tives. From a model-theoretic point of view, such primitives extend the original C semantics. Another important problem related to operating systems regards the atomicity of operations in a concurrent setting. The processor might be in- terrupted in its current computation after each assembly instruction but a single C statement is usually compiled into several instructions. ? Work partially funded by the German Federal Ministry of Education and Research (BMBF) in the framework of the Verisoft project under grant 01 IS C38. Model Stack for the Pervasive Verification of a Microkernel-based Operating System 57 Consequently, we need an assembly semantics, a C semantics and a sim- ulation theorem between both semantics. Leinenbach and Petrova [1] provide exactly that: a proven correct compiler that translates programs from the per- fectly type-safe C variant C0 into the assembly language of our risc processor. An important feature of C0 is the inline-assembly statement, which permits the language extensions via primitives. Using this mechanism, we have imple- mented our operating system in C0. The correct C0 compiler is an elementary prerequisite for our model stack. Outline. In Sect. 2, we explain the fundamentals of our operating system. This section comprises a sketch of the correct compiler, a survey of the features of our microkernel, and an outline of the system’s implementation design. With these prerequisites in place, we give an overview of our model stack in Sect. 3. Certainly, we cannot descend into all details of the stack in this paper.1 Hence, we confine ourselves to aspects of encapsulation and state partitioning. In Sect. 4, we describe our formal specification of the generic interface between the user processes and the kernel. Moreover, we sketch our formal model for assembly and C0 processes. This process abstraction is an elementary prerequisite for the model of communicating user processes as well as for the top-level specification of the operating system. We illustrate in Sect. 5, why we use the process abstraction already in the Vamos specification and how we have constructed the latter model in order to support the abstraction towards Coup and the top-level specification. We conclude in Sect. 6. 2 Background 2.1 A Correct Compiler for an Extensible Language In this section, we summarize work of Leinenbach and Petrova [1,2]. They provide a formally verified compiler that translates programs from the perfectly type- safe C variant C0 into the assembly language of a risc processor. Restrictions of C0 in comparison to C are that expressions must be side-effect-free, all type conversions have to be made explicitly, and there is no pointer arithmetic. In spite of that, C0 still features dynamic memory allocation and inlined assembly. The inline-assembly statement is an important feature for language exten- sions. Though the effect of this statement cannot be expressed in the formal C0 semantics, the compiler literally embeds the given assembly code into the compilation. Using this hook, we can implement C0 functions that use inline assembly in order to extend C0 by a functionality that cannot be expressed by other C0 statements. 1 The complete Isabelle/HOL theories of the models as well as detailed documentation are made available at the Verisoft repository, see 58 Matthias Daum, Jan Dörrenbächer, Sebastian Bogan For C0 and for the assembly language, two small-step transition semantics have been formalized in Isabelle/HOL. Leinenbach has formally proven compiler correctness by a stepwise simulation between both semantics. Below, we describe the semantics in detail. We often deal with structured values, which we define by enumerating the components in prose, e. g., “a value x consisting of two components this and that”. We refer to a single component with a dot, e. g., x.this refers to component this of value x. An update of this component is denoted by x [this := q]. C0 Semantics. For lack of space, we can only glance at the C0 semantics. C0 programs are statically represented by the program environment Γ , which comprises a symbol table of global variables, a type-name environment, and a function table. The dynamically changing state sC0 of a C0 program in execution is composed of – the remaining program sC0 .prog, and – the current state of the program variables sC0 .mem. In the following sections, we assume an evaluation function get-val for the look- up, and an update function set-val for the manipulation of a certain variable in the memory. We refer to the value of expression e in state sC0 by get-val (sC0 , e). If we update the left-value l in state sC0 with some expression u, we denote the resulting configuration by set-val (sC0 , l, u). seq The transition relation δC0 of this semantics is a partial function. The Target Assembly Language. The assembly semantics was developed for the risc processor Vamp [3]. The assembly semantics abstracts from the paging mechanism of the processor and employs a linear memory model. An assembly state sasm consists of the following components: – the normal and the delayed program counters, sasm .pc and sasm .dpc, respec- tively, implementing the delayed branch mechanism. – the general-purpose register file sasm .gpr ∈ {0, . . . , 31} → {0, 1}32 . – the memory size sasm .V measured in pages of 4096 bytes. It defines the set of available memory addresses: VA(sasm ) = {a | a < sasm .V · 4096} – the byte-addressable linear memory sasm .vm ∈ VA(sasm ) → {0, 1}8 We denote the state space of the assembly semantics by Sasm . Assembly seq computations are modeled by the partial function δasm ∈ Sasm * Sasm . Note that the effects of exceptions like illegal page faults cannot be fully determined seq from the assembly-machine state. In that case, δasm gets stuck. But with sufficient resources, a compiled C0 program does not generate exceptions during normal execution. Moreover, the memory size sasm .V can neither be read nor changed by the assembly machine itself but depends on the operating-system kernel. We extend the semantics accordingly in Sect. 4. Model Stack for the Pervasive Verification of a Microkernel-based Operating System 59 2.2 Fundamentals of our Microkernel In this section, we sketch the features of Vamos and explain the fundamental access-control mechanism that establishes the process roles “privileged process” and “device driver”. Our microkernel Vamos performs the following tasks: (a) enforcement of a minimal access control, (b) process management, (c) memory management, (d) priority-based round-robin scheduling, (e) support for user-mode device dri- vers, and (f) inter-process communication (IPC). Processes can control these tasks via the kernel’s application binary interface (ABI). Table 1 lists the kernel calls that constitute the ABI. Most kernel calls are reserved for so-called privileged processes. Thus, only a privileged process can bring up new processes or kill existing ones, alter the memory consumption of processes, change their scheduling parameters, or con- trol the registration of device drivers. Any process, however, might use the IPC mechanism. Thus, the privileging serves as a minimal access control. We pre- sume that the privileged processes constitute the user-mode parts of the op- erating system and implement a more sophisticated access-control mechanism. Non-privileged processes may then communicate with the privileged processes in order to request kernel services on their behalf. Table 1. Application binary interface of the Vamos kernel Kernel Call Description Access Control set privileged p add a process to the set of privileged processes Process Management process create p create a new process from a memory image process clone p copy an already existing process process kill p kill a process Memory Management memory add p increase the amount of virtual memory for a process memory free p decrease the amount of virtual memory for a process Scheduling Mechanism chg sched params p change scheduling parameters Device Driver Support change driver p (un)register a process as a driver for a set of devices enable interrupts d re-enable a set of interrupts after their successful handling dev read d / dev write d communicate with a certain device Inter-Process Communication ipc send send a message to another process ipc receive receive a message from another process ipc request send a message and immediately wait for a reply change rights manipulate IPC rights read kernel info receive information from the kernel p d call is reserved for privileged processes call is reserved for device drivers 60 Matthias Daum, Jan Dörrenbächer, Sebastian Bogan When Vamos boots, it launches one single process, the init process. This process is privileged and has to set up the required servers of the operating system, start and register the device drivers, and possibly run initial applications. A device driver is a user process, which is designated for the communication with certain devices. Only if a process is registered as a driver for a particular device, it may place read or write requests from or to that device, respectively. Moreover, the device driver is notified of interrupts from that device. 2.3 Implementation Design Fig. 1 depicts the implementation structure of Vamos. The lowest software layer is called communicating virtual machines (CVM). This layer encapsulates all hardware-specific low-level functionality, which is possibly using inline assembly. CVM has two major tasks: memory virtualization and switching between differ- ent threads of execution. Hence, CVM includes a page-fault handler with a sim- ple memory swapping facility [4]. Moreover, it exports an interface of so-called primitives for the access and manipulation of user machines to the hardware-in- dependent part. We have thereby established a solid framework for microkernel construction. Using this framework, we have implemented our microkernel Vamos in C0 without extra portions of inline assembly. On every kernel entry, CVM preserves the old context, establishes a suitable C0 environment and calls the function kdispatch of Vamos. For the manipulation of the user memory or registers, Va- mos may call the primitives of CVM. The return value of kdispatch determines, which user machine resumes when the kernel execution finishes. mode User App SOS App SOS (C0+asm)? kcalls kcalls Coup (C0+asm)? Vamos System mode kdispatch primitives Vamos asm? CVM Hardware CVM C0 asm? Fig. 1. Implementation scheme Fig. 2. Schematics of our model stack Model Stack for the Pervasive Verification of a Microkernel-based Operating System 61 While CVM and Vamos run in the privileged system mode of the processor, user machines run in the unprivileged user mode. In the figure, we labelled one user machine “SOS” for simple operating system and the others “App” as abbreviation for application. The SOS constitutes the highest layer of our operating system. It features an advanced rights management with different users, implements a sophisticated access control to kernel services like process creation and provides further services like file-system and network access. All user machines interact with the kernel via kernel calls. The special instruction trap causes an exception, which is handled in Vamos. Vamos can examine and alter the state of the user machine using CVM primitives, thus identifying the user machine’s specific request and storing the kernel’s corresponding response. 3 Model-Stack Overview In analogy to the software stack, we have developed a model stack. For each software layer, this stack contains a specification that fully describes the observ- able behavior of this layer. The specification describes the full functionality of the software layer, on the one hand, and forms the foundation for the simula- tion proofs for the more abstract layers, on the other hand. We build the model stack as a means of verifying statements about the real system on a convenient, abstract layer. Such statements include safety and liveness properties. Fig. 2 shows our model stack. The lowest model describes CVM. This model comprises a kernel machine, given in the C0 semantics, and a number of user machines with virtual memory, given in the assembly semantics. The kernel runs in system mode with disabled interrupts. The user machines, in contrast, may be interrupted after each assembly instruction, thus we model them on assembly level. We will not descend into the details of this model in the present publication because it has been described earlier [5]. We can instantiate the kernel of CVM with the Vamos implementation, i. e., the C0 program that implements (the hardware-independent part of) our microkernel is executed as CVM’s kernel machine. The Vamos model specifies the behaviour of the resulting system. In short, this model establishes assembly processes that communicate with the kernel via a well-defined interface, the kernel ABI. A code-correctness proof for the Vamos implementation establishes the simulation theorem between the instantiated CVM model and the Vamos specification. While the Vamos processes use the assembly semantics, the SOS and the applications are written in C0. The kernel ABI is encapsulated in a few library functions. Except for the library functions, the verification of these programs should certainly employ the more abstract C0 semantics. For this purpose, we designed the model communicating user processes (Coup). This model simu- lates Vamos but it abstracts from the scheduler and establishes C0 processes. 62 Matthias Daum, Jan Dörrenbächer, Sebastian Bogan We need to abstract from the particular Vamos scheduler because of the dif- ferent granularity of the program semantics: The execution of a process might be suspended and resumed on every assembly instruction but a C0 statement usually contains several assembly instructions. The simulation proof has three parts: (1) The execution traces of Coup should contain all traces of Vamos. (2) There is a confluent reordering of any execution trace such that there is no scheduling event during a single C0 statement. (3) The kernel library is correctly implemented. We can instantiate one of the C0 processes with the implementation of SOS. The SOS model specifies the behaviour of the resulting system. The simulation between both models can be shown by the functional correctness of the SOS implementation. 4 Process Abstraction In this section, we formalize the interface between the processes and the kernel. Both kernel models, the Vamos specification and the Coup model, use this process abstraction in order to access and manipulate processes. Our abstraction is based on the observation that Vamos interacts with processes only via a well-defined interface, the kernel ABI. Hence, we can encapsulate processes in a self-contained input-output automaton, thereby hiding the internal state and just exposing the generic interface. We define a process as input-output automaton described by a tuple (Sproc , Σproc , Ωproc , ωproc , vm-size proc , init proc , δproc ) with state space Sproc , input alphabet Σproc , output alphabet Ωproc , output func- tions ωproc and vm-size proc , initialization function init proc , and transition function δproc . While the state space Sproc depends on the individual process abstraction, the interface between the kernel and the processes is naturally shared by all process abstractions. This interface is entirely defined by Σproc and Ωproc . The output alphabet Ωproc enumerates all possible kernel calls. Additionally, we have to treat a few error cases. As the kernel calls are internally identified by a number, a process might specify an invalid number. This condition is represented by the special output value undefined trap. Moreover, a process might generate exceptions like an arithmetic overflow or an illegal page fault. These exceptions are collectively represented as value runtime error. Finally, the output ε denotes the intention to perform a local computation. The input alphabet Σproc reflects all kernel-initiated changes of a process. These comprise all possible responses to kernel calls, on the one hand, and the demand to change the amount of virtual memory, on the other hand. While the Model Stack for the Pervasive Verification of a Microkernel-based Operating System 63 former are the synchronous reaction to a kernel call, the latter may be issued asynchronously at any stage of a process. In order to perform a local transition, we pass the input ε to the transition function δproc . In addition to the usual functions δproc ∈ Σproc × Sproc → Sproc for transi- tions and ωproc ∈ Sproc → Ωproc for the output, we use two more functions in our process abstraction. In order to compute the overall memory consumption of the process system, Vamos needs to know the amount of virtual memory that is currently occupied by every process. The function vm-size proc ∈ Sproc → N provides the necessary information. When a new process is created, Vamos has to translate a representation of a given binary executable file into the cor- responding, initial process state. We specify this translation in the function init proc ∈ N? → Sproc . For the kernel specification, the generic abstraction of processes as self- contained input-output automata is rather a matter of taste than necessity. However, this generic process abstraction is a cornerstone of the Coup model. Moreover, it is crucial for the simulation theorem between Vamos and Coup be- cause both models mainly differ in the process abstraction. Hence, it proved to be beneficial that many definitions of the specification are parametrized over the process abstraction and can thus be reused in Coup as well. Finally, the intro- duction of this parameter on the whole Coup model is just a logical consequence because it paves the way for the integration of other language semantics. In the following, we refine the generic abstraction with specific process models for assembly and C0. Assembly Processes. We reuse the state space Sasm of the assembly seman- tics for our assembly processes. Based on this state space, we now define the output functions ωasm and vm-size asm , the transition function δasm , and the ini- tialization function init asm . The function vm-size asm simply returns the value of the component V of the current state: vm-size asm (sasm ) = sasm .V . However, the definitions for other functions are too complex to fully present them here. We constrain ourselves to an exemplary excerpt. Fig. 3 depicts the formal definition of the output function ωasm and the tran- sition function δasm for the call process clone. We assume that sasm is the state of an assembly process. The predicate trap holds iff the current instruction is a trap, and the function simm extracts the sign-extended immediate constant from the current instruction. If there is a trap with immediate constant 2, the output function will return the pair of value process clone and the content of register 11. Let us now assume that the kernel recognizes this output from the current process but the process is not privileged. In this case, the kernel signals this error condition by passing value err unprivileged via the transition function to the current process. Then, the transition function updates the register 22 with the corresponding error code and increases the program counters. 64 Matthias Daum, Jan Dörrenbächer, Sebastian Bogan trap(sasm ) ∧ simm(sasm ) = 2 =⇒ ωasm (sasm ) = (process clone, sasm .gpr (11)) 2 3 gpr (22) := −4 δasm (err unprivileged, sasm ) = sasm 4 pc := sasm .pc + 4 5 dpc := sasm .dpc + 4 Fig. 3. Formal definition of the output function and the transition function of assembly processes for the call process clone C0 Processes. In order to represent C0 processes, we define an automaton with the following signature: (SC0 , Σproc , Ωproc , ωC0 , vm-size C0 , init C0 , δC0 ) Any state sC0 ∈ SC0 comprises the static program environment as well as the dynamic program state of the C0 machine as described in Sect. 2.1. We store the program environment because processes might dynamically be created and killed, hence the program might change. The output functions ωC0 and vm-size C0 , the transition function δC0 , and the initialization function init C0 are defined in analogy to their assembly-process counterparts. For example, the formal definition of the output function and the transition function for the call process clone is given in Fig. 4. We assume that sC0 is the current state of a C0 process. The component sC0 .prog denotes the remaining program of the process. We consider the first statement of the program. If this statement is a call to the function vc process clone, we define the output of ωC0 as the pair of process clone and the value of the first argument to the function call. Let us now assume that the kernel recognizes this output from the current process, clones the given process, and computes the value hn new as new identifier for the clone. It would then pass this value via the transition function to the current process, thus signalling the success of the clone operation. In this case, the transition function updates the memory sC0 .mem at the address designated by the left-value e with value hn new and removes the function call from the remaining program. 5 Formal Kernel Models In the previous section, we described our process model. Now, we embed the processes into the two concurrent kernel models, the Vamos specification, and the Coup model. The former specifies the exact behaviour of our microkernel with a particular scheduler. This model is used for code verification. The latter abstracts the scheduler and focusses on the interaction of the processes with the Model Stack for the Pervasive Verification of a Microkernel-based Operating System 65 sC0 .prog = ”e = vc process clone(e0 ); r” =⇒ ωC0 (sC0 ) = (process clone, get-val (sC0 .mem, e0 )) sC0 .prog = ”e = vc process clone(e0 ); r” » – mem := set-val (sC0 .mem, e, hn new ) =⇒ δC0 ((succ new process, hn new ), sC0 ) = sC0 prog := r Fig. 4. Formal definition of the output function and the transition function of C0 processes for the call process clone microkernel.We need this more abstract model in order to describe the reordering of interleaved sequences and to introduce C0 processes. Both models are Moore machines. We describe the kernel specification with the tuple (Sv , s0v , Σ̂, Ω̂, ωv , δv ) and the Coup model with (Svc , s0vc , Σ̂, Ω̂, ωvc , δvc ). The state spaces Sv , Svc contain the initial states s0v and s0vc , respectively. For device communication, we use the input alphabet Σ̂ and the output alphabet Ω̂. The functions ωv and ωvc determine the output in a current state. Finally, δv and δvc describe the transitions of the models. The notable difference between these machines regards the determinism: While the Vamos specification is fully deterministic, Coup features a non-determinism in its scheduling decisions. Con- sequently, the transition relation δv is functional while δvc is not. Below, we introduce the different components of the models side by side. Device Communication. Our kernel uses memory-mapped I/O for device com- munication. Hence, the output alphabet Ω̂ comprises read and write accesses to device addresses. The input alphabet Σ̂ consists of interrupt lines and optionally incoming data. Hillebrand et al. [6] have described our device interface in detail. State Spaces. A state sv ∈ Sv comprises the following components: – The partial function sv .procs maps the process identifiers (PIDs) of the cur- rently active processes to their assembly states sasm ∈ Sasm . For inactive processes, this function is undefined. – Priorities are assigned to each PID of the active processes with the partial function sv .priodb. – All other scheduling information is kept in the component sv .schedds. – The partial function sv .rightsdb maps PIDs to a data structure for the man- agement of IPC rights and the set of privileged processes. – Finally, the component sv .devds contains data for device communication. At first sight, the separation of priorities from the remaining scheduling data might surprise, here. The reason for this unintuitive partitioning is that Coup abstracts from the particular scheduling algorithm but the priorities remain visible. We modularized the states and the corresponding transition functions in 66 Matthias Daum, Jan Dörrenbächer, Sebastian Bogan order to share the definition of similar parts between the Vamos specification, Coup, and the SOS model. A central part of the Vamos specification are the scheduling data structures. The component sv .schedds is divided into sub components. The current time time ∈ N is a counter for clock ticks. Process-specific scheduling information for active processes is collected in the partial function procdb that maps PIDs to a record of (a) the time slice tsl , (b) the amount of consumed time ctsl , and (c) the absolute timeout to. If a process is found to be computing when a timer interrupt raises, the component ctsl is increased until the process has finally run for tsl ticks. In this case, another process is scheduled. If a process calls the kernel for IPC and no partner is ready for communication, the absolute timeout to is computed from the current time and the relative timeout that has been specified with the call. Moreover, the scheduler maintains different queues for scheduling. They are represented as finite sequences in the Vamos specification. Namely, there is a ready queue ready(prio) of schedulable processes for each priority prio ∈ {0, 1, 2}. The processes that cannot currently be scheduled (because they are waiting for an IPC partner) are held in a queue named wait. In Vamos, the current process is the first process in the highest, non-empty ready queue. If all ready queues are empty, the current process is undefined. Formally, we define the function cup as: cup(sv .schedds) = p ⇐⇒ ∃i : sv .schedds.ready(i) = (p, . . . ) ∧ ∀ j > i : sv .schedds.ready(j) = () The corresponding Coup state svc inherits most components of sv . Only two components change: The process abstraction becomes a model parameter, i. e., component svc .procs of a Coup state svc is a partial function from PIDs to a generic state space Sproc for processes. Moreover, the scheduling data structures are replaced by a current-process indicator. We retain the current process in the state in order to compute the output from the current state. The output function ωvc signals the demand for device communication. In order to determine this demand, we need to employ the output function ωproc of the current process. Consequently, we fix this process beforehand instead of including transitions for all ready processes in the transition relation. Transitions. A transition δv (σ̂, sv ) of the Vamos specification under the device input σ̂ ∈ Σ̂ has up to three phases: 1. If the current process cp = cup(sv .schedds) is defined, we consult its out- put ωproc (sv .procs(cp)) and compute the response according to the current Vamos state. For instance, if a process calls process clone, we check for sufficient privileges and resources and choose the corresponding response Model Stack for the Pervasive Verification of a Microkernel-based Operating System 67 σ ∈ Σproc for success or failure. With this response, we advance the cur- rent process: sv [procs(cp) := δproc (σ, sv .procs(cp))]. 2. If the timer-interrupt line is raised, the scheduler increases the clock-tick counter sv .schedds.time and the consumed time sv .schedds.procdb(cp).ctsl of the current process. Moreover, the scheduler wakes up all processes p with elapsed timeouts. 3. Finally, Vamos delivers interrupts to waiting drivers and saves the remaining interrupts for later delivery in sv .devds. σ The Coup transitions behave very similarly. However, a transition svc ←→ 0 svc ∈ δvc , obtains cp directly from svc .cup. Moreover, the only visible effect of the second phase is the wake-up of certain waiting processes. This effect is simulated non-deterministically and independently from the timer interrupt. Reusability. As mentioned earlier, we worked hard to reuse as much definitions as possible in the different layers of abstraction. For example, in contrast to the Coup model, the kernel specification does not actually rely on the encapsulation of processes as self-contained input-output automata. Since Vamos only con- siders assembly processes, it would be easier to directly manipulate the state of these processes. However, in order to literally reuse update functions and def- initions, we invested the extra work and already introduced the input-output automata at this level of our model stack. Consider the formal definition of the function v clone updt procs in Fig. 5. This function describes the necessary updates in case of a successful call to process clone. Without going into detail, depending on the relation between the calling process (psubj ) and the process that was cloned (pobj ), this func- tion updates the state of the calling process, the state of the cloned process, and the state of the new process. The important thing about this definition is that the updates are presented in a process-abstraction independent man- ner. Rather than updating particular registers, as it would suffice for assem- bly processes, we feed the generic transition function δproc with generic input, e.g. (succ new process, hn new ). Now, this generic interface between kernel and processes allows us to use v clone updt procs in both models, i.e. Vamos and Coup. Proceeding in a similar way with all kernel calls keeps the models clear and maintainable. Furthermore, tedious equivalence proofs between Vamos and Coup are avoided. 6 Conclusion Related Work. With the CLI stack [7], a pioneering approach started out for pervasive verification of a complete system with rigorous formalizations of specifications and proofs. Most notably, the simple kernel KIT [8] was developed 68 Matthias Daum, Jan Dörrenbächer, Sebastian Bogan v clone updt procs(procs, psubj , pobj , pnew , hn new ) = λx. if x = psubj then δproc ((succ new process, hn new ), procs(psubj )) else if x = pnew ∧ psubj = pobj then δproc ((succ new process, 0), procs(psubj )) else if x = pnew then procs(pobj ) else procs(x) Fig. 5. Formal definition of the function v clone updt procs that computes the component procs after a call to process clone and its machine code implementation was proven to be correct. However, this kernel is fairly simple compared to modern microkernels. Many recent projects undertake verification efforts on modern microkernels. Among them are L4.verified [9], VFiasco [10], Eros [11], and the Flint project [12]. Though these projects may have achieved some advances, they all discard pervasiveness but focus on the verification of a single software layer. The Flint project has developed a verification framework for an assembly language and has formally proven the correctness for context-switching code. The other three projects have established semantics for C variants and have verified different properties on source code level. As far as we can see, inlined assembly portions are just postulated to be correct and solely described by their semantical effects. Moreover, these projects rely on compiler correctness. Traditionally, concurrent systems are described by the Calculus of Communi- cating Systems [13] or as Communicating Sequential Processes (CSP) [14]. Both approaches, however, argue on a very abstract level and are mainly used to spec- ify system requirements at an early stage in the system design. Basin, Olderog and Sevinç [15], for instance, describe the combination of CSP and Object-Z in order to specify and analyze security automata. Unfortunately, a link to the actual code is missing. Discussion. We presented an approach to a pervasive, formal specification of a microkernel-based operating system. The formal models of our microkernel Vamos and the SOS occupy almost 400 kB and comprise about 8,000 lines of specification. The formalization took us about 48 person months. The different specification layers resemble essentially the implementation layers. A recurring problem was the semantic gap between the high-level imple- mentation language C0 and certain hardware details that were manipulated by the considered programs. Such details concerned hidden hardware components like devices, on the one hand, and the granularity of atomic operations on the hardware level, on the other hand. Both problems arose simultaneously when we argued about C0 processes: In pure C0, we cannot express kernel calls. Hence, we extended the original C0 semantics by a library of functions that implement the Model Stack for the Pervasive Verification of a Microkernel-based Operating System 69 kernel calls using inlined assembly. Moreover, user machines may be interrupted after the execution of any assembly instruction. Consequently, we had to justify that we may confluently shift all rescheduling events to statement boundaries before we could argue about C0 processes. In non-pervasive approaches, these challenges are likely to be skipped. When we began to specify the different layers of our operating system, we started with independent models. As expected, many parts of the formal models overlapped because we specified the same operating system at several layers of abstraction. After a short period of evaluation for each abstraction layer, we coupled the models as tight as possible in order to minimize the proof efforts for the simulation theorems between adjacent models. This ambivalent approach was very successful. In the very beginning, the specifications change very often, such that maintaining dependencies between them would be very tedious and distracting from the individual problems. As soon as the models stabilized, however, there was a common ground for a tight meshing. The synergies between the layers could easily be identified and utilized. The key to shared definitions between the Vamos specification and the Coup model was twofold: On the one hand, we unitized the state space in a way that permits easy abstraction between the different abstraction layers. On the other hand, we established the process abstraction already at the Vamos specification. Both arrangements together enabled a very tight mesh, which could even be reused in the SOS specification. Of course, this tight mesh of the models has drawbacks: The specification of an abstract layer depends massively on the underlying layers, which impedes the understanding of this model. Moreover, even the lower layers became more complicated, so for instance, when the process abstraction was already intro- duced in the Vamos specification. Finally, several layers were affected whenever the abstraction at one level turned out to be wrong. From our experience, however, we conclude that the advantages outweigh the drawbacks. We saved valuable time by reusing the definitions, which we otherwise had spent in distracting equivalence proofs. For instance, we succeeded with the step-wise simulation proof between the Vamos specification and the Coup model within two months. We have proven this simulation for most kernel calls within days. Most of the time, however, we spend with the verification of IPC. This effort was caused by a considerably simpler modelling of IPC in Coup, which became possible after the scheduler was abstracted. In this particular case, the simpler modelling was desirable and hence, we invested the additional time for the proof. Independent models, however, would possibly have caused such an overhead for every kernel call. In the code-correctness proof for Vamos, there is only a minimal overhead (hours) induced by the use of the process abstraction in the specification. All in all, we invested about 18 person months to show the correctness of the IPC send 70 Matthias Daum, Jan Dörrenbächer, Sebastian Bogan call. Including all auxiliary functions, the implementation of this call accounts for nearly half the code size and comprises the most complex parts. Again 18 person month had been spent on the functional verification of the file system and the hard disk driver code in the SOS. This code covers 20% of the SOS implementation. Summary. The fundamental difference of our “pervasive” approach compared to “incremental” approaches that focus on a single layer at a time regards the design of specifications: At first, models are usually adapted to the actual verifi- cation goal. If it is the only goal, to show that the formal specification precisely describes the implementation, the specification tends to move closely towards the implementation. If a model is used as fundament of an underlying compo- nent without a proof, there is a likelihood that the model over-abstracts the actual implementation. At second, regarding one single layer at a time necessar- ily leads to self-contained, independent models. When combining those different layers later on, there is a tremendous proof effort necessary to establish simula- tion theorems between the adjacent layers. Our approach prevents this effort by the specification design. References 1. Leinenbach, D., Petrova, E.: Pervasive compiler verification: From verified programs to verified systems. In: Workshop on Systems Software Verification, to appear, Elsevier (2008) 2. Leinenbach, D., Paul, W., Petrova, E.: Towards the formal verification of a C0 compiler: Code generation and implementation correctness. In: SEFM. (September 2005) 2–11 3. Beyer, S., Jacobi, C., Kröning, D., Leinenbach, D., Paul, W.J.: Putting it all together: Formal verification of the VAMP. STTT 8(4-5) (2006) 411–430 4. Alkassar, E., Starostin, A., Schirmer, N.: Formal pervasive verification of a paging mechanism. In: TACAS. (2007) 5. Gargano, M., Hillebrand, M., Leinenbach, D., Paul, W.: On the correctness of operating system kernels. In: TPHOLs. Volume 3603 of LNCS., Springer (2005) 1–16 6. Hillebrand, M., In der Rieden, T., Paul, W.: Dealing with I/O devices in the context of pervasive system verification. In: ICCD, IEEE Computer Society (2005) 309–316 7. Bevier, W.R., Hunt, Jr., W.A., Moore, J.S., Young, W.D.: An approach to systems verification. J. Autom. Reasoning 5(4) (December 1989) 411–428 8. Bevier, W.R.: Kit and the short stack. J. Autom. Reasoning 5(4) (December 1989) 519–530 9. Heiser, G., Elphinstone, K., Kuz, I., Klein, G., Petters, S.M.: Towards trustworthy computing systems: taking microkernels to the next level. Operating Systems Review (July 2007) 41(3) 10. Hohmuth, M., Tews, H., Stephens, S.G.: Applying source-code verification to a microkernel: the vfiasco project. In: Operating Systems Review, European workshop: beyond the PC, New York, NY, USA, ACM Press (2002) 165–169 11. Shapiro, J.S., Weber, S.: Verifying the EROS confinement mechanism. In: Symposium on Security and Privacy, IEEE Computer Society (2000) 166–176 12. Ni, Z., Yu, D., Shao, Z.: Using XCAP to certify realistic systems code: Machine context manage- ment. In: TPHOLs, Lecture Notes in Computer Science (September 2007) 189–206 13. Milner, R.: A Calculus of Communicating Systems. Springer (1982) 14. Hoare, C.A.R.: Communicating Sequential Processes. Prentice-Hall (1985) 15. Basin, D.A., Olderog, E.R., Sevinç, P.E.: Specifying and analyzing security automata using CSP-OZ. In: ASIACCS, ACM (2007) 70–81