<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Model Stack for the Pervasive Verification of a Microkernel-based Operating System?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Matthias Daum</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan D¨orrenba¨cher</string-name>
          <email>jandb@wjpserver.cs.uni-sb.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Bogan</string-name>
          <email>sebastian@wjpserver.cs.uni-sb.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Saarland University, Computer Science Dept.</institution>
          <addr-line>66123 Saarbru ̈cken</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>56</fpage>
      <lpage>70</lpage>
      <abstract>
        <p>Operating-system verification gains increasing research interest. The complexity of such systems is, however, challenging and many endeavors are limited in some respect: Some projects focus on a particular aspect like memory safety, not pursuing functional correctness. Others restrict their verification efforts to a single layer of software, assuming correctness of those below. Only few projects aim at pervasive formal verification of a computer system over several software layers. In our paper, we present an approach to the formal specification of a microkernel-based operating system at several layers and glance on our verification experience with this model stack. From our experience, we conclude that pervasiveness entails more than just cumulative verification efforts on several layers. In fact, it is a challenging task to integrate models and proofs into a uniform, coherent theory.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Consequently, we need an assembly semantics, a C semantics and a
simulation theorem between both semantics. Leinenbach and Petrova [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] provide
exactly that: a proven correct compiler that translates programs from the
perfectly type-safe C variant C0 into the assembly language of our risc processor.
An important feature of C0 is the inline-assembly statement, which permits
the language extensions via primitives. Using this mechanism, we have
implemented our operating system in C0. The correct C0 compiler is an elementary
prerequisite for our model stack.
      </p>
      <p>Outline. In Sect. 2, we explain the fundamentals of our operating system. This
section comprises a sketch of the correct compiler, a survey of the features of
our microkernel, and an outline of the system’s implementation design. With
these prerequisites in place, we give an overview of our model stack in Sect. 3.
Certainly, we cannot descend into all details of the stack in this paper.1 Hence, we
confine ourselves to aspects of encapsulation and state partitioning. In Sect. 4,
we describe our formal specification of the generic interface between the user
processes and the kernel. Moreover, we sketch our formal model for assembly
and C0 processes. This process abstraction is an elementary prerequisite for the
model of communicating user processes as well as for the top-level specification of
the operating system. We illustrate in Sect. 5, why we use the process abstraction
already in the Vamos specification and how we have constructed the latter model
in order to support the abstraction towards Coup and the top-level specification.
We conclude in Sect. 6.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <sec id="sec-2-1">
        <title>A Correct Compiler for an Extensible Language</title>
        <p>
          In this section, we summarize work of Leinenbach and Petrova [
          <xref ref-type="bibr" rid="ref1 ref2">1,2</xref>
          ]. They provide
a formally verified compiler that translates programs from the perfectly
typesafe C variant C0 into the assembly language of a risc processor. Restrictions
of C0 in comparison to C are that expressions must be side-effect-free, all type
conversions have to be made explicitly, and there is no pointer arithmetic. In
spite of that, C0 still features dynamic memory allocation and inlined assembly.
        </p>
        <p>The inline-assembly statement is an important feature for language
extensions. Though the effect of this statement cannot be expressed in the formal
C0 semantics, the compiler literally embeds the given assembly code into the
compilation. Using this hook, we can implement C0 functions that use inline
assembly in order to extend C0 by a functionality that cannot be expressed by
other C0 statements.
1 The complete Isabelle/HOL theories of the models as well as detailed documentation are made
available at the Verisoft repository, see &lt;http://www.verisoft.de/VerisoftRepository.html&gt;</p>
        <p>For C0 and for the assembly language, two small-step transition semantics
have been formalized in Isabelle/HOL. Leinenbach has formally proven compiler
correctness by a stepwise simulation between both semantics. Below, we describe
the semantics in detail. We often deal with structured values, which we define
by enumerating the components in prose, e. g., “a value x consisting of two
components this and that ”. We refer to a single component with a dot, e. g.,
x.this refers to component this of value x. An update of this component is
denoted by x [this := q].</p>
        <p>C0 Semantics. For lack of space, we can only glance at the C0 semantics.
C0 programs are statically represented by the program environment Γ , which
comprises a symbol table of global variables, a type-name environment, and a
function table. The dynamically changing state sC0 of a C0 program in execution
is composed of
– the remaining program sC0.prog, and
– the current state of the program variables sC0.mem.</p>
        <p>In the following sections, we assume an evaluation function get-val for the
lookup, and an update function set-val for the manipulation of a certain variable in
the memory. We refer to the value of expression e in state sC0 by get-val (sC0, e).
If we update the left-value l in state sC0 with some expression u, we denote the
resulting configuration by set-val (sC0, l, u).</p>
        <p>
          The transition relation δCse0q of this semantics is a partial function.
The Target Assembly Language. The assembly semantics was developed for
the risc processor Vamp [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The assembly semantics abstracts from the paging
mechanism of the processor and employs a linear memory model. An assembly
state sasm consists of the following components:
– the normal and the delayed program counters, sasm.pc and sasm.dpc,
respectively, implementing the delayed branch mechanism.
– the general-purpose register file sasm.gpr ∈ {0, . . . , 31} → {0, 1}32.
– the memory size sasm.V measured in pages of 4096 bytes. It defines the set
of available memory addresses: VA(sasm) = {a | a &lt; sasm.V · 4096}
– the byte-addressable linear memory sasm.vm ∈ VA(sasm) → {0, 1}8
        </p>
        <p>We denote the state space of the assembly semantics by Sasm. Assembly
computations are modeled by the partial function δasesqm ∈ Sasm * Sasm. Note
that the effects of exceptions like illegal page faults cannot be fully determined
from the assembly-machine state. In that case, δasesqm gets stuck. But with sufficient
resources, a compiled C0 program does not generate exceptions during normal
execution. Moreover, the memory size sasm.V can neither be read nor changed
by the assembly machine itself but depends on the operating-system kernel. We
extend the semantics accordingly in Sect. 4.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Fundamentals of our Microkernel</title>
        <p>In this section, we sketch the features of Vamos and explain the fundamental
access-control mechanism that establishes the process roles “privileged process”
and “device driver”.</p>
        <p>Our microkernel Vamos performs the following tasks: (a) enforcement of
a minimal access control, (b) process management, (c) memory management,
(d) priority-based round-robin scheduling, (e) support for user-mode device
drivers, and (f) inter-process communication (IPC). Processes can control these
tasks via the kernel’s application binary interface (ABI). Table 1 lists the kernel
calls that constitute the ABI.</p>
        <p>Most kernel calls are reserved for so-called privileged processes. Thus, only
a privileged process can bring up new processes or kill existing ones, alter the
memory consumption of processes, change their scheduling parameters, or
control the registration of device drivers. Any process, however, might use the IPC
mechanism. Thus, the privileging serves as a minimal access control. We
presume that the privileged processes constitute the user-mode parts of the
operating system and implement a more sophisticated access-control mechanism.
Non-privileged processes may then communicate with the privileged processes
in order to request kernel services on their behalf.</p>
        <p>When Vamos boots, it launches one single process, the init process. This
process is privileged and has to set up the required servers of the operating
system, start and register the device drivers, and possibly run initial applications.</p>
        <p>A device driver is a user process, which is designated for the communication
with certain devices. Only if a process is registered as a driver for a particular
device, it may place read or write requests from or to that device, respectively.
Moreover, the device driver is notified of interrupts from that device.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Implementation Design</title>
        <p>
          Fig. 1 depicts the implementation structure of Vamos. The lowest software layer
is called communicating virtual machines (CVM). This layer encapsulates all
hardware-specific low-level functionality, which is possibly using inline assembly.
CVM has two major tasks: memory virtualization and switching between
different threads of execution. Hence, CVM includes a page-fault handler with a
simple memory swapping facility [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Moreover, it exports an interface of so-called
primitives for the access and manipulation of user machines to the
hardware-independent part. We have thereby established a solid framework for microkernel
construction.
        </p>
        <p>Using this framework, we have implemented our microkernel Vamos in C0
without extra portions of inline assembly. On every kernel entry, CVM preserves
the old context, establishes a suitable C0 environment and calls the function
kdispatch of Vamos. For the manipulation of the user memory or registers,
Vamos may call the primitives of CVM. The return value of kdispatch determines,
which user machine resumes when the kernel execution finishes.
rseU eodm</p>
        <p>App</p>
        <p>SOS</p>
        <p>App
kcalls kcalls</p>
        <p>Vamos</p>
        <p>primitives
e
o kdispatch
d
m
m
e
t
s
y
S</p>
        <p>CVM
Hardware</p>
        <p>SOS
Coup
(C0+asm)?
(C0+asm)?
Vamos</p>
        <p>asm?
CVM</p>
        <p>C0
asm?</p>
        <p>Fig. 2. Schematics of our model stack</p>
        <p>While CVM and Vamos run in the privileged system mode of the processor,
user machines run in the unprivileged user mode. In the figure, we labelled
one user machine “SOS” for simple operating system and the others “App”
as abbreviation for application. The SOS constitutes the highest layer of our
operating system. It features an advanced rights management with different
users, implements a sophisticated access control to kernel services like process
creation and provides further services like file-system and network access. All
user machines interact with the kernel via kernel calls. The special instruction
trap causes an exception, which is handled in Vamos. Vamos can examine and
alter the state of the user machine using CVM primitives, thus identifying the
user machine’s specific request and storing the kernel’s corresponding response.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Model-Stack Overview</title>
      <p>In analogy to the software stack, we have developed a model stack. For each
software layer, this stack contains a specification that fully describes the
observable behavior of this layer. The specification describes the full functionality of
the software layer, on the one hand, and forms the foundation for the
simulation proofs for the more abstract layers, on the other hand. We build the model
stack as a means of verifying statements about the real system on a convenient,
abstract layer. Such statements include safety and liveness properties.</p>
      <p>
        Fig. 2 shows our model stack. The lowest model describes CVM. This model
comprises a kernel machine, given in the C0 semantics, and a number of user
machines with virtual memory, given in the assembly semantics. The kernel runs
in system mode with disabled interrupts. The user machines, in contrast, may
be interrupted after each assembly instruction, thus we model them on assembly
level. We will not descend into the details of this model in the present publication
because it has been described earlier [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>We can instantiate the kernel of CVM with the Vamos implementation,
i. e., the C0 program that implements (the hardware-independent part of) our
microkernel is executed as CVM’s kernel machine. The Vamos model specifies
the behaviour of the resulting system. In short, this model establishes assembly
processes that communicate with the kernel via a well-defined interface, the
kernel ABI. A code-correctness proof for the Vamos implementation establishes
the simulation theorem between the instantiated CVM model and the Vamos
specification.</p>
      <p>While the Vamos processes use the assembly semantics, the SOS and the
applications are written in C0. The kernel ABI is encapsulated in a few library
functions. Except for the library functions, the verification of these programs
should certainly employ the more abstract C0 semantics. For this purpose, we
designed the model communicating user processes (Coup). This model
simulates Vamos but it abstracts from the scheduler and establishes C0 processes.
We need to abstract from the particular Vamos scheduler because of the
different granularity of the program semantics: The execution of a process might
be suspended and resumed on every assembly instruction but a C0 statement
usually contains several assembly instructions. The simulation proof has three
parts: (1) The execution traces of Coup should contain all traces of Vamos.
(2) There is a confluent reordering of any execution trace such that there is no
scheduling event during a single C0 statement. (3) The kernel library is correctly
implemented.</p>
      <p>We can instantiate one of the C0 processes with the implementation of SOS.
The SOS model specifies the behaviour of the resulting system. The simulation
between both models can be shown by the functional correctness of the SOS
implementation.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Process Abstraction</title>
      <p>In this section, we formalize the interface between the processes and the kernel.
Both kernel models, the Vamos specification and the Coup model, use this
process abstraction in order to access and manipulate processes. Our abstraction
is based on the observation that Vamos interacts with processes only via a
well-defined interface, the kernel ABI. Hence, we can encapsulate processes in
a self-contained input-output automaton, thereby hiding the internal state and
just exposing the generic interface.</p>
      <p>We define a process as input-output automaton described by a tuple
(Sproc, Σproc, Ωproc, ωproc, vm-sizeproc, init proc, δproc)
with state space Sproc, input alphabet Σproc, output alphabet Ωproc, output
functions ωproc and vm-sizeproc, initialization function init proc, and transition function
δproc.</p>
      <p>While the state space Sproc depends on the individual process abstraction,
the interface between the kernel and the processes is naturally shared by all
process abstractions. This interface is entirely defined by Σproc and Ωproc.</p>
      <p>The output alphabet Ωproc enumerates all possible kernel calls. Additionally,
we have to treat a few error cases. As the kernel calls are internally identified by a
number, a process might specify an invalid number. This condition is represented
by the special output value undefined trap. Moreover, a process might generate
exceptions like an arithmetic overflow or an illegal page fault. These exceptions
are collectively represented as value runtime error. Finally, the output ε denotes
the intention to perform a local computation.</p>
      <p>The input alphabet Σproc reflects all kernel-initiated changes of a process.
These comprise all possible responses to kernel calls, on the one hand, and the
demand to change the amount of virtual memory, on the other hand. While the
former are the synchronous reaction to a kernel call, the latter may be issued
asynchronously at any stage of a process. In order to perform a local transition,
we pass the input ε to the transition function δproc.</p>
      <p>In addition to the usual functions δproc ∈ Σproc × Sproc → Sproc for
transitions and ωproc ∈ Sproc → Ωproc for the output, we use two more functions in
our process abstraction. In order to compute the overall memory consumption
of the process system, Vamos needs to know the amount of virtual memory that
is currently occupied by every process. The function vm-sizeproc ∈ Sproc → N
provides the necessary information. When a new process is created, Vamos
has to translate a representation of a given binary executable file into the
corresponding, initial process state. We specify this translation in the function
init proc ∈ N? → Sproc.</p>
      <p>For the kernel specification, the generic abstraction of processes as
selfcontained input-output automata is rather a matter of taste than necessity.
However, this generic process abstraction is a cornerstone of the Coup model.
Moreover, it is crucial for the simulation theorem between Vamos and Coup
because both models mainly differ in the process abstraction. Hence, it proved to
be beneficial that many definitions of the specification are parametrized over the
process abstraction and can thus be reused in Coup as well. Finally, the
introduction of this parameter on the whole Coup model is just a logical consequence
because it paves the way for the integration of other language semantics.</p>
      <p>In the following, we refine the generic abstraction with specific process models
for assembly and C0.</p>
      <p>Assembly Processes. We reuse the state space Sasm of the assembly
semantics for our assembly processes. Based on this state space, we now define the
output functions ωasm and vm-sizeasm, the transition function δasm, and the
initialization function init asm. The function vm-sizeasm simply returns the value of
the component V of the current state: vm-sizeasm(sasm) = sasm.V . However, the
definitions for other functions are too complex to fully present them here. We
constrain ourselves to an exemplary excerpt.</p>
      <p>Fig. 3 depicts the formal definition of the output function ωasm and the
transition function δasm for the call process clone. We assume that sasm is the state
of an assembly process. The predicate trap holds iff the current instruction is
a trap, and the function simm extracts the sign-extended immediate constant
from the current instruction. If there is a trap with immediate constant 2, the
output function will return the pair of value process clone and the content of
register 11. Let us now assume that the kernel recognizes this output from the
current process but the process is not privileged. In this case, the kernel signals
this error condition by passing value err unprivileged via the transition function
to the current process. Then, the transition function updates the register 22 with
the corresponding error code and increases the program counters.
trap(sasm) ∧ simm(sasm) = 2 =⇒ ωasm(sasm) = (process clone, sasm.gpr(11))</p>
      <p>2 gpr(22) := −4 3
δasm(err unprivileged, sasm) = sasm4 pc := sasm.pc + 4 5</p>
      <p>dpc := sasm.dpc + 4
C0 Processes. In order to represent C0 processes, we define an automaton
with the following signature:
(SC0, Σproc, Ωproc, ωC0, vm-sizeC0, init C0, δC0)</p>
      <p>Any state sC0 ∈ SC0 comprises the static program environment as well as
the dynamic program state of the C0 machine as described in Sect. 2.1. We
store the program environment because processes might dynamically be created
and killed, hence the program might change. The output functions ωC0 and
vm-sizeC0, the transition function δC0, and the initialization function init C0 are
defined in analogy to their assembly-process counterparts.</p>
      <p>For example, the formal definition of the output function and the transition
function for the call process clone is given in Fig. 4. We assume that sC0 is the
current state of a C0 process. The component sC0.prog denotes the remaining
program of the process. We consider the first statement of the program. If this
statement is a call to the function vc process clone, we define the output of
ωC0 as the pair of process clone and the value of the first argument to the function
call. Let us now assume that the kernel recognizes this output from the current
process, clones the given process, and computes the value hnnew as new identifier
for the clone. It would then pass this value via the transition function to the
current process, thus signalling the success of the clone operation. In this case,
the transition function updates the memory sC0.mem at the address designated
by the left-value e with value hnnew and removes the function call from the
remaining program.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Formal Kernel Models</title>
      <p>In the previous section, we described our process model. Now, we embed the
processes into the two concurrent kernel models, the Vamos specification, and
the Coup model. The former specifies the exact behaviour of our microkernel
with a particular scheduler. This model is used for code verification. The latter
abstracts the scheduler and focusses on the interaction of the processes with the
sC0.prog = ”e = vc process clone(e0); r”</p>
      <p>» mem := set-val(sC0.mem, e, hnnew) –
=⇒ δC0((succ new process, hnnew), sC0) = sC0 prog := r
microkernel.We need this more abstract model in order to describe the reordering
of interleaved sequences and to introduce C0 processes.</p>
      <p>Both models are Moore machines. We describe the kernel specification with
the tuple (Sv, s0v, Σˆ, Ωˆ, ωv, δv) and the Coup model with (Svc, s0vc, Σˆ, Ωˆ, ωvc, δvc).
The state spaces Sv, Svc contain the initial states s0v and s0vc, respectively. For
device communication, we use the input alphabet Σˆ and the output alphabet
Ωˆ. The functions ωv and ωvc determine the output in a current state. Finally, δv
and δvc describe the transitions of the models. The notable difference between
these machines regards the determinism: While the Vamos specification is fully
deterministic, Coup features a non-determinism in its scheduling decisions.
Consequently, the transition relation δv is functional while δvc is not.</p>
      <p>
        Below, we introduce the different components of the models side by side.
Device Communication. Our kernel uses memory-mapped I/O for device
communication. Hence, the output alphabet Ωˆ comprises read and write accesses to
device addresses. The input alphabet Σˆ consists of interrupt lines and optionally
incoming data. Hillebrand et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] have described our device interface in detail.
State Spaces. A state sv ∈ Sv comprises the following components:
– The partial function sv.procs maps the process identifiers (PIDs) of the
currently active processes to their assembly states sasm ∈ Sasm. For inactive
processes, this function is undefined.
– Priorities are assigned to each PID of the active processes with the partial
function sv.priodb.
– All other scheduling information is kept in the component sv.schedds.
– The partial function sv.rightsdb maps PIDs to a data structure for the
management of IPC rights and the set of privileged processes.
– Finally, the component sv.devds contains data for device communication.
At first sight, the separation of priorities from the remaining scheduling data
might surprise, here. The reason for this unintuitive partitioning is that Coup
abstracts from the particular scheduling algorithm but the priorities remain
visible. We modularized the states and the corresponding transition functions in
order to share the definition of similar parts between the Vamos specification,
Coup, and the SOS model.
      </p>
      <p>A central part of the Vamos specification are the scheduling data structures.
The component sv.schedds is divided into sub components. The current time
time ∈ N is a counter for clock ticks. Process-specific scheduling information
for active processes is collected in the partial function procdb that maps PIDs
to a record of (a) the time slice tsl , (b) the amount of consumed time ctsl , and
(c) the absolute timeout to. If a process is found to be computing when a timer
interrupt raises, the component ctsl is increased until the process has finally
run for tsl ticks. In this case, another process is scheduled. If a process calls the
kernel for IPC and no partner is ready for communication, the absolute timeout
to is computed from the current time and the relative timeout that has been
specified with the call.</p>
      <p>Moreover, the scheduler maintains different queues for scheduling. They are
represented as finite sequences in the Vamos specification. Namely, there is
a ready queue ready(prio) of schedulable processes for each priority prio ∈
{0, 1, 2}. The processes that cannot currently be scheduled (because they are
waiting for an IPC partner) are held in a queue named wait .</p>
      <p>In Vamos, the current process is the first process in the highest, non-empty
ready queue. If all ready queues are empty, the current process is undefined.
Formally, we define the function cup as:
cup(sv.schedds) = p ⇐⇒ ∃i : sv.schedds.ready(i) = (p, . . . ) ∧</p>
      <p>∀ j &gt; i : sv.schedds.ready(j) = ()</p>
      <p>The corresponding Coup state svc inherits most components of sv. Only two
components change: The process abstraction becomes a model parameter, i. e.,
component svc.procs of a Coup state svc is a partial function from PIDs to a
generic state space Sproc for processes. Moreover, the scheduling data structures
are replaced by a current-process indicator. We retain the current process in the
state in order to compute the output from the current state. The output function
ωvc signals the demand for device communication. In order to determine this
demand, we need to employ the output function ωproc of the current process.
Consequently, we fix this process beforehand instead of including transitions for
all ready processes in the transition relation.</p>
      <p>Transitions. A transition δv(σˆ, sv) of the Vamos specification under the device
input σˆ ∈ Σˆ has up to three phases:
1. If the current process cp = cup(sv.schedds) is defined, we consult its
output ωproc(sv.procs(cp)) and compute the response according to the current
Vamos state. For instance, if a process calls process clone, we check for
sufficient privileges and resources and choose the corresponding response
σ ∈ Σproc for success or failure. With this response, we advance the
current process: sv [procs(cp) := δproc(σ, sv.procs(cp))].
2. If the timer-interrupt line is raised, the scheduler increases the clock-tick
counter sv.schedds.time and the consumed time sv.schedds.procdb(cp).ctsl of
the current process. Moreover, the scheduler wakes up all processes p with
elapsed timeouts.
3. Finally, Vamos delivers interrupts to waiting drivers and saves the remaining
interrupts for later delivery in sv.devds.
σ</p>
      <p>The Coup transitions behave very similarly. However, a transition svc ←→
s0vc ∈ δvc, obtains cp directly from svc.cup. Moreover, the only visible effect of the
second phase is the wake-up of certain waiting processes. This effect is simulated
non-deterministically and independently from the timer interrupt.
Reusability. As mentioned earlier, we worked hard to reuse as much definitions
as possible in the different layers of abstraction. For example, in contrast to the
Coup model, the kernel specification does not actually rely on the encapsulation
of processes as self-contained input-output automata. Since Vamos only
considers assembly processes, it would be easier to directly manipulate the state of
these processes. However, in order to literally reuse update functions and
definitions, we invested the extra work and already introduced the input-output
automata at this level of our model stack.</p>
      <p>Consider the formal definition of the function v clone updt procs in Fig. 5.
This function describes the necessary updates in case of a successful call to
process clone. Without going into detail, depending on the relation between
the calling process (psubj) and the process that was cloned (pobj), this
function updates the state of the calling process, the state of the cloned process,
and the state of the new process. The important thing about this definition
is that the updates are presented in a process-abstraction independent
manner. Rather than updating particular registers, as it would suffice for
assembly processes, we feed the generic transition function δproc with generic input,
e.g. (succ new process, hnnew). Now, this generic interface between kernel and
processes allows us to use v clone updt procs in both models, i.e. Vamos and
Coup. Proceeding in a similar way with all kernel calls keeps the models clear
and maintainable. Furthermore, tedious equivalence proofs between Vamos and
Coup are avoided.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>
        Related Work. With the CLI stack [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], a pioneering approach started out
for pervasive verification of a complete system with rigorous formalizations of
specifications and proofs. Most notably, the simple kernel KIT [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] was developed
v clone updt procs(procs, psubj, pobj, pnew, hnnew) =
λx.
      </p>
      <p>if x = psubj then δproc((succ new process, hnnew), procs(psubj))
else if x = pnew ∧ psubj = pobj then δproc((succ new process, 0), procs(psubj))
else if x = pnew then procs(pobj)
else procs(x)
Fig. 5. Formal definition of the function v clone updt procs that computes the
component procs after a call to process clone
and its machine code implementation was proven to be correct. However, this
kernel is fairly simple compared to modern microkernels.</p>
      <p>
        Many recent projects undertake verification efforts on modern microkernels.
Among them are L4.verified [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], VFiasco [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], Eros [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], and the Flint project
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Though these projects may have achieved some advances, they all discard
pervasiveness but focus on the verification of a single software layer. The Flint
project has developed a verification framework for an assembly language and
has formally proven the correctness for context-switching code. The other three
projects have established semantics for C variants and have verified different
properties on source code level. As far as we can see, inlined assembly portions
are just postulated to be correct and solely described by their semantical effects.
Moreover, these projects rely on compiler correctness.
      </p>
      <p>
        Traditionally, concurrent systems are described by the Calculus of
Communicating Systems [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] or as Communicating Sequential Processes (CSP) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Both
approaches, however, argue on a very abstract level and are mainly used to
specify system requirements at an early stage in the system design. Basin, Olderog
and Sevin¸c [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], for instance, describe the combination of CSP and Object-Z
in order to specify and analyze security automata. Unfortunately, a link to the
actual code is missing.
      </p>
      <p>Discussion. We presented an approach to a pervasive, formal specification of
a microkernel-based operating system. The formal models of our microkernel
Vamos and the SOS occupy almost 400 kB and comprise about 8,000 lines of
specification. The formalization took us about 48 person months. The different
specification layers resemble essentially the implementation layers.</p>
      <p>A recurring problem was the semantic gap between the high-level
implementation language C0 and certain hardware details that were manipulated by
the considered programs. Such details concerned hidden hardware components
like devices, on the one hand, and the granularity of atomic operations on the
hardware level, on the other hand. Both problems arose simultaneously when we
argued about C0 processes: In pure C0, we cannot express kernel calls. Hence, we
extended the original C0 semantics by a library of functions that implement the
kernel calls using inlined assembly. Moreover, user machines may be interrupted
after the execution of any assembly instruction. Consequently, we had to justify
that we may confluently shift all rescheduling events to statement boundaries
before we could argue about C0 processes. In non-pervasive approaches, these
challenges are likely to be skipped.</p>
      <p>When we began to specify the different layers of our operating system, we
started with independent models. As expected, many parts of the formal models
overlapped because we specified the same operating system at several layers of
abstraction. After a short period of evaluation for each abstraction layer, we
coupled the models as tight as possible in order to minimize the proof efforts for
the simulation theorems between adjacent models.</p>
      <p>This ambivalent approach was very successful. In the very beginning, the
specifications change very often, such that maintaining dependencies between
them would be very tedious and distracting from the individual problems. As
soon as the models stabilized, however, there was a common ground for a tight
meshing. The synergies between the layers could easily be identified and utilized.</p>
      <p>The key to shared definitions between the Vamos specification and the Coup
model was twofold: On the one hand, we unitized the state space in a way that
permits easy abstraction between the different abstraction layers. On the other
hand, we established the process abstraction already at the Vamos specification.
Both arrangements together enabled a very tight mesh, which could even be
reused in the SOS specification.</p>
      <p>Of course, this tight mesh of the models has drawbacks: The specification
of an abstract layer depends massively on the underlying layers, which impedes
the understanding of this model. Moreover, even the lower layers became more
complicated, so for instance, when the process abstraction was already
introduced in the Vamos specification. Finally, several layers were affected whenever
the abstraction at one level turned out to be wrong.</p>
      <p>From our experience, however, we conclude that the advantages outweigh
the drawbacks. We saved valuable time by reusing the definitions, which we
otherwise had spent in distracting equivalence proofs. For instance, we succeeded
with the step-wise simulation proof between the Vamos specification and the
Coup model within two months. We have proven this simulation for most kernel
calls within days. Most of the time, however, we spend with the verification of
IPC. This effort was caused by a considerably simpler modelling of IPC in Coup,
which became possible after the scheduler was abstracted. In this particular case,
the simpler modelling was desirable and hence, we invested the additional time
for the proof. Independent models, however, would possibly have caused such
an overhead for every kernel call.</p>
      <p>In the code-correctness proof for Vamos, there is only a minimal overhead
(hours) induced by the use of the process abstraction in the specification. All in
all, we invested about 18 person months to show the correctness of the IPC send
call. Including all auxiliary functions, the implementation of this call accounts
for nearly half the code size and comprises the most complex parts. Again 18
person month had been spent on the functional verification of the file system
and the hard disk driver code in the SOS. This code covers 20% of the SOS
implementation.</p>
      <p>Summary. The fundamental difference of our “pervasive” approach compared
to “incremental” approaches that focus on a single layer at a time regards the
design of specifications: At first, models are usually adapted to the actual
verification goal. If it is the only goal, to show that the formal specification precisely
describes the implementation, the specification tends to move closely towards
the implementation. If a model is used as fundament of an underlying
component without a proof, there is a likelihood that the model over-abstracts the
actual implementation. At second, regarding one single layer at a time
necessarily leads to self-contained, independent models. When combining those different
layers later on, there is a tremendous proof effort necessary to establish
simulation theorems between the adjacent layers. Our approach prevents this effort by
the specification design.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Leinenbach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petrova</surname>
          </string-name>
          , E.:
          <article-title>Pervasive compiler verification: From verified programs to verified systems</article-title>
          .
          <source>In: Workshop on Systems Software Verification</source>
          , to appear, Elsevier (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Leinenbach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paul</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petrova</surname>
          </string-name>
          , E.:
          <article-title>Towards the formal verification of a C0 compiler: Code generation and implementation correctness</article-title>
          .
          <source>In: SEFM</source>
          . (
          <year>September 2005</year>
          )
          <fpage>2</fpage>
          -
          <lpage>11</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Beyer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jacobi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Kr¨oning,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Leinenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Paul</surname>
          </string-name>
          , W.J.:
          <article-title>Putting it all together: Formal verification of the VAMP</article-title>
          .
          <source>STTT</source>
          <volume>8</volume>
          (
          <issue>4</issue>
          -5) (
          <year>2006</year>
          )
          <fpage>411</fpage>
          -
          <lpage>430</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Alkassar</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Starostin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schirmer</surname>
          </string-name>
          , N.:
          <article-title>Formal pervasive verification of a paging mechanism</article-title>
          . In: TACAS. (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gargano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hillebrand</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leinenbach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paul</surname>
          </string-name>
          , W.:
          <article-title>On the correctness of operating system kernels</article-title>
          .
          <source>In: TPHOLs</source>
          . Volume
          <volume>3603</volume>
          of LNCS., Springer (
          <year>2005</year>
          )
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hillebrand</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , In der Rieden, T.,
          <string-name>
            <surname>Paul</surname>
          </string-name>
          , W.:
          <article-title>Dealing with I/O devices in the context of pervasive system verification</article-title>
          .
          <source>In: ICCD, IEEE Computer Society</source>
          (
          <year>2005</year>
          )
          <fpage>309</fpage>
          -
          <lpage>316</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bevier</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunt</surname>
            , Jr.,
            <given-names>W.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moore</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Young</surname>
            ,
            <given-names>W.D.:</given-names>
          </string-name>
          <article-title>An approach to systems verification</article-title>
          .
          <source>J. Autom. Reasoning</source>
          <volume>5</volume>
          (
          <issue>4</issue>
          ) (
          <year>December 1989</year>
          )
          <fpage>411</fpage>
          -
          <lpage>428</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Bevier</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          :
          <article-title>Kit and the short stack</article-title>
          .
          <source>J. Autom. Reasoning</source>
          <volume>5</volume>
          (
          <issue>4</issue>
          ) (
          <year>December 1989</year>
          )
          <fpage>519</fpage>
          -
          <lpage>530</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Heiser</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elphinstone</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuz</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petters</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          :
          <article-title>Towards trustworthy computing systems: taking microkernels to the next level</article-title>
          .
          <source>Operating Systems Review (July</source>
          <year>2007</year>
          )
          <volume>41</volume>
          (
          <issue>3</issue>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hohmuth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tews</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stephens</surname>
            ,
            <given-names>S.G.</given-names>
          </string-name>
          :
          <article-title>Applying source-code verification to a microkernel: the vfiasco project</article-title>
          .
          <source>In: Operating Systems Review, European workshop: beyond the PC</source>
          , New York, NY, USA, ACM Press (
          <year>2002</year>
          )
          <fpage>165</fpage>
          -
          <lpage>169</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Shapiro</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weber</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Verifying the EROS confinement mechanism</article-title>
          .
          <source>In: Symposium on Security and Privacy</source>
          , IEEE Computer Society (
          <year>2000</year>
          )
          <fpage>166</fpage>
          -
          <lpage>176</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Ni</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shao</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Using XCAP to certify realistic systems code: Machine context management</article-title>
          .
          <source>In: TPHOLs, Lecture Notes in Computer Science (September</source>
          <year>2007</year>
          )
          <fpage>189</fpage>
          -
          <lpage>206</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Milner</surname>
          </string-name>
          , R.:
          <source>A Calculus of Communicating Systems</source>
          . Springer (
          <year>1982</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Hoare</surname>
            ,
            <given-names>C.A.R.</given-names>
          </string-name>
          : Communicating Sequential Processes. Prentice-Hall (
          <year>1985</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Basin</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Olderog</surname>
            ,
            <given-names>E.R.</given-names>
          </string-name>
          , Sevin¸c, P.E.:
          <article-title>Specifying and analyzing security automata using CSP-OZ</article-title>
          . In: ASIACCS,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2007</year>
          )
          <fpage>70</fpage>
          -
          <lpage>81</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>