=Paper= {{Paper |id=Vol-2597/paper-08 |storemode=property |title=Control-flow Flattening Preserves the Constant-Time Policy |pdfUrl=https://ceur-ws.org/Vol-2597/paper-08.pdf |volume=Vol-2597 |authors=Matteo Busi,Pierpaolo Degano,Letterio Galletta |dblpUrl=https://dblp.org/rec/conf/itasec/BusiDG20 }} ==Control-flow Flattening Preserves the Constant-Time Policy== https://ceur-ws.org/Vol-2597/paper-08.pdf
Control-flow Flattening Preserves the Constant-Time Policy∗
               Matteo Busi1 , Pierpaolo Degano1 , and Letterio Galletta2
              1
                Università di Pisa, Pisa, Italy — {matteo.busi, degano}@di.unipi.it
       2
           IMT School for Advanced Studies, Lucca, Italy — letterio.galletta@imtlucca.it


                                                Abstract
          Obfuscating compilers protect a software by obscuring its meaning and impeding the
      reconstruction of its original source code. The typical concern when defining such compilers
      is their robustness against reverse engineering and the performance of the produced code.
      Little work has been done in studying whether the security properties of a program are
      preserved under obfuscation. In this paper we start addressing this problem: we consider
      control-flow flattening, a popular obfuscation technique used in industrial compilers, and
      a specific security policy, namely constant-time. We prove that this obfuscation preserves
      the policy, i.e., that every program satisfying the policy still does after the transformation.


1     Introduction
Secure compilation is an emerging research field that puts together techniques from security,
programming languages, formal verification, and hardware architectures to devise compilation
chains that protect various aspects of software and eliminate security vulnerabilities [15, 10].
As any other compiler, a secure one not only translates a source program, written in a high-
level language, into an efficient object code (low-level), but also provides mitigations that make
exploiting security vulnerabilities more difficult and that limit the damage of an attack. More-
over, a secure compilation chain deploys mechanisms to enforce secure interoperability between
code written in safe and unsafe languages, and makes it hard extracting confidential data from
information gained by examining program runs (e.g., soft information as specific outputs given
certain inputs, or physical one as power consumption or execution time).
    An important requirement for making a compiler secure is that it must grant that the
security properties at the source level are fully preserved into the object level or, equivalently,
that all the attacks that can be carried out at the object level can also be carried out at the
source level. In this way, it is enough showing that the program is secure at the source level,
where reasoning is far more comfortable than at low level!
    In this paper we focus on obfuscating compilers designed to protect a software by obscuring
its meaning and impeding the reconstruction of its original source code. Usually, the main
concern when defining such compilers is their robustness against reverse engineering and the
performance of the produced code. Very few papers in the literature address the problem of
proving their correctness, e.g., [8], and, to the best our knowledge, there is no paper about
the preservation of security policies. Here, we offer a first contribution in this direction: we
consider a popular program obfuscation (namely control-flow flattening [18]) and a specific
   ∗ The first two authors have been partially supported by project PRA_2018_66 DECLware: Declarative

methodologies for designing and deploying applications of the Università di Pisa; the third author by MIUR
project PRIN 2017FTXR7S IT MATTERS (Methods and Tools for Trustworthy Smart Systems).
Copyright c 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution
4.0 International (CC BY 4.0).
Control-flow Flattening Preserves the Constant-Time Policy           M. Busi, P. Degano and L. Galletta


security policy (namely constant-time), and we prove that every program satisfying the policy
still does after the transformation, i.e., the obfuscation preservers the policy.
     For the sake of presentation, our source language is rather essential, as well as our illustrative
examples. The proof that control-flow flattening is indeed secure follows the approach of [3]
(briefly presented in Section 2), and only needs paper-and-pencil on our neat, foundational
setting. Intuitively, we prove that if two executions of a program on different secret values
are indistinguishable (i.e., they take the same time), then also the executions of its obfuscated
version are indistinguishable (Section 3).
     Actually, we claim that extending our results to a richer language will only require to
handle more details with no relevant changes in the structure of the proof itself; similarly, other
security properties can be accommodated with no particular effort in this framework, besides
those already studied in [3], and also other program transformations can be proved to preserve
security in the same manner.
     Below, we present the security policy and the transformation of interest.

Constant-time policy An intruder can extract confidential data by observing the physical
behavior of a system, through the so-called side-channel attacks. The idea is that the attacker
can recover some pieces of confidential information or can get indications on which parts are
worth her cracking efforts, by measuring some physical quantity about the execution, e.g.,
power consumption and time. Many of these attacks, called timing-based attacks, exploit the
execution time of programs [17]. For example, if the program branches on a secret, the attacker
may restrict the set of values it may assume, whenever the two branches have different execution
times and the attacker can measure and compare them. A toy example follows (in a sugared
syntax), where a user digits her pin then checked against the stored one character by character:
here the policy is violated since checking a correct pin takes longer than a wrong one.
1   pin := read_secret();
2   current_char := 1;
3   while (current_char ≤ stored_pin_length and pin(current_char) = stored_pin(current_char))
4       current_char := current_char+1;
5
6   if (current_char = stored_pin_length+1) then print("OK!");
7       else print("KO!");

Many mitigations of timing-based attacks have been proposed, both hardware and software.
The program counter [19] and the constant-time [5] policies are software-based countermea-
sures, giving rise to the constant-time programming discipline. It makes programs constant-time
w.r.t. secrets, i.e., the running times of programs is independent of secrets. The requirement
to achieve is that neither the control-flow of programs nor the sequence of memory accesses
depend on secrets, e.g., the value of pin in our example. Usually, this is formalized as a form of
an information flow policy [13] w.r.t. an instrumented semantics that records information leak-
age. Intuitively, this policy requires that two executions started in equivalent states (from an
attacker’s point of view) yield equivalent leakage, making them indistinguishable to an attacker.
   The following is a constant-time version of the above program that checks if a pin is correct:
1   pin := read_secret();
2   current_char := 1;
3   pin_ok := true
4   while (current_char ≤ stored_pin_length)
5       current_char := current_char+1
6       if (pin(current_char) = stored_pin(current_char)) then pin_ok := pin_ok
7            else pin_ok := false
8


2
 Control-flow Flattening Preserves the Constant-Time Policy         M. Busi, P. Degano and L. Galletta


 9   if (pin_ok = true) then print("OK!");
10       else print("KO!");



 Control-flow flattening A different securing technique is code obfuscation, a program trans-
 formation that aims at hiding the intention and the logic of programs by obscuring (portions
 of) source or object code. It is used to protect a software making it more difficult to reverse
 engineer the (source/binary) code of the program, to which the attacker can access. In the
 literature different obfuscations have been proposed. They range from only performing simple
 syntactic transformations, e.g., renaming variables and functions, to more sophisticated ones
 that alter both the data, e.g., constant encoding and array splitting [12], and the control flow
 of the program, e.g., using opaque predicates [12] and inserting dead code.
     Control-flow flattening is an advanced obfuscation technique, implemented in state-of-the-
 art and industrial compilers, e.g., [16]. Intuitively, this transformation re-organizes the Control
 Flog Graph (CFG) of a program by taking its basic blocks and putting them as cases of a
 selective structure that dispatches to the right case. In practice, CFG flattening breaks each
 sequences of statements, nesting of loops and if-statements into single statements, and then
 hides them in the cases of a large switch statement, in turn wrapped inside a while loop. In
 this way, statements originally at different nesting level are now put next each other. Finally,
 to ensure that the control flow of the program during the execution is the same as before, a
 new variable pc is introduced that acts as a program counter, and is also used to terminate the
 while loop. The switch statement dispatches the execution to one of its cases depending on
 the value of pc. When the execution of a case of the switch statement is about to complete pc
 is updated with the value of the next statement to executed.
     The obfuscated version of our constant-time example follows.
 1   pc := 1;
 2   while(1 ≤ pc)
 3        switch(pc):
 4             case 1: pin := read_secret(); pc:= 2;
 5             case 2: current_char := 1; pc:= 3;
 6             case 3: pin_ok := true; pc:= 4;
 7             case 4: if (current_char ≤ stored_pin_length) then pc:= 5; else pc := 9;
 8             case 5: current_char := current_char+1; pc:= 6;
 9             case 6: if (pin(current_char) = stored_pin(current_char)) then pc:= 7; else pc := 8;
10             case 7: pin_ok := pin_ok; pc:= 4;
11             case 8: pin_ok := false; pc:= 4;
12             case 9: skip; pc:= 10;
13             case 10: if (pin_ok = true) then pc:= 11; else pc := 12;
14             case 11: print("OK!"); pc:= 0;
15             case 12: print("KO!"); pc:= 0;

    Now the point is whether the new obfuscated program is still constant-time, which is the
 case. In general we would like to have guarantees that the attacks prevented by the constant-
 time based countermeasure are not possible in the obfuscated versions.


 2     Background: CT-simulations
 Typically, for proving the correctness of a compiler one introduces a simulation relation between
 the computations at the source and at the target level: if such a relation exists, we have the
 guarantee that the source program and the target program have the same observable behavior,
 i.e., the same set of traces.

                                                                                                      3
Control-flow Flattening Preserves the Constant-Time Policy                    M. Busi, P. Degano and L. Galletta


    A general method for proving that constant-time is also preserved by compilation generalizes
this approach and is based on the notion of CT-simulation [3]. It considers three relations: a
simulation relation between source and target, and two equivalences, one between source and
the other between target computations. The idea is to prove that, given two computations
at source level that are equivalent, they are simulated by two equivalent computations at the
target level. Actually, CT-simulations guarantee the preservation of a particular form of non-
interference, called observational non-interference. In the rest of this section, we briefly survey
observational non-interference and how CT-simulations preserve it.
    The idea is to model the behavior of programs using a labeled transition system of the form
   t
A→− B where A and B are program configurations and t represents the leakage associated with
the execution step between A and B. The semantics is assumed deterministic. Hereafter, let
the configurations of the source programs be ranged over by A, B, . . . and those of the target
programs be ranged over by α, β, . . .. We will use the dot notation to refer to commands and
state inside configurations, e.g., A.cmd refers to the command part of the configuration A.1
    The leakage represents what the attacker learns by the program execution. Formally, the
leakage is a list of atomic leakages where not cancellable. Observational non-interference is
defined for complete executions (we denote Sf the set of final configurations) and w.r.t. an
equivalence relation φ on configurations (e.g., states are equivalent on public variables):
Definition 2.1 (Observational non-interference [3]). A program p is observationally non-
interferent w.r.t. a relation φ, written p |= ONI (φ), iff for all initial configurations A, A0 ∈ Si
and configurations B, B 0 and leakages t, t0 and n ∈ N,
                    t             t0
                 → n B ∧ A0 −
                A−          −→n B 0 ∧ φ(A, A0 ) =⇒ t = t0 ∧ (B ∈ Sf iff B 0 ∈ Sf ).

   Hereafter, we denote a compiler/transformation with J·K and with JpK the result of compiling
a program p. Intuitively, a compiler J·K preserves observational non-interference when for every
program p that enjoys the property, JpK does as well. Formally,
Definition 2.2 (Secure compiler). A transformation J·K preserves observational non-interference
iff, for all programs p
                              p |= ONI (φ) ⇒ JpK |= ONI (φ).
    To show that a compiler J·K is secure, we follow [3], and build a general CT-simulation in
two steps. First we define a simulation, called general simulation, that relates computations
between source and target languages. The idea is to consider related a source and a target
configuration whenever, after they perform a certain number of steps, they end up in two still
related configurations. Formally,
Definition 2.3 (General simulation [3]). Let num-steps(·, ·) be a function mapping source and
target configurations to N. Also, let |·| be a function from source configurations to N. The
relation ≈p is a general simulation w.r.t. num-steps(·, ·) whenever:
    1. (∀B, α. A →                     →num-steps(A,α) β =⇒ B ≈p β),
                 − B ∧ A ≈p α =⇒ (∃β. α−
                                       −
    2. (∀B, α. A →
                 − B ∧ A ≈p α ∧ num-steps(A, α) = 0 =⇒ |B| < |A| ,
    3. For any source configuration B ∈ Sf and target configuration α there exists a target
                                       →num-steps(A,α) β =⇒ A ≈p β.
       configuration β ∈ Sf such that α−
                                       −
    1 Following the convention of secure compilation, we write in a blue, sans-serif font the elements of the source

language, in a red, bold one those of the target and in black those that are in common.


4
Control-flow Flattening Preserves the Constant-Time Policy               M. Busi, P. Degano and L. Galletta


Given two configurations A and α in the simulation relation, the function num-steps(A, α) pre-
dicts how many steps α has to perform for reaching a target configuration β related with the
corresponding source configuration B. When num-steps(a, α) = 0, a possibly infinite sequence of
source steps is simulated by an empty one at the target level. To avoid these situations the mea-
sure function |·| is introduced and the condition 2 of the above definition ensures that the mea-
sure of source configuration strictly decreases whenever the corresponding target one stutters.
    The second step consists of introducing two equivalence relations between configurations:
 c                                              c
≡s relates configurations at the source and ≡t at the target. These two relations and the
simulation relation form a general CT-simulation. Formally,
                                                                 c   c
Definition 2.4 (General CT-simulation [3]). A pair (≡s , ≡t ) is a general CT-simulation w.r.t.
≈p , num-steps(·, ·) and |·| whenever:
        c     c
   1. (≡s , ≡t ) is a manysteps CT-diagram, i.e., if
                  c           c
            • A ≡s A0 and α ≡t α0 ;
                  t           t
               − B and A0 →
            • A→          − B0 ;
                  τ                        τ0                0   0
               →num-steps(A,α) β and α0 −−
            • α−
               −                         →num-steps(A ,α ) β 0 ;
            • A ≈p α, A0 ≈p α0 , B ≈p β and B0 ≈p β 0
      then
            • τ = τ 0 and num-steps(A, α) = num-steps(A0 , α0 );
                  c           c
            • B ≡s B0 and β ≡t β 0 ;
                                                                                               c
   2. if A, A0 are initial configurations, with targets α, α0 , and φ(A, A0 ), then A ≡s A0 and
         c
      α ≡t α 0 ;
              c
   3. If A ≡s A0 , then A ∈ Sf ⇐⇒ A0 ∈ Sf ;
        c     c
   4. (≡s , ≡t ) is a final CT-diagram [3], i.e., if
                  c           c
            • A ≡s A0 and α ≡t α0 ;
            • A and A0 are final;
                  τ                        τ0                0   0
               →num-steps(A,α) β and α0 −−
            • α−
               −                         →num-steps(A ,α ) β 0 ;
            • A ≈p α, A0 ≈p α0 , B ≈p β and B0 ≈p β 0
      then
            • τ = τ 0 and num-steps(A, α) = num-steps(A0 , α0 );
                  c
            • β ≡t β 0 and they are both final.
                                       c          c
   The idea is that the relations ≡s and ≡t are stable under reduction, i.e., preservation of
the observational non-interference is guaranteed. The following theorem, referred to in [3] as
Theorem 6, gives a sufficient condition to establish constant-time preservation.
Theorem 2.1 (Security). If p is constant-time w.r.t. φ and there is a general CT-simulation
w.r.t. a general simulation, then JpK is constant-time w.r.t. φ.

                                                                                                         5
Control-flow Flattening Preserves the Constant-Time Policy                M. Busi, P. Degano and L. Galletta


3     Proof of preservation
In this section, we present the proof that control-flow flattening preserves constant-time policy.
We first introduce a small imperative language, its semantics in the form of a LTS and our
leakage model. Then, we formalize our obfuscation as a function from syntax to syntax, and
finally we prove the preservation of the security policy.

3.1    The language and its (instrumented) semantics
We consider a small imperative language with arithmetic and boolean expressions. Let Var be
a set program identifiers, the syntax is

        AExpr 3 e ::= v | x | e1 op e2        v ∈ Z, op ∈ {+ , - , * , / , % },             x ∈ Var
        BExpr 3 b ::= true | false | b1 or b2 | not b | e1 ≤ e2 | e1 = e2
          Cmd 3 c ::= skip | x := e | c1 ; c2 | if b then c1 else c2 | while b do c

We assume that each command in the syntax carries a permanent color either white or not,
typically z. Also, we stipulate that each while statement and all its components get a unique
non-white color, and that there is a function color yielding the color of a statement.
    Now, we define the semantics and instantiate the framework of [3] to the non-cancelling
constant-time policy. For that, we define a leakage model to describe the information that an
attacker can observe during the execution. Recall from the previous section that the leakage is
a list of atomic leaks. We denote with · the list concatenation and with [a] a list with a single
element a. Arithmetic and boolean expressions leak the sequence of operations required to be
evaluated; we assume that there is an observable op, associated with the arithmetic operation
being executed, but not with the logical ones (slightly simplifying [3]). Also we denote with •
absence of leaking. Our leakage model is defined by the following function leak (·, ·) that given
an expression (either arithmetic or boolean) and a state returns the corresponding leakage:

                leak (v, σ) = leak (x, σ) = leak (true, σ) = leak (false, σ) = [•]
           leak (not b, σ) = leak (b, σ)
         leak (e1 op e2 , σ) = leak (e1 , σ) · leak (e2 , σ) · op
        leak (e1 ≤ e2 , σ) = leak (e1 = e2 , σ) = leak (b1 or b2 , σ) = leak (e1 , σ) · leak (e2 , σ)

Accesses to constants and identifiers leak nothing; boolean and relational expressions leak the
concatenation of the leaks of their sub-expressions; the arithmetic expressions append the ob-
servable of the applied operator to the leaks of their sub-expressions.
    We omit the semantics of arithmetic and boolean expression [·]σ because fully standard [20];
we only assume that each syntactic arithmetic operator op has a corresponding semantic
operator op.
                                                                             t
    The semantics of commands is given in term of a transition relation →   − between configu-
rations where t is the leakage of that transition step. As usual a configuration is a pair c, σ
consisting of a command and a state σ ∈ Store assigning values to program identifiers. Given
a program p the set of initial configurations is Si = {p, σ | σ ∈ Store}, and that of final
configurations is Sf = {skip, σ | σ ∈ Store}.
    Figure 1 reports the instrumented semantics of the language. Moreover, the semantics is
assumed to keep colors, in particular in the rule for an z-colored while, all the components of
the if in the target are also z-colored, avoiding color clashes (see the .pdf for colors).

6
Control-flow Flattening Preserves the Constant-Time Policy                                      M. Busi, P. Degano and L. Galletta


                                                                                  t
                                                                                  → c01 , σ 0
                                                                           c1 , σ −
               leak (e, σ) · [x]                                                  t                                              t
      x := e, σ −−−−−−−−−→ skip, σ{x 7→ [a]σ }                                    → c01 ; c2 , σ 0
                                                                      c1 ; c2 , σ −                                             → c2 , σ 0
                                                                                                                   skip; c2 , σ −

                      [b]σ = true                                                           [b]σ = false
                                   leak (b, σ) · [true]                                                 leak (b, σ) · [false]
   if b then c1 else c2 , σ −
                            −−−−−−−−−−−
                                      → c1 , σ                        if b then c1 else c2 , σ −−−−−−−−−−−−→ c2 , σ

                                     [•]
             while b do c, σ −−→ if b then (c; while b do c) else skip, σ


                     Figure 1: Instrumented operational semantics for commands.


3.2     Control-flow flattening formalization
Recall that the initial program being obfuscated is p. For the sake of presentation, we will adopt
the sugared syntax we used in Section 1 and represent a sequence of nested conditionals in the
obfuscated program as the command switch e : cs, where cs = [(v1 , c1 ); . . .; (vn : cn )], with
semantics
                                        ([e]σ , c) ∈
                                                   / cs                                ([e]σ , c) ∈ cs
                                                leak (e, σ)                                          leak (e, σ)
                       switch e : cs, σ −−−−−−→ skip, σ                     switch e : cs, σ −−−−−−→ c, σ

Now, let pc be a fresh identifier, called program counter. Then, following [8],the obfuscated
version JcK of the command c is
                                             pc := 1;
                                             while 1 ≤ pc do
                                                    switch pc : labeled (pc, c, 1 , 0 )
where
   labeled (pc, skip, n, m) = [(n, skip; pc := m)]
    labeled (pc, x := e, n, m) = [(n, x := e; pc := m)]
    labeled (pc, c1 ; c2 , n, m) = labeled (pc, c1 , n, n + size(c1 )) · labeled (pc, c2 , n + size(c1 ), m)
    labeled (pc, if b then c1 else c2 , n, m) =
               [(n, if b then pc := n + 1 else pc := n + 1 + size(c1 ))] ·
               labeled (pc, c1 , n + 1, m) · labeled (pc, c2 , n + 1 + size(c1 ), m)
    labeled (pc, while b do c, n, m) =
               [(n, if b then pc := n + 1 else pc := n + 1 + size(c))] ·
               labeled (pc, c, n + 1, n) · [(n + 1 + size(c), skip; pc := m)]

with size(·) defined as follows
                                                               size(c) = 1 if c ∈ {skip, · := ·}
                                                          size(c1 ; c2 ) = size(c1 ) + size(c2 )
                             size(if b then c1 else c2 ) = 1 + size(c1 ) + size(c2 )
                                             size(while b do c) = 2 + size(c)

The obfuscated version of a program p is a loop with condition 1 ≤ pc and with body a switch
statement. The switch condition is on the values of pc and its cases correspond to the flattened

                                                                                                                                             7
Control-flow Flattening Preserves the Constant-Time Policy              M. Busi, P. Degano and L. Galletta


statements, obtained from the function labeled (pc, c, n, m). It returns a list containing the cases
of the switch and it is inductively defined on the syntax of commands: the first parameter
pc is the identifier to use for program counter; the second is the command c to be flattened;
the parameter n represents the value of the guard of the case generated for the first statement
of c; the last parameter m represents the value to be assigned to pc by the last switch case
generated. For example, the flattening of a sequence c1 ; c2 generates the cases corresponding
to c1 and c2 , and then concatenates them. Note that the values of the program counter for the
cases of c2 start from the value assigned to pc by the last case generated for c1 , i.e., n + size(c1 ),
where the function size(·) returns the “length” of c1 . For a program p, we use 1 as initial value
of n and 0 as last value to be assigned so as to exit from the while loop.

3.3    Correctness and security
Since obfuscation does not change the language (apart from sugaring nested if commands);
the operational semantics is deterministic; and there are no unsafe programs (i.e., a program
gets stuck iff execution has completed), the correctness of obfuscation directly follows from the
existence of a general simulation between the source and the target languages [3]. For that,
inspired by [8], we define the relation ≈p between source and target configurations shown
in Figure 2. Intuitively, the relation ≈p matches source and target configurations with the
same behaviour, depending on whether they are final (third rule), their execution originated
from a loop (Rule (Colored)) or not (Rule (White)). Note that we differentiate white and
colored cases as to avoid circular reasoning in the derivations of ≈p . More specifically, our
relation matches a configuration A in the source with a corresponding α in the target. Actually,
α.cmd is the while loop of the obfuscated program (fourth premise in Rule (White) and
third in Rule (Colored)), whereas α.σ is equal to A.σ except for the value of pc. Its value is
mapped to the case of the switch corresponding to the next command in A (first premise in
Rule (White) and fifth in Rule (Colored)).
    To understand how our simulation works, recall the example from Section 1. By Rule (White)
we relate the configuration reached at line (3) at the source level with that of the obfuscated
program starting at line (2) and with a state equal to that of the source level with the additional
binding pc 7→ 3. Similarly, we relate the configuration reached at line (6) at the source level and
its obfuscated counterpart (again at line (2) at the obfuscated level), using Rule (Colored)
and noting that the source configuration derives from the execution of a loop.
    The following theorem ensures that the relation ≈p is a general simulation.
Theorem 3.1. For all programs p, the relation ≈p is a general simulation.
    The correctness of the obfuscation is now a corollary of Theorem 3.1.
Corollary 3.1 (Correctness). For all commands c and store σ

                              c, σ →∗ skip, σ 0    iff   JcK, σ →∗ skip, σ 0

   The next step is showing that the control-flow flattening obfuscation preserves the constant-
                                                c                             c c
time programming policy. For that we define ≡ below and we show that (≡, ≡) is a general
CT-simulation, as required by Theorem 2.1.
                                                                                                 c
Definition 3.1. Let A and A0 be two (source or obfuscated) configurations, then A ≡ A0 iff
A.cmd = A0 .cmd .
    We prove the following:

8
Control-flow Flattening Preserves the Constant-Time Policy                         M. Busi, P. Degano and L. Galletta



          (White)
                                       color (c) = white  σ 0 = σ ∪ {pc 7→ n}
                                         0
          ls = labeled(pc, p, 1, 0)     c = while (1 ≤ pc) do (switch pc : ls)              −, pc ` c ./ ls[n], m
                                                     c, σ ≈p c0 , σ 0

(Colored)
                                 σ 0 = σ ∪ {pc 7→ n} ls = labeled(pc, p, 1, 0)
                       c0 = while (1 ≤ pc) do (switch pc : ls)       while b do c00 ∈ p
color (while b do c00 ) = color (c) 6= white     −, pc ` while b do c00 ./ ls[n0 ], m0  n0 , pc ` c  ls[n], m
                                                  c, σ ≈p c0 , σ 0

                                      σ 0 = σ ∪ {pc 7→ n}
                                      skip, σ ≈p skip, σ 0


                                                                        ls[n] = (n, x := e; pc := m)
          n0 , pc ` skip ∼ ls[n], m                                      n0 , pc ` x := e ∼ ls[n], m

                                 n0 , pc ` c1 ∼ ls[n], m0       n0 , pc ` c2 ∼ ls[m0 ], m
                                              n0 , pc ` c1 ; c2 ∼ ls[n], m

     ls[n] = (n, if b then pc := n + 1 else pc := n + 1 + size(c1 ))
    n0 , pc ` c1 ∼ ls[n + 1], m  n0 , pc ` c2 ∼ ls[n + 1 + size(c1 )], m              ls[n] = (n, skip; pc := n0 c)
                 n0 , pc ` if b then c1 else c2 ∼ ls[n], m                           n0 , pc ` while b do c  ls[n], n0

                                                    n, pc ` if b then (c; while b do c) else skip  ls[n], m
       n0 , pc ` while b do c  ls[n0 ], n0                          −, pc ` while b do c ./ ls[n], m
where ∼ ∈ {, ./}, and the first parameter (n0 ) is immaterial in ./.

         Figure 2: Definition of ≈p relation on configurations and its auxiliary relations.

                                  c   c
Theorem 3.2. The pair (≡, ≡) is a general CT-simulation w.r.t. ≈p , num-steps(·, ·) and |·|.

  The main result of our paper directly follows from the theorem above, because the transfor-
mation in Section 3.2 satisfies Definition 2.2:

Corollary 3.2 (Constant-time preservation).
The control-flow fattening obfuscation preserves the constant-time policy.

     The proofs of the theorems above are available online [9].


4      Conclusions
In this paper we applied a methodology from the literature [3] to the advanced obfuscation
technique of control-flow flattening and proved that it preserves the constant-time policy. For
that, we have first defined what programs leak. Then, we have defined the relation ≈p
between source and target configurations – that roughly relates configurations with the same
behavior – and proved that it adheres to the definition of general simulation. Finally, we proved
                                                                           c c
that the obfuscation preserves constant time by showing that the pair (≡, ≡) is a general CT-
simulation, as required by the framework we instantiated. As a consequence, the obfuscation
based on control-flow flattening is proved to preserve the constant-time policy.

                                                                                                                          9
Control-flow Flattening Preserves the Constant-Time Policy           M. Busi, P. Degano and L. Galletta


    Future work will address proving the security of other obfuscations techniques, and consid-
ering other security properties, e.g., general safeties or hyper-safeties. Here we just considered
a passive attacker that can only observe the leakage, and an interesting problem would be to
explore if our result and the current proof technique scale to a setting with active attackers
that also interferes with the execution of programs. Indeed, recently new secure compilation
principles have been proposed to take active attackers into account [1].

Related Work Program obfuscations are widespread code transformations [18, 12, 16, 11,
4, 25] designed to protect software in settings where the adversary has physical access to the
program and can compromise it by inspection or tampering. A great deal of work has been
done on obfuscations that are resistant against reverse engineering making the life of attackers
harder. However, we do not discuss these papers because they do not consider formal properties
of the proposed transformations. We refer the interested reader to [14] for a recent survey.
    Since to the best our knowledge, ours is the first work addressing the problem of security
preservation, here we focus only on those proposals that formally studied the correctness of
obfuscations. In [23, 24] a formal framework based on abstract interpretation is proposed
to study the effectiveness of obfuscating techniques. This framework not only characterizes
when a transformation is correct but also measures its resilience, i.e., the difficulty of undoing
the obfuscation. More recently, other work went in the direction of fully verified, obfuscating
compilation chains [6, 8, 7]. Among these [8] is the most similar to ours, but it only focusses on
the correctness of the transformation, and studies it in the setting of the CompCert C compiler.
Differently, here we adopted a more foundational approach by considering a core imperative
language and proved that the considered transformation preserves security.
    As for secure compilation, we can essentially distinguish two different approaches. The first
one only considers passive attackers (as we do) that do not interact with the program but that
try to extract confidential data by observing its behaviour. Besides [3], recently there has been
an increasing interest in preserving the verification of the constant time policy, e.g., a version of
the CompCert C compiler [2] has been released that guarantees that preservation of the policy
in each compilation step. The second approach in secure compilation considers active attackers
that are modeled as contexts in which a program is plugged in. Traditionally, this approach
reduces proving the security preservation to proving that the compiler is fully-abstract [21].
However, recently new proof principles emerged, see [1, 22] for an overview.


References
 [1] Carmine Abate, Roberto Blanco, Deepak Garg, Catalin Hritcu, Marco Patrignani, and Jérémy
     Thibault. Journey beyond full abstraction: Exploring robust property preservation for secure
     compilation. In 32nd IEEE Computer Security Foundations Symposium, pages 256–271, 2019.
 [2] Gilles Barthe, Sandrine Blazy, Benjamin Grégoire, Rémi Hutin, Vincent Laporte, David Pichardie,
     and Alix Trieu. Formal verification of a constant-time preserving C compiler. PACMPL,
     4(POPL):7:1–7:30, 2020.
 [3] Gilles Barthe, Benjamin Grégoire, and Vincent Laporte. Secure compilation of side-channel coun-
     termeasures: The case of cryptographic "constant-time". In 31st IEEE Computer Security Foun-
     dations Symposium, CSF, pages 328–343, 2018.
 [4] Mihai Bazon. Uglifyjs - javascript parser, compressor, minifier written in js. http://lisperator.
     net/uglifyjs/. Online; last access Dec 2019.
 [5] Daniel J. Bernstein.      Cache-timing attacks on AES.           https://cr.yp.to/antiforgery/
     cachetiming-20050414.pdf, 2005. Online; last access Nov 2019.

10
Control-flow Flattening Preserves the Constant-Time Policy            M. Busi, P. Degano and L. Galletta


 [6] Sandrine Blazy and Roberto Giacobazzi. Towards a formally verified obfuscating compiler. In SSP
     2012 - 2nd ACM SIGPLAN Software Security and Protection Workshop, 2012.
 [7] Sandrine Blazy and Rémi Hutin. Formal verification of a program obfuscation based on mixed
     boolean-arithmetic expressions. In Proceedings of the 8th ACM SIGPLAN International Confer-
     ence on Certified Programs and Proofs, pages 196–208, 2019.
 [8] Sandrine Blazy and Alix Trieu. Formal verification of control-flow graph flattening. In Proceedings
     of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs, pages 176–187, 2016.
 [9] Matteo Busi, Pierpaolo Degano, and Letterio Galletta. Control-flow flattening preserves the
     constant-time policy (extended version). https://arxiv.org/abs/2003.05836.
[10] Matteo Busi and Letterio Galletta. A brief tour of formally secure compilation. In Pierpaolo
     Degano and Roberto Zunino, editors, Proceedings of the Third Italian Conference on Cyber Secu-
     rity, ITASEC19, volume 2315 of CEUR Workshop Proceedings. CEUR-WS.org, 2019.
[11] Christian Collberg. The tigress c diversifier/obfuscator. http://tigress.cs.arizona.edu/. On-
     line; last access Dec 2019.
[12] Christian S. Collberg and Jasvir Nagra. Surreptitious Software - Obfuscation, Watermarking, and
     Tamperproofing for Software Protection. Addison-Wesley, 2010.
[13] J. A. Goguen and J. Meseguer. Security policies and security models. In IEEE Symposium on
     Security and Privacy, pages 11–20, 1982.
[14] Shohreh Hosseinzadeh, Sampsa Rauti, Samuel Laurén, Jari-Matti Mäkelä, Johannes Holvitie, Sami
     Hyrynsalmi, and Ville Leppänen. Diversification and obfuscation techniques for software security:
     A systematic literature review. Information and Software Technology, 104:72–93, 2018.
[15] Catalin Hritcu, David Chisnall, Deepak Garg, and Mathias Payer. Secure compilation. https:
     //blog.sigplan.org/2019/07/01/secure-compilation/, 2019. Online; last access Dec 2019.
[16] Pascal Junod, Julien Rinaldini, Johan Wehrli, and Julie Michielin. Obfuscator-llvm–software
     protection for the masses. In 2015 IEEE/ACM 1st International Workshop on Software Protection,
     pages 3–9. IEEE, 2015.
[17] Paul C. Kocher. Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other
     Systems. In Proceedings of the 16th Annual International Cryptology Conference on Advances in
     Cryptology, volume 1109 of LNCS, pages 104–113. Springer-Verlag, 1996.
[18] Tımea László and Ákos Kiss. Obfuscating c++ programs via control flow flattening. Annales
     Universitatis Scientarum Budapestinensis de Rolando Eötvös Nominatae, Sectio Computatorica,
     30(1):3–19, 2009.
[19] David Molnar, Matt Piotrowski, David Schultz, and David A. Wagner. The program counter secu-
     rity model: Automatic detection and removal of control-flow side channel attacks. In Dongho Won
     and Seungjoo Kim, editors, Information Security and Cryptology - ICISC 2005, 8th International
     Conference, volume 3935 of LNCS, pages 156–168, 2005.
[20] Hanne Riis Nielson and Flemming Nielson. Semantics with Applications: An Appetizer. Under-
     graduate Topics in Computer Science. Springer, 2007.
[21] Marco Patrignani, Amal Ahmed, and Dave Clarke. Formal approaches to secure compilation: A
     survey of fully abstract compilation and related work. ACM Computing Surveys, 2019.
[22] Marco Patrignani and Deepak Garg. Robustly safe compilation or, efficient, provably secure
     compilation. CoRR, abs/1804.00489, 2018.
[23] Mila Dalla Preda and Roberto Giacobazzi. Control code obfuscation by abstract interpretation.
     In Third IEEE International Conference on Software Engineering and Formal Methods (SEFM
     2005), 7-9 September 2005, Koblenz, Germany, pages 301–310, 2005.
[24] Mila Dalla Preda and Roberto Giacobazzi. Semantics-based code obfuscation by abstract inter-
     pretation. Journal of Computer Security, 17(6):855–908, 2009.
[25] WebAssembly team. Binaryen - compiler infrastructure and toolchain library for webassembly.
     https://github.com/WebAssembly/binaryen. Online; last access Dec 2019.


                                                                                                     11