In Proceedings of the First Italian Conference on Cybersecurity (ITASEC17), Venice, Italy.
         Copyright c 2017 for this paper by its authors. Copying permitted for private and academic purposes.


                                                                                                          ∗
            Tracking sensitive and untrustworthy data in IoT
                                  Chiara Bodei   Letterio Galletta
                                  {chiara,galletta}@di.unipi.it
                                Dipartimento di Informatica, Università di Pisa


                                                     Abstract
              The Internet of Things (IoT) produces and processes large amounts of data. Among
          these data, some must be protected and others must be carefully handled because they
          come from untrusted sources. Taint analysis techniques can be used to for marking data
          and for monitoring their propagation at run time, so to determine how they influence the
          rest of the computation. Starting from the specification language IoT-LySa, we propose
          a Control Flow Analysis for statically predicting how tainted data spread across an IoT
          system and for checking whether those computations considered security critical are not
          affected by tainted data.


1         Introduction
We are living the Internet of Things (IoT) revolution, where the things we use every day have
computational capabilities and are always connected to the Internet. These devices are equipped
with sensors, produce huge amounts of data, store them on the cloud or use them to affect the
surrounding environment through actuators.
    The IoT paradigm introduces new pressing security challenges. On the one hand, a large
part of the collected data is sensitive and must be protected against attackers. However, these
data need to be shared and processed to be useful. For instance, a fitness app should store and
manipulate data about the heart rate or the expended calories to be helpful for users. On the
other hand, an attacker can easily intercept communications or manipulate sensors to alter data
(tampering). Thus, an IoT system must be resistant against “bad” data that may come from
malicious sources or that may be the result of tampering with “good” sensors. For instance,
deciding to stop an industrial plant in dependence of data coming from tampered sensors can
have severe consequences.
    Usually, formal methods provide designers with tools to support the development of systems
and to reason about their properties. We follow this line of research by presenting preliminary
results about using static analysis to study simple security properties of IoT systems, i.e. net-
works of nodes, where each node interacts with the environment through sensors and actuators.
    Technically, our starting point is the formal specification language IoT-LySa, a process
calculus recently proposed for IoT systems [4, 3]. IoT-LySa may help designers to adopt
a Security by Design development model. Indeed, designers can model the structure of the
system and how its components (smart objects) interact with each other through the IoT-LySa
primitives. Furthermore, they can reason about the system correctness and robustness by using
the Control Flow Analysis (CFA) of IoT-LySa. This analysis safely approximates the system
behaviour, by statically predicting how data from sensors may spread across the system and
how objects may interact with each other. Technically, it “mimics” the evolution of the system,
by using abstract values in place of concrete ones and by modelling the consequences of each

    ∗
        Partially supported by Università di Pisa PRA 2016 64 Project Through the fog.


                                                         38
possible action. Designers can detect possible security flaws through this “abstract simulation”
and intervene as early as possible during the design phase.
    Here, we propose a variant of this CFA for static taint analysis. Taint analysis predicts how
information flows from specific data sources to the computations that use these data [11, 10].
The basic idea is to classify data sources in tainted and untainted and to mark as tainted or
untainted the derived data, through suitable propagation rules.
    In order to detect possible confidentiality or privacy leaks, we mark as tainted data coming
from sensitive sensors and check whether they are protected when in transit. Otherwise, we
mark as tamperable data coming from sensors or memory locations that can be tampered and
check which computations they affect, thus addressing integrity issues. Furthermore, we require
designers (1) to identify some parts of the code as critical (critical points), e.g. a conditional
statement whose result may trigger a risky actuation; and (2) to provide a set of functions that
untaint data. In the case of sensitive data these functions remove privacy-critical information,
e.g. by blurring people faces in surveillance videos. In the case of tamperable data these
functions perform sanitisation. By exploiting the result of our analysis, we check whether (i) a
variable is untrustworthy, because affected by possibly tampered data; (ii) sensitive data may be
leaked; and (iii) computations in critical points depend on tamperable data. As a consequence,
analysis results may help designers in making educated decisions, on the exposure of their data,
as well as on their trustfulness and robustness. As far as the last point is concerned, they can
indeed statically evaluate the impact of a certain amount of tamperable data, e.g. by answering
to question like: how many sensors can be compromised without dramatically impacting on the
overall behaviour of the system? Note that tamperable data may also produce, besides security
issues, quality ones, such as unexpected behaviours.
    Since the CFA computes an over-approximation of the behaviour of a system, if the analysis
does not predict any dependence or any leak, we can be sure that at run time it will never
occur. If instead the CFA does, there is only the possibility of the violation. Nevertheless, it
can be worthwhile to track where the tainted values may come from and decide the points that
deserve to be monitored at run time in order to prevent violations.
    We assume that the processes running inside a node are protected against tampering. This
can be obtained by resorting to standard mechanisms for integrity. However, we assume that
an attacker can tamper with every sensor or memory location considered tamperable. When
we focus on confidentiality, we assume instead that communication is not protected against
eavesdropping, but that the attacker does not intervene in communications. Dealing with an
attacker able to inject forged messages can be managed by following [2]. Finally, we assume to
use perfect cryptography and to have keys, fixed once and for all, exchanged in a secure manner
at deployment time, as it is often the case, e.g. ZigBee [12].
    The paper is organised as follows. In Section 2 we introduce a motivating example, which is
also instrumental in giving an overall picture of our methodology. In Section 3 we briefly recall
the process calculus IoT-LySa as introduced in [4, 3]. Then, we define the CFA for static taint
analysis in Section 4, and we also show how to check the security properties described above.
The Appendix includes part of the technical details of our formalism.


2    A storehouse with perishable food
In this section we illustrate our methodology through a scenario similar to the one of [5].
Consider an IoT system for keeping the temperature under control inside a storehouse storing
perishable food (see Figure 1). The temperature must be regulated depending on the quantity
and the kind of the food. The system also monitors how long the food is kept inside the


                                               39
 Figure 1: The organisation of nodes in the IoT system of a storehouse with perishable food.

storehouse to avoid passing the expiry-date. In the storehouse there are four nodes: N1 , N2 ,
N3 and N4 . The specification of the node N1 in IoT-LySa is as follows:
 N1 = `1 ∶ [Σ1 ∥ Pc ∥ (S0 ∥ S1 ∥ S2 ∥ S3 )]                    where Si = µh.i ∶= v.τ.h     i ∈ [0, 3]
 Pc = µh.z0 ∶= 0.z1 ∶= 1.z2 ∶= 2.z3 ∶= 3.⟨⟨avg(z0 , z1 , z2 , z3 )⟩⟩ ▷ {`3 }.
        th1 − th3 ≤ avg(z0 , z1 , z2 , z3 ) ≤ th2 + th3 ?
                                      th1 ≤ avg(z0 , z1 , z2 , z3 ) ≤ th2 ? h
                                                                          ∶ ⟨j, start⟩. h
                                      ∶ ⟨⟨alarm⟩⟩ ▷ {`3 }.h
The node N1 , uniquely identified in the system by the label `1 , is made of Pc , a software control
process that specifies the logic of the node, and four temperature sensors Si , one for each corner
of the storehouse (a rectangular room). Each sensor Si periodically senses the environment and
sends the temperature through a wireless communication to Pc . In IoT-LySa each sensor inside
a node is identified by an index i and communicates the value v read from the environment by
storing it in the corresponding reserved location i of the shared store Σ1 . In IoT-LySa intra-
node communications are abstractly modelled through operations over the shared store. The
action τ denotes internal actions, which we are not interested in modelling, e.g. noise reduction
of sensed data; the construct µh. implements the iterative behaviour of the sensor. The control
process Pc reads temperatures from sensors Si and stores them to local variables zi through the
assignments zi ∶= i (with i identifier of the sensor Si ); then, it computes the average through
the function avg and sends it to the node N3 with label `3 through the primitive ⟨⟨ ⟩⟩ ▷ . Then,
Pc checks if this average is within the range [th1 , th2 ] (th1 and th2 are the threshold variables
and the primitive ? ∶ is a conditional). If this is not the case and the temperature is out
of range of a value lower than th3 (tolerance threshold), the actuator j is started through the


                                                    40
message ⟨j, start⟩ to accordingly turn on/off the cooling system. If the difference is too high
(greater than th3 ), the control unit sends instead an alarm to the node N3 .
   The node N2 with label `2 does the stocktaking and sends it to the node N3 . We assume
that each wood box containing food is equipped with a RFID read by the reader R0 (0 is the
index of the sensor) of N2 when the box enters the storehouse. The specification of N2 is:

         N2 = `2 ∶ [Σ2 ∥ µh.x ∶= 0.db ∶= update(db, x).τ.h ∥ µh.⟨⟨db⟩⟩ ▷ {`3 }.h ∥ R0 ]

The node is made of two processes. The first reads a value from R0 (defined as the sensors Si
above) and updates the stocktaking db that the second one periodically sends to N3 .
    The node N3 with label `3 works as the system controller and as gateway towards the Cloud
(i.e. the node N4 with label `4 ). It receives the stocktaking and the temperature, and checks
whether the temperature is acceptable for the quantity and the kind of the stored food. If this
is not the case it drives N1 to take the proper actions and raises an alarm through the Cloud.
The specification of the process Pk of N3 performing these checks is as follows:

                  / validRange(db) ? ⟨⟨alarm⟩⟩ ▷ {`4 }.⟨⟨validRange(db)⟩⟩ ▷ {`1 }.h ∶ h
     Pk = µh.temp ∈

where validRange, given the current content of the storehouse, returns a range of admissible
temperatures to be maintained inside the room.
    Since an attacker may manipulate data from sensors and also the sensors themselves, we
consider sensors as tamperable data sources, and we tag with the taint label for tamperable
data  the locations of Σ1 reserved for the sensors Si of N1 . Similarly, we tag the location of
Σ2 reserved for R0 in N2 . As discussed above, N3 controls the system based on data received
by others nodes. Thus, we mark the code checking if the temperature is right (the guard of the
conditional of Pk ) as a critical point.
    However, our CFA can detect that our system is not robust against data manipulations.
Indeed, an attacker by manipulating the sensors Si and R0 can interfere or drive the decision of
N3 and may damage the system, because the computations in the critical point directly depend
on these data. This is because the nodes N1 and N2 simply forward their data to N3 without
performing any sanitisation. In particular, the analysis propagates the tag  and detects that
local variables zi inside N1 are also tamperable, and so it is the computed average that is sent
to N3 . The same holds for the stocktaking computed by N1 .
    To solve this problem we need to modify our design to perform data sanitisation. The CFA
can help us during this process because we can try different approaches and test them through
the analysis (what if reasoning). For instance, assume that the attacker can tamper with only
one sensor, e.g. S0 . A possible approach to deal with this fact is to modify the behaviour of N1
by observing that sensors on the same side of the room should perceive the same temperature
with a difference that can be at most a given value . The same happens when we consider
the difference between two consecutive not manipulated samples: this difference is at most
a given value δ. In this way we may compare data and discover possible manipulations and
discard possibly tampered data. These checks can be implemented in N1 specification through
a function adjust that returns the temperature read from a sensor, adjusted taking into account
the previous samples and temperature of the adjacent sensor. In the specification of Pc the
assignment to the variable z0 would become z0 = adjust(0, 3, s0 ) where s0 contains the previous
samples of the sensor S0 . The same happens for the other variables. Thus, the function adjust
sanitises the input and does not propagate taint information, i.e. it never returns a value with
tag . By using the CFA we can be sure that N1 sends no tainted data to the node N3 .


                                               41
3    The calculus IoT-LySa
Here, we briefly review the process calculus IoT-LySa [4, 3]. Differently from other process
calculus approches to IoT, e.g. [8, 9], the focus of IoT-LySa is on the development of a design
framework that includes a static semantics to support verification techniques and tools for
certifying properties of IoT applications.
    Systems in IoT-LySa have a two-level structure and consist of a finite number of nodes
(things), each of which hosts a store for internal communication and a finite number of control
processes (representing the software), sensors and actuators. We assume that each sensor
(actuator) in a node with label ` is uniquely identified by an index i ∈ I` (j ∈ J` , resp). Data
                                          ′
are represented by terms. Labels a, a , ai , ..., ranged over by A, identify the occurrences of
terms. They are used in the analysis and do not affect the dynamic semantics. Formally, the
syntax of labelled expression and the one of unlabelled terms are as follows.


 E ∋E    ∶∶= labelled terms               M ∋ M, N     ∶∶= terms
                                                                         value (v ∈ V)
            a
         M                labelled term                v
                          with a ∈ A                   i                 sensor location (i ∈ I` )
                                                       x
                                                       {E1 , ⋯, Er }k0   encryption with key k0 ∈ K
                                                       f (E1 , ⋯, Er )   function on data

 We assume as given a finite set K of secret keys owned by nodes, previously exchanged in a
secure way, as it is often the case [12]. The encryption function {E1 , ⋯, Er }k0 returns the result
of encrypting values Ei for i ∈ [1, r] under the shared key k0 . The term f (E1 , ⋯, Er ) is the
application of function f to r arguments; we assume given a set of primitive functions, typically
for aggregating or comparing values. We assume the sets V, I`, J`, K be pairwise disjoint.
    The syntax of systems of nodes and of its components is as follows.


 N ∋ N ∶∶=   systems of nodes                      B ∋ B ∶∶=   node components
             0              empty system                       Σ`           node store
             ` ∶ [B]        single node (` ∈ L)                P            process
             N1 ∣ N2        par. composition                   S            sensor (label i ∈ I` )
                                                               A            actuator (label j ∈ J` )
                                                               B∥B          par. composition


 A node ` ∶ [B] is uniquely identified by a label ` ∈ L that may represent further characterising
information (e.g. node location). Sets of nodes are described through the (associative and
commutative) operator ∣ for parallel composition. The 0 denotes a system with no nodes.
Inside a node ` ∶ [B] there is a finite set of components described by the parallel operator ∥.
We impose that there is a single store Σ` ∶ X ∪ I` → V, where X , V are the sets of variables
and of values (integers, booleans, ...), resp.
    The store is essentially an array whose indexes are variables and sensors identifiers i ∈ I`
(no need of α-conversions). We assume that store accesses are atomic, e.g. through CAS
instructions [7]. The other node components are control processes P , and sensors S (less than
#(I` )), and actuators A (less than #(J` )) the actions of which are in Act. The syntax of
processes is as follows.


                                                  42
    P ∶∶=    0                                                    inactive process
             ⟨⟨E1 , ⋯, Er ⟩⟩ ▷ L. P                               asynchronous multi-output L ⊆ L
             (E1 , ⋯, Ej ; xj+1 , ⋯, xr ). P                      input (with matching)
             decrypt E as {E1 , ⋯, Ej ; xj+1 , ⋯, xr }k0 in P     decryption with key k0 (with match.)
             E?P ∶ Q                                              conditional statement
             h                                                    iteration variable
             µh. P                                                tail iteration
             x ∶= E. P                                            assignment to x ∈ X
             ⟨j, γ⟩. P                                            output of action γ to actuator j

The prefix ⟨⟨E1 , ⋯, Er ⟩⟩ ▷ L implements a simple form of multi-party communication: the
tuple obtained by evaluating E1 , . . . , Er is asynchronously sent to the nodes with labels in
L that are “compatible” (according, among other attributes, to a proximity-based notion).
The input prefix (E1 ,⋯, Ej ; xj+1 ,⋯, xr ) receives a r-tuple, provided that its first j elements
match the corresponding input ones, and then assigns the variables (after “;”) to the received
values. Otherwise, the r-tuple is not accepted. A process repeats its behaviour, when de-
fined through the tail iteration construct µh.P (h is the iteration variable). The process
decrypt E as {E1 , ⋯, Ej ; xj+1 , ⋯, xr }k0 in P tries to decrypt the result of the expression E
with the shared key k0 ∈ K. Also in this case we use the pattern matching. If the pattern
matching succeeds, the process continues as P and the variables xj+1 , . . . , xr are suitably as-
signed. Sensors and actuators have the form:
         S    ∶∶= sensors                     A       ∶∶= actuators
              0         inactive sensor               0           inactive actuator
              τ.S       internal action               τ.A         internal action
              i ∶= v. S store of v ∈ V                (∣j, Γ∣). A command for actuator j (Γ ⊆ Act)
                                 th
                        by the i sensor               γ.A         triggered action (γ ∈ Act)
              h         iteration var.                h           iteration var.
              µh.S      tail iteration                µh.A        tail iteration

A sensor can perform an internal action τ or store the value v, gathered from the environment,
into its store location i. An actuator can perform an internal action τ or execute one of its
action γ, possibly received from its controlling process. Both sensors and actuators can iterate.
For simplicity, here we neither provide an explicit operation to read data from the environment,
nor to describe the impact of actuator actions on the environment.
    The semantics is based on a standard structural congruence and a two-level reduction relation
→ defined as the least relation on nodes and its components, where we assume the standard
denotational interpretation [[E]]Σ for evaluating terms. The complete semantic is in Appendix.
The reader not interested in the technical details can safely skip it without compromising the
comprehension of the rest of the paper. As examples of semantic rules, we show the rules
(Ev-out) and (Multi-com) that drive asynchronous multi-communications.
               (Ev-out)
                                             ⋀i=1 vi = [[Ei ]]Σ
                                                  r


             Σ ∥ ⟨⟨E1 , ⋯, Er ⟩⟩ ▷ L. P ∥ B → Σ ∥ ⟨⟨v1 , ⋯, vr ⟩⟩ ▷ L.0 ∥ P ∥ B
    (Multi-com)
                         `2 ∈ L ∧ Comp(`1 , `2 ) ∧ ⋀i=1 vi = [[Ei ]]Σ2
                                                    j


      `1 ∶ [⟨⟨v1 , ⋯, vr ⟩⟩ ▷ L. 0 ∥ B1 ] ∣ `2 ∶ [Σ2 ∥ (E1 , ⋯, Ej ; xj+1 , ⋯, xr r ).Q ∥ B2 ] →
                                                                           aj+1     a

       `1 ∶ [⟨⟨v1 , ⋯, vr ⟩⟩ ▷ L \ {`2 }. 0 ∥ B1 ] ∣ `2 ∶ [Σ2 {vj+1 /xj+1 , ⋯, vr /xr } ∥ Q ∥ B2 ]


                                                         43
In the first rule, to send a message ⟨⟨v1 , ..., vr ⟩⟩ obtained by the evaluation of ⟨⟨E1 , ..., Er ⟩⟩,
a node with label ` spawns a new process, running in parallel with the continuation P ; this
new process offers the evaluated tuple to all the receivers with labels in L. In the second rule,
the message coming from `1 is received by a node labelled `2 , provided that: (i) `2 belongs
to the set L of possible receivers, (ii) the two nodes satisfy a compatibility predicate Comp
(e.g. when they are in the same transmission range), and (iii) that the first j values match with
the evaluations of the first j terms in the input. Moreover, the label `2 is removed by the set of
receivers L of the tuple. The spawned process terminates when all the receivers have received
the message (L = ∅).


4     Control flow analysis
Here we present a CFA for static taint analysis to track the propagation of sensitive data and
data coming from possibly tampered sensors or variables. This CFA follows the same schema of
the one in [4, 3] for IoT-LySa. However, here we use different abstract values and propagation
rules. Intuitively, abstract values “symbolically” represent runtime data so as to encode whether
these data are tainted or not. Furthermore, we define functions over abstract values encoding
how information about the taint is propagated to derived data. Finally, we show how to use the
CFA results to check whether data from tamperable sources affect variables and computations
considered security critical. We also check whether sensitive data are leaked.

Taint information To formally introduce our abstract values and operators through which
we can combine and manipulate them, we first define the set of taint labels B (ranged over by
b, b1 , ...) whose elements (the colours in the pdf should help the intuition) are:

         ◇ untainted           sensitive      tamperable          sensitive & tamperable

The idea is that these labels mark our abstract values with information about their sources.
Formally, abstract values are pairs in V̂ defined as follows, where b ∈ B.

             V̂ ∋ v̂ ∶∶=   abstract terms
                           (⊤, b)                  abstract value denoting cut (see below)
                           (ν, b)                  abstract value for clear data
                           ({v̂1 , ⋯, v̂n }k0 , b) abstract value for encrypted data

For simplicity, hereafter we write them as ⊤ , ν , {v̂1 , ⋯, v̂n }k0 , and indicate with ↓i the projec-
                                                b   b             b

                           th
tion function on the i component of the pair. We naturally extend the projection to sets, i.e.
V̂↓i = {v̂↓i ∣v̂ ∈ V̂ }, where V̂ ⊆ V̂. In the abstract value ν , ν abstracts the concrete value from
                                                               b

sensors or computed by a function in the concrete semantics, while the first value of the pair
{v̂1 , ⋯, v̂n }k0 abstracts encrypted data. Since the dynamic semantics may introduce encrypted
               b

                                                                                     b
terms with an arbitrarily nesting level, we have the special abstract values ⊤ that denote all
the terms with a depth greater than a given threshold d. During the analysis, to cut these
values, we will use the function ⌊−⌋d , defined as expected (see Appendix). Note that once given
the set of encryption functions occurring in a node N , the abstract values are finitely many.
     We assume that designers provide the analysis with a classification of the data sources, i.e.
the set of sensitive sensors S` , and the set of the tamperable sources (sensors and variables) T` ,
for each node with label `.


                                                    44
                  ⊗     ◇                             ⊗          ◇          L      L           L
                  ◇     ◇                             ◇          ◇L         L      L           L
                                                       L′         L′       L∩L′   L∩L′        L∩L′
                                                 L′          L′   L∩L′      L∩L′        L∩L′
                                                 L′          L′   L∩L′      L∩L′        L∩L′

          Table 1: The operator ⊗, version 1 (on the left) and version 2 (on the right)


Definition 4.1 (Data classification). Given the set of sensitive sensors S` , and the set of the
tamperable sensors and variables T` , the taint assignment function τ is defined as follows:

                           ⎧
                           ⎪
                           ⎪
                                     if y ∈ S`
                           ⎪
                           ⎪
                           ⎪
                τ (y, `) = ⎪
                           ⎨                                                              τ (v, `) = ◇
                                     if y ∈ T`
                           ⎪
                           ⎪
                                                               where y ∈ I` ∪ X
                           ⎪
                           ⎪
                                    if y ∈ S` ∩ T`
                           ⎪
                           ⎪
                           ⎩◇        o.w.

    The above function classifies only the data source; to propagate the taint information we
resort to the taint combination operator ⊗ ∶ B × B → B defined in Table 1, on the left part.
Note that ⊗ works as a join operator making our abstract values a lattice similar to the 4-point
lattice LL,LH,HL,HH, used in the information flow literature, which combines confidentiality
                                                                                ′
                                                                               b
and integrity. This operator naturally extends to abstract values: b ⊗ v̂ = v̂↓1 where v̂ ∈ V̂ and
b = b ⊗ v̂↓2 ; and to sets of abstract values: b ⊗ V̂ = {b ⊗ v̂∣v̂ ∈ V̂ ⊆ V̂}.
 ′

    Now we need to specify how taint information propagate with aggregation functions. Below
we define two simple propagation policies for function applications and encryptions. In the
case of function application, the idea is that a single tainted argument turns to tainted the
result of the application. Encryption protects its sensitive data, but preserves tamperable taint
information. This means that encryption as a structure is marked with label ◇, unless one of
its component is marked with  or ; in these cases, encryption is labelled with .

Definition 4.2 (Taint propagation policies). Given the combination operator ⊗ ∶ B × B → B,
the taint resulting by the application of
  • a function f is Fτ (f, v̂1 , . . . , v̂r ) = ⊗(v̂1↓2 , . . . , v̂r↓2 )
                                                                             if ∀i.v̂i↓2 ∈ {◇,    }
  • an encryption function is Encτ (v̂1 , . . . , v̂r ) = {
                                                                     ◇
                                                                            o.w.

    It is obviously possible to consider different policies for propagation, e.g. we could consider
a set of functions that produce sensitive data independently from the taint information of its
arguments. Moreover, we could deal with anonymisation functions that process their arguments
to remove sensitive information. Consider e.g. blurring the faces of people in surveillance videos
to protect their privacy. A policy for these functions is similar to the one for the encryption:
the resulting taint label is ◇, if no argument is also tamperable, otherwise is .
    Finally, we could extend the whole presented schema to keep also the source nodes of tainted
information. To do that, we could introduce in our abstract values also the labels of nodes from
which the taint information derives. To deal with this new information, the taint assignment
function τ returns pairs (b, L), written bL (b` when L = {`}), our combinator operator becomes
the one shown on the right part in Table 1, and the encryption operator takes ` as argument
and uses it to annotate the resulting taint label.


                                                              45
CFA validation and correctness We now have all the ingredients to define our CFA to
approximate communications and data stored and exchanged and, in particular, the taint con-
tents of messages and values. We specify our analysis in a logical form through a set of inference
rules expressing the validity of the analysis results. The analysis result is a triple (Σ,
                                                                                        ̂ κ, Θ) (a
pair (Σ, Θ) when analysing a term), called estimate for N (for E), where Σ, κ, and Θ are the
       ̂                                                                      ̂
following abstract domains:
   • the union Σ̂ = ⋃`∈L Σ̂` of the super-sets Σ̂` ∶ X ∪ I` → 2V̂ of abstract values that may
     possibly be associated to a given location in I` or a given variable in X ,
   • a super-set κ ∶ L → L × ⋃i=1 V̂ of the messages that may be received by the node `, and
                                 k   i

                                           ̂
                                           V
   • a super-set Θ ∶ L → A → 2 of the taint information of the actual values computed by
                         a
     each labelled term M in a given node `, at run time.
Once a proposed estimate is available, its correctness has to be validated. This requires that it
satisfies the judgements defined on the syntax of nodes, node components and terms. They are
defined by a set of clauses, fully presented in Appendix (see Tables 3 and 4). Here, we show
some examples. The judgements for labelled terms have the form (Σ,   ̂ Θ) ⊧` M a . For each term
  a
M occurring in the node `, the corresponding judgement requires that Θ(`)(a) includes all
                                       a                                                       a
the abstract values v̂ associated to M . Consider, e.g. the following clause for the variable x .

                                           τ (x, `) ⊗ Σ̂` (x) ⊆ Θ(`)(a)

                                                  (Σ,
                                                   ̂ Θ) ⊧` x   a


In this case, an estimate is valid if Θ(`)(a) includes the abstract values resulting from the
combination (through the operator ⊗) of the taint information associated to the variable x (via
τ (x, `)), and the abstract values bound to x in Σ̂ ` . This combination allows us to propagate
the tamperable taint if x belongs to the set of tamperable variables.
    The judgements for nodes have the form (Σ,  ̂ κ, Θ) ⊧ N . The rule for a single node ` ∶ [B]
requires that its internal components B are analysed with judgements (Σ,      ̂ κ, Θ) ⊧` B. As
examples of clauses, we consider the clauses for communication.
                                       ̂ Θ) ⊧` Mi i ∧ (Σ,
                                 ⋀i=1 (Σ,              ̂ κ, Θ) ⊧` P ∧
                                       k                a

                 ∀v̂1 , ⋯, v̂r ∶ ⋀i=1 v̂i ∈ Θ(`)(ai ) ⇒ ∀` ∈ L ∶ (`, ⟨⟨v̂1 , ⋯, v̂r ⟩⟩) ∈ κ(` )
                                   r                               ′                              ′

                                     ̂ κ, Θ) ⊧` ⟨⟨M1a1 , ⋯, Mrar ⟩⟩ ▷ L. P
                                    (Σ,
                                               ⋀i=1 (Σ,
                                                     ̂ Θ) ⊧` Miai ∧
                                                j


   ∀(` , ⟨⟨v̂1 , ⋯, v̂r ⟩⟩) ∈ κ(`) ∶ Comp(` , `) ⇒ (⋀i=j+1 v̂i ⊗ τ (xi , `) ∈ Σ̂` (xi ) ∧ (Σ,
                                                                                           ̂ κ, Θ) ⊧` P )
      ′                                    ′         r


                               (Σ,
                                ̂ κ, Θ) ⊧` (M1 1 , ⋯, Mj ; xj+1 , ⋯, xr r ). P
                                                    a        aj        aj+1   a


An estimate is valid for multi-output if it is valid for the continuation of P and the set of messages
                                                    ′
communicated by the node ` to each node ` in L, includes all the messages obtained by the
evaluation of the r-tuple ⟨⟨M1 , ⋯, Mr ⟩⟩. More precisely, the rule (i) finds the sets Θ(`)(ai ) for
                                     a1      ar

each term Mi i , and (ii) for all tuples of values (v̂1 , ⋯, v̂r ) in Θ(`)(a1 ) × ⋯ × Θ(`)(ar ) it checks
             a

if they belong to κ(` ) for each ` ∈ L. Symmetrically, the rule for input requires that the values
                         ′              ′

inside messages that can be sent to the node `, passing the pattern matching, are included in
                                                                                                           a
the estimates of the variables xj+1 , ⋯, xr . More in detail, the rule analyses each term Mi i , and
requires that for any message that it can receive, i.e. (` , ⟨⟨v̂1 , ⋯, v̂j , v̂j+1 , . . . , v̂r ⟩⟩) in κ(`) and
                                                                ′
         ′
Comp(` , `), v̂j+1 , . . . , v̂r are included and combined with the estimates of xj+1 , ⋯, xr .


                                                        46
Example 4.3. Consider again our example of the storehouse in Sect. 2. A valid estimate
(Σ,
 ̂ κ, Θ) must include the following entries, where τ (z0 , `) = , ν Fτ (avg,ν ,ν ,ν ,ν ) = ν  ,
                                                                               ◇ ◇ ◇


and Avg, and Db are the labels of the terms avg(z0 , z1 , z2 , z3 ) and db.

      Σ̂`1 (z0 ) ⊇ {ν }   Σ̂`2 (x) ⊇ {ν }   Σ̂`2 (db) ⊇ {ν }                     Σ̂`3 (temp) ⊇ {ν }
                                                                                               

      Θ(`1 )(Avg) ⊇ {ν } Θ(`2 )(Db) ⊇ {ν } κ(`3 ) ⊇ {(`1 , ν )}                  κ(`3 ) ⊇ {(`2 , ν )}
                                                                                               


Indeed, an estimate must satisfy the checks of the CFA rules. For instance, the validation of
Pc requires, in particular, that ν is in Σ̂`1 (z0 ) for the rule for variables, (`1 , ν ) ∈ κ(`3 ) for
                                                                                      

the rule for output, and ν ∈ Θ(`1 )(Avg) for the rule of functions (see Appendix).
                           


    Our analysis respects the operational semantics of IoT-LySa. As usual, we prove a subject
reduction result for our analysis. The proofs follow the usual schema and benefit from an
instrumented denotational semantics for expressions, the values of which are pairs ⟨v, v̂⟩ where
v is a concrete value and v̂ is the corresponding abstract value. The formal definition can be
                                      i
found in Appendix. The store (Σ` with an undefined ⊥ value) is accordingly extended. Of
course, we compute the second component of the pair by using the operators introduced in
Definition 4.2, while, the semantics used in Table 2 uses the projection on the first component.
    The following subject reduction theorem establishes the correctness of our CFA, by relying
on the relation ⋈ that says when the concrete and the abstract stores agree. Its definition is
immediate, since the analysis only considers the second component of the extended store, i.e.
                     i
the abstract one: Σ` ⋈ Σ ̂ ` iff w ∈ X ∪ I` such that Σi` (w) ≠ ⊥ implies (Σi` (w))↓2 ∈ Σ
                                                                                        ̂ ` (w).

Theorem 4.4 (Subject reduction). If (Σ,
                                     ̂ κ, Θ) ⊧ N and N → N and ∀Σ` in N it is Σ` ⋈ Σ
                                                                                   ̂ `,
                                                                             ′       i                i

then (Σ,
      ̂ κ, Θ) ⊧ N and ∀Σ` in N it is Σ` ⋈ Σ  ̂ `.
                 ′       i′     ′       i′


Checking taint We now show that by inspecting the results of our CFA, we detect whether a
variable receives a value coming from a tamperable source, and hence is not trustworthy. Then,
we rephrase the confidentiality property of [3] in terms of propagation of sensitive contents, in
order to prevent their leaks. Finally, we check when a computation, considered security critical
by designers, depends on a tamperable source (a generalisation of the first property).
                                                    a            a
                                                M1 1 ,...,Mr r       ′                    a
   In the following, we denote with N −−−−−−−−−→` N when all the terms Mi i are evaluated
                              ⟨⟨v1 ,...,vr ⟩⟩
inside node `, and with N −−−−−−−→`1 ,`2 N when the message ⟨⟨v1 , . . . , vr ⟩⟩ is sent from the
                                                        ′

node `1 to the node `2 .
    First we characterise when a variable x of a node ` is untrustworthy, i.e. when it stores
a value propagated from a tamperable source, and therefore with taint label in {, }. To
                                                                        ̂ and the taint labels
statically check this property we can simply inspect the abstract store Σ
of the corresponding abstract values.
Definition 4.5. Let N be a system of nodes with labels in L, and T = {T` ∣ ` ∈ L} be the set
of tamperable sources. Then, the variable x of a node ` ∈ L is untrustworthy w.r.t. T , if for all
derivatives N s.t. N → N it holds that Σ` (x)↓2 ∈ {, } where Σ` is the store of ` in N .
             ′         ∗   ′               ′i                         ′i                       ′


Theorem 4.6. Let N be a system of nodes with labels in L, and let T = {T` ∣ ` ∈ L} the set of
tamperable sources. Then a variable x of the node ` is untrustworthy w.r.t. T if (Σ,
                                                                                  ̂ κ, Θ) ⊧ N ,
    ̂ ` (x)↓2 ⊆ {, }.
and Σ
    We define the confidentiality property in terms of sensitive taint information. There are no
leaks when messages do not expose values with taint label in { , }. We statically verify it,
by inspecting the labels of the corresponding abstract values in the component κ.


                                                            47
Definition 4.7. Let N be a system of nodes with labels in L, and S = {S` ∣ ` ∈ L} the set of
                                                          ∗  ′
its sensitive sensors. Then N has no leaks w.r.t. S if N → N and, for all `1 , `2 ∈ L, there is
                 ′ ⟨⟨v1 ,...,vn ⟩⟩
no transition N −−−−−−−−→`1 ,`2 N such that vi↓2 ∈ { , } for some i.
                                           ′′


Theorem 4.8. Let N be a system of nodes with labels in L, and S = {S` ∣ ` ∈ L} the set of
its sensitive sensors. Then N has no leaks w.r.t. S if (Σ,      ̂ κ, Θ) ⊧ N , and ∀`1 , `2 ∈ L such that
(`2 , ⟨⟨v̂1 , ⋯, v̂r ⟩⟩) ∈ κ(`2 ) we have that ∀i. v̂i↓2 ∈ {◇, }.

    The last property characterises when computations considered security critical are not di-
rectly or indirectly affected by tainted data i.e. they are reached by untrustworthy data. We
assume that the designer identify a set P ⊆ A of critical points in the application, i.e. points
where possibly tainted data should flow, unless they are checked for validity. We statically
verify this property, by inspecting the taint labels of the values in Θ for each critical point.

Definition 4.9. Let N be a system of nodes with labels in L, S = {S` ∣ ` ∈ L} the set of its
sensitive sensors, T = {T` ∣ ` ∈ L} the set of its tamperable sources, and P a set of program
                                                                              ∗    ′
critical points. Then N does not use tainted values in a critical point if N → N and there
is no transition N −−−−−−−−−→` N s.t. ai ∈ P and ([[Mi i ]]Σi` )↓2 ⊆ { , , } for some
                             a       a
                      ′   M1 1 ,...,Mr r        ′′                 a

i ∈ {1, . . . , r} and ` ∈ L.

Theorem 4.10. Let N be a system of nodes with labels in L, S = {S` ∣ ` ∈ L} the set of its
sensitive sensors, T = {T` ∣ ` ∈ L} the set of its tamperable sources, and P a set of program
critical points. Then N does not use tainted values in a program critical point if (Σ,
                                                                                    ̂ κ, Θ) ⊧ N ,
and Θ(`)(a)↓2 = {◇} for all labels a ∈ P and ` ∈ L.

Example 4.11. Back to our example, consider the critical point (temp ∈       / validRange(db))
                                                                                                      a

in the process Pk . Our CFA detects that this decision is based on possibly tainted values, since
Θ(`3 )(a)↓2 = {}. More in detail, the analysis of the condition depends on the analysis of
temp and on the one of validRange(db), which in turn depends on the one of db (temp and
db are both untrustworthy, see Ex. 4.3). By using the labelled version of abstract values and
the operator ⊗, as defined on the right part in Table 1, we can determine where the possibly
tampered data originally come from, i.e. which are the “bad sinks”. In our example, we would
obtain `1 and `2 , because Σ̂`3 (temp) ⊇ {ν `1 } and Σ̂`2 (db) ⊇ {ν `2 }. Similarly, we can trace
                                                                  

back possible leakages.


5     Conclusions
Based on IoT-LySa, we developed a CFA for static taint analysis to track the propagation of
sensitive data and data coming from possibly tampered sensors or variables, as illustrated by a
motivating example that provides a simple but non-trivial application of our approach. Taint
tracking is a relevant issue in security and can be used for checking both confidentiality and
integrity of data (see [6, 11, 10] to cite only a few). The idea underlying our approach requires
that the designer classify data sources as untainted, sensitive, tamperable. Static analysis has
the advantage of giving hints on propagation of tainted data in a early phase of system design,
thus guiding designers to understand the potential vulnerabilities and to direct their efforts
towards the suitable modifications and validity checks. It is not meant to being a substitute of
dynamic checks, but only as a supporting technique.
    Here we relied on CFA approximations for checking various security properties about data
propagation. In particular, we could detect if a variable is untrustworthy, because affected


                                                     48
by possibly tampered data; if sensitive data are leaked; and if computations in critical point
depend on tamperable data. Our CFA could also provide a prescriptive usage (along the lines
of [1]), by embedding taints in data and forcing some data to be accepted only if they come
with the expected taints.
    In future we would like to extend our analysis to also deal with implicit flows of tainted
data and to understand the relationships with the properties based on implicit flow (see [10]).


References
 [1] C. Bodei, L. Brodo, P. Degano, and H. Gao. Detecting and preventing type flaws at static time.
     Journal of Computer Security, 18(2):229–264, 2010.
 [2] C. Bodei, M. Buchholtz, P. Degano, F. Nielson, and H. Riis Nielson. Static validation of security
     protocols. Journal of Computer Security, 13(3):347–390, 2005.
 [3] C. Bodei, P. Degano, G-L. Ferrari, and L. Galletta. A step towards checking security in IoT. In
     Procs. of ICE 2016, volume 223 of EPTCS, pages 128–142, 2016.
 [4] C. Bodei, P. Degano, G-L. Ferrari, and L. Galletta. Where do your IoT ingredients come from?
     In Procs. of Coordination 2016, volume 9686 of LNCS, pages 35–50. Springer, 2016.
 [5] C. Bodei and L. Galletta. The cost of securing IoT communications. In Procs.of Italian Conference
     on Theoretical Computer Science, CEUR Proceedings Vol-1720, 2016.
 [6] W. Enck, P. Gilbert, S. Han, Vasant Tendulkar, Byung-Gon Chun, L. P. Cox, J. Jung, P. McDaniel,
     and A. N. Sheth. Taintdroid: An information-flow tracking system for realtime privacy monitoring
     on smartphones. ACM Trans. Comput. Syst., 32(2):5:1–5:29, 2014.
 [7] M. Herlihy. Wait-free synchronization. ACM Trans. Program. Lang. Syst., 13(1), 1991.
 [8] I. Lanese, L. Bedogni, and M. Di Felice. Internet of things: a process calculus approach. In Procs
     of the 28th Annual ACM Symposium on Applied Computing, SAC ’13, pages 1339–1346. ACM,
     2013.
 [9] R. Lanotte and M. Merro. A semantic theory of the Internet of Things. In Procs. of Coordination
     2016, volume 9686 of LNCS, pages 157–174. Springer, 2016.
[10] D. Schoepe, M. Balliu, B. C. Pierce, and A. Sabelfeld. Explicit secrecy: A policy for taint tracking.
     In IEEE European Symposium on Security and Privacy, EuroS&P 2016, pages 15–30, 2016.
[11] E. J. Schwartz, T. Avgerinos, and D. Brumley. All you ever wanted to know about dynamic
     taint analysis and forward symbolic execution (but might have been afraid to ask). In 31st IEEE
     Symposium on Security and Privacy, S&P 2010, pages 317–331. IEEE Computer Society, 2010.
[12] T. Zillner. ZigBee Exploited, 2015.


A      Appendix
A.1     Operational Semantics of IoT-LySa
Our reduction semantics is based on the following Structural congruence ≡ on nodes and node
components. It is standard except for rule (3) that equates a multi-output with no receivers
and the inactive process, and for the fact that inactive components of a node are all coalesced.

                      (1)   (N /≡ , ∣, 0) and (B/≡ , ∥, 0) are commutative monoids
                      (2)   µ h . X ≡ X{µ h . X/h} for X ∈ {P, A, S}
                      (3)   ⟨⟨E1 , ⋯, Er ⟩⟩ ∶ ∅. 0 ≡ 0

We have a two-level reduction relation → defined as the least relation on nodes and its com-
ponents satisfying the set of inference rules in Table 2. For the sake of simplicity, we use


                                                   49
                                                                                                                (Asgm)
                                                                                                                                            [[E]]Σ = v
     (S-store)

                                                                                                                Σ ∥ x ∶= E. P ∥ B → Σ{v/x} ∥ P ∥ B
     Σ ∥ i ∶= v . Si ∥ B → Σ{v/i} ∥ Si ∥ B
                        ′
                                                                                                                        a
                a      a

           (Cond1)                                                                                      (Cond2)
                      [[E]]Σ = true                                                                                         [[E]]Σ = false
                    Σ ∥ E? P1 ∶ P2 ∥ B → Σ ∥ P1 ∥ B                                                     Σ ∥ E? P1 ∶ P2 ∥ B → Σ ∥ P2 ∥ B
   (A-com)                                                                                                      (Act)                                     (Int)
                                        γ∈Γ
    ⟨j, γ⟩. P ∥ (∣j, Γ∣). A ∥ B → P ∥ γ. A ∥ B                                                                  γ.A → A                                   τ. X → X

  (Decr)
                                                [[E]]Σ = {v1 , ⋯, vr }k0 ∧ ⋀i=1 vi = [[Ei ]]Σ
                                                                                                        j                   ′


  Σ ∥ decrypt E as {E1 , ⋯, Ej ; xj+1 , ⋯, xr r }k0 in P ∥ B → Σ{vj+1 /xj+1 , ⋯, vr /xr }∥ P ∥ B
                                        ′           ′       aj+1                  a

                (Ev-out)
                                            ⋀i=1 vi = [[Ei ]]Σ
                                               r


                            Σ ∥ ⟨⟨E1 , ⋯, Er ⟩⟩ ▷ L. P ∥ B → Σ ∥ ⟨⟨v1 , ⋯, vr ⟩⟩ ▷ L.0 ∥ P ∥ B


           (Multi-com)
                                                   `2 ∈ L ∧ Comp(`1 , `2 ) ∧ ⋀i=1 vi = [[Ei ]]Σ2
                                                                                                                    j


            `1 ∶ [⟨⟨v1 , ⋯, vr ⟩⟩ ▷ L. 0 ∥ B1 ] ∣ `2 ∶ [Σ2 ∥ (E1 , ⋯, Ej ; xj+1 , ⋯, xr r ).Q ∥ B2 ] →
                                                                                                                                    aj+1              a

             `1 ∶ [⟨⟨v1 , ⋯, vr ⟩⟩ ▷ L \ {`2 }. 0 ∥ B1 ] ∣ `2 ∶ [Σ2 {vj+1 /xj+1 , ⋯, vr /xr } ∥ Q ∥ B2 ]

   (Node)                                           (ParN)                                              (ParB)                                                (CongrY)
             ′                                               ′                                                    ′                                             ′              ′
       B → B                                           N1 → N1                                              B1 → B1                                           Y1 ≡ Y1 → Y2 ≡ Y2
    ` ∶ [B] → ` ∶ [B ]                              N1 ∣N2 → N1 ∣N2                                     B1 ∥B2 → B1 ∥B2
                                ′                                         ′                                                     ′                                         ′           ′
                                                                                                                                                                      Y1 → Y2


Table 2: Reduction semantics (the upper part on nodes, the lower one on node components),
where X ∈ {S, A} and Y ∈ {N, B}.


one relation. We assume the standard denotational interpretation [[E]]Σ for evaluating terms,
while for the proof of the subject reduction results we resort to an instrumented denotational
semantics for expressions, the values of which are pairs ⟨v, v̂⟩ where v is a concrete value and v̂
is the corresponding abstract value:

    [[v ]]Σi = (v , ν )                                               [[i ]]Σi = Σ` (i)                                                     [[x ]]Σi = Σ` (x)
       a    i              a    ◇                                             a       i             i                                         a           i           i

    [[f (E1 , ⋯, Er )]]Σi = (f ([[E1 ]]Σi ↓1 , ⋯, [[Er ]]Σi ↓1 ), Fτ (f , [[E1 ]]Σi ↓2 , ⋯, [[Er ]]Σi ↓2 ))
                `                                                                         `                                                                   `
       a                                i               a             i                                     i                       a                 i                               i

    [[{E1 , ⋯, Er }k ]]Σi = ({[[E1 ]]Σi ↓1 , ⋯, [[Er ]]Σi ↓1 }k , ⌊{[[E1 ]]Σi ↓2 , ⋯, [[Er ]]Σi ↓2 } ⌋d )
                                            `                             `                                     `                                         `                               `
                            a       i                         i                                 i                                   i                             i           b

                                                                                              where b = Encτ ([[E1 ]]Σi ↓2 , ⋯, [[Er ]]Σi ↓2 )
                                        `                         `                                 `                                   `                             `
                                                                                                                                              i                               i
                                                                                                                                                  `                               `


   The first two semantic rules implement the (atomic) asynchronous update of shared vari-
                                                                                             th
ables inside nodes, by using the standard notation Σ{−/−}. According to (S-store), the i
sensor uploads the value v, gathered from the environment, into its store location i. According


                                                                                              50
            τ (i, `) ∈ Θ(`)(a)                      τ (v, `) ∈ Θ(`)(a)                  τ (x, `) ⊗ Σ̂` (x) ⊆ Θ(`)(a)
                 (Σ,
                  ̂ Θ) ⊧` i                           (Σ,
                                                       ̂ Θ) ⊧` v                                   (Σ,
                                                                                                    ̂ Θ) ⊧` x
                               a                                   a                                           a


                                                      ⋀i=1 (Σ,
                                                            ̂ Θ) ⊧` Mi i ∧
                                                        k                    a


                 ∀ v̂1 , .., v̂r ∶ ⋀i=1 v̂i ∈ Θ(`)(ai ) ⇒ ⌊{v̂1 , .., v̂r }Enc τ (v̂1 ,..,v̂r )
                                                                                                ⌋d ∈ Θ(`)(a)
                                       r
                                                                           k0

                                               (Σ,
                                                ̂ Θ) ⊧` {M1 , .., Mr }k0
                                                           a1           ar a


                                                       ⋀i=1 (Σ,
                                                             ̂ Θ) ⊧` Mi ∧
                                                        k


                          ∀ v̂1 , .., v̂r ∶ ⋀i=1 v̂i ∈ Θ(`)(ai ) ⇒            F (f,v̂1 ,..,v̂r )
                                               r
                                                                            ν τ                    ∈ Θ(`)(a)
                                                   (Σ,
                                                    ̂ Θ) ⊧` f (M1a1 , .., Mrar )a


                              Table 3: Analysis of labelled terms (Σ,
                                                                   ̂ Θ) ⊧` M .                          a


to (Asgm), a control process updates the variable x with the value of E. The rules for condi-
tional (Cond1) and (Cond2) are as expected. In the rule (A-com) a process with prefix ⟨j, γ⟩
                  th
commands the j actuator to perform the action γ, if it is one of its actions. The rule (Act)
says that the actuator performs the action γ. Similarly, for the rules (Int) for internal actions
for representing activities we are not interested in. The rules (Ev-out) and (Multi-com) that
drive asynchronous multi-communications are discussed in Section 3. The rule (Decr) tries to
decrypt the result {v1 , ⋯, vr }k of the evaluation of E with the key k0 , and matches it against
the pattern {E1 , ⋯, Ej ; xj+1 , ⋯, xr }k0 . As for communication, when this match succeeds the
                ′      ′

variables after the semicolon “;” are assigned to values resulting from the decryption. The last
rules propagate reductions across parallel composition ((ParN) and (ParB)) and nodes (Node),
while (CongrY) is the standard reduction rule for congruence for nodes and node components.

A.2     Control Flow Analysis of IoT-LySa
Our CFA is specified in a logical form through a set of inference rules expressing the validity
of the analysis results, where the function ⌊−⌋d to cut all the terms with a depth greater than
                                                        b
a given threshold d (with the special abstract values ⊤ ) is defined as follows.

 ⌊⊤ ⌋d = ⊤         ⌊ν ⌋d = ν               ⌊{v̂1 , ⋯, v̂r }k0 ⌋0 = ⊤       ⌊{v̂1 , ⋯, v̂r }k0 ⌋d = {⌊v̂1 ⌋d−1 , ⋯, ⌊v̂r ⌋d−1 }k0
    b        b        b            b                     b             b                     b                               b


    The result or estimate of our CFA is a triple (Σ, ̂ κ, Θ) (a pair (Σ,  ̂ Θ) when analysing a term)
that satisfies the judgements defined by the axioms and rules of Tables 3 and 4.
    We do not comment the clauses discussed in Section 4. The judgement (Σ,                ̂ Θ) ⊧` M a ,
defined by the rules in Table 3, requires that Θ(`)(a) includes all the abstract values v̂ associated
to M . In the case of sensor identifiers i and values v , Θ(`)(a) must include τ (i, `) and τ (v, `),
      a                                     a             a

respectively, i.e. the corresponding taint classifications decided by the designer. The rule for
analysing compound terms requires that the components are in turn analysed. The penultimate
rule deals with the application of an r-ary encryption. To do that (i) it analyses each term
Mi i , and (ii) for each r-tuple of values (v̂1 , ⋯, v̂r ) in Θ(`)(a1 ) × ⋯ × Θ(`)(ar ), it requires
   a

that the abstract structured value {v̂1 , ⋯, v̂r }k0 , cut at depth d, belongs to Θ(`)(a), where
                                                    b

b = Encτ (v̂1 , .., v̂r ). The special abstract value ⊤ will end up in Θ(`)(a) if the depth of the
                                                        b

term exceeds d. The last rule is for the application of an r-ary function f . Also in this case, (i) it
analyses each term Mi i , and (ii) for all r-tuples of values (v̂1 , ⋯, v̂r ) in Θ(`)(a1 )×⋯×Θ(`)(ar ),
                           a

it requires that the abstract value ν belongs to Θ(`)(a), where b = Fτ (f, v̂1 , .., v̂r ).
                                         b


                                                                 51
                                               (Σ,
                                                ̂ κ, Θ) ⊧` B                   (Σ,
                                                                                ̂ κ, Θ) ⊧ N1 ∧ (Σ,
                                                                                                ̂ κ, Θ) ⊧ N2
                  (Σ,
                   ̂ κ, Θ) ⊧ 0             (Σ,
                                            ̂ κ, Θ) ⊧ ` ∶ [B]                         (Σ,
                                                                                       ̂ κ, Θ) ⊧ N1 ∣ N2

                   ∀ i ∈ I` . τ (i, `) ∈ Σ̂` (i)
                        (Σ,
                         ̂ κ, Θ) ⊧` Σ                             (Σ,
                                                                   ̂ κ, Θ) ⊧` S                 (Σ,
                                                                                                 ̂ κ, Θ) ⊧` A

                                     ⋀i=1 (Σ,
                                           ̂ Θ) ⊧` Mi i ∧ (Σ,
                                                           ̂ κ, Θ) ⊧` P ∧
                                           k                      a


                    ∀v̂1 , ⋯, v̂r ∶ ⋀i=1 v̂i ∈ Θ(`)(ai ) ⇒ ∀` ∈ L ∶ (`, ⟨⟨v̂1 , ⋯, v̂r ⟩⟩) ∈ κ(` )
                                       r                                   ′                                  ′


                                       (Σ,
                                        ̂ κ, Θ) ⊧` ⟨⟨M1a1 , ⋯, Mrar ⟩⟩ ▷ L. P

                                                   ⋀i=1 (Σ,
                                                         ̂ Θ) ⊧` Mi i ∧
                                                        j                       a


     ∀(` , ⟨⟨v̂1 , ⋯, v̂r ⟩⟩) ∈ κ(`) ∶ Comp(` , `) ⇒ (⋀i=j+1 v̂i ⊗ τ (xi , `) ∈ Σ̂` (xi ) ∧ (Σ,
                                                                                             ̂ κ, Θ) ⊧` P )
         ′                                          ′                 r


                                   (Σ,
                                    ̂ κ, Θ) ⊧` (M1a1 , ⋯, Mj j ; xj+1  , ⋯, xr r ). P
                                                           a      aj+1       a


                                      (Σ,
                                       ̂ Θ) ⊧` M                ∧ ⋀i=1 (Σ,
                                                                        ̂ Θ) ⊧` Miai ∧
                                                            a      j


               ∀{v̂1 , ⋯, v̂r }k0 ∈ Θ(`)(a) ⇒ (⋀i=j+1 v̂i ⊗ τ (xi , `) ∈ Σ̂` (xi ) ∧ (Σ,
                                                                                      ̂ κ, Θ) ⊧` P )
                              b                             r


                      (Σ,
                       ̂ κ, Θ) ⊧` decrypt M as {M1 1 , ⋯, Mj ; xj+1 , ⋯, xr r }k in P
                                                        a         a             aj     aj+1     a
                                                                                0


                                (Σ,
                                 ̂ Θ) ⊧` M ∧            a

             ∀ v̂ ∈ Θ(`)(a) ⇒ v̂ ⊗ τ (x, `) ∈ Σ̂` (x) ∧ (Σ,
                                                         ̂ κ, Θ) ⊧` P                                  (Σ,
                                                                                                        ̂ κ, Θ) ⊧` P
                           (Σ,
                            ̂ κ, Θ) ⊧` x ∶= M . P                                                   (Σ,
                                                                                                     ̂ κ, Θ) ⊧` ⟨j, γ⟩. P
                                           ax       a


                            (Σ,
                              ̂ Θ) ⊧` M a ∧
                   (Σ, κ, Θ) ⊧` P1 ∧ (Σ,
                    ̂                  ̂ κ, Θ) ⊧` P2                      (Σ,
                                                                           ̂ κ, Θ) ⊧` B1 ∧ (Σ,
                                                                                            ̂ κ, Θ) ⊧` B2
                        (Σ,
                          ̂ κ, Θ) ⊧` M ?P1 ∶ P2                                      (Σ,
                                                                                      ̂ κ, Θ) ⊧` B1 ∥ B2
                                      a


                                                            (Σ,
                                                             ̂ κ, Θ) ⊧` P
                      (Σ,
                       ̂ κ, Θ) ⊧` 0                     (Σ,
                                                         ̂ κ, Θ) ⊧` µh. P                     (Σ,
                                                                                               ̂ κ, Θ) ⊧` h


      Table 4: Analysis of nodes (Σ,
                                  ̂ κ, Θ) ⊧ N , and of node components (Σ,
                                                                        ̂ κ, Θ) ⊧` B.


    The judgements for nodes with the form (Σ,  ̂ κ, Θ) ⊧ N are defined by the rules in Table 4.
The rules for the inactive node and for parallel composition are standard. The rule for a single
node ` ∶ [B] requires that its internal components B are in turn analysed; in this case we the
use rules with judgements (Σ,  ̂ κ, Θ) ⊧` B, where ` is the label of the enclosing node. The rule
connecting actual stores Σ with abstract ones Σ ̂ requires the locations of sensors to contain the
corresponding abstract values. The rule for sensors is trivial, because we are only interested in
the users of their values. The rule for actuators is equally trivial, because we model actuators
as passive entities. The rules for processes require to analyse the immediate sub-processes.
The rule for decryption is similar to the one for communication: it also requires that the keys
                                                                                                 a
coincide. The rule for assignment requires that all the values v̂ in the estimate Θ(`)(a) for M
belong to Σ̂ ` (x), once combined with the variable taint with the operator ⊗. The rules for the
inactive process, for parallel composition, and for iteration are standard (we assume that each
iteration variable h is uniquely bound to the body P ).


                                                                   52