<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Resilience via Blackbox Self-Piloting Plants</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michel Barbeau</string-name>
          <email>barbeau@scs.carleton.ca</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joaquin Garcia-Alfaro</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Lübben</string-name>
          <email>christian.luebben@tum.de</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marc-Oliver Pahl</string-name>
          <email>marc-oliver.pahl@imt-atlantique.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lars Wüstrich</string-name>
          <email>lars.wuestrich@tum.de</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IMT Atlantique</institution>
          ,
          <addr-line>Rennes, France IRISA, UMR IRISA CNRS 6074</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science, Carleton University</institution>
          ,
          <addr-line>1125 Colonel By Drive, Ottawa, Ontario</addr-line>
          ,
          <country country="CA">Canada</country>
          <addr-line>K1S 5B6</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Télécom SudParis, SAMOVAR, Institut Polytechnique de Paris</institution>
          ,
          <addr-line>91120, Palaiseau</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <fpage>35</fpage>
      <lpage>46</lpage>
      <abstract>
        <p>Distributed control is a reality of today's industrial automation and systems. Parts of a system are on-site, and other elements are on the edge of the cloud. The overall system-functioning relies on the reliable operation of local and remote components. However, all system parts can be attacked. Typically, local entities of a cyber-physical system, such as robot arms or conveyor belts, get afected by cyber attacks. However, attacking the control and monitoring channels between a plant and its remote controller is attractive, too. There is a diversity of attacks, such as manipulating a plant's input signals, controller logic, and output signals. To detect and mitigate the impact of such various attacks and to make a plant more resilient, we introduce a self-learning controller proxy in the plant's communication channel to the controller. It acts as a local trust anchor to the commands received from a remote controller. It does black box self-learning of the controller algorithms and audits its operations. Once an attack is detected, the plant pivots into self-piloting mode. We investigate design alternatives for the controller proxy. We evaluate how complex the control algorithms can be to enable self-piloting resilience.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Cyber-Physical System</kwd>
        <kwd>Networked-Control Systems</kwd>
        <kwd>security</kwd>
        <kwd>incident response</kwd>
        <kwd>resilience</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>A Cyber-Physical System (CPS), such as a critical infrastructure, consists of distributed physical
elements, including local and of-site control algorithms that work together to accomplish a
task. All together, the physical elements form the plant. The algorithms make up the controller.
Controller and plant communicate through a network, constituting a Networked-Control System
(NCS). It is a flexible environment but vulnerable to attacks: attacks on one component can
afect the entire system. Hence, the importance of securing a NCS.</p>
      <p>The security of a NCS encompasses several aspects, including protection against attacks,
recognition of attacks and incident response. Protection is realized by leveraging cryptography
techniques and security protocols. Recognition of attacks leverages tools such as intrusion
detection techniques. In this paper, we explore an idea for preparing responses to attacks. In
particular, when the severity of attacks is such, it becomes safer for a plant component to
disconnect from the network and operate autonomously, at least for a short while.</p>
      <p>We propose a method to automatically configure a local model of a remote controller by
observing its actions. In case a network disconnection is required, the resulting local controller
proxy can take over and assure the operation for some time. It makes the local infrastructure
more resilient to cyber-attacks. Starting from a mathematical model, the paper develops an
approach to learning and consequently predicts a remote controller’s behavior. The resulting
algorithm is evaluated with an emulated industrial control process.</p>
      <p>The related work is reviewed in Section 2. Our system model is described in Section 3. In
Section 4, we discuss learning by imitating. Our approach is evaluated in Section 5. We conclude
with Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        A CPS is a combination of hardware and software resources, collectively called a plant or system.
The term NCS refers to a CPS monitored and controlled remotely through a communication
network. Autonomous transport systems and energy distribution systems are NCS examples. A
CPS is vulnerable to several availability, covert [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] and integrity attacks [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]. Adversaries
perpetrate attacks by manipulating signals to actuators and from sensors. Perpetrated attacks
can lead to disastrous consequences. Hence, a CPS needs to be protected.
      </p>
      <p>
        Protection involves using cryptographic methods such as digital signatures, encryption, and
key establishment. Despite protection, impactful attacks will likely be perpetrated. Hence,
detection methods and response plans are required. Several methods have been proposed
to detect attacks on a CPS, such as challenge-response authentication [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ] and auxiliary
states [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Adequate incident response mitigates the impacts of attacks and achieves resilience.
Resilience to attacks can be obtained by redundancy, diversity [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and response planning,
which means that when attacks are detected, triggering behavior that mitigates impact. Incident
response is the aspect that we develop further in this paper.
      </p>
      <p>
        Our work is related to model predictive control and learning. In predictive control, at time ,
the controller sends an input row vector ⃗ of length k+1, the time horizon. The first element of
⃗ is the input to apply at time . Predicting a response from the plant (using a model of it), the
vector ⃗ also contains the expected inputs for times  + 1 to  + . When the plant receives an
input vector ⃗, it applies the first element in normal mode. The remaining  elements are stored.
When a situation occurs, the plant stops accepting new inputs. It keeps running, applying the
inputs predicted by the controller. The approach has been used by Quevedo et al. and Franzè et
al. to make a system resilient to packet loss [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] or packet replay [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Leveraging machine
learning, we take this idea to another level.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Networked-Control System Model</title>
      <p>A CPS is modeled by the following two discrete time equations:
+1</p>
      <p>=  (, )
 = ()
 and  are the current state and actuator inputs, at time , with  ∈ R and  ∈ R. +1
is the successor state determined by the evaluation of the state-input function  (, ), at time
 + 1, with +1 ∈ R.  is the sensor outputs in the current state defined by the evaluation of
output function (), with  ∈ R. The variables , ,  and  denote positive integers.</p>
      <p>
        As an example, let us consider a system that consists of a single cylindrical tank. The tank
has one inflow and one outflow of liquid. The dynamics of liquid level in the tank are captured
by the following diferential equation [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]:
ℎ ()

=
 () − √︀ℎ ()
      </p>
      <p>Eq. (3) models the relationship between instantaneous changes in liquid level and the
diference between the inlet flow rate and outlet flow rate. As a function of time
 (second), the level
of the liquid in the tank is ℎ() (cm). Variable  represents a cross-sectional area of the tank
(cm2). The term  () represents the inlet flow rate (cm 3/second). The parameter  denotes the
outlet valve coeficient. The outlet flow rate (cm
3/second) is proportional to the product  times
the square root of the pressure represented by the term ℎ (). Note that because of the square
root term, the system is nonlinear.</p>
      <p>Mapping Eq. (3) into the model of Eq. (1) , we get:
+1</p>
      <p>=  (, ) =  +
 = () = 
 − √

where  represents the input flow rate at time . In the sequel, this dynamics is used to build a
NCS case study.</p>
      <sec id="sec-3-1">
        <title>3.1. Architecture</title>
        <p>We assume that remote monitoring control of the CPS is required. Hence, it is a NCS. The
architecture of a NCS is pictured in Figure 1 (a). There is a System consisting of resources that
can be controlled and monitored. There is a Controller that posts inputs and gets outputs from
the System. The inputs are commands to actuators of the System. The outputs are readings from
sensors attached to the System. Inputs and outputs travel over a network. This architecture is
relevant to situations requiring remote control and monitoring. The attack model is pictured
in Figure 1 (b). Somewhere in the network, an adversary stands between the Controller and
System. The adversary can modify inputs and outputs. In the sequel, we assume that data
modification attacks can be detected by both the Controller and System, using, for instance, a
digital signature mechanism.
(1)
(2)
(3)
(4)
(5)</p>
        <sec id="sec-3-1-1">
          <title>Controller</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Controller (a) Architecture of a Networked-Control System.</title>
        </sec>
        <sec id="sec-3-1-3">
          <title>Inputs</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>Outputs</title>
        </sec>
        <sec id="sec-3-1-5">
          <title>Inputs</title>
        </sec>
        <sec id="sec-3-1-6">
          <title>Outputs'</title>
        </sec>
        <sec id="sec-3-1-7">
          <title>Network</title>
        </sec>
        <sec id="sec-3-1-8">
          <title>Network</title>
        </sec>
        <sec id="sec-3-1-9">
          <title>Adversary</title>
        </sec>
        <sec id="sec-3-1-10">
          <title>Inputs</title>
        </sec>
        <sec id="sec-3-1-11">
          <title>Outputs</title>
        </sec>
        <sec id="sec-3-1-12">
          <title>Inputs'</title>
        </sec>
        <sec id="sec-3-1-13">
          <title>Outputs</title>
          <p>(b) Attack model.</p>
        </sec>
        <sec id="sec-3-1-14">
          <title>Controller proxy</title>
        </sec>
        <sec id="sec-3-1-15">
          <title>System (c) Resilience model.</title>
        </sec>
        <sec id="sec-3-1-16">
          <title>System</title>
        </sec>
        <sec id="sec-3-1-17">
          <title>System</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Resilience Model</title>
        <p>We define resilience as a system’s ability to maintain operation while an attack is being
perpetrated. In Figure 1 (c), we propose a system architecture with resilience by design. The remote
controller controls and monitors the system during regular operation. The system can detect
attacks from cyberspace and disconnect itself from the network when attacks occur. When
the perpetration of an attack is detected, the System disconnects itself from the network. The
System embeds a controller proxy. While disconnected from the network, control of the CPS
is handed over to the controller proxy. The controller proxy may act on the physical system
with its actuators and observe it with its sensors. Hence, the controller proxy may have its own
state-input function (, ) and output function ().</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Learning to Control by Imitating</title>
      <p>We develop a controller proxy learning approach for the architecture of Figure 1 (c). Let us
ifrst consider radio-controlled model airplanes. They have a pre-programmed solution. They
memorize their launching location. When a model airplane goes out of the wireless range of
its controller, the aircraft detects the signal loss. An auto return to launch site function kicks
in. The plane returns over the takeof point, circles around it, and lands safely. The return to
launch site solution has also been adopted by modern quadcopters. This solution is interesting
and specific to a context. It can hardly be generalized to other types of situations. We aim at
general solutions.</p>
      <p>
        We follow the more generic idea of imitation learning [
        <xref ref-type="bibr" rid="ref15 ref16 ref17">15, 16, 17</xref>
        ]. The learning entity is
called an agent. In imitation learning, the agent observes a demonstration of a task and tries to
mimic the behavior. There are two diferent categories of imitation learning. The first one is
behavioral cloning which uses supervised learning to infer the relationship from observations to
actions [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ]. In contrast, with inverse reinforcement learning, a reward function is estimated
to explain the teacher’s behavior [
        <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
        ]. In our case, we are interested in the former, i. e.,
ifnding an appropriate mapping between the observations and actions. Behavioral cloning and
DAGGER are examples of imitation supervised learning [
        <xref ref-type="bibr" rid="ref18 ref19 ref21">18, 19, 21</xref>
        ]. They do not require a
specification of the reward function.
      </p>
      <p>We propose a flexible solution where a controller proxy observes, imitates, and acquires
the behavior of a remote controller, i. e., the actions it performs in each state. In this context,
imitation learning means that the controller proxy is an agent that infers the controller’s
behavior. The remote controller is a teacher. The controller proxy is a learner. The teacher does
not need to be aware of the learner. It performs its task normally. The learner is an observer.
The teacher is part of its environment. When an attack is detected, the controller proxy agent
acts on the system instead of the remote controller. For detection methods, see Section 2. In the
sequel, we focus on the controller proxy agent training.</p>
      <p>The agent can only observe the communications between the controller and the system.
There are two types of communication. The first type of communication is from the controller
to the system. This communication contains commands to the system. The system executes
the commands and responds with the resulting state. As a function of the new state, the
controller sends a new set of commands in the second communication type. To faithfully imitate
the controller, the controller proxy needs to reliably predict the new set of commands based
on the reported system state. Therefore, it must learn a policy that reflects the controller’s
decision-making.</p>
      <p>Our reinforcement learning architecture reflects this setting. There are two types of scenarios.
The first type of scenario is stateless. The controller consistently makes the same decision
given the same input. The second type of scenario is stateful. In this case, in addition to the
system’s current situation, the state also needs to capture the trends of the ongoing activity. For
example, a water tank may have a changing target level. A controller’s decisions for a water
tank collecting water difer from the commands for a tank leaking water. Because they are more
general, we focus on stateful physical systems.</p>
      <p>The input to the agent is a series of observations of the communication flow on the channel
between the controller and the system. Let a command from the controller to the system have 
parameters. Moreover, let a system state be a vector of  parameters. With  denoting the
number of interactions between the controller and system, i. e., one command and a corresponding
state vector, the input to the agent has length  · ( + ) parameters. The agent is limited to the
actions that the controller takes. Since the agent should imitate the controller, the commands it
receives have the same dimension as the commands produced by the controller. Furthermore,
the agent’s output is also command vectors of  parameters, i. e., its choice of actions according
to the policy acquired with reinforcement learning. The size of the reinforcement learning
action space grows exponentially with the number of commands available to the controller.
The growth depends on two factors. The first is the number of possibilities for each action.
The second factor is the number of actions included in a command vector. The combination of</p>
      <p>Upper tank
Pump</p>
      <p>Lower tank
possible commands, i. e., actions, grows with the possible combinations that the controller can
send. When receiving an input, the agent predicts the chosen action of the controller. When it
correctly predicts the new parameters, it receives a positive reward. Otherwise, it receives a
negative reward, i. e., a penalty.</p>
      <p>We use a neural network for the core of the agent. Within these parameters, the neural
network has  · ( + ) input neurons and  output neurons. In addition, the neural network
can have one or more hidden layers of varying sizes. Their number depends on the scenario.
The agent should accurately imitate the controller’s logic. This implies that the agent should
not explore new action paths to optimize its policy. The agent concentrates on exploitation.</p>
      <p>The training of the agent results in a policy that reflects the controller’s logic. If needed,
it makes it capable of replacing the controller in case of an attack and managing the system
according to the observed behavior. In contrast to predictive control with a finite time horizon,
the controller proxy can operate while the system is disconnected for an arbitrarily long period.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Evaluation</title>
      <p>In this section, we demonstrate the controller proxy concept. We assess the feasibility of our
method and its performance versus the number of observations made by the controller proxy.
To do so, we choose a concise yet realistic setting: a two-water tank control problem. The plant
comprises two tanks while the controller maintains target water levels. The controller is remote
to the plant in the cloud. Collocated with the plant, we place a self-learning controller proxy.
The quality of service is evaluated with variable conditions.</p>
      <p>The process works as follows. The controller proxy learns by passively observing the
communication between entities inside the plant and one or multiple external controllers. After
observing the state of the monitored plant, the internal controller decides and compares its
decision to the command received from the external controller in the next step. When the
external controller gets disconnected, the internal controller proxy takes over and mimics its
behavior to keep the plant in an optimal state.</p>
      <p>For our evaluation, we consider the following scenario. The plant consists of two water tanks,
one being on top of the other. Two pipes connect the water tanks. One pipe has an in-line water
pump that can be turned on or of. When turned on, the pump moves water from the lower tank
to the upper one. While the pump moves the water to the upper tank, the water level in the
lower tank decreases. The water from the upper tank flows through the second pipe from the
upper tank to the lower tank. When the pump is of, the water level in the upper tank decreases
and the level in the lower tank increases. The pump turns on when the upper tank is emptied
to a certain threshold. It remains on until the upper tank reaches the maximum threshold and
is turned of. The pump is reactivated when enough water from the upper tank has flown to the
lower tank. Figure 2 shows the setup.</p>
      <p>Figure 3 shows this deterministic cyclic behavior to be learned by observation and imitation.
The horizontal axis corresponds to the time in seconds. The vertical axis represents the water
level in tanks in cm. The green curve denotes the water level of the lower tank. The orange
curve corresponds to the level of the upper tank. The blue line represents the state of the pump;
the low level is of, high status is on. For this scenario, the sampling period is five seconds.</p>
      <p>
        This scenario is implemented and simulated using the Virtual State Layer (VSL)
middleware [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. VSL enables rapid prototyping and realistic evaluation of the scenario. Building on
the VSL micro-service architecture concept, one service is created for each entity in the scenario.
The resulting system consists of six services. One service for each tank, the valve, pump, and
the external controller. In addition, there is a plotter service recording measurements. The
pump is modeled to fill the upper tank with a fixed flow of 500 cm 3 per second. The outflow
of the upper tank depends on the water level and is simulated using the diferential equations
introduced in Section 3. The initial water levels of both tanks are 400 cm. The upper tank has
1.0
0.8
y0.6
c
a
r
u
c
c
A0.4
0.2
0.0
1
2
3
4
5
6
7 8
Epoch
9
a maximum water level of 500 cm. When reaching 90% of the maximum level, the external
controller turns of the pump. It turns the pump on again when the level of the upper tank
drops below 30% of the maximum level.
      </p>
      <p>We train a Deep Q-learning (DQN) agent for this task on the observing controller proxy.
The input to the DQN is the current level of each water tank and the status of the pump. The
action space switching the pump on or of in the next step. The DQN outputs a prediction
for the activation of the pump in the next step. It also observes the command of the external
controller that it predicted and uses it as label to evaluate the prediction. We implement the
DQN in a separate Python program using PyTorch. The agent is connected via interfaces to the
VSL middleware such that it can observe the activities in the simulation.</p>
      <p>There are various parameters that can afect the result of the training. For our evaluation,
we focus on the number of epochs for training, and the number of observations at the proxy
controller. Both afect the amount of data that the proxy controller has for learning.</p>
      <p>In a first experiment, we evaluate the quality of proposed actions by the local controller proxy
depending on the number of trained epochs and observations. We assess the performance by
comparing DQN predictions against the actual commands issued by the external controller.</p>
      <p>We train multiple DQNs while varying amounts of data and epochs. Each DQN has random
weights at initialization. Then, we train each DQN for 1, 2, . . . , 14 epochs. We repeat this
experiment with diferent amounts of training data. We use sequential data starting from the
beginning of the simulation. Depending on the amount of provided data, a DQN is trained with
0.9
0.8
led0.7
o
m
f
o
cya0.6
r
u
c
c
A
0.5
0.4
800
600
400
200
0
)
m
c
(
l
e
v
e
l
r
e
t
a
W
0
1000</p>
      <p>2000 3000
Observations used for training
4000
a partial observation of a cycle, a complete, or multiple cycles. The smallest amount of data
used for training is the first 100 measurements. For the number of epochs, the DQNs are trained
with 100, 200, 300, . . . , 4300 observations.</p>
      <p>Figure 4 shows the accuracy of DQNs as a function of the number of trained epochs averaged
over the diferent amounts of training. The accuracy is defined as correctly predicted labels
over all samples. The accuracy rises until epoch seven. DQNs trained for eight to 10 epochs
have decreasing accuracy. DQNs trained for 12 epochs have the highest accuracy. Considering
the trade-of between training time and accuracy, we use seven epochs for further evaluation.
The accuracy gain at 12 epochs is only around 5%, while the training time rises by about 170%.</p>
      <p>Using the fixed number of seven epochs, the following experiment evaluates the impact
of the number of observations on a model’s accuracy. We, therefore, trained multiple DQNs
while varying the amount of sequential training data. Figure 5 shows that the performance
of trained models correlates with the quantity of training data. The dashed lines show the
observations used to train the model. Models that are only trained on observations where the
pump is turned of perform poorly. The performance increases significantly after observing one
cycle in which the pump was turned of and turned on. The maximum accuracy is achieved at
3500 observations. The results are averaged over ten iterations of the experiment.</p>
      <p>We implemented in VSL the internal controller proxy to verify the training accuracy. We
used a learned model on live data. Based on the results of the former evaluation, the controller
proxy implements a model trained for up to seven epochs, using 3500 observations. The results</p>
      <p>Pump activation
Upper tank</p>
      <p>Lower tank
0
500
1000</p>
      <p>Time (s)
1500
2000
are shown in Figure 6. The red lines mark the upper and lower limits implemented in the
original controller. Compared to Figure 3, it shows that the internal controller proxy learned the
behavior of the external controller. The internal controller keeps the upper tank level within
the limits of the external controller.</p>
      <p>The previous evaluation shows the feasibility of the approach. Our local proxy controller
became a self-learned digital twin of the remote controller. It showed that the approach works
in our limited scenario. Our evaluation setting fulfills our assumptions of a local CPS that is
controlled from the outside. Therefore, It can be expected that the approach is also fitting for
bigger settings. We plan to evaluate this in the future.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This paper showed how a remote controller could be learned locally, resulting in a digital twin of
the controller. It showed how the digital twin acts as a trust anchor, enabling anomaly detection
and mitigation from attacks by taking over control in case of an attack.</p>
      <p>This work shows the feasibility of the idea by describing the approach and evaluating it at a
representative scenario. In the future, we plan to focus on more complex systems to evaluate
the performance of the approach.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Acknowledgments</title>
      <p>This work was partially supported by the Natural Sciences and Engineering Research Council of
Canada (NSERC). This research took part in the context of the industrial chair Cybersecurity for
Critical Networked Infrastructures (CyberCNI.fr) with the support of the FEDER development
fund of the Brittany region. This work was also supported by the German Federal Ministry of
Education and Research under funding number 16KIS1221 (SKINET),</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <article-title>A decoupled feedback structure for covertly appropriating networked control systems</article-title>
          ,
          <source>IFAC Proceedings Volumes</source>
          <volume>44</volume>
          (
          <year>2011</year>
          )
          <fpage>90</fpage>
          -
          <lpage>95</lpage>
          . 18th IFAC World Congress.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <source>Covert Misappropriation of Networked Control Systems: Presenting a Feedback Structure</source>
          ,
          <source>IEEE Control Systems</source>
          <volume>35</volume>
          (
          <year>2015</year>
          )
          <fpage>82</fpage>
          -
          <lpage>92</lpage>
          . doi:
          <volume>10</volume>
          .1109/
          <string-name>
            <surname>MCS</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <volume>2364723</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ramasubramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Chandra</surname>
          </string-name>
          ,
          <article-title>Structural resilience of cyberphysical systems under attack</article-title>
          ,
          <source>in: 2016 American Control Conference (ACC)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>283</fpage>
          -
          <lpage>289</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chapman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mesbahi</surname>
          </string-name>
          ,
          <article-title>Security and infiltration of networks: A structural controllability and observability perspective</article-title>
          ,
          <source>in: Control of Cyber-Physical Systems</source>
          , Springer,
          <year>2013</year>
          , pp.
          <fpage>143</fpage>
          -
          <lpage>160</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Barreto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Cárdenas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Quijano</surname>
          </string-name>
          ,
          <article-title>Controllability of dynamical systems: Threat models and reactive security</article-title>
          ,
          <source>in: International Conference on Decision and Game Theory for Security</source>
          , Springer,
          <year>2013</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rubio-Hernan</surname>
          </string-name>
          , L. De Cicco,
          <string-name>
            <given-names>J.</given-names>
            <surname>Garcia-Alfaro</surname>
          </string-name>
          ,
          <article-title>Revisiting a watermark-based detection scheme to handle cyber-physical attacks</article-title>
          ,
          <source>in: Availability, Reliability and Security (ARES)</source>
          ,
          <year>2016</year>
          11th International Conference on,
          <source>(Best Paper Award)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>28</lpage>
          . URL: http://dx.doi.org/10.1109/ARES.
          <year>2016</year>
          .
          <article-title>2</article-title>
          . doi:
          <volume>10</volume>
          .1109/ARES.
          <year>2016</year>
          .
          <volume>2</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rubio-Hernan</surname>
          </string-name>
          , L. De Cicco,
          <string-name>
            <given-names>J.</given-names>
            <surname>Garcia-Alfaro</surname>
          </string-name>
          ,
          <article-title>Adaptive control-theoretic detection of integrity attacks against cyber-physical industrial systems</article-title>
          ,
          <source>Trans. Emerging Telecommunications Technologies</source>
          <volume>32</volume>
          (
          <issue>09</issue>
          ) (
          <year>2017</year>
          ). URL: http://dx.doi.org/10.1002/ett.3209. doi:
          <volume>10</volume>
          .1002/ett.3209.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rubio-Hernan</surname>
          </string-name>
          , L. De Cicco,
          <string-name>
            <given-names>J.</given-names>
            <surname>Garcia-Alfaro</surname>
          </string-name>
          ,
          <article-title>On the use of watermark-based schemes to detect cyber-physical attacks</article-title>
          ,
          <source>EURASIP Journal on Information Security</source>
          <year>2017</year>
          (
          <year>2017</year>
          )
          <article-title>8</article-title>
          . URL: http://dx.doi.org/10.1186/s13635-017-0060-9. doi:
          <volume>10</volume>
          .1186/s13635-017-0060-9.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Schellenberger</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. Zhang,</surname>
          </string-name>
          <article-title>Detection of covert attacks on cyber-physical systems by extending the system dynamics with an auxiliary system</article-title>
          ,
          <source>in: 2017 IEEE 56th Annual Conference on Decision and Control (CDC)</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1374</fpage>
          -
          <lpage>1379</lpage>
          . doi:
          <volume>10</volume>
          .1109/CDC.
          <year>2017</year>
          .
          <volume>8263846</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Barbeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cuppens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cuppens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dagnas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Garcia-Alfaro</surname>
          </string-name>
          ,
          <article-title>Resilience estimation of cyber-physical systems via quantitative metrics</article-title>
          ,
          <source>IEEE Access 9</source>
          (
          <year>2021</year>
          )
          <fpage>46462</fpage>
          -
          <lpage>46475</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Quevedo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. I.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Goodwin</surname>
          </string-name>
          ,
          <article-title>Packetized predictive control over erasure channels</article-title>
          , in: 2007 American Control Conference,
          <year>2007</year>
          , pp.
          <fpage>1003</fpage>
          -
          <lpage>1008</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Franzè</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tedesco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Famularo</surname>
          </string-name>
          ,
          <article-title>Model predictive control for constrained networked systems subject to data losses</article-title>
          ,
          <source>Automatica</source>
          <volume>54</volume>
          (
          <year>2015</year>
          )
          <fpage>272</fpage>
          -
          <lpage>278</lpage>
          . URL: http: //www.sciencedirect.com/science/article/pii/S0005109815000710. doi:https://doi.org/ 10.1016/j.automatica.
          <year>2015</year>
          .
          <volume>02</volume>
          .018.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Franzè</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tedesco</surname>
          </string-name>
          , W. Lucia,
          <article-title>Resilient control for cyber-physical systems subject to replay attacks</article-title>
          ,
          <source>IEEE Control Systems Letters</source>
          <volume>3</volume>
          (
          <year>2019</year>
          )
          <fpage>984</fpage>
          -
          <lpage>989</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>A. K. Tangirala</surname>
          </string-name>
          ,
          <source>Principles of System Identification: Theory and Practice</source>
          , CRC Press,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>B. D.</given-names>
            <surname>Argall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chernova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Veloso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Browning</surname>
          </string-name>
          ,
          <article-title>A survey of robot learning from demonstration</article-title>
          ,
          <source>Robotics and autonomous systems 57</source>
          (
          <year>2009</year>
          )
          <fpage>469</fpage>
          -
          <lpage>483</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Billard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Calinon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dillmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schaal</surname>
          </string-name>
          ,
          <article-title>Robot programming by demonstration</article-title>
          , in: Springer handbook of robotics, Springer,
          <year>2008</year>
          , pp.
          <fpage>1371</fpage>
          -
          <lpage>1394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schaal</surname>
          </string-name>
          ,
          <article-title>Is imitation learning the route to humanoid robots?</article-title>
          ,
          <source>Trends in cognitive sciences 3</source>
          (
          <year>1999</year>
          )
          <fpage>233</fpage>
          -
          <lpage>242</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gordon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bagnell</surname>
          </string-name>
          ,
          <article-title>A reduction of imitation learning and structured prediction to no-regret online learning</article-title>
          ,
          <source>in: Proceedings of the fourteenth international conference on artificial intelligence and statistics, JMLR Workshop and Conference Proceedings</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>627</fpage>
          -
          <lpage>635</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Andrychowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stadie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. J.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Abbeel</surname>
          </string-name>
          , W. Zaremba,
          <article-title>One-shot imitation learning</article-title>
          ,
          <source>in: Advances in neural information processing systems</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1087</fpage>
          -
          <lpage>1098</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russell</surname>
          </string-name>
          , et al.,
          <article-title>Algorithms for inverse reinforcement learning</article-title>
          ., in: ICML, volume
          <volume>1</volume>
          ,
          <year>2000</year>
          , p.
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Osa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pajarinen</surname>
          </string-name>
          , G. Neumann,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Bagnell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Abbeel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <article-title>An algorithmic perspective on imitation learning, Foundations and Trends® in Robotics 7 (</article-title>
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>179</lpage>
          . URL: http://dx.doi.org/10.1561/2300000053. doi:
          <volume>10</volume>
          .1561/2300000053.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>M.-O. Pahl</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Liebald</surname>
          </string-name>
          ,
          <article-title>Information-centric IoT middleware overlay: VSL</article-title>
          , in: 2019
          <source>International Conference on Networked Systems (NetSys) (NetSys'19)</source>
          ,
          <article-title>Garching b</article-title>
          . München, Germany,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>