A Tool for Monitoring and Maintaining System
                  Trustworthiness at Runtime*

    Abigail Goldsteen1, Micha Moffie1, Torsten Bandyszak2, Nazila Gol Mohammadi2,
     Xiaoyu Chen3, Symeon Meichanetzoglou4, Sotiris Ioannidis4, Panos Chatziadam4
                                   1
                                   IBM Research - Haifa, Israel
                             {abigailt,moffie}@il.ibm.com
2
  paluno - The Ruhr Institute for Software Technology, University of Duisburg-Essen, Germany
      {torsten.bandyszak,nazila.golmohammadi}@paluno.uni-due.de
                     3
                       It-Innovation Center, University of Southampton, UK
                            wxc@it-innovation.soton.ac.uk
                   4
                     Foundation for Research and Technology Hellas, Greece
                     {simosme,sotiris,panosc}@ics.forth.gr


         Abstract. Trustworthiness of software systems is a key factor in their ac-
         ceptance and effectiveness. This is especially the case for cyber-physical sys-
         tems, where incorrect or even sub-optimal functioning of the system may have
         detrimental effects. In addition to designing systems with trustworthiness in
         mind, monitoring and maintaining trustworthiness at runtime is critical to iden-
         tify issues that could negatively affect a system's trustworthiness. In this paper,
         we present a fully operational tool for system trustworthiness maintenance,
         covering a comprehensive set of quality attributes. It automatically detects, and
         in some cases mitigates, trustworthiness threatening events. The use of such a
         tool can enable complex software systems to support runtime adaptation and
         self-healing, thus reducing the overall upkeep cost and complexity.

         Keywords: Trustworthiness, runtime, monitoring, mitigation, adaptation.


1        Introduction

Cyber-physical systems (CPS) are highly-connected, distributed, software-intensive
systems that interact with other software as well as a multitude of physical entities.
Trustworthiness of CPS is a key factor in their effectiveness. We define trustworthi-
ness as the assurance that a system will perform as expected, or meet certain require-
ments, as defined by trustworthiness attributes. Different trustworthiness attributes
should be considered and not only those related to security or reliability. In addition,
different systems may have different requirements with regard to these attributes. The
target trustworthiness may be derived from system requirements or service level
agreements (SLA).


Copyright © 2015 by the authors. Copying permitted for private and academic purposes.
This volume is published and copyrighted by its editors.

                                                142
   In addition to designing systems with trustworthiness in mind, monitoring and
maintaining trustworthiness at runtime is critical to identify issues that could nega-
tively affect a system's trustworthiness. These could stem from system failures, secu-
rity attacks or even changes in the system's context. The relevant attributes should be
measured and accounted for at all phases of the software lifecycle and corrective ac-
tions taken if they are violated.
   Existing solutions usually monitor a subset of trustworthiness characteristics, or do
not propose any mitigating actions that can be performed to alleviate a problem once
it has been identified. In this paper, we build upon previous work [‎1] and present a
fully operational tool for system trustworthiness maintenance. The tool covers a com-
prehensive set of trustworthiness attributes [‎2], based on a generic ontology suitable
for different kinds of software systems. In addition it envelops a complete mainte-
nance cycle starting from raw events collected from the system, through identification
of possible threats to the system's trustworthiness, to automatic mitigation of these
threats and verification that the issue was corrected. In some cases this whole flow,
including execution of controls, can be fully automatic. The use of such a tool enables
complex software systems to support runtime adaptation and self-healing with respect
to trustworthiness, achieved by automatic identification and mitigation of problems in
real-time. This self-adaptivity to both external and internal changes in the system
enables reducing the overall upkeep cost and complexity once the system is opera-
tional, since less manual intervention is required.


2      Tool Architecture and Implementation

The tool’s architecture (see Figure 1) is based on the concept of autonomic computing
and the MAPE-K reference model [‎3]. It includes a Monitor component responsible
for collecting events from the system assets, storing them in a database and perform-
ing initial processing to compute trustworthiness metrics and misbehaviours; a Man-
agement component that receives alerts from the Monitor, identifies relevant threats
and controls and selects the best controls to deploy; and a Mitigation component that
actually executes the selected controls.
   Events may be sent to the Monitor in several ways. One option is to pre-configure
the monitored system to send such events to the maintenance tool. This requires plan-
ning the support for runtime maintenance into the system design so that observation
and control interfaces are built into the system. More details on the required system
analysis at design time, including the identification of relevant threats and controls,
the identification of measureable system properties, and the design of respective inter-
faces can be found in [‎5]. Another option is to use specialized sensors that collect
events in a specific environment (such as a mobile device) or a generic monitoring
framework such as Zabbix [‎4]. The latter option also enables monitoring existing
systems and applications.
   The main processing sub-component within the Monitor is a Complex Event Pro-
cessor (CEP) which receives all low-level events and fires a misbehaviour event
whenever a single measurement reaches its pre-defined threshold or some more com-


                                          143
plex rule is triggered. The CEP depends on system-specific configuration that is based
on observable system properties.


                  Fig. 1. Low-level Architecture of the Maintenance Tool

   The main analysis in the Management is performed by a Trustworthiness Evalua-
tor (TWE) which maintains and incrementally updates a semantic runtime model of
the system. The model includes the different system assets and the connections be-
tween them, system threats and vulnerabilities, and controls that have been deployed.
This model comprises not only software assets, but also physical assets, such as
hardware and humans interacting with the system. The TWE utilizes machine reason-
ing based on a generic threat ontology that incorporates relevant security knowledge.
This semantic model should encode as many common attack patterns as possible to be
able to correctly identify threats to the system.


                                          144
   The TWE uses a Bayesian network approach to analyze threat activity likelihood
given the reported system behaviour. Bayesian networks are a powerful tool for con-
structing models of phenomena involving uncertainty. Bayesian models can combine
expert knowledge with observational data, and can be refined over time through learn-
ing from observation. In our case, we encode the expert knowledge (e.g., prior threat
likelihoods, causational probabilities, etc.) in the model, and use runtime observed
events as evidence to reason the threat activity likelihood and perform predictive
analysis. As a result of system events, the TWE performs a comprehensive threat
analysis, including: what are the possible threats given the observed system status,
what are their likelihoods and are there threats that constitute a secondary effect due
to the activity of other threats. Some more details about the TWE reasoning engine
can be found in [‎1].
   The trustworthiness evaluation process is triggered each time a meaningful system
event is detected by the Monitor, including misbehaviours and system topology up-
dates. Threats whose likelihood passes a pre-defined threshold are output, along with
control options for mitigating each threat. Then the Control Identification and Selec-
tion component selects one or more controls to deploy, taking into consideration both
cost and different trustworthiness metrics. More information about the trustworthi-
ness-cost optimization problem can be found in [‎7]. The Mitigation, which encom-
passes system-specific information on how to execute each control, actually deploys
the selected controls.
   Each of the above components needs to be configured to support the specific sys-
tem that will be monitored: the Monitor is configured to recognize the events arriving
from the system (or monitoring framework) and derive metric values and misbehav-
iours; the TWE receives a model of the deployed system assets and how they are con-
nected; and the Mitigation is configured with system-specific information on how to
execute each control. The amount of monitoring information required from the system
depends on the attack patterns we want to detect. The more complex the patterns, the
more information will be needed from the system in order to detect them.


3      Initial Evaluation

   In this section we show how the trustworthiness runtime maintenance tool was
used to monitor select trustworthiness characteristics in a Distributed Attack Detec-
tion and Visualization (DADV) system, illustrated in Figure 2. The goal of this evalu-
ation was to validate that our tool can in fact increase a system’s trustworthiness.


                                         145
                           Fig. 2. DADV system architecture

    The DADV sensors are low interaction honeypots that run as virtual machines
(VMs) within a Sensor Container System (SCS), which is essentially a hypervisor that
hosts and manages the VMs. The traffic collected by the sensors is forwarded to a
Centralized Analysis System (CAS) for storage and analysis. Each sensor is config-
ured to monitor a set of unused IP addresses in an organization’s corporate network.
The sensors run simulated vulnerable network services (e.g., SSH, HTTP) in a con-
trolled environment (a sandbox within the VM) in order to attract attackers and gather
information about zero-day attacks. However, a skilled attacker could evade the sand-
box and take control of the sensor VM. A compromised sensor is a major concern as
it poses a serious threat to the corporate network.
    The runtime maintenance tool was used to monitor the health of the DADV sensors
with the goal of detecting compromised sensors. Under normal conditions, the sensor
VMs’ resource load (CPU, memory consumption, etc.) is relatively low. Furthermore,
sensors should not open network connections to other devices in the network. The
existence of an outgoing network connection is indicative of a compromised sensor,
which must be blacklisted and switched off immediately. A sensor with very high
resource utilization is also suspected as compromised.
    The sensors send health-statistic events to the Monitor every ten seconds. If an
outgoing connection is initiated, the CEP triggers a Promiscuous misbehaviour, which
is forwarded to the TWE. The TWE analysis outputs several relevant threats, of which
the one with the highest likelihood is Unauthorised Communications, and proposes
the Blacklisting control objective to mitigate it. Based on system-specific configura-
tion, this is mapped to the DADV-specific control Stop Sensor, which is performed
automatically by the Mitigation by calling an HTTP POST method of the SCS.
    When a high load is detected on a sensor, the Overloaded misbehavior is triggered.
This can lead to two possible threats: either the sensors are simply under a high load,
which can be mitigated by adding an additional VM to the pool, or the sensor is under
attack and should be blacklisted, similar to the above scenario. The likelihood of each
threat occurring is computed in the TWE based on additional system behaviours.
    We conducted a small-scale experiment with 27 live administrators.
Each administrator used two alternative versions of the DADV system, one integrated
with the maintenance tool and one not. Results showed that the integrated system was
better at early detection and mitigation of compromised sensors (was able to mitigate
80% of the attacks vs. 63% in the regular system), and was found by administrators to
be more trustworthy (91% of administrators preferred the integrated version over the
regular one). This demonstrates that a system’s trustworthiness, as well as users’ per-
ceived trust and acceptance of a system, can be substantially increased with the use of
such a tool.


                                         146
4       Summary

   We demonstrated a tool for runtime monitoring of software systems, and specifi-
cally cyber-physical systems, to enable automatically detecting and mitigating events
that may threaten the system’s trustworthiness. The tool was tested and validated on
two trustworthiness-critical applications: a Fall Management system for Ambient
Assisted Living and a Distributed Attack Detection and Visualization system for
Cyber-Crisis Management. An initial proof-of-concept was also performed on real
sensor data from an electric company.
   The tool covers a large range of trustworthiness metrics and can be adapted to
many types of systems. It supports runtime adaptation and self-healing of critical
systems, thus reducing the overall upkeep costs and complexity and increasing system
uptake and retention. This approach requires detailed knowledge about the system to
configure the different components, sensors able to accurately observe events that
may affect trustworthiness, as well as "hooks" into the system to support automatic
deployment of the mitigating actions.

    Acknowledgements. This work was supported by the FP7 project OPTET (gr. no. 317631).


5       References
 1. Gol Mohammadi, N., Bandyszak, T., Moffie, M., Chen, X., Weyer, T., Kalogiros, C., Nas-
    ser, B., Surridge, M.: Maintaining Trustworthiness of Socio-Technical Systems at Run-
    Time. In: 11th International Conference on Trust, Privacy & Security in Digital Business.
    LNCS, vol 8647, pp. 1-12. Springer, Heidelberg (2014)
 2. Gol Mohammadi, N., Paulus, S., Bishr, M., Metzger, A., Könnecke, H., Hartenstein, S.,
    Weyer, T., Pohl, K.: Trustworthiness Attributes and Metrics for Engineering Trusted In-
    ternet-Based Software Systems. In: Helfert, M., Desprez, F., Ferguson, D., Leymann, F.
    (eds) Cloud Computing and Services Science. CCIS, vol. 453, pp. 19–35. Springer, Hei-
    delberg (2014)
 3. Kephart, J.O., Chess, D.M.: The Vision of Autonomic Computing. In: IEEE Computer
    36(1), pp. 41–50 (2003)
 4. Zabbix, The Enterprise-class Monitoring Solution for Everyone, http://www.zabbix.com/
 5. Bandyszak, T., Gol Mohammadi, N., Bishr, M., Goldsteen, A., Moffie, M., Nasser, B.,
    Hartenstein, S., Meichanetzoglou, S.: Cyber-Physical Systems Design for Runtime Trust-
    worthiness Maintenance Supported by Tools. In: 1st International Workshop on Require-
    ments Engineering for Self-Adaptive and Cyber Physical Systems (2015).
 6. Surridge, M., Nasser, B., Chen, X., Chakravarthy, A., Melas, P.: "Run-Time Risk Man-
    agement in Adaptive ICT Systems," In: 8th International Conference on Availability, Reli-
    ability and Security (ARES), pp.102-110 (2013)
 7. Kalogiros, C., Kanakakis, M.: “Profit-maximizing security level of ICT systems”, To be
    published: 3rd International Conference on Human Aspects of Information Security, Pri-
    vacy and Trust (2015)


                                            147