=Paper=
{{Paper
|id=Vol-2786/Paper8
|storemode=property
|title=Privacy-Preserving Data Sharing and Adaptable Service Compositions in Mission-Critical Clouds
|pdfUrl=https://ceur-ws.org/Vol-2786/Paper8.pdf
|volume=Vol-2786
|authors=Bharat Bhargava,Rohit Ranchal,Pelin Angin
|dblpUrl=https://dblp.org/rec/conf/isic2/BhargavaRA21
}}
==Privacy-Preserving Data Sharing and Adaptable Service Compositions in Mission-Critical Clouds==
<pdf width="1500px">https://ceur-ws.org/Vol-2786/Paper8.pdf</pdf>
<pre>
                                                                                                                                                            60


Privacy-Preserving Data Sharing and                                                                                           Adaptable            Service
Compositions in Mission-Critical Clouds
Bharat Bhargavaa, Pelin Anginb and Rohit Ranchalc
a
  Purdue University, West Lafayette, IN, USA
b
  Middle East Technical University, Ankara, Turkey
c
  IBM Cloud Lab, Austin, TX, USA


                                   Abstract
                                   Existing cloud systems lack robust mechanisms to monitor compliance of services with security
                                   and performance policies under changing contexts, and to ensure uninterrupted operation in
                                   case of failures. On the other hand, microservices-based cloud system architectures that have
                                   become indispensable for defense applications require systematic monitoring of service
                                   operations to satisfy their resiliency and antifragility goals. In this work we propose a unified
                                   model for enforcing security and performance requirements of mission-critical cloud systems
                                   even in the presence of anomalous behavior/attacks and failure of services. The model allows
                                   for proactive mitigation of threats and failures in cloud-based systems through active
                                   monitoring of the performance and behavior of services, promising achievement of resiliency
                                   and antifragility under various failures and attacks. It also provides secure dissemination of data
                                   between services to ensure end-to-end secure operation of critical missions.

                                   Keywords 1
                                   Cloud computing, privacy, service composition


1. Introduction                                                                                               sudden changes in context can cause
                                                                                                              performance to deteriorate, if not result in the
                                                                                                              failure of a whole composition. To provide
    The rise of cloud computing and Internet of
                                                                                                              optimal performance in the enterprise cloud
things (IoT) have created new security
                                                                                                              architecture under varying contexts, we need
challenges with a large attack surface.
                                                                                                              context-awareness and adaptation mechanisms
Microservices-based           cloud        system
                                                                                                              for SOA and cloud service domains. Cloud
architectures for defense applications require
                                                                                                              platforms are vulnerable to increasingly
systematic monitoring of service operations to
                                                                                                              complex attacks that could violate the privacy
satisfy their resiliency (withstand cyber-attacks,
                                                                                                              of data stored on them or shared with web
and sustain and recover critical function) and
                                                                                                              services, which is especially detrimental in case
antifragility (increase in capability, resilience,
                                                                                                              of mission-critical operations. In order to
or robustness as a result of mistakes, faults,
                                                                                                              mitigate these problems, cloud systems need to
attacks, or failures) goals.
                                                                                                              integrate proactive defense mechanisms, which
    When clients interact with a cloud service,
                                                                                                              provide increased resiliency by treating
they expect certain levels of Quality of Service
                                                                                                              potentially malicious service interactions and
(QoS) guarantees, expressed as service
                                                                                                              data sharing before they take place.
performance, security and privacy policies.
                                                                                                                  These requirements call for the development
Controlling compliance with service level
                                                                                                              of unified models for performance and security
agreements (SLAs) requires continuous
                                                                                                              monitoring of operations that provide valuable
monitoring of services in an enterprise, as
                                                                                                              input for achieving situation-awareness,

International Semantic Intelligence Conference (ISIC 2021), Feb
25-27, 2021, New Delhi, India
EMAIL: bbshail@purdue.edu (A. 1); pangin@ceng.metu.edu.tr
(A. 2); ranchal@us.ibm.com (A. 3)
                               © 2020 Copyright for this paper by its authors. Use permitted under Creative
                               Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Wor
    Pr
       ks
        hop
     oceedi
          ngs
                ht
                I
                 tp:
                   //
                    ceur
                       -
                SSN1613-
                        ws
                         .or
                       0073
                           g

                               CEUR Workshop Proceedings (CEUR-WS.org)
                                                                                                   61

dynamic adaptability and restoration of services    solution for checking behavioral correctness of
in the face of changes in context, and effective    web service conversations. Their proposal is for
mechanisms for detection and sharing of threat      a specific application server, since they utilize
data, as well as enforcing cross-domain security    an event mechanism provided by that server.
and Quality of Service (QoS) constraints.               To support flexible auditing of the behavior
Controlled privacy and integrity-preserving         pattern for composite services, Wu et al. [2]
data dissemination and filtering models are         demonstrate an “aspect extension” to WS-
needed to ensure protection of the privacy of       BPEL, in which history-based pointcuts specify
sensitive data in trusted and untrusted clouds.     the pattern of interest within a range, and
    In this paper, we describe the design of a      advices describe the associated action to
unified monitoring and response model for           manage the process if the specified pattern
privacy-preserving data dissemination and           occurs. Their solution addresses specific
adaptable service compositions in mission-          orchestration engines, which is not a generic
critical cloud systems. Through unsupervised        solution for modern cloud-based services. In [3]
learning-based detection of anomalies in cloud      and [4] the identification of trusted services and
services and adaptable real-time service            dynamic trust assessment in SOA are studied.
composition, the proposed model aims to             Malik et al. [4] introduce a framework called
achieve a highly resilient cloud architecture for   RATEWeb for trust-based service selection and
mission-critical systems.                           composition based on peer feedback. It is based
                                                    on decentralized techniques for evaluating
2. Related work                                     reputation-based trust with ratings from peers.
                                                    Spanoudakis et al. [5] present an approach to
                                                    keep track of trusted services to address the
    Current industry-standard cloud systems         compliance of promises expressed within their
such as Amazon EC2 provide coarse-grain             service level agreements (SLAs). The trust
monitoring capabilities (e.g. CloudWatch) for
                                                    assessment is based on information collected by
various performance parameters for services
                                                    monitoring services in different operational
deployed in the cloud. Although such monitors       contexts and subjective assessments of trust
are useful for handling issues such as load         provided by different clients. Approaches like
distribution and elasticity, they do not provide    [3] and [5] are not suitable for compositions
information regarding potentially malicious         with many services, as the monitoring system
activity in the domain. Log management and
                                                    would need to collect intensive information
analysis tools such as Splunk [8], Graylog [9]      from a lot of clients. Gamble et al. [6] present a
and Kibana [10] provide capabilities to store,      tiered approach to auditing information in the
search and analyze big data gathered from
                                                    cloud. Filtering and reasoning over the audit
various types of logs on enterprise systems,        trails can manifest potential security
enabling organizations to detect security threats
                                                    vulnerabilities and performance attributes as
through examination by system administrators.       desired by stakeholders. [7] introduces a system
Such tools mostly require human intelligence        to model the essential security elements and
for detection of threats and need to be
                                                    define the proper message structure and content
complemented with automated analysis and            that each service in the composition must have,
accurate threat detection capability to quickly     based on a security meta-language (SML). Both
respond to possibly malicious activity in the       approaches focus on how services can comply
enterprise and provide increased resiliency by      with established standards, but their
providing automation of response actions.           implementation requires extensive changes in
    Development of runtime-auditing systems         the current infrastructures. Our previous work
for mobile and web-based services has been the
                                                    [17] proposed service interceptors to enforce
focus of many research efforts. Li et al. [3]       policies on interactions between different cloud
describe a system for auditing runtime              services in a composition. In this work, we take
interaction behavior of web services. They use      a monitoring approach for service health and
finite state automata to validate predefined        anomalies for more informed real-time
interaction constraints, where message              decisions and build on [16] to dynamically
interception is bound to the particular server      update service compositions with low
used for deploying Web services. Simmonds et        overhead.
al. [1] present a more comprehensive auditing
                                                                                                         62


                                        Figure 1: Solution architecture

                                                             unsupervised machine learning models to
3. Proposed Solution                                         detect deviations from normal behavior. The
                                                             analysis results are reported to a central
                                                             monitor in the form of summary statistics for
    In this paper, we describe an approach that
                                                             the services.
uses a distributed network of service activity
                                                             •     The     central     monitor      utilizes
monitors to audit and detect service behavior
                                                             information submitted by local monitors to
and performance changes, adaptively update
                                                             update trust values of services and
service compositions and securely share data in
                                                             reconfigure services/service compositions
a mission-critical cloud system. By integrating
                                                             to provide resiliency against attacks and
components       for     service    performance
                                                             failures. The central monitor utilizes the
monitoring, dynamic service reconfiguration
                                                             gathered information to form cyber threat
and adaptable data dissemination, the proposed
                                                             intelligence feeds about the services in a
model aims to provide a unified architecture for
                                                             standard format.
agile and resilient computing in trusted and
                                                             •     Detection of service failures and/or
untrusted clouds. The overall architecture of the
proposed model is demonstrated in Figure 1.                  suboptimal service performance triggers
                                                             restoration of optimal behavior through
General characteristics of the solution are as
follows:                                                     dynamic reconfiguration of service
                                                             compositions.
    •    Each service domain, such as a cluster
                                                             •     Privacy-preserving dissemination of
    of machine instances in the cloud or a set of
    mobile services in close proximity to each               data between services is achieved using
    other, has a service monitor that tracks                 active bundles. Likewise, data services in
                                                             the cloud utilize active bundles for protected
    interactions among the services in the
    domain as well as outside the domain.                    data storage that enforces fine-grain security
                                                             policies associated with the usage of the data
    •    The local service monitors (Monitor A,
                                                             items when authorizing access.
    Monitor B etc.) gather performance and
    security data including response time,
    response status, authentication failures, etc.,      3.1. Cloud            Service         Anomaly
    among other parameters for each service by           Detection
    intercepting service requests and utilizing
    available performance monitoring software.              In this section we present our system
    The data collected are logged in the database        architecture for the monitoring of cloud
    of each local monitor and mined using                services and detection of anomalies in order to
                                                                                                  63

provide adaptable and resilient service
operation in a mission-critical cloud system.
Figure 2 shows a high-level overview of service
monitoring and anomaly detection in the
proposed architecture.
    Monitoring in the system architecture is
distributed in the sense that each service
domain, such as a cluster of machine instances
in the cloud, has a service monitor that tracks
interactions among the services in the domain
as well as interactions with services or users
outside the domain. When a service is
deployed, it is registered with the local monitor   Figure 2: Cloud service anomaly detection
of its domain in order to be discoverable by
other services or users. The local monitors have        In this paper we focus on unsupervised
access to all interactions with the services        models for outlier/anomaly detection in service
registered in their domain and they gather          behavior. A significant advantage of
interaction/performance        data       streams   unsupervised models is that the training data
containing items for response time, response        required is gathered from the behavior of
status, authentication failures etc. among other    services operating under normal conditions
parameters for each service using interceptors      (possibly in an isolated environment); i.e. no
transparent to each service implementation.         attack data is required to train these models.
Services in each domain are also tracked using      Specifically, we focus on two unsupervised
aspect-oriented programming (AOP)-based             learning models, k-means clustering and one-
software monitors for parameters requiring          class support vector machines (SVM), due to
finer-grained control. The data collected are       their simplicity and success in anomaly
mined by the anomaly detection module of the        detection tasks. Training of the models is
domain and reported to the central monitor in       performed with data gathered under normal
the form of summary health statistics and trust     system operation (i.e., isolated execution under
values for the services. These statistics are       a controlled runtime environment).
utilized by the dynamic service composition             Service     performance      and     security
module when making decisions about which            parameters that are used in the learning process
services to include in an orchestration.            for general cloud-based services and data
                                                    services include: Number of requests/sec, total
3.1.1. Unsupervised learning                for     error rate, CPU utilization, memory utilization,
                                                    number of authentication failures, number of
service anomaly detection                           connection failures, network latency, service
                                                    response time, disk space usage, number of
    Research in machine learning has resulted in    database connections. Note that this is not an
various models for detection of outliers in         exhaustive list and various other relevant
different types of data. While supervised and       parameters that can be obtained during service
unsupervised classification models have been        runtime through monitoring can be integrated
applied with success to a variety of domains        into the learning algorithms easily.
[19], robust real-time models for detecting             K-means Clustering: K-means clustering
anomalies and failures in service operation are     partitions n observations into k clusters in
still in progress. The main shortcoming of          which each observation belongs to the cluster
supervised anomaly detection models including       with the nearest mean [11]. When applied to
deep learning models is that they require a large   the service anomaly detection problem, k-
amount of training data and can only provide        means clustering finds clusters of parameter
accurate results on anomalies that were             values of normal service behavior during the
previously observed in the system. This makes       training phase, using the data obtained with
such      models     unable       to     capture    service monitoring under normal operation.
threats/anomalies that are completely new,          During the online anomaly detection process,
which is essential in an environment of ever-       data gathered by the service monitors are
growing security vulnerabilities and attacks.       utilized to measure the distance of the service
                                                                                                     64

behavior (i.e., values of performance/security           violations, data leaks, unauthorized
parameters) at each time point to all clusters           dissemination, etc. The digital content can
found by the algorithm. If the value does not fall       include documents, pieces of code, images,
in any cluster, an anomaly signal is raised.             audio, video files etc. This content can have
    One-class Support Vector Machines                    several items, each with a different
(SVM): One-class SVM [12] is an extension of             security/privacy level and an applicable
the well-known support vector machines                   policy to ascertain its distribution and usage.
(SVM) classification algorithm, where training           •    Metadata: It describes the active
is performed using only positive examples and            bundle and its policies. This can include
test instances are classified as belonging or not        information such as AB identifier,
belonging to the single (positive) class.                information about its creator and owner,
Essentially, one-class SVM learns a decision             creation time, lifecycle etc. It also includes
function for novelty detection, which is what            policies that govern AB’s interaction and
we try to achieve in service anomaly detection           usage of its data, such as access control
to mitigate attacks with no well-known                   policies, privacy policies, dissemination
signature. SVM constructs a decision                     policies etc.
hyperplane boundary based on normal runtime              •    Policy Enforcement Engine (or
conditions of the service it is trained for. During      Virtual Machine, VM): It is a specific-
the online anomaly detection phase, instances            purpose VM used to operate AB, protect its
lying outside the boundary for normal operation          content and enforce policies (for example,
are classified as anomalous, resulting in an             disclosing to a service only the portion of
anomaly signal.                                          sensitive data that it requires to provide
                                                         service).
3.2. Privacy-Preserving     Data
                                                      Further details of the active bundle solution can
Dissemination between Services in                     be found at [13].
Mission-Critical Clouds
                                                      4. Implementation of Distributed
    We propose a policy–based distributed data
dissemination model, which provides secure
                                                         Service    Monitoring     and
data dissemination, i.e., every service gets             Adaptable Composition
access only to those parts of data for which it is
authorized. The goal of the proposed solution is          In the prototype distributed service
to selectively disclose information based on          monitoring system, each local service monitor
policies, minimize the unnecessary disclosure         has been implemented using Apache Axis2
and ensure security and privacy of the                valves for intercepting all service requests in
information. Our solution uses Active Bundle          the domain and each service domain includes a
(AB) to achieve this [13, 14, 15]. An active          MySQL database, in which data (response time,
bundle (AB) is a self-protecting data                 response status, CPU usage, memory usage)
mechanism that includes sensitive data,               about each service gathered by the monitor is
metadata (policies) and a policy enforcement          logged. Additionally, AOP-based service
engine (Virtual Machine) for policy                   interceptors were added to allow for finer-grain
enforcement. Clients interact with services by        monitoring and policy enforcement capability.
sending an AB, which contains encrypted data          The central monitor is implemented as a web
about their request and the policies associated       service on Amazon EC2, which has its own
with the data. AB is a data protection                database to store health, endpoint address and
mechanism, which can be used to protect data          category data for various services. While each
at various stages throughout its lifecycle. AB is     service invocation leads to an update in the
a robust and an extensible scheme that can be         local monitor’s database, summary data for all
used     for    secure     cross-domain       data    services in a specific domain is reported to the
dissemination. AB includes the following              central monitor periodically by each local
components:                                           monitor. One of the benefits of cloud
    •    Sensitive data: It is the digital content    computing is that there can be multiple options
    that needs to be protected from privacy           for services to achieve a specific task. We
                                                                                                   65

define a service category as an abstraction for a   module for scenarios with total number of
set of services that provide similar                services from 25 to 125. The results show that
functionality. A service is the actual              the execution time changes almost linearly.
implementation of the functionality for a           Even for 125 services in 5 categories (which is
specific service category. The dynamic service      unlikely to be surpassed in any practical SOA
composition module utilizes information from        scenario), the dynamic service composition
the central monitor’s database to create service    module performs very well and the average
orchestrations that comply with users’              response time is 22ms.
performance and/or security requirements on-             In the second experiment, we investigated
the-fly. The goal of dynamic service                the effect of the number of service constraints
composition is to maximize the resiliency and       on the performance of dynamic service
trustworthiness of the system based on              composition module. In this experiment, we set
selecting the best individual services, while       the number of services to 50 and the number of
meeting the constraints (security and SLA           service categories to 5. According to Figure 4,
requirements).                                      the effect of the QoS constraints on
                                                    performance is sublinear. Even after increasing
                                                    the input size by a factor of 5, the response time
                                                    only increases by 50% and remains under 20
                                                    ms.

                                                    5. Conclusion

                                                        Existing cloud enterprise systems lack
                                                    robust mechanisms to monitor compliance of
Figure 3: Effect of number of services on           services with security and performance policies
dynamic service composition time                    under changing contexts, and to ensure
                                                    uninterrupted operation in case of failures. This
                                                    work proposes a unified model for enforcing
                                                    security and performance requirements of
                                                    mission-critical cloud systems even in the
                                                    presence of anomalous behavior/attacks and
                                                    failure of services. Service monitors include
                                                    components that enable the adaptation of the
                                                    systems in response to detected anomalies, such
                                                    that the non-stop system operations continue
                                                    and comply with security requirements. The
Figure 4: Effect of number of QoS constraints       resiliency is accomplished through dynamic
on dynamic service composition time                 reconfiguration and restoration of services. Our
                                                    approach is complementary to functionality
    We performed experiments to evaluate the        provided by log management tools such as
overhead of dynamic service composition using       Splunk in that it develops models that
testbeds in the Amazon EC2 cloud. Note that         accurately analyze the log data gathered by
the problem here is finding an optimal service      such tools to immediately detect deviations
composition (i.e., selecting a service from each    from normal behavior and quickly respond to
service category required in the composition)       such anomalous behavior in order to provide
subject to a set of QoS and security constraints.   increased automation of threat detection as well
In the first experiment, we investigated the        as resiliency. Our approach allows for proactive
effect of the number of services to choose from     mitigation of threats and failures in cloud-based
for each service category, on the performance       systems through active monitoring of the
of dynamic service composition. In this             performance and behavior of services,
experiment, we set the number of service            promising achievement of resiliency and
categories to 5 and the number of QoS               antifragility under various failures and attacks.
constraints to 3. Figure 3 shows the response       The proposed approach offers a unified model
time of the dynamic service composition             for agile and resilient distributed computing,
                                                                                                  66

based on standardized technologies for                    5th IEEE International Conference on
monitoring and sharing of performance and                 Cloud Computing (CLOUD), 2012, pp.
threat data, promising for easy adoption in               945-946.
industry. The proposed performance and               [7] R. Baird, R. Gamble, Developing a
security policy enforcement model enables                 security     meta-language     framework,
integration of various types of policies and              Proceedings of the 44th Hawaii
optimization algorithms as well as filtering              International Conference on System
capabilities (e.g., high-quality vs. lower-quality        Sciences, 2011, pp. 1-10.
data) for various data types, which is needed for    [8] Splunk,               2020.           URL:
fine-grain control over dissemination, searches,          http://www.splunk.com.
analytics, and operations in cross domains of        [9] Graylog,               2020.          URL:
privacy.                                                  http://www.graylog.org.
    Future work will include detailed evaluation     [10] Kibana,              2020.           URL:
of the overheads and accuracy of service                  https://www.elastic.co/products/kibana.
anomaly detection under various attacks and          [11] S. P. Lloyd, Least squares quantization in
operational failures as well as extension of the          PCM, IEEE Transactions on Information
privacy-preserving        data      dissemination         Theory 28.2 (1982): 129–137.
mechanism between the services to a                  [12] B. Scholkopf, J.C. Platt, J.Shawe-Taylor,
blockchain-based model, where the integrity               A.J. Smola, R.C. Williamson, Estimating
and validity of the data shared between                   the support of a high-dimensional
mission-critical services can be ensured with             sistribution, Technical report, Microsoft
strong security guarantees.                               Research, MSR-TR-99-87, 1999.
                                                     [13] R.     Ranchal,      Cross-Domain     Data
                                                          Dissemination and Policy Enforcement,
6. References                                             Ph.D. thesis, Purdue University, West
                                                          Lafayette, IN, 2015.
                                                     [14] R. Ranchal, B. Bhargava, L.B. Othmane,
[1] J. Simmonds, Y. Gan, M. Chechik, S.                   L. Lilien, A. Kim, Protection of identity
    Nejati, B. O'Farrell, E. Litani, J.                   information in cloud computing without
    Waterhouse, Runtime monitoring of Web                 trusted third party, Proceedings of the
    service conversations, IEEE Transactions              IEEE International Symposium on
    on Service Computing 2.3 (2009): 223-
                                                          Reliable Distributed Systems (SRDS),
    244.                                                  2010, pp. 368-372.
[2] G. Wu, J. Wei, T. Huang, Flexible pattern        [15] P. Angin, B. Bhargava, R. Ranchal, N.
    monitoring for WSBPEL through stateful
                                                          Singh, L. Lilien, L.B. Othmane, An entity-
    aspect extension, Proceedings of the IEEE             centric approach for privacy and identity
    International Conference on Web Services
                                                          management in cloud computing,
    (ICWS '08), 2008, pp. 577 – 584.                      Proceedings of the IEEE International
[3] Z. Li, Y. Jin, J. Han, A runtime monitoring           Symposium on Reliable Distributed
    and validation framework for Web service
                                                          Systems (SRDS), 2010, pp. 177-183.
    interactions, Proceedings of the Australian      [16] B. Bhargava, P. Angin, R. Ranchal, S.
    Software       Engineering      Conference,           Lingayat, A distributed monitoring and
    Sydney, Australia, 2006, pp. 70–79.                   reconfiguration approach for adaptive
[4] Z. Malik, A. Bouguettaya, Rateweb:                    network computing, Proceedings of the
    reputation       assessment     for    trust          6th     International     Workshop      on
    establishment among Web services,                     Dependable Network Computing and
    VLDB 18.4 (2009): 885–911.
                                                          Mobile Systems (DNCMS) in conjunction
[5] G. Spanoudakis, S. LoPresti, Web service              with SRDS’15, 2015, pp. 31-35.
    trust: towards a dynamic assessment              [17] R. Fernando, R. Ranchal, B. Bhargava, P.
    framework, Proceedings of the IEEE                    Angin, A monitoring approach for policy
    International Conference on Availability,             enforcement       in     cloud    services,
    Reliability and Security (ARES’09), 2009,             Proceedings of the 10th IEEE International
    pp. 33–40.                                            Conference on Cloud Computing
[6] R. Xie, R. Gamble, A tiered strategy for              (CLOUD’17), 2017, pp. 600-607.
    auditing in the cloud, Proceedings of the

</pre>