<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Model-driven Automated Deployment of Large-scale CPS Co-simulations in the Cloud</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Yogesh D. Barve, Himanshu Neema, Aniruddha Gokhale and Janos Sztipanovits Institute for Software-Integrated Systems, Dept. of EECS, Vanderbilt University</institution>
          ,
          <addr-line>Nashville, TN 37212</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-With increasing advances in Internet-enabled devices, large cyber-physical systems (CPS) are being realized by integrating several sub-systems together. Analyzing and reasoning different properties of such CPS requires co-simulations by composing individual and heterogeneous simulators, each of which addresses only certain aspects of the CPS. Often these co-simulations are realized as point solutions or composed in an ad hoc manner, which makes it hard to reuse, maintain and evolve these co-simulations. Although our prior work on a modelbased framework called Command and Control Wind Tunnel (C2WT) supports distributed co-simulations, many challenges remain unresolved. For instance, evaluating these complex CPSs requires large amount of computational and I/O resources for which the cloud is an attractive option yet there is a general lack of scientific approaches to deploy co-simulations in the cloud. In this context, the key challenges include (i) rapid provisioning and de-provisioning of experimental resources in the cloud for different co-simulation workloads, (ii) simulating incompatibility and resource violations, (iii) reliable execution of co-simulation experiments, and (iv) reproducible experiments. Our solution builds upon the C2WT heterogeneous simulation integration technology and leverages the Docker container technology to provide a model-driven integrated tool-suite for specifying experiment and resource requirements, and deploying repeatable cloudscale experiments. In this work, we present the core concepts and architecture of our framework, and provide a summary of our current work in addressing these challenges. Index Terms-co-simulations, verification, model driven, cloud</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION AND PROBLEM STATEMENT</title>
      <p>Large-scale cyber physical systems (CPS) experiments are
being increasingly deployed for real-world scenarios in
domains such as building automation and control, smart power
grid, health-care, and industrial processes. For example, power
grid CPSs are composed of many multi-domain subsystems
with different assets and technologies, such as electric grid,
sensors, networking and physical control systems. Thus,
designing and analyzing such complex systems needs extensive
simulation and prototyping tools that span multiple domains.</p>
      <p>While recent advances in simulation tools have enabled
modeling and simulation of system characteristics, a single
simulator tool is not sufficient to model and experiment with
CPS. This is due to the fact that no single simulator can
simulate all aspects of CPS, and moreover, CPS require
heterogeneous resources and execution environments. Thus
co-simulation environments have emerged as an approach for
modeling and simulating CPS. Co-simulation or coupled
simulation is a methodology that focuses on evaluating the behavior
of a system by integrating simulations of its components. Each
specialized simulation tool can process and communicate
various events among participating simulation engines to model
large-scale CPS. To realize such a co-simulation platform,
proper time synchronization and coordination of message
flows among participating simulations engines is needed.</p>
      <p>
        C2WT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is a heterogeneous simulation integration
framework that we have previously developed at Vanderbilt
University. It enables model-based rapid synthesis of
heterogeneous and distributed CPS co-simulations. C2WT relies on
the IEEE High-Level Architecture (HLA) standard.
Domainspecific tools have been built on top of C2WT such as
C2WTTE [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] which targets transactive smart grid domain, and the
SURE testbed [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] that targets security and resilience in CPS.
      </p>
      <p>Despite these advances, many challenges still remain
unresolved. For instance, large-scale simulations exhibit compute
and/or I/O intensive workloads and may need large amount
of such resources. Cloud computing can provide access to
such a large pool of resources elastically and on-demand.
However, existing cloud platforms lack tools for effective
deployment of large-scale CPS simulations. Migrating existing
simulation tools to the cloud is also a challenging task,
which hinders the widespread adoption of cloud computing
for CPS co-simulation. This problem is further exacerbated
since CPS domain experts conducting the simulations often
lack a proper understanding of the cloud resource provisioning
and utilization thereby resulting in ad hoc and sub-optimal
deployment of CPS simulations in the cloud.</p>
      <p>In this research, we focus primarily on cloud-based
provisioning of large-scale CPS experiments, and outline the key
challenges associated with deploying and experimenting with
CPS co-simulations in the cloud.</p>
    </sec>
    <sec id="sec-2">
      <title>II. CHALLENGES IN REALIZING CLOUD-HOSTED CPS</title>
      <p>CO-SIMULATIONS</p>
      <p>The following challenges must be resolved to support
reusable and extensible cloud-based CPS co-simulations.</p>
      <p>1. Integrated tool to rapidly deploy experiments on cloud
resources: To run experiments in the cloud, the framework
should be able to acquire required resources, instantiate the
deployment and execution of the co-simulation, and tear down
the acquired resources when the experiment is completed.
The run-time infrastructure should require minimal startup and
shutdown time to ensure a quick experiment start and prompt
release of resources, without incurring additional resource
utilization cost. The simulations also impose different resource
requirements such as CPU cores, GPU, RAM, and disk space.
Moreover, the simulations could be CPU and/or I/O intensive.
These resource requirements must be configured in the tool,
and the cloud resources should be allocated accordingly. A
dynamic cloud resource management strategy can be highly
effective for better cloud resource utilization.</p>
      <p>2. Handling simulation incompatibility and resource
violations: For faithful experimental outcomes, different
simulators impose co-simulation specific data-exchange requirements
and QoS constraints such as communication latencies,
computation execution deadlines, hardware resource availability
(CPUs, memory, etc.). For instance, if one of the simulators in
the co-simulation requires high I/O bandwidth to stream large
videos, the receiving simulator needs to consume the streamed
data within a given time-period. Thus, if these simulators are
not co-located in the cloud, a violation or incompatibility
warning should be raised so that the user can make the
necessary modifications to satisfy the QoS constraints.</p>
      <p>3. Proactive fault tolerance for simulation execution:
The cloud-based co-simulation framework must be resilient
to system faults and failures that can occur within the cloud
platforms. Our solution, called Co-simulation checkpointing,
leverages Linux container’s save and restore functions, and
enables, in the event of a failure, an effective recovery of
systems to their previously checkpointed states. Implementing
checkpoint with distributed co-simulations, that have
intertwined dependencies, is even more challenging. Here,
checkpointing also needs to be coordinated and synchronized across
all simulators. This ensures reliable recovery and correct
execution from snapshot images during system restoration.</p>
      <p>4. Reproducible Experiments: Deterministic execution
and reproducible experiments are needed for many CPS
cosimulations. The co-simulation integration methods and
runtime execution tools must be designed for these requirements
from the start. In addition, for repeatable experiments, the
cloud experimentation platform should provide the same
runtime execution environment and configuration for the same
experiments.</p>
    </sec>
    <sec id="sec-3">
      <title>III. PROPOSED SOLUTION AND CURRENT STATUS</title>
      <p>We are developing a framework that enables effective
CPS co-simulation in cloud. Figure 1 provides the functional
architecture of our framework. Our framework uses Docker
containers for deploying simulations in an Openstack cloud.
The simulations are built using corresponding pre-packaged
simulators inside a Docker container. These containers provide
a repeatable runtime environment. We are also developing
a domain-specific modeling language to capture experiment
resource requirements of individual simulators.</p>
      <p>In future, we plan on integrating an SMT solver for an
optimal placement of simulators in the cloud environment
Fig. 1: Architecture Overview of CPS Co-simulation
Deployment in the Cloud
while still satisfying individual resource requirements. We are
also building a cloud resource monitoring framework utilizing
collectd and other tools to enable real time monitoring of cloud
resources which can then be fed to the SMT solvers to make
effective decisions. To enable fault-tolerant co-simulations, we
are developing a Co-simulation checkpointing technique using
the save and restore functions of the CRIU library for Docker
containers. It is also critical that checkpointing should be
synchronized and coordinated, and must support distributed
simulator deployments.</p>
    </sec>
    <sec id="sec-4">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work is supported in part by NIST contract
number 70NANB15H312, NSF CPS VO contract number
CNS1521617 and NSF US Ignite CNS 1531079. Any opinions,
findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily
reflect the views of the funding agencies.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Hemingway</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Neema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Nine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sztipanovits</surname>
          </string-name>
          , and G. Karsai, “
          <article-title>Rapid synthesis of high-level architecture-based heterogeneous simulation: a model-based integration approach</article-title>
          ,” Simulation, vol.
          <volume>88</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>217</fpage>
          -
          <lpage>232</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Neema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sztipanovits</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Burns</surname>
          </string-name>
          , and E. Griffor, “
          <article-title>C2WT-TE: A model-based open platform for integrated simulations of transactive smart grids,” in Modeling and Simulation of Cyber-Physical Energy Systems</article-title>
          (MSCPES),
          <year>2016</year>
          <article-title>Workshop on</article-title>
          . IEEE,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Neema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Volgyesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Potteiger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Emfinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Koutsoukos</surname>
          </string-name>
          , G. Karsai,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Vorobeychik</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Sztipanovits</surname>
          </string-name>
          , “
          <article-title>SURE: An Experimentation and Evaluation Testbed for CPS Security</article-title>
          and Resilience: Demo Abstract,”
          <source>in Proceedings of the 7th International Conference on Cyber-Physical Systems, ser. ICCPS '16</source>
          .
          <string-name>
            <surname>Piscataway</surname>
          </string-name>
          , NJ, USA: IEEE Press,
          <year>2016</year>
          , pp.
          <volume>27</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          :
          <fpage>1</fpage>
          . [Online]. Available: http://dl.acm.org/citation.cfm?id=
          <volume>2984464</volume>
          .
          <fpage>2984491</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>