=Paper= {{Paper |id=Vol-2019/docsymp_6 |storemode=property |title=Towards Model Driven Verifiable Deployment of Distributed Simulations in Cloud |pdfUrl=https://ceur-ws.org/Vol-2019/docsymp_6.pdf |volume=Vol-2019 |authors=Yogesh D. Barve |dblpUrl=https://dblp.org/rec/conf/models/Barve17 }} ==Towards Model Driven Verifiable Deployment of Distributed Simulations in Cloud== https://ceur-ws.org/Vol-2019/docsymp_6.pdf
    Towards Model Driven Verifiable Deployment of
           Distributed Simulations in Cloud
                                                            Yogesh D. Barve
                                               Institute for Software Integrated Systems,
                                                Dept. of EECS, Vanderbilt University,
                                                       Nashville, TN 37212, USA
                                                Email:{yogesh.d.barve}@vanderbilt.edu


   Abstract—Running simulations on cloud computing platforms            simulations. Moreover, resource requirements for different
offers advantages to users such as on-demand computing and              participating entities in a distributed simulation can vary based
scalability. Despite these benefits offered by cloud computing          on the workload characteristics of the individual simulation
platforms, limitations exists in verifiable and efficient deployment
of these simulations in the cloud. Moreover, distributed simula-        agent. As such, tools and strategies are required to provide
tions impose additional requirements of coordinated time stepped        seamless execution of the simulations improving performance
execution to progress the simulations. As such, deadline-aware          and minimizing the makespan of execution of the simulation
resource allocation for different simulation entities and dynamic       run. Migration of the simulations from the grid to the cloud
execution of load balancing strategies are required to minimize         environment is another challenge which the users face. Issues
the total execution time. Also, the cloud provider’s interest in
minimal execution cost is another requirement which demands             such as vendor lock-in can make the users tied to a given
workload consolidation for these distributed simulations. In this       cloud provider, thus making it difficult for the user to move
context, there is a general lack of mechanisms that address these       to another cloud provider in future.
concerns in the cloud hosted distributed simulation space. To
address these gaps, this research proposes Model Driven Verifi-                          II. PROBLEM STATEMENT
able Distributed Simulations in Cloud (MoViDiX). It provides               In this research we are exploring the use of cloud computing
DSML building blocks for rapid provisioning of distributed
                                                                        resources for running distributed simulations. Simulations can
simulations in cloud. It also provides a verification subsystem
that identifies simulation resource incompatibilities and finds         either consist of single participating entity (monolithic), or
solution for verified deployment of the simulation. Leveraging          a group of coordinated simulation entities as in distributed
Models@Runtime principles, dynamic resource management for              simulations. In this doctoral research we address challenges
effective simulations for deadline aware execution is proposed.         imposed in running simulations in cloud along three dimen-
   Index       Terms—co-simulation,          distributed      simula-   sions. These are listed in the following section below:
tions,verification, Cloud.
                                                                           • Simulation-imposed challenges

                     I. INTRODUCTION                                            – CH1: Distributed and Synchronized: Distributed sim-
                                                                                   ulations comprise a group of simulators which coor-
   With the new focus on technologies like internet of things,                     dinates and advances time synchronously. The exe-
fog computing, and blockchain enabled services, under-                             cution semantics follow time stepped execution. The
standing the fundamental technologies of distributed systems                       distributed nature of the execution demands strict
like fault tolerance, publish-subscribe communication, client-                     communication latency deadlines among participat-
server, and peer-to-peer technologies becomes critical. Simu-                      ing entities. Also, time stepped execution demands
lations provide a rapid way to run and test new distributed                        state synchronization between participating entities
systems algorithms in a controlled environment. But to sim-                        before progressing into the next execution time step.
ulate highly scalable distributed systems such as those that                    – CH2: QoS and Resource Integration challenges:
manifest in the scenarios like smart city, smart grid systems,                     Different participating simulators in a distributed
one needs a large amount of computing resources. Running                           simulation settings may have different levels of re-
such large scale simulations can benefit from the scalability,                     source requirements as execution progresses. Some
on-demand, ubiquitous nature of cloud computing [1], [2].                          simulators may need availability to specific hardware
   Although these benefits are very compelling for running                         components such as GPU, cpu core, memory, etc.
simulations in the cloud, the research community is faced                          In a Hardware-in-the-Loop(HiL) simulation scenario,
with numerous challenges from moving the execution of                              for instance, there may be a need to place certain
simulations from grid like environment to the cloud computing                      simulation components like trigger unit close to
model. Deployment of simulations to the cloud requires an                          the hardware while the actual computation intensive
in-depth understanding of various cloud-specific deployment                        decision making can happen in the cloud envi-
tools and deployment engines. Lack of understanding of                             ronment. The simulation deployment and execution
these tools can result in sub-optimal deployment of these                          needs must some capture these requirements from the
        user so as to make sound decisions for optimal and             The INDICES [6] framework also addresses the appli-
        successful execution of the distributed simulation.            cation interference issues in the cloud and finding an
     – CH3: Straggler Mitigation: In time stepped dis-                 appropriate hosting platform that can provide guarantees
        tributed simulations, all participating entities wait on       to the QoS latency requirements of the application. But
        each other before progressing to the next time step            the application use cases are monolithic and does not
        for execution. This is the BSP model of computa-               address distributed simulation-specific requirements such
        tion [3]. In the BSP computation model, if some                as coordination, and distributed execution.
        of the participants have a slower execution speed            • Usability challenges Previous work has shown MDE
        compared to the rest of the participants, the progress         and DSML being effective tools in providing intuitive
        of the simulation as a whole is dictated by the slowest        abstractions for constructing simulation experiments [7].
        progressing participant. As such, in a distributed sim-        Our work builds on top of [7], and will provide a visual
        ulation, we need to capture this execution behavior            DSML to specify simulation resource requirements and
        either apriori or during runtime so as to mitigate the         verification module for checking correct deployment of
        slow down caused due to such straggler participants.           the simulation execution.
  • Cloud-imposed challenges
                                                                                     IV. P ROPOSED SOLUTION
     – CH4: Multi-tenant Cloud Environment: In a cloud
        computing environment, resources are shared among            To address the challenges presented in Section II, we
        various applications hosted by different providers.        propose a Model Driven Verifiable Distributed Simulations in
        Thus, issues such as application interference [4]          Cloud(MoViDiX). An overview of the framework is shown in
        can affect the simulation performance and execution        the Figure1. In this research we propose to build a platform
        progress. Host system overloading can also affect the
        execution makespan of the simulations.
     – CH5: Fault Tolerance Support: The cloud computing
        environment is prone to regular system outages due
        to software and hardware failures. As such when run-
        ning large-scale long running simulations, it becomes
        critical to have a fault tolerance strategy to mitigate
        loss of computation and restart execution time.
  • Usability Challenges
     – CH6: Ease of Use : There is general lack of tools
        to support deployment of simulations in the cloud
        with ease. A user should be able to configure the
        experimental setup, provide resource specification,
        execution and fault tolerance policies using a graph-
        ical web portal. Also to enable collaboration, the
        web application should be able to provide Git style
        versioning of simulation models.
     – CH7: Distributed Systems Learning Toolkit: A repos-                  Fig. 1: Architecture Overview of MoViDiX
        itory of commonly used distributed systems algo-
        rithms to quickly test and deploy simulations in the       that enables users to better understand distributed systems
        cloud. This enables easier learning tools to under-        concepts and build simulation models to test and run in
        stand fundamentals of distributed systems in this era      the cloud. The platform leverages domain specific concepts
        of ubiquitous computing.                                   from distributed systems such as pub/sub, client server, peer-
                                                                   to-peer, etc, to construct simulation (CH 6,7). The MDE
                   III. RELATED WORK                               capabilities like modeling and code generation as put into
  In this section we describe related works along the lines        practice for rapidly taking initial ideas to building a simulation
of the challenges that have been highlighted in the previous       of distributed system in cloud environment with ease. We will
section.                                                           also work on a DSML which will allow users to provide
  • Simulation and cloud imposed challenges: DEXSim [5]            resource and runtime specification(CH 1,2). A verification
     presents a framework for replicated execution of sim-         module is also being developed that can check the simulation
     ulations utilizing the available hardware resources for       constraints and notify of any violations generated.
     speeding up executions of different scenarios in the             We plan on utilizing the Z3 SMT solver to provide verifica-
     simulation experiment. Although this is similar to the        tion and solution satisfying simulation deployment constraints.
     distributed simulation, the experiments are not geared to-    We shall leverage Models@Runtime approach for dynamic re-
     wards running in a multi-tenant cloud where applications      source management of individual simulation entities to provide
     are susceptible to interference across tenants.               runtime optimal performance and meeting simulation QoS
requirements(CH 3,4). We plan on using Docker technology           methodology. This research is being conducted under the
that provides encapsulation for simulation execution(CH 4, 6).     supervision of Dr. Aniruddha Gokhale, Associate Professor,
The required files which contain the specification of the soft-    Vanderbilt University.
ware dependency is auto-generated using MDE principles. We
                                                                                           ACKNOWLEDGMENTS
are also leveraging web-based generic modeling environment
WebGME to perform our distributed simulation modeling and             This work is supported in part by NIST contract num-
deployment(CH 3,4).                                                ber 70NANB15H312, NSF CPS VO contract number CNS-
                                                                   1521617 and NSF US Ignite CNS 1531079. Any opinions,
       V. P LAN FOR E VALUATION AND VALIDATION                     findings, and conclusions or recommendations expressed in
   To evaluate our system, we will be conducting experiments       this material are those of the author(s) and do not necessarily
in our university’s cloud datacenter as well as leveraging the     reflect the views of the funding agencies.
NSF Chameleon cloud platform. Using the MDE we should
be able to select what cloud provider we would like to deploy                                   R EFERENCES
our experiments. Also user studies are being carried out to test   [1] S. Shekhar, H. Abdel-Aziz, M. Walker, F. Caglar, A. Gokhale, and
                                                                       X. Koutsoukos, “A simulation as a service cloud middleware,” Annals
the usability and the educational learning aspects of the tool.        of Telecommunications, vol. 71, no. 3-4, pp. 93–108, 2016.
                                                                   [2] F. Caglar, S. Shekhar, A. Gokhale, S. Basu, T. Rafi, J. Kinnebrew, and
              VI. E XPECTED CONTRIBUTIONS                              G. Biswas, “Cloud-hosted simulation-as-a-service for high school stem
                                                                       education,” Simulation Modelling Practice and Theory, vol. 58, pp. 255–
  This research will make the following contributions:                 273, 2015.
                                                                   [3] L. G. Valiant, “A bridging model for parallel computation,” Communica-
  • MDE framework for design and deployment of dis-                    tions of the ACM, vol. 33, no. 8, pp. 103–111, 1990.
    tributed simulations in cloud.                                 [4] F. Caglar, S. Shekhar, and A. Gokhale, “A performance interferenceaware
                                                                       virtual machine placement strategy for supporting soft realtime applica-
  • Dynamic resource management strategies for effecient               tions in the cloud.”
    execution of distributed simulations in cloud by lever-        [5] C. Choi, K.-M. Seo, and T. G. Kim, “Dexsim: an experimental environ-
                                                                       ment for distributed execution of replicated simulators using a concept
    aging Models@Runtime principles.                                   of single simulation multiple scenarios,” Simulation, vol. 90, no. 4, pp.
  • DSML for experiment and resource specification for                 355–376, 2014.
                                                                   [6] S. Shekhar, A. Chhokra, A. Bhattacharjee, G. Aupy, and A. Gokhale,
    distributed simulations in cloud.                                  “Indices: Exploiting edge resources for performance-aware cloud-hosted
  • Verifiable deployment of distributed simulations.                  services.”
                                                                   [7] H. Neema, J. Sztipanovits, M. Burns, and E. Griffor, “C2wt-te: A
                   VII. C URRENT S TATUS                               model-based open platform for integrated simulations of transactive smart
                                                                       grids,” in Modeling and Simulation of Cyber-Physical Energy Systems
   Currently we have addressed the usability issue for design-         (MSCPES), 2016 Workshop on. IEEE, 2016, pp. 1–6.
                                                                   [8] Y. D. Barve, P. Patil, and A. Gokhale, “A cloud-based immersive learning
ing simulation experiments and running them in the cloud               environment for distributed systems algorithms,” in Computer Software
environment(CH 4, 6, 7) in [8], [9]. We are currently working          and Applications Conference (COMPSAC), 2016 IEEE 40th Annual,
                                                                       vol. 1. IEEE, 2016, pp. 754–763.
on designing the DSML to address CH 2. In the future               [9] Y. Barve, P. Patil, A. Bhattacharjee, and A. Gokhale, “Pads: Design and
we plan on addressing resource management issues using                 implementation of a cloud-based, immersive learning environment for
                                                                       distributed systems algorithms,” IEEE Transactions on Emerging Topics
the Models@runtime principles for providing effective re-              in Computing, vol. PP, no. 99, pp. 1–1, 2017.
source allocation by using the monitoring, decision and action