=Paper=
{{Paper
|id=Vol-2019/docsymp_6
|storemode=property
|title=Towards Model Driven Verifiable Deployment of Distributed Simulations in Cloud
|pdfUrl=https://ceur-ws.org/Vol-2019/docsymp_6.pdf
|volume=Vol-2019
|authors=Yogesh D. Barve
|dblpUrl=https://dblp.org/rec/conf/models/Barve17
}}
==Towards Model Driven Verifiable Deployment of Distributed Simulations in Cloud==
Towards Model Driven Verifiable Deployment of Distributed Simulations in Cloud Yogesh D. Barve Institute for Software Integrated Systems, Dept. of EECS, Vanderbilt University, Nashville, TN 37212, USA Email:{yogesh.d.barve}@vanderbilt.edu Abstract—Running simulations on cloud computing platforms simulations. Moreover, resource requirements for different offers advantages to users such as on-demand computing and participating entities in a distributed simulation can vary based scalability. Despite these benefits offered by cloud computing on the workload characteristics of the individual simulation platforms, limitations exists in verifiable and efficient deployment of these simulations in the cloud. Moreover, distributed simula- agent. As such, tools and strategies are required to provide tions impose additional requirements of coordinated time stepped seamless execution of the simulations improving performance execution to progress the simulations. As such, deadline-aware and minimizing the makespan of execution of the simulation resource allocation for different simulation entities and dynamic run. Migration of the simulations from the grid to the cloud execution of load balancing strategies are required to minimize environment is another challenge which the users face. Issues the total execution time. Also, the cloud provider’s interest in minimal execution cost is another requirement which demands such as vendor lock-in can make the users tied to a given workload consolidation for these distributed simulations. In this cloud provider, thus making it difficult for the user to move context, there is a general lack of mechanisms that address these to another cloud provider in future. concerns in the cloud hosted distributed simulation space. To address these gaps, this research proposes Model Driven Verifi- II. PROBLEM STATEMENT able Distributed Simulations in Cloud (MoViDiX). It provides In this research we are exploring the use of cloud computing DSML building blocks for rapid provisioning of distributed resources for running distributed simulations. Simulations can simulations in cloud. It also provides a verification subsystem that identifies simulation resource incompatibilities and finds either consist of single participating entity (monolithic), or solution for verified deployment of the simulation. Leveraging a group of coordinated simulation entities as in distributed Models@Runtime principles, dynamic resource management for simulations. In this doctoral research we address challenges effective simulations for deadline aware execution is proposed. imposed in running simulations in cloud along three dimen- Index Terms—co-simulation, distributed simula- sions. These are listed in the following section below: tions,verification, Cloud. • Simulation-imposed challenges I. INTRODUCTION – CH1: Distributed and Synchronized: Distributed sim- ulations comprise a group of simulators which coor- With the new focus on technologies like internet of things, dinates and advances time synchronously. The exe- fog computing, and blockchain enabled services, under- cution semantics follow time stepped execution. The standing the fundamental technologies of distributed systems distributed nature of the execution demands strict like fault tolerance, publish-subscribe communication, client- communication latency deadlines among participat- server, and peer-to-peer technologies becomes critical. Simu- ing entities. Also, time stepped execution demands lations provide a rapid way to run and test new distributed state synchronization between participating entities systems algorithms in a controlled environment. But to sim- before progressing into the next execution time step. ulate highly scalable distributed systems such as those that – CH2: QoS and Resource Integration challenges: manifest in the scenarios like smart city, smart grid systems, Different participating simulators in a distributed one needs a large amount of computing resources. Running simulation settings may have different levels of re- such large scale simulations can benefit from the scalability, source requirements as execution progresses. Some on-demand, ubiquitous nature of cloud computing [1], [2]. simulators may need availability to specific hardware Although these benefits are very compelling for running components such as GPU, cpu core, memory, etc. simulations in the cloud, the research community is faced In a Hardware-in-the-Loop(HiL) simulation scenario, with numerous challenges from moving the execution of for instance, there may be a need to place certain simulations from grid like environment to the cloud computing simulation components like trigger unit close to model. Deployment of simulations to the cloud requires an the hardware while the actual computation intensive in-depth understanding of various cloud-specific deployment decision making can happen in the cloud envi- tools and deployment engines. Lack of understanding of ronment. The simulation deployment and execution these tools can result in sub-optimal deployment of these needs must some capture these requirements from the user so as to make sound decisions for optimal and The INDICES [6] framework also addresses the appli- successful execution of the distributed simulation. cation interference issues in the cloud and finding an – CH3: Straggler Mitigation: In time stepped dis- appropriate hosting platform that can provide guarantees tributed simulations, all participating entities wait on to the QoS latency requirements of the application. But each other before progressing to the next time step the application use cases are monolithic and does not for execution. This is the BSP model of computa- address distributed simulation-specific requirements such tion [3]. In the BSP computation model, if some as coordination, and distributed execution. of the participants have a slower execution speed • Usability challenges Previous work has shown MDE compared to the rest of the participants, the progress and DSML being effective tools in providing intuitive of the simulation as a whole is dictated by the slowest abstractions for constructing simulation experiments [7]. progressing participant. As such, in a distributed sim- Our work builds on top of [7], and will provide a visual ulation, we need to capture this execution behavior DSML to specify simulation resource requirements and either apriori or during runtime so as to mitigate the verification module for checking correct deployment of slow down caused due to such straggler participants. the simulation execution. • Cloud-imposed challenges IV. P ROPOSED SOLUTION – CH4: Multi-tenant Cloud Environment: In a cloud computing environment, resources are shared among To address the challenges presented in Section II, we various applications hosted by different providers. propose a Model Driven Verifiable Distributed Simulations in Thus, issues such as application interference [4] Cloud(MoViDiX). An overview of the framework is shown in can affect the simulation performance and execution the Figure1. In this research we propose to build a platform progress. Host system overloading can also affect the execution makespan of the simulations. – CH5: Fault Tolerance Support: The cloud computing environment is prone to regular system outages due to software and hardware failures. As such when run- ning large-scale long running simulations, it becomes critical to have a fault tolerance strategy to mitigate loss of computation and restart execution time. • Usability Challenges – CH6: Ease of Use : There is general lack of tools to support deployment of simulations in the cloud with ease. A user should be able to configure the experimental setup, provide resource specification, execution and fault tolerance policies using a graph- ical web portal. Also to enable collaboration, the web application should be able to provide Git style versioning of simulation models. – CH7: Distributed Systems Learning Toolkit: A repos- Fig. 1: Architecture Overview of MoViDiX itory of commonly used distributed systems algo- rithms to quickly test and deploy simulations in the that enables users to better understand distributed systems cloud. This enables easier learning tools to under- concepts and build simulation models to test and run in stand fundamentals of distributed systems in this era the cloud. The platform leverages domain specific concepts of ubiquitous computing. from distributed systems such as pub/sub, client server, peer- to-peer, etc, to construct simulation (CH 6,7). The MDE III. RELATED WORK capabilities like modeling and code generation as put into In this section we describe related works along the lines practice for rapidly taking initial ideas to building a simulation of the challenges that have been highlighted in the previous of distributed system in cloud environment with ease. We will section. also work on a DSML which will allow users to provide • Simulation and cloud imposed challenges: DEXSim [5] resource and runtime specification(CH 1,2). A verification presents a framework for replicated execution of sim- module is also being developed that can check the simulation ulations utilizing the available hardware resources for constraints and notify of any violations generated. speeding up executions of different scenarios in the We plan on utilizing the Z3 SMT solver to provide verifica- simulation experiment. Although this is similar to the tion and solution satisfying simulation deployment constraints. distributed simulation, the experiments are not geared to- We shall leverage Models@Runtime approach for dynamic re- wards running in a multi-tenant cloud where applications source management of individual simulation entities to provide are susceptible to interference across tenants. runtime optimal performance and meeting simulation QoS requirements(CH 3,4). We plan on using Docker technology methodology. This research is being conducted under the that provides encapsulation for simulation execution(CH 4, 6). supervision of Dr. Aniruddha Gokhale, Associate Professor, The required files which contain the specification of the soft- Vanderbilt University. ware dependency is auto-generated using MDE principles. We ACKNOWLEDGMENTS are also leveraging web-based generic modeling environment WebGME to perform our distributed simulation modeling and This work is supported in part by NIST contract num- deployment(CH 3,4). ber 70NANB15H312, NSF CPS VO contract number CNS- 1521617 and NSF US Ignite CNS 1531079. Any opinions, V. P LAN FOR E VALUATION AND VALIDATION findings, and conclusions or recommendations expressed in To evaluate our system, we will be conducting experiments this material are those of the author(s) and do not necessarily in our university’s cloud datacenter as well as leveraging the reflect the views of the funding agencies. NSF Chameleon cloud platform. Using the MDE we should be able to select what cloud provider we would like to deploy R EFERENCES our experiments. Also user studies are being carried out to test [1] S. Shekhar, H. Abdel-Aziz, M. Walker, F. Caglar, A. Gokhale, and X. Koutsoukos, “A simulation as a service cloud middleware,” Annals the usability and the educational learning aspects of the tool. of Telecommunications, vol. 71, no. 3-4, pp. 93–108, 2016. [2] F. Caglar, S. Shekhar, A. Gokhale, S. Basu, T. Rafi, J. Kinnebrew, and VI. E XPECTED CONTRIBUTIONS G. Biswas, “Cloud-hosted simulation-as-a-service for high school stem education,” Simulation Modelling Practice and Theory, vol. 58, pp. 255– This research will make the following contributions: 273, 2015. [3] L. G. Valiant, “A bridging model for parallel computation,” Communica- • MDE framework for design and deployment of dis- tions of the ACM, vol. 33, no. 8, pp. 103–111, 1990. tributed simulations in cloud. [4] F. Caglar, S. Shekhar, and A. Gokhale, “A performance interferenceaware virtual machine placement strategy for supporting soft realtime applica- • Dynamic resource management strategies for effecient tions in the cloud.” execution of distributed simulations in cloud by lever- [5] C. Choi, K.-M. Seo, and T. G. Kim, “Dexsim: an experimental environ- ment for distributed execution of replicated simulators using a concept aging Models@Runtime principles. of single simulation multiple scenarios,” Simulation, vol. 90, no. 4, pp. • DSML for experiment and resource specification for 355–376, 2014. [6] S. Shekhar, A. Chhokra, A. Bhattacharjee, G. Aupy, and A. Gokhale, distributed simulations in cloud. “Indices: Exploiting edge resources for performance-aware cloud-hosted • Verifiable deployment of distributed simulations. services.” [7] H. Neema, J. Sztipanovits, M. Burns, and E. Griffor, “C2wt-te: A VII. C URRENT S TATUS model-based open platform for integrated simulations of transactive smart grids,” in Modeling and Simulation of Cyber-Physical Energy Systems Currently we have addressed the usability issue for design- (MSCPES), 2016 Workshop on. IEEE, 2016, pp. 1–6. [8] Y. D. Barve, P. Patil, and A. Gokhale, “A cloud-based immersive learning ing simulation experiments and running them in the cloud environment for distributed systems algorithms,” in Computer Software environment(CH 4, 6, 7) in [8], [9]. We are currently working and Applications Conference (COMPSAC), 2016 IEEE 40th Annual, vol. 1. IEEE, 2016, pp. 754–763. on designing the DSML to address CH 2. In the future [9] Y. Barve, P. Patil, A. Bhattacharjee, and A. Gokhale, “Pads: Design and we plan on addressing resource management issues using implementation of a cloud-based, immersive learning environment for distributed systems algorithms,” IEEE Transactions on Emerging Topics the Models@runtime principles for providing effective re- in Computing, vol. PP, no. 99, pp. 1–1, 2017. source allocation by using the monitoring, decision and action