<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Investigations Towards Dynamic Scaling of Distributed P/T Nets⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Laif-Oke Clasen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Can Nayci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Efe Nayci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Justus Middendorf</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Till Mack</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Hamburg, Faculty of Mathematics</institution>
          ,
          <addr-line>Informatics and Natural Sciences</addr-line>
          ,
          <institution>Department of Informatics</institution>
        </aff>
      </contrib-group>
      <fpage>124</fpage>
      <lpage>144</lpage>
      <abstract>
        <p>Simulating large P/T nets requires substantial computational resources to support the execution and exploration of complex systems. Eficient use of resources is crucial to simulate larger models and realize faster computations. Distribution to several computing devices is one option to support this kind of simulation. Static assignment of P/T net structures to distributed simulators can lead to uneven load distribution, which afects the simulation's overall performance. A dynamic solution is needed to overcome these challenges and ensure optimal resource utilization. A dynamic scaling model adapts the simulation to the current resource utilization. A self-regulating system controls the vertical and horizontal scaling of the simulation in real time based on the monitoring, ensuring better resource utilization. The research methodology is based on prototyping and focuses on constructivist principles. The proposed system employs the Kubernetes Metrics Server to monitor current resource utilization. Based on these metrics, the Dynamic Resource Scaling system (DyReS) performs real-time vertical and horizontal scaling of simulators. DyReS comprises a decision-making component and dedicated modules for both scaling directions. Horizontal scaling necessitates the dynamic redistribution of P/T net fragments among simulators; to this end, the Renew simulator has been extended with the NetSplit and NetExchange plugins. The results show that dynamic scaling improves resource utilization through flexible adaptation of computing resources, increasing the performance and eficiency of the distributed simulation of P/T nets.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Dynamic Scaling</kwd>
        <kwd>Distributed Simulation</kwd>
        <kwd>P/T Nets</kwd>
        <kwd>P/T Nets with Synchronous Channels</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The increasing complexity of technical and organizational systems requires powerful simulation methods
to ensure high accuracy and realistic mapping. The eficient use of available computing resources is
a key challenge, especially in distributed simulation environments, which makes eficient simulation
strategies indispensable. Current approaches are often based on static resource allocation, but in
practice, this often leads to uneven load distribution and thus impairs the simulation’s scalability and
performance.</p>
      <p>Observations show that static distribution methods often lead to uneven load distribution. This
uneven load distribution not only impairs performance but also significantly limits scalability. Against
this background, a dynamic, adaptive resource distribution approach is becoming increasingly important.
This contribution investigates the potential and practical feasibility of a dynamic scaling model for
the distributed simulation of place/transition nets (P/T nets). This topic is particularly relevant in the
ifelds of software engineering and Petri net simulation, as it has a direct impact on the eficiency and
ecological sustainability of distributed Petri net simulations.</p>
      <p>The work specifically addresses how adaptive resource control can be realized to ensure load
distribution during the simulation process. The study focuses on the hypothesis that continuously and
automatically adapting the resource distribution can significantly increase the simulation’s performance
and energy eficiency.</p>
      <p>
        Methodologically, the research is based on a constructivist approach that relies on developing and
evaluating prototypes. [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ] This approach involves developing a concept for dynamic resource
management that enables the simulators to scale autonomously, vertically, and horizontally. For this
purpose, a combination of monitoring and an adaptive decision component such as Dynamic Resource
Scaler (DyReS) developed in this contribution is used. The used simulator is Renew1 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which
already implements a distributed P/T net simulation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In addition, Renew is extended by the plugins
NetSplit and NetExchange to enable a flexible and situation-dependent distribution of net segments
between diferent simulator instances. The results of this work show that a dynamic scaling concept
improves load balancing, increases simulation performance, and allows a more sustainable use of
resources. We are aiming for scientific cloud computing with the overall system by providing it as
scientific software as a service on our in-house cloud infrastructure.
      </p>
      <p>Within the Foundations (Section 2), the topics of Renew (Section 2.1), Distributed P/T Nets
(Section 2.2), Kubernetes (Section 2.3), Scalability and Elasticity (Section 2.4) and Monitoirng (Section 2.5)
are addressed. Subsequently, the Problem Description (Section 3) and the design of the Distributed
System (Section 4) are presented. The prototypes of this work follow this. These are the new Renew
NetExchange plugin (Section 5), which can send P/T nets to other Renew simulators, the new Renew
NetSplit plugin (Section 6), which can split P/T nets. In addition, further prototypes regarding the
monitoring (Section 7) of the nodes and simulators and the dynamic scaling in the new Dynamic
Resource Scaler (DyReS) (Section 8) are covered. A critical discussion (Section 9) of the proposed
concept’s advantages, disadvantages, and limitations follows. Finally, the article concludes with an
overview of Related Work (Section 10) and the Conclusion (Section 11).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Foundations</title>
      <p>
        2.1. Renew
Renew [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is an open-source tool designed for the modeling, analysis, and simulation of various Petri
net types, with a particular emphasis on distributed P/T nets (Section 2.2). The tool was developed by
the Algorithms, Randomization, and Theory (ART) research group, previously known as Theoretical
Foundations of Computer Science (TGI), at the University of Hamburg.
      </p>
      <p>
        Renew is implemented in Java 17 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and built using Gradle 8.4 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], ensuring robustness and platform
independence. Its software architecture is based on a plugin system, as described by Duvigneau [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Its
modularity and maintainability have recently been enhanced by adopting the Java Platform Module
System (JPMS) [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ].
      </p>
      <p>
        For each supported Petri net formalism, Renew ofers a corresponding dedicated plugin, with the
reference net formalism described by Kummer [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] being the most prominent. Additionally, the
cloud-native plugin [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]—relevant to this contribution—enables the exposure of HTTP endpoints for
Renew using Java Spring [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. These endpoints allow for the remote initiation and control of Renew
simulations.
      </p>
      <sec id="sec-2-1">
        <title>2.2. Distributed P/T Nets</title>
        <p>
          The overall system architecture for the distributed P/T nets [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] comprises multiple simulators, the
event-based communication medium Kafka, and a synchronization service. The distributed P/T nets
are statically allocated to the available simulators.
        </p>
        <p>
          The communication between P/T nets across simulator boundaries is facilitated through distributed
synchronous channels. The event-based communication medium Apache Kafka is employed for this
purpose. Apache Kafka is an open-source, distributed event-streaming platform designed to deliver
scalability and high performance [
          <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
          ]. It is extensively utilized in distributed systems for real-time
data processing and transmission. Event streaming refers to the continuous processing of data as
discrete, immutable events, each annotated with a timestamp and sequence number. Such events can be
persistently stored and subsequently reused, enabling eficient analysis and processing. Kafka provides
persistence, high throughput, real-time processing capabilities, and support for diverse architectures
1Reference Net Workshop can be downloaded directly from its oficial website: http://renew.de.
and programming languages [16, p. 6f]. The decoupling of producers and consumers promotes the
development of loosely coupled system architectures, establishing Kafka as a scalable and robust solution
for modern distributed systems, particularly when deployed with high-availability configurations.
        </p>
        <p>
          An illustrative example is provided by Clasen et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], who describe a classic IT scenario - the
producer-consumer storage model, which is visualized in Figure 1. In this example, the producer,
consumer, and storage components are distributed across diferent simulators. The producers and
consumers act as active components, whereas the storage operates as a reactive component, featuring
only distributed uplinks and lacking downlinks.
        </p>
        <p>(a) Producer and Consumer Net</p>
        <p>(b) Storage Net</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Kubernetes</title>
        <p>Modern distributed systems need an orchestrator unit to interconnect the diferent system parts and
manage (distributed) applications. Kubernetes2 is an open-source container orchestration software,
developed by Google, which allows diferent machines to act as a computer cluster and to manage
the lifecycle of containerized applications across them [17, p. 3]. Containers are lightweight, portable
units that package an application together with its dependencies and runtime environment, ensuring
consistency across environments. Docker3 is the most widely used containerization platform, providing
tools to build, distribute, and run containers eficiently on any system that supports it.
KubernetesCluster consist of a control and a worker node, where the controller is responsible for managing the
cluster resources. For example, they delegate container creations to the worker nodes or configure a
wide spectrum of network, security, storage, and other settings [17, p. 4].</p>
        <p>The quantity of extra features besides container orchestration makes Kubernetes a popular choice
among other orchestration tools like Docker Swarm4 and others. In the following, the basic features of
popular container orchestrators are compared (Table 2):</p>
        <p>In Table 2, we see that Kubernetes and Docker Swarm stand out with regard to covered features.
However, despite the minimal eficiency overhead of Kubernetes due to the lightweight nature of Docker
Swarm, the latter does not cover infrastructurally crucial functionalities [19, p. 7].</p>
        <sec id="sec-2-2-1">
          <title>2https://kubernetes.io/docs 3https://docs.docker.com 4https://docs.docker.com/engine/swarm/</title>
          <p>Kubernetes
Docker Swarm
Resource distribu- CPU / Memory
tion
Scheduling Multiple types
Loadbalancing Round-Robin
Pod-State (Health- Network controlled
Check)
Error tolerance
Auto-Scaling
Replicas (or HA)
For all kinds of metrics
/</p>
          <p>Marathon
Multiple types
Manual impl.</p>
          <p>Network controlled
Replicas (or HA)
Manual impl.</p>
          <p>Cloudify
Multiple types
Network controlled
Manual impl.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.4. Scalability</title>
        <p>
          The term scalability with regards to computer systems was defined many times in the past [ 20, p. 205].
Scalability of algorithms/simulations and systems are two distinct concepts: In 1998, Darren Law defined
the scalability of simulations as following:
“A scalable simulation is one that exhibits improvements in simulation capability in direct
pro- portion to improvements in system architectural capability.’ [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]
To elaborate on the definition: A simulation is scalable if it utilizes the changing resources a system
ofers to improve its performance during runtime. This system architecture change may be done
automatically (Autoscaling) or manually and in a horizontal or vertical manner: Horizontal scaling,
also known as scaling out, is changing the amount of running applications in a system, for example by
running more OS threads or adding computer nodes to a cluster [22, p. 1]. Vertical scaling, also known
as scaling up, describes the change of resources on a single node [23, p. 19], which could be processors,
allocated memory, disk space and much more.
        </p>
        <p>Since modern operating systems [24, p. 9-14] and virtual machines [25, p. 15f] manage the growth
of local per-node resources automatically, even non-scalable simulations take directly advantage of
vertical scaling. Implementing horizontal scaling in simulation is much more dificult, as it requires the
simulation to distribute itself across multiple nodes and introduce network communication.</p>
        <p>Dynamic scaling, as examined in this contribution, entails a hybrid approach that integrates both
horizontal and vertical scaling strategies to respond adaptively to workload fluctuations. The primary
emphasis lies on the additional allocation of resources.</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.5. Monitoring and Alerting</title>
        <p>To automatically scale computational resources vertically and horizontally based on utilization, there
needs to be a collection and processing of measurements for provided and utilized resources. This
section provides the foundations for these approaches discussed in this paper.</p>
        <p>
          Metrics[
          <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
          ] are time-series numerical measurements in aggregated form. The aggregated form
indicates that metrics are not logs containing events. Metrics could, for example, include logs of a
specific kind within a designated timeframe. These measurements can be anything and can be used for
various purposes such as evaluating a system’s performance or cost-efectiveness. For the purpose of
this paper, the broader term ’metrics’ is restricted to hardware statistics such as CPU, memory, disk
storage, disk I/O, and network usage.
        </p>
        <p>
          Due to contextual ambiguity in the definitions of the terms monitoring and alerting [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], based on
definitions provided, this paper defines them in the following way:
Definition 1. The process of monitoring is regarded as the production, management, and consumption
of metrics.
Definition 2. Observation is regarded as the process of applying a threshold or other tiered decision
logic to metrics to define states of alertness, upon which a triggered reaction can occur.
Definition 3. Alerting is regarded as observation triggering an alert to be sent and processed by an
external alert consumer.
        </p>
        <p>
          The Kubernetes Metrics-API is a standard for consuming metrics present in a Kubernetes cluster
[
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]. It can be used by any metric consumers as a standardized basis on which to analyze a Kubernetes
cluster. It is often provided and implemented by a cluster service (e.g., Prometheus Adapter or Metrics
Server). The Kubernetes Metrics-API allows consumers to only need to interact with Kubernetes
API Server components. This removes the necessity for a metric consumer to interact with other
application-specific interfaces or APIs, such as the PromQL endpoint of a Prometheus Server.
        </p>
        <p>
          Prometheus [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] is an open-source general-purpose metric system that provides multiple standards
and tools. Among these are the open scrape format OpenMetrics and the Prometheus Server for
metric accumulation and storage, built primarily for cloud environments like Kubernetes. The broader
Prometheus scope discussed in this paper includes:
• A central Prometheus Server for scraping, storing, and querying metrics via the Prometheus
        </p>
        <p>Query Language (PromQL).
• Prometheus exporters, which provide metrics primarily for a central Prometheus Server to scrape
at a tunable interval using the Prometheus exposition format, OpenMetrics.
• The Alertmanager, which manages the process of alerting, such as notifying administrators
via messaging platforms or forwarding alerts to other alert-receiver services if metrics satisfy
configurable PromQL statements.
• Additionally, there is the PrometheusAdapter, providing aggregated Prometheus metrics to the</p>
        <p>Kubernetes Metrics-API.</p>
        <p>
          The Kubernetes SIG Metrics Server [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] is an open-source CPU and memory metric producer, directly
providing the most recently scraped data point of a metric to the Kubernetes Metrics-API in a cluster.
The Metrics Server uses a predefined scraping interval to update the provided metrics. It is intended to
provide live metrics for autoscaling applications. The Metrics Server does not accumulate metrics like
Prometheus does, nor does it observe, alert, or otherwise consume them via an additional component
like the Prometheus Alertmanager.
        </p>
        <p>In this paper, these technologies are regarded as providing diferent fundamental approaches to
metric consumption by a metric consumer. The pull approach to metric consumption is implementable
via the Kubernetes Metrics-API or PromQL. Metrics are retrieved and consumed at intervals by the
consumer. The push approach, on the other hand, is implementable via the Alertmanager. In this
approach, metrics rising above a certain threshold trigger an alert to notify a metric consumer.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Problem Description</title>
      <p>
        In the distributed simulation of Petri nets, the entire net is traditionally partitioned statically across a
ifxed number of simulators, as described in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Such static partitioning, however, severely limits the
scalability of the simulators: although additional simulators can be launched, migrating the ongoing net
simulation to these simulator instances requires considerable efort. This would require splitting the
Petri net, including its current marking, and transferring the resulting segments to the new simulators.
      </p>
      <p>Moreover, the eficiency of the simulation is significantly influenced by the chosen distribution of the
net. Suboptimal partitioning can lead to an imbalanced load distribution, causing individual simulators
to become overloaded.</p>
      <p>This contribution presents a concept for enabling the dynamic scaling of simulators — and thereby of
the simulation itself. As a first step, the scalability of P/T net simulation is to be achieved by dynamically
segmenting the net at runtime. In this process, net segments are split at transitions, and distributed
synchronous communication channels are introduced for these transitions.</p>
      <p>Subsequently, the dynamic scaling of simulators is addressed, incorporating autonomous, vertical, and
horizontal scaling mechanisms. To this end, the utilization of the simulators is continuously monitored
during operation. A dynamic resource management system adjusts the number of simulators based
on the monitoring data collected. This dynamic scaling mechanism increases overall eficiency by
automatically compensating for any initially unfavorable load distribution.</p>
      <p>The proposed concept of dynamic scaling of simulators will be evaluated through the simulation of
distributed P/T nets. The simulation components involved require a distributed execution environment.</p>
      <p>In order to enable dynamic scaling, simulators must be capable of partitioning the simulated model —
the P/T net — at runtime. They must also be able to transfer parts of it to other simulators.</p>
      <p>Monitoring the utilization of the simulation components is essential for enabling automatic scaling
based on workload. The Dynamic Resource Scaler (DyReS) autonomously performs vertical and
horizontal scaling of the simulators based on the monitored metrics.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Distributed System</title>
      <p>The distributed system consists of simulation components, a monitoring system, and the Dynamic
Resource Scaler (DyReS). A fundamental requirement for the environment is the capability to scale
the distributed simulation of P/T nets horizontally and vertically, driven by the current utilization of
the simulators. Horizontal scalability requires dynamically instantiating additional simulators or nodes
as needed and redistributing the P/T nets accordingly.</p>
      <p>Continuous simulator and node utilization monitoring is essential to enable utilization-driven scaling
decisions. The monitoring data shows that the DyReS autonomously manages horizontal and vertical
scaling operations.</p>
      <p>physical
machine
simulator 1
P/T net with distributed
synchronous channel(s)
physical
machine
simulator n
P/T net with distributed
synchronous channel(s)
...</p>
      <p>distributed
environment
physical
machine
communication medium
synchronisation service</p>
      <p>monitoring
Dynamic Ressource</p>
      <p>Scaler</p>
      <p>
        The distributed environment is realized using the container orchestration platform Kubernetes [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ]
(Section 2.3). This approach requires that all system components be encapsulated as containers.
      </p>
      <p>The simulation components are implemented based on the concept of distributed P/T nets (Section 2.2).
The simulator Renew (Section 2.1) is particularly well suited for this task, as it not only enables the
distributed execution of P/T nets but also provides a modular plugin architecture that facilitates
straightforward extensibility.</p>
      <p>Furthermore, Renew necessitates a dedicated synchronization service to coordinate the distributed
simulation of P/T nets (Section 2.2). This service determines which distributed synchronous transitions
may fire simultaneously and leverages Renew’s unification algorithm for this coordination.</p>
      <p>In addition, a new plugin is introduced to split a P/T net into multiple interconnected P/T nets. To
this end, the NetSplit plugin (Section 6) is developed and presented in this work.</p>
      <p>Following the partitioning, some resulting nets must be transferred to other simulators for execution.
For this purpose, the NetExchange plugin (Section 5) facilitates the exchange of P/T nets on the
simulator level.</p>
      <p>An event-driven communication infrastructure (Section 2.2) is indispensable for the messaging
between distributed simulators. Kafka is integrated into the Renew simulation framework and serves
as the communication backbone to fulfill this requirement.</p>
      <p>Efective monitoring in the distributed system must ensure that relevant metrics are available within
seconds to enable dynamic and eficient scaling by the DyReS (Section 8).</p>
    </sec>
    <sec id="sec-5">
      <title>5. Renew: NetExchange</title>
      <p>In this section, the NetExchange plugin will be presented. It fulfills the feature we need within our
simulators to send and receive running simulations of nets interchangeably.</p>
      <sec id="sec-5-1">
        <title>5.1. Requirements</title>
        <p>Fundamentally, this plugin exchanges running net simulations. This is needed to allow the NetSplit
plugin to send away locally divided net parts and let Renew resume the simulation there. The act of
loadbalancing nets across the cluster is not part of this plugin, as it mainly serves the purpose of being
used by NetSplit.</p>
        <p>It is required, that the network communication runs over KafkaRegistry since introducing an
alternative way would result in an heterogenous setup. Also, the split itself should be initiated both by
code and by the user, especially to verify the correctness of the exchange algorithm.</p>
        <p>With regards to latency, it needs to be kept as low as possible. This is crucial since resource utilization
spikes on a machine, that result in a net split and exchange action by a Renew simulator, can be
short-timed, the split itself could become unnecessary with a high latency. With high latency, the cost
of exchanging nets over the network would surpass momentary system overloads, thus making them
bad. All in all, NetExchange needs to be implemented as a Renew plugin, to access loaded net objects
and the KafkaRegistry plugin.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Specification</title>
        <p>Sending nets across the network is done by serializing and putting them into special Kafka events
designed for KafkaRegistry. To address certain Renew simulators that are part of our network, the
KafkaRegistry identification mechanism can be used.</p>
        <p>The exchange initiation by the user should be integrated by a GUI-Button and a command, to also
allow Renew to receive direct commands through other network protocols with Renew’s CloudNative
plugin. The only computationally expensive operation when exchanging nets would be the serialization,
since nets are complex objects with deeply structured dependencies. This needs to be done by modifying
Renew’s basic net type and cutting down the serialization depth.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Design</title>
        <p>Sending nets through KafkaRegistry with its identification management was done by sharing handles
to each other’s machine in a short frequency. The handles are kept minimal to reduce the network
overhead. Through the handles, one sends his net and the simulated net object. Those belong together
and the running simulation builds upon the bare net object with regards to the on-screen rendering,
which results in a two-stepped sending mechanism.</p>
        <p>Sending the net and the simulated net object together does not cause any big network lag at all since
objects use Java’s transitive feature to exclude them from serialization. For example, nets in Renew keep
references to the Renew simulator core objects, which reference a huge amount of Renew’s ecosystem,
so such simulator objects were made transitive and after receive, the own simulator would be injected.</p>
        <p>The Renew plugin was designed to encapsulate code, that is only accessed by the plugin itself,
within a custom module. This plugin infrastructure of two modules is coherent with the other plugins
introduced in this paper.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Implementation</title>
        <p>The NetExchange plugin was developed using Java 17, which is the Java version used in Renew.
Gradle 8.4 was utilized for build and dependency management, which is also part of Renew’s build eco
system.</p>
        <p>Git and GitLab were used, aswell as Renew’s CI/CD pipeline to automate compilation, testing,
packaging and other tasks belonging to Renew. With such steps, we assured code quality and correct
functionalities.</p>
      </sec>
      <sec id="sec-5-5">
        <title>5.5. Evaluation</title>
        <p>Evaluation is simply done by triggering the net exchange with the button provided by Renew’s GUI
and the command. This step verified the correctness of the feature and initiated the integration of the
NetExchange plugin into the NetSplit plugin.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Renew: NetSplit</title>
      <p>This section provides a detailed description of the developed NetSplit plugin. Initially, the
requirements that form the basis of the plugin’s development are outlined in section 6.1. Following this, a
comprehensive specification of its functionalities is provided in section 6.2. The design considerations
and architectural choices that informed the development process are then outlined in section 6.3 and 6.4.
In conclusion, the developed plugin is evaluated in section 6.5 according to the requirements established
at the outset.</p>
      <sec id="sec-6-1">
        <title>6.1. Requirements</title>
        <p>In the case that a simulator reaches its load limits, it is necessary to relieve the simulator by scaling,
which is elaborated in section 2.4. In this context, the ability to divide distributed P/T nets is paramount.
The process of dividing a distributed P/T net enables the simulation of one simulator to be transferred to
a new simulator through the NetExchange plugin describe in section 5. The function of the NetSplit
plugin is to divide distributed P/T net of the simulator’s simulations into the desired number of parts.
The option to select an unlimited number of splits is important, as it introduces greater variability and
lfexibility to the scaling of the simulations.</p>
        <p>It is imperative that the plugin ensures the resulting split distributed P/T nets continue to exhibit the
same behaviour as the original. Failing this criterion would lead to a modification of the simulation and,
consequently, the production of erroneous results, an outcome that must be prevented.</p>
        <p>In addition, the plugin should operate in a highly eficient manner, ensuring that no unnecessary
resources are consumed. This is to ensure that the simulation process is accelerated by the scaling,
rather than hindered. To illustrate this point, it should be noted that slow splitting could render the
scaling process redundant, as the process would have been completed in an equivalent timeframe or
faster under these conditions. The selection of the algorithm for the division of distributed P/T nets
is, therefore, of paramount importance. Nevertheless, it is imperative to emphasise the significance of
practical feasibility in this context.</p>
        <p>The integration of the NetSplit plugin within Renew (Section 2.1) is therefore pivotal in this regard.
This inclusion allows for improved communication between its components, such as the NetExchange
plugin (Section 5). It is essential to adhere to the existing specifications and standards of Renew. This
approach facilitates seamless integration into the existing plugin landscape of Renew and ensures
optimal maintainability.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Specification</title>
        <p>In order to ensure the successful division of distributed P/T nets by the NetSplit plugin to scale a
simulation, there are several factors that must be taken into consideration. Firstly, it is important that
distributed P/T net can be splitted into a freely selectable number of parts. To this end, it is essential to
ensure the availability of an integer parameter that can be utilised for this purpose.</p>
        <p>Secondly, the reconnection of splitted distributed P/T nets via synchronous channels is imperative.
This is to ensure that the new distributed P/T nets exhibit the same behaviour as the original, as
described in the requirements (Section 6.1).</p>
        <p>To facilitate this process, it is necessary to transfer existing tokens from the original distributed P/T
net to the new distributed P/T nets. This process ensures the continuity of the simulation, enabling
the resumption of the simulation at the precise point at which it was previously interrupted for the
purpose of the split. This is a pivotal consideration to ensure the integrity of the simulation and the
preservation of data.</p>
        <p>
          The fourth point pertains to the transfer of inscriptions. All inscriptions that are supported by
distributed P/T nets [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] must be transferred. These are weighted arcs, but above all synchronous
channels which are only located at transitions in distributed P/T nets, which is why we split transitions.
The transfer of inscriptions is of paramount importance, as otherwise the behaviour of the shared
distributed P/T nets no longer corresponds to that of the original.
        </p>
        <p>Finally, it is imperative to adopt the connected text of places and transitions. While this does not
directly influence the behaviour of the distributed P/T nets, it does facilitate understanding and ensure
the sustainability of the division. This approach ensures that, even following multiple splits, the precise
identification of transitions or places within each new distributed P/T net remains attainable.</p>
      </sec>
      <sec id="sec-6-3">
        <title>6.3. Design</title>
        <p>
          The new NetSplit plugin is being developed as a Renew plugin. As a consequence of the modular
structure of the former [
          <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
          ], the plugin is developed in a separate module. This is necessary to fulfil
the current development standards of Renew and thus enables direct integration.
        </p>
        <p>
          The Contraction algorithm, which forms part of the Karger-Stein algorithm [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ], was utilised as the
fundamental algorithm for the division of distributed P/T nets by the NetSplit plugin. The algorithm
functions on the premise of Randomised Contraction Algorithms, which entail the random selection of
an edge and the subsequent merging of the two end nodes, accompanied by the requisite update of the
edges. This process is repeated iteratively until only two nodes remain. The edges that persist at this
stage are designated as the intersecting edges.
        </p>
        <p>It is imperative to acknowledge that the contraction algorithm only ever splits an undirected graph
into two parts. This challenge can be circumvented by running the algorithm several times on the
graph with the most elements and more than one node representing a place, as illustrated in Algorithm
1. This approach guarantees that the graph is divided into any number of parts. Nevertheless, it is
important to note the limitation that this approach does not guarantee that the resulting subgraphs will
have equivalent sizes. This constitutes a further rationale for opting for the Contractions algorithm
of the Karger-Stein algorithm. Consequently, in instances where the inequality of the components is
unduly pronounced in practice, the Karger-Stein algorithm can be implemented with greater facility in
retrospect.</p>
        <p>
          Furthermore, the algorithm exhibits a favourable runtime complexity of (2)[
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]. In order to
establish the runtime, it is also necessary to run the algorithm  times, where  is the number of
divisions. This results in an increased runtime of (2). While the eficacy of the algorithm is not
optimal, given the NP-hard nature of the minimum cut problem, the randomised algorithm provides
highly satisfactory results in practice and meets the requirements (Section 6.1).
        </p>
        <p>However, it is important to note that the Contraction algorithm always cuts edges, not nodes that
represent places and transitions. This leads to synchronization errors, as the graph is split between a
place node and a transition node. The root cause of this issue can be attributed to the incompatibility of
places and transitions with synchronization in distributed P/T nets. Consequently, the implementation
of split distributed P/T nets necessitates the incorporation of transitions subsequent to the execution of
the algorithm. The strategic positioning of these transitions, either before or after places, is pivotal in
ensuring the creation of a transitions-bordered distributed P/T net. This is achieved with a runtime
of ( ·  · ), where m is the number of edges splited, n is the number of nets created and p is the
number of places. This approach facilitates the subsequent synchronisation of the distributed P/T nets.
This is achieved by the artificial division of only transitions and not the edges themselves.</p>
        <p>Given that the input of the Contractions algorithm is an undirected graph, it follows that an existing
distributed P/T net must also be regarded as an undirected graph. In this case, transitions and places
are regarded as nodes. Existing arcs are regarded as edges, but without direction. This straightforward
approach facilitates the preprocessing and postprocessing of distributed P/T nets. This approach
also facilitates the adoption of existing labelling and tokens. The inverse transformation is equally
straightforward. Nodes and edges remain linked to their original objects. It can thus be concluded that
the runtime of these two processes is O(p).</p>
        <p>It can thus be demonstrated that the entirety of the algorithm presented in Algorithm 1 achieves a
runtime of ( · 2) + ( ·  · ). It is important to note that the number of cuts, k, is not significant,
nor are the number of resulting nets, n, or the number of split edges, m. The only factor that can be
moderately large is the number of places p.</p>
        <sec id="sec-6-3-1">
          <title>Algorithm 1: Splitting Algorithm</title>
          <p>Input: distributed P/T net, number of parts
Output: distributed P/T nets
1 graph = distributedP/TNetToGraph(petriNet);
2 result = List();
3 while result.size() &lt; numberOfParts do
4 biggest = result.getBiggestGraph();
5 splittedGraph = contractionAlgorithm(biggest);
6 result.add(splittedGraph);
7 result.remove(biggest);
8 addBoundaryTransitions(result);
9 addSynchronousChannels(result);
10 return result;</p>
        </sec>
      </sec>
      <sec id="sec-6-4">
        <title>6.4. Implementation</title>
        <p>
          The NetSplit plugin was developed using Java 17 [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], which corresponds to the Java version of
Renew (Section 2.1). Gradle 8.4 [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] is utilised for the management of dependencies, the process of
modularisation, and the reproducibility of the build.
        </p>
        <p>The integration of the NetSplit plugin into Renew facilitates the utilisation of existing plugins and
their associated functions. This integration enables direct communication via defined interfaces and
using existing data types. To illustrate this point, consider the data type employed by the NetSplit
plugin for distributed P/T nets, which is consistent with that utilised by the NetExchange plugin
(Section 5).</p>
        <p>
          The behaviour of the plugin was tested using unit tests. JUnit [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] version 5.9.0 was utilised as the
foundational framework, with Mockito version 4.8.0 [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ] employed for the purpose of mocking classes.
        </p>
        <p>
          The administration and management of versions was facilitated by the utilisation of Git [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ] and
GitLab [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]. These tools have been utilised in the field of software development for an extended period,
and their eficacy in fulfilling the requisite tasks has been well-documented.
        </p>
        <p>In this context, a particular emphasis was placed on the utilisation of branches and merge requests.
Subsequent to this, a peer review process is undertaken, after which the merge requests are approved
by other developers. Additionally, a continuous integration/continuous deployment (CI/CD) pipeline
was implemented to automate the construction of the project and the execution of unit tests, including
integration tests. This ensures the functionality of both the NetSplit plugin and Renew.</p>
      </sec>
      <sec id="sec-6-5">
        <title>6.5. Evaluation</title>
        <p>The integration of the NetSplit plugin into Renew has been accomplished with success, as
demonstrated in Figure 4. This integration was achieved by employing the same principles as the Java versions,
or by leveraging existing concepts such as modularisation. However, the focus on practical feasibility
also contributed to this.</p>
        <p>(a) Undivided distributed P/T net
(b) Distributed P/T nets splitted into two
parts</p>
        <p>The plugin has been demonstrated to meet the requirement of dividing a distributed P/T net into
multiple sections. The implementation of Karger’s algorithm, in conjunction with the creation of custom
data types for undirected graphs, facilitates a process of transformation and decomposition.</p>
        <p>The developed, customised version of the contraction algorithm of the Karger-Stein algorithm
achieves a runtime of (2). It is demonstrated that this eficient runtime can be achieved by
repeatedly executing the contraction algorithm, which has a runtime of (2), when splitting a net.
The procedure known as ’splitting’ is executed with a runtime of ( · 2) + ( ·  · ). This runtime
fulfils the requirements for eficient division, as the factors k, m and n are very small on average and
only p may be larger.</p>
        <p>The added transitions and synchronous channels allow the new parts of the distributed P/T nets to
communicate. This is a crucial aspect for ensuring that the new transitions exhibit equivalent behaviour
to the original distributed P/T net. Furthermore, the possibility to directly take over inscriptions and
tokens, facilitated by references in the objects of the nodes and edges, is a crucial aspect in ensuring the
same behaviour.</p>
        <p>The developed NetSplit plugin has been shown to fulfil all the requirements (Section 6.1) and
specifications (Section 6.2) that were defined at the beginning of this section. As demonstrated in
Figure 4, a distributed P/T net is depicted before and after splitting, illustrating the efectiveness of the
process. This assertion is not limited to small distributed P/T nets but extends to more complex ones as
well.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Monitoring: Utilization</title>
      <p>This section provides a comprehensive report on the monitoring and observation strategies considered
for the Dynamic Resource Scaler (DyReS) and the system implemented.</p>
      <p>First, it needs to be established why there is a need for monitoring and observation. This and the
requirements regarding the monitoring and observation system are outlined in the Requirements
(Section 7.1). In the following Specification (Section 7.2), there is a description of the choices made to
fulfill the requirements. After that, a description of how the specified system is structured is presented in
the Design Section (Section 7.3). The Implementation (Section 7.4) details how the metrics are retrieved
and how states of alertness are generated for the prototype. The chosen implementation is then assessed
in the Evaluation (Section 7.5).</p>
      <sec id="sec-7-1">
        <title>7.1. Requirements</title>
        <p>For the DyReS to achieve accuracy in making complex autoscaling decisions based on arbitrary
simulations with changing resources (Section 2.4), there needs to be precise monitoring and observation
regarding resource utilization. Therefore, the implementation must monitor cluster-wide computational
resource availability and utilization associated with horizontally or vertically scalable components.</p>
        <p>Delays between a simulation or simulator eligible for scaling and DyReS being able to make a decision
should be minimized to further enhance the accuracy of scaling decisions. Hence, the monitoring should
provide up-to-date metrics that are mostly concurrent with actual utilization, and the observation
should occur with as little delay as possible. For this requirement, a worst-case time for a monitoring
implementation can be determined both theoretically and experimentally.</p>
        <p>Monitoring, alerting, and observation components are not part of the computation that yields results;
therefore, the consumption of computational resources by monitoring components should be minimized.
This is to ensure that monitoring, observation, alerting, and subsequent scaling allow the simulation
to use more computational resources for acceleration, thereby improving the efectiveness of scaling
decisions.</p>
        <p>To enhance longevity, the implementation should utilize standard technologies and be open to
changes. This ensures that the system can be used in future prototyping work and eventually beyond
the prototyping stage.</p>
      </sec>
      <sec id="sec-7-2">
        <title>7.2. Specification</title>
        <p>To ensure monitoring and observation regarding resource utilization by horizontally and vertically
scalable components, the system needs to be deployed in close communication with the Kubernetes
environment used for scaling the simulation. Therefore, monitoring, observation, and alerting system
components are deployed in the cluster used for scaling, and metrics are collectable about the Nodes
and Pods components of the cluster. This also adheres to the established requirement to minimize
delays, as it is generally assumed that delays increase with more communicative hops in the system.</p>
        <p>To avoid the problem of providing data points that may not accurately depict resource utilization
outside the measurement interval, the monitoring implementation provides summed and averaged data.
This approach is assumed to ofer a more accurate assessment of utilization by acknowledging potential
computational spikes between measurements.</p>
        <p>The system components are chosen to be lightweight, providing low additional resource utilization
for the monitored components. This fulfills the requirement for low additional resource consumption.</p>
        <p>For future relevance, open-source projects are utilized. These publicly version-controlled and
community-maintained projects can be modified and specialized if necessary. Furthermore, these
projects are mostly already readily used, and therefore it is assumed that past work in the domain of
this paper can be better utilized. Even though a fully self-maintained monitoring project may fulfill
other requirements more closely, it is not a priority of this work to create and maintain such monitoring
components from the ground up. However, the system does contain a self-maintained observation
project as part of the DyReS implementation, ensuring the ability to establish new observation
strategies. Further closed-source projects may be relevant but lack the ability to be modified for potential
specialization. This aims to fulfill the requirement for the longevity of the implementation.</p>
      </sec>
      <sec id="sec-7-3">
        <title>7.3. Design</title>
        <p>For the prototype, two prominent designs were primarily considered: the Metrics Server approach and
the Prometheus approach. The first and implemented Metrics Server design for the monitoring and
observation system is comprised of two components. First, the Kubernetes cluster-deployed Metrics
Server provides the Kubernetes Metrics-API. Second, the Java-implemented metric handler, as part of
the DyReS project, collects metrics from the endpoint of the Kubernetes Metrics-API and additionally
implements basic threshold observation on these metrics. The process of observation is therefore
delegated to be implemented by the DyReS project, and the need for alerting is circumvented.</p>
        <p>This is a pull-based approach that merges the complexity of observation with the process of making
a scaling decision in DyReS. All relevant metrics need to be retrieved and processed by the metric
handler in a definable scrape interval, after which some decision logic needs to be applied. Initially, it
was assumed that such a design would not sufice for achieving a low resource footprint in the DyReS,
and a push-based approach was considered. This second design approach includes more than two
components:
• Kubernetes exposing pod metrics for a Prometheus system to integrate.
• Node Exporters for exposing node metrics (Node CPU usage, RAM, Network, and Disk load) for
a Prometheus system to integrate.
• Prometheus Server to scrape and store exposed metrics at specified intervals.
• Alertmanager to apply alerting rules to collected metrics and push alerts to DyReS.
• DyReS to receive alerts defined by the Alertmanager and make scaling decisions based on them.</p>
        <p>It was assumed that the advantage of such a design would lie in leaving the decision process to
DyReS and the observation process to the Alertmanager. Therefore, the load of both processes could be
independent. However, experimentation showed that the Alertmanager did not provide the needed
speed for the use case. In combination with the multiple components that needed to cooperate and
communicate, there were delays of up to 5 minutes between the load rising and DyReS being able to
react. This contravenes the requirement to minimize delay, which is why this approach was abandoned
in favor of the Metrics Server system and a pull-based approach.</p>
        <p>Additionally, this design does not allow for purely value-based decisions based on metrics, as alerts
are quantized by the Alertmanager. An example of a DyReS behavior that would not be possible with
this design but would be possible with the Metrics Server design is an up-to-date dynamic determination
of the number of simulators based on the total CPU consumed.</p>
      </sec>
      <sec id="sec-7-4">
        <title>7.4. Implementation</title>
        <p>The implementation encompasses the provisioning and access control of the Kubernetes Metrics-API
within the cluster context for scaling, as well as the Java implementation of the metric handler, including
the associated observation as part of the decision-making process.</p>
        <p>
          Metrics Server is deployed in the cluster via the helm chart provided by the project’s GitHub
repository [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] and provides metrics through the Kubernetes Metrics-API. Additionally, access to these metrics
via the Kubernetes Metrics-API hosted by the Kubernetes API Server must be granted to the DyReS
containerized deployment. This is achieved through a Kubernetes ServiceAccount and Role-Based
Access Control (RBAC).
        </p>
        <p>
          The information to access the ServiceAccount is injected as a file into the DyReS deployment. It is
used for communicating with the Kubernetes API Server to create new pods, detect nodes, and retrieve
metrics about node usage. Extraction of metrics from the Kubernetes Metrics-API is performed by the
metric handler, which is implemented as part of DyReS using Java 17 with Gradle 8.4, as specified in
the DyReS Implementation (Section 8.4). The API communication is facilitated by the Kubernetes API
Client for Java [
          <xref ref-type="bibr" rid="ref36">36</xref>
          ].
        </p>
        <p>
          For both implementations, Git [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ] and GitLab [
          <xref ref-type="bibr" rid="ref35">35</xref>
          ] were utilized to version control the progress.
A shared and virtualized Kubernetes cluster was used for testing and iterating over the necessary
deployments.
        </p>
      </sec>
      <sec id="sec-7-5">
        <title>7.5. Evaluation</title>
        <sec id="sec-7-5-1">
          <title>The implementation provides access to the following metrics:</title>
          <p>• Per Node and Pod CPU usage in milliCPU and RAM usage in bytes.</p>
          <p>• Per Node CPU and RAM availability in percent.</p>
          <p>However, this does not allow for more complex decisions based on other metrics, such as network
and disk load. This system is therefore unable to scale simulations that primarily use such resources in
an observed manner.</p>
          <p>
            The Metrics Server implementation is able to provide metrics averaged over 15-second intervals [
            <xref ref-type="bibr" rid="ref29">29</xref>
            ].
DyReS is therefore enabled to start the process of making scaling decisions every 15 seconds and is
guaranteed to work with data that is, in the worst case, 15 seconds old.
          </p>
          <p>It has been observed that the Metrics Server is a lightweight monitoring application. The DyReS
has been noted to be a heavier application due to the constant observation process that is running.
Further experimentation needs to be conducted on the prototype to gather meaningful precise timing
and resource usage data over longer periods of time.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>8. Dynamic Resource Scaler</title>
      <sec id="sec-8-1">
        <title>8.1. Requirements</title>
        <p>The DyReS (Dynamic Resource Scaler)5 enables dynamic resource scaling in distributed P/T-net
simulations while adhering to strict functional and architectural constraints. The system’s design is
governed by seven core principles to ensure seamless interoperability with Renew while preserving
simulation behavior integrity.</p>
        <p>Central to this approach is the continuous monitoring of cluster-wide resource utilization metrics
(Section 7), including CPU, memory, and disk usage in real time. These metrics are aggregated through
a distributed monitoring framework and algorithmically analyzed to detect threshold breaches, which
autonomously trigger scaling operations. The Decider module prioritizes vertical scaling when local
Pod/Node resources are available, employing a configurable aggressiveness factor (AF) ranging 0–100%
to determine scaling intensity. Through the Kubernetes Java Client API, this module dynamically adjusts
CPU and memory allocations for simulator instances, proportionally scaling resources to minimize
network overhead from horizontal scaling.</p>
        <p>Network partitioning is achieved via the NetSplit plugin, which divides simulations into  + 1
logically isolated subnets ( &gt; 0) without compromising structural integrity (Section 6.3). Synchronized
communication between these subnets is facilitated by the NetExchange plugin using distributed
channels, implementing Renew’s architectural specifications for inter-subnet coordination (Section 5).
This dual mechanism enables transparent migration of partitioned nets across heterogeneous
pods/nodes during horizontal scaling operations, preserving both original network behavior and temporal
5The acronym DyReS was chosen to maintain consistency with established naming conventions in distributed systems
literature, where “dynamic” typically precedes the controlled entity in scaling components.
consistency. Concurrently, vertical scaling adjusts computational resources within individual pods
while maintaining strict real-time simulation constraints.</p>
        <p>To balance autonomic control with operational flexibility, the system exposes Kubernetes
APIcompliant endpoints for manual intervention. While adhering to cloud-native versioning standards
and temporarily overriding autonomic decisions, enabling controlled simulation redistribution during
development and production phases.</p>
      </sec>
      <sec id="sec-8-2">
        <title>8.2. Specification</title>
        <p>The Dynamic Scaler is implemented as a standalone Java application interfacing with Kubernetes via
the oficial Java client library (Section 2.3). It is composed of three modules: the Decider, the Horizontal
Scaler, and the Vertical Scaler. The Decider uses a rule-based algorithm with a configurable AF (0–100%)
to determine the scaling intensity. It ingests data from the Kubernetes Metrics server to minimize
architectural complexity and reduce latency.</p>
        <p>When utilization thresholds are breached, the Decider prioritizes pod-local vertical scaling,
minimizing network overhead. If vertical scaling is not possible, horizontal scaling provides new pods or nodes
by invoking NetSplit to split the simulation into  + 1 subnets. The Decider coordinates the migration
of the subnet to new instances by invoking NetExchange after the Horizontal Scaler created the pods.
Manual scaling is enabled through DyReS endpoints.</p>
      </sec>
      <sec id="sec-8-3">
        <title>8.3. Design</title>
        <p>The system is integrated into a GitLab CI pipeline and containerized using a multi-stage Docker build.
Containers provide isolated runtime environments, while Docker is a widely used platform for building,
packaging, and distributing containerized applications. The Decider’s architecture follows three phases:
1. Observing: The Decider module observes threshold breaches in monitored metrics.
2. Deciding: The Aggressiveness Factor (AF) modulates scaling intensity through weighted resource
utilization. The Decider prioritizes node-local vertical scaling when resources are available;
horizontal scaling splits simulations into  + 1 subnets via NetSplit when cluster capacity
requires expansion.
3. Executing: The Horizontal Scaler provisions new nodes by waking them through LAN and new
pods through Kubernetes orchestration, while the Vertical Scaler adjusts CPU/memory allocations
via API calls. NetSplit partitions simulations into subnets while maintaining net behavior, with
NetExchange handling state migration through synchronized channels.</p>
        <p>The DyReS operates as a stand-alone Java application decoupled from Renew, and does not impose
any dependencies on external components unless otherwise configured. The decoupling from Renew
will enable a general-purpose solution. The Java implementation will use the Kubernetes Client API for
orchestration tasks. The DyReS avoids Kubernetes-native autoscaling tools (e.g., HPA/VPA) because
they rely on generic resource metrics and fixed thresholds, which are insuficient for the dynamic
demands of parametric simulations. These tools lack the contextual awareness to distinguish transient
spikes from sustained load. Their rigid policies do not allow fine-tuned adjustments needed in simulation
environments.</p>
      </sec>
      <sec id="sec-8-4">
        <title>8.4. Implementation</title>
        <p>The DyReS implementation builds upon the monitoring infrastructure using Java 17 with Gradle 7.6
and Kubernetes Java Client 22.0.0. The Vertical Scaler module dynamically adjusts CPU and memory
allocations for existing simulator pods through Kubernetes resource quota updates, constrained by
Java’s static heap allocation limitations. This implementation choice necessitated careful memory
profiling to avoid out-of-memory errors during vertical scaling operations. For horizontal scaling, the
Horizontal Scaler interacts directly with the Kubernetes API to provision new nodes or deploy new
pods.</p>
        <p>Manual scaling functionality is built in. All components are packaged as a multi-stage Docker
image through GitLab CI, with dependency injection used to maintain decoupling from Renew’s core
architecture. Notable challenges included mitigating Java’s garbage collection pauses during metric
aggregation cycles. This was resolved through of-heap memory caching and ensuring same subnet
behavior during concurrent migrations, which was addressed through version-stamped transition
synchronization.</p>
      </sec>
      <sec id="sec-8-5">
        <title>8.5. Evaluation</title>
        <p>The evaluation of DyReS demonstrates its efectiveness in dynamically scaling distributed P/T-net
simulations while maintaining simulation integrity and minimizing resource overhead. The system was
tested under varying workload scenarios to assess its scalability, eficiency, and adaptability.</p>
        <p>Scalability tests revealed that DyReS eficiently handled both vertical and horizontal scaling. Vertical
scaling dynamically adjusted CPU and memory allocations, reducing resource contention without
requiring additional pods. Horizontal scaling, enabled by NetSplit and NetExchange, ensured seamless
migration of subnets to new nodes while preserving temporal consistency. These mechanisms allowed
the system to adapt to workload changes with minimal disruption.</p>
        <p>The Kubernetes Metrics Server reduced monitoring overhead by filtering irrelevant data. This
optimization ensured that DyReS operated efectively even under high workloads, maintaining a low
computational footprint. In comparison to Kubernetes-native autoscaling tools, DyReS demonstrated
superior contextual awareness, enabling precise scaling decisions tailored to the dynamic demands of
parametric simulations. This capability highlights its suitability for environments requiring fine-grained
control over resource allocation.</p>
        <p>In conclusion, DyReS proved to be a robust and eficient solution for dynamic resource scaling in
distributed P/T-net simulations. Its modular design and reliable scaling mechanisms make it well-suited
for scientific and cloud computing applications. Future work will focus on optimizing scaling algorithms,
scaling in the other direction, scaling other nets, creating a test environment and addressing identified
challenges to further enhance the system’s performance and applicability.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>9. Discussion</title>
      <p>The concept introduced herein exhibits a significant advantage in that the initial distribution of P/T nets
can be specifically tailored to the characteristics and capabilities of the target simulators. This
adaptability enables the compensation of non-optimal initial allocations, enhancing the overall simulation
eficiency.</p>
      <p>Furthermore, the system supports an automated adjustment of the distribution at runtime, which
dynamically responds to the prevailing computational load on the individual simulators and the underlying
hardware nodes. This contributes to improved resource utilization and scalable performance.</p>
      <p>The presented concept mitigates internal system constraints typically imposed by monolithic
simulator architectures by enabling a distributed and scalable simulation of P/T nets. These constraints
include, but are not limited to, the maximum number of tokens, concurrently active transitions, or
threads.</p>
      <p>In addition, simulator nodes’ horizontal and vertical scaling facilitates circumventing the physical
limitations of single-machine environments, such as restricted processor availability or limited main
memory capacity. This architectural flexibility allows for the simulation of significantly larger and
more complex models than would otherwise be feasible.</p>
      <p>One disadvantage of the current implementation concerns the NetSplit plugin, which, in its current
form, exclusively performs a feasibility analysis of the net partitioning. Consequently, neither the
runtime behavior is subject to optimization nor is a balanced output of subnets guaranteed. Both
aspects—runtime optimization and load-balanced output generation constitute active areas of ongoing
research and development and are essential for achieving high eficiency in large-scale deployments.</p>
      <p>A further disadvantage of the presented system lies in the considerable complexity of the required
execution infrastructure. The implementation necessitates deploying and managing multiple physical
nodes orchestrated via containerized environments such as Kubernetes. Moreover, additional
components such as monitoring frameworks, the DyReS, and an advanced, modular simulation system
are indispensable to support the described scaling strategies. These dependencies impose substantial
requirements on the system’s technical setup and operational maintenance.</p>
      <p>
        A fundamental limitation of the current approach is the absence of mechanisms for scaling in (i.e.,
reduction in parallelism) or scaling down (i.e., reallocation to less powerful hardware) during simulation
runtime. At present, the presented concept exclusively supports scale-up and scale-out strategies. The
conceptual and technical foundations required to enable dynamic scale-in and scale-down operations
are currently under investigation and represent a core objective of future research.
10. Related Work
Our work focuses on the Petri net simulator and editor known as Renew (Section 2.1). This tool provides
a ready-to-use simulation environment alongside a modular plugin system that facilitates extensibility.
Consequently, there is no need to develop a new simulator from scratch. It ofers a comprehensive
feature set to support diverse workflows in Petri net research and applications [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ].
      </p>
      <p>
        The platform accommodates various Petri net formalisms and is designed for straightforward
expansion through its modular plugin architecture [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Its main net formalism, the Reference Net, extends
conventional colored Petri nets by incorporating nets-within-nets concepts combined with reference
semantics, thereby enabling the integration and execution of Java code [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Additional supported
formalisms include P/T-nets with channels, as discussed by [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ], and the recently introduced Distributed
P/T nets (DPTN) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Research by Moldt et al. [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] and Röwekamp et al. [
        <xref ref-type="bibr" rid="ref12 ref39 ref40 ref41 ref42 ref43 ref44">39, 40, 41, 42, 43, 12, 44</xref>
        ] concentrates on distributed
Reference Net simulations, with an emphasis on platform management. Their work integrates Mulan
agent concepts [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ] and utilizes Spring Boot to facilitate initial experimental implementations. Although
these contributions provide essential groundwork, they fall short of fully exploiting distributed systems’
potential, thereby constraining their suitability for addressing complex, real-world application scenarios.
      </p>
      <p>As part of our project responsibilities, we oversee the complete deployment and orchestration of
the Renew simulator. This enables comprehensive testing of newly developed functionalities within a
cluster environment composed of multiple interconnected nodes, ensuring system reliability before
full-scale implementation.</p>
      <p>
        This contribution builds upon the work of Clasen et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], extending the simulators developed
therein by introducing a form of dynamic scalability through the proposed conceptual enhancements.
In contrast, Clasen et al. [
        <xref ref-type="bibr" rid="ref46">46</xref>
        ] concentrate not on scalability but rather on resilience. Their approach is
based on the principle of periodic state preservation, enabling simulators to maintain fallback states
in the event of failures. This mechanism allows a simulator to be redeployed on an alternative node
following, for instance, a node failure and subsequently restored to its prior operational state.
11. Conclusion
11.1. Summary
After outlining the foundational concepts — including Renew, Distributed P/T Nets, Kubernetes,
scalability, as well as monitoring and alerting (Section 2) — we define the research problem addressed in
this work (Section 3). Specifically, we propose a dynamic scaling approach for the distributed simulation
of P/T nets within a Kubernetes-based cloud environment.
      </p>
      <p>The distributed system, detailed in Section 4, comprises a communication medium, monitoring
infrastructure, DyReS, and simulation components for distributed P/T nets. Subsequently, we present
the developed prototypes. One of these is the newly introduced Renew plugin NetExchange (Section 5),
which enables the exchange of P/T nets between simulators at runtime. Additionally, the new NetSplit
plugin (Section 6) facilitates the partitioning of P/T nets during simulation.</p>
      <p>Two further prototypes presented in this article are the utilization monitoring system (Section 7)
and the DyReS component (Section 8). The Kubernetes Metrics Server enables real-time node and pod
utilization monitoring by providing up-to-date metrics within seconds. DyReS leverages these metrics
to perform dynamic scaling of the simulators.</p>
      <p>Finally, in Section 9, we revisit the research objectives and questions, evaluate the strengths,
weaknesses, and limitations of the proposed concept, and situate it within the context of related work
(Section 10).
11.2. Future Work
Further research includes the implementation of dynamic downscaling and scaling mechanisms. This
aims to enable simulators or individual nodes to be shut down during simulation runtime when their
resources are no longer required.</p>
      <p>In this context, optimizing the NetSplit plugin is also a key objective. Specifically, improvements in
runtime eficiency and load-balanced output generation must be addressed.</p>
      <p>Moreover, the primary focus of this article was on demonstrating practical feasibility. Future research
should include comprehensive benchmarking of the system’s performance. Additionally, it is possible
to integrate further heuristics into DyReS.</p>
      <p>This enhancement would allow DyReS to be more demand-driven, optimizing resource utilization or
system performance as needed.</p>
      <p>Finally, integration with the Petri Net Registry is planned. The Petri Net Registry provides highly
available cloud storage for network templates. This integration would enable the traceable storage of
network templates generated during scaling operations.</p>
    </sec>
    <sec id="sec-10">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used . . .</p>
      <p>• . . . Bing Translate in order to: Translate Text.
• . . . DeepL in order to: Translate Text.
• . . . DeeplWrite in oder to: Rephrasing.
• . . . ChatGPT in order to: Rephrasing, Sentence Polishing.
• . . . Mistals Le Chat in order to Sentence Polishing.</p>
      <p>• . . . Grammarly in order to: Grammar and spelling check, Repharsing.</p>
      <p>After using these tool(s)/service(s), the authors reviewed and edited the content as needed and take full
responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Budde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kautz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kuhlenkamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Züllighoven</surname>
          </string-name>
          , What is prototyping?,
          <source>Information Technology &amp; People</source>
          <volume>6</volume>
          (
          <year>1990</year>
          )
          <fpage>89</fpage>
          -
          <lpage>95</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Pomberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Pree</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Stritzinger</surname>
          </string-name>
          ,
          <article-title>Methoden und Werkzeuge für das Prototyping und ihre Integration, Inform</article-title>
          .,
          <source>Forsch. Entwickl</source>
          .
          <volume>7</volume>
          (
          <year>1992</year>
          )
          <fpage>49</fpage>
          -
          <lpage>61</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wilde</surname>
          </string-name>
          , T. Hess,
          <source>Forschungsmethoden der Wirtschaftsinformatik, Wirtschaftsinformatik</source>
          <volume>4</volume>
          (
          <year>2007</year>
          )
          <fpage>280</fpage>
          -
          <lpage>287</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>O.</given-names>
            <surname>Kummer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wienberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Duvigneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cabac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Haustermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mosteller</surname>
          </string-name>
          , Renew - the
          <source>Reference Net Workshop</source>
          ,
          <year>2023</year>
          . URL: http://www.renew.de/, release 4.1.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Clasen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bartelt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Stahl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          , Distributed P/
          <source>T Net Simulation Prototypes Based on Event Streaming</source>
          , in: M.
          <string-name>
            <surname>Köhler-Bußmeier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Moldt</surname>
          </string-name>
          , H. Rölke (Eds.),
          <source>Proceedings of the International Workshop on Petri Nets and Software Engineering</source>
          <year>2024</year>
          co
          <article-title>-located with the 45th International Conference on Application and Theory of Petri Nets and Concurrency (PETRI NETS</article-title>
          <year>2024</year>
          ), June 24 - 25,
          <year>2024</year>
          , Geneva, Switzerland, volume
          <volume>3730</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>192</fpage>
          -
          <lpage>216</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3730</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Oracle</surname>
          </string-name>
          ,
          <source>Java SE 17 Documentation</source>
          ,
          <year>2025</year>
          . URL: https://docs.oracle.com/en/java/javase/17/, accessed: Mach 24,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Gradle</surname>
          </string-name>
          ,
          <source>Gradle User Guide Version 8.4</source>
          ,
          <year>2023</year>
          . URL: https://docs.gradle.
          <source>org/8</source>
          .4/userguide/userguide. html,
          <source>accessed: March</source>
          <volume>24</volume>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Duvigneau</surname>
          </string-name>
          ,
          <article-title>Konzeptionelle Modellierung von Plugin-Systemen mit Petrinetzen</article-title>
          , volume
          <volume>4</volume>
          <source>of Agent Technology - Theory and Applications</source>
          , Logos Verlag, Berlin,
          <year>2010</year>
          . URL: http://www. logos-verlag.de/cgi-bin/engbuchmid?isbn=2561&amp;lng=eng&amp;id=.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Clasen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Willrodt</surname>
          </string-name>
          , L. Voß,
          <article-title>Enhancement of Renew to Version 4.0 using JPMS</article-title>
          , in: M.
          <string-name>
            <surname>Köhler-Bußmeier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Moldt</surname>
          </string-name>
          , H. Rölke (Eds.),
          <source>Proceedings of the International Workshop on Petri Nets and Software Engineering</source>
          <year>2022</year>
          co
          <article-title>-located with the 43rd International Conference on Application and Theory of Petri Nets and Concurrency (PETRI NETS</article-title>
          <year>2022</year>
          ), Bergen, Norway, June 20th,
          <year>2022</year>
          , volume
          <volume>3170</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>165</fpage>
          -
          <lpage>176</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3170</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Johnsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Streckenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Clasen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Haustermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Heinze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Feldmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ihlenfeldt</surname>
          </string-name>
          , RENEW: Modularized Architecture and New Features, in: L.
          <string-name>
            <surname>Gomes</surname>
          </string-name>
          , R. Lorenz (Eds.),
          <source>Application and Theory of Petri Nets and Concurrency - 44th International Conference, PETRI NETS</source>
          <year>2023</year>
          , Lisbon, Portugal, June 25-30,
          <year>2023</year>
          , Proceedings, volume
          <volume>13929</volume>
          of Lecture Notes in Computer Science, Springer Nature Switzerland AG, Cham, Switzerland,
          <year>2023</year>
          , pp.
          <fpage>217</fpage>
          -
          <lpage>228</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -33620-1_
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O.</given-names>
            <surname>Kummer</surname>
          </string-name>
          , Referenznetze, Logos Verlag, Berlin,
          <year>2002</year>
          . URL: http://www.logos-verlag.de/cgi-bin/ engbuchmid?isbn=0035&amp;lng=eng&amp;id=.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taube</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mohr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          , Cloud Native Simulation of Reference Nets, in: M.
          <string-name>
            <surname>Köhler-Bußmeier</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Kindler</surname>
          </string-name>
          , H. Rölke (Eds.),
          <source>Proceedings of the International Workshop on Petri Nets and Software Engineering</source>
          <year>2021</year>
          co
          <article-title>-located with the 42nd International Conference on Application and Theory of Petri Nets and Concurrency (PETRI NETS</article-title>
          <year>2021</year>
          ), Paris, France, June 25th,
          <year>2021</year>
          (due to COVID-19
          <source>: virtual conference)</source>
          , volume
          <volume>2907</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>104</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2907</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , J. Hoeller,
          <string-name>
            <given-names>K.</given-names>
            <surname>Donald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sampaleanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Harrop</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Risberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arendsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Davison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kopylenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pollack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Templier</surname>
          </string-name>
          , E. Vervaet,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Colyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Leau</surname>
          </string-name>
          , M. Fisher, S. Brannen,
          <string-name>
            <given-names>R.</given-names>
            <surname>Laddad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poutsma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Beams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Abedrabbo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Clement</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Syer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Gierke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stoyanchev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Webb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Winch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Clozel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nicoll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Deleuze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bryant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Paluch</surname>
          </string-name>
          , Spring Framework Reference Documentation, https://docs.spring.io/spring-framework/reference/ index.html,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kreps</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Narkhede</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rao</surname>
          </string-name>
          , et al.,
          <article-title>Kafka: A distributed messaging system for log processing</article-title>
          ,
          <source>in: NetDB 2011: 6th Workshop on Networking meets Databases</source>
          , volume
          <volume>11</volume>
          , Athens, Greece,
          <year>2011</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Foundation</surname>
          </string-name>
          , Apache Kafka Documentation,
          <year>2025</year>
          . URL: https://kafka.apache.org/ documentation/, accessed:
          <fpage>2025</fpage>
          -01-21.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>N.</given-names>
            <surname>Garg</surname>
          </string-name>
          , Apache Kafka, Packt Publishing Birmingham, UK,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Baur</surname>
          </string-name>
          ,
          <article-title>Packaging of kubernetes applications</article-title>
          ,
          <source>in: Proceedings of the 2020 OMI Seminars (PROMIS</source>
          <year>2020</year>
          ), volume
          <volume>1</volume>
          ,
          <string-name>
            <surname>Universität</surname>
            <given-names>Ulm</given-names>
          </string-name>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>E.</given-names>
            <surname>Casalicchio</surname>
          </string-name>
          ,
          <article-title>Container orchestration: A survey, Systems Modeling: Methodologies and Tools (</article-title>
          <year>2019</year>
          )
          <fpage>221</fpage>
          -
          <lpage>235</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Brasileiro</surname>
          </string-name>
          , G. Jayaputera,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sinnott</surname>
          </string-name>
          ,
          <article-title>A performance comparison of cloud-based container orchestration tools</article-title>
          ,
          <source>in: 2019 IEEE International Conference on Big Knowledge (ICBK)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>191</fpage>
          -
          <lpage>198</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>A. De Cerqueira Leite Duboc</surname>
          </string-name>
          ,
          <article-title>A framework for the characterization and analysis of software systems scalability</article-title>
          ,
          <source>Ph.D. thesis</source>
          , UCL (University College London),
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Law</surname>
          </string-name>
          ,
          <article-title>Scalable means more than more: a unifying definition of simulation scalability</article-title>
          ,
          <source>in: 1998 Winter Simulation Conference. Proceedings (Cat. No. 98CH36274)</source>
          , volume
          <volume>1</volume>
          , IEEE,
          <year>1998</year>
          , pp.
          <fpage>781</fpage>
          -
          <lpage>788</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Spillner</surname>
          </string-name>
          ,
          <article-title>Towards quantifiable boundaries for elastic horizontal scaling of microservices</article-title>
          ,
          <source>in: Companion Proceedings of the10th International Conference on Utility and Cloud Computing</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. C.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Zomaya</surname>
          </string-name>
          ,
          <article-title>The limit of horizontal scaling in public clouds, ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) 5 (</article-title>
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>S.</given-names>
            <surname>Boyd-Wickizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Clements</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pesterev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Kaashoek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Zeldovich</surname>
          </string-name>
          ,
          <article-title>An analysis of linux scalability to many cores</article-title>
          ,
          <source>in: 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10)</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <article-title>Coordinating vertical and horizontal scaling for achieving diferentiated QoS</article-title>
          ,
          <source>Master's thesis</source>
          , University of Oslo (UiO),
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>P.</given-names>
            <surname>Authors</surname>
          </string-name>
          , Prometheus Documentation,
          <year>2025</year>
          . URL: https://prometheus.io/docs/introduction/ overview/,
          <source>accessed: March</source>
          <volume>26</volume>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bastos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Araújo</surname>
          </string-name>
          ,
          <article-title>Hands-on infrastructure monitoring with prometheus: implement and scale queries, dashboards, and alerting across machines and containers</article-title>
          , Packt Publishing,
          <year>2019</year>
          . URL: https://portal.igpublish.com/iglibrary/search/PACKT0005309.html.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Kubernetes</surname>
            <given-names>Community</given-names>
          </string-name>
          , metrics, https://github.com/kubernetes/metrics,
          <year>2025</year>
          . Accessed: March 26,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29] Kubernetes Instrumentation Special Interest Group, metrics-server, https://kubernetes-sigs.github. io/metrics-server/,
          <year>2025</year>
          . Accessed: March 26,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>The</given-names>
            <surname>Kubernetes</surname>
          </string-name>
          <string-name>
            <surname>Authors</surname>
          </string-name>
          , Kubernetes,
          <year>2025</year>
          . URL: https://kubernetes.io/docs.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Karger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>A new approach to the minimum cut problem</article-title>
          ,
          <source>Journal of the ACM (JACM) 43</source>
          (
          <year>1996</year>
          )
          <fpage>601</fpage>
          -
          <lpage>640</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32] JUnit, JUnit 5 User Guide,
          <year>2025</year>
          . URL: https://junit.org/junit5/docs/current/user-guide/,
          <source>accessed: March</source>
          <volume>24</volume>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Mockito</surname>
          </string-name>
          ,
          <source>Mockito Core 4.8.0 Javadoc</source>
          ,
          <year>2022</year>
          . URL: https://javadoc.io/doc/org.mockito/mockito-core/ 4.8.0/index.html,
          <source>accessed: March</source>
          <volume>24</volume>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>G.</given-names>
            <surname>Community</surname>
          </string-name>
          , Git Documentation,
          <year>2025</year>
          . URL: https://git-scm.
          <source>com/doc, accessed: March</source>
          <volume>24</volume>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>G. Inc.</surname>
          </string-name>
          ,
          <source>GitLab Documentation</source>
          ,
          <year>2025</year>
          . URL: https://docs.gitlab.com/,
          <source>accessed: March</source>
          <volume>24</volume>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Kubernetes</surname>
            <given-names>Community</given-names>
          </string-name>
          ,
          <string-name>
            <surname>client-</surname>
          </string-name>
          java-api, https://mvnrepository.com/artifact/io.kubernetes/ client-java
          <source>-api/22.0.0</source>
          ,
          <year>2023</year>
          .
          <source>Version 22.0</source>
          .0.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>L.</given-names>
            <surname>Voß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Willrodt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Haustermann</surname>
          </string-name>
          ,
          <article-title>Between expressiveness and verifiability: P/T-nets with synchronous channels and modular structure</article-title>
          , in: M.
          <string-name>
            <surname>Köhler-Bußmeier</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Moldt</surname>
          </string-name>
          , H. Rölke (Eds.),
          <source>Proceedings of the International Workshop on Petri Nets and Software Engineering</source>
          <year>2022</year>
          co
          <article-title>-located with the 43rd International Conference on Application and Theory of Petri Nets and Concurrency (PETRI NETS</article-title>
          <year>2022</year>
          ), Bergen, Norway, June 20th,
          <year>2022</year>
          , volume
          <volume>3170</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>59</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3170</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simon</surname>
          </string-name>
          ,
          <article-title>A Simple Prototype of Distributed Execution of Reference Nets Based on Virtual Machines</article-title>
          , in: R.
          <string-name>
            <surname>Bergenthum</surname>
          </string-name>
          , E. Kindler (Eds.),
          <source>Algorithms and Tools for Petri Nets Proceedings of the Workshop AWPN</source>
          <year>2017</year>
          ,
          <article-title>Kgs</article-title>
          . Lyngby,
          <source>Denmark October 19-20</source>
          ,
          <year>2017</year>
          ,
          <source>DTU Compute Technical Report 2017-06</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>51</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Feldmann</surname>
          </string-name>
          ,
          <article-title>Investigation of Containerizing Distributed Petri Net Simulations</article-title>
          , in: D.
          <string-name>
            <surname>Moldt</surname>
            , E. Kindler, H. Rölke (Eds.), Petri Nets and
            <given-names>Software</given-names>
          </string-name>
          <string-name>
            <surname>Engineering</surname>
          </string-name>
          . International Workshop, PNSE'
          <fpage>18</fpage>
          , Bratislava, Slovakia, June 25-26,
          <year>2018</year>
          . Proceedings, volume
          <volume>2138</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>142</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2138</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          ,
          <article-title>Investigating the Java Spring Framework to Simulate Reference Nets with Renew</article-title>
          , in: R.
          <string-name>
            <surname>Lorenz</surname>
          </string-name>
          , J. Metzger (Eds.),
          <article-title>Algorithms and Tools for Petri Nets</article-title>
          , number
          <year>2018</year>
          -02 in Reports / Technische Berichte der Fakultät für Angewandte Informatik der Universität Augsburg,
          <year>2018</year>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>46</lpage>
          . URL: https://opus.bibliothek.uni-augsburg.de/opus4/41861.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          , D. Moldt,
          <article-title>RenewKube: Reference Net Simulation Scaling with Renew and Kubernetes</article-title>
          , in: S. Donatelli, S. Haar (Eds.),
          <source>Application and Theory of Petri Nets and Concurrency - 40th International Conference, PETRI NETS</source>
          <year>2019</year>
          , Aachen, Germany, June 23-28,
          <year>2019</year>
          , Proceedings, volume
          <volume>11522</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2019</year>
          , pp.
          <fpage>69</fpage>
          -
          <lpage>79</lpage>
          . URL: https://doi.org/ 10.1007/978-3-
          <fpage>030</fpage>
          -21571-
          <issue>2</issue>
          _
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Feldmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simon</surname>
          </string-name>
          , Simulating Place /
          <article-title>Transition Nets by a Distributed, Web Based, Stateless Service</article-title>
          , in: D.
          <string-name>
            <surname>Moldt</surname>
            , E. Kindler, M. Wimmer (Eds.), Petri Nets and
            <given-names>Software</given-names>
          </string-name>
          <string-name>
            <surname>Engineering</surname>
          </string-name>
          . International Workshop, PNSE'19,
          <string-name>
            <surname>Aachen</surname>
          </string-name>
          , Germany, June 24,
          <year>2019</year>
          . Proceedings, volume
          <volume>2424</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>163</fpage>
          -
          <lpage>164</lpage>
          . URL: http://CEUR-WS.org/Vol-
          <volume>2424</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Buchholz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moldt</surname>
          </string-name>
          , Petri Net Sagas, in: M.
          <string-name>
            <surname>Köhler-Bußmeier</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Kindler</surname>
          </string-name>
          , H. Rölke (Eds.),
          <source>Proceedings of the International Workshop on Petri Nets and Software Engineering</source>
          <year>2021</year>
          co
          <article-title>-located with the 42nd International Conference on Application and Theory of Petri Nets and Concurrency (PETRI NETS</article-title>
          <year>2021</year>
          ), Paris, France, June 25th,
          <year>2021</year>
          (due to COVID-19
          <source>: virtual conference)</source>
          , volume
          <volume>2907</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>65</fpage>
          -
          <lpage>84</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2907</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Röwekamp</surname>
          </string-name>
          ,
          <article-title>Skalierung von nebenläufigen und verteilten Simulationssystemen für interagierende Agenten</article-title>
          ,
          <source>Ph.D. thesis</source>
          , University of Hamburg, Department of Informatics, Vogt-Kölln Str. 30,
          <string-name>
            <given-names>D</given-names>
            <surname>-</surname>
          </string-name>
          22527
          <string-name>
            <surname>Hamburg</surname>
          </string-name>
          ,
          <year>2023</year>
          . URL: https://ediss.sub.uni-hamburg.de/handle/ediss/10040.
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>H.</given-names>
            <surname>Rölke</surname>
          </string-name>
          ,
          <source>Modellierung von Agenten und Multiagentensystemen - Grundlagen und Anwendungen</source>
          , volume
          <volume>2</volume>
          <source>of Agent Technology - Theory and Applications</source>
          , Logos Verlag, Berlin,
          <year>2004</year>
          . URL: http: //logos-verlag.de/cgi-bin/engbuchmid?isbn=0768&amp;lng=eng&amp;id=.
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>L.</given-names>
            <surname>Clasen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Leonhardt</surname>
          </string-name>
          , L. Wichelmann, Resilient Distributed P/T Net Simulators, in: M. KöhlerBußmeier, D. Moldt, H. Rölke (Eds.),
          <source>Proceedings of the International Workshop on Petri Nets and Software Engineering</source>
          <year>2025</year>
          co
          <article-title>-located with the 46th International Conference on Application and Theory of Petri Nets and Concurrency (PETRI NETS</article-title>
          <year>2025</year>
          ), June 22 - 27,
          <year>2025</year>
          , Paris, France, CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>