<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Optimizing network slice placement using Deep Reinforcement Learning (DRL) on a real platform operated by Open Source MANO (OSM)</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Alexandre</forename><surname>Sabbadin</surname></persName>
							<email>alexandre.sabbadin@laas.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LAAS-CNRS</orgName>
								<orgName type="institution" key="instit1">Université de Toulouse</orgName>
								<orgName type="institution" key="instit2">CNRS</orgName>
								<orgName type="institution" key="instit3">UPS</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Abdel</forename><forename type="middle">Kader</forename><surname>Chabi</surname></persName>
							<email>akchabisik@laas.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LAAS-CNRS</orgName>
								<orgName type="institution" key="instit1">Université de Toulouse</orgName>
								<orgName type="institution" key="instit2">CNRS</orgName>
								<orgName type="institution" key="instit3">UPS</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sika</forename><surname>Boni</surname></persName>
							<affiliation key="aff0">
								<orgName type="laboratory">LAAS-CNRS</orgName>
								<orgName type="institution" key="instit1">Université de Toulouse</orgName>
								<orgName type="institution" key="instit2">CNRS</orgName>
								<orgName type="institution" key="instit3">UPS</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Hassan</forename><surname>Hassan</surname></persName>
							<email>hassan.hassan@laas.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LAAS-CNRS</orgName>
								<orgName type="institution" key="instit1">Université de Toulouse</orgName>
								<orgName type="institution" key="instit2">CNRS</orgName>
								<orgName type="institution" key="instit3">UPS</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Khalil</forename><surname>Drira</surname></persName>
							<email>khalil@laas.fr</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LAAS-CNRS</orgName>
								<orgName type="institution" key="instit1">Université de Toulouse</orgName>
								<orgName type="institution" key="instit2">CNRS</orgName>
								<orgName type="institution" key="instit3">UPS</orgName>
								<address>
									<settlement>Toulouse</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Optimizing network slice placement using Deep Reinforcement Learning (DRL) on a real platform operated by Open Source MANO (OSM)</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">BA6CCF06E12008D65DB746933F12B084</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:39+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Deep Reinforcement Learning</term>
					<term>Slicing</term>
					<term>IoT systems</term>
					<term>Open Source MANO</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Optimizing network slice placement in 5G networks requires efficient algorithms. Deep Reinforcement Learning (DRL) has been used to solve this problem successfully. However, few works have tackled the deployment of these algorithms in a real environment. In this paper we present a DRL based algorithm aiming to optimally place network slices in IoT networks. We evaluate the performance of this algorithm in a real network deployed on Grid'5000 platform, operated by an Open Source MANO (OSM) middleware. The simulation results show a good convergence of the algorithm and the deployment in the real environment gives us some insights about a potential slicing architecture using OSM, the processing of a DRL agent in real conditions, and limitations due to consequent instantiation times for Virtual Network Functions (VNF).</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the current evolving landscape of modern telecommunications, the concept of network slicing has emerged as a groundbreaking paradigm that promises to revolutionize how we manage and optimize network resources <ref type="bibr" target="#b0">[1]</ref>. Network slicing allows network operators to divide their physical infrastructure into virtualized, dedicated, and isolated networks. Each slice (e.g. virtualized network) is tailored to specific service requirements, such as low latency for augmented reality services, massive bandwidth for video streaming or ultra-reliability for autonomous cars. As the deployment of network slices becomes more complex, it brings us a consequent challenge: the slicing optimization problem. This problem revolves around efficiently allocating network resources, such as compute and storage, across various slices while ensuring that each slice meets its specific quality-of-service (QoS) requirements. This optimization problem becomes even more challenging in dynamic, real-time environments where network conditions fluctuate, users make requests and resources need to be continuously adjusted. To address the slicing optimization problem, innovative approaches have been introduced</p><p>Alexandre Sabbadin et al. CEUR Workshop Proceedings 1-12 using Deep Reinforcement Learning (DRL) algorithms. DRL algorithms have demonstrated remarkable learning and adaptation capabilities to complex environments, making them a promising tool for network operators who wish to optimize resource allocation in a dynamic slicing context. We decided to evaluate the performance of the DRL algorithm in both simulation and a real-world environment (using Grid'5000 infrastructure <ref type="bibr" target="#b1">[2]</ref>), Our choice is driven by the will to ensure the reliability and practical applicability of our research. Simulations offer a controlled setting to fine-tune algorithms, due to low-time runs, but they often simplify the complexity of real-world networks. On the other hand, deploying DRL algorithms in the Grid'5000 infrastructure allows us to confront the unpredictability, noise, and dynamic nature of actual network environments. By undertaking this comparative analysis, we seek to validate the algorithm's performances beyond theoretical expectations and lay a robust foundation for its practical deployment.</p><p>The rest of the paper is organized as follows: in Section 2, we present related work about deployment of network slicing management systems, then we introduce the architecture that we used in this study in Section 3. A brief introduction to DRL algorithms is given in Section 4 and, in Section 5, we present the result of our evaluation in simulation and real-world environment. Finally we conclude by giving some directions of our future research.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>Most network slicing implementations in practical scenarios primarily target 5G networks. One such example is the 5GCity initiative, as outlined in <ref type="bibr" target="#b2">[3]</ref>, which aims to provide 5G services to both citizens and businesses in a smart city environment. Another example is presented in <ref type="bibr" target="#b3">[4]</ref>, where the authors have developed a 4G/5G testbed to explore network slicing capabilities. Network slicing is based on two principles, namely Network Function Virtualization (NFV) and Software-Defined Networking (SDN). These principles play a central role in enabling the dynamic and efficient deployment of network slices.</p><p>The NFV architecture of the European Telecommunications Standards Institute (ETSI) provides a standardized approach to virtualizing network functions, enabling network operators to virtualize and operate network functions as software on hardware devices. This architecture consists of three layers. Firstly, the virtualized infrastructure (NFVI) : this layer provides the infrastructure to host virtualized network functions. It can include servers, storage devices, network devices, and other hardware and software elements necessary for network function virtualization. Then, we have the ETSI standard MANO (Management and Orchestration), which controls the creation, deployment, and management of VNFs on the virtualized infrastructure. In <ref type="bibr" target="#b4">[5]</ref>, two MANO solutions are examined alongside others, namely Open Source MANO (OSM) <ref type="bibr" target="#b5">[6]</ref> and Open Network Automation Platform (ONAP). It encompasses resource management, service orchestration, monitoring, and notification functions. The ETSI NFV standard operates with Virtual Infrastructure Managers (VIM), of which two are evaluated in <ref type="bibr" target="#b6">[7]</ref>: OpenStack <ref type="bibr" target="#b7">[8]</ref> and OpenVIM. Additionally, OSM is compatible with major cloud providers such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform. Multiple VIMs can be employed simultaneously, as exemplified in <ref type="bibr" target="#b8">[9]</ref>. Finally, Virtualized Network Functions (VNF) provide the network functions that can be deployed on the virtual infrastructure. Within the context of 5G network slicing, VNFs that use OpenAirInterface (OAI) as discussed in <ref type="bibr" target="#b9">[10]</ref> and <ref type="bibr" target="#b3">[4]</ref> serve to emulate a 5G network.</p><p>SDN complements NFV by offering centralized control and programmability over network resources, thereby allowing dynamic allocation of bandwidth, routing, and other network elements to optimize the performance of individual slices in real-time. The utilization of an SDN controller is evocated in <ref type="bibr" target="#b8">[9]</ref>, where the authors present a comprehensive architecture and experimental validation. Collectively, NFV and SDN furnish the agility and flexibility necessary to create, manage, and adapt network slices effectively. This paper's primary focus is on the implementation of a DRL algorithm for optimizing VNF placement within a real infrastructure managed by OSM. Our cloud infrastructure is based on OpenStack, specifically MicroStack <ref type="bibr" target="#b10">[11]</ref>. It is important to note that the implementation of dynamic routing utilizing an SDN controller is a topic left for future research works. Also, our work exclusively deals with resource allocation, and as such, our VNFs do not engage in real network function operations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Architecture concepts</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Open Source MANO</head><p>Open Source MANO (OSM) <ref type="bibr" target="#b5">[6]</ref> is an open-source project that delivers a comprehensive network management and virtualization service platform for telecommunications networks, based on the principles of Network Function Virtualization (NFV). OSM offers a software infrastructure for creating, orchestrating, managing, and supervising virtualized network services. OSM's primary objective is to ease the adoption of NFV architecture among telecommunications service providers by providing an open and flexible platform aligned with industry standards and compatible with various vendors' equipment. This initiative is sustained by a community of developers and contributors representing various organizations and companies, operating under ETSI. OSM has gained widespread adoption within the telecommunications industry for network function virtualization and cloud service management for the past years.</p><p>OSM provides a comprehensive solution for implementing network slicing: NetSlices. In OSM, a slice is instantiated by defining a Network Slice Template (NST), which is divided into netslice-subnets and netslice-vld, that can be duplicated as depicted in Figure <ref type="figure" target="#fig_0">1</ref>. The former corresponds to the network services (NS) within the slice, while the latter represents the virtual links (VLs) that interconnect them. It's worth noting that OSM employs a management network for the deployment and management of slice instances. The NS, previously mentioned, serve as the network services to be deployed within our infrastructure. They act as wrappers for our VNFs and establish connection points for linking them through VLs. Not all NS have the same number of connection points, with an extra connection point for those in the "middle" of the slice. Within an NS, there is one or more VNFs, each defining the network function used, such as a firewall or router. Finally, at the level of each VNF, characteristics of one or more Virtual Deployment Units (VDUs) must be specified to define the properties of the virtual machine (image, number of CPUs, RAM, disk space, etc.). All these configurations are done using YAML-format descriptors, allowing connections between the various layers mentioned earlier through an identification and referencing system. To simplify the creation of slices, we developed generic template descriptors files<ref type="foot" target="#foot_0">1</ref> , which can be customized for diverse cases. It is important to emphasize that all the descriptors that compose a slice must be determined in advance before initiating the instantiation request to OSM. As a result, our algorithms need to consider what we refer to as "oneshot" placement, where the placement of all VNFs must be determined simultaneously. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Grid'5000</head><p>Grid'5000 <ref type="bibr" target="#b1">[2]</ref> is a dedicated computer research infrastructure designed for large-scale experimentation and validation of technologies and applications in the fields of Cloud computing, High Performance Computing (HPC), Big Data, and Artificial Intelligence (AI). It consists of a network of high-performance computing clusters distributed across nine university sites in France: Grenoble, Lille, Luxembourg, Lyon, Nancy, Nantes, Rennes, Sophia Antipolis, and Toulouse. Each cluster consists of interconnected compute nodes, storage resources and highspeed networking, providing a highly configurable and isolated test environment for researchers. Grid'5000 facilitates large-scale experiments on distributed computing and storage infrastructures, allowing the assessment of the performance of new architectures, algorithms, applications, scheduling policies, and more. Users can access Grid'5000 through a command-line interface, an API, or specific tools. Grid'5000 is widely used within the computer research community in France and internationally.</p><p>To access Grid'5000 resources, users need to make a reservation from a frontend server. Once access to these resources is granted, users are free to use superuser privileges on the reserved machines. This means they can install whatever they need for their experiments, as the machines will be automatically reinstalled at the end of their reservation period.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Proposed architecture</head><p>We propose an architecture to address resource allocation challenges in the context of NFV and cloud computing environments. In this architecture, we define key terms as follows:</p><p>• Slice: A slice represents a sequential chain of VNFs, each with specific CPU, RAM, and storage requirements. • Iteration: An iteration corresponds to the instantiation of a single slice, with VNFs initialized with randomly generated resource values. • Episode: An episode is a loop of iterations that ends when a slice can not be instantiated.</p><p>Essentially, it answers the question: "How many slices can we create?" • Test Environment: We conduct tests on both a simulation and a real environment using the same sequence of slices to ensure comparability across environments. Each slice maintains a consistent number of VNFs. It is subject to change in future research works. • VIM: In our architecture, each VIM corresponds to a distinct datacenter. All VIMs must be accessible by a central node which hosts OSM.</p><p>We employ MicroStack <ref type="bibr" target="#b10">[11]</ref>, which offers us a single or multi-node OpenStack <ref type="bibr" target="#b7">[8]</ref> deployment. While initially designed for developers to prototype and test, MicroStack is also suitable for edge computing, IoT applications, and appliances. This technology packages all OpenStack services and supporting libraries into a single, easily installable, upgradable, or removable package, simplifying deployment. In our specific use case, each VIM corresponds to a single machine equipped with a single-node MicroStack installation. However, MicroStack also supports multinode clustering configurations. In this multi-node case, the VIM's amount of resources is the total of all nodes' resources (in CPU, RAM and storage). We developed a collection of Python scripts to facilitate communication between an environment (either simulated or real) and an agent. In the real environment, we employ the osmclient library to establish communication with an OSM server instance hosted on Grid'5000.</p><p>Our reinforcement learning (RL) model operates within a dynamic environment characterized by the allocation of VIMs resources to satisfy the requirements of different network slices. Slices requirements are randomly generated by the part Values generator in Figure <ref type="figure" target="#fig_2">2</ref> at each iteration. An observation is generated at the beginning of each iteration. This observation combines the available resources within the VIMs with the slice resource requirements. It provides a comprehensive view of our current network to the agent. With this observation, our agent's objective is to decide where to instantiate a VNF. The agent uses its learned policy (i.e. behaviour) to make a decision. Our choice of a an agent is detailed in Section 4. The environment responds to the agent's decision by applying the chosen action. In other words, it initiates the instantiation of the network slice, but only if the chosen VIM has the necessary resources available. Then, a new observation, reflecting the new state of the network, is generated alongside a reward for the agent. The reward is based on the agent performance on its action choice and its purpose is to improve the agent's policy.</p><p>The major difference between the simulated and real implementation comes with the environment's modification. In our real environment displayed Figure <ref type="figure" target="#fig_2">2a</ref>, OSM is responsible for orchestrating VNFs and VLs instantiation, using OpenStack (MicroStack) API to create Virtual Machines (VMs) and networks based on descriptors mentioned in Section 3.1. For this, descriptors are generated based on slice resources requirements and placement decisions. In our simulation environment, the resources are "emulated" within the Mock Environment shown in Figure <ref type="figure" target="#fig_2">2b</ref>. A slice placement amounts to subtract the required resources to the available resources.  DRL establishes an interaction between an agent equipped with neural networks and an environment. The latter is characterized by states whose transitions occur through actions. In the Network Slice Placement problem, the physical infrastructure (comprising VIMs) and slice placement requests are either present or received within the environment, and the number of possible actions is equal to the number of VIMs where VNFs can be placed. The various pieces of information exchanged between the agent and the environment are as follows:</p><p>• State: When a slice placement request is received by the environment, a real-time description (i.e., state or observation) of the physical infrastructure's VIMs and the elements of the request (VNFs) is transmitted to the agent. We define our set of VIMs by 𝒩 and VNFs by ℱ. The description of VIMs (respectively VNFs) includes the available (respectively required) CPU, RAM, and storage space 𝑐 𝑟 𝑖 , ∀𝑖 ∈ 𝒩 (respectively 𝜈 𝑟 𝑘 , ∀𝑘 ∈ ℱ). • Action: The agent takes the observation as input and outputs a VIM identifiant (action) it believes to be most suitable for placing the current VNF under processing. This action is sent back to the environment for execution and evaluation of its optimality. It's important to note that in DRL, only the environment executes the actions and has the ability to assess their optimality. This assessment is reflected in a value provided to the agent to reward or penalize it.</p><p>• Reward: The reward function aims to incentivize the agent to improve its future actions.</p><p>For each VNF placement, there is an associated value, i.e., a reward calculated by the environment. The higher the reward, the better the placement suggested by the agent is. The agent's objective is precisely to maximize the cumulative sum of rewards it receives from the placements. In this paper, we have introduced and employed a reward function for VNF 𝑘 ∈ ℱ defined by <ref type="bibr" target="#b0">(1)</ref>, where 𝜂 is a small constant used to avoid division by zero. </p><formula xml:id="formula_0">𝑥(𝑖, 𝑘) = {︃ 1 if VNF 𝑘 ∈ ℱ is placed on VIM 𝑖 0 otherwise. and 𝑦 𝑟 (𝑖, 𝑘) = {︃ 1 if 𝑐 𝑟 𝑖 &lt; 𝜈 𝑟 𝑘 0 otherwise.<label>(2)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">DDQN</head><p>The Double Deep Q-Network (DDQN) <ref type="bibr" target="#b11">[12]</ref> is a value-based DRL algorithm that computes Q-values representing an estimation of expected future rewards. This algorithm is primarily designed for discrete environments, i.e., those with a finite action space. Each action is associated with a Q-value that gets updated as the DDQN agent learns from its environment. The agent's objective is to maximize the cumulative rewards obtained throughout episodes. Achieving this goal is closely tied to the use of Q-values because whenever the agent selects the action with the highest Q-value, it aims to maximize an estimate of the cumulative reward. This estimation plays a critical role in the algorithm's behavior. DDQN employs two neural networks to better estimate Q-values: an evaluation network and a target network following equation <ref type="bibr" target="#b2">(3)</ref>. Action selection relies on Q-values from the evaluation network, while the update of these Q-values is based on former Q-values from the target network, which is an older version of the evaluation network. In DRL, the targets are not known in advance and are not fixed, hence the need for the target network, which, for a certain number of iterations, freezes the Q-values used in target calculations, effectively "freezing" the targets. Periodically, the target network is updated by copying the weights from the evaluation network.</p><formula xml:id="formula_1">𝑌 𝐷𝐷𝑄𝑁 = 𝑟 𝑡 + 𝛾𝑄 𝑡𝑎𝑟𝑔𝑒𝑡 (𝑠 𝑡+1 , argmax 𝑎 𝑄 𝑒𝑣𝑎𝑙 (𝑠 𝑡+1 , 𝑎))<label>(3)</label></formula><p>The trial-and-error nature of DDQN is managed through an exploration-exploitation trade-off. To ensure the selection of actions that yield the maximum reward, various actions are tested in different states (exploration), even if this means occasionally choosing actions with lower Q-values. However, to truly maximize cumulative rewards and achieve the ultimate objective, actions with the highest estimated Q-values must be selected (exploitation). To strike a balance, we have chosen the linear decay epsilon-greedy policy, which involves selecting an action based on the maximum Q-value with a probability of 𝜖 and choosing a random action with a probability of 1 − 𝜖. DDQN is an off-policy algorithm, meaning that it leverages past experiences to update its policy. To achieve this, it utilizes a replay buffer where experiences are stored, and samples are periodically drawn from it to train the evaluation network, thereby improving the estimation of its Q-values. We provide the pseudo-code of DDQN in algorithm <ref type="bibr" target="#b0">(1)</ref>. Set 𝑠 ← 𝑠 ′ 20: return DDQN agent</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Real deployment considerations</head><p>Deploying our algorithm within a real environment requires us to consider several key factors. Firstly, our observation space include infrastructure state and current slice requirements (CPU, RAM and storage). All these measurements are emulated in simulation and easily retrievable with OSM. One noteworthy advantage is that the agent receive observations independently of the environment type. This means that we have no need to develop separate agents for each environment. Consequently, we can use the same model for both environments evaluations, which not only minimizes potential errors but also ensure consistent behaviour across environments. The model's training is conducted within the simulation environment, justified by the significant time needed to instantiate a slice in a real environment, which could extend training times to several months. By training in simulation, we can significantly reduce the learning time of our algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results</head><p>In this section, we introduce the configuration of our experimental testbed, which serves as the foundation for our research and evaluation. The testbed includes six VIMs deployed on separate Grid'5000 nodes. Each VIM is equipped with 32 virtual CPUs (vCPUs), 128 GB of RAM, and 256 GB of disk space. We use a Ubuntu 20.04 LTS distribution for each node. For our cloud infrastructure, we have deployed MicroStack instances with the OpenStack Ussuri version. To orchestrate and manage network services and resources, we have integrated OSM on a separate node with the same characteristics as the MicroStack nodes. We use OSM Release THIRTEEN for our experiment. Each generated slice contains exactly 10 VNFs whose values are randomly selected within ranges defined in Table <ref type="table" target="#tab_1">1</ref>. In the initial phase of our experiments, we monitored the resources state of both simulation and real environments at each iteration, with CPU in Figure <ref type="figure" target="#fig_5">3</ref>, RAM in Figure <ref type="figure" target="#fig_6">4</ref> and storage in Figure <ref type="figure" target="#fig_8">5</ref>. We successfully instantiated 12 slices, equivalent to 120 VNFs in both the simulation and real environments. The last slice <ref type="bibr" target="#b11">(12)</ref> could not be created in either environment because of a lack of resources. We observed that differences in the allocation of resources appears just before the middle of the episode. These disparities may be attributed to variations in the agent's behavior.</p><p>Therefore, to delve deeper into the agent's behavior, we examined its choice of VNF placement at each iteration. We plotted in Figure <ref type="figure" target="#fig_9">6</ref> the number of VNFs assigned to each VIM, whether or not the slice can be created. Initially, there was not any difference between the decision graphs for both environments in the early iterations. However, differences started to manifest around slice 5. Most of the differences observed in the previous figures were primarily caused by variations in the agent's decisions. During evaluation, exploration was deactivated, meaning that any small changes in the agent's actions should be attributed to disparities in the two observations.</p><p>We still needed to assess the overall impact of these disparities, so we plotted in Figure <ref type="figure" target="#fig_10">7a</ref> the total amount of available resources for both the simulation and real environments at each iteration. The objective was to investigate whether there was a consequent difference in the overall available resources of the infrastructure. Our observations revealed that the behavior of resource utilization appeared to be linear for both experiments and across the three resource types (CPU, RAM, and Disk space). While there were some differences between the real and simulation curves, these disparities did not appear excessively significant.</p><p>As a final aspect of our analysis, we examined the relative differences (𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒_𝑑𝑖𝑓 𝑓 =</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>𝑆𝑖𝑚−𝑅𝑒𝑎𝑙 𝑅𝑒𝑎𝑙</head><p>) between the total amount of resources allocated in the real and simulation environments for each iteration. Notably, we observed in Figure <ref type="figure" target="#fig_10">7b</ref> that the simulation appeared to prioritize RAM consumption over storage consumption. Interestingly, the relative difference in CPU resources remained consistently zero throughout all iterations. These results are likely due to the use of integer values for CPU and float for the others. Then, our investigations pointed towards a potential issue with OSM, which appeared to round RAM and Disk space values. Specifically, OSM returned RAM values with three decimal numbers and provided Disk space values without any decimal precision, whereas it uses decimal precision for slice storage requirements.   Our experiments revealed that there were no major differences in the performance (i.e. number of slices instantiated), between the simulation and real environments. However, the disparities identified in the OSM observation of the infrastructure were significant enough to impact the overall behavior of the agent.    </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion and perspectives</head><p>In this paper we presented a DRL based algorithm to solve the problem of optimizing network slice placement in future networks. The algorithm was trained in a simulated environment and then evaluated in a real environment deployed on the Grid'5000 platform and managed by OSM. The results show good performances of the algorithm in real conditions with minor differences due to the collected observations in the infrastructure. However, the consequent delays required to instantiate VNFs in the infrastructure make it difficult to continue the training in real conditions once the algorithm has been deployed. In future work, we would like to introduce mechanisms to accelerate the generation of experiences, either by augmenting the training experience dataset or by using analytical models.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: NetSlice template architecture in OSM</figDesc><graphic coords="4,110.13,163.81,375.03,181.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>(a) Grid'5000 real environment (b) Simulation environment</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2</head><label>2</label><figDesc>Figure 2: Developed architectures</figDesc><graphic coords="6,89.29,179.54,270.85,134.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Algorithm 1 9 : 11 :</head><label>1911</label><figDesc>Double Deep Q-Network (DDQN) pseudo-code 1: INPUTS : 𝑁 𝑒𝑝𝑖𝑠𝑜𝑑𝑒𝑠 (number of episodes), 𝑇 (number of steps per episode), 𝐶 (target network update frequency) 2: OUTPUT : trained DDQN agent 3: 4: Initialize R (replay buffer) 5: Initialize 𝑄 𝑒𝑣𝑎𝑙 (evaluation network) 6: Initialize 𝑄 𝑡𝑎𝑟𝑔𝑒𝑡 (target network) 7: Set 𝑄 𝑡𝑎𝑟𝑔𝑒𝑡 weights to be same as 𝑄 𝑒𝑣𝑎𝑙 weights 8: For episode = 1 to 𝑁 𝑒𝑝𝑖𝑠𝑜𝑑𝑒𝑠 do For timestep = 1 to 𝑇 do 12: Choose action 𝑎 with 𝜖-greedy policy based on 𝑄 𝑒𝑣𝑎𝑙 13: Execute action 𝑎, observe reward 𝑟 and next state 𝑠 ′ 14: Store (𝑠, 𝑎, 𝑟, 𝑠 ′ ) in replay buffer 𝑅 15: Sample a random minibatch from 𝑅 16: Compute target 𝑌 𝐷𝐷𝑄𝑁 using equation (3) 17: Update evaluation network using gradient descent: 𝜃 ← 𝜃 − 𝛼∇ 𝜃 [𝑄 𝑒𝑣𝑎𝑙 (𝑠, 𝑎) − 𝑦] 2 18: Every 𝐶 timesteps, update target network weights: 𝑄 𝑡𝑎𝑟𝑔𝑒𝑡 ← 𝑄 𝑒𝑣𝑎𝑙 19:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Number of available vCPUs per VIM at each iteration</figDesc><graphic coords="10,89.29,439.87,187.51,97.23" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Amount of available RAM (in GB) per VIM at each iteration</figDesc><graphic coords="10,318.47,439.87,187.51,97.23" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Amount of available Disk space (in GB) per VIM at each iteration</figDesc><graphic coords="11,89.29,238.77,187.52,98.29" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Agent's placement decisions per VIM at each iteration</figDesc><graphic coords="11,318.47,238.77,187.52,98.29" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: Computation of differences between real and simulation experiments</figDesc><graphic coords="11,318.47,394.42,187.51,138.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>•</head><label></label><figDesc>Next state: Following the application of the action, the VIMs' amounts of available resources in the physical infrastructure change, and a new slice placement request can now be processed. This new real-time description of the environment become the next state.• Experience: The combination of state 𝑠 𝑡 , action 𝑎 𝑡 , reward 𝑟 𝑡 , and next state 𝑠 𝑡+1 is referred as an experience, denoted 𝑒 = (𝑠 𝑡 , 𝑎 𝑡 , 𝑟 𝑡 , 𝑠 𝑡+1 ). Experiences are used by the agent for self-improvement, as elaborated in the following section.</figDesc><table><row><cell>𝑅(𝑘) =</cell><cell>⎧ ⎪ ⎨ ⎪ ⎩ − −</cell><cell>∑︀ 𝑟∈ℛ |ℛ|+</cell><cell>∑︀ 𝑖∈𝒩 ∑︀ 𝑟∈ℛ</cell><cell>𝑥(𝑖, 𝑘) * ∑︀ 𝜂 𝑖∈𝒩 𝑥(𝑖,𝑘)*𝑦 𝑟 (𝑖,𝑘)*(𝑐 𝑟 1 𝑐 𝑟 𝑖 −𝜈 𝑟 𝑘 +𝜂 𝑖 −𝜈 𝑟 𝑘 )</cell><cell>for a successful VNF placement otherwise</cell><cell>(1)</cell></row><row><cell>with</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>Random functions and ranges used for slice requirements generation</figDesc><table><row><cell>Resource</cell><cell>Type</cell><cell>Random function</cell><cell>Range</cell><cell>Unit</cell></row><row><cell>CPU</cell><cell cols="2">Integer numpy.random.choice 2</cell><cell>{1, 2} with probabilities {0.7, 0.3}</cell><cell>-</cell></row><row><cell>RAM</cell><cell>Float</cell><cell>numpy.random.uniform 2</cell><cell>[0, 10[</cell><cell>GB</cell></row><row><cell>Disk</cell><cell>Float</cell><cell>numpy.random.uniform 2</cell><cell>[0, 20[</cell><cell>GB</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Based on OSM Information Model : https://osm.etsi.org/docs/user-guide/latest/11-osm-im.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">Based on numpy.random functions : https://numpy.org/doc/stable/reference/random/legacy.html</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Description of network slicing concept</title>
		<author>
			<persName><forename type="first">N</forename><surname>Alliance</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NGMN 5G P</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1" to="11" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Adding virtualization capabilities to the Grid&apos;5000 testbed</title>
		<author>
			<persName><forename type="first">D</forename><surname>Balouek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Carpen Amarie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Charrier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Desprez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Jeannot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Jeanvoine</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lèbre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Margery</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Niclausse</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Nussbaum</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Richard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Pérez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Quesnel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Rohr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Sarzyniec</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-04519-1_1</idno>
	</analytic>
	<monogr>
		<title level="m">Cloud Computing and Services Science</title>
		<title level="s">Communications in Computer and Information Science</title>
		<editor>
			<persName><forename type="first">I</forename><forename type="middle">I</forename><surname>Ivanov</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Van Sinderen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Leymann</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Shan</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="volume">367</biblScope>
			<biblScope unit="page" from="3" to="20" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<ptr target="https://www.5gcity.eu/,2023" />
		<title level="m">5GCity -A distributed cloud &amp; radio platform for 5G Neutral Hosts</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A cloud-based sdn/nfv testbed for end-to-end network slicing in 4g/5g</title>
		<author>
			<persName><forename type="first">A</forename><surname>Esmaeily</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kralevska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Gligoroski</surname></persName>
		</author>
		<idno type="DOI">10.1109/NetSoft48620.2020.9165419</idno>
	</analytic>
	<monogr>
		<title level="m">2020 6th IEEE Conference on Network Softwarization (NetSoft)</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="29" to="35" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Benchmarking open source NFV MANO systems: OSM and ONAP</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">M</forename><surname>Yilma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">F</forename><surname>Yousaf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Sciancalepore</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Costa-Perez</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.comcom.2020.07.013</idno>
		<ptr target="https://doi.org/10.1016/j.comcom.2020.07.013" />
	</analytic>
	<monogr>
		<title level="j">Computer Communications</title>
		<imprint>
			<biblScope unit="volume">161</biblScope>
			<biblScope unit="page" from="86" to="98" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<ptr target="https://osm.etsi.org/" />
		<title level="m">Open Source MANO</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Management and orchestration for network function virtualization: An open source mano approach</title>
		<author>
			<persName><forename type="first">M.-I</forename><surname>Csoma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Koné</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Botez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I.-A</forename><surname>Ivanciu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Dobrota</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2020 19th RoEduNet Conference: Networking in Education and Research (RoEduNet), IEEE</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">OpenStack: toward an open-source solution for cloud computing</title>
		<author>
			<persName><forename type="first">O</forename><surname>Sefraoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Aissaoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Eleuldj</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Applications</title>
		<imprint>
			<biblScope unit="volume">55</biblScope>
			<biblScope unit="page" from="38" to="42" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Enabling multi-domain orchestration using open source MANO, OpenStack and OpenDaylight</title>
		<author>
			<persName><forename type="first">P</forename><surname>Karamichailidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Choumas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Korakis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN), IEEE</title>
				<imprint>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A 4g/5g packet core as vnf with open source mano and openairinterface</title>
		<author>
			<persName><forename type="first">T</forename><surname>Dreibholz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2020 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), IEEE</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1" to="3" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<ptr target="https://microstack.run/" />
		<title level="m">OpenStack on Kubernetes | Ubuntu</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Deep reinforcement learning with double q-learning</title>
		<author>
			<persName><forename type="first">H</forename><surname>Van Hasselt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Guez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Silver</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI conference on artificial intelligence</title>
				<meeting>the AAAI conference on artificial intelligence</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page">1</biblScope>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
