<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Toward Reinforcement Learning-based Framework for Workflow Migration: Position Paper</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Nour</forename><surname>El</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Houda</forename><surname>Boubaker</surname></persName>
							<email>nour.boubaker@univ-constantine2.dz</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LIRE Laboratory</orgName>
								<orgName type="institution">Constantine2 -Abdelhamid Mehri University</orgName>
								<address>
									<settlement>Constantine</settlement>
									<country key="DZ">Algeria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Karim</forename><surname>Zarour</surname></persName>
							<email>zarour.karim@univ-constantine2.dz</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LIRE Laboratory</orgName>
								<orgName type="institution">Constantine2 -Abdelhamid Mehri University</orgName>
								<address>
									<settlement>Constantine</settlement>
									<country key="DZ">Algeria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Nawal</forename><surname>Guermouche</surname></persName>
							<email>nawal.guermouche@laas.fr</email>
							<affiliation key="aff1">
								<orgName type="laboratory">LAAS-CNRS</orgName>
								<orgName type="institution" key="instit1">University of Toulouse</orgName>
								<orgName type="institution" key="instit2">INSA</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Djamel</forename><surname>Benmerzoug</surname></persName>
							<email>djamel.benmerzoug@univ-constantine2.dzd.benmerzoug</email>
							<affiliation key="aff0">
								<orgName type="laboratory">LIRE Laboratory</orgName>
								<orgName type="institution">Constantine2 -Abdelhamid Mehri University</orgName>
								<address>
									<settlement>Constantine</settlement>
									<country key="DZ">Algeria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Toward Reinforcement Learning-based Framework for Workflow Migration: Position Paper</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">2008AEAAD50284BAB994938E2A3B3C38</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:40+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Workflow Migration</term>
					<term>Fog</term>
					<term>Cloud</term>
					<term>Reinforcement Learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The concept of service migration holds significant importance within the realm of Fog-Edge computing, particularly in scenarios where mobile users are in constant motion, transitioning between various Access Points (APs). While this dynamic mobility is a fundamental characteristic of modern networking environments, it introduces the challenge of the frequent migration of services. The latter can potentially lead to a degradation in the Quality of Service (QoS) experienced by users. This paper addresses this critical issue by presenting a methodical Reinforcement Learning framework for necessary workflow migration in Fog-Cloud Computing. Firstly, we examine the literature solutions and then, we introduce our Markov Decision Model (MDP) for workflow migration.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>User mobility in Fog and Edge environments refers to scenarios where users are constantly moving within the network, such as in a smart city or Internet of Things (IoT) deployment. As users move, their proximity to Fog nodes may change, and it becomes necessary to migrate services to Fog nodes that are closer to the users to provide seamless connectivity and optimal performance <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>.</p><p>However, it is important to highlight that frequent migration could result in added migration expenses, including delays and increased energy usage. Hence, there exists a need to minimize the frequency of migrations while still adhering to users' Quality of Service (QoS) requirements, such as diminishing latency as perceived by users <ref type="bibr" target="#b2">[3]</ref>. Moreover, the regions may have varying levels of resources (computational capacity, memory, etc.) and network bandwidth <ref type="bibr" target="#b3">[4]</ref>. These differences in resource capacities and network capabilities need to be considered when making decisions about where to migrate services to ensure optimal performance and resource utilization.</p><p>Recently, researchers have shown an increasing interest in leveraging artificial intelligence techniques to propose intelligent solutions for service migration problems in Fog and Edge environments <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref>. These solutions aim to identify an optimal migration policy based on user mobility patterns. Our approach will employ a specific machine-learning paradigm called reinforcement learning. This letter is particularly well-suited for addressing the challenges posed by complex environments that require adaptability in response to contextual factors. Through this technique, we can develop a migration solution that dynamically adjusts the placement strategy by considering the varying performance of resources and bandwidth links. In contrast to existing solutions, our approach will primarily emphasize minimizing the frequency of workflow migrations within such heterogeneous environment characteristics to maintain a trade-off between QoS and migration costs.</p><p>The rest of the paper is organized as follows: In Section 2, we discuss the problem statement, highlighting certain limitations in existing migration solutions that motivated our approach. Section 3 provides an overview of the system modeling. In Section 4, we explain our MDP modeling and RL framework.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Problem Statement</head><p>Serval works were proposed to solve the service migration problem in Fog-Edge Computing <ref type="bibr" target="#b6">[7]</ref>, For instance, <ref type="bibr" target="#b4">[5]</ref> proposed a deep Q-learning algorithm to solve the task migration problem without knowing users' mobility patterns. Wang and al. <ref type="bibr" target="#b5">[6]</ref> proposed a Double Deep Q-Learning (DDQN) for computation offloading and migration framework in vehicular networks, which considers time-varying channel states and stochastically arriving computation tasks. <ref type="bibr" target="#b7">[8]</ref>, considered a centralized controller for service allocation and migration. <ref type="bibr" target="#b8">[9]</ref> designed a DRL approach to deploying an optimal migration policy in order to improve user QoS in Mobile Edge Computing (MEC). The approach consists of migrating data to another eNodeb (eNB) depending on the user position and the current state of the network. Djemai et al. <ref type="bibr" target="#b9">[10]</ref> presented a probabilistic mobility-based Genetic Algorithm (MGA) and mobility greedy heuristic (MGH) for an efficient services migration in the Fog environment that minimizes the infrastructure energy consumption and applications delay violations over time. <ref type="bibr">Huang and al. [11]</ref> proposed an intelligent task migration scheme in MEC using the Q-learning technique. The authors aim to minimize the overall service time. <ref type="bibr" target="#b11">[12]</ref> focused on the problem of service migration where users move between multiple edge nodes and propose a service migration strategy algorithm (SMSMA) based on multi-attribute MDP to make migration decisions.</p><p>In many existing research studies <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b11">12]</ref>, the primary focus revolves around exploring migration strategies that relocate services or workflows whenever users change locations. However, it's essential to recognize that such migration strategies can introduce computational resource overhead, higher communication costs, and longer migration times. These unnecessary migrations can deplete resources and disrupt service execution, potentially leading to suboptimal performance and operational inefficiencies.</p><p>Conversely, studies aiming to reduce migration frequency <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b10">11]</ref> often focus on scenarios involving a single service migration in relatively homogeneous environments. In real-world situations, regions can significantly differ in characteristics, especially regarding resource capacities and network conditions. This heterogeneity adds complexity, requiring a more nuanced approach to service migration management.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">System Model Overview</head><p>In our study, depicted in Fig. <ref type="figure">4</ref>.2, we examine a typical industrial scenario where a robot traverses geographic regions covered by Fog servers. These servers connect to Access Points (APs) through wireless links. Initially, the robot assigns computation tasks to Fog resources in the first region. However, as the user moves, the system must make informed decisions about task migration. These decisions consider factors like resource performance and network conditions. To accommodate user mobility, the system operates in time slots, with the timeline represented by 𝑡 ∈ 𝑇 = {0, 1, 2, . . . , 𝑇 }. The duration of each time slot 𝑡 𝑡 is 𝛿 (e.g.,15 minutes).</p><p>The robot's cyber workflow consists of tasks with dependencies. The robot's physical component continuously sends data gathered from various sensors, like images, to this workflow. This data is processed and manipulated within the cyber workflow. After processing, the results are returned to the physical component. The cyber workflow is formally represented as a directed acyclic graph (DAG), denoted as 𝐺 = (𝑇 𝑠, 𝐸). Here, 𝑇 𝑠 is the set of tasks, and 𝐸 represents the dependencies or constraints between task pairs. Each task in this model has attributes like size and computational requirements.</p><p>Fog resources are distributed in each region, forming a network of interconnected nodes. These resources are linked together through wireless links, which can vary in characteristics from one resource to another. Furthermore, the resources are characterized by a set of attributes such as computational capacities. Our primary goal is to develop a decision-making algorithm that efficiently minimizes both the overall delay and energy consumption associated with processing in the system. This reduction includes offloading processing, execution processing, and migration processing. It encompasses the time from when tasks are offloaded to the resources of the initial region to the time the final task in the workflow completes its execution in the last region. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Workflow migration-based RL Methodology</head><p>In contrast to other fields of Machine Learning, reinforcement learning relies on continuous interaction with the environment, where the agent learns through feedback in the form of values assessing its actions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">MDP Model</head><p>The workflow migration problem is formalized as a Markov Decision Process (MDP), denoted as 𝑀 𝐷𝑃 = ⟨𝑆, 𝐴, 𝑅, 𝑃 ⟩ <ref type="bibr" target="#b12">[13]</ref>, where 𝑆 represents the state space, 𝐴 is the action space encompassing all possible actions at each state, 𝑅 is the reward function valuing state-action pairs, and 𝑃 determines the probability of transitioning between states when specific actions are taken. 𝑆 , 𝐴, and 𝑅 are represented as follows:</p><p>1. State Space: The state space is defined by several variables that collectively represent the system state. These variables include the current time slot (𝑡 𝑡 ), the information of the current task (𝜏 𝑖 ) that needs to be allocated, the information of resources at the current region (𝑅 𝑡 ), and the action taken for the task (𝐴 𝑡−1 𝑖 ) in the previous time slot. The state can be denoted as 𝑆 𝑡 𝑖 = {𝑡 𝑡 , 𝜏 𝑖 , 𝑅 𝑡 , 𝐴 𝑡−1 𝑖 (𝑡&gt;1) }. The total number of states in one episode is equal to 𝑁 × 𝑇 , where 𝑁 represents the number of tasks, and 𝑇 denotes the number of time slots.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Action Space:</head><p>In each time slot, an action 𝐴 𝑡 𝑖 must be taken for each task 𝜏 𝑖 . The action 𝐴 𝑡 𝑖 ∈ {𝑎 0 (𝑡&gt;1) , 𝑎 1 } consists of two options: 𝑎 0 denotes no migration decision, while 𝑎 1 involves migrating the task to the current region by selecting a suitable resource. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Reinforcement Learning Framework</head><p>In RL, an agent interacts with an environment by taking actions, receiving rewards or penalties based on those actions, and using this feedback to improve its decision-making process. The agent's actions are guided by a policy, which determines the mapping from system states to actions <ref type="bibr" target="#b12">[13]</ref>. The ultimate objective of the agent is to ascertain an optimal policy, denoted as 𝜋 * , which effectively maps a state 𝑠 𝑛 to a probability distribution over possible actions 𝑎 𝑛 and is represented as follows:</p><formula xml:id="formula_0">𝜋 * : 𝑆 → 𝑃 (𝐴)<label>(1)</label></formula><p>Figure <ref type="figure">4</ref>.2 illustrates the workflow migration framework, consisting primarily of two core components: the RL agent and the migration environment. The primary aim of the agent revolves around enhancing its cumulative reward during system operations, a goal that is achieved by minimizing both overall delay and energy consumption. This cumulative reward, denoted as 𝑅 𝑡𝑜𝑡𝑎𝑙 , is defined as the summation of rewards of tasks across all time slots:</p><formula xml:id="formula_1">𝑅 𝑡𝑜𝑡𝑎𝑙 = 𝑇 ∑︁ 𝑖=1 𝑁 ∑︁ 𝑖=1 𝛾𝑅(𝑆 𝑡 𝑖 , 𝐴 𝑡 𝑖 )<label>(2)</label></formula><p>In the initial stages, the agent operates without prior knowledge about the environment and therefore initiates exploration by taking random actions that might not yield immediate high rewards but offer valuable insights for discovering more rewarding actions over time. Subsequently, the agent shifts to exploitation, where it selects actions aimed at maximizing the expected future rewards, relying on its current understanding of the environment. At first, the agent receives the information related to the first state donated 𝑠 2  1 , including, the information of the first task (CPU, RAM, and DISK) requirements, as well as information about the available resources in the first region, including cloud resources (identified by id machine). The Deep Neural Network (DNN) prediction module utilizes this information, in conjunction with a timestamp to estimate the current resource utilization, specifically for CPU, RAM, and Disk. This step is crucial in minimizing resource overhead, as the agent will only consider the subset of resources that can adequately meet the requirements of the task. After the resource selection, the agent receives a reward that reflects the resulting delay and energy consumption from offloading and executing the task on the selected resource. To facilitate its decision-making process, the agent relies on a value function, specifically the State Value Function V (s) or the State-Action Value Function Q (s, a). This function estimates the anticipated future reward that the agent can achieve by taking a specific action from the current system state. By utilizing the value function, the agent can evaluate the potential advantages of different actions and prioritize the most promising ones. Subsequently, the agent updates the value and requests the next state, which represents the information regarding the second task to be allocated to the resources in the first time slot. Once the agent has completed the task offloading process in the first time slot, it requires information about the first task, denoted as 𝑠 2  1 , which includes its previous host machine. This information is necessary for the agent to make a migration decision 𝑎 2  1 in the second time slot. The host machine's identifier is used by the prediction module to estimate the current resource utilization for this resource. This iterative process continues as the agent makes migration decisions for each subsequent task across all regions until decisions are made for all tasks.</p><p>The RL agent learns through multiple episodes, refining its value function from rewards and new information gathered during interactions with the environment. This iterative process enhances its decision-making, leading to the acquisition of an optimal policy. This value function update is a fundamental aspect of RL algorithms like Q-learning and State-Action-Reward-State-Action (SARSA).</p><p>In the inference phase, the agent leverages its acquired knowledge to make decisions and take actions during interactions with its environment. This behavior reflects the wisdom accumulated from its training, guiding it to navigate and interact in alignment with the optimal migration policy 𝜋 * .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion And Future Works</head><p>This position paper addresses Workflow Migration in the context of Fog-Cloud Computing, especially in scenarios with varying resource capacities and bandwidth links. We began by highlighting limitations and challenges in existing literature. We then outlined our system model. We introduced an MDP model, defining its key components. Additionally, we presented an RL framework for necessary workflow migration.</p><p>In the future, we aim to develop a Deep Learning resource prediction module using Google cluster trace and Alibaba data, explore partial offloading for energy-constrained end-users, and investigate multi-agent strategies for scalability in expanding resource scenarios.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Overview of the system model</figDesc><graphic coords="3,172.63,432.07,250.02,133.08" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>3 .</head><label>3</label><figDesc>Reward: It represents delay and energy consumption associated with the execution of action 𝐴 𝑡 𝑖 within the context of state 𝑆 𝑡 𝑖 .</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Reinforcement Learning Framework</figDesc><graphic coords="5,162.21,163.87,270.85,152.36" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgment</head><p>This work was partially supported by the LABEX-TA project MeFoGL: "Méthode Formelles pour le Génie Logiciel"</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A survey on mobility-induced service migration in the fog, edge, and related computing paradigms</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Rejiba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Masip-Bruin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Marín-Tordera</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">52</biblScope>
			<biblScope unit="page" from="1" to="33" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A survey on service migration in mobile edge computing</title>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="23511" to="23528" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Mobility-aware dynamic service placement for edge computing</title>
		<author>
			<persName><forename type="first">G</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">EAI Endorsed Transactions on Internet of Things</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="e2" to="e2" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A survey of fog computing: concepts, applications and issues</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2015 workshop on mobile big data</title>
				<meeting>the 2015 workshop on mobile big data</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="37" to="42" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Task migration for mobile edge computing using deep reinforcement learning</title>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zheng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Future Generation Computer Systems</title>
		<imprint>
			<biblScope unit="volume">96</biblScope>
			<biblScope unit="page" from="111" to="118" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Computation migration and resource allocation in heterogeneous vehicular networks: a deep reinforcement learning approach</title>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="171140" to="171153" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">E H</forename><surname>Boubaker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zarour</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Guermouche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Benmerzoug</surname></persName>
		</author>
		<title level="m">Fog and edge service migration approaches based on machine learning techniques: A short survey</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A service migration method based on dynamic awareness in mobile edge computing</title>
		<author>
			<persName><forename type="first">M</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Rui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Qiu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NOMS 2020-2020 IEEE/IFIP Network Operations and Management Symposium</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1" to="7" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A deep reinforcement learning approach for data migration in multi-access edge computing</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">De</forename><surname>Vita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bruneo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Puliafito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nardini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Virdis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Stea</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ITU Kaleidoscope: Machine Learning for a 5G Future (ITU K), IEEE</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Mobility support for energy and qos aware iot services placement in the fog</title>
		<author>
			<persName><forename type="first">T</forename><surname>Djemai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Stolf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Monteil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-M</forename><surname>Pierson</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2020 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), IEEE</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1" to="7" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Intelligent task migration with deep qlearning in multiaccess edge computing</title>
		<author>
			<persName><forename type="first">S.-Z</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K.-Y</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C.-L</forename><surname>Hu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IET Communications</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="1290" to="1302" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Service migration strategy based on multi-attribute mdp in mobile edge computing</title>
		<author>
			<persName><forename type="first">P</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Si</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>An</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Electronics</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page">4070</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Reinforcement learning: An introduction</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Sutton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">G</forename><surname>Barto</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
			<publisher>MIT press</publisher>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
