<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Toward Reinforcement Learning-based Framework for Workflow Migration: Position Paper</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nour El Houda Boubaker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karim Zarour</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nawal Guermouche</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Djamel Benmerzoug</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Constantine2 - Abdelhamid Mehri University, LIRE Laboratory</institution>
          ,
          <addr-line>Constantine</addr-line>
          ,
          <country country="DZ">Algeria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LAAS-CNRS, University of Toulouse</institution>
          ,
          <addr-line>INSA</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The concept of service migration holds significant importance within the realm of Fog-Edge computing, particularly in scenarios where mobile users are in constant motion, transitioning between various Access Points (APs). While this dynamic mobility is a fundamental characteristic of modern networking environments, it introduces the challenge of the frequent migration of services. The latter can potentially lead to a degradation in the Quality of Service (QoS) experienced by users. This paper addresses this critical issue by presenting a methodical Reinforcement Learning framework for necessary workflow migration in Fog-Cloud Computing. Firstly, we examine the literature solutions and then, we introduce our Markov Decision Model (MDP) for workflow migration.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Workflow Migration</kwd>
        <kwd>Fog</kwd>
        <kwd>Cloud</kwd>
        <kwd>Reinforcement Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        User mobility in Fog and Edge environments refers to scenarios where users are constantly
moving within the network, such as in a smart city or Internet of Things (IoT) deployment. As
users move, their proximity to Fog nodes may change, and it becomes necessary to migrate
services to Fog nodes that are closer to the users to provide seamless connectivity and optimal
performance [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>
        However, it is important to highlight that frequent migration could result in added migration
expenses, including delays and increased energy usage. Hence, there exists a need to minimize
the frequency of migrations while still adhering to users’ Quality of Service (QoS) requirements,
such as diminishing latency as perceived by users [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Moreover, the regions may have varying
levels of resources (computational capacity, memory, etc.) and network bandwidth [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. These
diferences in resource capacities and network capabilities need to be considered when
making decisions about where to migrate services to ensure optimal performance and resource
utilization.
      </p>
      <p>
        Recently, researchers have shown an increasing interest in leveraging artificial intelligence
techniques to propose intelligent solutions for service migration problems in Fog and Edge
environments [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. These solutions aim to identify an optimal migration policy based on
user mobility patterns. Our approach will employ a specific machine-learning paradigm called
reinforcement learning. This letter is particularly well-suited for addressing the challenges posed
by complex environments that require adaptability in response to contextual factors. Through
this technique, we can develop a migration solution that dynamically adjusts the placement
strategy by considering the varying performance of resources and bandwidth links. In contrast
to existing solutions, our approach will primarily emphasize minimizing the frequency of
workflow migrations within such heterogeneous environment characteristics to maintain a
trade-of between QoS and migration costs.
      </p>
      <p>The rest of the paper is organized as follows: In Section 2, we discuss the problem statement,
highlighting certain limitations in existing migration solutions that motivated our approach.
Section 3 provides an overview of the system modeling. In Section 4, we explain our MDP
modeling and RL framework.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem Statement</title>
      <p>
        Serval works were proposed to solve the service migration problem in Fog-Edge Computing
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], For instance, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] proposed a deep Q-learning algorithm to solve the task migration problem
without knowing users’ mobility patterns. Wang and al.[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] proposed a Double Deep Q-Learning
(DDQN) for computation ofloading and migration framework in vehicular networks, which
considers time-varying channel states and stochastically arriving computation tasks.[8], considered
a centralized controller for service allocation and migration. [9] designed a DRL approach to
deploying an optimal migration policy in order to improve user QoS in Mobile Edge Computing
(MEC). The approach consists of migrating data to another eNodeb (eNB) depending on the
user position and the current state of the network. Djemai et al. [10] presented a probabilistic
mobility-based Genetic Algorithm (MGA) and mobility greedy heuristic (MGH) for an eficient
services migration in the Fog environment that minimizes the infrastructure energy
consumption and applications delay violations over time. Huang and al. [11] proposed an intelligent
task migration scheme in MEC using the Q-learning technique. The authors aim to minimize
the overall service time. [12] focused on the problem of service migration where users move
between multiple edge nodes and propose a service migration strategy algorithm (SMSMA)
based on multi-attribute MDP to make migration decisions.
      </p>
      <p>In many existing research studies [8, 10, 9, 12], the primary focus revolves around exploring
migration strategies that relocate services or workflows whenever users change locations.
However, it’s essential to recognize that such migration strategies can introduce computational
resource overhead, higher communication costs, and longer migration times. These unnecessary
migrations can deplete resources and disrupt service execution, potentially leading to suboptimal
performance and operational ineficiencies.</p>
      <p>
        Conversely, studies aiming to reduce migration frequency [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6, 11</xref>
        ] often focus on scenarios
involving a single service migration in relatively homogeneous environments. In real-world
situations, regions can significantly difer in characteristics, especially regarding resource
capacities and network conditions. This heterogeneity adds complexity, requiring a more
nuanced approach to service migration management.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. System Model Overview</title>
      <p>In our study, depicted in Fig. 4.2, we examine a typical industrial scenario where a robot
traverses geographic regions covered by Fog servers. These servers connect to Access Points
(APs) through wireless links. Initially, the robot assigns computation tasks to Fog resources
in the first region. However, as the user moves, the system must make informed decisions
about task migration. These decisions consider factors like resource performance and network
conditions. To accommodate user mobility, the system operates in time slots, with the timeline
represented by  ∈  = {0, 1, 2, . . . ,  }. The duration of each time slot  is  (e.g.,15 minutes).</p>
      <p>The robot’s cyber workflow consists of tasks with dependencies. The robot’s physical
component continuously sends data gathered from various sensors, like images, to this workflow. This
data is processed and manipulated within the cyber workflow. After processing, the results are
returned to the physical component. The cyber workflow is formally represented as a directed
acyclic graph (DAG), denoted as  = ( , ). Here,   is the set of tasks, and  represents
the dependencies or constraints between task pairs. Each task in this model has attributes like
size and computational requirements.</p>
      <p>Fog resources are distributed in each region, forming a network of interconnected nodes.
These resources are linked together through wireless links, which can vary in characteristics
from one resource to another. Furthermore, the resources are characterized by a set of attributes
such as computational capacities. Our primary goal is to develop a decision-making algorithm
that eficiently minimizes both the overall delay and energy consumption associated with
processing in the system. This reduction includes ofloading processing, execution processing,
and migration processing. It encompasses the time from when tasks are ofloaded to the
resources of the initial region to the time the final task in the workflow completes its execution
in the last region.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Workflow migration-based RL Methodology</title>
      <p>In contrast to other fields of Machine Learning, reinforcement learning relies on continuous
interaction with the environment, where the agent learns through feedback in the form of
values assessing its actions.</p>
      <sec id="sec-4-1">
        <title>4.1. MDP Model</title>
        <p>The workflow migration problem is formalized as a Markov Decision Process (MDP), denoted
as   = ⟨, , ,  ⟩ [13], where  represents the state space,  is the action space
encompassing all possible actions at each state,  is the reward function valuing state-action
pairs, and  determines the probability of transitioning between states when specific actions
are taken.  , , and  are represented as follows:
1. State Space: The state space is defined by several variables that collectively represent
the system state. These variables include the current time slot (), the information of the
current task ( ) that needs to be allocated, the information of resources at the current
region (), and the action taken for the task (− 1) in the previous time slot. The state
can be denoted as  = {,  , , (−&gt;11) }. The total number of states in one episode is
equal to  ×  , where  represents the number of tasks, and  denotes the number of
time slots.
2. Action Space: In each time slot, an action  must be taken for each task  . The action
 ∈ {0(&gt;1) , 1} consists of two options: 0 denotes no migration decision, while 1
involves migrating the task to the current region by selecting a suitable resource.
3. Reward: It represents delay and energy consumption associated with the execution of
action  within the context of state .</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Reinforcement Learning Framework</title>
        <p>In RL, an agent interacts with an environment by taking actions, receiving rewards or penalties
based on those actions, and using this feedback to improve its decision-making process. The
agent’s actions are guided by a policy, which determines the mapping from system states to
actions [13]. The ultimate objective of the agent is to ascertain an optimal policy, denoted as  * ,
which efectively maps a state  to a probability distribution over possible actions  and is
represented as follows:</p>
        <p>* :  →  ()
 
 = ∑︁ ∑︁  (, )
=1 =1
(1)
(2)</p>
        <p>In the initial stages, the agent operates without prior knowledge about the environment
and therefore initiates exploration by taking random actions that might not yield immediate
high rewards but ofer valuable insights for discovering more rewarding actions over time.
Subsequently, the agent shifts to exploitation, where it selects actions aimed at maximizing the
expected future rewards, relying on its current understanding of the environment.
update is a fundamental aspect of RL algorithms like Q-learning and
State-Action-Reward-StateAction (SARSA).</p>
        <p>In the inference phase, the agent leverages its acquired knowledge to make decisions and take
actions during interactions with its environment. This behavior reflects the wisdom accumulated
from its training, guiding it to navigate and interact in alignment with the optimal migration
policy  * .</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion And Future Works</title>
      <p>This position paper addresses Workflow Migration in the context of Fog-Cloud Computing,
especially in scenarios with varying resource capacities and bandwidth links. We began by
highlighting limitations and challenges in existing literature. We then outlined our system
model. We introduced an MDP model, defining its key components. Additionally, we presented
an RL framework for necessary workflow migration.</p>
      <p>In the future, we aim to develop a Deep Learning resource prediction module using Google
cluster trace and Alibaba data, explore partial ofloading for energy-constrained end-users, and
investigate multi-agent strategies for scalability in expanding resource scenarios.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment References</title>
      <p>This work was partially supported by the LABEX-TA project MeFoGL: "Méthode Formelles
pour le Génie Logiciel"
[8] M. Zhang, H. Huang, L. Rui, G. Hui, Y. Wang, X. Qiu, A service migration method based on
dynamic awareness in mobile edge computing, in: NOMS 2020-2020 IEEE/IFIP Network
Operations and Management Symposium, IEEE, 2020, pp. 1–7.
[9] F. De Vita, D. Bruneo, A. Puliafito, G. Nardini, A. Virdis, G. Stea, A deep reinforcement
learning approach for data migration in multi-access edge computing, in: 2018 ITU
Kaleidoscope: Machine Learning for a 5G Future (ITU K), IEEE, 2018, pp. 1–8.
[10] T. Djemai, P. Stolf, T. Monteil, J.-M. Pierson, Mobility support for energy and qos aware
iot services placement in the fog, in: 2020 International Conference on Software,
Telecommunications and Computer Networks (SoftCOM), IEEE, 2020, pp. 1–7.
[11] S.-Z. Huang, K.-Y. Lin, C.-L. Hu, Intelligent task migration with deep qlearning in
multiaccess edge computing, IET Communications 16 (2022) 1290–1302.
[12] P. Tian, G. Si, Z. An, J. Li, F. Zhou, Service migration strategy based on multi-attribute
mdp in mobile edge computing, Electronics 11 (2022) 4070.
[13] R. S. Sutton, A. G. Barto, Reinforcement learning: An introduction, MIT press, 2018.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Rejiba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Masip-Bruin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Marín-Tordera</surname>
          </string-name>
          ,
          <article-title>A survey on mobility-induced service migration in the fog, edge, and related computing paradigms</article-title>
          ,
          <source>ACM Computing Surveys (CSUR) 52</source>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Y. Liu,
          <article-title>A survey on service migration in mobile edge computing</article-title>
          ,
          <source>IEEE Access 6</source>
          (
          <year>2018</year>
          )
          <fpage>23511</fpage>
          -
          <lpage>23528</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Mobility-aware dynamic service placement for edge computing</article-title>
          ,
          <source>EAI Endorsed Transactions on Internet of Things</source>
          <volume>5</volume>
          (
          <year>2019</year>
          )
          <fpage>e2</fpage>
          -
          <lpage>e2</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>A survey of fog computing: concepts, applications and issues</article-title>
          ,
          <source>in: Proceedings of the 2015 workshop on mobile big data</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <article-title>Task migration for mobile edge computing using deep reinforcement learning</article-title>
          ,
          <source>Future Generation Computer Systems</source>
          <volume>96</volume>
          (
          <year>2019</year>
          )
          <fpage>111</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ke</surname>
          </string-name>
          , G. Liu, W. Sun,
          <article-title>Computation migration and resource allocation in heterogeneous vehicular networks: a deep reinforcement learning approach</article-title>
          ,
          <source>IEEE Access 8</source>
          (
          <year>2020</year>
          )
          <fpage>171140</fpage>
          -
          <lpage>171153</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N. E. H.</given-names>
            <surname>Boubaker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zarour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Guermouche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Benmerzoug</surname>
          </string-name>
          ,
          <article-title>Fog and edge service migration approaches based on machine learning techniques: A short survey (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>