FREEDA: Failure-Resilient, Energy-aware, and
                                Explainable Deployment of Microservice-based
                                Applications over Cloud-IoT Infrastructures
                                Monica Vitali3,† , Jacopo Soldani1 , Roberto Amadini2,5 , Antonio Brogi1 , Stefano Forti1 ,
                                Simone Gazza2 , Saverio Giallorenzo2,4 , Pierluigi Plebani3 , Francisco Ponce1 and
                                Gianluigi Zavattaro2,4
                                1
                                  Department of Computer Science, University of Pisa, Italy
                                2
                                  Department of Computer Science and Engineering, University of Bologna, Italy
                                3
                                  Department of Computer Science, Electronic, and Bio-engineering, Politecnico di Milano, Italy
                                4
                                  INRIA, France
                                5
                                  OPTIMA ARC Industrial Transformation and Training Centre, Melbourne, Australia


                                            Abstract
                                            FREEDA is an Italian research project aimed at supporting DevOps engineers in achieving failure-resilient
                                            and environmentally sustainable deployments of microservice-based applications over the Cloud-edge
                                            computing continuum. In this article, we describe motivations, objectives and first results of the FREEDA
                                            project, and discuss how FREEDA relates to the Information Systems Engineering community.

                                            Keywords
                                            Microservice-based Applications, Cloud-Edge, Application Deployment, Failure-resiliency, Energy-
                                            awareness, Sustainabile IT


                                Project overview Duration: September 2023 - September 2025. Consortium: University of
                                Pisa (IT) (Project Coordinator), University of Bologna (IT), Politecnico di Milano (IT). Funding
                                Agency: Ministry of Universities and Research (MUR), ITALY. Website: https://freeda.di.unipi.it


                                1. Project Overview
                                1.1. Context and Motivation
                                The extensive integration of smart connected devices, coupled with the increasing computa-
                                tional capacities they offer, requires a transformation of Cloud computing into ubiquitously


                                RPE@CAiSE’24: Research Projects Exhibition at the International Conference on Advanced Information Systems Engi-
                                neering, June 3–7, 2024, Limassol, Cyprus
                                †
                                  Corresponding author.
                                $ monica.vitali@polimi.it (M. Vitali); jacopo.soldani@unipi.it (J. Soldani); roberto.amadini@unibo.it (R. Amadini);
                                antonio.brogi@unipi.it (A. Brogi); stefano.forti@unipi.it (S. Forti); simone.gazza@unibo.it (S. Gazza);
                                saverio.giallorenzo2@unibo.it (S. Giallorenzo); pierluigi.plebani@polimi.it (P. Plebani); francisco.ponce@di.unipi.it
                                (F. Ponce); gianluigi.zavattaro@unibo.it (G. Zavattaro)
                                 0000-0002-5258-1893 (M. Vitali); 0000-0002-2435-3543 (J. Soldani); 0000-0003-1668-7305 (R. Amadini);
                                0000-0003-2048-2468 (A. Brogi); 0000-0002-4159-8761 (S. Forti); 0000-0002-3658-6395 (S. Giallorenzo);
                                0000-0002-6411-0511 (F. Ponce)
                                          © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


                                                                                                             69


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
distributed infrastructures that exploit computing capabilities at the edge of the network (Cloud-
IoT) [1]. The distributed infrastructure on which the Cloud-IoT computing continuum is based
is characterised by significant heterogeneity and variability. In such a scenario, the computing
devices composing the infrastructure range from cloud servers to smart IoT devices. These nodes
differ in terms of computational and storage capabilities, performance, cost, and ownership.
In common scenarios as data-intensive applications, another aspect to be considered is the
mutual position of the different nodes when they have to exchange information and the type
of connection (i.e., wired or wireless) affecting the communication time and cost. At the same
time, the infrastructure composition is continuously changing: nodes might join overtime or
might become unavailable at some point, as well as their performance and capabilities might be
affected by external factors (i.e., other applications concurrently running on the same hardware);
additionally, the workload of the applications hosted in such an infrastructure is characterised
by fluctuations in workload and traffic.
   In this context, the widespread adoption of Microservice-based Applications (MSAs) in
delivering enterprise solutions has increased the need for facilitating MSA deployment across
the Cloud-IoT infrastructures seamlessly. An MSA is an application designed as a set of loosely
coupled smaller components, each one with specific functionalities. Each component is also
characterised by specific functional (e.g., amount of computational and storage resources needed)
and non-functional (e.g., response time, latency) requirements that need to be considered and
satisfied when the application is deployed on the infrastructure.
   Given the complexity of both the Cloud-IoT infrastructure (i.e., number of nodes, heterogene-
ity, and variability) and the application (e.g., number of components and specific requirements),
properly mapping each component to a feasible infrastructural node is becoming a complex and
time-consuming task. This calls for approaches where the different deployment requirements of
the microservices composing the application are considered together with the capabilities and
features of the infrastructure. These approaches could influence the DevOps practice, which
must take into consideration potential service and hosting node failures, as well as cascading
failures where the malfunction of one infrastructure node or service leads to the failure of
others [2].
   The need for the deployment of resilient applications must also align with the recent European
Union’s requirements for sustainable IT [3][4], including the reduction of consumed energy.
Even though cloud infrastructures’ environmental impact has been significantly reduced in
recent years, this reduction does not apply to edge devices or smaller data centers. Additionally,
the deployed applications’ energy demand has been increasing with the widespread adoption of
the Cloud, due to the availability of large amounts of computational resources at a reduced cost.
This is expected to occur in Cloud-IoT infrastructures as well, therefore calling for supporting
energy-aware MSA deployments [5][6][7].
   The combination of all these characteristics makes it difficult to reason on the Cloud-IoT
deployment configuration of MSAs. Moreover, deployment requirements can conflict (e.g.,
resilience can be increased by deploying replicated service instances, which increases the toll
on the energy budget). This calls for novel techniques and tools that help DevOps engineers to
reason on deployment requirements.


                                               70
                                                                                     Cloud-IoT
                                                                                     continuum
                      Failure
       H                                                Reasoner
                     Analyzer
                                          A+
       I
                                           I          a+     I   R+
       A
                                          R+
                                                                             D
       R              Energy                          Multi-Criteria
                     Analyzer                            Solver
                                                                             E      DevOps
                                                                                   Engineer
                    Enrichment                          Trade-Off


Figure 1: Overview of FREEDA’s approach [8]


1.2. Proposed Approach
FREEDA aims to address the demand for DevOps support in deploying MSAs within Cloud-IoT
infrastructures, integrating considerations for failure resilience and environmental sustainability.
An overview of the proposed approach is illustrated in Fig. 1.
   The project enables the holistic deployment of an MSA over a Cloud-IoT infrastructure by
trading among its (possibly conflicting) deployment requirements. More precisely, FREEDA
will enable analyzing an MSA and a Cloud-IoT infrastructure, together with information on the
current and former MSA deployments (if any), to elicit additional requirements enabling the
enforcement of the MSA deployment’s failure resilience and energy sustainability.
   As shown in Fig. 1, the proposed solution is divided into two main phases (Enrichment and
Trade-Off ) each one supported by two main components. The main stakeholder of the proposed
approach is the DevOps Engineer willing to deploy an MSA in the infrastructure. The DevOps
Engineer must provide the following information:

    • A description of the application components (A): the set of microservices composing the
      application and their mutual relationships.
    • A set of requirements (R): functional (computational and storage resources needed)
      and non-functional (performance, security, energy) requirements associated with each
      component.
    • A description of the infrastructure (I): the set of nodes in which the application can
      be deployed and their capabilities and cost. These nodes might belong to the DevOps
      organization or an external organization (e.g., a public or private cloud provider).

   Additionally, to these inputs provided by the DevOps engineer, we expect to have also
information coming from the historical monitoring of the infrastructure and the application, if
already deployed (H).


                                                71
   Using these inputs, FREEDA will provide the DevOps Engineer with a valid deployment
able to fulfil all the expressed constraints. The core of the methodology is the Multi-Criteria
Solver component. This component formally models the deployment problem, by encoding it
in a constraint modelling language and builds a solver providing the best deployment plan (D).
   Before being able to perform this step, the Enrichment phase needs to be executed. This
phase aims at improving the application and requirements description to ensure the fulfilment
of the failure resiliency and environmental sustainability requirements. The enrichment phase
consists of two main components:
    • Failure Analyzer: the focus of this component is to enrich the application and its
      requirements description with additional features. The failure identifies possible causes of
      failure in the application execution and includes proper sidecar integration components
      (e.g., circuit breakers) in A to avoid failure propagation. It also generates novel hard/soft
      requirements in R for enforcing failure resiliency at deployment time, e.g., avoiding
      deploying services on nodes that are known to fail if subject to a given load, or whose
      failure is predicted to occur soon based on the available historical data.
    • Energy Analyzer: this component develops techniques to classify the profile of the nodes
      in a Cloud-IoT infrastructure and correspondingly generate requirements for reducing
      the energy consumed by MSAs deployed on such infrastructure. Also, the relation be-
      tween an MSA’s components, based on the connections/functional dependencies between
      microservices and information on the data they exchange are considered in the analysis
      to provide insights about microservice co-placement to the solver.
   The output of the enrichment phase is an updated and enrichment description of the applica-
tion components (A+) and requirements (R+), including failure-resiliency and energy-aware
considerations. The enrichment phase reduces the design effort required from the DevOps
Engineer by ensuring the satisfaction of these non-functional aspects in the deployment of the
application without the need for her direct intervention.
   As discussed in Sect. 1.1, FREEDA needs to operate in a dynamic environment that can affect
both infrastructure and the application composition, as well as the applications’ requirements.
Anytime a significant change is detected or requirements get violated, a new execution of
the Multi-Criteria Solver is needed to generate a new deployment plan. To avoid the inef-
ficiency that might result from a full re-deployment of an already running application, the
Reasoner component is in charge of determining a —possibly small— subset a+ of the applica-
tion components that need to be re-deployed due to violation of the requirements in R+. This
continuous reasoning approach might generate a sub-optimal deployment plan while reducing
the disruption of the re-deployment and ensuring the satisfaction of all the requirements.
   The last relevant feature distinguishing the FREEDA approach from the state of the art is
Explainability. FREEDA aims at providing DevOps engineers with proper and human-friendly
explanations (E) on why/how specific deployment choices have been taken by the solver. Such
explanations can also be exploited by the DevOps engineers to improve the MSA description
or requirements. For instance, the explanation may suggest the deployment of additional
components (e.g., circuit breakers) to reduce the risk of failure of another component, thus
increasing the cost and energy consumption of the proposed solution. Based on this, a DevOps
engineer may decide to refactor her service, natively including resiliency in the design.


                                                72
2. Project Objectives and Expected Results
The project targets four main objectives:
(𝑂1 ) Holistic MSA Deployment over the Cloud-Edge continuum: FREEDA will develop novel
      techniques to determine a suitable trade-off among the MSA’s deployment requirements,
      therein including cost, hardware, software, security, failure resilience, and sustainability
      requirements.
(𝑂2 ) Continuous Reasoning for Adaptive MSA Deployment over the Cloud-Edge continuum:
      FREEDA will develop continuous reasoning-like [9][10] to timely adapt the deployment
      of an MSA over a Cloud-Edge infrastructure when changes in the MSA, infrastructure, or
      deployment requirements occur.
(𝑂3 ) Explainable Enhancement of MSA Deployments’ Failure Resilience: FREEDA will develop
      explainable techniques to enhance the failure resilience of to-be-deployed MSAs over
      Cloud-Edge infrastructures.
(𝑂4 ) Explainable Reduction of MSA Deployments’ Environmental Impact: FREEDA will develop
      explainable techniques for reducing brown energy consumption when deploying MSAs
      over Cloud-Edge infrastructures.
   FREEDA will deliver a set of techniques to fulfil the project’s objectives, which will be proto-
typed to form a toolchain for planning MSA deployment across Cloud-IoT infrastructures. These
techniques correspond to the main components outlined in Fig. 1. Development within FREEDA
will adopt an iterative approach, beginning with basic implementations of proposed techniques
and gradually refining and extending them until project goals are achieved. Each element
of the approach will be provided as independent components, allowing DevOps Engineers to
activate and configure them according to specific needs and objectives for their applications. The
toolchain’s components will be developed with a service-oriented approach. While the specific
technologies and languages for implementation are still being examined, the aim remains to
leverage standard and open-source solutions whenever possible. As an example, the deployment
plan D and the descriptions of the application A, requirements R, and infrastructure I will be
articulated in a YAML1 file format, while the optimization problem will be formulated using
MiniZinc [11] as constraint modeling language and solved with optimization tools supporting
MiniZinc (e.g., Gurobi2 or OR-Tools3 ).
   The practical application of these techniques will be demonstrated through realistic use cases,
illustrating the (re)configuration of MSA deployment over a testbed Cloud-IoT infrastructure.
Operations and monitoring of the deployment will leverage existing tools, such as those based
on TOSCA [12] or EDMM [13]. Additionally, FREEDA aims to engage with standardization
committees (e.g., OASIS TOSCA) and open industrial initiatives (e.g., Cloud Native Computing
Foundation, Gaia-X, Next Generation Internet, Industrial Internet Consortium). This dual
approach will facilitate the exploration of emerging Cloud and Edge/Fog standards and solutions
within FREEDA, while also contributing to standards and initiatives through the solutions
developed within the project.
1
  https://yaml.org
2
  https://www.gurobi.com/
3
  https://developers.google.com/optimization


                                                73
  The prototype tools and their integration as a toolchain will be released as open-source
software in public repositories, such as GitHub.

2.1. Preliminary Results
FREEDA has started its operations in September 2023. At the current stage, the partners have
been involved in activities for the definition of:

    • A preliminary version of the mathematical formalization of the optimization problem,
      which included a detailed description of the application, the infrastructure, and the
      requirements;
    • A preliminary and simplified implementation of the solver using the model formalization
      to find a suitable deployment solution.


3. Relevance for the CAISE Community
The project’s alignment with the themes of the International Conference on Advanced Infor-
mation Systems Engineering (CAISE) underscores its relevance to the Information Systems
Engineering community. Specifically, FREEDA addresses challenges that are of particular
interest to this community, as delineated in the Call for Papers:

    • Microservices design and deployment: this topic is aligned with the main motivation of
      the project as expressed in objective O1;
    • Cloud- and edge-based IS engineering: this topic is aligned with the context in which
      FREEDA operates as declared in objectives O1 and O2;
    • Context-aware, autonomous, and adaptive IS: this topic is aligned with the goal of the
      continuous reasoner component, adapting the deployment according to the context of
      execution and requirements violations as declared in objective O2.

  Moreover, the project’s focus resonates with key themes from previous editions of the confer-
ence. For instance, the Intelligent Information Systems theme from CAISE 2021 acknowledged
the heightened level of uncertainty faced by organizations and the growing imperative for
Intelligent Information Systems that offer trusted, adaptive, agile, and autonomous solutions.
Similarly, the Resilient Information Systems theme from CAISE 2020 recognized the inherent
complexity of information systems as they evolve, leading to susceptibility to various forms of
degradation and failure. These topics are central to FREEDA’s objectives.


Acknowledgments
This work was supported by the project FREEDA (CUP: I53D23003550006), funded by the
frameworks PRIN (MUR, Italy) and Next Generation EU.


                                              74
References
 [1] A. J. Ferrer, J. M. Marquès, J. Jorba, Towards the decentralised cloud: Survey on approaches
     and challenges for mobile, ad hoc, and edge computing, ACM Computing Surveys (CSUR)
     51 (2019) 1–36.
 [2] J. Soldani, A. Brogi, Anomaly detection and failure root cause analysis in (micro) service-
     based cloud applications: A survey, ACM Computing Surveys (CSUR) 55 (2022) 1–39.
 [3] EU, A new strategic agenda 2019 – 2024, https://www.consilium.europa.eu/media/39914/
     a-new-strategic-agenda-2019-2024.pdf, 2019. Accessed: 2024-04-12.
 [4] EU, Eu. financing the climate transition, https://www.consilium.europa.eu/en/policies/
     climate-finance/, 2021. Accessed: 2024-04-12.
 [5] S. Forti, A. Brogi, Green application placement in the cloud-iot continuum, in: International
     Symposium on Practical Aspects of Declarative Languages, Springer, 2022, pp. 208–217.
 [6] M. Vitali, Towards greener applications: Enabling sustainable-aware cloud native applica-
     tions design, in: International Conference on Advanced Information Systems Engineering,
     Springer, 2022, pp. 93–108.
 [7] M. Vitali, P. Schmiedmayer, V. Bootz, Enriching cloud-native applications with sustainabil-
     ity features, in: 2023 IEEE International Conference on Cloud Engineering (IC2E), IEEE,
     2023, pp. 21–31.
 [8] J. Soldani, R. Amadini, A. Brogi, et al., Towards Sustainable Deployment of Microservices
     over the Cloud-Edge Continuum, with FREEDA, in: International Workshop on Flexible
     Resource and Application Management on the Edge (FRAME), ACM, 2024, pp. 1–4.
 [9] S. Forti, et al., Declarative continuous reasoning in the cloud-iot continuum, J. Log.
     Comput. 32 (2022) 206–232. doi:10.1093/LOGCOM/EXAB083.
[10] P. O’Hearn, Continuous reasoning: Scaling the impact of formal methods, in: LICS 2018,
     ACM, 2018, p. 13–25. doi:10.1145/3209108.3209109.
[11] N. Nethercote, P. J. Stuckey, R. Becket, S. Brand, G. J. Duck, G. Tack, MiniZinc: Towards a
     standard CP modelling language, in: International Conference on Principles and Practice
     of Constraint Programming, Springer, 2007, pp. 529–543.
[12] M. Rutkowski, C. Lauwers, C. Noshpitz, C. Curescu (eds.), TOSCA Simple Profile in
     YAML Version 1.3, https://docs.oasis-open.org/tosca/TOSCA-Simple-Profile-YAML/v1.3/
     os/TOSCA-Simple-Profile-YAML-v1.3-os.html, 2020. Accessed: 2024-04-12.
[13] M. Wurster, U. Breitenbücher, A. Brogi, F. Diez, F. Leymann, J. Soldani, K. Wild, Au-
     tomating the deployment of distributed applications by combining multiple deployment
     technologies., in: CLOSER, 2021, pp. 178–189.


                                               75