ViolationPredictor: a Solution for Predicting SLA
Violations of IoT Applications
Noureddine Staifi1 and Meriem Belguidoum1
1
    LIRE Laboratory, University of Constantine 2, Algeria


                                         Abstract
                                         The Internet of Things (IoT) paradigm has emerged strongly over the past decade and has established
                                         itself as an important player in the provision of services that are increasingly adapted to user preferences
                                         and profiles. Indeed, the management of the quality of service (QoS) is essential at the level of IoT systems,
                                         particularly critical applications, such as smart home systems (Smart Home Systems - SHS) and health
                                         monitoring systems (Health Monitoring Systems - HMS), requiring a certain level of quality specified
                                         and guaranteed by formal contracts, called Service Level Agreements (SLA). However, managing these
                                         SLAs are crucial tasks, such as SLA negotiation, monitoring, control, breach prediction, customisable
                                         management, etc. This paper presents ViolationPredictor, a Deep Learning (DL) based solution for the
                                         prediction of SLA violations. ViolationPredictor provides a way to predict future SLA violations and
                                         uses neural networks to accomplish this task. For each obligation, ViolationPredictor generates a neural
                                         network, where each system can predict possible future violations of this obligation. We used recurrent
                                         neural networks to implement ViolationPredictor because they have a memory that captures processed
                                         information and they can retain and consider past contextual information in their decisions.

                                         Keywords
                                         Internet of Things, Service Level Agreements, Quality of Service, SLA violation, Violation prediction,
                                         Deep Learning, Neural network.


1. Introduction
The Internet of Things (IoT) concept represents the new era of the Internet, allowing to inter-
connect objects to provide intelligent services. Currently, there are approximately 12.3 billion
connected objects in the world, and by 2025, connected objects will generate more than 73.1
billion Terabytes of data [1]. According to Cisco research, by 2030, 500 billion devices are
expected to be connected to the IoT [2]. However, to maximize the benefits of IoT in general,
and Smart Home Systems (SHS) in particular, several challenges need to be overcome, namely
managing massive amounts of data, privacy, security, and Quality of Service (QoS) management.
   In SHS, each application has its usage and traffic characteristics, and therefore requires a
certain level of quality specified in QoS contracts, called Service Level Agreements (SLA). An
SLA is a formal contract between service providers and consumers, it specifies the provided
services, the obligations of each party and the corresponding penalties in the event of a contract
violation. Its main objective is to clarify the needs of the customer and the provider, allowing

Tunisian Algerian Conference on Applied Computing (TACC 2022), December 13–14, 2022, Constantine, Algeria
$ noureddine.staifi@univ-constantine2.dz (N. Staifi); meriem.belguidoum@univ-constantine2.dz (M. Belguidoum)
 0000-0001-9965-9785 (N. Staifi); 0000-0002-2936-6810 (M. Belguidoum)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
each party to respect its commitment, and in case of conflict, it improves the understanding
aspect between these parties [3].
   SLAs play a key role in the deployment of services, where their specification and management
have become increasingly complex at the level of IoT applications. SLA specification is essential
to explicitly describe a prescribed service between a service provider and a consumer in terms
of required QoS, quantified and measurable expectations, consumer priorities, etc. [4]. Indeed,
this specification is a kind of guarantee and assurance for the consumer. On another side, SLA
management plays a vital role in establishing and maintaining a stable, reliable and measurable
business relationship between service provider and consumer, which presents several challenges
such as negotiation, monitoring, control, violation prediction, customisable management, etc.
[5].
   SLA violation is related to the assessment of the service QoS compliance with an SLA, it
concerns various relevant issues, such as reliability, availability, security and performance.
Traditional methods of managing and monitoring SLA violations work well at the level of
business services, such as cloud services. However, these methods cannot provide the desired
levels of security and reliability for critical systems, such as IoT applications. Indeed, these
applications require the consideration of some critical aspects, such as ubiquity, interconnectivity,
large-scale deployment, synchronization, massive data transfer, distribution and heterogeneity.
Moreover, the source of violation cannot be easily identified in the presence of multiple actors
such as consumers, Cloud and IoT providers, etc.
   Avoiding SLA violations requires early detection of potential risks. To reduce these situations,
service providers need tools to intuitively analyse whether their service design is causing SLA
violations, and to automatically guide them in their prevention. Several prediction strategies
have been developed, such as those that adopt Artificial Intelligence (AI) techniques, namely
Machine Learning (ML) and Deep Learning (DL), in which if the parameters approach agreed
limits, monitors should be alerted to take the necessary preventive measures.
   Promising approaches to service assurance and prediction of SLA violations are based on new
information and communication techniques, which have facilitated the task of predicting SLAs
[6]. Indeed, AI and its different techniques, such as ML and DL, appear as effective solutions to
face the challenges of violation prediction [7], where service quality and behavior are learned
from system observations, whose objective is to automate predictions in real time and in a
proactive manner. These techniques provide predictive models that exploit the data provided to
better anticipate breaches and contribute to operational efficiency.
   However, in this paper, we proposed ViolationPredictor, a Deep Learning (DL) based solution
for SLA violation prediction. ViolationPredictor generates a neural network for each obligation,
where each system is responsible to predict future violations of this obligation or SLO. We
implemented ViolationPredictor using recurrent neural networks (RNN) because they have a
memory that records processed information and can store and incorporate previous contextual
information into their choices. The dataset used by ViolationPredictor is a CSV file composed
of two columns: (1) the series of contextual data provided by the environmental sensors, and (2)
the decisions of the violations of these series.
   This paper is an extension of our previous work concerning the SLA specification and
management throughout their entire life cycle. The first phase was the proposal of ML-SLA-IoT
[8], a language for specifying multi-level SLAs for IoT applications. While the second phase
concerns the proposal of the solution SC-Generator [9], which presents a solution for monitoring
SLA obligations, that provides a way to monitor SLA terms by automatically generating Smart
Contracts from specified SLOs. These smart contracts are responsible for monitoring SLO
parameters, detecting violations, and notifying service providers. However, the present paper
enriches our work by detecting violations before their occurrence. SC-Generator plays a key
role in the ViolationPredictor solution, because it is responsible for providing the violation
decisions of the contextual datasets used in the dataset.
  The paper is organized as follows: Section 2 presents a review of related work including
their limitations. Section 3 presents our proposed approach. This approach is illustrated by an
evaluation and comparison in Section 4. Finally, in the last section, we conclude and present
some future work.


2. Related Work
This section discusses AI-based solutions for predicting SLA violations. There are several
proposals, we have limited ourselves to the most relevant researchs, such as Leitner et al. [7],
Hani et al. [10], Wong et al. [11], Hemmat et al. [12], Uriarte et al. [13], Biswas et al. [14], Tang
et al. [15] and Di et al. [16]
   Leitner et al. [7] provide a model for predicting SLA violations during runtime. In this
research, the model inputs could be the composition of the services or the quality of the services
used. A machine learning regression technique is then used to train data captured from historical
process instances.
   Hani et al. [10] propose a model that predicts SLA violations using SVM-based time series
analysis for regression. The prediction will learn from historical service level delivery data
captured by the monitoring system. This type of data forms sequential data points in spacetime,
called time series data. However, the limitation of this predictive model is its inability to scale
to inherently very large and volatile real-world data.
   Wong et al. [11] used five different machine learning algorithms such as SVM, Random
Forests, Naive Bayesian Classifier, Neural Network, and k-NN to predict SLA violations, so
corrective action can be taken. While other approaches can help a provider anticipate SLA
violations, they cannot help providers quantitatively assess QoS.
   Hemat et al. [12] conducted an experiment to overcome the challenge of predicting SLA
violations. According to these researchers, SLA violation is a rare real-world event that only
occurs 20 % of the time.
   Uriarte et al. [13] use an unsupervised formulation of the Random Forest algorithm to
calculate similarities and provide them as input to a Clustering algorithm, with the aim of
aggregating resource usage and service duration to avoid violations of the Google Cluster
Tracking data set.
   Biswas et al. [14] proposed an approach that anticipates future resource demand to meet
SLA requirements. They used enterprise-level SLAs (throughput and response time) as input
parameters for the chosen prediction approaches. ML techniques such as SVM and linear
regression were used.
               Table 1
               Comparison of AI-based solutions
          Work                 Domain                     MLA                      DLA                 DS            PM     PN    RO
     Leitner et al. [7]       services IT           linear regression               ✗              simulation         ✗      ✗     ✗
     Hani et al. [10]           Cloud                     SVM                       ✗              simulation         ✓     ✓      ✗
     Tang et al. [15]           Cloud             naive bayes classifier            ✗              simulation         ✓      ✗     ✗
      Di et al. [16]            Cloud             naive bayes classifier            ✗           Google data center    ✗      ✗     ✗
                                                                                                  Google Cloud
    Hemmat et al. [12]          Cloud                random forests                 ✗                                 ✗      ✗     ✗
                                                                                                     Cluster
    Biswas et al. [14]           IoT          SVM and linear regression             ✗              simulation         ✗      ✗     ✓
                                              SVM, random forests, naive
     Wong et al. [11]           Cloud                                         neural network       WS-DREAM           ✗      ✓     ✗
                                               bayes classifier and k-NN
                                                                                                  Google Cluster
    Uriarte et al. [13]         Cloud                random forests                 ✗                                 ✓      ✗     ✓
                                                                                                    Tracking

Note : MLA = Machine Learning Algorithms, DLA = Deep Learning Algorithms, DS = Data Source, PM = Parameter Monitoring, PN = Provider
Notification, RO = Resource Optimisation.


                  Several researchers use the Naive Bayes classifier. Tang et al. [15] provided an SLA violation
               prediction model, the training dataset is obtained from the WS-DREAM dataset, and only the
               response time is used as the value of hall. In the same context, Di et al. [16] proposed another
               Bayesian model for predicting host load using one-month tracking data collected by Google
               from thousands of machines running for up to 4 p.m. The predictive model uses CPU and
               memory as input metrics.
                  As shown in table 1, these proposals will be compared according to the following criteria:
                     • Domain: specifies the domain in which the solution is offered.
                     • Machine Learning Algorithms (MLA): This criterion indicates whether the solution
                       has adopted ML techniques.
                     • Deep Learning Algorithms (DLA): indicates whether the proposal considered DL.
                     • Data Source (DS): indicates the source from where the training and test data are collected.
                     • Parameter Monitoring (PM): shows whether the solution monitors QoS parameters.
                     • Provider Notification (PN): designates whether the proposal notifies the service provider
                       if a violation is predicted.
                     • Resource Optimisation (RO): indicates whether the solution optimises system resources
                       to avoid possible violations.


               3. ViolationPredictor: a solution for predicting SLA violations
               Different ML and DL techniques have been used to create predictive models for QoS assurance.
               Unlike previous work on predicting SLA violations, these models are trained on real dataset
               to provide effective solutions. The key idea is to use data samples to train a statistical model,
               which is then used for unseen data predictions.
Figure 1: Overview of the ViolationPredictor


   From the DL perspective, the SLA violation prediction problem is equivalent to a binary
classification problem, where there are two classes: class zero is the case of non-violated tasks
(violation = 0), while class one is the case of violated tasks (violation = 1).
   To do this, we proposed ViolationPredictor, a DL-based solution for predicting SLA violations.
It provides a means to predict future violations of SLA terms using neural networks. For each
obligation, ViolationPredictor generates a neural network, where each of these systems predicts
possible future violations of this obligation. Each generated neural network has as input a CSV
file, this file is composed of the captured data sequences and the decisions of the corresponding
violations, where these decisions are generated and provided by SC-Generator. Subsequently,
the neural network performs its prediction tasks to provide the decision on future predictions.
Figure 1 describes the ViolationPredictor overview.
   ViolationPredictor has three main phases; Firstly, the dataset retrieval step, which is applied
to retrieve the data that serve as inputs for the neural networks, this data is assembled into a
CSV file. The second phase is the learning stage, which incorporates a neural network model
to predict future violations. Finally, the result interpretation step consists of extracting and
visualizing the network outputs to prevent future SLA violations.
3.1. Data recovery
This phase is responsible for creating neural network inputs. For this, the captured data and
their violation decisions are assembled in a CSV file, where these decisions are calculated and
provided by SC-Generator. CSV files serve as inputs for neural networks. This is summarized
in the following steps:
    • Retrieve captured data: Data from sensors are retrieved and divided into a set of
      sequences.
    • Violation decision: for each data sequence, a violation decision is taken according to
      the concerned obligation parameters (the SLO threshold for example). This is achieved
      through the SC-Generator component which compares each data to the SLO parameters.
    • Creation of CSV files: through the following phases:
         – Create the CSV file by the instruction: new FileWriter ("dataset.csv");.
         – Generate the file header by the instruction: buffer.write ("Sequence, Violation");.
         – Fill the file with sequences and corresponding decisions.
Figure 2 illustrates a part of a resulting CSV file.


Figure 2: Example of a CSV file


3.2. Learning model
To implement ViolationPredictor, we chose recurrent neural networks (RNNs). The idea behind
the choice of RNNs is, on the one hand, the use and processing of sequential data, and, on the
other hand, RNNs are networks enclosing loops allowing information to persist. RNNs perform
the same task for each element of a sequence, and the output depends on previous computations,
in addition, they have a memory that captures the processed information, and they can retain
and consider old contextual information in their future decisions. In particular, we used LSTM
RNNs which overcome the difficulties encountered with standard RNNs.
   For each SLA obligation, a neural network is generated. Figure 3 illustrates the architecture
of each neural network which is composed of: an input layer, two LSTM layers, and an output
layer. Each network has as input a CSV file which is composed of captured data and violation
decisions provided by SC-Generator.
Figure 3: ViolationPredictor Neural Network Architecture


   To implement these neural networks, we used the Python language with its various libraries,
such as io (retrieving data), Numpy (manipulating matrices or multidimensional arrays), Pandas
(manipulate and analyze data), Seaborn and matplotlib (visualize data). Listing 1 presents the
code to create each neural network. We have created a neural network composed of two LSTM
layers, where each is composed of 10 neurons with a tensor of size (1,1), and an output layer,
which is a Dense layer with the activation function Sigmoid, we used this function because it
returns a value between 0 and 1, which represents the probability of violation occurrence.

1    model = S e q u e n t i a l ( )
2    model . add ( LSTM ( 1 0 , i n p u t _ s h a p e = ( 1 , 1 ) , r e t u r n _ s e q u e n c e s = True ) )
3    model . add ( LSTM ( 1 0 ) )
4    model . add ( Dense ( 1 , a c t i v a t i o n = ’ s i g m o i d ’ ) )


                                            Listing 1: Creation of neural network

3.3. Results interpretation
ViolationPredictor predicts future SLA violations. After training and running the neural network,
the results are provided. Network training is performed using the model.fit which takes as
parameters the dataset, the number of iterations, and the batch size (see Listing 2).

1    model . f i t ( X _ t r a i n , y _ t r a i n , e p o c h s = 3 5 , b a t c h _ s i z e = 1 )


                               Listing 2: Neural network training
    The precision metrics used are:
     • Loss: measures the error of the model, i.e. how correct the model is [17]. If the loss equals
       0, then the network performance is efficient. It is calculated by the following formula:
                                                 ∑︀𝑁
                                                       (𝑦 − 𝑦′)2
                                       𝐿𝑜𝑠𝑠 = 𝑖=1
                                                        𝑁
       (Where N = the set of values, y = the expected output, y′ = the produced output).
     • Accuracy: describes the performance of the model in all classes. It is useful when all
       classes are of equal importance [17]. The value of the precision must be equal to 1 to judge
       the proper functioning of the neural network. It is calculated as the ratio between the
       number of correct predictions and the total number of predictions through the following
       formula:
                                                      𝑇𝑃 + 𝑇𝑁
                                𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
                                               𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
       (where TP = true positives, TN = true negatives, FP = false positives and FN = false
       negatives).
     • The Mean Squared Error - (MSE): measures the mean squared error, i.e. the mean
       squared difference between the estimated values and the value true [17]. To judge the
       proper functioning of the neural network, the value of the MSE must be equal to 0. It is
       calculated by the following formula:
                                                                        𝑁
                                                                 1 ∑︁
                                                          𝑀 𝑆𝐸 =     (𝑦𝑖 − 𝐹 (𝑥𝑖 ))2
                                                                 𝑁
                                                                        𝑖=1

        (Where N = the set of values, yi = the expected output, F(xi) = the produced output).

    To test the efficiency of the system and to have prediction results, we used the instruction
model.predict, which takes as parameters a set of test data, as presented in Listing 3.

1     p r i n t ( model . p r e d i c t ( t e s t _ d a t a s e t ) )


                                              Listing 3: Retrieving prediction results

4. Evaluation and Comparison
This phase describes the process of predicting possible violations of SLA obligations using
ViolationPredictor. As presented previously, this process involves three steps: retrieving all the
data, generating the learning model and interpreting the results. These steps are summarized
in Figure 4. The first step is to create the CSV file that includes the data sequences and their
violation decisions from SC-Generator. In the second step, the neural networks will be generated.
Finally, the prediction process begins, and the third step illustrates the result of an example
prediction.
   The precision metrics considered are loss, precision and MSE. As illustrated in figure 5, to
judge the proper functioning of the neural network, the values of the loss and the quadratic
error are equal to 0, and the value of the precision is equal to 1.
   As mentioned previously, the SLA management proposals have certain limitations, such
as (1) uncertain monitoring of the SLAs which determines whether the obligations are met,
(2) the lack of optimization of the resources used in the realization of the system, and (3) the
lack of combination of AI techniques for violation prediction. To do this, we have proposed
ViolationPredictor, it is a DL-based policy for predicting SLA violations. ViolationPredictor
Figure 4: Generation steps and prediction results


generates, for each obligation, a neural network, where each network can predict possible future
violations of this obligation.
   Figure 6 presents a comparison of our ViolationPredictor solution, with some solutions related
to our proposal. These solutions will be compared according to the criteria described in Table 2.
As shown in figure 6, no solution uses DL’s algorithms, and only our solution and the solution
of Wong et al. [11] help predict violations.


5. Conclusion
This paper aims to address the challenges of QoS management for IoT applications in general,
and SHS in particular. Thus, several objectives are defined beforehand allowing the flexible and
intelligent management of QoS in IoT applications. This paper presented ViolationPredictor,
a DL-based solution for SLA violation prediction. It provides a means of predicting future
SLA violations, based on neural networks. ViolationPredictor generates a neural network for
each obligation (SLO and Rule), where each network predicts possible future violations of this
obligation.
   In the context of our contributions, we identify several avenues that deserve to be explored
to complete and extend our work. Indeed, we can cite four main possible prospects, in the short,
          Figure 5: The considered precision metrics


          Table 2
          Description of the criteria for comparing SLA management work
                                                             Description of levels
   Criteria
                       Criterion not supported                                  Criterion supported
                                                                  Partially                                 Completely
  Parameter
                    no monitoring of QoS parameters    monitoring of some QoS parameters       monitoring of some QoS parameters
 monitoring
  Consumer         no consumer compensation for SLA
                                                                  consumer compensation in the event of SLA violations
compensation                  violations
Notification of
                        no provider notification         notification of some violations            notification of all violations
   provider
  Violation
                         no violation prediction          prediction of some violations             prediction of all violations
  prediction
Deep Learning       Deep Learning algorithms are not
                                                                      use of Deep Learning algorithms for prediction
 Algorithms                   considered


          medium and long term. In the short term, we aim to extend the ML-SLA-IoT Framework by the
          security and accessibility aspect. In addition, we plan to integrate the technique of Chatbots
          to assist and help residents in different contextual situations. In the medium term, we want
          to manage the renegotiation of the contract dynamically at the time of execution. In the long
          term, we plan to aggregate SLAs and services provided by a multitude of suppliers. Finally, it
          would be interesting to propose a recommendation system for the provision of services adapted
          and customizable to user profiles.
Figure 6: Comparison of ViolationPredictor


References
 [1] B. Jovanović, I. Vojinovic, D. J. Spajić, N. Cvetićanin, Fascinating IoT Statistics for 2021:
     The State of the Industry, Retrieved April (2021).
 [2] O. Benedito, R. Delgado-Gonzalo, V. Schiavoni, KeVlar-Tz: A Secure Cache for Arm
     TrustZone, in: IFIP International Conference on Distributed Applications and Interoperable
     Systems, Springer, 2021, pp. 109–124.
 [3] G. Gaillard, Opérer les réseaux de l’Internet des Objets à l’aide de contrats de qualité de
     service (Service Level Agreements), Ph.D. thesis, INSA Lyon, 2016.
 [4] M. Alhamad, T. Dillon, E. Chang, Conceptual SLA framework for cloud computing, in:
     2010 4th IEEE International Conference on Digital Ecosystems and Technologies, IEEE,
     2010, pp. 606–610.
 [5] A. V. Dastjerdi, R. Buyya, An autonomous time-dependent SLA negotiation strategy for
     cloud computing, The Computer Journal 58 (2015) 3202–3216.
 [6] Y. Kouki, Approche dirigée par les contrats de niveaux de service pour la gestion de
     l’élasticité du nuage, Ph.D. thesis, Nantes, Ecole des Mines, 2013.
 [7] P. Leitner, B. Wetzstein, F. Rosenberg, A. Michlmayr, S. Dustdar, F. Leymann, Runtime
     prediction of service level agreement violations for composite services, in: Service-oriented
     computing. ICSOC/ServiceWave 2009 workshops, Springer, 2009, pp. 176–186.
 [8] N. Staifi, M. Belguidoum, Multi-level sla specification language for iot applications., in:
     TACC, 2021, pp. 49–61.
 [9] S. Noureddine, B. Meriem, Ml-sla-iot: an sla specification and monitoring framework for
     iot applications, in: 2021 International Conference on Information Systems and Advanced
     Technologies (ICISAT), IEEE, 2021, pp. 1–12.
[10] A. F. M. Hani, I. V. Paputungan, M. F. Hassan, Support vector regression for service level
     agreement violation prediction, in: 2013 International Conference on Computer, Control,
     Informatics and Its Applications (IC3INA), IEEE, 2013, pp. 307–311.
[11] T.-S. Wong, G.-Y. Chan, F.-F. Chua, A machine learning model for detection and prediction
     of cloud quality of service violation, in: International Conference on Computational
     Science and Its Applications, Springer, 2018, pp. 498–513.
[12] R. A. Hemmat, A. Hafid, SLA violation prediction in cloud computing: A machine learning
     perspective, arXiv preprint arXiv:1611.10338 (2016).
[13] R. B. Uriarte, S. Tsaftaris, F. Tiezzi, Service clustering for autonomic clouds using random
     forest, in: 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid
     Computing, IEEE, 2015, pp. 515–524.
[14] N. K. Biswas, S. Banerjee, U. Biswas, U. Ghosh, An approach towards development of new
     linear regression prediction model for reduced energy consumption and SLA violation in
     the domain of green cloud computing, Sustainable Energy Technologies and Assessments
     45 (2021) 101087.
[15] B. Tang, M. Tang, Bayesian model-based prediction of service level agreement violations
     for cloud services, in: 2014 Theoretical Aspects of Software Engineering Conference, IEEE,
     2014, pp. 170–176.
[16] S. Di, D. Kondo, W. Cirne, Host load prediction in a Google compute cloud with a Bayesian
     model, in: SC’12: Proceedings of the International Conference on High Performance
     Computing, Networking, Storage and Analysis, IEEE, 2012, pp. 1–11.
[17] K. Janocha, W. M. Czarnecki, On loss functions for deep neural networks in classification,
     arXiv preprint arXiv:1702.05659 (2017).