<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Evaluating Load Adjusted Learning Strategies for Client Service Levels Prediction from Cloud-hosted Video Servers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Obinna Izima</string-name>
          <email>Obinna.Izima@mydit.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ruair de Frein</string-name>
          <email>ruairi.defrein@dit.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mark Davis</string-name>
          <email>mark.davis@dit.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dublin Institute of Technology</institution>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Network managers that succeed in improving the accuracy of client video service level predictions, where the video is deployed in a cloud infrastructure, will have the ability to deliver responsive, SLA-compliant service to their customers. Meeting up-time guarantees, achieving rapid rst-call resolution, and minimizing time-to-recovery after video service outages will maintain customer loyalty. To date, regression-based models have been applied to generate these predictions for client machines using the kernel metrics of a server cluster. The e ect of time-varying loads on cloud-hosted video servers, which arise due to dynamic user requests have not been leveraged to improve prediction using regularized learning algorithms such as the LASSO and Elastic Net and also Random Forest. We evaluate the performance of load-adjusted learning strategies using a number of learning algorithms and demonstrate that improved predictions are achieved irrespective of the learning approach. A secondary bene t of the load-adjusted learning approach is that it reduces the computational cost as long as the load is not constant. Finally, we demonstrate that Random Forest signi cantly improve the prediction performance produced by the best performing linear regression variant, the Elastic Net.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Streaming video content over wired and wireless communication networks will
be a major contributor to future internet tra c as can be inferred from [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] which
predicts that the global IP video tra c is estimated to be about 82 percent of all
consumer tra c by 2021. To safeguard revenues, network providers must be able
to proactively monitor customer experience. The authors of [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] adopt a Machine
Learning (ML) approach in their work on video service-level prediction using
the kernel metrics of the server delivering the video which is a rst step to meet
this goal.
      </p>
      <p>Fig. 1 depicts the set-up of the system under study in this paper which is
representative of real world systems. It is composed of a cloud-based server
infrastructure servicing dynamically and time-varying client requests received over
a network. Server resources are shared between multiple clients with the video
service (RHS) delivering video to target client machines (LHS). The number of
users accessing the server resources changes rapidly and this poses a challenge in
predicting the target client's video quality. A video server of this form must be
able to handle time-varying loads as users can start and stop videos at arbitrary
times simultaneously.</p>
      <p>
        We seek a model that characterizes the e ect of the time-varying loads on
the client's video quality provided we have knowledge of the kernel metrics of
the server delivering the video. Both the client machine and the server clocks
are synchronized to match up observations. Samples are drawn from the client
and server machines every second. A VLC media player services the
Video-onDemand (VoD) requests in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and extracts the RTP packet rate, Video Frame
Rate (VFR) and Audio Bu er Rate (ABR), yi at time i. RTP, VFR and ABR are
the client's service-level metrics we seek to predict. The System Activity Report
(SAR) function on the server extracts the feature set, x. Here, features imply
the metrics on the operating system for example, the TCP active connections
on the server. The authors of [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] characterize the system we seek to investigate
properly; this investigation examines the e ect of time-varying requests on the
system resources.
      </p>
      <p>
        We contribute adaptive learning techniques which reduce the computational
complexity, and provide more accurate predictions over the baseline approach in
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
1. We demonstrate this by considering the performance of linear regression
methods and non-linear methods using the baseline approach. The linear
methods we evaluate are Linear Regression (LR) and also members of the
family of shrinkage methods: Ridge Regression (RR), LASSO and Elastic
Net. We compare the performance of the best performing linear method
with a non-linear method, Random Forest;
2. We evaluate the e cacy of the load-adjusted (LA) learning (i.e. using the
TCP socket count in our learning algorithms to improve performance) on
two di erent traces which vary periodically and according to a ashcrowd
behaviour;
3. We determine whether the load-adjusted technique works better in linear or
non-linear learning algorithms.
      </p>
      <p>This paper is organized as follows. In Section 2, we introduce the load-adjusted
learning technique and the Machine Learning (ML) techniques. Section 3
introduces the model tting procedures and the evaluation framework. In Section 4
we evaluate the e cacy of each of the approaches. Section 5 places our
contribution in the context of related literature and we conclude our work in Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Learning Strategies</title>
      <p>The server in Fig. 1, collects device statistics, x using SAR. The number of active
clients, the load signal at time i can be measured with the TCPSCK feature of
x. A load generator dynamically allocates client requests for video to the server
under two load patterns, a periodic-load pattern and ashcrowd-load pattern.
In the periodic-load pattern, clients are started following a Poisson process with
an average arrival rate of 30 clients per minute. This arrival rate is modulated
using a sinusoidal function with a period of an hour and an amplitude of 20
clients. The ashcrowd-load pattern starts clients with a Poisson process with 5
clients per minute average arrival rate and peaks at randomly created events at
a rate of 10 events per hour. During ash events, the average arrival rate swaps
to 50 clients per minute for a minute and then gradually reduces to 5 clients per
minute within the next 4 minutes.</p>
      <p>Using the device statistics computed from SAR, service-level metrics can be
computed at the clients. In this paper, we are interested in predicting the (i)
Video Frame Rate (VFR), the number of displayed video frames for any time
i; (ii) Audio Bu er Rate (ABR), audio bu ers played for time i; and (iii) RTP
packet count, the number of received RTP packets in time i. Fig. 2 illustrates a
plot of the RTP packet count and the TCPSCK count, the load proxy recorded
over a period of 15000 seconds for both load patterns. As can be inferred from
Fig. 2, the TCPSCK count rises with an increase in load and may result to
a reduction in the RTP packet received at the client because the system has
limited resources.
2.1</p>
      <sec id="sec-2-1">
        <title>Load-adjusted Learning (LA)</title>
        <p>The e ect of concurrent requests on the server kernel are examined. In Fig. 2,
we plot the RTP packet count recorded over 15000 seconds and the TCPSCK
kernel parameter over the same period of time. From visual inspection of the
plot, the RTP packet count has the same periodic pattern as the TCPSCK. It
is also evident that as the load increases, the TCPSCK count increases and may
lead to a decrease in the number of RTP packet counts at the client because the
system does not have unlimited resources.</p>
        <p>
          Load-adjusted model: Let n represent one video resource currently being
used by a client. A server response with respect to its kernel metric, the n-th
feature, to one request for a video at time i is the sum of the resource a user has
and some deviation signal speci c to a feature [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
(1)
(2)
(3)
xi[n] = 2 n + i[n; 1] + i[n; 2]:
The deviation from the ideal performance arising from the second user is denoted
by i[n; 2]. Assume that at any time, i the number of users requesting the service
is K[i]. When there are ve client requests for server resources, K[i] = 5. The
response of the n-th feature to the time-varying load is
xi[n] = n + i[n];
        </p>
        <p>where i 2 Z; xi[n]; n 2 R:</p>
        <p>An additional request for resources by the current user or a new client would
invoke a feature response of the form:</p>
        <p>K(i)
xi[n] = nK[i] + X i[n; k]:
k=1
The load signal nK[i] denotes the number of active users at time i times the
resources one user uses, n. The TCPSCK is K[i].</p>
        <p>Un-adjusted Learning (UA): Previous attempts at predicting service-level
metrics from device statistics do not model the e ect of the time-varying load.
Problem Statement: Our objective is to learn a model that predicts
servicelevel QoS metrics, y: RTP packet rates, video frame rate and audio bu er rate
using the features xi[n] given a time varying load K[i]. Using both the
ashcrowdload and periodic-load traces, we test the hypothesis that a learning algorithm
which takes the load value into consideration produces better predictions than
models obtained using algorithms which ignore the load.
They assume that K[i] is constant C.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Machine Learning Techniques</title>
        <p>In this section, we brie y introduce the di erent ML techniques we adopted for
our experiments.</p>
        <p>Linear Regression: We start with Linear Regression (LR), a baseline for many
ML techniques. The LR models a linear relationship between the metrics we want
to predict y, the dependent variables and the independent variables of predictors
x as a linear function of the form:
y^i =</p>
        <p>
          N
X xi[n] n
n=1
where xn[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] represents the intercept and the remaining features represent the
feature space of the predictors. The variable y^i represent the estimates for yi.
The model coe cients, which we use for prediction, are n where n = 1; : : : N .
LR computes the coe cients that minimize the residual sum of squares (RSS).
Ridge Regression (RR): In a second approach, we use Ridge Regression (RR),
a variant of the LR which includes an `2-norm loss function on the coe cients,
and maintains a small amount of energy in each coe cient [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. RR seeks model
coe cients that best suit the data by minimizing the RSS in the LR equation
with the addition of a regularization parameter Pn n2 ; where 0.
(4)
(5)
Least Absolute Shrinkage and Selection Operator (LASSO): We then
apply the LASSO, another LR variant that imposes an `1-norm on the regression
coe cients. Similar to the RR, the LASSO solves the LR model with the addition
of a regularization parameter Pn j nj, where 0. In contrast to RR, the
`1norm in LASSO performs a form of automatic variable selection and continuous
shrinkage by forcing some model coe cients to zero and e ectively turning-o
some features. The feature space we examine is a high dimensional one and
LASSO is known to obtain sparse linear models in such cases [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
Elastic Net (EN): We apply the Elastic Net (EN) model owing to its ability
to perform shrinkage and variable selection just like the LASSO method. The
LASSO is known to su er from poor prediction accuracy due to high correlations
between the features especially in cases where the number of observations are
greater than the feature space, as is the case in our data set [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The EN combines
both the LASSO and RR, that is, it applies a mixture of `1-norm and `2-norm
penalties on the coe cients. The EN automates the choice of the regularization
parameter and produces sparse solutions just like the LASSO. However, the EN
tends to select more features than the LASSO does as the EN overcomes the
grouping e ect situation in LASSO. The grouping e ect is a situation where
the LASSO tends to select only one feature from a group of features with high
pairwise correlations.
        </p>
        <p>
          Random Forest (RF): We apply a non-linear method, the Random Forest
algorithm, which is an ensemble method which builds multiple decision trees
and consolidates their results to obtain stable and more accurate predictions. In
simple terms, the RF estimates y^ for the metric y using the average predictions
from a large number of regression trees [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments, Model Fitting and Evaluation Procedure</title>
      <p>
        We perform model computations using the traces made publicly available in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
All four evaluation frameworks were implemented in RStudio [version1.1.423].
The traces we utilize for our experiments are the periodic-load pattern and
ashcrowd-load pattern. The periodic-load contains 51043 observations for 297
features while the ashcrowd-load pattern has 275 features with 15150
observations. We start by pre-processing the data sets to remove all non-numeric and
constant value features. Using the ML techniques above we learn models to
predict the service-level metrics ABR, VFR and RTP using the device statistics
with the un-adjusted method. We perform two sets of experiments. One set of
experiments with the periodic-load trace and the second set of experiments with
the ashcrowd-load trace.
      </p>
      <p>
        Un-Adjusted Method (UA): For the UA learning, we adopt the technique
used in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we generate the train and test data using any sample from the data
regardless of the load value. We also adopt the validation set approach [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] for
all model building and evaluation with 60% of the trace in the training set and
40% in the testing set. The 60/40 split was done for both traces with 60% set
aside for training the models and 40% for test data and prediction. Using the
UA approach, we learn LR, RR, LASSO and EN models for the service-level
metrics.
      </p>
      <p>The regularization techniques, RR, LASSO and EN required a method for
selecting the regularization parameter, , for the penalty function. RR, LASSO
and EN use an `2-norm, `1-norm and combination of both norms as a weighted
(by ) penalty term. The entire path for the results for these models was
calculated using path-wise cyclical coordinate descent algorithms. Computationally
e cient and e ective approaches for evaluating these convex optimization
problems were implemented using the glmnet package in R.</p>
      <p>To obtain the value of , we employed 10-fold cross-validation (CV) approach
for both learning approaches. This value was used in subsequent learning and
prediction experiments. The 10-fold CV was implemented for training the models
and during testing using both traces. Di erent values for were determined for
both the UA and LA algorithms. A sequence of values between 0:0001 and 1 was
selected and cross-validation applied to select optimal values for the regularized
models. EN outperformed the other three linear methods evaluated and was
adopted for our model performance comparison between the LA and UA models.
Load-Adjusted Method (LA): We obtain subsets of the entire data set for
which the load value, the TCPSCK, is xed and has more than 500 samples.
To ensure that we have enough data to split between train and test samples,
we use the top subsets with the most samples in them for the LA learning.
Using the validation set approach, we divide the traces into two; 60% of the
trace in the training set and 40% in the test set. We apply the same
crossvalidation procedures used for the UA models to nd the best values for the
EN algorithm using the LA method. We learn EN models for each subset of the
data based on the TCPSCK value. We then apply EN to the traces to learn
UA prediction models for VFR, ABR and RTP packet rate. We refer to these
models as Un-adjusted Elastic Net (UA-EN) models. We learn Load-adjusted
Elastic Net (LA-EN) models for our service level metrics with the same number
of samples as was done for the UA-EN models.</p>
      <p>Using the same samples used for the UA-EN and LA-EN models, we then
apply Random Forest to learn Load-Adjusted Random Forest (LA-RF) and
Unadjusted Random Forest (UA-RF) models for the service-level metrics.</p>
      <p>All models are evaluated in terms of two accuracy measures. The rst is the
Root Mean Squared Error (RMSE), computed as q n1 Pin=1 (yi y^i)2. The best
model is the model with the lowest RMSE. The second measure is the R-squared
which is essentially a statistical measure that explains the goodness of t of our
regression models. R-squared achieves this by comparing our regression models
with a baseline model, one which is simply the average of of observed responses
of the dependent feature. The R-squared is computed as R2 = 1 SSSSET where
the Sum of Squared Errors (SSE) in our model is computed as Pin=1 (yi y^i)2
and the Sum of Squared Total (SST) of our baseline model is computed as
Pin=1 (yi yi)2. The model with the highest R-squared value is the best, and a
model with an R-squared value of 1 is a perfect model. We report only the test
RMSE and R-squared values.</p>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>We present three results: (1) We compare the performance of the four linear
models we examined using the UA approach. (2) We examine the performance
of the best performing linear model with the non-linear model, the Random
Forest algorithm. (3) Finally, we compare the performance of our LA models
with the UA approach.</p>
      <p>UA Linear Models: Table 2 lists the performance of the linear models LR, RR,
LASSO and the EN on the test data for both the Periodic-load and
Flashcrowdload traces using the UA technique.
1. The EN gives the best result out of all four linear models for both traces. The
RMSE for all three metrics predicted is lowest for the EN and the R-squared
is highest for the EN models in both traces (boldfont in table).
2. The EN performance is closely matched by the LASSO and the LR for both
traces across all three metrics. The EN o ers the best prediction accuracy
due to its ability to overcome the limitations of LASSO by automatically
tuning the loss function based on the data. The results indicate that the EN
does well in both traces but seems to o er lower RMSE values for the VFR
and ABR using the Flashcrowd trace.
3. The RR algorithm o ers the worst predictions across all three metrics for
both traces.</p>
      <p>Un-adjusted Method</p>
      <sec id="sec-4-1">
        <title>Model</title>
      </sec>
      <sec id="sec-4-2">
        <title>RMSE</title>
      </sec>
      <sec id="sec-4-3">
        <title>RMSE</title>
        <p>Periodic-load Trace</p>
      </sec>
      <sec id="sec-4-4">
        <title>Linear Regression 14.75</title>
      </sec>
      <sec id="sec-4-5">
        <title>Non-linear</title>
        <p>Random Forest
4. The performance of the Random Forest algorithm using the UA approach
is listed in Table 2 for both load traces. The RF algorithm o ers a big
improvement in RMSE and R-squared over the EN. The RF performance
gain over the indicates that non-linear methods perform signi cantly better
than linear models as can be inferred from the results.</p>
        <p>Comparison of LA and UA: We have listed results for EN and RF models
learned using the LA and UA approaches in Table 3.
1. The LA models for EN and RF outperform the UA models for both the
Periodic-load and Flashcrowd-load traces across all three service-level
metrics. The LA-EN estimates are over 5 audio bu ers/second better than for
the UA-EN in both load traces; the LA-EN estimates for VFR and RTP
indicate similar improvements in both load traces. The LA-EN o ers similar
performance gains for both traces.
2. The RF algorithm o ers better RMSE and R-squared values for the LA
technique than the LA-EN. The LA-RF shows a big improvement in estimates.
We compare the average RTP prediction to illustrate what the RMSE values
imply. For instance, the LA-RF estimates for the RTP using the Flashcrowd
trace indicate that the values lie between 90 to 449 RTP packets. The
UARF estimates for the same trace o er estimates between 8 to 347 RTP
packets/second. True RTP values lie between 83 to 545 RTP packets/second.
Expressed in percentages, the average improvement in prediction performance
is 50% to 60% better for the LA RF learning.
3. The LA models in the Flashcrowd-load trace o er better prediction metrics
than the Periodic-load. Fig. 3 illustrates the accuracy of the LA predictions
for the RTP packet/second for both traces for comaprison with the UA
predictions.</p>
        <p>Flashcrowd LA-EN</p>
        <p>True
Predicted
True
Predicted
True</p>
        <p>Predicted
1
Periodic LA-EN</p>
        <p>400
1
Flashcrowd LA-RF</p>
        <p>400
500
y
100
500
y
100
500
y
100
500
y
100
1
800
800
800
500
100
500
100
500
100
500
100</p>
        <p>Flashcrowd UA-EN</p>
        <p>True</p>
        <p>Predicted
1
Periodic UA-EN</p>
        <p>400
True
Predicted
True
Predicted
True</p>
        <p>Predicted
1
Flashcrowd UA-RF</p>
        <p>400
1
Periodic UA-RF
400
800
800
800
800
1
Periodic LA-RF</p>
        <p>True
Predicted
400
400
Time (s)
800
1</p>
        <p>
          400
Time (s)
Yanggratoke et al.in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] applied Machine Learning using a UA approach for
service-level prediction from cloud-hosted device statistics. We have
demonstrated that it is possible to improve the accuracy of the predictions achieved in
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Similarly, the authors of [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] investigated the problem of service-level
estimation using ML for another cloud hosted service, Voldemort, a Key-Value store.
We posit that our LA approach may also work in this scenario.
        </p>
        <p>
          The authors of [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] applied a signal processing approach for prediction of
service-level metrics from cloud-based device statistics. In their work, the
authors developed an initial system load model to aid subsequent service-level
prediction. This technique is called load-adjusted learning. It provides the
foundation for the approach undertaken in this paper. The load-adjusted technique
trains prediction weights conditional on the load value. The work in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] was
limited to regression models. We have also demonstrated that Random Forest
models give better predictions when load adjusted.
        </p>
        <p>
          Our results are of relevance to networking professionals. Our load adjusted
approach is computationally cheap than UA learning. We consider subset of the
data based on the load value which improves prediction accuracy and reduces
computation. From the perspective of the network service provider, the authors
of [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] evaluated monitoring of the quality of a compressed video transmitted in
a lossy packet network using bitstream measurements only. In their work, they
adopted the Mean Squared Error (MSE) as an estimation of the video quality.
They examined three di erent techniques for MSE estimation NoParse (NP),
QuickParse (QP) and FullParse (FP). The FP method extracts detailed
information regarding e ects of packet loss on the video; the QP method is only
concerned with extracting high-level details about the video bitstream quality
and as a result requires less computational time than the FP. The NP method
estimates the MSE based using network-level measurements only. They concluded
that the FP was the most accurate of the methods examined. In a practical
network system spanning multiple Internet Service Providers over a broad
geographical area, there may be instances when there are no available measurements
for in-network video quality estimation except for the packet loss rate and
bitrate; in such cases, the NP could be a handy tool. However, our LA approach
using readily available device statistics of the server(s) delivering the video
resources can learn the client video without any detailed knowledge of the system
or the video.
        </p>
        <p>
          There is a signi cant momentum behind in the concept of Software De ned
Networks (SDN) which will lay the foundation for our future work, particularly,
how we will deploy our learning engine. The authors of [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] proposed an
approach to measure di erent video Quality of Experience (QoE) metrics running
on client's devices in order to improve QoE. They also explored the possibility of
dynamic routing of requests or designation of best available delivery node based
on predetermined network conditions. Using a light-weight plugin they created
for HTML5 video player, they were able to monitor various QoE factors (e.g.
bu ering state and video resolution at target). With these they were able to
analyze user-perceived experience while using the video is streaming. This setup
points us towards how we might extend our testbed.
        </p>
        <p>The LA improvements in performance makes no additional assumptions about
the data. We will investigate how weakened forms of the independence
assumption made by these models can improve prediction.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We introduced a method for improving predictions of service-level metrics using
a load adjusted learning technique. We provided evidence that the EN algorithm
provides the best prediction performance among the linear regression variants
using the baseline UA approach. We also presented evidence which shows that
the LA learning algorithm improves the UA prediction performance for all three
metrics under study. We further demonstrated that the Random Forest
predictions outperform the EN estimates using the load adjusted approach. The LA
method o ers signi cant improvements in the prediction accuracy and reduces
the computational requirements of the system delivering the resources.
Acknowledgement. This publication has emanated from research conducted
with the nancial support of Science Foundation Ireland (SFI) under the Grant
Number 15/SIRG/3459.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <source>Cisco Systems 2017. Cisco Visual Networking Index: Global Mobile Data Tra c Forecast Update</source>
          ,
          <fpage>2016</fpage>
          -2021
          <string-name>
            <given-names>White</given-names>
            <surname>Paper. Cisco</surname>
          </string-name>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>R.</given-names>
            <surname>Yanggratoke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ardelius</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Flinta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Johnsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gillblad</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Stadler</surname>
          </string-name>
          .
          <article-title>Predicting real-time service-level metrics from device statistics</article-title>
          .
          <source>In IFIP/IEEE Int. Sym. on Int. Net. Man. (IM)</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. R. de Frein.
          <article-title>E ect of system load on video service metrics</article-title>
          .
          <source>IEEE Irish Signals &amp; Systems Conference</source>
          , pages
          <fpage>1</fpage>
          <issue>{6</issue>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. R. de Frein.
          <article-title>Take o a load: Load-adjusted video quality prediction and measurement</article-title>
          .
          <source>In IEEE Inter. Conf. on Comp. and IT</source>
          , pages
          <year>1886</year>
          {
          <year>1894</year>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>T.</given-names>
            <surname>Hastie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Friedman</surname>
          </string-name>
          .
          <source>The Elements of Statistical Learning</source>
          . Springer New York Inc., New York, USA,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>R.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          .
          <article-title>Regression Shrinkage and Selection Via the LASSO</article-title>
          .
          <source>J. Roy. Stat. Soc. Series B (Methodological)</source>
          ,
          <volume>58</volume>
          (
          <issue>1</issue>
          ):
          <volume>267</volume>
          {
          <fpage>288</fpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>H.</given-names>
            <surname>Zou</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Hastie</surname>
          </string-name>
          .
          <article-title>Regularization and variable selection via the Elastic-Net</article-title>
          .
          <source>J. Roy. Stat. Soc.</source>
          ,
          <volume>67</volume>
          (
          <issue>2</issue>
          ):
          <volume>301</volume>
          {
          <fpage>320</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          .
          <article-title>Random forests</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>45</volume>
          (
          <issue>1</issue>
          ):5{
          <fpage>32</fpage>
          ,
          <string-name>
            <surname>Oct</surname>
          </string-name>
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>R.</given-names>
            <surname>Stadler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pasquini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Fodor</surname>
          </string-name>
          .
          <article-title>Learning from network device statistics</article-title>
          .
          <source>J. Netw. Syst. Manage.</source>
          ,
          <volume>25</volume>
          (
          <issue>4</issue>
          ):
          <volume>672</volume>
          {
          <fpage>698</fpage>
          ,
          <year>October 2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. R. de Frein.
          <article-title>Source separation approach to video quality prediction in computer networks</article-title>
          .
          <source>IEEE Comm. Let</source>
          .,
          <volume>20</volume>
          (
          <issue>7</issue>
          ):
          <volume>1333</volume>
          {6,
          <string-name>
            <surname>Jul</surname>
          </string-name>
          .
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>A. R. Reibman</surname>
            ,
            <given-names>V. A.</given-names>
          </string-name>
          <string-name>
            <surname>Vaishampayan</surname>
            , and
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Sermadevi</surname>
          </string-name>
          .
          <article-title>Quality monitoring of video over a packet network</article-title>
          .
          <source>IEEE Trans. on Multim.</source>
          ,
          <volume>6</volume>
          (
          <issue>2</issue>
          ):
          <volume>327</volume>
          {
          <fpage>334</fpage>
          ,
          <string-name>
            <surname>April</surname>
          </string-name>
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. H.
          <string-name>
            <surname>Nam</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>J. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
            , and
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Schulzrinne</surname>
          </string-name>
          .
          <article-title>Towards QoE-aware video streaming using SDN</article-title>
          .
          <source>In 2014 IEEE Globecom</source>
          , pages
          <volume>1317</volume>
          {
          <fpage>1322</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>