<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Modelling the Multi-Layer Artificial Neural Network for Internet Traffic Forecasting: The Model Selection Design Issues</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mba O. Odim</string-name>
          <email>odimm@run.edu.ng</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacob A. Gbadeyan</string-name>
          <email>jagbadeyan@unilorin.edu.ng</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joseph S. Sadiku</string-name>
          <email>jssadiku@unilorin.edu.ng</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, Redeemer's University</institution>
          ,
          <addr-line>Ede</addr-line>
          ,
          <country country="NG">Nigeria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computer Science Department, University of Ilorin</institution>
          ,
          <addr-line>Ilorin</addr-line>
          ,
          <country country="NG">Nigeria</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Mathematics Department, University of Ilorin</institution>
          ,
          <addr-line>Ilorin</addr-line>
          ,
          <country country="NG">Nigeria</country>
        </aff>
      </contrib-group>
      <fpage>10</fpage>
      <lpage>16</lpage>
      <abstract>
        <p>Internet traffic forecasting models with learning ability, such as the artificial neural network (ANN), have been growing in popularity in recent time due to their impressive performance in modelling the high degree of variability and nonlinearity of internet traffic. This study examined the impacts of some design issues on performance of the multi-layer artificial neural network for internet traffic forecasting. The traffic forecasting was modelled as a standard time series problem and the multilayer artificial neural network designed to performs the time series function mapping. The input lags were varied from 1 to 24. The training epoch values of 200, 500, and 1000 on one and two hidden layered networks were used. The learning algorithm was backpropagation with 0.1 learning rate and 0.9 momentum on logistic sigmoid activation function. The model was implemented in Visual Basic and validated with four categories of classified time series internet traffic of a branch residential network of one of a firm in Nigeria. Various predictive performances without consistent pattern were observed on the issues considered, however, input lag one gave the worst performance in all cases for the HOURLY traffic; three of the four traffic categories demonstrated the superiority of two hidden layers to one hidden layer. Although the epoch values of 200, 500 and 1000 showed no consistent performance variations, epoch value 200 outperformed the others on the model selections. The study revealed that input lags, number of hidden layers and epoch values could impact on the traffic forecasting performance of multilayer perceptron and that performance could be considerably improved by careful selection of those parameters through experimentations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>• Computing Methodologies → Artificial Intelligence →</p>
    </sec>
    <sec id="sec-2">
      <title>Machine learning → Machine learning approaches → Neural</title>
    </sec>
    <sec id="sec-3">
      <title>Networks</title>
      <sec id="sec-3-1">
        <title>1. INTRODUCTION</title>
        <p>Accurate information about offered traffic is required for efficient
resource provisioning and general capacity planning of an Internet
service. The inability of most statistical methods in modelling the
high variability of internet traffic accurately, and their lack of
reasoning capabilities have triggered an increased number of
studies that employ non-traditional statistical methods including
machine learning. Furthermore, traditional summary statistics,
particularly the sample mean and variance are instable metrics for
working with the high variability of internet traffic, as such the
sample means and sample variances are not reliable statistics for
summarising traffic properties [1]. Machine learning techniques,
such as the Artificial Neural Network (ANN) employ mechanisms
that allow computers to evolve behaviour based on knowledge
gained from dynamic observations. Machine learning technique
based on nonlinear elements is often referred to as Neural
Network. Neural networks are networks of nonlinear elements
interconnected through adjustable weights and they play a
prominent role in machine learning. Artificial (ANN) emerged
with the aim of imitating the information processing process of
human brain. Through learning, ANNs can determine nonlinear
relationship in a data set by associating the corresponding output
to input patterns. The multilayer artificial neural network, among
other machine learning models, has shown impressive results in
forecating studies [2, 3, 4, 5, 6, 7, 8, 9]. However, applying an
ANN to a given forecasting endeavour is a hard task, as basic
modelling issues must be carefully considered for enhanced
precision. The issues include the network architecture, learning
parameters and data pre-processing methods [6, 8]). The
inconsistencies in performance reports on the design issues in the
literature was noted also in [8]. In [9] it was argued that ANN
technique should not be applied arbitrarily as has been sometimes
suggested and even used in the internet forecasting domain [10,
11] .</p>
        <p>The paper examined the impacts of number of input lags, hidden
neurons and training epochs on the precision of the multilayer
artificial neural network in forecasting internet traffic.</p>
      </sec>
      <sec id="sec-3-2">
        <title>1. RELATED WORK</title>
        <p>Quite a number of research efforts has been reported in the
literature on seeking appropriate models for forecasting Internet
traffic.</p>
      </sec>
      <sec id="sec-3-3">
        <title>2.1 Internet</title>
      </sec>
      <sec id="sec-3-4">
        <title>Methods</title>
      </sec>
      <sec id="sec-3-5">
        <title>Traffic Forecasting: Statistical</title>
        <p>In [12] a comparative study on suitable statistical methods for
network traffic estimation was conducted. In the paper, several
estimation methods for IP network traffic were studied. The study
showed that non-linear time series models could model and
forecast better than the classical linear time series models. Anand
in [13] investigated a non-linear Time series model, the
Generalised Autoregressive Heteroskdasticity (GARCH) in
internet traffic modelling. The model showed that the forecasting
algorithm was accurate compared with actual traffic. Although
nonlinear statistical models can capture the busrtiness of
network‟s traffic, the models are parametric in nature and
therefore require the knowledge of the distribution of the traffic.
In addition, they are analytical and therefore require explicit
programming to clearly specify the algorithmic steps. To take the
advantages of machine learning paradigms, applying machine
learning techniques to internet traffic forecasting has been on the
increase.</p>
      </sec>
      <sec id="sec-3-6">
        <title>2.2 Machine Learning and Artificial</title>
      </sec>
      <sec id="sec-3-7">
        <title>Neural Network for Internet Traffic</title>
      </sec>
      <sec id="sec-3-8">
        <title>Forecasting</title>
        <p>A vast number of research efforts have been on going in exploring
machine learning techniques to internet traffic predictions, the
results of which have demonstrated their superiority to statistical
forecasting methods. A concurrent neuro-fuzzy model to discover
and analyse useful knowledge from available Web log data was
proposed in [14]. The study used self-organizing map for pattern
analysis and a fuzzy inference system to capture the chaotic trend
to provide short term (hourly) and long term (daily) web traffic
trend predictions. Empirical results demonstrated that the
proposed approach was efficient for mining and predicting web
traffic. A study in [15] presented a neural network ensemble
(NNE) for the prediction of TCP/IP traffic using time series
forecasting (TSF) point of view. The NNE approach was
compared with TSF methods (Holt -winter and ARIMA) and the
NNE was found to compete favourably with the TSF methods. In
[16] the least square support vector machines was applied to solve
the problem of accurately predicting non-peak traffic and the
method had a good generalization ability and guaranteed global
minima. [17] Presented a neural network ensemble approach and
two adapted time series methods (ARIMA and Holt-Winters) for
forecasting the amount of traffic in TCP/IP based networks. The
experiments with the neural ensemble achieved the best results for
5 min and hourly data, while the Holt-Winters was the best option
for the daily forecasts. The study in [10] investigated the
ensembles of artificial neural networks in predicting long-term
internet traffic. The proposed prediction models were compared
with the classic method of Holt-Winters. Prangchumpol in [18]
presented a description approach to predicting incoming and
outgoing data rate in network system by using a data (machine
learning) mining techniques, the association rule discover. The
result of the study showed that the technique could predict future
network traffic.</p>
      </sec>
      <sec id="sec-3-9">
        <title>2.3 Design Issues with Forecasting with</title>
      </sec>
      <sec id="sec-3-10">
        <title>Artificial Neural Network</title>
        <p>A detailed state of the art presentation on forecasting with
artificial neural networks was made in [8]. The study showed that
overall; ANNs gave satisfactory performance in forecasting, but
went on to indicate the inconsistencies in performance reports of
design issues in the literature. The inconsistencies were attributed
to trial and error methodology adopted in most studies. Faraway
and Chatfield [9] argued that it was unwise to apply ANN models
blindly in black box mode as had sometimes been suggested.
Shamsuddin, et al. in [7] investigated the effect of applying
different number of input nodes, activation functions and
preprocessing techniques on the performance of backpropagation
network in time series revenue forecasting. The findings showed
that the performance of ANN model could be considerably
improved by careful selection of those parameters. In [19], the
performance of two learning algorithms: the linear regression and
Neural Network Standard Back propagation were compared on
the prediction of four major stock market indexes. The
comparison showed that the neural network approach resulted in
better prediction accuracy than the Linear Regression model.
Chabaa et al in [20] presented an ANN based on the multi-layer
perceptron for analysing a time series measured internet traffic
data over IP networks. The comparison between some training
algorithms demonstrated the efficiency and accuracy of the
Levenberg Maquardt and the Resilient back propagation
algorithms. Chukwuchekwa in [21] compared the performance of
the back propagation gradient descent technique and genetic
algorithm on some pattern recognition problems. The
backpropagation (BP) algorithm was found to outperform the
genetic algorithm in that instance. The study suggested that
caution should be applied before using other algorithms as
substitutes for the BP algorithm, more especially in classification
problems. In [2], an evaluation of several learning rules for
adjusting ANN weights was carried out on the popular airline
passenger data set. The Levenberg-Marquardt backpropagation
algorithm showed the best performance among other learning
rule. Various degrees of performances were observed in [22] on
examining the impact of input lags of the multilayer perceptron in
forecasting internet traffic on a two layered network. In [23] a
survey of research and application issues on Web usage mining
based on various mining technique was conducted to provide
some understanding in designing algorithms suitable for mining
data.</p>
        <p>This review demonstrated the impressive results of applying
machine learning technique, such as the artificial neural networks,
in forecasting Internet traffic as well as raising concerns over the
little or no consideration given by researchers on the design
issues. The paper therefore presents results from the study on the
impacts of some multi-layer perceptron design issues on internet
traffic forecasting.</p>
      </sec>
      <sec id="sec-3-11">
        <title>3. METHODOLOGY</title>
        <p>The traffic forecasting was modelled as a standard time series
problem and the multilayer artificial neural network designed to
performs the time series function mapping.</p>
      </sec>
      <sec id="sec-3-12">
        <title>3.1 Time series for Traffic Forecasting</title>
        <p>Traffic forecasting is a standard time series prediction task. The
goal is to approximate the function that relates the future values of
a variable of the previous observations of that variable [24]. In
some situations, such as internet traffic, data are non-stationary
and chaotic. In such situation, one general assumption is that
historical data incorporate all behaviour required to capture the
dependency between the future traffic and that of the past.
Therefore, the historical data is the major player in the forecasting
process. The second assumption to model and forecast the
dynamic of the traffic is that its values are expressed by discrete
time series [2, 3]. A discrete time series is a vector {yt} of
observations made at regular intervals, t=1, 2, 3……, N. For the
time series forecasting problem, the inputs are typically the past
observation of the data series and the output is the future value.
Suppose y1, y2 ,.yN. denote an observed time series of the
traffic loads, then the basic problem is to estimate future traffic
value such as yN k , where the integer k is called the lead time or
the forecasting horizon [25]. For the univariate method, forecasts
of a given traffic load are based on a model fitted only to the past
^
observations of the given time series, so that yt (N, k) depends
only on y1, y2….yN-1. The estimate of yN 1 is computed as a</p>
        <p>N 1
weighted sum of the past observations:
^
y
where the { wi } are weights.</p>
        <p>
           w0 yN  w1 yN 1  w2 yN 2  ... (
          <xref ref-type="bibr" rid="ref1 ref27 ref7">1</xref>
          )
The Multi-Layer Perceptron performs the following function
mapping [3, 8]:
^
yt  f ( yt1, yt2 ,..., ytn )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
^
yt
where
is
the
estimated
traffic
at
time
t,
( yt1, yt2 ,..., ytn ) denotes the training pattern composed of a
fixed number (n) of lagged observations of the series.
The weight to be used in the ANN model are estimated from the
data by minimizing the sum of squares of the within-sample
onestep ahead forecast errors, namely
        </p>
        <p>
          ^
S   ( y  yt )2 (3)
t
over the first part of the time series, called the training set. The
last part of the time series called the test set, is kept in reserve so
that genuine out of sample (ex ante) forecasts can be made and
compared with the actual observations. Equations (
          <xref ref-type="bibr" rid="ref1 ref27 ref7">1</xref>
          ) and (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) give
a one-step-ahead forecast as it uses the actual observed values of
all lagged variables as inputs. If multistep-ahead-forecasts are
required, then it is possible to proceed in one of two ways. Firstly,
construct a new architecture with several outputs, giving
^ ^ ^
yt , yt1, yt2 ,... , where each output would have separate
weights for each connection to the neurons. Secondly, „feedback‟
the one-step-ahead forecast to replace the lag 1 value as one of the
input variables, and the same architecture could then be used to
construct the two-step-ahead forecast, and so on [16].. This study
adopted the latter iterative approach because of its numerical
simplicity and because it requires fewer weights to be estimated.
        </p>
      </sec>
      <sec id="sec-3-13">
        <title>3.2 The Multilayer Neural Network</title>
        <p>Neural network is a powerful model for solving complex
problems because it has natural potential of solving nonlinear
problems and can esily achieve the input-out mapping, it is good
for solving predicting problems [26]. The basic features of the
multilayer perceptrons include:
i.
ii.</p>
        <p>The model of each neuron in the network includes a
nonlinear activation function that is differentiable.</p>
        <p>The network contains one or more layers that are hidden
from both input and output nodes.
iii.</p>
        <p>The network exhibits a high degree of connectivity, the
extent of which is determined by synaptic weights of the
network.</p>
        <sec id="sec-3-13-1">
          <title>3.2.1 A Neural Model</title>
          <p>
            The node is the basic unit of the Artificial Neural Network. . Each
node is able to sum many inputs x1, x2, …,xn form the
environment or from other nodes, with each input modified by an
adjusted node weight (Figure 2). The sum of these weighted
inputs is added to an adjustable threshold for the node and then
passed through a modifying (activation) function that determines
the final output.
yk =  ( uk + bk )
(
            <xref ref-type="bibr" rid="ref5">4</xref>
            )
(5)
where x1, x2, …, xm are the input signals; wk1, wk2, …., wkm are the
respective synaptic weights of neuron k. uk is the linear combiner

output due to the input signals, bk is the “bias”, (.) is the
y
activation function, and k is the output signal of the neuron. The
use of the bias bk has the effect of applying affine transformation
to the output vk of the linear combiner in the model this is shown
by
The bias bk is an external parameter of neuron k.

The activation function, denoted by (v) defines the output of a
neuron in terms of induced local field v. It is this function (also
(6)
called, the transfer function) that determines the relationship
between inputs and outputs of a node and a network. In general,
the activation function introduces a degree of nonlinearity that is
valuable for most ANN applications. Among these functions,
sigmoid function is very popular. It is a strictly increasing
function that exhibits a graceful balance between linear and
nonlinear behaviour. The Logistic Sigmoid is defined as in (5)
 (v) =
(7)
A logistic sigmoid function assumes a continuous range of values
from 0 to 1. Additional types of activation functions can be found
in [8]. Among these functions, logistic transfer function is the
most popular choice [8].
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3.2.1 Training of artificial neural networks</title>
      <p>ANN has to be trained before it can be put to use. The goal of the
training is to find the logical relationship from the given
input/output. There two strategies of the learning: supervised and
unsupervised. This study employs the supervised learning
strategy. Supervised learning typically operates in two phases –
training and test set. The training set is used for estimating the arc
weights while the test set is used for measuring the generalization
ability of the network. Training is used to gain generalised
knowledge about the system under consideration and testing is
used to predict (forecast) the system behaviour using the
knowledge gained. On the other hand, unsupervised techniques
such as the reinforcement learning is independent of training data
and operate by directly interacting with the environment.
The training algorithm employed is the Backpropagation. It is a
supervised training strategy and popular method for training the
multilayer perceptron. The training proceeds in two phases [26]:</p>
      <p>In the forward phase, the synaptic weights of the
network are fixed and the input signal is propagated
through the network, layer by layer, until it reaches the
output. Thus, in this phase, changes are confined to the
activation potentials and outputs of the neurons in the
network.</p>
      <p>In the backward phase, an error signal is produced by
computing the output of the network with desired
response. The resulting error signal is propagated
through the network, again layer by layer, but this time
the propagation is performed in the backward direction.
In this second phase, successive adjustment is made to
the synaptic weights of the network.</p>
      <p>In [5] it is also reported that the backprobagation is the most
computationally straightforward algorithm for training the
multi-layer perceptron. They summarized the algorithms
steps as
1.
2.
3.
4.
5.
6.</p>
      <p>Obtain a set of training patterns
Set up ANN model that consist of number of input
neurons, hidden neurons, and output neurons
Set learning rate (h) and momentum rate (a)
Initialize all connections (Wij and Wjk) and bias
weights ( qk and qj) to random values.</p>
      <p>Set the minimum error Emin/number of epochs
Start training by applying input pattern one at a
time and propagate through the layers then
calculate total error
7.
8.
9.</p>
      <p>Back-propagate error through output and hidden
layers and adapt Wij and qj.</p>
      <p>Back-propagate error through hidden and input
layer and adapt weights Wij and qj,
Check if Error &lt; Emin or max epoch reached. If not,
repeat steps 6 – 9, otherwise, stop training.
3.3</p>
      <sec id="sec-4-1">
        <title>Data collection and Description</title>
        <p>
          Internet traffic data was collected in hourly average kilo bit/s of
TCP/IP traffic of a company‟s resident network from January 1
2010 to September 30 2010 (making up 6552 data points each for
IN and OUT traffic data), daily traffic data from January 1 to
December 31, 2010 (making up 365 data points each for IN and
OUT traffic data), using PRTG (Paessler Router Traffic Grapher),
a network monitoring and bandwidth usage tool from a company
called PAESSLER. 20Mpbs bandwidth was allocated for upload
(Traffic IN) and 20Mbps for download (traffic out) statically for
the period under consideration.
3.3.1 Data Pre-processing/ Normalisation
Nonlinear activation functions such as the logistic function
typically restricts the possible output from a node to, typically, (
          <xref ref-type="bibr" rid="ref1 ref27 ref7">0,
1</xref>
          ) or (
          <xref ref-type="bibr" rid="ref1 ref27 ref7">-1, 1</xref>
          ). This is to avoid computational problems, to meet
algorithm requirement and to facilitate network learning. Four
methods for input normalization are summarized in [8]. This study
employs, the Linear transformation to [0, 1], defined as
yn = (y0-ymin)/(ymax-ymin)
(8)
.
where yn and yo represent the normalized and original data: ymin,
ymax, are the minimum, maximum of the column or rows
respectively.
        </p>
        <sec id="sec-4-1-1">
          <title>3.3.2 Training and Testing set</title>
          <p>Eighty percent (80%) of the data, that is, 5241.6 approximated to
5242 was used for training the network, while twenty per cent
(20%), that is, 1310.4, approximated to 1310, was used for testing
the generalisation predictive capability of the network each for the
HOURLY_IN and HOURLY_OUT flow traffic. Also, a training
set of 80% and testing set of 20% were used for each of the
DAILY traffic, that is. 292 data points for training and 73 for
testing.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>3.4. Finding the appropriate complexity of the Network</title>
        <p>For times series forecasting problem, a training pattern consists of
a fixed number of lagged observations of the series [7]. The inputs
(number of lag observations) were varied from 1 to 24, excluding
the bias. One and two hidden layers were considered. The number
of hidden nodes were equalled to the number of input nodes. In
several studies, networks with the number of hidden nodes being
equal to the number of input nodes are reported to have better
forecasting [8]. One output node was used, one look-ahead. So the
model of our network is k, k, k, 1, where k represents the number
of lag observations (input variables). The epochs were based on
200, 500, and 1000. The best model according to [18] is the one
that gives the best result in the test set. The logistic sigmoid
activation function was used [8]. The Error correction
backpropagation algorithm with learning rate: 0.1; momentum:
0.9 was used to train the network.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.5 Stopping and Evaluation Criteria</title>
        <p>The training stop after each epoch respectively. Typically, as SSE
based objective function or cost function to be minimized during
the training process is defined in (10). The measure of accuracy
employed is the Root Mean Square error (RMSE) defined as
1n t ( y^t  yt )2
(9)
where n is the total number of sample group observations, ŷt is the
predicted (computed) value while yt is the target value at time t.
RMSE is one of the most commonly used measure of forecast
error to examine how close the forecast is to the actual value [5].
The best model is the one that gives the best result in the test set,
that is, the model that has the least RMSE in the testing set [27].</p>
      </sec>
      <sec id="sec-4-4">
        <title>4. RESULTS AND DISCUSSION</title>
        <p>The system was implemented in visual basic. The RMSE of
various models were recorded and compared based on the design
issues considered. The results are presented and discussed in this
section.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.1 HOURLY_IN traffic</title>
        <p>The RMSE of the testing (prediction) results of the various
models based on the number of input lags, number of hidden
layers and training epochs on one and two hidden layers network
respectively were compared for the HOURLY_IN traffic. Figure 3
depicts these results.</p>
        <p>0.25
0.2
SE0.15
RM0.1
0.05
0
1 4 7 0 3 6 9 2
lga lga lga lga1 lga1 lga1 lga1 lga2
input lag
200ep_1ddn
200ap_2hdn
500ep_1hdn
500ep_2hdn
1000ep_1hdn
1000ep_2hdn
There were varying degrees of performance with no regular
patterns of performance among the input lags, between the one
and two hidden layers networks, and among the various epochs
used. Nevertheless, the worst performance for all the cases is
input lag 1. The least RMSE with the value 0.0766984 of this
experiment occurred at input lag 24 with 200 training epochs on
two hidden layer network. Therefore, the best model for
forecasting the HOURLY_IN traffic is input lag 24, 200 training
epochs using two hidden layers.</p>
      </sec>
      <sec id="sec-4-6">
        <title>4.2 HOURLY_OUT traffic</title>
        <p>The RMSE of the testing (prediction) results of the various
models were compared for the HOURLY_OUT traffic. The
results are shown in Figure 4.
0.2
0.15
R0.05</p>
        <p>0</p>
        <p>There also various values of the performance measure
with no particular patterns on the issues for the HOURLY_OUT
traffic. As in the HOURLY_IN, the worst performance for was
recorded at input lag 1 in all the cases. The least RMSE with the
value 0.0621992 occurred at input lag 13 with 200 training epochs
on two hidden layer network. Therefore, the best model for
forecasting the HOURLY_OUT traffic is input lag 13, 200
training epochs using two hidden layers.</p>
      </sec>
      <sec id="sec-4-7">
        <title>4.3 DAILY_IN traffic</title>
        <p>Figure 5 presents the prediction RMSE of the various models for
the DAILY_IN traffic.</p>
        <p>1 4 7 0 3 6 9 2
lga lga lga lga1 lga1 lga1 lga1 lga2</p>
        <p>inpu lag
Different performance values were also observed with no
particular patterns on the various prediction models.. The least
RMSE with the value 0.116691 of this experiment occurred on
input lag 3 with 1000 training epochs on two hidden layer
network. Therefore, the best model for forecasting the DAILY_IN
traffic is input lag 3, 1000 training epochs using two hidden
layers.
No particular patterns of performance was observed among the
various models, although there were different values of the
performance measure. The least RMSE with the value 0.099416
of this experiment occurred on input lag 3 with 200 training
epochs on one hidden layer network. Therefore, the best model for
forecasting the DAILY_OUT traffic is input lag 3, 200 training
epochs using one hidden layer.
For the HOURLY_IN traffic the traffic computed (predictive)
values based on 24 input lags on two hidden layer network using
200 training epochs was deployed, and for the HOURLY_OUT
traffic, the study used 13 input lags of the traffic computed values
of the testing set on 200 training epochs on two hidden layers
network designed to perform . The study deployed 3 input lags,
two hidden layers of 3 neurons each using 200 training epochs for
predicting the DAILY_IN traffic. For the DAILY_OUT traffic, 3
input lags, one hidden layer of three neurons with 200 training
epochs were selected.</p>
        <p>Figure 7 compares the predicted models selected for the
traffic categories.</p>
        <p>traffic category
200ap_2hdn
500ep_1hdn
500ep_2hdn
1000ep_1hd
n</p>
        <p>The HOURLY traffic categories had a better prediction
performance than the DAILY traffic counterparts. This could have
been attributed to the very large sample size used for the
HOURLY traffic. It has been reported that the ANN for
forecasting perform better with large sample size than with small
sample size (Zhang et al. 1998 [8] and Zhang, et al. [26]). In
addition, figure 7 revealed that various forecasting models may
exist for different traffic categories, even if the traffic categories
are all from the same network operator.</p>
        <p>This study has observed different forecasting models for the
various traffic categories based on the issues. The findings suggest
that carefully consideration of the design issues is indispensable
for improving the predictive performance of a multi-layer
artificial neural network rather than applying it to internet traffic
forecasting blindly. However, there are no generally acceptable
techniques for determining the optimal design parameter but by
experimentations, an improved predictive performance model is
feasible.</p>
      </sec>
      <sec id="sec-4-8">
        <title>5. CONCLUSION</title>
        <p>This study examined the impacts of some important design issues
in modelling a multilayer perceptron artificial neural network for
Internet traffic forecasting. The traffic forecasting was modelled
as a standard time series problem and a multilayer artificial neural
network designed to performs the time series function mapping.
The mechanism was implemented in a Visual Basic programming
environment and tested with real Internet traffic data through
experimentation with the various design issues considered.
Although no particular pattern of performance was observed the
study showed that the forecasting performance can be affected by
the number of input lags, hidden layers and training epochs,.
Despite that the study did not make any attempt to determine an
optimal values for the various factors considered, it has shown
that careful experimentation is required to choose appropriate
values for each of the design issues. Therefore, the multilayer
perceptron should not be applied blindly to Internet traffic
forecasting.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Crovella</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Krishnamurthy</surname>
            <given-names>B.</given-names>
          </string-name>
          <year>2006</year>
          .
          <string-name>
            <given-names>Internet</given-names>
            <surname>Measurement</surname>
          </string-name>
          . John Wiley &amp; Sons, Ltd., England.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Benkacha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benhra</surname>
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>El Hassani</surname>
          </string-name>
          , H..
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>Seasonal Time Series Forecasting Models on Artificial Neural Network</source>
          .
          <source>International Journal of Computer Applications</source>
          .
          <volume>116</volume>
          ,
          <issue>20</issue>
          ,
          <fpage>0975</fpage>
          -
          <lpage>8887</lpage>
          , DOI=:
          <volume>10</volume>
          .5120/
          <fpage>20451</fpage>
          -
          <lpage>2805</lpage>
          [3]
          <string-name>
            <surname>Benkacha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benhra</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>El Hassani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>Causal Method and Time Series Forecasting Model based on Artificial Neural Network</source>
          .
          <source>International Journal of Computer Applications</source>
          .
          <volume>75</volume>
          ,
          <issue>7</issue>
          ,
          <fpage>0975</fpage>
          -
          <lpage>8887</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Islam</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keung</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>Empirical prediction models for adaptive resource provisioning in the cloud</article-title>
          .
          <source>Future Generation Computer Systems</source>
          .
          <volume>28</volume>
          ,
          <fpage>155</fpage>
          -
          <lpage>162</lpage>
          . DOI=
          <volume>10</volume>
          .1016/1.future2011.
          <volume>05</volume>
          .
          <issue>027</issue>
          [5]
          <string-name>
            <surname>Chabaa</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Zeroual</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Antari</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>Identification and prediction of internet traffic using artificial neural networks</article-title>
          .
          <source>Journal of. Intelligent Learning Systems &amp; Applications. 2</source>
          ,
          <fpage>147</fpage>
          -
          <lpage>155</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>DOI=1</source>
          .4236/jilsa.
          <year>2010</year>
          .
          <volume>23018</volume>
          [6]
          <string-name>
            <surname>Shamsuddin</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sallehuddin</surname>
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Yusof</surname>
            ,
            <given-names>N. M.</given-names>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          2012.
          <article-title>Artificial neural network time Series modelling for revenue forecasting</article-title>
          .
          <source>Chiang Mai J. Sci. 35</source>
          ,
          <issue>3</issue>
          ,
          <fpage>411</fpage>
          -
          <lpage>426</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Cortez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Sousa</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rocha</surname>
            <given-names>M..</given-names>
          </string-name>
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <article-title>Topology aware internet forecasting using neural networks</article-title>
          .
          <source>In Proceedings of the 17th International Conference on Artificial Neural Networks (Porto, Portugal), Lecture Notes in Computer Science</source>
          <volume>4669</volume>
          ,
          <fpage>445</fpage>
          -
          <lpage>452</lpage>
          , Springer.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patuwo</surname>
            ,
            <given-names>B. E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>M. Y.</given-names>
          </string-name>
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <article-title>Forecasting with artificial neural networks: The state of the art</article-title>
          .
          <source>International Journal of Forecasting</source>
          .
          <volume>14</volume>
          ,
          <fpage>35</fpage>
          -
          <lpage>62</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Faraway</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Chattfield</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>1998</year>
          .
          <article-title>Times series forecasting with neural networks: a comparative study using the airline data</article-title>
          .
          <source>Journal of Appl. Statist</source>
          .
          <volume>47</volume>
          ,
          <fpage>231</fpage>
          -
          <lpage>250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Miguel</surname>
            ,
            <given-names>M. L. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Penna</surname>
            ,
            <given-names>M. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nievola</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          <string-name>
            <surname>Pellenz</surname>
            and
            <given-names>M. E.</given-names>
          </string-name>
          <year>2012</year>
          . .
          <article-title>New models for long-term internet traffic forecasting using artificial neural networks and flow based information</article-title>
          .
          <source>In Proceedings of 2012 IEEE Network Operations and Management Symposium</source>
          ,
          <fpage>1082</fpage>
          -
          <lpage>1088</lpage>
          . DOI=
          <volume>10</volume>
          .1109/NOMS.
          <year>2012</year>
          .
          <volume>6212033</volume>
          ,
          <string-name>
            <surname>Cortez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Rocha</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sousa</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>Multiscale Internet traffic forecasting using neural networks and time series methods</article-title>
          .
          <source>Expert Systems</source>
          .
          <volume>29</volume>
          ,
          <issue>2</issue>
          ,
          <fpage>143</fpage>
          -
          <lpage>155</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Mariam</surname>
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dadarlat</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Iancu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>A Comparative Study of the statistical Methods suitable for Network Traffic Estimation</article-title>
          .
          <source>In Proceedings of the 13th WSEAS International Conference on Communications. 99-104.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Anand</surname>
            ,
            <given-names>C. N.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Internet traffic modeling and forecasting using non-linear time series model Garch</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X..</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abraham</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>K. A.</given-names>
          </string-name>
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <article-title>Intelligent web traffic mining and analysis</article-title>
          ,
          <source>Journal of Network and Computer Applications</source>
          .
          <volume>28</volume>
          ,
          <fpage>147</fpage>
          -
          <lpage>165</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Cortez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rocha</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sousa</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <article-title>Internet Forecasting using Neural Networks</article-title>
          .
          <source>In Proceeding of the International Joint Conference on Neural Network (Vancouver)</source>
          ,
          <fpage>2635</fpage>
          -
          <lpage>2642</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Comparison of parametric and nonparametric techniques for non-peak traffic forecasting</article-title>
          ,
          <source>World Academy of Science, engineering and Technology</source>
          .
          <volume>51</volume>
          ,
          <string-name>
            <surname>Cortez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rocha</surname>
          </string-name>
          , M and Sousa. P.
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <article-title>Multi-scale internet traffic forecasting using neural networks and time series methods</article-title>
          .
          <source>Expert Systems</source>
          .
          <volume>29</volume>
          ,
          <issue>2</issue>
          ,
          <fpage>143</fpage>
          -
          <lpage>155</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Prangchumpol</surname>
            ,
            <given-names>D. A.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>Network traffic prediction algorithm based on data mining technique</article-title>
          .
          <source>World Academy of Science</source>
          , Engineering and Technology.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Fok</surname>
          </string-name>
          , W. W. T,
          <string-name>
            <surname>Tam</surname>
            <given-names>V. W. L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <article-title>Computational neural network for global stock Indexes Prediction</article-title>
          ,
          <source>In Proceedings of World Congress on Engineering (London, UK, July 2 -4</source>
          ,
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Chabaa S.</given-names>
            ,
            <surname>Zeroual</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            , and
            <surname>Antari</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <source>DOI=1</source>
          .4236/jilsa.
          <year>2010</year>
          .
          <volume>23018</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>Chukwuchekwa</surname>
            ,
            <given-names>U. J.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Comparing the performance of backpropagation algorithm and genetic algorithms in pattern recognition problems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <surname>Odim</surname>
            ,
            <given-names>M. O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gbadeyan</surname>
            <given-names>J. A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sadiku</surname>
            <given-names>J. S.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>A neural network model for improved internet service resource provisioning</article-title>
          .
          <source>British Journal of Mathematics &amp; Computer Science</source>
          .
          <volume>4</volume>
          ,
          <issue>17</issue>
          ,
          <fpage>2418</fpage>
          -
          <lpage>2434</lpage>
          , Dogne,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            and
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <surname>S..</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>Evolving trends and its application in web usage mining: a survey.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <source>International Journal of soft computing and engineering. 4</source>
          ,
          <issue>6</issue>
          ,
          <fpage>98</fpage>
          -
          <lpage>101</lpage>
          , Rutka,
          <string-name>
            <given-names>G.</given-names>
            , and
            <surname>Lauks</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <year>2007</year>
          .
          <article-title>Study on internet traffic prediction models</article-title>
          .
          <source>Electronics</source>
          and
          <string-name>
            <given-names>Electrical</given-names>
            <surname>Engineering</surname>
          </string-name>
          . - Kaunas: Technologija,
          <volume>6</volume>
          ,
          <issue>78</issue>
          ,
          <fpage>47</fpage>
          -
          <lpage>50</lpage>
          , Chatfield,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <year>1992</year>
          .
          <article-title>The analysis of time series: An introduction (4th ed</article-title>
          .).
          <source>Chapman &amp; Hall</source>
          , London:.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <surname>Haykin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Neural networks and learning machines (3rded</article-title>
          .).
          <source>Pearson Education</source>
          , Inc, New Jersey.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>G. P.</given-names>
          </string-name>
          <string-name>
            <surname>Patuwo</surname>
            ,
            <given-names>B. E</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>M. Y. A.</given-names>
          </string-name>
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <source>Simulation Study of Artificial Neural Networks for Nonlinear Time-series Forecasting</source>
          , Computer &amp; Operations research,
          <year>l28</year>
          ,
          <fpage>381</fpage>
          -
          <lpage>396</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>