<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Electricity Demand Prediction using Machine</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Manpreet Kaur</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shalini Panwar</string-name>
          <email>panwarshalini40@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ayush Joshi</string-name>
          <email>ayushjoshi75.aj@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kapil Gupta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Forest Regression, Extra Trees, Support Vector Regression</institution>
          ,
          <addr-line>Decision tree. Lasso Regression</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Learning</institution>
          ,
          <addr-line>Temperature, Building</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>National Institute of Technology</institution>
          ,
          <addr-line>Kurukshetra Haryana</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Prediction</institution>
          ,
          <addr-line>Demand, Consumption, Residential, Generation, Power, Dynamic, Regression</addr-line>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>of Linear Regression</institution>
          ,
          <addr-line>Lasso Regression, Ridge Regression, Elastic Net Regression, Random</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <fpage>25</fpage>
      <lpage>27</lpage>
      <abstract>
        <p>This paper presents an analysis of the usage of electric power in the residential sector and predicting the demand for power consumption of the next day and aims to improve the prediction accuracy and find the best model and try to reduce the cost of overall power consumption in a building. Consumption of electric power can be broadly divided into two categories i.e. commercial and residential sectors. This procedure consists of three steps i.e. feature extraction, normalization, and validating. Heavy fluctuation may arise in the residential sector may cause damage to electrical appliances. To match the demand of customers and generation of power at generating unit prediction is necessary. A variable power pattern may cause stress at power grid. Prediction of electric consumption is required in prior so that load at the power grid could be balanced. To meet the requirement of demand, appliances can be swapped from peak hours to an off-peak hour and also reduce the cost at the customers' side. The performance of different models can be compared by using different evaluating indices are: Coefficient of determination (R2), mean absolute error (MAE), mean squared error (MSE). Out and Support Vector Machine outperforms with an accuracy of 99.99% and 99.89% with 0.01% and %0.11 % mean squared error respectively.</p>
      </abstract>
      <kwd-group>
        <kwd>Prediction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>As we know electric power plays a vital role in
today’s era. To generate the power accordingly
so that it can fulfill the demand of customer is
not an easy task as natural sources of generation
of power is extincting day by day. The total
generation of power is distributed among
various sectors according to their requirement.
Sectors can be residential and commercial
(offices, factories). For now, we are just
focusing on the residential sector. It is not
necessary that all the electrical appliances will
consume
the
same
amount
power.</p>
      <p>Appliances may have to undergoes various
ISIC’21:</p>
      <p>International</p>
      <p>Semantic
EMAIL:
(M.</p>
      <p>Kaur),
Panwar),
️©</p>
      <p>2021 Copyright for this paper by its authors. Use permitted under Creative
of day [12]. If it’s summer season, the
Airconditioners appliance
will consume
more
power. In the winter season, heating appliances
will consume more power whereas, in the rainy
season, lighting will consume more power. The
total cost of power in the residential sector also
depends on the type of the hour i.e. peak hour,
off-peak hour, mid-peak hour [11] in addition
to the amount of power consumed. During peak
hours, load at a power grid, and cost of power
per unit hour is more as compared to an
offpeak hour, load at a power grid, and cost of per
unit hour is least. To
handle this issue,
appliances can be swapped from peak hour to
off-peak hour or we can say appliances can be
scheduled in order to manage the overall load
of the power grid and total cost of customer. To
schedule the appliances, the power generating
company and the customers make a deal in
order to compensate either by a price based
determination (R2) [7], mean absolute error
(MAE) [7], mean squared error (MSE).</p>
      <p>This paper is organized as follows: Section
II related work of machine learning algorithms
in which 8 Regression algorithms used for
prediction are briefly explained. In section III,
4 evaluation indices are used to evaluate the
performance of prediction models. In section
IV, experimental settings of the models which
explain the dataset and how the parameters of
models are tuned. In section V, experimental
observation and results. Section VI, finally
concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
    </sec>
    <sec id="sec-3">
      <title>2.1Linear Regression</title>
      <p>Linear Regression is used to find the
relationship between predictor or independent
variable and target or dependent variable. If one
variable is expressed accurately to another
variable then it is known as deterministic. The
basic idea of the linear regression model is to
derive the best fit line, also known as the
regression line. Sum of the distance between the
points on the graph and regressor line are the
total prediction error of all the data points.
Smaller the error, the better the result and vice
– versa [8].</p>
      <p>Linear regression Equation:
 (
) =  0 +  1 ∗ 
(1)
b0 is intercepting whereas b1 is the slope of
the regression line. In order to get the minimum
error, the value of b0 and b1 should be
minimum.</p>
      <p>The error between the predicted value and
the actual value can be calculated as:

=
∑ (
 


− 
)
(2)</p>
      <sec id="sec-3-1">
        <title>Exploring the value of ‘b1’:</title>
        <p>If b1&lt;0 then it will have a negative relationship,
which means a decrease in target value with an
increase in predictor value. If b1&gt;0 then it will
have a positive relationship, which means the
value of the target will increase with the
increase in the value of predictors.</p>
        <p>Exploring the value of b0: If the predictor is 0
then the equation will be meaningless and of no
use.</p>
        <p>In Figure 1, it shows the graph of predicted
other. L1 regularization will results in a sparse
value of power in a residential sector.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>2.2 Lasso Regression</title>
      <p>It stands for Least Absolute Shrinkage and
Selection Operator. From its full form, it is clear
that it uses shrinkage and it is a type of Linear
Regression. Here, shrink means that values of
the dataset will be shrunk towards the central
point, say similar to that of
Mean. The
performance of this model is good when the
dataset contains multicollinearity.</p>
      <sec id="sec-4-1">
        <title>Lasso</title>
      </sec>
      <sec id="sec-4-2">
        <title>Regression</title>
        <p>undergoes</p>
        <p>L1
Regularization means it is the summation of the
absolute
value
of the
magnitude
of the
coefficient. Here, a few of the coefficients can
be zero and that values can be eliminated from
the dataset. Larger penalties will result in the
coefficient values near zero whereas smaller
penalties will result in the coefficient values far
away from zero. The aim of this algorithm is to
minimize the error
−
∑ 
∗  )
2
+
λ
(3)

∑</p>
        <p>=1(

∑
 =1</p>
        <p>|  |</p>
        <p>If λ = 0 means there is an absence of
regularization and thus we get Ordinary Least</p>
      </sec>
      <sec id="sec-4-3">
        <title>Squares solution. When λ-&gt;</title>
        <p>INF,
then
coefficients will lead to 0 and the model left out
be a constant function To tune the parameters,
λ is the amount of shrinkage when λ = 0,
parameters will not be eliminated. When λ
increases, bias also increases whereas when λ
decreases, variance also decreases. Bias and
variance are inversely proportional to each
model.</p>
        <p>It is a challenging task to select one variable
as the predictor which particulates suite the
property of Lasso Regression. The selection can
be done haphazardly but it can result in a very
bad decision means a very time-consuming
process.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>2.3 Ridge Regression</title>
      <p>If an overfitting or underfitting type of
problem arises then there are chances that it
works as linear regression. Ridge Regression is
a method to create a parsimonious model i.e.
when the number of predictor values is more
than the number of observations means when
there is a correlation between predictor values
i.e. the dataset has multicollinearity.</p>
      <p>Tikhivov's
method
has a larger set as
compared to the parsimonious model but it is
similar to ridge regression. If a dataset contains
a noise i.e. statistical noise still this model can
produce the solution.</p>
      <sec id="sec-5-1">
        <title>Ridge regression undergoes L2</title>
        <p>regularization. Also known as the L2 penalty.
coefficients of data values are shrunk by the
same factor and none of the value is eliminated.
Unlike L1 regularization, L2 will not result in a
sparse model.</p>
        <p>∑

 =1(

∑
 =0 
−   ′)2</p>
        <p>2
∗  ) + λ ∗ ∑
=

 =0  2</p>
        <p>∑
 =1(
−
(4)
To strengthen the term of penalty, we have to
tune the parameter i.e. λ When λ is 0, least
squares and ridge regression are equal. When λ
is ∞, all coefficient will be zero. The overall
penalty will range from 0 to ∞ Overall, Least
Square uses the following equation:
 ′ = ( ′ )−1 ′
(5)</p>
        <p>Here, X is a scaled and centered matrix.</p>
      </sec>
      <sec id="sec-5-2">
        <title>When columns of the X</title>
        <p>matrix have high
multicollinearity then the cross product of
(X’X)
matrix</p>
        <p>will be singular or nearly
Singular. Including ridge parameter (k) to the
above equation, then the new equation will be
 ′ = ( ′ +  )−1 X’Y
(6)</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>2.4 Elastic Net Regression</title>
      <p>It is a technique that uses properties of the
L1 penalty (Lasso Regression) and L2 penalty
(Ridge Regression). To improve the
regularization, we combine both lasso and ridge
regression. It is a 2-step process i.e. in the first
step it finds the coefficient of ridge regression
by selecting group feature and in the second
step, it performs lasso sort of the coefficients of
shrinking by performing feature selection. The
objective of this model is at minimizing by
using the following equation:</p>
      <p>n
Lenet (β) = ∑i=1(yi- xiβ) * (yi- xiβ) + λ( 1 - α (7)
2n 2
∑jm=1 βj * βj + α ∑jm=1 |βj|)</p>
      <p>Here, α is the mixing parameter i.e. α = 1
reduces the function to lasso regression
whereas α = 0 reduces the function to ridge
regression. Parameter λ is highly dependent on
the α parameter. It has better predictive
potential than lasso regression.</p>
      <p>One of the biggest disadvantages of Elastic Net
Regression is that it may or may not remove all
the irrelevant coefficients.</p>
      <p>In Figure 2, it shows the relationship between
Lasso Regression, Ridge Regression and
ElasticNet Regression.
2.5 Decision Tree</p>
      <p>It is a supervised machine learning
algorithm. From the name, it defines that it is a
decision-making tool and it uses flowchart like
tree structure [8]. It supports both the
continuous and discrete output values. Here,
continuous output example is to predict the
required power of the building where our
ultimate goal is to reduce the overall cost of the
power whereas discrete output value means to
predict the rain on a particular day that whether
it rains or not.</p>
      <p>Decision Nodes are known as conditions of
a flowchart whereas terminals are known as
results of a flowchart. The root node is called as
best predictor node as this node is the topmost
decision node. Every machine learning
algorithm model has its advantages and
disadvantages but the advantage of the decision
tree is that it is a very good model at handling
the tabular data with categorical features with
lesser than hundreds of categories and
numerical data.</p>
      <p>The decision tree can capture the non-linear
interaction between the predictors and the
target value. Suppose, target variable is
airconditioner and predictor variable is room
occupancy (empty or not) and outdoor
airtemperature (&lt;=26o C) see figure 3</p>
    </sec>
    <sec id="sec-7">
      <title>2.6 Random forest</title>
      <p>It is a supervised machine learning
algorithm. The random forest can perform both
classification and regression problems. The
random forest contains multiple decision trees
and the output of this is not only dependent on
one decision tree but every single decision tree.
Every tree is independent, none of any tree has
interaction with each other while building the
model. All these trees run parallelly but
independently. Every tree performs its
prediction and these predictions are aggregated
and perform arithmetic mean on that to produce
a single final result. It can be formulated as:
g(x) = f0(x) + f1(x) + f2(x) + --- fn(x) (8)
Here, g(x) is a single final result whereas fi(x)
is a decision tree.</p>
      <p>Each Decision tree can be drawn using a
random sample from the original dataset by
splitting it and add randomness to it to prevent
it from overfitting. Random forest is one of the
highly accurate models which can handle
thousands of predictors without the deletion of
any variable.</p>
      <p>From Figure 4, it is clear that Random Forest is
multiple Decision Trees with multiple features.</p>
    </sec>
    <sec id="sec-8">
      <title>2.7 Extra Trees</title>
      <p>It is also known as Extremely Randomized
Trees. Unlike, Random Forest and Decision
Tree, Extra Trees makes the next best split from
the uniform random splits from the subsets of
features and can't be substituted with another
sample. Extra Trees creates a greater number of
unpruned Decision Trees. Unlike Random
Forest, it makes random split. In addition to the
optimization of algorithms, it also adds
randomization. This model is faster than other
models. It takes less time to compute as it
doesn't have to select the optimal split but a
random split.</p>
    </sec>
    <sec id="sec-9">
      <title>2.8 Support Vector Regression</title>
      <p>This algorithm is one of the most popular
algorithms for regression problems. Basically
[8], it draws a boundary line or straight line so
that n-dimensional space can be segregated into
classes. The Boundary line is drawn in such a
way that it can cover maximum data points
between them. This boundary line is known as
a hyperplane. There are two types of SVR:
Linear SVR: This type of data is known as
linearly separable data. It draws a single straight
line to differentiate two classes.</p>
      <p>Non-Linear SVR: This type of data is known
as non-linear separable data. It is not possible to
segregate data into classes by just one single
line.</p>
      <p>This linear and non-Linear data is handled
by the SVR kernel. Kernel Helps to find and
draw the hyperplane without increasing the cost
in n-dimensional space. Sometimes it is not
possible to find the hyperplane in
ndimensional space. So, we draw n+1
dimensional space. The value of kernel can be
poly, RBF, sigmoid, gaussian for non-linear
datasets whereas for linear dataset value should
be linear kernel only to solve the problems.</p>
      <p>Cross-Validation is also one of the
techniques which can be used in Support Vector
Regression from the training purpose of the
model and then evaluate the model. It is failed
to generalize the pattern of the dataset but can
detect the fitting whereas cross-validation is
used to find the most accurate value but it may
fail to enhance the accuracy.
Evaluation
indices
are
considered
to
evaluate the model by various authors. The
performance can be checked by finding the
accuracy and error of the models. The lesser the
error, the more the accuracy is better than the
means to check the error and tries to reduce it.
R2 its value varies between 0 to 1 defining the
accuracy of the model.</p>
      <p>R2 = 
RMSE = √∑ =1(  −  ′)2
MAE is also dependable on the scale. Basically,
it finds the absolute of all the data points either
if they have a negative error or positive error.
None of any error cancels out the effect of each
other
Here, the res is the sum of the square of the
residual error whereas tot is the total sum of the
error. If R2 &gt;0 it means the result is accurate, if
R2 means the same result and R2 is ambiguous
results.</p>
      <p>res= ∑

the number of data points on the graph
(9)
(9 i)
(9 ii)
(10)
(11)
(12)
From the System Architecture figure i.e. Figure
6, The prediction of power input parameters
will be of that type only like the type of a day
i.e. summer, winter, rainy. The material used
for construction is used i.e. if the material is
insulation then it would be best. Dataset is a
daily, hourly basis, yearly basis, etc. Other
details of buildings can also be included i.e.
height, width, illumination, occupancy, etc.
Even dataset can be of 3 types i.e. Real data,
Simulated data, Sensor-based data [8].
After analyzing the dataset, it undergoes the
feature extraction phase, in which filtration of
the dataset is done i.e. unusual data and noise is
discarded and only useful data is left behind and
thus undergoes transformation
process i.e.
dataset is
transformed
according
to
the
requirement of the algorithm and after that size
of the dataset is decreased to increase the
performance and this process is known as
reduction of dataset.</p>
      <p>After feature extraction, transformation and
reduction, the entire dataset is divided into
training and testing dataset and there is a
training and testing phase of the model. In the
training of a model first we have to select an
appropriate algorithm for the prediction and
thus training can be done in two ways i.e. First
principle approach i.e. the prediction of power
is done based on the current situation rather
than observing the history or Data-Driven
approach i.e. want to give detailed information
about building and thus results are validated
and accuracy is measured. If it ends with good
accuracy then our algorithm is ready for an
unknown dataset of the building and prediction
there demand of power. Thus, results are
compared based on evaluation metrics and
declare one model as the best model with
greater accuracy and minimum error.</p>
    </sec>
    <sec id="sec-10">
      <title>5. Experimental Settings</title>
      <p>Using various libraries of python such as
pandas, scipy we can carry out the analysis. For
the basic implementation of mining of data or
ML we are analysing the library “SKlearn”.
Sklearn is a module of python language which
integrates all the ML algorithms in the world of
different python libraries such as NumPy,
sciPy, Matplotlib. We are trying to use this as
the study shows that this gives efficient and
simple solutions. Now training and testing can
be done in different models and then compared
from each other in-order to get the better
outcomes. From the review of different authors
and according to our online study we can say
that ensemble models of machine learning give
better performance than others. For the time
being the study about different models is being
done and a dataset is collected. Also, the
gradient boosting may give us the accurate
results. The gradient boosting is the algorithm
which also trains various models in gradual,
additive and sequential manner. Since this
algorithm is prone to overfitting therefore it
uses hyperparameter tuning. This analysis
totally depends on how better accuracy we are
demanding according to the dataset available.
On the other hand, Random forests also prove
to be good for the efficient results. It is the
algorithm which uses a special process known
as "Early Stopping" in which training stops
once the performance on testing data stops
improving further. This is an optimized
technique. Also, this avoids overfitting.
Therefore, this also can be the model for our
project which is to be applied and analysed.
And this is beneficial for categorization as well
as regresssion problem. Also, this can be
modelled for categorical values. After the
search results, this model can also be compared
for the study purpose gathering more dataset
and then visualizing for more accuracy and
better performance for achieving the results.
The search result done after training can be
gathered in a document and the result could be
concluded.</p>
      <p>
        Experiments should be done to declare one
model as the best model and for experiments,
we need a dataset. Generally, the dataset can
vary from 2 weeks [
        <xref ref-type="bibr" rid="ref11 ref11 ref18 ref18">9</xref>
        ] to 4 years [10]. So, the
dataset is taken from Kaggle and we uploaded
on GitHub, link:
https://raw.githubusercontent.com/navkapil/go
oglecolab/master/pwrpred.csv
      </p>
      <p>The dataset consists of 1048576 rows and 9
columns. It contains per minute data of the day
for approx. 2 years from 16-12-2006 17:24:00
to 13-12-2008 21:38:00. 9 columns of the
dataset are DateTime, global active power,
global reactive power, Voltage, Global
intensity, sub-metering 1, sub-metering 2,
submetering 3, and sub-metering 4. Out of all these
parameters, global active power is taken as
output, and the rest all other as input
parameters. This dataset is divided into training
and testing with a percentage of 90% and 10%
respectively. After this division, it undergoes
normalization so that all the parameters lie in
the same range. Models for the experiments are
Linear Regression, Decision tree, Random
Forest, Extra Trees, Lasso Regression, Ridge
Regression, Elastic Net Regression, Support
Vector Regression. The result of the dataset
also depends on the type of the dataset i.e.
whether it is a linear or Non-Linear Dataset. All
the parameters of a model have Default values.
The performance of SVR is affected if data is
linear and we set the value of kernel as
nonlinear (RBF, poly, gaussian) and vice-versa.
The kernel we have taken is RBF. We checked
values of C and gramma varies from .01 to 100
and 0.01 to 10 by 10 units respectively and we
got good results at C=100 and gamma = 0.1 and
degree as 3 (default). Linear Regression is the
baseline model for this project because it gives
very good results by taking default values of all
the parameters. Decision Tree, Random Forest,
and Extra Trees are somewhat similar models.
Indecision tree default value of random_state is
None but we give it as 42 as this parameter
dominates the randomness of the estimator, in
random forest n_estimators tells the number of
trees to be formed and we checked the
performance of the model by varying its value
from 10 to 100 and got a better result for value
as 50 whereas the value of n_estimators in Extra
trees is 150 but its default value is 100. If we set
the default value of alpha in Lasso Regression
then it works as a Linear Regression but it is not
advised. For better performance, we set its
value as 0.01 whereas in Ridge Regression if
we set the default value of alpha then it works
as a Logistic Regression so it would be better if
we tuned them with value as 0.1. Basically,
alpha regularization improves the problem
conditioning and hence lower down the
variance of the estimators. For Elastic Net
Regression, we tuned two parameters that are
alpha and l1_ratio. For alpha = 0 it is solved by
Linear regression. If l1_ratio = 0 then penalty is
l2 i.e. Ridge Regression and if l1_ratio = 1 then
penalty is l1 i.e. Lasso Regression. 0 &lt; l1_ratio
&lt; 1, is the combination of l1 and l2 penalty. Rest
all the parameters of the models will be the
default value.</p>
    </sec>
    <sec id="sec-11">
      <title>6. Experiment Result</title>
      <p>4</p>
      <p>It shows the results of different models using
evaluation indices i.e., RMSE,
1.2</p>
      <p>1
0.8
0.6
0.4
0.2
0</p>
      <sec id="sec-11-1">
        <title>Models</title>
      </sec>
      <sec id="sec-11-2">
        <title>Linear Regression</title>
      </sec>
      <sec id="sec-11-3">
        <title>Elastic Net</title>
      </sec>
      <sec id="sec-11-4">
        <title>Random Forest</title>
      </sec>
      <sec id="sec-11-5">
        <title>Extra trees</title>
      </sec>
      <sec id="sec-11-6">
        <title>Support Vector</title>
      </sec>
      <sec id="sec-11-7">
        <title>Decision tree</title>
      </sec>
      <sec id="sec-11-8">
        <title>Ridge</title>
        <p>lasso</p>
        <p>RMSE
R2_score,MSE, MAE for the prediction of
next-day power consumption.</p>
        <p>From the table i.e. Table 1, it is clear that
Support Vector Machine and Lasso Regression
give better accuracy with the minimum error
where Linear Regression is the worst performer
out of these 8 models that is why Linear
Regression is taken as Baseline Model and
Support Vector Machine as a benchmark
system.</p>
        <p>Figure (7) (8) shows that Elastic Net,
Random Forest, Extra Trees, Support Vector
Regression, Lasso all have nearly the same
results but Support Vector Regressor and Lasso
give good results.</p>
        <p>R2_score</p>
        <p>RMSE and R2 for various regressors
Test error in the prediction with respect to various
regressors
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0</p>
      </sec>
    </sec>
    <sec id="sec-12">
      <title>7. Conclusion</title>
      <p>This paper focuses on the implementation of
various machine learning algorithms for
predicting the power of buildings. It is not
necessary that for a particular dataset it will
always show a good result, sometime it may
show the uncertain results as every model have
their pros and cons. By varying one or more
parameters of the building what would be its
effect on the other parameters, it would be very
difficult to predict as we can't set all the
parameters according to our requirements.
Because of prediction, it would be very easy to
make long term plans. Weather data plays a
vital role in the prediction of the power of a
building. According to our analysis of the result
of the model's Support vector regression, lasso
regression is the best model. Even to increase
the accuracy of these models we can use Long
Short Term Memory (LSTM) as it is very
robust deep learning algorithm for prediction of
time based forecasting and has potential to give
accurate prediction results or hybrid approach.</p>
    </sec>
    <sec id="sec-13">
      <title>8. References</title>
      <p>
        [
        <xref ref-type="bibr" rid="ref12 ref19">1</xref>
        ] Setlhaolo, D., Xia, X., &amp; Zhang, J. (2014).
      </p>
      <p>Optimal sceduling of household appliances
for demand response. Electric Power
Systems Research, 116, 24-28.
[2] Gayatri, P., Sukumar, G. D., &amp;
Jithendranah, J. (2015, December). Effect of
load change on source parameters in power</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>system. In 2015 Conference on Power,</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Sustainable Growth (PCCCTSG</surname>
          </string-name>
          ) (pp.
          <fpage>178</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          182). IEEE. [3]
          <string-name>
            <surname>Amasyali</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>El-Gohary</surname>
            ,
            <given-names>N. M.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>and Sustainable Energy Reviews</source>
          ,
          <volume>81</volume>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          11921205. [4]
          <string-name>
            <surname>Gomes</surname>
            ,
            <given-names>Á.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Antunes</surname>
            ,
            <given-names>C. H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Oliveira</surname>
          </string-name>
          , E.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          (
          <year>2011</year>
          ).
          <article-title>Direct load control in the</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          13-
          <fpage>26</fpage>
          ). Springer, Berlin, Heidelberg. [5]
          <string-name>
            <surname>Babar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahamed</surname>
            ,
            <given-names>T. I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AlAmmar</surname>
            ,
            <given-names>E. A.</given-names>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          &amp;
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>A novel algorithm for</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Procedia</surname>
          </string-name>
          ,
          <volume>42</volume>
          ,
          <fpage>607613</fpage>
          . [6]
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          (
          <year>2013</year>
          , June). Prediction
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <article-title>based on support vector regression</article-title>
          .
          <source>In 2013</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <source>9th Asian Control Conference (ASCC)</source>
          (pp.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          1-
          <fpage>5</fpage>
          ). IEEE. [7]
          <string-name>
            <surname>Muralitharan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sakthivel</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          (
          <year>2016</year>
          ). Multiobjective optimization
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Neurocomputing</surname>
          </string-name>
          ,
          <volume>177</volume>
          ,
          <fpage>110</fpage>
          -
          <lpage>119</lpage>
          . [8]
          <string-name>
            <surname>Amasyali</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>El-Gohary</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>analytics. Procedia</given-names>
            <surname>Engineering</surname>
          </string-name>
          ,
          <volume>145</volume>
          ,
          <fpage>511</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          517. [9]
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          (
          <year>2013</year>
          , June). Prediction
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <article-title>based on support vector regression</article-title>
          .
          <source>In 2013</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>9th Asian Control Conference (ASCC)</source>
          (pp.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          1-
          <fpage>5</fpage>
          ). IEEE [10]
          <string-name>
            <surname>Dagnely</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruette</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tourwé</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Tsiporkova</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Verhelst</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2015</year>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>for Renewable Energy Integration</source>
          (pp.
          <fpage>105</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          122). Springer, Cham. [11]
          <string-name>
            <surname>Naji</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Çelik</surname>
            ,
            <given-names>O. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alengaram</surname>
          </string-name>
          , U. J.,
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Jumaat</surname>
            ,
            <given-names>M. Z.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Shamshirband</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <volume>84</volume>
          ,
          <fpage>727</fpage>
          -
          <lpage>739</lpage>
          . [12]
          <string-name>
            <surname>Ali</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahmad</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2012</year>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <article-title>smart grid based on M2M</article-title>
          .
          <source>In</source>
          <year>2012</year>
          10th
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Information</given-names>
            <surname>Technology</surname>
          </string-name>
          (pp.
          <fpage>231</fpage>
          -
          <lpage>236</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          IEEE. [13]
          <string-name>
            <surname>Hahn</surname>
            , H., Meyer-Nieberg,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pickl</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          (
          <year>2009</year>
          ).
          <article-title>Electric load forecasting methods:</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <source>journal of operational research</source>
          ,
          <volume>199</volume>
          (
          <issue>3</issue>
          ),
          <fpage>902</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          907. [14]
          <string-name>
            <surname>Gonzalez-Romera</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jaramillo-Moran</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <given-names>A.</given-names>
            , &amp;
            <surname>Carmona-Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <source>Transactions on power systems</source>
          ,
          <volume>21</volume>
          (
          <issue>4</issue>
          ),
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          1946-
          <fpage>1953</fpage>
          . [15]
          <string-name>
            <surname>Stavrakas</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Flamos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2020</year>
          ). A
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <surname>Management</surname>
          </string-name>
          ,
          <volume>205</volume>
          ,
          <fpage>112339</fpage>
          . [16]
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>X. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oyedele</surname>
            ,
            <given-names>L. O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ajayi</surname>
            ,
            <given-names>A. O.</given-names>
          </string-name>
          , &amp;
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <string-name>
            <surname>Akinade</surname>
            ,
            <given-names>O. O.</given-names>
          </string-name>
          (
          <year>2020</year>
          ). Comparative study
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          andSociety,
          <volume>61</volume>
          ,
          <fpage>102283</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>