Proceedings of the International Conference on Big Data Cloud and Applications Tetuan, Morocco, May 25 - 26, 2015 Electric load forecasting using hybrid machine learning approach incorporating feature selection Malek Sarhani Abdellatif El Afia ENSIAS ENSIAS Mohammed-V University Mohammed-V University Rabat, Morocco Rabat, Morocco malek.sarhani@um5s.net.ma abdellatif.elafia@gmail.com A BSTRACT influencing factors such as climate factors, seasonal factors, and so on (Amjady and Keynia 2009) [1]. Thus, methods Forecasting of future electricity demand is very important based on artificial intelligence techniques like artificial neural for the electric power industry. As influenced by various fac- network (Minsky and Papert in 1969 [25]), genetic algorithms tors, it has been shown in several publications that machine (Goldbergn in 1989 [14]), fuzzy logic (Cox and Earl in 1994 learning methods are useful for electric load forecasting (ELF). [8]) and support vector machine (SVM) (Vapnik et al. 1997 On the one hand, we introduce in this paper the approach of [38]) can generate better results (Ul Islam 2011 [19]). support vector regression (SVR) for ELF. In particular, we use particle swarm optimization (PSO) algorithm to optimize SVR In the past few years, various efforts in improving the parameters. On the other hand, it is important to determine forecasting accuracy were proposed. Lots of these researchers the irrelevant factors as a preprocessing step for ELF. Our have tried to apply artificial intelligence techniques to improve contribution consists of investigating the importance of apply- forecasting accuracy. The most used method is artificial neural ing the feature selection approach for removing the irrelevant network (ANN). Or, Hu and Zhang (2008) [11] showed that factors of electric load. The experimental results elucidate the ANN has inherent drawbacks, such as local optimization feasibility of applying feature selection without decreasing the solution, lack generalization, and uncontrolled convergence. performance of the SVR-PSO model for ELF. Therefore, support vector machine (SVM), which overcomes some drawbacks of neural networks, was introduced to provide I. I NTRODUCTION a model with better predictive power to elaborate a more For developing countries, accurate electric load forecasting accurate forecast. (ELF) is an important guide for effective actions of energy Of the influencing factors on ELF which are presented in policies. Furthermore, accurate models for electric power load real dataset, some of them could be redundant or irrelevant. forecasting are essential to the operation and planning for Thus, feature selection (FS) is justified as a first step for ELF. several companies. It may have an impact on energy purchas- Our contribution in this paper is to investigate the importance ing and generation, load switching, contract evaluation, and of using FS in ELF. The rest of the paper is organized as infrastructure development. The cost of error is so high that follows: In the next section, we introduce the concepts related research in forecasting techniques which could help to reduce to our forecasting techniques. In the section 3, we outline the it in a few percent points would be amply justified. related works. Section 4 describes the algorithm and the tools which are used for its implementation and presents the case Load forecasts can be divided into three categories: short- study used for the evaluation. Section 5 presents the parameters term forecasts which are usually from one hour to one week, setting. The results are presented in section 6. Finally, we medium forecasts which are usually from a week to a year, conclude and present perspectives to our work. and long-term forecasts which are longer than a year. Short term forecasting are essential for the control and scheduling of II. T HE HYBRID M ACHINE L EARNING TECHNIQUE power systems [35]. However, daily load forecasting is a hard task, because it depends not only on the load of the previous A. Support Vector Machine days, but also on other facts such as temperature, calendar The support vector machine (SVM) is a recent tool from effect [42]. the artificial intelligence field which use statistical learning theory. It has been successfully applied to many fields and it Nowadays, there are different techniques for calculating recently of increasing interests of researchers: It has been first forecasts, In one hand, classical statistical foecasting methods introduced by Vapnik et al.(1992) [3] and was applied firstly such as exponential smoothing (Winter 1960 [43]) or ARIMA to pattern recognition (classification) problems, recent research models defined by Box and Jenkins (1994) [4] can be used has yielded extensions to regression problems, including time for this purpose. But, with these traditional methods, The series forecasting. construction of ELF model may be difficult due to its un- certain, non-linear, dynamic and complicated characteristics: SVM belongs to Kernel methods, which represent a new electric load data present nonlinear data patterns caused by generation of learning algorithms and utilize techniques from 1 optimization, statistics, and functional analysis in pursuit of . Finally, the SVR regression function is obtained as the maximal generality, flexibility, and performance. SVM applies following equation in the dual space the structural risk minimization (SRM) principle to minimize N an upper bound on the generalization error. SVM could theo- X f (x) = (βi∗ − βi )K(xi , x) + b (6) retically guarantee to achieve the global optimum. i=1 The main use of SVM is in classification. However, a where K(xi , xj ) is called the kernel function: The value of version of an SVM for regression has been proposed by Vapnik the kernel equals the inner product of two vectors xi and xj in et al. in 1997 [38]. the feature space ϕ(xi ) and ϕ(xj ) . The most commonly used kernel functions are the Gaussian radial basis functions (RBF) kernel function, namely K(xi , xj ) = exp(−||xi − xj ||2 /2σ 2 ) B. Support Vector Regression which is also employed in this study. This subsection introduces briefly the idea of SVM for the case of regression (SVR). SVR have been successfully C. Particle Swarm Optimization employed to solve forecasting problems in many fields, such as The parameters that should be optimized include the financial time series forecasting [20], engineering and software penalty parameter C,  and ω defined in equation (2). Thus, field forecasting [31], and so on. the choice of the parameters has a heavy impact on the forecasting accuracy. The PSO algorithm is used to seek a The basic concept of the SVR model is to nonlinearly map better combination of the three parameters in the SVR. (with function ϕ(.) : Rn → Rnh ) the input data (training data N set (xi , yi )i=1 ) into a higher dimensional feature space (which Particle swarm optimization (PSO) was originally designed may have infinite dimensions Rnh ). Then, the SVR function by Kennedy and Eberhart in 1995 [22]. The technique simu- is shown as follows: lates the moving of social behaviour among individuals (par- f (x) = ωϕ(x) + b (1) ticles) through a multi-dimensional search space, each particle represents a single intersection of all search dimensions. where f (x) denotes the forecasting values. the coefficients ω and b are estimated by solving the following formulation In PSO, Each particle i has two vectors: the velocity vector which aims to minimize the regularized risk function: and the position vector: The particles are updated according to itself previous best position and the entire swarm previous best position. That is, particle i adjusts its velocity νi and position N xi in each generation according to the following formula: 1 X min 2 ||w|| + C (ξi + ξi∗ ) (2)  n+1 ω,b,ξi ,ξi∗ 2 ν = ων n + c1 r1 (pn − xn ) + c2 .r2 .(png − xn ) i=1 (7)  xn+1 = xn + β.ν n yi − (< w, φ(xi ) > +b) ≤  + ξi s.t (< w, φ(xi ) > +b) − yi ≤  + ξi∗ (3) where ν n and xn are the current velocity and position ξi , ξi∗ ≥ 0  of the particle. pn represents the best previous position of particle i. png represents the best position among all particles in the population. r1 and r2 are two independently uniformly The constant C determines the trade off between the distributed random variables with a range. flatness of f and the amount up to which deviations larger than  are tolerated. ξi denotes the training error above  , whereas Nowadays, PSO has gained much attention and wide appli- ξi∗ denotes the training error below − , and n represents the cations in solving continuous non linear optimization problems number of samples. SVR avoids underfitting and overfitting of due to the simple concept, easy implementation and quick the training data by minimizing the regularization term 12 ||w||2 PN convergence. [5]. as well as the training error C i=1 (ξi + ξi∗ ) . This constrained optimization problem can be solved by D. Feature selection the pri- mal Lagrangian form and the KarushKuhnTucker Feature selection (FS), also known as variable selection or conditions, the dual can be obtained as: Maximize attribute selection, aims at identifying the most relevant input 1 X Xl Xl variables within a dataset. It may improve the performance of − αi αj < xi , xj > − αi + yi αi (4) the predictors by eliminating irrelevant inputs, achieves data 2 i,j=1 i=1 i=1 reduction for accelerated training and increases computational efficiency [34]. It is usually utilized to identify a subset where whereαi = βi − βi ∗ and βi ∗, βi are obtained by solving the the meanings of variables are important. quadratic program and are the Lagrangian multipliers. After this optimization problem is solved, the parameter vector w in Most feature selection algorithms perform a search through Equation (2) is obtained by: the space of feature subsets. There are some characteristics N X which affect the nature of the search: The most important are w= (βi∗ − βi )ϕ(Xi ) (5) the search organization (heuristic strategies are generally more i=1 feasible and adaptable for this problem), and the evaluator 2 (we can distinguish two major families of methods: Filter and combines SVR and PSO was presented also to traffic safety wrapper). forecasting in the paper of Gang and Zhuping (2011) [13]. Moreover, selecting the key variables is crucial in con- Tu et al. (2007) [36] performed feature selection with PSO structing the energy load forecasting model. Furthermore, and used SVM to evaluate the fitness value. He et al. (2008) according to Lu (2014) [23], the major disadvantage of SVR is [16] used Genetic algorithm for feature selection which lead that it cannot select important variables from many predictor to reduce input space. Nguyen and Torre (2010) [28] have variables. presented a method for jointly learn weights and parameters of the SVM model. Crone and Kourentzes (2010) [9] proposed III. L ITERATURE REVIEW an iterative neural filter is proposed for feature evaluation to automatically identify the frequency of the time series. Related work for this research includes the use SVM and SVR for Electric load forecasting (ELF) in general. Particu- In their paper, Vieira et al. (2013) [39] proposed a binary larly, we focus on works which used the hybrid model SVR- PSO algorithm for feature selection in parallel with optimiz- PSO for load forecasting and others which interest in the ing the SVM parameters. Lu (2014) [23] used Multivariable selection of relevant attributes. Adaptive Regression Splines (MARS) for selecting input vari- ables and then construct a sales forecasting model with SVR. That is, SVM and SVR was being applied to ELF. For Shahrabi et al. (2013) [33] presented an approach which used instance, Mohandes (2002) [26] applied the method of SVM k-means clustering for reducing the dimension of the data for short-term ELF. The obtained results for this paper indicate space, and then used genetic expert system for forecasting that SVM outperforms the autoregressive method. Also, Wang tourism demand. et al. (2009) [41] presented a  -SVR model considering seasonalproportions based on development tendencies from Niu et Guo (2009) [29] proposed a method which uses history data. Since electric load data are non-linear in relation simulated annealing to improve the global searching capacity and complex, many studies tend to hybridize SVR with other of the PSO for the purpose of optimizing SVR parameters methods, Elattar et al. (2010) [12] proposed an approach for and selecting its input features and then applied it to short solving the load forecasting problem which combines the and term load forecasting. Karimi (2012) [21] proposed a feature locally weighted regression. Then, he employed the weighted selection technique composed of Modified Relief and Mutual distance algorithm that uses the Mahalanobis distance to Information and then forecast electric load with a training optimize the weighting function’s bandwidth. In the study of mechanism. Yadav et al. (2014) [44] used Weka software in Ogcu et al. (2012) [30], SVR and ANN models were employed order to select the most relevant input parameters for solar to develop the best model for predicting electricity output. Che radiation prediction models. (2012) et al.[6] presented an adaptive fuzzy combination model based on the self-organizing map (SOM), the SVR and the IV. T HE PROPOSED APPROACH FOR ELF fuzzy inference method. Furthermore, several algorithms have A. The SVR-PSO model been proposed to optimize SVR parameters. Hong et al. (2011) [18] introduced the application of Chaotic Immune Algorithm Resolving the SVR dual problem is often troublesome. for optimizing SVR parameters and investigate its feasibility Despite an exhaustive search method could be used to tune for ELF. Zhang et al. (2012) [45] investigated the potentiality this, it suffers from the main drawbacks of being very of a hybrid algorithm which combine chaotic genetic algorithm time-consuming and lacking of a guarantee of convergence to and simulated annealing algorithm for optimizing SVR model the globally optimal solution. Compared to genetic algorithms and improving load forecasting accurate performance. Another (GA), the PSO method can efficiently find optimal or hybrid forecasting model using differential evolution algorithm near-optimal solutions in large search spaces. Furthermore, to determine the parameters in SVR model was proposed by Lu and Geng (2011) [24] showed that the PSO-SVR model Wang et al. (2012) [40] for forecasting the annual electric is superior to GA-SVR model in the running efficiency load. Aung et al. (2012) [2] adopted the least-squares support and predictive accuracy. Thus, we adopt PSO for optimal vector regression technique incorporated with online learning parameter selection of SVR in order to improve the accuracy to forecast the peak load of a particular consumer entity in the and runtime efficiency of learning procedure of SVR-PSO. smart grid for a future time unit. Our SVR-PSO algorithm for ELF can be defined as following: Furthermore, the SVR-PSO applied for ELF: Hong (2009) [17] elucidated the feasibility of applying chaotic particle Initialize P op(αi ), Initialize σ, , C swarm optimization (CPSO) algorithm to choose the suitable while t ≤ tmax do parameter combination for a SVR model in forecasting of for i = 0 to n do electric load. Duan et al. (2011) [10] proposed a combined Compute fα i according to Eq. (4) method for the short-term load forecasting of electric power Update Vα i and Xα i according to Eq. (7) systems based on the Fuzzy-c-means (FCM) clustering, PSO if fα i ≤ fbest α i then and SVR techniques. The SVR-PSO has been used for fore- fbest αi − > fα i casting in other fields. For instance, Anandhi et al. (2013) Xbest αi − > Xα i [37] presented an SVR based prediction model appropriately else {N is odd} tuned can outperform other more complex models. Specially, N ←N −1 evaluated results show that proposed SVM regression with end if PSO approach gave improved accuracy. This approach which end for 3 end while (MSE). The MAPE and MSE are given by the following equations: P B. SVR-PSO with feature selection |(prediction − real)/real| M AP E = 100. (8) As mentioned in the previous section, the SVR- n PSO model is useful for electric load forecasting (prediction − real)2 (ELF). Therefore, we apply it to the electric load M SE = (9) n forecasting. Moreover, we use feature selection to remove irrelevant attributes as illustrated below, The Where n is the number of instance of the testing set. procedure used in this paper is summarized in Figure 2. VI. E XPERIMENTAL RESULTS A. Experiment 1: First of all, the paper takes The historical electricity load dataset used in the EUNITE competition [7] from January 1, 1997 to December 31, 1998 as shown in Table I. The maximum daily values of the electrical load for the 31 days of January 1999 are to be forecasted using the given data for the preceding two years. Given load and some other information in 1997-1998, the task is to predict daily maximum load in January 1999. This dataset contains 16 features. As described by Chen et al. [7], these features belong to three categories (calendar attributes, temperature, past load demand). Features 1-7 correspond to the seven days of the week. Feature 8 is related to temperature. Features 10-16 are loads of the previous seven days. Fig. 2. SVR-PSO with feature selection At the best of our knowledge, this hybrid SVR-PSO model combined with feature selection hasn’t been yet applied for ELF. This idea has been investigated in the paper of [32], but the results of the forecasting model aren’t enough to validate this approach. V. E XPERIMENTS SETUP Fig. 3. Electric load history for experiment 1 A. Tools , we apply feature selection for the EUNITE competition To perform feature selection, Weka software was used. dataset: Weka is a collection of machine learning algorithms for data mining tasks. As mentioned in section II.D, feature selection has two principal characteristics. In this study, we used Correlation- based Feature Subset Selection (CFS) as an attribute evaluator (see Hall 1998 [15]) and Particle Swarm Optimization for search method (see Moraglio et al. 2009 [27]). Forecasting with SVR involves normalization of data within the range of [0,1]. Also, the SVR-PSO method with and without FS have been executed on the same platform: Intel Core i5 PC, 1.8 GHz with 4 GB of RAM under Ubuntu 14.04 operating system. B. Comparison measurement Fig. 4. The selected attributes The experimental data should be divided into two subsets: the training set and the testing set. The forecasting accuracy Figure 4. shows the feature selection result using Weka. It is measured in the testing set by two criteria which are Mean can be found that thirteen attributes are selected as important Absolute Percentage Error (MAPE) and Mean Square Error features (The attributes 7,8 and 9 are eliminated). Thus, these 4 features were chosen for being applied for building the SVR- day of the week). This result can be explained by the fact PSO model of ELF. that the load in this day is so weak (week-end) that we can neglect it. Indeed, this conclusion can be observed clearly Below, we will apply the SVR-PSO model the EUNITE from the dataset. The attribute 8 is related to temperature, the case study and shows the predicted values against the realist elimination of this attribute when doing feature selection mean values. The figure 5. presents it for the case of the model that it hasn’t a notable impact on electric load for the case of without feature selection, while the figure 6. shows it for the electric load in the competition studied. case with feature selection. The following table shows different performance measurement obtained: B. Experiment 2: In this experiment, the models are trained on hourly data from the NEPOOL region (courtesy ISO New England) from 2004 to 2007 (data are available on mathwork website). That is, it contains 8734 instances and 8 features as described in fig. 7. To build this experiment, we follow the same approach used in the previous example. Fig. 7. Description of experiment 2 Fig. 5. SVR-PSO without feature selection SVR-PSO without FS SVR-PSO with FS MAPE 0.5296 0.0290 MSE 1.0563.108 3.4092 Table 2. Comparison of results experiment 2. C. Discussion We can see from the two tables of the previous section that SVR-PSO with FS have smaller MSE and MAPE than SVR-PSO without FS. That is, feature selection may improve SVR-PSO performance. The outstanding forecasting performance of the SVR-PSO with FS technique is caused by the reason that the eliminated attributes do not have a great impact on load electricity, they can be replaced by other attributes who can have a more impact on electric load. Fig. 6. SVR-PSO with feature selection VII. C ONCLUSION SVR-PSO without FS SVR-PSO with FS In this paper, we investigate the applicability of the hy- MAPE 0.0613 0.0555 brid machine learning technique: SVR-PSO to electric load MSE 3.0275.103 2.5533.103 forecasting. on the one hand, we can see that the hybrid method SVR-PSO is useful for ELF. On the other hand, we Table 1. Comparison of results for experiment 1. can conclude that the selection of the most relevant feature can maintain the accuracy of the SVR-PSO model for forecasting. The first column presents the performance of SVR-PSO with- This result is useful, especially in the case of large datasets. out feature selection and the second column presents it for SVR-PSO with feature selection. Future research should attempt to use more advanced On one hand, the eliminated attributes in the section VI.A methods in optimizing SVR parameters to have a better are 7,8 and 9. The attribute 9 wasn’t used in the competition performance of the hybrid model and to determine the best as mentioned in V-B. The attribute 7 is Sunday (the seventh way for doing feature selection. 5 R EFERENCES [23] Chi-Jie Lu, Sales forecasting of computer products based on variable selection scheme and support vector regression, Neurocomputing 128 [1] N. Amjady and F. Keynia, Short-term load forecasting of power systems (2014), 491–499. by combination of wavelet transform and neuro-evolutionary algorithm, [24] Xiaoyong Lu and Xiaomeng Geng, Car sales volume prediction based Energy 34 (2009), no. 1, 4901–4909. on particle swarm optimization algorithm and support vector regres- [2] Zeyar Aung, Mohamed Toukhy, John R. Williams, Abel Sanchez, and sion, International Conference on Intelligent Computation Technology Sergio Herrero, Towards accurate electricity load forecasting in smart and Automation Shenzhen Guangdong (2011), 71–74. grids, The Fourth International Conference on Advances in Databases, [25] M. Minsky and S. Papert, An introduction to computational geometry, Knowledge, and Data Applications DBKDA (2012). MIT Press ISBN 0-262-63022-2 (1969). [3] B. E. Boser, I. M. Guyon, and V. N. Vapnik, A training algorithm [26] M. Mohandes, Support vector machines for short-term electrical load for optimal margin classiffers, 5th Annual ACM Workshop on COLT, forecasting, International Journal of Energy Research 26 (2002), 335– Pittsburgh PA (1992), 144–152. 345. [4] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel, Time series analysis [27] A. Moraglio, C. Di Chio, and R. Poli, Geometric particle swarm forecasting and control, 3rd ed, Prentice Hall Englewood Clifs 598 optimisation, In Proceedings of the 10th European Conference on pages (1994), no. 0130607746. Genetic Programming Berlin (2007), 125–136. [5] R. C.Eberhart and Y. Shi, Particle swarm optimization: developments, [28] Minh Hoai Nguyen and Fernando de la Torre, Optimal feature selection applications and resources, Proceedings of IEEE Congress on Evolu- for support vector machines, Pattern Recognition 43 (2010), 584–591. tionary Computation IEEE (2001). [29] Dong-Xiao Niu and Ying-Chun Guo, An improved pso for parameter [6] Jinxing Che, Jianzhou Wang, and Guangfu Wang, An adaptive fuzzy determination and feature selection of svr and its application in stlf, combination model based on self-organizing map and support vector Multi-valued Logic (2009), 1–18. regression for electric load forecasting, Energy 37 (2012), 657–664. [30] Gamse Ogcu, Omer F. Demirel, and Selim Zaim, Forecasting electric- [7] B.J. Chen, M.W. Chang, and C.J. Lin, Load forecasting using support ity consumption with neural networks and support vector regression, vector machines: a study on eunite competition 2001, IEEE Trans Power Procedia - Social and Behavioral Sciences 58 (2012), 1576–1585. Syst 19 (2004), 1821–30. [31] Ping-Feng Paia and Wei-Chiang Hong, Software reliability forecasting [8] Cox and Earl, The fuzzy systems handbook a practitioners guide to by support vector machines with simulated annealing algorithms, Jour- building using maintaining fuzzy system, Boston ISBN 0-12-194270-8 nal of Systems and Software 79 (2006), no. 6, 747–755. (1994). [32] Malek Sarhani and Abellatif El Afia, Electric load forecasting using [9] Sven F. Crone and Nikolaos Kourentzes, Feature selection for time hybrid machine learning model, Proceeding of the 11th international series prediction, a combined filter and wrapper approach for neural conference of Intelligent Systems: Theory and Applications (Rabat, networks, Neurocomputing 73 (2010), 1923–1936. Morocco), 2014. [10] Pan Duan, Kaigui Xie, Tingting Guo, and Xiaogang Huang, Short-term [33] Jamal Shahrabi, Esmaeil Hadavandi, and Shahrokh Asadi, Developing a load forecasting for electric power systems using the pso-svr and fcm hybrid intelligent model for forecasting problems: Case study of tourism clustering techniques, Energies 4 (2011), 173184. demand time series, Knowledge-Based Systems 43 (2013), 112–122. [11] R.C. Eberhart and Y. Shi, Particle swarm optimization developments [34] S.Piramuthu, Evaluating feature selection methods for learning in data applications and resources, Proceedings of the 2001 congress on mining applications, European Journal of Operational Research 156 evolutionary computation (2001). (2004), 483–494. [12] Ehab Elattar, John Goulermas, and Q. Wu, Electric load forecasting [35] J. W. Taylor and P. E. McSharry, Short-term load forecasting methods: based on locally weighted support vector regression, IEEE Transac- An evaluation based on european data, IEEE Transactions on Power tions on Systems, Man, And Cybernetics 40 (2010), no. 4, Part C: Systems 22 (2008), 2213–2219. Applications and Reviews. [36] Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, and Cheng-Hong [13] Ren Gang and Zhou Zhuping, Traffic safety forecasting method by par- Yang, Feature selection using pso-svm, International Journal of Com- ticle swarm optimization and support vector machine, Expert Systems puter Science 33 (2007), no. 1. with Applications 38 (2011), 10420–10424. [37] V.Anandhi and R.Manicka Chezian, Forecasting the demand of pulp- [14] D. E. Goldberg, Genetic algorithm in search optimization and machine wood using ann and svm, International Journal of Advanced Research learning, Addison-Wesley Reading (1989). in Computer Science and Software Engineering 3 (2013), no. 7, 1404– [15] M. A. Hall, Correlation-based feature subset selection for machine 1407. learning., Hamilton, New Zealand (1998). [38] V. Vapnik, S. Golowich, and A. Smola, Support vector method for [16] Wenwu He, Zhizhong Wang, and Hui Jiang, Model optimizing and function approximation regression estimation and signal processings, feature selecting for support vector regression in time series forecasting, MIT Press Cambridge 9 (1992 7), 144–152. Neurocomputing 72 (2008), 600–611. [39] Susana M. Vieira, Lus F. Mendonca, Goncalo J.Farinha, and Joo M.C. [17] Wei-Chiang Hong, Chaotic particle swarm optimization algorithm in Sousa, Modified binary pso for feature selection using svm applied a support vector regression electric load forecasting model, Energy to mortality prediction of septic patients, Applied Soft Computing 13 Conversion and Management 50 (2009), 105–117. (2013), 3494–3504. [18] Wei-Chiang Hong, Yucheng Dong, Chien-Yuan Lai, Li-Yueh Chen, and [40] Jianjun Wang, Li Li, and Dongxiao Niuand Zhongfu Tan, An annual Shih-Yung Wei, Svr with hybrid chaotic immune algorithm for seasonal load forecasting model based on support vector regression with differ- load demand forecasting, Energies 4 (2011), 960–977. ential evolution algorithm, Applied Energy 94 (2012), 65–70. [19] Badar Ul Islam, Comparison of conventional and modern load forecast- [41] Jianzhou Wang, Wenjin Zhu, Wenjin Zhu, and Donghuai Sun, A trend ing techniques based on artificial intelligence and expert systems, IJCSI fixed on firstly and seasonal adjustment model combined with the International Journal of Computer Science Issues (IJCSI) 8 (2011), epsilon-svr for short-term forecasting of electricity demand, Energy no. 5, 1694–0814. Policy 37 (2009), 4901–4909. [20] Kyoung jae Kim, Financial time series forecasting using support vector [42] L.J Wang and C. Liu, Short-term price forecasting based on pso train machines, Neurocomputing 48 (1983), 311–326. bp neural network, Electr. Power Sci. Eng. 24 (2008), 21–25. [21] Taghi Karimi, Peak load prediction with the new proposed algorithm, [43] P.R. Winters, Forecasting sales by exponentially weight moving aver- International Journal of Science and Advanced Technology 2 (2012), ages, Management Science 6 (1960), 324 – 342. no. 3, ISSN 2221–8386. [44] Amit Kumar Yadav, Hasmat Malik, and S.S. Chandel, Selection of most [22] J. Kennedy and R.C . Eberhart, Particle swarm optimization, Proceed- relevant input parameters using weka for artificial neural network based ings of IEEE international conference neural networks IEEE (1995), solar radiation prediction models, Renewable and Sustainable Energy 1942–8. Reviews 31 (2014), 509–519. 6 [45] Wen Yu Zhang, Wei-Chiang Hong, Yucheng Dong, Gary Tsai, Jing-Tian Sung, and Guo feng Fan, Application of svr with chaotic gasa algorithm in cyclic electric load forecasting, Energy 45 (2012), 850–858. 7