=Paper= {{Paper |id=Vol-2473/paper22 |storemode=property |title=Efficient Load Profiling and Forecasting in Large Electric Power Systems |pdfUrl=https://ceur-ws.org/Vol-2473/paper22.pdf |volume=Vol-2473 |authors=Imre Lendák,Tomáš Horváth |dblpUrl=https://dblp.org/rec/conf/itat/LendakH19 }} ==Efficient Load Profiling and Forecasting in Large Electric Power Systems== https://ceur-ws.org/Vol-2473/paper22.pdf
        Efficient load profiling and forecasting in large electric power systems

                                                      Imre Lendák1 , Tomáš Horváth1

           Data Science and Engineering Department, Faculty of Informatics, Eötvös Loránd University, Budapest, Hungary,
                                                     lendak@inf.elte.hu,
                                          WWW home page: http://t-labs.elte.hu

Abstract: The goal of this paper is to present an efficient               algorithm, consisting of a daily load profile clustering and
load forecasting algorithm for large electric power sys-                  a load forecasting phase.
tems. It uses a combination of nearest neighbor-based load                   The following sections of this document contain more
profile clustering and rule-based load forecasting. The                   detailed description of each of the above steps.
load data was sliced into daily load curves, which were
K-Means-clustered, thereby compressing data and simpli-
fying the solution. K-Means was chosen in the proof of                    2     State-of-the-art
concept phase and will be substituted with more precise
solutions later. In the forecasting phase the daily load                  The body of electric load forecasting knowledge is very
profile is predicted based on the forecast date, day type                 large, with numerous papers published in all major
(e.g. weekday or weekend) and historical consumption                      domain-specific journals and conferences. As an exten-
data for similar days in the past. The solution was tested                sive review of all relevant solutions would not be feasible
on a large dataset consisting of one year-long, 5-minute                  due to the page limits, we will only refer to those research
measurement data in a 1900-power-line system. The so-                     results, which specifically focus on time series clustering,
lution showed excellent performance in both the training                  smart meter big data management and the combination of
and forecast phases. It produced meaningful forecasts                     solutions from these domains used in load forecasting.
even when the input data contained significant amounts
of anomalies. An additional advantage of the presented                    2.1   Time series analysis
solution is that it can be used for medium and long-term
forecasting with limited and/or missing input data.                       Reference [19] presents two novel time series cluster-
                                                                          ing methods, namely k-shape and k-MultiShapes (k-MS),
                                                                          which rely on scalable iterative refinement procedures
1    Introduction                                                         based on shape-based distances (SBD). The authors claim
                                                                          that their solution(s) achieve similar results to dynamic
The challenge to accurately predict the power flows in to-                time warping, which at a lower computational cost. k-
day’s large electric power systems receives ample atten-                  Shape is quoted as a suitable and novel solution for creat-
tion. Numerous papers are published in specialized smart                  ing homogeneous and well-separated clusters of time se-
grid journals with the promise of being able to predict                   ries data. The positive characteristics of k-Shape are do-
the electricity consumption of single households or their                 main independence, accuracy and efficiency [20].
groups. Others develop solutions which predict the power                     Reference [33] describes a convolutional neural
flows in electric power transmission systems, which span                  network-based time series classification solution, in which
over large geographic regions, e.g. sizable parts of conti-               the time series features are automatically learned instead
nental Europe or the USA. Yet another group of scientists                 of handpicking. The authors describe the process of
works on data compression algorithms, with the intention                  data preparation, filtering, and the structure of the used
to lower the communication and storage costs incurred in                  network. The authors of reference claim that semi-
modern smart grids.                                                       supervision can boost time series clustering performance
   Within this setting, we start from the idea that the flows             [7].
on the power lines in electric power transmission systems
have some form of periodicity. More specifically, we will
theorize that the configurations of these large systems does              2.2   Data compression
not change frequently, and under the same load condi-                     Reference [29] contains an application-oriented review of
tions the flows will be similar on the power lines for sim-               smart meter data analysis solutions. Three main applica-
ilar days, e.g. for Wednesdays in July the load will most                 tion areas are identified, namely load analysis, load fore-
probably be very similar under the same loading condi-                    casting, and load management. This is a rare reference
tions. Therefore we propose a 2-phase load forecasting                    which addresses the data privacy and security aspect of
      Copyright c 2019 for this paper by its authors. Use permitted un-
                                                                          the analyzed solutions as well. The most important mo-
der Creative Commons License Attribution 4.0 International (CC BY         tivation behind data compression in smart metering are
4.0).                                                                     reduced congestion of communication channels used for
data transmission, storage overhead, as well as improved        use the stationarity property of the estimated models to
data mining efficiency. Reference [30] presents a com-          identify daily customer profiles.
prehensive study on smart meter big data compression so-           The authors of reference [13] analyze annual load
lutions. The authors of reference [25] present a feature-       curves of households and create annual and weekly load
based, load data compression method for smart metering          profiles. They also show how additional features of house-
infrastructures. The solution is not lossless. The authors      hold affect annual consumption and random variation in
claim it is efficient, with little reconstruction error. The    household energy consumption. Reference [32] presents
solution was validated on the Irish Smart Metering Trial        an analysis of the daily consumption data of 300 residen-
Data. The authors of reference [22] present lossless com-       tial customers in China. The authors identify four types
pression algorithms for power system operational data.          of monthly usage patterns and 9 abnormal users, with sig-
   The authors of reference [28] use K-SVD sparse repre-        nificantly different electricity use patterns. They prove that
sentation technique. In the dictionary learning phase, they     more than 80% of households have a similar monthly elec-
decompose load profiles into linear combinations of sev-        tricity usage pattern.
eral partial usage patterns (PUPs). In the sparse coding           The authors of references [31] used k-Shape for build-
phase, a linear support vector machine (SVM) is used to         ing energy usage pattern analysis and tested their solution
classify load profiles as residential or small and medium-      on real-life data measured in ten institutional buildings.
sized enterprises (SMEs). The authors claim that their          Reference [14] goes even further, by using ML techniques
solution outperforms k-means, the discrete wavelet trans-       to guess the lifestyles of energy consumers based on their
form (DWT), principal component analysis (PCA), as well         consumption patterns.
as piecewise aggregate approximation (PAA).
   The solution presented in reference [12] utilizes deep-
stacked auto-encoders in electric load data compression         3   Problem definition
and classification.
                                                                It is necessary to develop a load forecasting solution which
                                                                is capable to predict loads in extremely large, Europe-wide
2.3   Load classification and forecasting                       electric power transmission systems consisting of thou-
                                                                sands of power lines. The input data will consist of histori-
A more general review of smart meter data intelligence
                                                                cal measured loads with a sampling rate of 5 minutes avail-
is provided in references [1] and [15]. The authors of
                                                                able for at least the last 1-year period. This data will be re-
references [10][21] and [24] explore state-of-the-art ma-
                                                                ferred to as dynamic data, due to its frequency of change.
chine learning approaches in load forecasting. They re-
                                                                Due to data privacy limitation, the input data will not con-
view more than 50 research papers and group their contri-
                                                                tain the complete static data model of the system under
butions into single and hybrid computational intelligence-
                                                                consideration. This means that there will be no data pro-
based approaches. They perform a qualitative analysis
                                                                vided to build a mathematical graph consisting of the bus-
based on accuracy and prove the superiority of hybrid so-
                                                                bars (vertices) and power lines (edges) connecting them.
lutions. Various short-term load forecasting techniques
                                                                    It is expected that the prediction horizon will be 5 min-
were compared as early as 1989 in reference [17]. Var-
                                                                utes ahead. The solution should be extensible and be ca-
ious machine learning-based short-term load forecasting
                                                                pable to provide acceptable mid- and long-term forecasts
techniques ranging from moving averages to deep neu-
                                                                (1 day or 1 week ahead) as well. Additionally, it is neces-
ral networks are addressed in references [2][5][6][8][11]
                                                                sary for the solution to handle temporary unavailability of
[23][24]. Smart meter forecasting from one minute to one-
                                                                significant amounts of measurements, when those will not
year horizons is presented in reference [16]. Electricity
                                                                be provided in a timely manner by one or more countries
price and demand forecasting is tackled by the authors in
                                                                and/or companies in the geographical area under consider-
[18]. Bus load forecasting is addressed in reference [3].
                                                                ation. Optionally, the solution should be able to incorpo-
Reference [27] presents a smart meter data characteriza-
                                                                rate weather forecast and other freely available 3rd party
tion method based on the Gaussian mixture (GM) model.
                                                                data and thereby increase the accuracy of its outputs. The
The authors claim that compared to other state-of-the-art
                                                                relevance of such data might vary, as the extent of data
solutions, theirs offers significantly better fitting for me-
                                                                anonymization required will not allow the forecasting tool
ter data. Reference [26] describes a hybrid clustering and
                                                                access to the geographical location of system resources
classification technique in short-term energy consumption
                                                                (i.e. power lines).
forecasting.
   Reference [4] proposes to use clustering in bottom-
up, short-term load forecasting. The authors cluster load       4   Solution
curves by using wavelets to measure similarity and thereby
create super-consumer profiles. The solution was imple-         We suppose that the power flows measured on the power
mented in R and is freely available. The authors of refer-      lines in large electric power systems show some level of
ence [9] analyze four years of measurements represented         regularity and can be therefore classified into load pro-
as time series collected at 245 HV/MV substations. They         files. Based on this assumption, we propose to create a
hybrid forecasting solution which consists of two phases.        Load profile prediction The load profile selection is rule-
In the first phase the electric power flow data is clustered     based, and it is performed in the following steps (listed by
into daily load profiles. In the second phase we exper-          priority):
iment with various forecasting algorithms to predict the
daily load profile for each power line based on historical           1. if there are (past) values for the same (year, weekday,
data. This means that instead of predicting the expected                power line) tuple within the last two months, then se-
values of power flows 5 minutes in the future, we predict               lect one;
the load profile for an entire day in advance.
   This solution addresses most of the more complex re-              2. if there are (past) values for the same (year-1, month,
quirements listed above, namely it is expected that it can              weekday, power line) tuple, then select one;
handle missing data and create forecasts for multiple days
ahead. More specifically, it tolerates the absence of signif-        3. if there are historical values for (year – (2:N), month,
icant amounts of short-term historical data and still pro-              weekday, power line) tuples, then select the most
duces meaningful forecasts based on medium- or long-                    likely one – N is configurable and defaults to 15; or
term historical data (e.g. data older than a week or month).
Similarly, medium- or long-term (7 days or more ahead)               4. if none of the above is found, do a random load profile
forecasts are feasible with this type of solution.                      selection.
   In the following sections we propose the load profile
generation and forecasting steps.
                                                                 Amplitude calculus Amplitudes are chosen as averages
                                                                 of historical values for similar days in the past. This part
4.1   Load profile generation
                                                                 of the solution is also rule-based, and its steps/choices
The power flow data is sliced into daily (24h) load profiles     are very similar to the curve selection algorithm presented
consisting of measured flow values sampled every 5 min-          above (also listed by priority of choice):
utes. The sliced daily load profiles are normalized. The
amplitude of each daily load profile is memorized. The               1. choose non-zero amps for the same weekday within
normalized daily flows are clustered, separately for each               the last two calendar months and average them;
power line, i.e. a set of representative load profiles is cal-
culated for each power line. For each year, month, day of            2. average last year’s amplitudes for the same month
the week and power line we memorize the load profile and                and weekday combination;
amplitude.
   The daily load profile clustering introduces some error,          3. average the values in the longer-term history up to M
but significantly improves algorithm performance if it is               years in the past, where M is configurable and de-
not necessary to re-calculate the centroids too often. As               faults to 15; or
the main idea is that the daily load profiles will be similar,
this should not be an issue, i.e. we should re-calculate             4. choose a default amplitude, which was for simplicity
the representative daily load profiles relatively rarely, e.g.          set to a (configurable) scalar.
once in 15 or 30 days.

                                                                 5     Experiments
4.2   Load forecasting

Our improved baseline load forecasting solutions is rule-        The input dataset was loaded from Heterogeneous Data
based. In the predict phase it looks up the most likely his-     Format (HDF), version 5. We used HDFView version 3
torical load profile and amplitude for each of the power         for data exploration. The training data consisted of his-
lines. They are used to calculate the predicted load profile     torical power flows for 1935 power lines over a July-to-
for the entire prediction day and extract as many samples        July one-year period expanding over two calendar years.
as required. This means that we will predict for a whole         The training data consisted of 3-month-long power flows
day, i.e. 288 future values with a 5-minute sampling inter-      over a July-September period immediately after the train-
val.                                                             ing data. The training data was split into 68 data slices
   Predictions spanning two calendar days are somewhat           ranging in length between one hour and a couple of days-
challenging as in their case it is necessary to either select    long. Both training and testing power flows were sampled
two load profiles and stitch them together; or to select the     every 5 minutes.
‘end’ and the ‘start’ of the same load profile. In the actual       We implemented the solution in Python version 2. We
solution we perform the latter, simpler solution, i.e. use a     used the scikit-learn library for K-Means and other neces-
single daily profile to cover both calendar days before and      sary ML algorithms. Data visualization during experimen-
after midnight.                                                  tation was performed with matplotlib.
5.1   Load profiles                                             to replace 365 (or more if the dataset spans a longer time
                                                                period) daily curves with a small set of load profiles. Daily
The number of representative load profiles (i.e. centroid
                                                                loads in the historical data were essentially represented
count) was set to 10, which was shown to be a sufficient
                                                                with tuples consisting of a load profile identifier, daily am-
during data exploration. The clustering was performed by
                                                                plitude multiplier (as the curves were normalized in the (-
K-Means, which was chosen due to its efficiency. Each
                                                                1, 1) range), year, month and weekday. We used weekdays
load profile was a curve consisting of 288 numerical val-
                                                                as based on past experience, and the related works, we the-
ues.
                                                                orized that load profiles will be quite similar in a certain
   We re-trained the model if a sufficiently long time pe-
                                                                day of the week in each month of the calendar year, e.g.
riod expired since the last training performed. For simplic-
                                                                customer electricity use is usually similar on each Satur-
ity we performed full re-train is runs from scratch. This
                                                                day in July if the weather is good.
design decision was acceptable as the clustering phase for
                                                                   As explained earlier, we reduced the size of the histor-
the 1-year period and 1935 power lines took up to 20 min-
                                                                ical dataset by introducing the load profiles and thereby
utes on a personal computer with an Intel i7 CPU, 8 GB of
                                                                simplified the forecasting task. The original, vast dataset
RAM and SSD.
                                                                consisted of 288 daily flow measurements with a 5-minute
   Example 1-week input load profiles are shown in Fig-
                                                                sampling rate, collected for 365 days and 1916 power lines
ure ??. Example load profiles detected for the above sin-
                                                                (i.e. 288x365x1916 = 201,409,920 values). With the load
                                                                profiles we reduced the dataset to a tuple consisting of
      Figure 1: Power flows for a selected power line           the power line identifier, date, load profile identifier and
                                                                amplitude, i.e. the multiplier with which the load pro-
                                                                file is multiplied to obtain the ‘original’ load measure-
                                                                ments. This meant that instead of 288 floating point val-
                                                                ues for each of the day and power line combination, we
                                                                received a tuple consisting of the above four elements, i.e.
                                                                4x365x1916 = 2,797,360, which was a data reduction by
                                                                72 times, i.e. almost two orders of magnitude.
                                                                   We randomly selected one power line and created a plot
                                                                of the assigned load profile identifiers with K-Means over
                                                                a 60-day long period, starting with a Thursday (i.e. week-
                                                                day identifier 3). The resulting diagram can be seen in
                                                                Figure ??. The diagram covers a 60-day period during the


gle power in its corresponding 1-year-long flow data with             Figure 3: Assigned load profiles over 60 days
K-Means and cluster number 5 are shown in Figure ??.
Note that the cluster number of five was used here only to
illustrate the clustering results in a visually pleasing dia-
gram. Otherwise, during most experiments cluster num-
ber 10 was used. The introduction of these load profiles


      Figure 2: Load profiles for cluster number = 5




                                                                July-August period. We can see in the diagram that for
                                                                almost all weekends (for Saturdays 2, 9, 16, and Sundays,
                                                                3, 10, 17, etc.) the representative load profile was with ID
                                                                = 0. Additionally, we can see that load profile 4 was very
                                                                frequent for weekdays.
                                                                   The power flow amplitudes for the same power line and
                                                                period is shown in Figure ??. We can clearly identify an
allowed us to significantly reduce the solution space, i.e.     anomalous period around day 30, similarly as in the load
      Figure 4: Calculated amplitudes over 60 days                   Table 1: RMSE values - Short-term forecasting

                                                                        Length       Persistence model      Our model
                                                                       1 hour(s)           47.72              55.53
                                                                       2 hour(s)           53.44              51.56
                                                                       4 hour(s)           61.98              64.05



                                                                      Table 2: RMSE values - Mid-term forecasting

                                                                                     Length     Our model
                                                                                      1 day        130.23
                                                                                     1 week        148.33
                                                                                   1 month(s)      137.38
profile diagram above. Such periods are usually related
to periods with different weather conditions (e.g. colder,
rainy days with less use or air-conditioning) and/or con-          We experimented with N=(12, 24, 48), i.e. a 1, 2 and
figuration changes in the power system. In this diagram         4 hour short-term prediction horizons. One execution of
we might identify dips in amplitudes during the weekends        our prediction code took around 30 seconds to execute,
(day 0 is a Thursday), but there is no other clear regularity   regardless of the selected interval length. The time to exe-
identifiable via visual inspection.                             cute the same prediction task on the persistence model was
                                                                quite similar, which meant that most of the time was spent
5.2   Load forecasting                                          on creating the resulting datasets.
                                                                   The RMSE errors calculated with the persistence and
We transformed the date information into a tuple of three       our model are shown in Table ??. The error was calcu-
values, namely year, month and weekday. With this modi-         lated between the prediction values and measured values
fication, the inputs fed into the load forecasting code were    extracted from the adapt/test data.
tuples representing historical daily loads in the following
format:
                                                                Experiment II: Mid-term forecasting As explained
   (line id, year, month, day of the week, profile id, amp)
                                                                above, the proposed solution is capable to produce mean-
   We implemented the rule-based, baseline algorithm as
                                                                ingful mid and long-term forecasts with limited historical
described above. The forecasting code expected the fol-
                                                                data availability. We tested the algorithm on the following
lowing inputs:
                                                                three prediction tasks:
  • Forecast date, e.g. May 30th, 2019;
                                                                  • 1 day ahead, i.e. predict loads for the next day.
  • Power line identifier.
                                                                  • 1 week ahead, i.e. predict the power flows for the day
The code transformed the dates into the (year, month, day           one week in future compared to the last training data
of the week) sub-tuples and subsequently looked up the              item.
most similar historical data as explained in section IV/B.
The load forecasting code returned two values, namely the         • 1 month ahead.
‘expected’ load profile identifier and amplitude for the pre-
                                                                   As the persistence model used during this research did
diction day.
                                                                not have built-in support for these types of prediction
                                                                tasks, we measured the RMSE for the proposed model
Experiment I: Short-term forecasting We compared the            only. Table ?? contains the results of our measurements.
results of our load forecasting solution to the persistence        We decided to further explore the resulting predictions
model, which simply predicts that the N future values will      (i.e. load forecasts) by visually comparing the predicted
be equal to the N historical values preceding them. The         load curves to the actual measurements received as part of
persistence model implementation used for testing was           the test/adapt data. A randomly selected load flow predic-
limited to short-term forecasts for the next N values im-       tion and real daily load values are shown in Figure ??.
mediately following the last time period received in the
training data, i.e. it did not support time gaps between the      The values predicted one week in the future (in red
latest training data and the forecasting period.                color) are in a similar value range, i.e. they do not have an
                                                               rule-based as opposed to creating a neural network-based
            Figure 5: Predict one week ahead
                                                               or other machine learning solution. It can be tweaked fur-
                                                               ther, and one might expect from it to produce deterministic
                                                               results for the target electric power system under consid-
                                                               eration. It might be used as a baseline solution, against
                                                               which less deterministic, machine-learning solutions can
                                                               be compared and measured.

                                                                  The main advantages of the presented algorithm are its
                                                               training and prediction performance. It analyzes the his-
                                                               torical flow information and creates configurable numbers
                                                               of representative daily load profiles for each power line.
                                                               Predictions are based on high performance look-ups – a
                                                               single load profile index is selected for the prediction day,
                                                               a predicted (daily) flow curve is calculated for the whole
                                                               calendar day for which the prediction is initiated. The
                                                               predicted values are ’stitched’ (i.e. amplitudes are aug-
inverted sign or exceedingly different amplitudes. Not sur-    mented) to the actual input flows and the requested number
prisingly the anomalous zero value in the (real) measured      of predicted samples is returned. The solution can make
value is not predicted by the presented load forecasting al-   predictions based on (very) limited information and han-
gorithm.                                                       dle gaps in input data. It is also capable to quickly predict
   In Figure ?? we present the resulting daily load curve      the most likely and meaningful flows for extended future
for the same power line as in the previous example.            periods, i.e. instead of covering only a couple of hours
                                                               immediately after the last input (training) data received, it
                                                               is capable to predict a day, week or even longer periods
            Figure 6: Predict one month ahead                  ahead. Prediction accuracy will obviously vary as a func-
                                                               tion of the amount of historical data, i.e. if there are histor-
                                                               ical flow values measured multiple years in the past, then
                                                               we expect to obtain higher accuracy. This was not shown
                                                               in our experiments though, as the training data available
                                                               covered only one calendar year.

                                                                  The solution was tested on dataset consisting of power
                                                               flows collected over a one year, July-to-July period. The
                                                               test data covered the July to September period immedi-
                                                               ately following the training data. The system under con-
                                                               sideration consisted of 1916 power lines. The accuracy
                                                               of the proposed model was compared to the persistence
                                                               model. We showed that the RMSE was quite similar in
                                                               short-term forecasting tasks (1 to 4 hours) to the persis-
                                                               tence model. We measured the accuracy of our algorithm
   The relatively low accuracy levels can be improved by       in mid-term load forecasting scenarios, whose length was
implementing a more accurate load profile clustering tech-     set to be 1 day, 1 week, as well as 1 and 3 months in the
nique, instead of the relatively coarse K-Means, with a dif-   future.
ferent distance measure. Such changes would allow the
authors to find the most relevant daily load profiles.            The main disadvantage of the algorithm is its reliance
                                                               on historical flow information only, i.e. it does not take
                                                               auxiliary information into consideration. Additionally, the
6   Conclusion                                                 algorithm does not specifically cover national holidays,
                                                               which often fall on weekdays and result in weekend-like
This paper describes a power flow prediction algorithm,        load profiles - this missing element is relevant if the al-
which relies on analyzing historical (power) flow infor-       gorithm is used for state-level forecasting, but has lower
mation and creating a configurable number of represen-         relevance in continent-wide (e.g. Europe) load forecast-
tative daily load profiles for each power line. Predictions    ing scenarios, in which the impact of national holidays
are based on high performance look-ups – a single load         is lower. Further tuning and optimization of the load
profile index is selected for the target prediction day, for   profile classification and curve/amplitude selection algo-
which a (daily) load profile is calculated. The algorithm is   rithms might further improve performance and accuracy.
References                                                               3520.
                                                                     [17] Moghram, I., Rahman, S.: Analysis and evaluation of five
[1] Alahakoon, D., Yu, X.: Smart electricity meter data intelli-         short-term load forecasting techniques. IEEE Transactions
    gence for future energy systems: A survey. IEEE Transac-             on power systems. 4(4) (1989) 1484–1491.
    tions on Industrial Informatics. 12(1) (2016) 425–436.           [18] Motamedi, A., Zareipour, H., Rosehart, W. D.: Electricity
[2] Amjady, N. (2001). Short-term hourly load forecasting using          price and demand forecasting in smart grids. IEEE Transac-
    time-series modeling with peak load estimation capability.           tions on Smart Grid. 3(2) (2012) 664–674.
    IEEE Transactions on Power Systems, 16(3), 498-505.              [19] Paparrizos, J., Gravano, L.: Fast and accurate time-
[3] Amjady, N.: Short-term bus load forecasting of power sys-            series clustering. ACM Transactions on Database Systems
    tems by a new hybrid method. IEEE Transactions on Power              (TODS). 42(2) (2017) 8.
    Systems. 22(1) (2007) 333–341.                                   [20] Paparrizos, J., Gravano, L.: k-shape: Efficient and accurate
[4] Auder, B., Cugliari, J., Goude, Y., Poggi, J. M.: Scalable           clustering of time series. In Proceedings of the 2015 ACM
    clustering of individual electrical curves for profiling and         SIGMOD International Conference on Management of Data.
    bottom-up forecasting. Energies. 11(7) (2018) 1893.                  (2015, May) 1855–1870.
[5] Chen, K., Chen, K., Wang, Q., He, Z., Hu, J., He, J.: Short-     [21] Raza, M. Q., Khosravi, A.: A review on artificial intelli-
    term load forecasting with deep residual networks. IEEE              gence based load demand forecasting techniques for smart
    Transactions on Smart Grid. (2018) 1–1.                              grid and buildings. Renewable and Sustainable Energy Re-
                                                                         views. 50 (2015) 1352–1372.
[6] Chen, J. F., Wang, W. M., Huang, C. M.: Analysis
    of an adaptive time-series autoregressive moving-average         [22] Sarkar, S. J., Kundu, P. K., Sarkar, G.: Development
    (ARMA) model for short-term load forecasting. Electric               of lossless compression algorithms for power system op-
    Power Systems Research. 34(3) (1995) 187–196.                        erational data. IET Generation, Transmission Distribution.
                                                                         12(17) (2018) 4045–4052.
[7] Dau, H. A., Begum, N., Keogh, E.: Semi-supervision dra-
    matically improves time series clustering under dynamic          [23] Sun, X., Luh, P. B., Cheung, K. W., Guan, W., Michel,
    time warping. In Proceedings of the 25th ACM International           L. D., Venkata, S. S., Miller, M. T.: An efficient approach
    on Conference on Information and Knowledge Management.               to short-term load forecasting at the distribution level. IEEE
    (2016, October) 999–1008.                                            Transactions on Power Systems. 31(4) (2016) 2526–2537.
[8] Ding, N., Benoit, C., Foggia, G., Bésanger, Y., Wurtz, F.:       [24] Taylor, J. W., McSharry, P. E.: Short-term load forecast-
    Neural network-based model design for short-term load fore-          ing methods: An evaluation based on european data. IEEE
    cast in distribution systems. IEEE Transactions on Power             Transactions on Power Systems. 22(4) (2007) 2213–2219.
    Systems. 31(1) (2016) 72–81.                                     [25] Tong, X., Kang, C., Xia, Q.: Smart metering load data com-
[9] Espinoza, M., Joye, C., Belmans, R., De Moor, B.: Short-             pression based on load feature identification. IEEE Transac-
    term load forecasting, profile identification, and customer          tions on Smart Grid. 7(5) (2016) 2414–2422.
    segmentation: a methodology based on periodic time series.       [26] Torabi, M., Hashemi, S., Saybani, M. R., Shamshirband,
    IEEE Transactions on Power Systems. 20(3) (2005) 1622–               S., Mosavi, A.: A Hybrid clustering and classification tech-
    1630.                                                                nique for forecasting short-term energy consumption. Envi-
[10] Fallah, S., Deo, R., Shojafar, M., Conti, M., Shamshir-             ronmental Progress Sustainable Energy. 38(1) (2019) 66–76.
    band, S.: Computational intelligence approaches for energy       [27] Tripathi, S., De, S.: An efficient data characterization and
    load forecasting in smart energy management grids: state of          reduction scheme for smart metering infrastructure. IEEE
    the art, future challenges, and research directions. Energies.       Transactions on Industrial Informatics. 14(10) (2018) 4300–
    11(3) (2018) 596.                                                    4308.
[11] Hagan, M. T., Behr, S. M.: The time series approach to          [28] Wang, Y., Chen, Q., Kang, C., Xia, Q., Luo, M.: Sparse
    short term load forecasting. IEEE Transactions on Power              and redundant representation-based smart meter data com-
    Systems. 2(3) (1987) 785–791.                                        pression and pattern extraction. IEEE Transactions on Power
[12] Huang, X., Hu, T., Ye, C., Xu, G., Wang, X., Chen, L.:              Systems. 32(3) (2017) 2142–2151.
    Electric Load Data Compression and Classification Based on       [29] Wang, Y., Chen, Q., Hong, T., Kang, C.: Review of
    Deep Stacked Auto-Encoders. Energies. 12(4) (2019) 653.              smart meter data analytics: Applications, methodologies,
[13] Kuusela, P., Norros, I., Reittu, H., Piira, K.: Hierarchical        and challenges. IEEE Transactions on Smart Grid. 10(3)
    Multiplicative Model for Characterizing Residential Elec-            (2019) 3125–3148
    tricity Consumption. Journal of Energy Engineering. 144(3)       [30] Wen, L., Zhou, K., Yang, S., Li, L.: Compression of smart
    (2018) 04018023.                                                     meter big data: A survey. Renewable and Sustainable Energy
[14] Kwac, J., Flora, J., Rajagopal, R.: Lifestyle segmentation          Reviews. 91 (2018) 59–69.
    based on energy consumption data. IEEE Transactions on           [31] Yang, J., Ning, C., Deb, C., Zhang, F., Cheong, D., Lee, S.
    Smart Grid. 9(4) (2018) 2409–2418.                                   E., ... Tham, K. W.: k-Shape clustering algorithm for build-
[15] Liu, X., Golab, L., Golab, W., Ilyas, I. F., Jin, S.: Smart         ing energy usage patterns analysis and forecasting model ac-
    meter data analytics: Systems, algorithms, and benchmark-            curacy improvement. Energy and Buildings. 146 (2017) 27–
    ing. ACM Transactions on Database Systems (TODS). 42(1)              37.
    (2017) 2.                                                        [32] Yang, T., Ren, M., Zhou, K.: Identifying household elec-
[16] Massidda, L., Marrocu, M.: Smart meter forecasting from             tricity consumption patterns: A case study of Kunshan,
    one minute to one year horizons. Energies. 11(12) (2018)             China. Renewable and Sustainable Energy Reviews. 91
    (2018) 861–868.
[33] Zȩbik, M., Korytkowski, M., Angryk, R., Scherer, R.: Con-
    volutional Neural networks for time series classification. In
    International Conference on Artificial Intelligence and Soft
    Computing. Springer, Cham. (2017, June) 635–642.