Comparison of clustering techniques for residential load
              profiles in South Africa?

                      Wiebke Toussaint1,2,3[0000−0002−9657−9509] and
                      Deshendran Moodley1,2[0000−0002−4340−9178]
           1
               University of Cape Town, Rondebosch, 7700, Cape Town, South Africa
                   2
                      Centre for Artificial Intelligence Research, South Africa
                             3
                               Technical University Delft, Netherlands


        Abstract. This work compares techniques for clustering metered residential
        energy consumption data to construct representative daily load profiles in South
        Africa. The input data captures a population with high variability across temporal,
        geographic, social and economic dimensions. Different algorithms, normalisation
        and pre-binning techniques are evaluated to determine their effect on producing
        a good clustering structure. A Combined Index is developed as a relative score
        to ease the comparison of experiments across different metrics. The study shows
        that normalisation, specifically unit norm and the zero-one scaler, produce the best
        clusters. Pre-binning appears to improve clustering structures as a whole, but its
        effect on individual experiments remains unclear. Like several previous studies,
        the k-means algorithm produces the best results. To our knowledge this is the first
        work that rigorously compares state of the art cluster analysis techniques in the
        residential energy domain in a developing country context.

        Keywords: cluster analysis · machine learning · load profiles · household energy
        use · South Africa


1     Introduction
Long term energy planning requires insights into the energy consumption behaviour
of customers, such as residential households, to build demand forecasts. Customer
behaviour is frequently approximated with load profiles or load curves, which are time-
varying energy consumption patterns. A daily load profile captures the average load
drawn from the electrical grid over a metered interval (e.g. 5 minutes). If a daily load
profile averages consumer behaviour for a particular loading condition, such as a year,
season, month or daytype, it is called a representative daily load profile (RDLP).
    Clustering techniques are applied in the energy domain to generate RDLPs. Cluster
analysis typically yields good results for consumers in the industrial and commercial
sectors, but granular household energy consumption patterns are inherently noisy, making
it more challenging to produce meaningful clusters in the residential sector [26]. Pre-
binning, which involves applying a two-stage clustering algorithm that first clusters
load profiles by overall consumption and then by load shape, has shown promise for
?
    Supported by the Centre for Artificial Intelligence Research (CAIR) and the South African
    National Energy Development Institute (SANEDI)


Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0)
2       W. Toussaint and D. Moodley

clustering highly variable residential consumers [32], but has not been widely adopted.
While both the input data representation and algorithm parameters are known to have a
significant impact on clustering results, the effects of data input and evaluation measures
are not compared rigorously, with most studies in the domain implementing clustering
algorithms on very small datasets. Several studies have found that a single metric on its
own is insufficient to adequately represent cluster performance and suggest a combination
of measures to ensure optimal cluster selection [15][11][9].
    This paper provides a rigorous comparison of normalisation techniques, pre-binning
approaches and algorithms for clustering daily load profiles of a highly variable popu-
lation. Section 2 reviews the data representation, clustering algorithms and evaluation
approaches of previous studies that use cluster analysis for generating RDLPs in the
energy domain. Next we present the Domestic Electrical Load Study (DELS) dataset
on which this research is based in Section 3. Section 4 describes the setup of clustering
experiments and the development of the Combined Index, which is used to evaluate
experiments. Finally, the results are presented in Section 6, followed by a discussion and
conclusion work in Section 7.

2     Literature Review
Cluster analysis is an unsupervised machine learning approach that is useful for finding
groups in a dataset when no labelled training observations are available [25]. In the
energy domain cluster analysis is used extensively to segment energy consumers for
targeted energy efficiency campaigns [1], pricing [6], energy forecasts [19] and small-
scale renewable generation [32]. We reviewed studies from the past two decades that
cluster load profiles of energy consumers for the purpose of generating representative
daily load profiles. We discuss and analyse the studies in relation to their input data
and data representation, the clustering algorithms and parameters, and the evaluation
methodologies, as these have a significant impact on achieving good clustering results.

2.1   Data Input and Representation

Load Profile Feature Extraction Fine-grained daily load profiles are frequently re-
duced using Piecewise Aggregate Approximation with 15, 30 or 60 minute windows to
produce input vectors of 96, 48 or 24 dimensions respectively [24][31][9]. Other data
reduction methodologies extract features such as total demand, peak demand and number
of peaks [4][6], or apply dimensionality reduction using Principal Component Analysis
[12] or Self-Organising Maps [23]. [32] represents daily load profiles as a normalised
vector that sums consumption over time, to capture load shape as well as consumption
levels. [14] investigates the impact of temporal resolution on clustering algorithms in the
residential energy domain and suggests that cluster quality is best at a resolution of 8
or 15 minutes. For the k-means algorithm performance is robust in a band of temporal
resolutions between 4 to 60 minutes.

Load Profile Normalisation Most studies normalise input data by scaling vectors with
a min-max scaler so that patterns retain their shape but are scaled to a zero-one range
[9][24][3]. This approach is very sensitive to outliers and appears to be an unvalidated
domain preference. De-minning subtracts the daily minimum demand from each hourly
               Comparison of clustering techniques for residential load profiles in SA   3

value and then divides it by the de-minned daily total [16]. It is proposed as a more
robust form of normalisation, but the authors do not offer a quantitative comparison
against other approaches. De-minning has the drawback that it only considers profile
shape. Considering the importance of normalisation in cluster analysis, it is surprising
that some studies do not provide any details about the normalisation technique applied.
The selection of normalisation algorithms is mostly unsubstantiated. No studies with a
rigorous comparison of different normalisation approaches were found.

Clustering with Pre-binning Pre-binning, or two-stage clustering, is suggested by [7]
and implemented in [4], [32] and [30]. The results and effectiveness of pre-binning as
suggested in [30] are unclear, in part because the input data and data representation have
not been documented. [32] have found that a two-stage approach that first clusters by
overall consumption and then by load shape produces better results than clustering by
load shape only. The influence of different types of pre-binning has not been investigated.

Time Range and Spatial Cover Geographically and temporally most studies cover a
single location and a maximum time period of 18 months. Typically studies first derive
representative daily load profiles (RDLPs) for individual customers at specific loading
conditions and then cluster the RDLPs, which significantly reduces the number of input
patterns. Some studies, such as [15] and [18] cluster all daily load profiles and derive a
set of consumption patterns, described by the cluster centroids, that represents distinct
daily energy usage behaviour for different types of consumers.

2.2   Clustering Algorithms and Distance Measures
The majority of studies that evaluated different clustering techniques found that the k-
means algorithm performed the best [4][15][22][32]. Other studies showed that the SOM
[9][20], k-medoids [15][27] and modified follow-the-leader [7][8] yielded the best results.
Several variations of k-means [3][15][23] and hierarchical clustering [15][8][2] were
identified as the best or amongst the best clustering algorithms in individual studies. In
general, the studies performed no benchmarking and insufficient comparative evaluations.
Results across studies are thus contradictory, inconsistent and inconclusive. Euclidean
distance is used most frequently as distance measure and only a minority of studies
compares distance measures.

2.3   Evaluation Measures
The Davies Bouldin Index (DBI), Cluster Dispersion Index (CDI) and Mean Index
Adequacy (MIA) are used most frequently, with the Similarity Matrix Indicator (SMI)
and Silhouette Index having a couple of use cases. Evaluation of clustering results
remains a challenge [15], which some authors try to overcome by proposing metrics of
their own. Insufficient testing and evaluation of measures such as the Energy Variance
Index presented in [5] however means that their reliability is uncertain and new metrics
are seldomly adopted by other studies. MIA, which is proposed in [6] is an exception
and has been adopted by many subsequent studies. [15] finds that standard performance
metrics pose a trade-off between compactness and distinctness for cluster selection.
[9] concludes that the standard evaluation measures are unreliable due to bias towards
isolating outliers and insufficient penalisation of large, noisy clusters. Furthermore, the
4          W. Toussaint and D. Moodley

study suggests that combining measures can help overcome the challenge of representing
cluster performance and selecting the optimal number of clusters with a single measure.
Cluster ranges are typically constrained to small numbers of less than 30 clusters to ease
expert interpretation and to produce clusters that correspond with existing user groups.
Only few studies conclusively suggest the optimal number of clusters.
2.4     Limitations of Existing Clustering Approaches
Most studies are primarily concerned with the comparison of different clustering al-
gorithms, and neglect to investigate the effects of data representation and parameter
selection. The impact of the input dataset on clustering algorithms is largely unacknowl-
edged, with one third of the reviewed studies omitting to specify the data source. Almost
half the studies do not explicitly state the number of patterns in the input dataset and
over half the studies compare clustering algorithms on very small datasets with less
than 500 input patterns. Very few studies explore the effect of the distance measure
on clustering results, with a third of studies omitting to specify the distance measure.
These observations are similar to those made in the review of clustering approaches of
non-residential buildings presented in [21]. This is a wider problem in the data mining
community that has been reported in [17] more than a decade ago.
Considerations for Developing Countries Very few studies were conducted in devel-
oping countries. Certain assumptions around data representation and cleaning must be
reconsidered when clustering energy consumers in this context. Very low consuming
households are frequently treated as outliers and removed from the data [18][4]. While
individual household consumption of these groups is low, they present a significant
percentage of households in the DELS datasets. Moreover, the profiles typically belong
to consumers living in rural or informal settings, and their inclusion is key if energy
access is a concern. Their low consumption base also presents an opportunity for high
growth, which has important implications for utilities.

3      The South African Domestic Electrical Load Study (DELS) Data
This section provides an overview and descriptive statistics of the South African Do-
mestic Electrical Load Study (DELS) datasets and details the input data representation.
The DELS datasets collected from 1994 to 2014 present the most comprehensive source
of observational information on residential energy consumption in South Africa. We
use the raw metering data from the Domestic Electrical Load Metering (DELM) dataset
[13], considering each recorded daily load profile as an independent input pattern and
ignoring long term trends. Households metered for several years are thus treated as
having separate identities for each year of observation. Our data input contains daily
load profiles for a total of 14 945 of such household identities, which we refer to as
households from here on.
3.1     Description of Sample Population
For 58% of the metered households (8656 households) detailed socio-demographic data
was captured in an annual survey4 . The majority of households have a low income of less
 4
     A harmonised version of the survey data used to provide descriptive statistics has been published
     as the Domestic Electrical Load Survey - Key Variables (DELSKV) dataset [29]
                  Comparison of clustering techniques for residential load profiles in SA     5

than R5000 (about $340) per month. A fraction of households earns up to 50 times that
amount. A similar distribution can be observed for dwelling size, with most households
occupying dwellings between 25m2 and 100m2 . Less than half the surveyed households
have access to piped water in the home and less than a quarter of households live in
dwellings with brick walls. More than half the households have a corrugated iron or zinc
roof - a construction material that is particularly popular in rural and informal settlements
due to its availability and low cost. Furthermore, the dataset covers a large number of
newly electrified households. While the affluent households could be seen as outliers, it
is important to include them in the analysis as they are disproportionally large energy
consumers. Appendix A visualises the distribution of income, dwelling floor area, the
number of years electrified and the proportion of wall materials, roof materials and water
access points of survey respondents in Figures 4a, 4b, 4c and 4d.

3.2     Data Representation

The subset of the data used for this research contains metered current readings recorded
at 5 minute intervals. All observations are averaged over 60 minuteS, producing 3 295
848 daily load profiles for 14 945 households5 . Invalid and missing observations are
marked in the raw dataset and have been discarded from the analysis. Each interval t is
labeled by the start time, such that t = 0 captures interval 00:00:00 - 00:59:59.
    Assume that l(t) is the energy consumption (measured in Amperes) over interval t.
The daily load profile h of household j on day d is:

                             (j)
                          hd = l(t)d , where t = {0, 1...23}                                (1)
                                  (j) 
                          H (j) = hd , where d = {1, 2...d days}                            (2)

                                                                          (j)
    H (j) is the array of all 24-element daily load profile vectors hd for household j. d
varies for each household and depends both on the duration for which the household was
observed and on the number of valid readings in that period.
    The mean observation duration d for all households is 220 days. 61% of households
were observed for more than half a year (ie d > 183). The maximum number of
households observed on a single day was on 23 August 1999 when the electricity
consumption of 1245 households was recorded. The median daily household count is
399. The distribution of annual mean daily demand of all households is shown in Figure
4e in Appendix A. Half the households consume on average less than 10kWh/day. X is
the input array of all daily load profiles h and has dimensions 3 295 848 × 24.

                          X = H (j) , where j = {1, 2...14945}
                                    
                                                                                      (3)

4      Load Profile Clustering
The design of clustering experiments is presented in this section. Figure 1 provides an
overview of the process. All valid daily load profiles are pre-processed as described
 5
     This aggregated dataset has been published as the Domestic Electrical Load Metering, Hourly
     Data (DELMH) [28]
6       W. Toussaint and D. Moodley

in Section 3. Depending on the experiment, the input data is further processed by
removing zeros, applying a normalisation algorithm and one of two different pre-binning
approaches. Each algorithm is then initialised with the relevant cluster ranges. Following
this, the experiment’s results are recorded and metrics calculated. Finally the 10 best
experiments are selected.


                          Fig. 1: Load Profile Clustering Process


4.1   Normalisation and Pre-binning
Normalisation We compare four normalisation algorithms that are used in the energy
domain, i.e. unit norm, zero-one normalisation, de-minning and a normalisation tech-
nique frequently applied by experts in South Africa. Table 4 in Appendix B provides
details on the normalisation algorithms.

Pre-binning by average monthly consumption (AMC) calculates the AMC for house-
hold j over a one year period as follows:
                                    12      month  23
                                              Xend X
                             1      X
                AM C (j) =                                 230 × l(t)d kWh            (4)
                             12                      t=0
                                  month=1    d=1

   All the daily load profiles H (j) of household j are then assigned to one of 8 con-
sumption bins based on the household’s AM C (j) value. The bin ranges are listed in
Table 5 in Appendix B and are based on South African electricity tariff ranges used by
experts. Individual household identifiers are removed from X after pre-binning.

Pre-binning by integral k-means is a data-driven approach based on the work of [32].
For the simple case where t represents hourly values, pre-binning by integral k-means
follows these steps:
                                                                                       (j)
 1. Create a new vector c(t) from the cumulative sum of the normalised profile of hd
 2. Append l(t)max
                 d    to c(t) to ensure that both peak demand and relative demand
    increase are taken into consideration
 3. Gather all features into array X C and remove individual household identifiers
                   Comparison of clustering techniques for residential load profiles in SA                 7

 4. Use the k-means algorithm to cluster X C into k = 8 bins, corresponding to the
    number of bins created for AMC pre-binning

    Early experiments found unit norm to be a promising normalisation technique. Step
1 of the pre-binning by integral k-means thus normalised profiles with unit norm.

4.2    Clustering Algorithms and Experiments

We implemented k-means, self-organising maps (SOM) and a combination of the two
algorithms to cluster X. Given the large size of the dataset, we choose Euclidean distance
as the distance measure for the k-means algorithm. Each algorithm was initialised with
different sets of parameter values, normalisation and pre-processing steps. Due to South
Africa’s geographic spread and economic inequality, significant variability in national
energy consumption patterns was anticipated. We thus allowed for a maximum of 220
clusters based on population diversity and existing expert models which account for 11
socio-demographic groups, 2 seasons, 2 daytypes and 5 climatic zones. All experiments
are summarised in Table 1.

                                     Table 1: Experiment details
            exp.    algorithm    parameters              normalisation          pre-bin            zeros

              1     kmeans       m{5, 8, 11, ...136}          none
              2     kmeans       m{5, 8, 11, ...136}     none, u, d, z, sa
                    SOM          s{5, 7, 9, ...29}       none, u, d, z, sa
                    SOM+kmeans   s{30, 40, ...90}, m     none, u, d, z, sa
              3     kmeans       m{5, 8, 11, ...136}     none, u, d, z, sa                         False
                    SOM          s{5, 7, 9, ...29}       none, u, d, z, sa                         False
                    SOM+kmeans   s{30, 40, ...90}, m     none, u, d, z, sa                         False
              4     kmeans       m{2, 3, ...10}          none, u, d, z, sa       AMC
                    SOM          s{2, 3, 4, 5}           none, u, d, z, sa       AMC
                    SOM+kmeans   s{4, 7, 11, ...20}, m   none, u, d, z, sa       AMC
              5     kmeans       m{2, 3, ...19}          none, u, d, z, sa       AMC
                    SOM+kmeans   s{4, 7, 11, ...20}, m   none, u, d, z, sa       AMC
              6     kmeans       m{2, 3, ...19}          none, u, d, z, sa       AMC               False
              7     kmeans       m{2, 3, ...19}          none, u, d, z, sa   integral kmeans
              8     kmeans       m{2, 3, ...19}          none, u, d, z, sa   integral kmeans       False

      The k-means algorithm was initialised with a range of m clusters, producing k (i) =
  (i)   (i)
{k1 ...kmi } for mi in m. The SOM algorithm was initialised as a square map with
                                                          (i)
dimensions si × si for si in range s, producing k (i) = {k1 ...ksi ×s(i) } for si in s.
                                                                                               i
The cluster ranges produced by SOM span a greater range and increase the number of
clusters k in large increments, which has the advantage of testing edge cases, but has the
drawback of making it difficult to discern the best number of clusters k (i) . Combining
SOM and k-means first creates a s × s map, which acts as a form of dimensionality
reduction on X. For each s, k-means then clusters the map into m clusters. The mapping
only makes sense if s2 is greater than m. For experiments with pre-binning, clustering is
done independently within each bin, thus performing a two-stage clustering process. The
maximum acceptable number of clusters per bin is considerably smaller and the range of
m was chosen accordingly. The coarse-grained clustering increments of SOM do not
make it well suited to the requirement of fewer clusters and pre-binning was only done
with k-means.
8          W. Toussaint and D. Moodley

4.3     The Combined Index Score
Cluster compactness and distinctness are two important attributes that characterise a
good clustering structure. To overcome the challenge of comparing experiments across
metrics, we conducted cluster evaluation on a relative rank basis and combined three
common metrics, the Mean Index Adequacy (MIA), the Davies-Bouldin Index (DBI) and
the Silhouette Index, into a single Combined Index (CI) to ease the evaluation process.
Details on calculating metrics are contained Appendix B. The CI was calculated from
the product of the DBI, MIA and inverse Silhouette Index and provides an indication of
the performance of experiments across all three metrics. It is defined as follows:
                          bins                    !
                          X               Nbin                               (j)
            CI = log               Ixbin ×          , where N is the count of hd        (5)
                                           Ntotal
                         bin=1
                   
                      undefined    if DBI, M IA, SilhouetteIndex ≤ 0
             Ix =    DBI  ×  M  IA                                                      (6)
                   SilhouetteIndex             otherwise

     Ix is an interim score that computes the product of the DBI, MIA and inverse
Silhouette Index. The CI is the log of the weighted sum of Ix across all experiment bins.
A lower CI is desirable and an indication of a better clustering structure. The logarithmic
relationship between Ix and the CI means that the CI is negative when Ix is between 0
and 1, 0 when Ix = 1 and greater than 0 otherwise.
     The log function is only defined for values greater than 0. As the lower bound of the
DBI and MIA is 0 and a negative Silhouette Index is an indication of poor clustering, the
Ix score is undefined for all scores equal to or below 0, so that the input to Equation 5 is
valid. The Ix increases linearly with the DBI and MIA. When these scores are low, so
is the Ix. However, as both metrics evaluate cluster compactness, we anticipate them
to increase simultaneously. Thus, if cluster compactness deteriorates, the Ix should be
affected exponentially. Neither DBI nor MIA has an upper bound, which is thus also
true for the Ix. The Silhouette Index on the other hand is inversely related to Ix. When
the Silhouette Index is close to 1, clusters are good and the Silhouette Index has only a
marginal influence on Ix. The closer the Silhouette Index is to 0, the greater Ix becomes.
     For experiments with pre-binning, the experiment with the lowest Ix score in each bin
was selected, as it represents the best clustering structure for that bin. For experiments
without pre-binning, bins = 1 and Nbin = Ntotal . Weighting Ix of each bin was
important to account for the size of cluster membership in that bin.

5      Results and Analysis
We implemented our experiments in python 3.6.5 using k-means algorithms from scikit-
learn (0.19.1) and self-organising maps from the SOMOCLU (1.7.5) libraries6 . In total
2083 individual experiments were conducted.
    The CI scores for all experiments are plotted as a percentage distribution in Figure
2. Scores range from 2.282296 to 9.626502 and lower scores are better. The histogram
 6
     The codebase is available online at https://github.com/wiebket/del clustering
                Comparison of clustering techniques for residential load profiles in SA             9


         Fig. 2: Comparison of clustering techniques for residential load profiles in SA

shows two distinct distributions of experiments. Experiments in the first group have
a score below 4 and constitute 65.5% of experiments. These experiments have been
normalised with unit norm, de-minning or zero-one. Experiments in the second group
have high scores and have not been normalised, or normalised with SA norm. Over
97.1% of experiments have a score below 6.5. Next we present further details on the
distributions of CI scores across normalisation, pre-binning and algorithm types.

5.1   Performance of Normalisation, Pre-binning and Algorithms

From the histograms in Figure 3 it is clear that normalisation and pre-binning improve
clustering results. It is however not immediately evident which normalisation and pre-
binning approaches are best.


            (a) Normalisation algorithms                               (b) Pre-binning approaches


                                           (c) Clustering algorithms

  Fig. 3: Distribution of CI scores across normalisation, pre-binning and clustering algorithms
10      W. Toussaint and D. Moodley

Normalisation Performance Figure 3a groups the distribution of scores for all experi-
ments across the four normalisation algorithms and experiments without normalisation.
Normalisation clearly improves the CI score. Unit norm has the highest percentage of
experiments with the best CI scores, with a few zero-one outliers also top performing.
While de-minning does not produce the lowest scores, it contains a large percentage of
experiments that have amongst the lowest scores. It is not clear whether normalisation or
some other experimental parameters are responsible for the difference in performance.
SA norm performs worst and shows very limited improvement over unnormalised exper-
iments. Most of the experiments without normalisation have scores above 5.
Pre-binning Performance Figure 3b shows the impact of pre-binning on the CI scores.
Pre-binning by AMC produces the most results with the best scores. Integral k-means
yields a higher percentage of top results, though none are best performing. It is not
possible to determine with certainty which of the pre-binning approaches is better, but it
is clear that pre-binning improves clustering scores as a whole.
Algorithm Performance While Figure 3c clearly shows that the k-means algorithm
outperforms other algorithms, analysing the results in detail revealed some nuances.
Without normalisation, SOM+k-means performs better than k-means on its own, which
could be due to the dimensionality-reducing effect of the SOM. With normalisation
k-means performs best, followed by SOM+k-means and lastly SOM. SOM frequently
had a negative Silhouette Index, which is an indication of incorrect cluster assignment
and the CI score is undefined for those experiments.
5.2   Top 10 Experiments
The ranking of the top ten experiments is shown in Table 2.

                           Table 2: Top 10 experiments ranked by CI score

              #     CI       DBI    MIA     Sil.    Exp.    Alg.    m    Norm.      Run time

              1    2.282    2.125   0.438   0.095    2     kmeans   47     unit      40.76
              2    2.289    1.616   1.220   0.262    5     kmeans   17   zero-one    15.42
              3    2.296    1.616   1.220   0.260    4     kmeans   17   zero-one    14.74
              4    2.301    2.152   0.485   0.119    6     kmeans   82     unit      27.04
              5    2.316    2.115   0.447   0.093    2     kmeans   35     unit      50.43
              6    2.320    2.199   0.486   0.121    5     kmeans   71     unit      19.62
              7    2.349    2.152   0.481   0.143    7     kmeans   49     unit      21.82
              8    2.351    2.189   0.434   0.090    2     kmeans   50     unit      43.69
              9    2.354    2.111   0.476   0.128    8     kmeans   59     unit      20.08
              10   2.355    2.173   0.453   0.093    2     kmeans   32     unit      41.14


    With the exception of two experiments, all have been normalised with unit norm.
Experiments pre-binned with AMC, integral k-means and without pre-binning are all
included in the top results. K-means is the uncontested best clustering algorithm. For both
the k-means and SOM algorithms the batch fit time increases linearly with dimensionality.
For SOM+k-means the SOM is used for dimensionality reduction and the dimensions
explored are thus considerably greater. This has a significant impact on increasing
experiment run times, as shown in Table 3.
               Comparison of clustering techniques for residential load profiles in SA   11


                   Table 3: Summary of algorithm CI scores and run times

                           Algorithm     Mean CI score Mean run time (s)

                            k-means          2.59            44.79
                             SOM             4.11            39.42
                         SOM + k-means       3.17           1498.77


6   Discussion and Conclusion
This study presents a rigorous comparison of normalisation, pre-binning and cluster-
ing algorithms for a large, heterogeneous dataset of South African residential energy
consumers. A Combined Index (CI) was developed to effectively compare the results
of 2083 experiments across several metrics. The CI was used as a relative index to
avoid having to interpret individual scores. Even so, the difference between the best
and tenth best experiment is only 3.2 percentage points. The CI score alone is thus
insufficient for selecting the best clustering structure with confidence. This confirms
the conclusions drawn by previous studies, many of which rely on expert judgement to
select the best clusters. A future direction for this work will be to develop qualitative
evaluation measures that can be used together with the CI score, and to assess if this
two-stage evaluation approach yields more usable clusters.
    As expected, normalisation significantly impacts clustering results. There is a dis-
tinct difference in performance between experiments normalised with algorithms that
transform daily load profiles to values between 0 and 1 (unit norm, de-minning and
zero-one normalisation) and those that do not (SA norm and unnormalised experiments).
Unit norm was the best normalisation for most experiments. SA norm performed the
worst. This was no surprise, as the Euclidean distance measure and the error metrics are
severely impacted by the larger values that this normalisation permits. While pre-binning
appears promising, more rigorous analysis is warranted to assess its effectiveness.
    Comparing the clustering algorithms, k-means outperformed the SOM and SOM+k-
means techniques for almost all experiments. As the dataset was large and high di-
mensional, with fixed time series length and regular sampling intervals, this result
corresponds with the suggestions made in the cluster analysis literature and with the
results of previous studies. The square map initialised with the SOM may have resulted in
a clustering structure too coarse to capture the variability in the dataset. SOM+k-means
had the drawback of slow run times when the SOM dimension was high. Due to the poor
results and slow run times of SOM and SOM+k-means they were not implemented for
most of the experiments with pre-binning. The Euclidean distance measure was used
in all algorithms. While the type of dataset is well suited to clustering with k-means,
alternative partitional clustering algorithms such as k-medoids should be explored, as
well as alternative distance measures such as Dynamic Time Warping.
    To our knowledge this is the first work that applies state of the art cluster analysis
techniques to the residential energy domain in a developing country context. While the
analysis is limited to the electricity sector, similar approaches may be promising in other
residential utility domains, such as the water sector.
12        W. Toussaint and D. Moodley

References
 [1]   Adrian Albert and Mehdi Maasoumy. “Predictive Segmentation of Energy Consumers”. In: (2016).
       ISSN : 03062619. DOI : 10.1016/j.apenergy.2016.05.128.
 [2]   Florentin Batrinu et al. “Efficient iterative refinement clustering for electricity customer classifi-
       cation”. In: 2005 IEEE Russ. Power Tech, PowerTech (2005), pp. 1–7. DOI: 10.1109/PTC.2005.
       4524366.
 [3]   S. M. Bidoki et al. “Evaluating different clustering techniques for electricity customer classifi-
       cation”. In: 2010 IEEE PES Transm. Distrib. Conf. Expo. Smart Solut. a Chang. World (2010),
       pp. 1–5. DOI: 10.1109/TDC.2010.5484234.
 [4]   Hong An Cao, Christian Beckel, and Thorsten Staake. “Are domestic load profiles stable over
       time? An attempt to identify target households for demand side management campaigns”. In:
       IECON Proc. (Industrial Electron. Conf. (2013), pp. 4733–4738. ISSN: 1553-572X. DOI: 10.1109/
       IECON.2013.6699900.
 [5]   Charalampos Chelmis. “Big Data Analytics for Demand Response : Clustering Over Space and
       Time”. In: 2015 IEEE Int. Conf. Big Data (Big Data) (2015), pp. 2223–2232. DOI: 10 . 1109 /
       BigData.2015.7364011.
 [6]   G. Chicco et al. “Customer Characterization Options for Improving the Tariff Offer”. In: IEEE
       Power Eng. Rev. 22.11 (2002), p. 60. ISSN: 02721724. DOI: 10.1109/MPER.2002.4311841.
 [7]   Gianfranco Chicco, Roberto Napoli, and Federico Piglione. “Application of clustering algorithms
       and Self Organising Maps to classify electricity customers”. In: 2003 IEEE Bol. PowerTech - Conf.
       Proc. 1 (2003), pp. 373–379. ISSN: 00448486. DOI: 10.1109/PTC.2003.1304160.
 [8]   Gianfranco Chicco, Roberto Napoli, and Federico Piglione. “Comparison Among Clustering Tech-
       niques for Electricity Customer Classification”. In: IEEE Trans. POWER Syst. 21.2 (2006), pp. 1–
       7. DOI: 10.1109/TPWRS.2006.873122.
 [9]   The-Hien Dang-Ha, Roland Olsson, and Hao Wang. “Clustering Methods for Electricity Con-
       sumers: An Empirical Study in Hvaler-Norway”. In: NIK-2017 (2017). arXiv: 1703.02502. URL:
       http://arxiv.org/abs/1703.02502.
[10]   David L. Davies and Donald W. Bouldin. “A Cluster Separation Measure”. In: IEEE Trans. Pat-
       tern Anal. Mach. Intell. PAMI-1.2 (1979), pp. 224–227. ISSN: 01628828. DOI: 10.1109/TPAMI.
       1979.4766909.
[11]   Ian Dent et al. “Variability of behaviour in electricity load profile clustering; Who does things at
       the same time each day?” In: Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell.
       Lect. Notes Bioinformatics) 8557 LNAI (2014), pp. 70–84. ISSN: 16113349. DOI: 10.1007/978-3-
       319-08976-8 6. arXiv: arXiv:1409.1043v1.
[12]   J Du Toit et al. “Customer Segmentation Using Unsupervised Learning on Daily Energy Load
       Profiles”. In: J. Adv. Inf. Technol. 7.2 (2016), pp. 69–75. DOI: 10.12720/jait.7.2.69- 75. URL:
       http://www.jait.us/uploadfile/2016/0505/20160505105403530.pdf.
[13]   University of Cape Town Eskom Stellenbosch University. Domestic Electrical Load Metering-
       Secure Data 1994-2014. version 1. 2019. DOI: 10.25828/p3k7-r965. URL: https://www.datafirst.
       uct.ac.za/dataportal/index.php/catalog/760.
[14]   Ramon Granell, Colin J Axon, and David C H Wallom. “Impacts of Raw Data Temporal Resolu-
       tion Using Selected Clustering Methods on Residential Electricity Load Profiles”. In: IEEE Trans.
       Power Syst. 30.6 (2015), pp. 3217–3224. DOI: 10.1109/TPWRS.2014.2377213.
[15]   Ling Jin et al. “Comparison of Clustering Techniques for Residential Energy Behavior Using
       Smart Meter Data”. In: AAAI Work. Artif. Intell. Smart Grids Smart Build. (2017), pp. 260–266.
[16]   Ling Jin et al. “Load Shape Clustering Using Residential Smart Meter Data : a Technical Memo-
       randum”. In: September (2016), pp. 1–15.
[17]   Eamonn Keogh and Shruti Kasetty. “On the need for time series data mining benchmarks”. In:
       Proc. eighth ACM SIGKDD Int. Conf. Knowl. Discov. data Min. - KDD ’02 (2002), p. 102. ISSN:
       13845810. DOI: 10.1145/775047.775062. URL: http://portal.acm.org/citation.cfm?doid=775047.
       775062.
[18]   Jungsuk Kwac, June Flora, and Ram Rajagopal. “Household energy consumption segmentation
       using hourly data”. In: IEEE Trans. Smart Grid 5.1 (2014), pp. 420–430. ISSN: 19493053. DOI:
       10.1109/TSG.2013.2278477.
                  Comparison of clustering techniques for residential load profiles in SA                 13

[19]   Peter Laurinec et al. “Adaptive Time Series Forecasting of Energy Consumption using Optimized
       Cluster Analysis”. In: Icdm (2016). DOI: 10.1109/ICDMW.2016.159.
[20]   Fintan McLoughlin, Aidan Duffy, and Michael Conlon. “A clustering approach to domestic elec-
       tricity load profile characterisation using smart metering data”. In: Appl. Energy 141 (2015),
       pp. 190–199. ISSN: 03062619. DOI: 10 . 1016 / j . apenergy. 2014 . 12 . 039. URL: http : / / dx . doi .
       org/10.1016/j.apenergy.2014.12.039.
[21]   Clayton Miller, Zoltán Nagy, and Arno Schlueter. “A review of unsupervised statistical learning
       and visual analytics techniques applied to performance analysis of non-residential buildings”. In:
       Renew. Sustain. Energy Rev. 81.December 2018 (2018), pp. 1365–1377. ISSN: 1364-0321. DOI:
       10.1016/j.rser.2017.05.124. URL: http://dx.doi.org/10.1016/j.rser.2017.05.124.
[22]   S Ramos et al. “Typical Load Profiles in the Smart Grid Context A Clustering Methods Compari-
       son”. In: 2012 IEEE Power Energy Soc. Gen. Meet. (2012), pp. 1–8. DOI: 10.1109/PESGM.2012.
       6345565.
[23]   Teemu Räsänen et al. “Data-based method for creating electricity use load profiles using large
       amount of customer-specific hourly measured electricity use data”. In: Appl. Energy 87.11 (2010),
       pp. 3538–3545. ISSN: 03062619. DOI: 10.1016/j.apenergy.2010.05.015.
[24]   Joshua D. Rhodes et al. “Clustering analysis of residential electricity demand profiles”. In: Appl.
       Energy 135 (2014), pp. 461–471. ISSN: 03062619. DOI: 10.1016/j.apenergy.2014.08.111. URL:
       http://dx.doi.org/10.1016/j.apenergy.2014.08.111.
[25]   Warren S. Sarle, Anil K. Jain, and Richard C. Dubes. Algorithms for Clustering Data. 1990. DOI:
       10.2307/1268876. arXiv: tesxx. URL: http://www.jstor.org/stable/1268876?origin=crossref.
[26]   Lukas G. Swan and V. Ismet Ugursal. “Modeling of end-use energy consumption in the residential
       sector: A review of modeling techniques”. In: Renew. Sustain. Energy Rev. 13.8 (2009), pp. 1819–
       1835. ISSN: 13640321. DOI: 10.1016/j.rser.2008.09.033.
[27]   Thanchanok Teeraratkul, Daniel O’Neill, and Sanjay Lall. “Shape-Based Approach to Household
       Electric Load Curve Clustering and Prediction”. In: IEEE Trans. Smart Grid 9.5 (2018). ISSN:
       19493053. DOI: 10.1109/TSG.2017.2683461. arXiv: 1702.01414.
[28]   Wiebke Toussaint. Domestic Electrical Load Metering, Hourly Data 1994-2014. version 1. 2019.
       DOI : 10.25828/56nh- fw77. URL: https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/
       759.
[29]   Wiebke Toussaint. Domestic Electrical Load Survey - Key Variables 1994-2014. version 1. 2019.
       DOI : 10.25828/mf8s- hh79. URL: https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/
       758.
[30]   George J. Tsekouras, Nikos D. Hatziargyriou, and Evangelos N. Dialynas. “Two-stage pattern
       recognition of load curves for classification of electricity customers”. In: IEEE Trans. Power Syst.
       22.3 (2007), pp. 1120–1128. ISSN: 08858950. DOI: 10.1109/TPWRS.2007.901287.
[31]   Joaquim L. Viegas et al. “Electricity demand profile prediction based on household characteris-
       tics”. In: Int. Conf. Eur. Energy Mark. EEM 2015-Augus (2015), pp. 0–4. ISSN: 21654093. DOI:
       10.1109/EEM.2015.7216746.
[32]   Sharon Xu, Edward Barbour, and Marta C González. “Household Segmentation by Load Shape
       and Daily Consumption”. In: Proc. of. ACM SigKDD 2017 Conf. (2017), pp. 1–9. DOI: 10.475/123.
       URL: http://humnetlab.mit.edu/wordpress/wp-content/uploads/2016/03/household-segmentation-
       load-shape-consumption.pdf.
                14         W. Toussaint and D. Moodley

                A      Visualisations of descriptive statistics for input dataset


(a) Monthly income distribution                        (b) Dwelling floor area distribution                      (c) Years electrified distribution


                                   (d) Proportioned survey responses for water access, wall and roof materials


                                   (e) Histogram of mean daily household power consumption in 10kWh bins

                                     Fig. 4: Descriptive statistics of DEL survey respondents
                  Comparison of clustering techniques for residential load profiles in SA                                15

B      Supplementary Tables for Clustering Experiments
B.1    Normalisation algorithms
                                                                                                                        (j)
The normalised daily load profile for household j observed on day d is denoted as nd .


Norm                Equation                       Comments
                            (j)
                     (j)  h
Unit norm (u)       nd = d  (j)
                                                   Scales input vectors individually to unit norm
                         |h     |
                            d

                     (j)        l(t)d −l(t)min
                                           d
De-minning (d)      nd     =                       Proposed by [15]. Subtracts daily min. demand from each hourly value,
                               |l(t)d −l(t)min |
                                           d       then divides by the de-minned daily total.
                                  (j)
                     (j)         h
Zero-one (z)        nd     = l(t)dmax              Also known as min-max scaler. Scales all values to a range [0, 1]. Retains
                                  d                profile shape. Sensitive to outliers.
                                    (j)
                     (j)           h
SA norm (sa)        nd     =        d              Frequently used by South African experts. Normalises input vectors to
                               1 ×P23 l(t)
                               24   t=0   d        mean 1. Retains profile shape. Sensitive to outliers.
                     Table 4: Data normalisation algorithms and descriptions


B.2    Bin ranges AMC pre-binning


                    bin            AMC
                    1        0 - 1 kWh   no consumption
                    2       2 - 50 kWh   lifeline tariff - free basic electricity
                    3      51 - 150 kWh
                    4     151 - 400 kWh
                    5     401 - 600 kWh
                    6 601 - 1200 kWh
                    7 1201 - 2500 kWh
                    8 2501 - 4000 kWh
                   Table 5: AMC bins based on South African electricity tariffs


B.3    Clustering metrics
The Silhouette Index for an individual pattern p in the dataset is:

                                          distinctness(p) − compactness(p)
                 silhouette(p) =                                                                                        (7)
                                         max{distinctness(p), compactness(p)}

Compactness is the average distance between p and all other patterns in the same cluster.
Distinctness is the average distance between p and all remaining patterns that are not in
the same cluster.
16      W. Toussaint and D. Moodley

    The Davies Bouldin Index (DBI) for two clusters is calculated as the ratio of the sum
of cluster dispersions, and the distance between the two cluster centroids.

                                    dispersion(i) + dispersion(j)
                     DBI(i, j) =                                                        (8)
                                            distance(i, j)

Cluster dispersion can be calculated using different measures. A simple method for
computing it is as the average distance between the centroid of a cluster and each pattern
in the cluster. The DBI for the dataset is obtained by averaging the similarity measure of
each cluster and its most similar cluster, DBI(i, j)max , for all clusters. A small DBI
value indicates that cluster dispersions are small and distances between clusters are large,
which is desirable. When plotting the DBI against the number of clusters, the optimal
number of clusters can be visually identified. It is possible for the DBI to have several
local minima [10].