A study of SOM clustering software implementations
                                                        A. B. Adeyemo
                                                 Computer Science Department
                                                     University of Ibadan
                                                            Nigeria
                                                      +2348052107367
                                                  sesanadeyemo@gmail.com


ABSTRACT                                                                 dimensional (or (rarely) three or more-dimensions. The reason
Clustering algorithms generally suffer from some well-known              for using one- and two dimensional grids is that space
problems for which the Self Organizing Maps (SOM)                        structures of higher dimensionality cause problems with data
algorithms are adept at handling. While there are many                   display and cannot be displayed on the monitor. The SOM
variants of the SOM algorithm, software programmes that                  working algorithm is a variant of multidimensional vectors
implement the SOM algorithms have tended to give varying                 clustering of which the Kmeans clustering algorithm is an
results even when tested on the same data sets. This can have            example of this type of algorithm [9].
serious implications when the goal of the clustering is novelty
detection. In this study a comparison of the performance of
                                                                         The SOM neural network uses a competitive learning algorithm
some SOM clustering software was carried out and results
                                                                         and is a method for unsupervised learning, based on a grid of
presented.
                                                                         artificial neurons whose weights are adapted to match input
CCS Concepts                                                             vectors in a training set. The SOM algorithm is fed with
• General and reference ➝ -computing tools and techniques                feature vectors, which can be of any dimension. The algorithm
➝ Empirical studies                                                      for the training of the SOM [4] is explained easily in terms of
                                                                         a set of artificial neurons, each having its own physical
Keywords                                                                 location on the output map, which take part in a winner-take-
Comparative Analysis; Clustering; Self Organizing Maps.                  all process where a node with its weight vector closest to the
                                                                         vector of inputs is declared the winner and its weights are
1. INTRODUCTION                                                          adjusted making them closer to the input vector.
In the clustering process data is grouped in such a way that the
intra-cluster similarity is maximized while the inter-cluster
similarity is minimized. Data can be described by either
categorical or numeric features. Due to the differences in the
characteristics of these two kinds of data, attempts to develop
criteria functions for mixed data have not been very successful
[15]. There are two widely used clustering methods: the
hierarchical and the nonhierarchical (partitional) methods. The
hierarchical clustering process can be categorized as divisive
when a large data set is divided into several small groups and,
agglomerative when a small data set are put together to create
a larger cluster. Self-Organizing Maps (SOM) are competitive
networks that provide a "topological" mapping from the input
space to the clusters [4]. The SOM was inspired by the way in
which various human sensory impressions are neurologically
mapped into the brain such that spatial or other relations
among stimuli correspond to spatial relations among the
neurons.

In a SOM, the neurons (clusters) are organized into a grid
which is usually two-dimensional, but sometimes one-                              Figure 1: Illustration of the updating of the Best
                                                                                  Matching Unit (BMU) of a SOM grid and its neighbors
                                                                         In each training step, one sample vector „x‟ from the input data
                                                                         set is chosen randomly and a similarity measure is calculated
                                                                         between it and all the weight vectors of the map. The Best-
CoRI’16, Sept 7–9, 2016, Ibadan, Nigeria.
                                                                         Matching Unit (BMU), denoted as „c‟, is the unit whose
                                                                         weight vector has the greatest similarity with the input sample
                                                                         „x‟ (figure 1). The similarity is usually defined by means of a
                                                                         distance measure, usually the Euclidian distance. The BMU is


                                                                   160
defined mathematically as the processing element for which                   neighborhood relationship, and the density mapping. Neighboring
the expression:                                                              neurons in the SOM cannot be too far away from each other (in
                                                                             order to maintain their similarity) but the SOM also wants to place
                                                                             more neurons in areas of high input density (for example, logical
                                       . …..…………….….. 1                      clusters). Because of this, there will be neurons that will be placed
where d is the distance measure.                                             in areas between natural clusters which are typically low input
                                                                             density areas (so that the map can "stretch" between clusters).
Each node has a set of neighbors. When a node wins a
competition, the neighbor‟s weights are also changed but not
                                                                             The standard SOM algorithm uses numeric type variables and the
as much as that of the winning node. The further the neighbor
                                                                             Euclidean distance function. The arithmetic operations used
is from the winner, the smaller its weight change. The SOM
                                                                             during the learning phase for the update of the feature vectors
update rule for the weight vector of the unit i is given                     cannot be used with categorical values. The SOM was not directly
mathematically as:                                                           designed to work with categorical variables due to the limitation
                                                                             of learning laws. The method usually adopted is to translate
                                               ………………2                       categories to numeric numbers during data pre-processing before
                                                                             training using the transformed data using standard SOM algorithm
where                                                                        [2]. The Kohonen SOM clustering algorithm has also been used
t represents the sample index for each presentation of a sample „x‟          for classification purposes with remarkable results. There is a
                                                                             fundamental difference between the clustering process and the
hc(x),i represents the neighborhood function around the winner unit          classification process. Clustering is an unsupervised process while
„c‟, with neighborhood radius r(t).                                          classification is supervised. Usually data clustering is used as a
The neighborhood function is like a smoothing kernel that is                 pre-processor for classification purposes [8].
time-variable. It is a decreasing function of the distance                   A rich variety of versions of the basic SOM algorithm have
between the the ith and cth reference vectors on the map grid.               been proposed. Some of the variants aim at improving the
The neighborhood function is usually expressed as the                        preservation of topology by using more flexible map structures
Gaussian function which can be expressed mathematically as:                  instead of the fixed grid. Some of these methods however
                                                                             cannot be used for visualization as easily as the regular grid.
                                                                             Some variants aim at reducing the computational complexity
                                                                             of the SOM [3]. Experiments using different distance
                                        …………………3                             measures, map topologies, training parameters such as the
where                                                                        learning rate and neighbourhood function can be carried out.
ά(t) represents the learning rate factor and takes values 0< ά(t)<1
σ(t) represents the width of the neighborhood function which                 Using identical settings, training of a SOM map over different
decreases monotomically with the regression steps.                           iterations can lead to different mappings, because of the
                                                                             random initialisation. Yet it has been shown that the
A simpler definition of the neighbourhood function given by                  conclusions drawn from the map remain remarkably
Kohonen [4] is:                                                              consistent, which makes it a very useful tool in many different
hc(x),I=σ(t)…………………………………………………….4                                           circumstances [14]. Some of the desirable features that good
If ║ri – rc║ is smaller than a given radius around node „c‟ and              SOM clustering software should have include:
the radius is also a monotomically decreasing function of the                  1.   Being able to set the neighborhood kernel function and
regression steps, but otherwise hc(x),I = 0. σ(t) is a                              to set the start value for the neighborhood function
diminishing function of time. At the beginning of the learning                      (learning radius): The neighborhood function
procedure it is fairly large, but it is made to gradually shrink                    determines how strongly the processing elements are
during learning. Towards the end of learning a single winning                       connected to each other. Neighborhoods of different
processing element is trained. A linear diminishing function of                     sizes in different neuron configurations (e.g.
time is usually used. The learning process consisting of winner                     rectangular and hexagonal lattices) can be used. The
selection by Equation (1) and adaptation of the synaptic                            simplest neighborhood function is the bubble (winner-
weights by Equation (2). This process is repeated for each                          takes-all): it is constant (or 1) over the whole
input vector, usually for a large number of cycles with                             neighborhood of the winner unit and zero elsewhere.
different inputs producing different winners. The network                           Usually the neighbourhood function is expressed as a
therefore associates output nodes with groups or patterns in the                    Gaussian function and as expected using the winner-
input data set. The SOM algorithm is very simple and allows                         takes-all function retrieves less clusters than the
for many subtle adaptations.                                                        Gaussian function.
                                                                               2.   Being able to set the activation function and weight
There are some visual displays that are used to "determine" where                   initialization methods: Before the training, initial values
the natural cluster boundaries are in the SOM. Some of the visual                   are given to the prototype vectors of the SOM. The
tools that can be used are Histograms [6], Component Plane                          SOM is very robust with respect to the initialization
displays [3], U-matrix, P-matrix and U* matrix displays [10],                       process, however, when properly accomplished it
[11], [12, [13]. An important concept in interpreting these displays                allows the algorithm to converge faster to a good
is the interaction of the two properties of the SOM. These are the
                                                                       161
     solution. Initialization procedures that have been used            This work presents a comparative study of the performance
     are: Random initialization, where the weight vectors               some SOM clustering software when tested on the same data
     are initialized with small random values; Sample                   set. Results were presented and reasons for the observed
     initialization, where the weight vectors are initialized           variations presented. The study also presents the desirable
     with random samples drawn from the input data set;                 features that standard SOM software should have.
     Linear initialization, where the weight vectors are
     initialized in an orderly fashion along the linear                 2. MATERIALS AND METHODS
     subspace spanned by the two principal eigenvectors of              Agro metrological data for FRIN headquarters, Ibadan,
     the input data set.                                                Nigeria was used. The data set had 254 records and the
                                                                        attributes in the data set were: Year (numeric), Month (text),
3.   Being able to set the choice of cooling strategy during            Total Rainfall in millimeters (numeric), Minimum
     training: for example linear or exponential.                       Temperature in Celsius (numeric), Maximum Temperature in
4.   Being able to set the distance measure to be used, for             Celsius (numeric), Relative Humidity and Fire Danger Index
     example, Euclidean, Manhattan and Maximum value: It                (numeric). The SOM software used were: NNClust, Pittnet
     is noted that the distance measure between data points             Neural Network Educational Software and RapidMiner Studio.
     is an important component of a clustering algorithm. If
     the components of the data instance vectors are all in             The NNClust software was programmed to use only the
     the same physical units then it is possible to use the             Gaussian neighbourhood function and the Euclidean distance
     simple Euclidean distance metric to successfully group             measure. The user can input the learning rate and starting
     similar data elements. The Euclidean distance in a two             neighbourhood size. The software automatically normalizes
     or three-dimensional space measures is the actual                  the input data between -1 and 1 and has features for generating
     geometric distance between objects in the space.                   data/result statistics and data visualization such as weight
     However, it has been observed that even the Euclidean              maps and radar charts. The Pittnet software also uses the
     distance can sometimes be misleading, because of the               Gaussian neighbourhood function and Euclidean distance
     way the mathematical formula used to combine the                   metrics. The user also defines the starting learning rate and it
     distances between the single components of the data                also automatically normalizes the data between 0 and 1. It is a
     feature vectors into a unique distance measure that can            DOS based program that saves its result in a text file and has
     be used for clustering purposes is computed. Different             no data analysis or data visualization ability. RapidMiner
     formulas lead to different clustering‟s. Therefore,                studio (Community Edition) has facilities for selecting
     domain knowledge must be used to guide the                         parameters for defining the learning rate, neighbourhood
     formulation of a suitable distance measure for each                radius and can choose either to normalize the data or not. It
     particular application.                                            also has an array of tools for statistical data analysis and data
5.   Being able to set the scaling technique to be used: for            visualization.
     example z-transform, (0,1) transform, (1,-1) transform
     or none, depending on the clustering goal and data set.            Using the three software‟s clusters was generated. The
6.   Being able to set the starting and stopping learning rate:         arithmetic mean of each cluster group was also computed. The
     The learning rate is a decreasing function of time                 arithmetic mean is a measure of central tendency which
     between [0,1]. The learning rate can be expressed as a             describes the central location of data and is usually used with
     linear function and as a function inversely proportional           other statistical measures such as the standard deviation
     to time. Using the inverse function ensures that all               because it can be affected by extreme values in the data set and
     input samples have approximately equal influence on                therefore be biased. The standard deviation describes the
     the training result. Some learning rate functions that             spread of the data and is a popular measure of dispersion. It
     have been implemented are the linear, inverse-of-time,             measures the average distance between a single observation
     and as a power ser.                                                and its mean.

7.   Being able to set the training algorithm to be used: for
     example batch, on-line, hybrid etc. The batch algorithm            3. RESULTS AND DISCUSSION
     has been shown to be faster [4] than the normal                    The meteorological data was clustered using NNClust SOM
     sequential algorithm (and the results are just as good or          clustering software with a starting learning rate of 0.9 and was
     even better).                                                      trained over 100 epochs. The software accepts only numeric
8.   Good data visualization options: for example                       values. Non numeric values are treated as missing values
     histograms, hinton charts, weight charts (maps), U-                which are replaced by the column mean. The software was set
     Matrix, P-Matrix etc. Good result analysis and                     to identify a maximum of ten clusters, however only eight
     presentation functions: computation of vital statistics            clusters were generated. The software uses the number of
     for evaluating the quality of the clustering for example,          clusters specified to create the SOM grid. The mean and
     mean, standard deviation (or variance), correlation                standard deviation of the eight clusters were computed.
     coefficient, t-test etc.                                           Increasing the training cycle did not improve the results. Table
                                                                        1 presents the summary of the eight clusters, while figure 2
                                                                        presents the chart of the cluster means.

                                                                  162
                                                                          160
 200
 180                                             TotalRainfall            140
 160                                                                      120
 140                                                                      100
 120                                             MaxTemp
 100                                                                       80                                                     Cluster 1
  80                                             MinTemp                   60
                                                                                                                                  Cluster 2
  60                                                                       40
  40                                                                                                                              Cluster 5
                                                 RH                        20
  20                                                                                                                              Cluster 6
   0                                                                         0
                                                 FireDangerInd
         Cluster 1
         Cluster 2
         Cluster 3
         Cluster 4
         Cluster 5
         Cluster 6
         Cluster 7
         Cluster 8
                                                 ex


            Figure 2: Chart of NNClust cluster means
The meteorological data was trained using the Pittnet software                     Figure 3: Chart of Pitnett software cluster means
with a starting learning rate of 0.9 and was set to train for 100
epochs, although the software stops training as soon as the
maximum number of clusters have been generated. The                        300
software requires the user to specify the number of clusters                                                                 TotalRainfall
expected apriori. This number is used in conjunction with the              250
number of input signals (attributes) to determine the SOM grid
size. Expected number of clusters was set to ten. The software             200                                               MaxTemp
identified only four clusters. The mean and standard deviation
of the clusters were computed. Table 2 presents the summary                150
of the clusters, while figure 3 presents the chart of the cluster                                                            MinTemp
                                                                           100
means.
TheRapidMiner Studio software was used to cluster the                        50                                              RH
meteorological data set using a starting learning rate of 0.9 and
                                                                              0
was trained over a 100 epochs. The expected number of
clusters was set at ten and the software generated ten clusters.                                                             FireDangerIn
Table 3 presents the summary of the cluster means with their                                                                 dex
standard deviations while figure 4 presents a chart of their
cluster means.
                                                                                  Figure 4: Chart of Rapid Miner Studio cluster means
3.1      Discussion of Results
The quality of the clusters identified in the data by the three           Similarly considering the clusters identified by the Pittnet
software‟s can be inferred from a comparison of the mean and              software in table 2 the same trend is observed. Table 5
standard deviation of the clusters. If the value of the standard          presents the records for cluster 4 (table 2) for the Pittnet
deviation is low, then the clustered records are within the same          software cluster results. It can be observed that the cluster is
range. However if the value is high this suggests the presence            consists of data records which have the same value for the
of outliers in the clustered data records. For example table 4            FireDangerIndex attribute. However, considering the Total
presents the clustered records for cluster 2 (table1) for the             Rainfall field which has a mean value of 39.74444 and a
NNClust software which is representative of the trend                     standard deviation of 43.34732. The high standard deviation
observed in the clusters identified by the software. Interpreting         value implies that there are outlier data values in the clustered
the cluster is indecisive when the values in the Total Rainfall           records.
fields are considered. The field has a mean of 142.05 and a
standard deviation of 136.011711.                                         The clusters identified by the RapidMiner software presented
                                                                          in table 3 were easier to interpret. They followed the expected
                                                                          rainfall pattern which is known for the region where the data
                                                                          was collected [5]. Cluster 2 (table 3) contained records with
                                                                          only a high FireDangerIndex of 4 as presented in table 6, while
                                                                          cluster 5 (table 3) contains records with the highest recorded
                                                                          Rainfall level in the data set. The other clusters also contained
                                                                          data records which can be categorized by the Rainfall level
                                                                          pattern of the region.

                                                                    163
4. ACKNOWLEDGMENTS                                                          Portuguese conference on progress in Artificial Intelligence , pp
Some of the problems found in the literature about clustering               304 - 313, (Sringer-Verlag Berlin, Heidelberg ©2005)
algorithms are: Most clustering techniques are based on                   [3]. Kaski S., (1997), "Data exploration using self-organizing
distance calculations which are very sensitive to ranges of                  maps”, Acta Polytechnica Scandinavica, Mathematics,
variables, therefore the values have to be normalized.                       Computing and Management in Engineering Series No. 82,
Normalization however is a subjective function, and these                    Espoo 1997.
transformations cannot be carried out without creating biases;            [4]. Kohonen T, (1999), “The Self-Organizing Map (SOM)”,
The presence of outliers in data sets create problems in data                Helsinki University of Technology, Laboratory of Computer
clustering based on distance calculations when they have not                 and Information Science, Neural Networks Research Centre,
been identified and removed from the data set; Handling                      Quinquennial Report (1994-1998), (Downloaded from
categorical variables (non-numeric data, non-numeric                         http://www.cis.hut.fi/research/reports/quinquennial/ January
variables, categorical data, nominal data, or nominal variables)             2006).
are a problem for most clustering algorithms, and even when               [5]. Nigeria Climate Review, 2010, Nigerian Meteorological
data encoding methods are used they can introduce extra                      Agency, www.nimetng.org
biases due to the number of values which the encoding                     [6]. Pampalk E, Rauber A, Merkl D, (2002), “Using Smoothed
introduces in the categorical variables; The selection of                    Data Histograms for Cluster Visualization in Self Organizing
variables also has a large influence on clustering results, while            Maps”, Technical Report OeFAI-TR-2002-29, extended
                                                                             version published in Proceedings of the International
the assigning of different weights for variables and categorical
                                                                             Conference on Artificial Neural Networks, Springer Lecture
values can be used, when many variables and categorical                      Notes in Computer Science, Madrid, Spain, 2002.
values are involved, it can affect the clustering quality;                [7]. Pelczer I. J. and Cisneros H. L., (2008), “Identification of
Capturing patterns (or behaviors) hidden inside time-varying                 rainfall patterns over the Valley of Mexico”, 11th International
variables and modeling them is another problem and most                      Conference on Urban Drainage, Edinburgh, Scotland, UK,
clustering techniques do not possess this predictive modeling                2008
capability; Most clustering techniques were developed for                 [8]. Principe J. C., Euliano N. R. Lefebvre W. C, (2000), Neural
laboratory generated simple data sets consisting of a few to                 and Adaptive Systems: Fundamentals Through Simulations,
several numerical variables; hence they can‟t be used for large              John Wiley and Sons Inc, ISBN 0-471-35167-9, pp 656.
data analyses that consist of many categorical complex data.              [9]. Statsoft Electronic Statistics Textbook, (2002), Copyright,
                                                                             1984-2003,
                                                                          (http://www.statsoftinc.com/txtbook/glosd.html#Data Mining),
Most common implementation of data clustering algorithms                     Downloaded June
suffer from these problems, however, SOM‟s are very robust                2002.
and are adept at handling these problems but this depends also            [10]. Ultsch, A., (1999), Data Mining and Knowledge Discovery
on the goal of the algorithm‟s implementation (programming).                 with Emergent Self-Organizing Feature Maps for Multivariate
Applications programmed for demonstration purposes cannot                    Time Series, In Kohonen Maps, (1999), pp. 33-46.
                                                                          [11]. Ultsch A, (2003a), Maps for the Visualization of high-
be used for large scale projects and some implementations are
                                                                             dimensional Data Spaces, Proc. Workshop on Self Organizing
not flexible and do not give users much options. However if
                                                                             Maps, pp 225 - 230, Kyushu, Japan, 2003.
the various implementations of the conventional SOM                       [12]. Ultsch A, (2003b), U*-Matrix: a Tool to visualize Clusters
algorithm (which are usually focused on the goals of the                     in high dimensional Data, Technical Report No. 36, Computer
programmer) provides enough options to the user, it is still a               Science Department, University of Marburg, Germany, 2003.
very robust algorithm that can be used for both numerical,                [13]. Ultsch A., Moerchen F, (2005), ESOM-Maps: tools for
categorical and mixed data sets. Further work in this study is               clustering, visualization, and classification with Emergent
focused on the development of an open flexible SOM                           SOM, Technical Report No. 46, Dept. of Mathematics and
clustering tool with adequate features that can be used for                  Computer Science, University of Marburg, Germany, 2005.
research purposes.                                                        [14]. Wehrens R. Buydens L. M. C., 2007, Self- and Super-
                                                                             organizing Maps in R: The kohonen Package, Journal of
5. REFERENCES                                                                Statistical Software,published by the American Statistical
[1]. Chang C., Ding Z., (2004), "Categorical data visualization              Association, Vol. 21, Issue 5
   and clustering using subjective factors", Data & Knowledge             [15]. Zengyou He, Xiaofe I Fe, Shengchun Deng, (2003),
   Engineering, Published by Elsevier B.V.                                      “Clustering Mixed Categorical and Numeric Data”,
[2]. Chen N. and Marques N. C., (2005), “An Extension of Self-                  Department of Computer Science and Engineering, Harbin
   Organizing Maps to Categorical Data”, Proceedings of the 12th                Institute of Technology, Harbin 150001, P. R. China


                                                                    164
                                                 Table 1: Summary of NNClust clusters


                                 TotalRainfall        MaxTemp            MinTemp        RH            FireDangerIndex
      Cluster 1        Mean      3.7                  32                 24             83            2
                       SD        0                    0                  0              0             0
      Cluster 2        Mean      142.05               33.5               24.5           79.33333      2.666666667
                       SD        2.61629509           22.627417          16.9706        4.501851      0.516397779
      Cluster 3        Mean      113.313158           31.1236842         31.0605        70.54737      2.5
                       SD        69.9895185           15.4557389         11.4404        45.62364      1.246560403
      Cluster 4        Mean      149.99               30.8333333         30.2967        73.75333      2.333333333
                       SD        98.1425436           3.53058883         20.0499        25.41582      0.546672274
      Cluster 5        Mean      109.891667           30.6333333         36.1667        64.64444      2.638888889
                       SD        92.1210985           4.02073199         24.3938        34.37646      0.723198364
      Cluster 6        Mean      141.621277           31.7574468         27.0617        73.1617       2.617021277
                       SD        97.0359995           2.63056819         13.7078        20.8623       0.644481304
      Cluster 7        Mean      123.545794           31.4411215         29.4963        74.41028      2.411214953
                       SD        81.8137003           2.96536463         18.4077        24.4239       0.531165877
      Cluster 8        Mean      175.268966           29.3793103         23.069         86.89655      2.068965517
                       SD        85.4901878           1.49794605         1.06674        4.312315      0.257880715


                                          Table 2: Summary of the Pitnett software clusters


                               TotalRainfall         MaxTemp         MinTemp            RH            FireDangerIndex
                    Mean       50.850001             24.75           63.5               3.9           4
     Cluster 1
                    SD         31.32483              0.070709        12.0208153         0.141421356   0
                    Mean       134.3332              31.7082         23.5984375         82.4218728    2.3828125
     Cluster 2
                    SD         91.137324             2.254123        1.06439596         6.908488013   0.487025284
                    Mean       138.05185             24.64815        84.4074074         2.196296296   2.407407407
     Cluster 3
                    SD         45.668999             15.90804        27.2370968         39.48311832   1.836329785
                    Mean       39.744444             35.55556        23.5555556         59.22222133   4
     Cluster 4
                    SD         43.347321             1.333333        1.74005108         7.120003363   0


                                          Table 3: Summary of Rapid miner Studio clusters

                              TotalRainfall         MaxTemp          MinTemp         RH               FireDangerIndex
                  Mean        42.35385              33.41154         23.99615        78.46153846      2.730769231
cluster 0
                  SD          8.192056              2.308823         0.911913        7.798619207      0.603833905
                  Mean        13.50513              33.47179         23.80769        77.43589744      2.820512821
cluster 1
                  SD          9.379343              2.342845         1.280909        6.302860135      0.451418517
                  Mean        7.64                  35.36            23.42           55.2             3.8
cluster 2
                  SD          16.15873              17.96476         13.16786        40.93966268      1.299899072
                  Mean        57.94667              25.35333         78.13333        2.726666667      2.933333333
cluster 3
                  SD          13.23034              15.63488         11.11308        32.15964741      1.361648053


                                                                   165
             Mean   211.4214           23.90714           88.14286       1.871428571         2.071428571
cluster 4
             SD     46.93198           1.320527           4.24005        0.299816794         0.267261242
             Mean   270.4346           30.36154           23.21923       85.19230769         2.115384615
cluster 5
             SD     42.68863           1.395814           0.859101       5.129837598         0.322602539
             Mean   188.0463           30.77805           23.31463       84.90243902         2.146341463
cluster 6
             SD     15.90989           1.518801           0.887288       5.180757078         0.357839043
             Mean   144.6971           31.42              23.47429       82.85714286         2.342857143
cluster 7
             SD     10.84353           1.991127           0.995089       7.6855206           0.481593992
             Mean   110.85             31.84474           23.72105       82.31578947         2.473684211
cluster 8
             SD     9.73158            2.332462           1.076822       6.794692934         0.603451429
             Mean   70.05862           32.27241           24.04828       81.31034483         2.482758621
cluster 9
             SD     8.635041           2.37936            1.180684       9.043953972         0.508547628


                               Table 4: Sample NNClust software cluster result
            Year    Months     TotalRainfall   MaxTemp        MinTemp     RH           FireDangerIndex
            1980    Feb.       60              35             27          75           3
            1987    Aug.       357.1           30             23          86           2
            1987    Nov.       10              35             24          80           3
            1989    Mar.       57              35             25          77           3
            1991    Apr.       108.9           32             24          83           2
            1998    Sept.      259.3           34             24          75           3
            Mean               142.05          33.5           24.5        79.33333     2.666667
            SD                 136.0117        2.073644       1.378405    4.501851     0.516398


                               Table 5: Sample Pittnet software cluster result

            Year    Months     TotalRainfall   MaxTemp       MinTemp     RH            FireDangerIndex
            1989    Feb.       18.4            35            22          51            4
            1990    Feb.       40.3            35            23          64            4
            1990    Mar.       11.7            37            25          69            4
            1994    Jan.       1.3             33            20          45            4
            1997    Mar.       122.2           35            23          62            4
            1998    Feb.       2               36            25          60            4
            2000    Mar.       48.8            37            25          62            4
            2001    Mar.       15              37            25          60            4
            2001    Apr.       98              35            24          60            4
            Mean               39.74444        35.55556      23.55556    59.22222      4
            SD                 43.34732        1.333333      1.740051    7.120003      0


                                                    166
                 Table 6: Sample Rapidminer software cluster result

 Year   Months      TotalRainfall    MaxTemp           MinTemp     RH             FireDangerIndex
1989    Feb.        18.4             35                22          51             4
1994    Jan.        1.3              33                20          45             4
1998    Feb.        2                36                25          60             4
2001    Mar.        15               37                25          60             4
2004    Mar.        1.5              35.8              25.1        60             3
Mean                7.64             35.36             23.42       55.2           3.8
SD                  8.361399         1.499333          2.319914    6.906519       0.447214

                   Table 7: Sample RapidMiner software cluster result

 Year   Months     TotalRainfall    MaxTemp       MinTemp         RH          FireDangerIndex
 1979   Jul.       291.2            29            23              85          2
 1979   Sept.      269              29            23              86          2
 1979   Oct.       223.6            31            24              86          2
 1979   Nov.       261.4            32            24              83          2
 1980   Jun        306              31            23              82          2
 1980   Aug.       427.4            28            23              88          2
 1980   Sept.      333.5            29            23              90          2
 1981   Sept.      233.9            30            23              86          2
 1981   Oct.       225.1            31            24              83          2
 1983   May        250.7            31            24              85          2
 1984   May        223              32            23              86          2
 1984   Jun        233.6            30            22              82          2
 1985   Jul.       307.2            30            23              86          2
 1985   Aug.       232.2            30            23              89          2
 1986   Jun        312.9            31            23              83          2
 1986   Sept.      374.1            29            22              84          2
 1987   Jul.       246.8            30            23              85          2
 1987   Aug.       357.1            30            23              86          2
 1987   Sept.      252.5            31            23              87          2
 1988   Jun        242.9            30            22              82          2
 1988   Jul.       240.9            29            23              84          2
 1988   Sept.      225.1            30            23              87          2
 1989   May        259.2            32            23              83          2
 1989   Jun        338.7            31            23              86          2
 1989   Aug.       275              29            22              88          2
 1990   Apr.       233.8            33            24              82          3
 1990   Jul.       293.6            29            23              90          2


                                            167
1990   Oct.    255.4      31           23         85         2
1991   May     258.2      32           24         84         2
1991   Jul.    306.6      29           23         90         2
1992   Sept.   275.4      29           23         90         2
1992   Oct.    276.3      31           23         88         2
1993   Jul.    261        29           27         87         2
1993   Aug.    237.7      29           23         90         2
1993   Sept.   255.5      30           23         86         2
1994   Sept.   236        30           23         89         2
1995   May     334.3      31           24         81         2
1995   Aug.    304.2      29           23         91         2
1996   Aug.    224.7      30           23         89         2
1996   Sept.   304.1      29           22         90         2
1997   Apr.    261.7      32           24         70         3
1998   May     245.4      34           25         70         3
1998   Sept.   259.3      34           24         75         3
2000   Jul.    220.4      29           23         73         3
2000   Aug.    263.8      29           23         85         2
2001   May     265        33           24         74         3
2001   Sept.   275.2      29           22         90         2
2002   Oct.    265        29           24         87         2
2003   Jun     275.3      30.6         24.5       92         2
2003   Sept.   226        30.8         22.4       92         2
2003   Oct.    254.9      32           23.2       92         2
2006   Sept.   250.8      30.4         22.3       86         2
Mean           270.4346   30.36154     23.21923   85.19231   2.115385
SD             42.68863   1.395814     0.859101   5.129838   0.322603


                                 168