<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Computer-aided Diagnosis via Hierarchical Density Based Clustering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tom Obry</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Louise Travé-Massuyès</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Audine Subias</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ACTIA</institution>
          ,
          <addr-line>5 Rue Jorge Semprun, 31432 Toulouse</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LAAS-CNRS, Université de Toulouse</institution>
          ,
          <addr-line>CNRS, INSA, Toulouse</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <abstract>
        <p>When applying non-supervised clustering, the concepts discovered by the clustering algorithm hardly match business concepts. Hierarchical clustering then proves to be a useful tool to exhibit sets of clusters according to a hierarchy. Data can be analyzed in layers and the user has a full spectrum of clusterings to which he can give meaning. This paper presents a new hierarchical densitybased algorithm that advantageously works from compacted data. The algorithm is applied to the monitoring of a process benchmark, illustrating its value in identifying different types of situations, from normal to highly critical.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In data-based diagnosis applications, it is often the case that
huge amounts of data are available but the data is not
labelled with the corresponding operating mode, normal or
faulty. Clustering algorithms, known as non-supervised
classification methods, can then be used to form clusters that
supposedly gather data corresponding to the same operating
mode.</p>
      <p>
        Clustering is a Machine Learning technique used to group
data points according to some similarity criterion. Given
a set of data points, a clustering algorithm is used to
classify each data point into a specific group. Data points that
are in the same group have similar features, while data
points in different groups have highly dissimilar features.
Among well-known clustering algorithms, we can mention
K-Means [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], PAM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], K-Modes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], DBSCAN [4].
      </p>
      <p>Numerous validity indexes have been proposed to
evaluate clusterings [5]. These are generally based on two
fundamental concepts :
compactness, the members of each cluster should be as
close to each other as possible. A common measure
of compactness is the variance, which should be
minimized.
separation, the clusters themselves should be widely
spaced.</p>
      <p>Nevertheless, one must admit that the concepts
discovered by even the most scored clusterings hardly match
business concepts [6] [7]. One of the reasons is that data bases
are often incomplete in the sense that they do not include the
data about all the influencial attributes. In particular,
business concepts are highly sensitive to environmental
parameters that fall outside the scope of the considered business
domain and that are not recorded, for instance stock exchange.
In addition, the clusters corresponding to business concepts
may be quite "close" in the data space and the only way to
capture them would be to guess the right number of clusters
to initialize correctly the clustering algorithm. This is
obviously quite hard. Hierarchical clustering then proves to be
a useful tool because it exhibits sets of clusters according to
a hierarchy and it modulates the number of clusters. Data
can then be analyzed in layers, with a different number of
clusters at each level, and the user has a full spectrum of
clusterings to which he can give meaning.</p>
      <p>
        Hierarchical clustering identifies the clusters present in a
dataset according to a hierarchy [
        <xref ref-type="bibr" rid="ref17">8</xref>
        ][9][
        <xref ref-type="bibr" rid="ref19">10</xref>
        ]. There are two
strategies to form clusters, the agglomerative ("bottom up")
strategy where each observation starts in its own cluster and
pairs of clusters are merged as one moves up the
hierarchy. The divise method ("top down") where all observations
start in one cluster and splits are performed recursively as
one moves down the hierarchy. The results of hierarchical
clustering are usually presented in a dendrogram. A
dendrogram is a tree diagram frequently used to illustrate the
arrangement of the clusters. In order to decide which
clusters should be combined or where a cluster should be split,
a measure of dissimilarity between sets of observations is
required. In most methods of hierarchical clustering, splits
or merges of clusters are achieved by use of an appropriate
metric like euclidean, manhattan or maximum distance.
      </p>
      <p>
        Few algorithms propose a density-based hierarchical
clustering approach like -unchaining single linkage [
        <xref ref-type="bibr" rid="ref20">11</xref>
        ]
or HDBSCAN [
        <xref ref-type="bibr" rid="ref21">12</xref>
        ]. In this paper, we present a new
hierarchical clustering algorithm, named HDyclee, based on
density that advantageously works from compacted data in
the form hypercubes. This contribution is an extension of
the clustering algorithm DyClee [
        <xref ref-type="bibr" rid="ref22">13</xref>
        ], [
        <xref ref-type="bibr" rid="ref23">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">15</xref>
        ]. The
purpose of this work is to generate a flat partition of clusters
with a hypercube’s density level higher or equal to a
threshold and to be able to visualize all existant clusters in the
dataset with a dendogram by varying the density of the
hypercubes present in a group. The value of the algorithm in
a diagnosis context is illustrated with the monitoring of a
Continuous Stirred Tank Heater benchmark, for which it
allows the user to identify different types of situations, from
normal to highly critical.
      </p>
      <p>This paper is organized as follows. In section 2 the
DyClee algorithm is presented. In section 3 the concepts and
principles underlying Dyclee, like the definition of micro
clusters C, dynamic clusters and the KD-Tree structure,
are explained. In the section 4, the hierarchical clustering
based-density algorithm is presented. Tests and results are
detailed in section 5. The conclusion and perspective for
future work end this paper in section 6.1
2</p>
    </sec>
    <sec id="sec-2">
      <title>Dyclee: a dynamic clustering algorithm</title>
      <p>
        DyClee is a dynamic clustering algorithm which is able to
deal with large amounts of data arriving at fast rates by
adopting a two stages strategy similar to [
        <xref ref-type="bibr" rid="ref9">16</xref>
        ], [
        <xref ref-type="bibr" rid="ref10">17</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">18</xref>
        ].
The first stage is a fast scale distance-based algorithm that
collects, pre-processes and compresses data samples to form
so-called micro-clusters ( -clusters). It operates at the rate
of the data stream and creates -clusters putting together
data samples that are close, in the sense of a given distance,
to each other. -clusters are stored in the form of
summarized representations including statistical and temporal
information.
      </p>
      <p>
        The second stage is a slower scale density-based algorithm
that groups the -clusters into actual clusters that can be
interpreted semantically as classes. It takes place once each
tslow seconds and analyses the distribution of -clusters.
The density of a -cluster is considered as low, medium
or high and is used to create the final clusters by a density
based approach, i.e. dense -clusters that are close enough
(connected) are said to belong to the same cluster.
Similarly to [
        <xref ref-type="bibr" rid="ref12">19</xref>
        ], a cluster is defined as the group of connected
-clusters where every inside -cluster presents high
density and every outside -cluster exhibits either medium or
low density. The above dense -cluster structure allows the
algorithm to create clusters of non convex shapes even in
high dimensional spaces and it has proved outliers rejection
capabilities in evolving environments [
        <xref ref-type="bibr" rid="ref11">18</xref>
        ]. In addition,
clusters of similar densities can form clusters of any shape
and any size.
      </p>
      <p>In DyClee, both stages work on-line, but operate at different
time scales. This multi-density feature allows the detection
of novelty behavior in its early stages when only a few
objects giving evidence of this evolution are present. Figure 1
gives the global description of DyClee.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Main principles of DyClee</title>
      <p>All the principles explained in this section are from the core
algorithm.
3.1</p>
      <sec id="sec-3-1">
        <title>Notion of micro-clusters C</title>
        <p>Considering a d-dimensional object X = [x1, ..., xd] marked
with a timestamp tX and qualified by d features, a -cluster
1This work is performed in the framework of a CIFRE Project
supported by ACTIA.
gathers a group of data samples close in all dimensions and
whose information is summarized in a characteristic
feature vector (CF). For a -cluster Ck, CF has the following
form:</p>
        <p>CFk = (nk; LSk; SSk; tlk; tsk; Dk; Classk) :
(1)
where nk 2 &lt; is the number of objects in the -cluster k,
LSk 2 &lt;d is the vector containing the linear sum of each
feature over the nk objects, SSk 2 &lt;d is the square sum of
features over the nk objects, tlk 2 &lt; is the time when the
last object was assigned to that -cluster, tsk 2 &lt; is the time
when the -cluster was created, Dk is the -cluster density
and Classk is the -cluster label if known. Using LSk, SSk
and nk the variance of the group of objects assigned to the
-cluster k can be calculated.</p>
        <p>The -cluster is shaped as a d-dimensional box since the
L1-norm is used as distance measure. The distance between
an object X = [x1, ..., xd]T and a -cluster Ck, named
as dis(X; Ck), is calculated as the sum of the distances
between the Ck vector center ck = [c1k; : : : ; cdk]T and the
object value for each feature as shown in equation (2):
d
dis(X; Ck) = L1(X; ck) = X xi
cik :</p>
        <p>(2)
i=1</p>
        <p>The data is normalized according to the data context,
i.e. the feature range [mini; maxi] of each feature i, i =
1; :::; d. If no context is available in advance, it may be
established online. The size of the hyperboxes Si along each
dimension i is set as a fraction of the corresponding feature
range. The hyperbox size per feature is hence found
according to (3), where i is a user constant parameter in the
interval (0; 1), establishing the fraction:</p>
        <p>Si =
ijmaxi
minij;
8i = 1; : : : ; d:
(3)
Whenever an object X arrives, the algorithm searches for
the closest -cluster. Once found, a maximal distance
criterion is evaluated to decide whether or not the object fits
inside the -cluster hyper-box. If the fitting is sufficient the
-cluster feature vector is updated using the object
information; if not, a new -cluster is created with the object
information using its time-stamp as cluster time of creation.</p>
        <p>The density of a -cluster Ck is calculated using the
current number of objects nk and the current hyper-volume
of the bounding box Vk = Qid=1 Si, as shown in (4):
Dk =
nk :
Vk
(4)
Let Ck and Ck be two -clusters, then Ck and Ck
are said to be directly connected if their hyper-boxes overlap
in all but ' dimensions, where ' is an integer. The
parameter ', fixed by the user, establishes the feature selectivity.</p>
        <p>
          A -cluster Ck1 is said to be connected to Ckn if
there exists a chain of -clusters f Ck1 ; Ck2 ; ; Ckn g
such that Cki is directly connected to Cki+1 for i =
1; 2; ; n 1. A set of connected -clusters is said to be
a group.
Dyclee is a dynamic clustering algorithm, which means that
not only the parameters but the classifier structure changes
according to input data in an automatic way. It achieves
several cluster operations like creation, elimination, drift,
merge, and split. For instance, a cluster is splitted into two
or more clusters if, with the arrival of new data, high density
regions can be distinguished inside the cluster. In that
scenario, dense regions are separated by low density regions,
making the cluster no longer homogeneous. Even more, the
cluster center could be situated in a low density region,
loosing its interpretability as prototype of the elements in the
cluster. Splitting the cluster creates smaller homogeneous
clusters, completely representative of the belonging
samples. An illustrative example of this phenomenon is shown
in Figure 2.
To find groups of connected -clusters, a KD-Tree [
          <xref ref-type="bibr" rid="ref13">20</xref>
          ] is
used. A KD-Tree is a binary tree, each of whose nodes
represents an axis-aligned hyperrectangle. Each node specifies
an axis and splits the set of points based on whether their
coordinate along that axis is greater than or less than a
particular value. The tree is queried to return only neighbors
who are at a maximum distance from a point. A Cj is
the neighbor of the Ck if the condition in equation (5) is
respected :
        </p>
        <p>L1 = mi=da1x jxik
i
cj j
r:
^
(5)
where d is the number of dimension, cji the center of the
Cj at the dimension i and r the maximal distance from the
Ck. In this context, r is set to i.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>A new hierarchical clustering density-based algorithm</title>
      <p>In this section, a new hierarchical clustering density-based
algorithm is presented. Inputs are the connections between
-clusters from the KD-Tree and the output is a flat partition
of clusters where all -clusters that are in clusters have a
minimum density level guaranteed.
4.1</p>
      <sec id="sec-4-1">
        <title>Representation of the -clusters connections</title>
        <p>The representation of the connections between all -clusters
can be visualized by a weighted Graph. A weighted Graph
G = (N ; E ; W) is a triplet where N is a set of nodes. A
node ni corresponds to the -cluster Ci. E is the set of
edges with eij the edge between the node ni and nj , which
are unordered pairs of elements of G. Finally, W is a set of
weights on E with wij defined in the equation 6.
wi;j = min(Di; Dj ):
(6)
where the density of a -cluster Di is defined in the
equation 4.</p>
        <p>An edge eij between -clusters Ci and Cj means those
are directly connected in the sense defined in the section
3.1. If two -clusters are not directly connected, there is no
edge between them which leads to a Graph that is not full.
The Graph of -cluster’s connections is built according to
the algorithm 1. Neighbors are searched for each -cluster
with respect to the equation 5 (lines 3 to 6). The function
Search_neighbors() is detailed in the algorithm 2. For each</p>
        <p>Ck, the distance L1 defined in equation 5 is applied where
r = i, xik the value of Ck at the ith dimension and cij the
center of the Cj at the dimension i. The variable
Neighbors_of_k contains all the neighbors of Ck. An edge ekj
is added to the Graph G for each neighbor Cj of the
cluster studied Ck and the weight wkj is calculated with
the equation defined above (lines 7 to 11).</p>
        <p>Algorithm 1 Build the Graph of -cluster’s connections
Require: KD-Tree
1: G = Graph()
2: Connection = ()
3: for k = 1 to Number of -clusters do
4: N = []
5: N = Search_neighbors(k)
6: Connection[k] = N
7: for j = 1 to Nbre of neighbors of k do
8: Weight = min(Dk, Dj )
9: G.add_edges(k, j, Weight)
10: end for
11: end for
Algorithm 2 Research of a -cluster’s neighbors
Require: KD-Tree, -cluster Ck
1: for j = 1 to Number of -clusters do
2: Neighbors_of_k = []
3: if mi=da1x jxik cij j r^then</p>
        <sec id="sec-4-1-1">
          <title>4: Neighbors_of_k.add(j)</title>
          <p>5: end if
6: end for
7: return Neighbors_of_k
Let us consider five -clusters C1, C2, C3, C4 and
C5 with their densities D1 = 14, D2 = 12, D3 = 3, D4 = 9,
D5 = 13. Figure 3 shows those five -clusters.</p>
          <p>The search of neighbors begins with the densest -cluster.
In this example, the research starts with the C1 as shown
in Figure 4. All -clusters that satisfy the equation 5 are
neighbors of C1.</p>
          <p>Once neighbors of C1 are found, the neighbors of the
other -clusters are searched. The result is shown in table
1:
-clusters</p>
          <p>Neighbors
C1
C2
C3
C4
C5</p>
          <p>C2
C1, C3
C2, C4
C3, C5</p>
          <p>C4</p>
          <p>
            The weights are calculated for every edge following the
equation 6 :
8w12 = min(D1; D2) = min(14; 12) = 12;
&gt;
&gt;&lt;w23 = min(D2; D3) = min(12; 3) = 3;
&gt;w34 = min(D3; D4) = min(3; 9) = 3
&gt;
:w45 = min(D4; D5) = min(9; 13) = 9:
(7)
The Figure 5 shows the weighted Graph which represents
the connections between all -clusters in the dataset.
The objective of the algorithm proposed in this paper is to
represent and visualize all the possible clusters at different
levels of density in a dataset. In contrast to most known
hierarchical clustering algorithm, links that relies level
ln and ln + 1 in our new algorithm’s dendogram are not
based on the distance between objects but on their densities.
Furthermore, our proposal is not based on the objects
but on -clusters that contains the objects to decrease the
complexity of calculation. At the root of the tree, there is
one cluster composed by all -clusters. Each cut in the tree
corresponds to a density threshold, i.e each cluster formed
below this cut level is composed by -clusters that have a
density higher to the density threshold. At the bottom of the
tree, there is one cluster for each -cluster. So root’s density
level is 0 and the last cut on the tree is max(w 2 W).
Like [
            <xref ref-type="bibr" rid="ref21">12</xref>
            ], a variable " is user parameter evolving in the
interval " 2 [0; max(w 2 W)]. For each iteration of ", all
the weights w 2 W are checked. if wij "^, the edge eij
is removed. The vertical axis of the dendogram is 1=D to
keep an ascending direction.
          </p>
          <p>Figure 6 represents the case when " = 3. We can
observe two clusters composed by -clusters that have their
densities strictly higher to three and one -cluster alone.
This method allows to detect a split of clusters case and to
isolate the least denses -clusters as early as possible.
Once all values in the interval of " studied and all edges
removed, the dendogram can be generated (see Figure 7).</p>
          <p>At the density threshold D = 3, two clusters composed
by C1, C2 and C4, C5 are found and the -cluster</p>
          <p>C3 is alone. One cluster composed by C1 and C2 is
found and C4, C5. Then for D=9, one cluster composed
by C3. Finally, above density D = 12, all -clusters are
alone.
The dendogram generated, a partition of clusters
corresponding to a specific density level can be extracted from the
Graph of -cluster’s connections with a density threshold
according to the algorithm 2. To find clusters, the edges that
have a weight strictly lower than the density threshold are
removed (lines 1 to 7). The remaining edges in the Graph
have their weights higher or equal to the density threshold
hence -clusters forming the clusters are guaranteed to have
their densities higher or equal to the threshold. Remaining
groups are searched in the Graph. If the size of a group
is equal to 1, i.e the group have not connection with other
-clusters, it is considered as noise. Else, the group is
recognized as a cluster (lines 10 to 16).</p>
          <p>Once the density threshold known, all the edges that have
their weights strictly lower are removed to left only groups
of -cluster to form final clusters. Figure 11 shows the final
clusters. The first includes C1 and C2, the second
includes C4, C5 and the last contains C6, C7. C2 and</p>
          <p>C8 are considered as noise because they do not have any
connections with the other -clusters.</p>
          <p>An other example is shown in Figure 8 to illustrate this
part. Let us consider -clusters Ci, i = 1; :::; 8 with their
respective densities D1 = 12, D2 = 3, D3 = 10, D4 = 11, D5
= 13, D6 = 9, D7 = 10 and D8 = 4.</p>
          <p>In this example, the density threshold is fixed to " = 8.
Every group of -clusters that is below the density
threshold is considered as a cluster. If a -cluster does not have
connection to the other -clusters, then it is considered as
noise. The Figure 10 illustrates how to visualize the clusters
that have a density strictly above the threshold.</p>
          <p>Process inputs are set-points for the cold water, hot
water and steam valves. Process outputs are hot and cold
water flow, tank level and temperature. Process inputs and
outputs represent electronic signals in the range 4-20 mA.
The test is done using three output variables: cold water
flow CWflow, tank level T anklevel, and temperature of the
water in the tank T anktemperature, in the operation mode
OP 1. In this mode, these variables are regulated at the
values provided in table 2. The process undergoes several
faults and several repairs that are reported in table 3.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Variable</title>
          <p>CWflow(mA)</p>
          <p>T anklevel(mA)
T anktemperature(mA)</p>
          <p>The measurements of CWflow, T anklevel, and
T anktemperature are shown in Figure 13 and their
recorded values across time constitute the data set for our
hierarchical clustering experiment. Sudden changes in the
value of regulated variables are indicative of the occurence
of some fault or of some fault being fixed. The dataset was
generated by simulation.</p>
          <p>For this experiment, the radius of research for a
cluster’s neighbors r is set to 0.06. The parameter ' that
defines the number of dimensions that must overlap so that
two -clusters are considered directly connected is set to
0. That means two -clusters Ck and Ck are directly
connected if their hyperboxes overlap in all dimensions.
The parameter " is set to 0 for results shown in Figure 16 that
correspond to the root of the dendogram. For the following
graphs, the x-axis is the flow CWflow normalized and the
y-axis is the tank temperature T anktemperature normalized.
The tank level T anklevel is not plotted. Figure 14 shows the
graph of connections between -clusters when " = 0. Each
red square represents a -cluster. Micro-clusters that are
not connected to the others are considered as noise. The
cluster C293 is connected to C222 and C232, meaning
there are directly connected. C222 is connected to C34
and C232 is connected with C41 and so on. This chain
of -clusters forms a cluster. Figure 15 shows a generalized
dendogram representing the hierarchy of behaviors found in
the dataset.</p>
          <p>HDyClee detects the 6 main behaviors as shown the
Figure 16. The biggest cluster (green cluster) represents the
nominal behavior, the blue cluster represents fault s2, the
pink cluster (bottom of the Figure) represents the fault s1,
the grey cluster models the fault l3, the brown cluster shows
the event l1, and the purple cluster represents l1 and l2
present simultaneously. Black points represent -clusters
that have no connection with other -clusters. All the
objects inside these -clusters are considered as noise.</p>
          <p>To visualize only the nominal behavior, the dendogram
must be cut at the density threshold " = 7 because above
this value, the other clusters have no -clusters that are
connected to each other. This is shown on Figure 19. Figure 18
illustrates the graph of -cluster connections after deletion
of the edges with a weight wij 7^. Some -clusters which
were part of the biggest cluster are now considered as noise.
Indeed, edges that connected them to other -clusters were
less than the density threshold.</p>
          <p>It is possible to visualize the most frequent behaviors of
the system, in our case the normal behavior and fault l3.
This is reproted in Figure 17. For this purpose, the density
threshold is set to " = 6 by using the dendogram. At this
density, the clusters corresponding to other behaviors are
considered as noise because their maximal densities are less
than 6.
The work presented in this paper proposes a new
hierarchical density-based algorithm named HDyclee. The purpose
of this algorithm is to extract a hierarchy of clusters that are
guaranteed to have a level of density at each layer. Branches
in the dendogram do not represent distance between objects
but minimum density difference. This approach allows one
to identify clusters with the poorest densities and then walk
up the hierarchy for higher densities. The algorithm is
detailed and tested on a well known monitoring benchmark.
HDyClee is able to detect all the behaviors of the process
and the user can explore more or less frequent behaviors by
cutting the dendogram at different densities.</p>
          <p>
            Next step is to develop experimentations in order to
compare this new algorithm with other density-based
algorithms. Then the comparative study will include
hierarchical and distance-based clustering methods [
            <xref ref-type="bibr" rid="ref15">22</xref>
            ], [
            <xref ref-type="bibr" rid="ref16">23</xref>
            ].
Several perspectives have been identified for HDyClee, which
follow from DyClee properties. In particular, DyClee has
a forgetting function that allows to forget -clusters which
do not receive any data or are not significant (not denses
clusters). This function will be included in HDyClee, which
will allow us to produce a dynamic dendogram and then to
visualize the evolution of the different behaviors of a
system.
          </p>
          <p>A density-sensitive
hierarXiv preprint arXiv:</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>James</surname>
            <given-names>MacQueen</given-names>
          </string-name>
          et al.
          <article-title>Some methods for classification and analysis of multivariate observations</article-title>
          .
          <source>In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability</source>
          , volume
          <volume>1</volume>
          , pages
          <fpage>281</fpage>
          -
          <lpage>297</lpage>
          . Oakland, CA, USA,
          <year>1967</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Kaufman</surname>
          </string-name>
          and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Rousseeuw</surname>
          </string-name>
          .
          <article-title>Clustering by means of medoids</article-title>
          . North-Holland,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Zhexue</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <article-title>Extensions to the k-means algorithm for clustering large data sets with categorical values</article-title>
          .
          <source>Data mining and knowledge discovery</source>
          ,
          <volume>2</volume>
          (
          <issue>3</issue>
          ):
          <fpage>298</fpage>
          -
          <lpage>304</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Martin</given-names>
            <surname>Ester</surname>
          </string-name>
          ,
          <string-name>
            <surname>Hans-Peter Kriegel</surname>
            , Jörg Sander,
            <given-names>Xiaowei</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
          </string-name>
          , et al.
          <article-title>A density-based algorithm for discovering clusters in large spatial databases with noise</article-title>
          .
          <source>In Knowledge Discovery Data Mining</source>
          , volume
          <volume>96</volume>
          , pages
          <fpage>226</fpage>
          -
          <lpage>231</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Maria</given-names>
            <surname>Halkidi</surname>
          </string-name>
          , Yannis Batistakis, and
          <string-name>
            <given-names>Michalis</given-names>
            <surname>Vazirgiannis</surname>
          </string-name>
          .
          <article-title>On clustering validation techniques</article-title>
          .
          <source>Journal of intelligent information systems</source>
          ,
          <volume>17</volume>
          (
          <issue>2-3</issue>
          ):
          <fpage>107</fpage>
          -
          <lpage>145</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Olatz</given-names>
            <surname>Arbelaitz</surname>
          </string-name>
          , Ibai Gurrutxaga, Javier Muguerza,
          <string-name>
            <surname>Jesùs M Pérez</surname>
            ,
            <given-names>and Iñigo</given-names>
          </string-name>
          <string-name>
            <surname>Perona</surname>
          </string-name>
          .
          <article-title>An extensive comparative study of cluster validity indices</article-title>
          .
          <source>Pattern Recognition</source>
          ,
          <volume>46</volume>
          (
          <issue>1</issue>
          ):
          <fpage>243</fpage>
          -
          <lpage>256</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Yanchi</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Zhongmou</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Hui</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xuedong</given-names>
            <surname>Gao</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Junjie</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Understanding of internal clustering validation measures</article-title>
          .
          <source>In Data Mining (ICDM)</source>
          ,
          <year>2010</year>
          IEEE 10th International Conference on, pages
          <fpage>911</fpage>
          -
          <lpage>916</lpage>
          . IEEE,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Nathalie</surname>
            <given-names>Barbosa</given-names>
          </string-name>
          ,
          <article-title>Louise Travé-Massuyès, and Victor Hugo Grisales. A data-based dynamic classification technique: A two-stage density approach</article-title>
          .
          <source>In SAFEPROCESS 2015, Proceedings of the 9th IFAC Symposium on Fault Detection, Supervision and Safety for Technical Processes</source>
          , pages
          <fpage>1224</fpage>
          -
          <lpage>1231</lpage>
          . IFAC,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Charu</surname>
            <given-names>C Aggarwal</given-names>
          </string-name>
          , Jiawei Han,
          <string-name>
            <given-names>Jianyong</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <surname>Philip S Yu</surname>
          </string-name>
          .
          <article-title>A framework for clustering evolving data streams</article-title>
          .
          <source>In Proceedings of the 29th international conference on Very large data bases-</source>
          Volume
          <volume>29</volume>
          , pages
          <fpage>81</fpage>
          -
          <lpage>92</lpage>
          . VLDB Endowment,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Philipp</surname>
            <given-names>Kranen</given-names>
          </string-name>
          , Ira Assent, Corinna Baldauf, and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Seidl</surname>
          </string-name>
          .
          <article-title>The ClusTree: indexing micro-clusters for any stream mining</article-title>
          .
          <source>Knowledge and information systems</source>
          ,
          <volume>29</volume>
          (
          <issue>2</issue>
          ):
          <fpage>249</fpage>
          -
          <lpage>272</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Feng</surname>
            <given-names>Cao</given-names>
          </string-name>
          , Martin Ester, Weining Qian, and
          <string-name>
            <given-names>Aoying</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <article-title>Density-based clustering over an evolving data stream with noise</article-title>
          .
          <source>In Proceedings of the 2006 SIAM international conference on data mining</source>
          , pages
          <fpage>328</fpage>
          -
          <lpage>339</lpage>
          . SIAM,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Yixin</given-names>
            <surname>Chen</surname>
          </string-name>
          and
          <string-name>
            <given-names>Li</given-names>
            <surname>Tu</surname>
          </string-name>
          .
          <article-title>Density-based clustering for real-time stream data</article-title>
          .
          <source>In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data mining</source>
          , pages
          <fpage>133</fpage>
          -
          <lpage>142</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Songrit</given-names>
            <surname>Maneewongvatana and David M Mount.</surname>
          </string-name>
          <article-title>It's okay to be skinny, if your friends are fat</article-title>
          .
          <source>In Center for Geometric Computing 4th Annual Workshop on Computational Geometry</source>
          , volume
          <volume>2</volume>
          , pages
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Nina</surname>
            <given-names>F Thornhill</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sachin C Patwardhan</surname>
          </string-name>
          , and
          <string-name>
            <surname>Sirish L Shah</surname>
          </string-name>
          .
          <article-title>A continuous stirred tank heater simulation model with applications</article-title>
          .
          <source>Journal of Process Control</source>
          ,
          <volume>18</volume>
          (
          <issue>3-4</issue>
          ):
          <fpage>347</fpage>
          -
          <lpage>360</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Adrián</given-names>
            <surname>Rodríguez</surname>
          </string-name>
          <string-name>
            <given-names>Ramos</given-names>
            , José Manuel Bernal de Lázaro,
            <surname>Antônio J da Silva</surname>
          </string-name>
          <string-name>
            <surname>Neto</surname>
          </string-name>
          , Carlos Cruz Corona, José Luís Verdegay, and
          <article-title>Orestes LlanesSantiago. An approach to fault diagnosis using fuzzy clustering techniques</article-title>
          .
          <source>In Advances in Fuzzy Logic and Technology</source>
          <year>2017</year>
          , pages
          <fpage>232</fpage>
          -
          <lpage>243</lpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Yaguo</surname>
            <given-names>Lei</given-names>
          </string-name>
          , Zhengjia He,
          <string-name>
            <surname>Yanyang Zi</surname>
            , and
            <given-names>Xuefeng</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>New clustering algorithm-based fault diagnosis using compensation distance evaluation technique</article-title>
          .
          <source>Mechanical Systems and Signal Processing</source>
          ,
          <volume>22</volume>
          (
          <issue>2</issue>
          ):
          <fpage>419</fpage>
          -
          <lpage>435</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Stephen</surname>
            <given-names>C</given-names>
          </string-name>
          <string-name>
            <surname>Johnson</surname>
          </string-name>
          .
          <article-title>Hierarchical clustering schemes</article-title>
          .
          <source>Psychometrika</source>
          ,
          <volume>32</volume>
          (
          <issue>3</issue>
          ):
          <fpage>241</fpage>
          -
          <lpage>254</lpage>
          ,
          <year>1967</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Robin</given-names>
            <surname>Sibson</surname>
          </string-name>
          .
          <article-title>Slink: an optimally efficient algorithm for the single-link cluster method</article-title>
          .
          <source>The computer journal</source>
          ,
          <volume>16</volume>
          (
          <issue>1</issue>
          ):
          <fpage>30</fpage>
          -
          <lpage>34</lpage>
          ,
          <year>1973</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Defays</surname>
          </string-name>
          .
          <article-title>An efficient algorithm for a complete link method</article-title>
          .
          <source>The computer journal</source>
          ,
          <volume>20</volume>
          (
          <issue>4</issue>
          ):
          <fpage>364</fpage>
          -
          <lpage>366</lpage>
          ,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Álvaro</surname>
          </string-name>
          Martínez-Pérez.
          <source>archical clustering method. 1210.6292</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Ricardo</surname>
            <given-names>JGB Campello</given-names>
          </string-name>
          , Davoud Moulavi, and
          <string-name>
            <given-names>Jörg</given-names>
            <surname>Sander</surname>
          </string-name>
          .
          <article-title>Density-based clustering based on hierarchical density estimates</article-title>
          .
          <source>In Pacific-Asia conference on Knowledge Discovery and Data mining</source>
          , pages
          <fpage>160</fpage>
          -
          <lpage>172</lpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Nathalie</surname>
            <given-names>Barbosa</given-names>
          </string-name>
          ,
          <article-title>Louise Travé-Massuyès, and Victor Hugo Grisales. A novel algorithm for dynamic clustering: Properties and performance</article-title>
          .
          <source>In Machine Learning and Applications (ICMLA)</source>
          ,
          <year>2016</year>
          15th IEEE International Conference on, pages
          <fpage>565</fpage>
          -
          <lpage>570</lpage>
          . IEEE,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Nathalie</given-names>
            <surname>Barbosa</surname>
          </string-name>
          .
          <article-title>A data-based approach for dynamic classification of functional scenarios oriented</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>