<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Online Hybrid Probabilistic-Fuzzy Clustering in Medical Data Mining Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>niy Bo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>nskiy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Control systems research laboratory, Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>Kharkiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>In the paper the online fuzzy clustering recurrent procedure has been introduced that allows the forming of hyperellipsoidal clusters with an arbitrary orientation of the axes is investigated. The proposed clustering system is the generalization of a number of known algorithms, it is intended to solve tasks within the general problems of Medical Data Mining, when information is sequentially fed to processing in online mode.</p>
      </abstract>
      <kwd-group>
        <kwd>Medical Data Mining</kwd>
        <kwd>Big Data</kwd>
        <kwd>Computational Intelligence</kwd>
        <kwd>Fuzzy Clustering</kwd>
        <kwd>EM-Algorithm</kwd>
        <kwd>Kohonen's Self-Learning</kwd>
        <kwd>Soft Clustering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Clustering task has a special place in the general problem of Data Mining [1,2] since
it’s solution implements in self-learning mode (unsupervised learning) when the
researcher doesn’t have a marked-up training dataset in advance. It is clear that here the
level of a priori uncertainty is much higher comparatively to other problems of data
analysis, which led to the emergence of a variety of approaches, methods and
algorithms for this problem solving [3-6], differing both in initial premises and in
mathematical procedures, which often leads to different results in the end. This is
distinguishes the task of clustering from other traditional Data Mining tasks such as
classification, forecasting, identification of hidden dependencies contained in data, etc.</p>
      <p>Clustering task is complicated if the data for processing are received sequentially in
online mode, forming a data stream [7], in doing so the frequency of data income is
such that it's impossible to process an accumulated information in time between two
neighboring observations. The situation is even more complex if the amount of data is
so large that it’s processing is impossible as a single array (Big Data concept [8]).
Hence, the idea of online clustering of data flow seems to be very attractive,
especially in Medical Data Mining task, connected with mass examination of patients.</p>
      <p>Artificial neural networks, fuzzy reasoning systems, and hybrid neuro-fuzzy
systems [3, 9, 10] can be successfully used to solve the problems of processing data in
online mode, while performance issues come to the fore, especially learning
processes. Obviously, that multi-epoch learning in this situation is ineffective. The
incremental learning is more promising when the parameters of the clustering system are
refined sequentially synchronously with data arrival. Here, first of all, it is necessary to
note the clustering Kohonen’s neural networks [9] – self-organizing maps (SOM),
which have shown their effectiveness in solving many real-world problems. It should
be remembered that SOMs solve clustering problems under convex (linearly
separable) nonoverlapped classes.</p>
      <p>In real-world problems, data usually form overlapping classes, wherein each
observation simultaneously belonging to two or more classes. It’s clear, in this case, in
the process of clustering, it is necessary to calculate both the possible classes and the
probability membership levels of each vector-pattern to each of the possible classes.
Obviously, that in this situation, the traditional Kohonen’s neural network is
ineffective. In this case, the so-called soft computing methods come to the fore, among them
the most popular is the Expectation-Maximization approach (EM-algorithm) [9-15],
which is based on probabilistic assumptions. It should also be noted a powerful
method of Fuzzy-C-Means (FCM) [10-12] proposed by J.C.Bezdek. Both of these methods
were combined in the hybrid approach proposed in [15].</p>
      <p>However, it must be noted that EM and FCM are algorithms that process
information in batch form in multi-epoch mode, which makes it impossible to use them in
Data Stream Mining tasks. In this regard, recurrent adaptive FCM versions operating
in the online sequential mode were introduced in [16, 17]. The disadvantages of this
approach include the use of conventional Euclidean metrics, which allows the
formation of clusters of only a spherical shape. Obviously that in the class conditions of
a complex nonconvex form, there may be too many such spheroid clusters, which
reduces the speed of the processing.</p>
      <p>In this regard, it is advisable to develop recurrent online algorithms of sequential
fuzzy clustering which allow forming data clusters of a more complex form than the
hyperspheres and, particularly, hyprellipsoidal one arbitrarily oriented in the feature
space of overlapping classes.
2</p>
      <p>Batch Clustering Using Probabilistic And Fuzzy Approaches
Let the initial data array be given in the dataset form of n-dimensional observation
vectors such that x(k ) = ( x1 (k ) ,..., x ( k ) ,..., xn (k ))T Î Rn , where k =1, 2,..., N
i
describes the number of the observation in this dataset. As a result of clustering, this
dataset should be divided into m (1&lt;m&lt;N) overlapping ellipsoidal classes.</p>
      <p>In the framework of the statistical approach (EM-algorithm), it is assumed that the
data are of a random nature and have a Gaussian density distribution:</p>
      <p>ae n
p j ( x ) = ç ( 2p )2
è</p>
      <p>
        ö-1 ae
det å j ÷ exp ç
ø è
12 ( x - wj )T å-j1 ( x - wj ) ö÷ø
where wj - n-dimensional vector centroid of the j-th cluster, å j - the correlation
matrix of the j-th cluster of dimension (n ´ n) :
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
(
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
m
å p j = 1.
j=1
m
å µ j = 1
j=1
      </p>
      <p>It’s obviously that using of this metrics instead of the traditional Euclidean one in
the FCM algorithm allows to restore the classes of the hyperellipsoidal form with an
arbitrary orientation of the axes in the initial space of features.</p>
      <p>In the process of clustering using the EM-approach, the maximization of the
function of log likelihood is implemented:
å j =</p>
      <p>N1 åkN=1 ( x ( k ) - wj ) ( x ( k ) - wj )T .</p>
      <p>
        Basing on (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), it is easy to write down the joint distribution density of observations
of an initial dataset in the form
      </p>
      <p>N N ae n
p ( x ) = å p j p j ( x ) = å p j ç (2p )2
k =1 k =1 è</p>
      <p>ö-1 ae 1
det å j ÷ exp ç
ø è 2
( x - wj )T å-j1 ( x - wj ) ö÷ø =</p>
      <p>N ae n
= å p j ç (2p )2
k =1 è</p>
      <p>ö-1 ae 1 ö
det å j ÷ø exp çè - 2 d M2 ( x, wj ) ÷ø
where p j - a priori probabilities that satisfy standard conditions</p>
      <p>
        It is easy to see that equation (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) coincides with the constraint on the sum of the
memberships in the Fuzzy C-Means algorithm
where 0 &lt; µ j (k) &lt; 1 - membership level of k-th observation to j-th cluster.
      </p>
      <p>
        Due to the constraints of (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) procedures of FCM-type are called fuzzy probabilistic
algorithms [10].
      </p>
      <p>Let’s introduce Mahalanobis distance between centroids wj and vectors-pattern
x(k ) in the form of
dm2 ( x ( k ) , wj ) = ( x (k ) - wj ) å-j1 ( x (k ) - wj ).</p>
      <p>T
wherein, the final result can be written in the form [13]
ïíw = åN p j ( x (k )) x (k )
ïî j k =1</p>
      <p>N
å p j ( x (k )).</p>
      <p>k =1
ì ö m ö
ïï p j ( x (k )) = exp aeèç - 12 d M2 ( x (k ) , wj ) ÷ø ål=1 exp aeèç - 12 d M2 ( x (k ) , wj ) ÷ø ,</p>
      <p>It should be noted that when pj = m-1 and the unity matrix å-1 , the EM algorithm
coincides with the standard K-means clustering procedure.</p>
      <p>As it is known [4,5] K-means clustering procedure is related with minimization of
the goal function</p>
      <p>N m
E ( x (k ), wj ) = ååµ j (k ) x (k ) - wj
k=1 j=1
2</p>
      <p>N m
= ååµ j (k ) dE2 ( x (k ), wj )</p>
      <p>k=1 j=1
ìï1, if x (k ) belongs to j - th cluster,
µ j (k ) = í</p>
      <p>ïî0, otherwise.</p>
      <p>In doing so, the use of Euclidean metrics leads to the fact that the emerging clusters
have a spherical shape. A modification of the standard K-means is the procedure [6]
of Mahalanobis K-means, associated with the goal function</p>
      <p>N ae m ö
E ( x (k ) , wj , å j , p j ) = ålog ç å p j p j ( x (k )) ÷</p>
      <p>k=1 è k=1 ø</p>
      <p>N m
E ( x (k ) , wj ) = å åµ j (k ) ( x(k) - wj )T å-j1 ( x(k) - wj ) =</p>
      <p>k=1 j=1</p>
      <p>N m
= å åµ j (k ) d M2 ( x (k ) , wj )</p>
      <p>
        k=1 j=1
where
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
(
        <xref ref-type="bibr" rid="ref9">9</xref>
        )
(
        <xref ref-type="bibr" rid="ref10">10</xref>
        )
(
        <xref ref-type="bibr" rid="ref11">11</xref>
        )
(
        <xref ref-type="bibr" rid="ref12">12</xref>
        )
the minimization of which leads to an assessment of the centroids’ position of the
form:
      </p>
      <p>N
wj = åµ j (k ) x (k )
k =1</p>
      <p>N
åµ j (k ) =
k =1
1</p>
      <p>å x (k )
N j x(k)ÎClj
where N j - is the number of observations of the initial dataset associated with the j-th
cluster.</p>
      <p>
        Crisp goal functions (
        <xref ref-type="bibr" rid="ref9">9</xref>
        ), (
        <xref ref-type="bibr" rid="ref11">11</xref>
        ) are the partial case of fuzzy clustering criterion [18]
      </p>
      <p>N m
E ( x (k ), wj , µ j ) = å åµ bj (k ) d 2 ( x (k ), wj )</p>
      <p>k=1 j=1
where a positive fuzzifier b &gt; 0 describes blurring of the clusters, wherein most
often (FCM) the value of this parameter is equal to two. At the same time, as
distances d 2 ( x (k ) , wj ) – Euclidean metrics are taken.</p>
      <p>
        The solution of the minimization of the goal function problem (
        <xref ref-type="bibr" rid="ref13">13</xref>
        ) with the
constraints (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) leads to fuzzy probabilistic clustering algorithm [3]:
(
        <xref ref-type="bibr" rid="ref13">13</xref>
        )
(
        <xref ref-type="bibr" rid="ref14">14</xref>
        )
(
        <xref ref-type="bibr" rid="ref15">15</xref>
        )
(
        <xref ref-type="bibr" rid="ref16">16</xref>
        )
(
        <xref ref-type="bibr" rid="ref17">17</xref>
        )
which when b = 2 turns into a standard FCM procedure:
ì 2
ïïµ j (k ) = d 1-b ( x (k ) , wj )
ïíw = åNµ bj ( x (k ) , wj )
ï j
î k=1
m 2
åd 1=b ( x (k ) , wl ) ,
l=1
ì dE-2 ( x (k ) , wj )
ïïµ j (k ) = m
ï å dE-2 ( x (k ) , wl )
íïïw = åNµl =2j1(k ) x (k )
ïî j k =1
      </p>
      <p>N
åµ 2j (k ).
k =1</p>
      <p>
        x (k ) - wj
= m
å x (k ) - wl
l=1
-2
-2
The first relation (
        <xref ref-type="bibr" rid="ref14">14</xref>
        ) can be rewritten in the form
the corresponding description of generalized Gaussian [19], which when b = 2 turns
into Cauchy probabilities density function, which leads to the expression
µ j (k ) =
1+
d 2 ( x (k ) , wj )
1
g j
ae ö
ç m
, g j = ç å d 2 ( x (k ) , wl ) ÷÷ .
      </p>
      <p>çè ll=¹1j, ø÷
-1</p>
      <p>It is interesting to note here that if the EM algorithm is based on Gaussian
distribution, then fuzzy procedures are connected with the distribution of Cauchy.</p>
      <p>
        It is also interesting to note that the popular clustering algorithm of Gath-Geva
[20], which minimizes the goal function (
        <xref ref-type="bibr" rid="ref13">13</xref>
        ), occupies an intersection between the
EM and FCM approaches since the estimate uses as the distance
      </p>
      <p>-1 ae 1 ö
dG2G ( x (k ) , wj ) = q j ( det å j ) exp ç - ( x - wj )T å=j1 ( x - wj ) ÷ø =</p>
      <p>è 2
aeè 12 d M2 ( x, wj ) ÷ø
= q j ( det å j )=1 exp ç - ö</p>
      <p>N
q j = åµ bj (k )
k =1</p>
      <p>N m
å åµlb (k ).</p>
      <p>k =1 l=1</p>
      <p>
        As a result of minimization (
        <xref ref-type="bibr" rid="ref18">18</xref>
        ) with the constraints (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ), (
        <xref ref-type="bibr" rid="ref19">19</xref>
        ), we came to the
algorithm
where
which when b = 2 takes the form
ì 2
ïµ j (k ) = dG1-Gb ( x (k ) , w j )
ï
ïï N N
íwj = åµ bj (k ) x (k ) åµ bj (k ) ,
ï k=1 k=1
ï N
ïïîå j = åk=1 µ bj (k ) ( x (k ) - wj ) ( x (k ) - wj )
      </p>
      <p>
        T
m 2
å d 1-b ( x (k ) , w j ) ,
l=1
which is the FCM modification for the clusters-hyperellipsoids case.
(
        <xref ref-type="bibr" rid="ref18">18</xref>
        )
(
        <xref ref-type="bibr" rid="ref19">19</xref>
        )
(
        <xref ref-type="bibr" rid="ref20">20</xref>
        )
(
        <xref ref-type="bibr" rid="ref21">21</xref>
        )
3
      </p>
      <p>Online Fuzzy Probabilistic Clustering In The Case Of</p>
      <p>
        Hyperellipsoidal Classes
The procedures discussed above assume that the initial dataset is specified in the
batch of data form, which is processed several times in multi-epoch learning mode. It
is clear that if the information is fed for processing in the form of a data stream x (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ),
x (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), ..., x(k ), x (k +1),..., (here k – index of the current discrete time), the
clustering methods discussed above are ineffective.
      </p>
      <p>
        As is known, self-organizing T. Kohonen’s map solves the clustering problem in
sequential mode by minimizing the goal function (
        <xref ref-type="bibr" rid="ref9">9</xref>
        ), i.e., in fact, implements the
Kmeans method for data flow. Popular WTA self-learning rule that looks like [21]
ìïwj (k ) +h (k +1) ( x (k +1) - wj (k )) ,
ï
wj (k +1) = íif wj (k ) - " winner ",
ï
ïîwj (k ) - otherwise
(here 0 &lt;h (k +1) &lt; 1- the learning rate parameter is chosen according to stochastic
approximation conditions) actually computes the usual arithmetic mean in recurrent
form.
      </p>
      <p>
        The self-learning rule (
        <xref ref-type="bibr" rid="ref22">22</xref>
        ) is closely related to the EM algorithm since the E-step
of expectation implements the process of Kohonen’s competition, and the M-step of
maximization implements the synaptic adaptation process. With this distance:
dE2 ( x (k +1) , wj (k )) = x (k +1) - wj (k )
2
minimizing by the gradient procedure (
        <xref ref-type="bibr" rid="ref24">24</xref>
        ), in fact, coinciding with (
        <xref ref-type="bibr" rid="ref22">22</xref>
        )
ìïwj ( k ) -h (k +1) Ñcj dE2 ( x (k +1) , wj (k )) ,
ï
wj (k +1) = íif wj (k ) - " winner ",
ï
ïîwj ( k ) - otherwise.
      </p>
      <p>
        Similarly (
        <xref ref-type="bibr" rid="ref24">24</xref>
        ) Mahalanobis metrics can be minimized (
        <xref ref-type="bibr" rid="ref6">6</xref>
        ) using a recurrent
procedure [22]:
ìïwj (k ) -h (k +1) Ñcj d M2 ( x (k +1) , wj (k )) ,
ï
wj (k +1) = íif wj (k ) - " winner ",
ï
ïîwj (k ) - otherwise
or, which is the same:
(
        <xref ref-type="bibr" rid="ref22">22</xref>
        )
(
        <xref ref-type="bibr" rid="ref23">23</xref>
        )
(
        <xref ref-type="bibr" rid="ref24">24</xref>
        )
(25)
where kl – the number of “wins” of the j-th neuron of Kohonen’s map.
      </p>
      <p>
        For fuzzy situations where the clusters being formed are mutually overlapped, in
addition to the procedure (26), an assessment of the membership level can be declared
similar to (
        <xref ref-type="bibr" rid="ref15">15</xref>
        ):
(26)
(27)
(28)
(29)
µ j ( k ) = d -2 ( x (k ) , wj (k ))
      </p>
      <p>M
m
å d -2 ( x (k ) , wl (k )) =</p>
      <p>M
l=1
(( x (k ) - wj (k )) å-j1 (k ) ( x (k ) - wj (k )))</p>
      <p>T
= m
å (( x (k ) - wl (k ))T ål-1 (k ) ( x (k ) - wl (k )))
l=1
-1</p>
      <p>
        Next, solving the nonlinear programming problem (goal function (
        <xref ref-type="bibr" rid="ref13">13</xref>
        ) with
constraints (
        <xref ref-type="bibr" rid="ref5">5</xref>
        )), using the Arrow-Hurwitz-Uzawa algorithm, it is easy to write down
relations:
which when b = 2 taken the simple form [16]:
ìïwj (k +1) = wj (k ) +h (k +1)µ bj (k +1) ( x (k +1) - wj (k )),
í 2
ïîµ j (k +1) = dE1-b ( x (k +1), wj (k ))
m 2
å d 1-b ( x (k +1) , wl (k )),
      </p>
      <p>E
l=1
ìïwj (k +1) = wj (k ) +h (k +1)µ 2j (k +1) ( x (k +1) - wj (k )),
í
ïîµ j (k +1) = x (k +1) - wj (k ) -2
m
å x (k +1) - wl (k ) -2 .
l=1</p>
      <p>Here, the multipliers µ bj (k +1) , µ 2j (k +1) in fact are neighbourhood functions
in Kohonen’s self-learning rule “Winner Takes More”. Wherein, instead of the usual
Gaussian, generalized Gaussian and Cauchian are used. It is interesting to note that
the receptive fields parameters of these functions are automatically evaluated here.</p>
      <p>It should be noted that the recurrent modification of the Gath-Geva algorithm [20]
was introduced in [23], but this modification is not related to optimization procedures.
Further modification can be written in the form of recurrence relations
ïìwj (k +1) = wj (k ) +h (k + 1) ( x (k + 1) - wj (k )), if w j (k ) - " winner ",
ïîíå j (k + 1) = (1-h (k + 1)) å j (k ) +h (k ) ( x (k + 1) - wj (k )) ( x (k + 1) - w j (k ))T .</p>
      <p>It can be noticed that relations (30) are the WTA self-learning rule of T.Kohonen
and the procedure for correcting the correlation matrix. It is interesting to note that
this matrix doesn’t fluent the process of the centroids tuning.</p>
      <p>A more effective is algorithm also proposed in [23], where a correction is made to
the membership levels of the form:
(30)
(31)
(32)
k +1
U j (k +1) = åµ bj (t ) = U j (k ) + µ bj (k +1).</p>
      <p>t =1
In this case, the algorithm has the form:
ì µ bj (k +1)
ïwj (k +1) = wj (k ) +
ï U j (k +1)
ï
ï ae
ïå j (k +1) = U j (k ) U j (k +1) çç å j (k ) +
í è
ïï! ( x (k +1) - wj (k ))T ) ,
ï
ï 2
ïîµ j (k +1) = dG1-Gb ( x (k +1) , wj (k ))
( x (k +1) - wj (k )) ,
µ bj (k +1)
U j (k +1)</p>
      <p>( x (k +1) - wj (k )) !</p>
      <p>Algorithm (32) is close to procedure (28) and coincides with it when
h (k +1) = U -j1 (k +1), however, the metric dG2G ( x (k +1) , wj (k )) used is
significantly different from the previously used one dE2 ( x (k +1) , wj (k )) . Once again, in this
algorithm, the correlation matrix å j (k ) does not affect the process of centroids
tuning.</p>
      <p>Combining procedure (25) using the gradient of Mahalanobis ’metrics and the
standard Gath-Geva algorithm, we will get a relation describing the fuzzy clustering
method with the hyperellipsoidal classes:
( x (k +1) - wj (k )) !</p>
      <p>(33)
Using the fuzzifier b = 2 , we get adaptive recurrent procedure [24]:
ì
ïU j (k +1) = U j ( k ) + µ 2j (k +1) ,
ï µ 2j (k +1)
ïïwj (k +1) = wj (k ) + U j ( k +1) å-j1 ( x (k + 1) - wj (k )) ,
ï
ï U j (k ) ae µ 2j (k +1)
ïíå j (k +1) = U j (k +1) çèç å j (k ) + U j (k +1)
ïï! ( x (k +1) - wj (k ))T ) ,
ï
ï
ïîµ j (k +1) = dG-G2 ( x (k +1) , wj (k ))
m
å dG-G2 ( x (k +1) , wj (k )).
l=1
( x (k +1) - wj (k ))
(34)
4</p>
      <p>Computer Experiments With Medical Data
As a demonstration of the developed method Online Fuzzy Probabilistic Clustering,
two medical samples from the UCI repository were used. The first series of
experiments was performed on a dataset of Breast Cancer Wisconsin, which related to the
applied problem in the medical field. The data describe two types of tumors –
malignant or benign. Clustering results are represented as a 3D model.</p>
      <p>The structure of Breast Cancer Wisconsin dataset is complex and non-linear. The
maximum classification accuracy reached 97.5 percent. Fig. 1, 2, and 3 show the
visualization of clustering results in three different projections. For visualization of our
dataset, that was previously compressed with Principal Component Analysis method
(PCA-method) into three principal components.</p>
      <p>The received results we compared with standard EM-algorithm and Fuzzy
Cmeans method of J. Bezdek. The results of the experiment can be seen in Table 1. In
the first row of the table, the outputs of EM-procedure are represented, in the second
row – FCM one and in the third – the proposed approach. It can certainly be seen, that
considered procedures surpass the quality of results both EM and FCM algorithms.</p>
      <p>For each cell nucleus, ten real signs are calculated that are columns in the sample:
radius (average distance from the center to the points along the perimeter), texture
(standard deviation of gray values); perimeter; square; smoothness; compactness;
concavity (the severity of the concave parts of the contour); the number of firing
points (the number of concave parts on the contour of the tumor); symmetry; fractal
size.</p>
      <p>The obtained accuracy ranged from 89% to 97.5%, which in general was from 569
examples correctly classified from 490 to 547 examples.</p>
      <p>The second series of experiments was carried out on a sample of Dermatology
which is the data obtained as a result of a study of the histopathological features of
patients with dermatological diseases. The goal is to determine the type of
dermatological disease.</p>
      <p>The dataset has 33 attributes, of which: one target attribute, which shows the
presence of a specific dermatological disease, there are 6 classes. Classification: 1st class
(psoriasis) – 112 examples, 2nd class (seborrheic dermatitis) – 61 examples, 3rd grade
(lichen) – 72 examples, 4th grade (pink lichen) – 49 examples, 5th grade (chronic
dermatitis) – 52 examples, 6th grade (lichen planus) – 20 examples.</p>
      <p>This sample is not quite popular for testing various algorithms for medical data
analytics since it has several linearly inseparable clusters that are difficult to process
with standard clustering algorithms.</p>
      <p>The final results of the clustering of the developed method are presented in Fig. 4
and 5. As can be seen in Fig. 4 and 5 some examples of one cluster are very close to
another cluster, this suggests that some clusters are not linearly separable.</p>
      <p>A series of experiments indicate that the proposed Online Fuzzy Probabilistic
Clustering method works quite quickly and shows the high quality of clustering of data
arrays in conditions of non-linear data. And also the experiment proves that the
proposed method can solve practical problems in the field of medical data processing.
The problem of online fuzzy clustering of data that are fed to processing sequentially
in the form of data stream was considered. A feature of approach under consideration
is that the formed classes have the hyperellipsoidal shape with an arbitrary orientation
of the axes in feature space. The proposed procedure uses Mahalanobis’ metrics and
also it is the generalization of a number of well-known clustering algorithms, has high
speed and simple, in computational implementation. The obtained results allow
solving a number of problems arising in Data Mining and especially Medical Data Mining
ones connected with mass examination of patients.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Aggarwa,l C.C.
          <article-title>: Data Mining</article-title>
          . Cham: Springer, Int. Publ.,
          <string-name>
            <surname>Switzerland</surname>
          </string-name>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bramer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <source>Principles of Data Mining</source>
          . Springer-Verlag London (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Höppner</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klawonn</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kruse</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Runkler</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition</article-title>
          . John Wiley &amp; Sons. Chichester (
          <year>1999</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gan</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Ma, Ch.,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Data Clustering: Theory, Algorithms</article-title>
          and Applications, Philadelphia: SIAM (
          <year>2007</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wunsch</surname>
            ,
            <given-names>D. C.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Clustering</surname>
          </string-name>
          . IEEE Press Series on Computational Intelligence. Hoboken, NJ: John Wiley &amp; Sons, Inc. (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Aggarwal</surname>
            ,
            <given-names>C. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reddy</surname>
            ,
            <given-names>C. K.</given-names>
          </string-name>
          :
          <article-title>Data Clustering. Algorithms and Application</article-title>
          . Boca Raton: CRC Press (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bifet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams</article-title>
          , IOS Press (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kacprzyk</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pedrycz</surname>
          </string-name>
          , W.: Springer Handbook of Computational Intelligence, Berlin Heidelberg: Springer, Verlag (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Du</surname>
          </string-name>
          , K.-L.,
          <string-name>
            <surname>Swamy</surname>
            ,
            <given-names>M. N. S.</given-names>
          </string-name>
          :
          <source>Neural Networks and Statistical Learning. London: SpringerVerlag</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Bezdek</surname>
          </string-name>
          , J.-C.:
          <article-title>Pattern Recognition with Fuzzy Objective Function Algorithms</article-title>
          , N.Y.: Plenum Press (
          <year>1981</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bodyanskiy</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ye</surname>
          </string-name>
          . V.,
          <string-name>
            <surname>Deineko</surname>
            ,
            <given-names>A. O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kutsenko</surname>
            ,
            <given-names>Y. V.</given-names>
          </string-name>
          :
          <article-title>On-line kernel clustering based on the general regression neural network and T. Kohonen's self-organizing map</article-title>
          ,
          <source>Automatic Control and Computer Sciences</source>
          <volume>51</volume>
          (
          <issue>1</issue>
          ),
          <fpage>55</fpage>
          -
          <lpage>62</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Bezdek</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          , Keller, J.,
          <string-name>
            <surname>Krishnapuram</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pal</surname>
          </string-name>
          , N.:
          <article-title>Fuzzy Models and Algorithms for Pattern Recognition</article-title>
          and
          <string-name>
            <given-names>Image</given-names>
            <surname>Processing</surname>
          </string-name>
          .
          <source>The Handbook of Fuzzy Sets</source>
          . Kluwer, Dordrecht, Netherlands: Springer, vol.
          <volume>4</volume>
          (
          <year>1999</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Dempster</surname>
            ,
            <given-names>A. P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laird</surname>
            ,
            <given-names>N. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rubin</surname>
            ,
            <given-names>D. B.</given-names>
          </string-name>
          :
          <article-title>Maximum likelihood from incomplete data via the EM algorithm</article-title>
          ,
          <source>J. of the Royal Statistical Society, Ser.B</source>
          <volume>39</volume>
          (
          <issue>1</issue>
          ), рр. 1-
          <fpage>38</fpage>
          (
          <year>1977</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Hathaway</surname>
          </string-name>
          , R.:
          <article-title>Another interpretation of the EM algorithm for mixture distributions</article-title>
          ,
          <source>J. of Statistics &amp; Probability Letters</source>
          , vol.
          <volume>4</volume>
          , pp.
          <fpage>53</fpage>
          -
          <lpage>56</lpage>
          (
          <year>1986</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Meng</surname>
            ,
            <given-names>X. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rubin</surname>
            ,
            <given-names>D. B.</given-names>
          </string-name>
          :
          <article-title>Maximum likelihood estimation via the ECM algorithm:a general framework</article-title>
          ,
          <source>Biometrica</source>
          , vol.
          <volume>80</volume>
          , рр.
          <fpage>267</fpage>
          -
          <lpage>278</lpage>
          (
          <year>1993</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Bodyanskiy</surname>
          </string-name>
          , Ye.:
          <article-title>Computational intelligence techniques for data analysis</article-title>
          ,
          <source>Lecture Notes in Informatics, Bonn: GI</source>
          , pp.
          <fpage>15</fpage>
          -
          <lpage>36</lpage>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Gorshkov</surname>
          </string-name>
          , Ye.,
          <string-name>
            <surname>Kolodyaznhiy</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bodyanskiy</surname>
          </string-name>
          , Ye.:
          <article-title>New recursive learning algorithms for fuzzy Kohonen clustering network</article-title>
          ,
          <source>Proc. 17th Int. Workshop on Nonlinear Dynamics of Electronic Systems</source>
          , Rapperswil, Switzerland, pp.
          <fpage>58</fpage>
          -
          <lpage>61</lpage>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Mumford</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jain</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Computational</surname>
            <given-names>Intelligence</given-names>
          </string-name>
          , Collaboration, Fuzzy and Emergence, Berlin: Springer, Vergal (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Osowski</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Sieci neuronowe do przetwarzania informacji</article-title>
          , Warszawa: Oficijna Wydawnicza Politechniki Warszawskiej (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Gath</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geva</surname>
            ,
            <given-names>A. B.</given-names>
          </string-name>
          :
          <article-title>Unsupervised optimal fuzzy clustering</article-title>
          ,
          <source>Pattern Analysis and Machine Intelligence</source>
          <volume>2</volume>
          (
          <issue>7</issue>
          ), pp.
          <fpage>773</fpage>
          -
          <lpage>787</lpage>
          (
          <year>1989</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Kohonen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Self-Organizing Maps</surname>
          </string-name>
          . Berlin: Springer-Verlag (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Bodyanskiy</surname>
          </string-name>
          , Ye.,
          <string-name>
            <surname>Deineko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kutsenko</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zayika</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Data streams fast EM-fuzzy clustering based on Kohonen`s self-learning,</article-title>
          <source>The 1st IEEE International Conference on Data Stream Mining &amp; Processing (DSMP 2016): Proc. of Int. Conf., Lviv</source>
          , pp.
          <fpage>309</fpage>
          -
          <lpage>313</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Geva</surname>
            ,
            <given-names>A.B.</given-names>
          </string-name>
          :
          <article-title>Clustering as a basis for evolving neuro-fuzzy modeling</article-title>
          ,
          <source>Evolving Systems</source>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>71</lpage>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Deineko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhernova</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gordon</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zayika</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pliss</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pabyrivska</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          :
          <article-title>Data stream online clustering based on fuzzy expectation-maximization approach</article-title>
          .
          <source>The 2nd IEEE International Conference on Data Stream Mining and Processing (DSMP 2018): Proc. of Int. Conf., Lviv</source>
          , pp
          <fpage>171</fpage>
          -
          <lpage>176</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>