<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Electroencephalogram Signals Classification by Ordered Fuzzy Decision Tree</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jan Rabcan</string-name>
          <email>jan.rabcan@fri.uniza.sk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miroslav Kvassay</string-name>
          <email>miroslav.kvassay@fri.uniza.sk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Key Terms. Model, Approach, Methodology, Scientific Field</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Zilina, Department of Infromatics</institution>
          ,
          <addr-line>Univerzitna 8215/1, 010 26, Zilina, Slovakia (jan.rabcan</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>A new algorithm for Electroencephalogram (EEG) signals classification is proposed in this paper. This classification is used for automatic detection of patients with epilepsy in a medical system for decision support. The classification algorithm is based on Ordered Fuzzy Decision Tree (OFDT) for EEG signals. The application of OFDT requires special transformation of EEG signal that is named as preliminary data transformation. This transformation extracts fundamental properties/features of EEG signals from every sample and reduces dimension of the samples. The accuracy of the proposed algorithm was evaluated and compared with other known algorithms used for EEG signal classification. This comparison showed that the algorithm proposed in this paper is comparable with existing ones and can produce better results than others.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>Electroencephalogram</kwd>
        <kwd>Classification</kwd>
        <kwd>Ordered Fuzzy Decision Tree</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Epilepsy is characterized by the seizures that are the result of a transient and
unexpected electrical disturbance of the brain. These electrical discharges of neurons are
evident in the electroencephalogram (EEG) signal, i.e. a signal that represents
electrical activity of the brain [1]. EEG has been the most common used signal for brain
state monitoring. It is used in medicine to reveal changes in the electrical activity of
the brain to detect disorders that are indicated generally at seizure diseases, loss of
consciousness after a stroke, inflammations, trauma, or concussion [2]. Measuring
brain electrical activity by EEG is considered as one of the most important tools in
neurology diagnostic [3, 4]. The specifics of this signal have been presented in details
in [3]. The communication in the brain cells takes place through electrical impulses.
EEG allows measuring these electrical impulses by placing the electrodes on the scalp
[1, 2]. Captured signals are amplified by EEG and then converted to the graphic
representation – curve [2]. The shape and character of the curves depend on the current
activity of the brain. By extracting useful information from captured signal, we are
able to predict or classify the brain state of investigated patient. However, the visual
inspection of EEG signal does not provide much information and, therefore, the
automatic analysis and classification of EEG signal is a current problem. Solution to this
problem can permit developing decision support system for epilepsy diagnosis [1]–
[3]. By extracting useful information from the captured signal, the brain state of a
patient can be predicted or classified.</p>
      <p>
        The specifics of EEG signal described in [1] imply special transformation of this
signal before application of classification procedure. This typical step of algorithms
for EEG classification is known as preliminary data transformation. The features
extraction and reduction of their dimension from the initial signal are fundamental
procedures that are necessary before the classification. Different algorithms for features
extraction can be used for EEG signal, e.g. Fourier transform [4], logistic regression
[3], wavelet transform [2], Welch's method [4]. In this paper, we use Welch's
method, whose result is a matrix of features that indicates specifics of the investigated
signal. However, this matrix has usually a large dimension for classification and,
therefore, a special procedure has to be used to reduce its dimension. The principal
component analysis (PCA) is used typically to achieve this goal in EEG signal
analysis. PCA is a statistical procedure [
        <xref ref-type="bibr" rid="ref19">5, 19</xref>
        ], which converts a set of observations
(described by variables that can be correlated) into a linearly uncorrelated smaller
number of variables that are named as "principal components".
      </p>
      <p>
        After the preliminary data transformation, the EEG signal can be classified. The
classification is implemented based on such methods as neural networks [4],
evaluation methods [5], clustering analysis [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], K-nearest neighbor classifier [6]. In [7],
decision tree has been used for EEG signal classification. The decision tree in [7] has
been inducted for numerical data. This required transformation of original EEG signal
into reduced data that can have some ambiguity [5], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. This ambiguity can be
included and considered in the analysis of data for the classification by their
transformation into fuzzy data [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Fuzzy sets, which defines domains of fuzzy data, can be
useful to describe real-world problems with higher accuracy [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. This idea is
considered in this paper. We added one more step into the preliminary data
transformation. In this step, the crisp data obtained after application of PCA is transformed
into fuzzy data in the process known as fuzzification.
      </p>
      <p>
        In this paper, a new algorithm for classification of EEG signal that is transformed
into fuzzy reduced features is developed. The classification itself is implemented by
an Ordered Fuzzy Decision Tree (OFDT). This type of decision trees has been
introduced in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and its advantage is a regular structure that contains exactly one
attribute at each level [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. This implies the analysis based on this type of tree can be done
in a parallel way. In this paper, the OFDT for EEG signal classification is inducted
based on the data from source [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The accuracy of the classification was evaluated
and compared with other algorithms used for EEG signal classification.
      </p>
      <p>
        This paper consists of four sections. The first section describes the background of
the proposed algorithm for EEG signals classification. This section describes data
used in the algorithm and explains the principal steps of the algorithm. The second
section deals with the preliminary data transformation by Welch’s method, principal
component analysis, and fuzzification. The process of OFDT induction is explained in
the third section. This process is illustrated using the data obtained after the
preliminary data transformation of data from [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The fourth section provides estimation and
analysis of the implemented algorithm.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Background of New Algorithm of EEG Signal Classification</title>
      <sec id="sec-2-1">
        <title>The Dataset Description</title>
        <p>
          In case of epileptic activity, EEG signal has some special features that have to be
extracted automatically to allow classification. The development of new classification
algorithm requires application of data from real observations because they allow us to
evaluate the accuracy of the algorithm. The dataset of EEG signals used in this paper
was collected and published by R. G. Andrzejak in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. This dataset consists of five
subsets A, B, C, D and E, where each subset contains 100 samples. Every sample is
EEG segment of a patient of 23.6-sec duration. In subsets A and B, all samples were
taken from the surface of the head from five healthy persons. The difference of these
subsets is that the persons in subset A had eyes open while the persons in subset B
had eyes closed during EEG recording. According to [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], open or closed eyes of
patients have the influence on epileptic activity. Samples in subset D were recorded
from within the epileptogenic zone identified in the hippocampal formation, and those
in subset C were obtained from the hippocampal formation of the opposite
hemisphere of the brain. While subsets C and D contain only activity measured during
seizure-free intervals, subset E contains only seizure activity. All EEG signals were
recorded by the same 128-chanels amplifier system. After 12 bit analog-to-digital
conversion, the data were written continuously onto the disk of data acquisition
computer system at a sampling rate of 173.61 Hz. Band-pass filter setting was 0.53-40 Hz.
Examples of EEG signals from every subset are shown in Fig. 1.
        </p>
        <p>The visual inspection of EEG signals depicted in Fig.1 does not provide much
information. Also, the classification of these signals by decision tree, OFDT in
particular, is not possible because these signals cannot be interpreted in terms of OFDT
attributes. This implies that some numerical features from these signals have to be
extracted. Furthermore, if there are a lot of extracted features, then they should be
reduced before the classification. This transformation of EEG signal is interpreted as
preliminary data transformation.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Principal Steps of New Algorithm for EEG Classification by Ordered</title>
      </sec>
      <sec id="sec-2-3">
        <title>Fuzzy Decision Tree</title>
        <p>
          The new algorithm for EEG signal classification has 2 principal steps (Fig. 2):
preliminary data transformation and classification based on OFDT. The first of these steps
consists of three procedures. The first of them is feature extraction. There are different
procedures for extraction of features of EEG signal. For example, Fourier Transform
is used in [4], and wavelet transform in [
          <xref ref-type="bibr" rid="ref12">2, 12</xref>
          ]. In this paper, features have been
extracted by Welch’s method, which represents one of the commonly used power
spectral density estimators.
        </p>
        <p>
          The second procedure of the preliminary data transformation is a reduction of the
dimension. This procedure can be implemented based on the PCA, whose background
and principals are discussed in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
        </p>
        <p>
          Finally, the third procedure is fuzzification of the obtained features of EEG signal.
This procedure is specific for our algorithm (it is not used in the existing ones), and its
addition results from application of classification algorithm based on OFDT. Several
algorithms can be implemented for this task. In this paper, we use algorithm of
fuzzification that was described in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>
          The second step of the developed algorithm is classification. A new classification
based on OFDT is elaborated in this paper. The OFDT for this classification is
inducted based on estimation of cumulative information introduced in [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Preliminary data transformation</title>
      <sec id="sec-3-1">
        <title>Welch’s Method</title>
        <p>The preliminary data transformation includes three procedures. The first of them is
Welch’s method that is used for extraction of the features from raw EEG segments.
The Welch’s method provides conversion from the time domain of the analyzed
signal into its frequency domain. It is non-parametric power signal density estimator [4].
This transformation divides the time series of a signal into several overlapping
segments of the same length, and then computes the periodogram of the each segment.
The result of the Welch’s transformation is a matrix of features where every row
corresponds with the periodogram of one signal. Every column is considered as input
attribute. As a rule, the obtained matrix has usually a lot of columns, i.e. a lot of input
attributes for the classification. To reduce the amount of input attributes PCA can be
applied.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Principal Component Analysis</title>
        <p>
          PCA is a statistical procedure, which converts a set of observations (possibly
correlated variables) into linearly uncorrelated variables called "principal components". The
number of principal components is less than or equal to the number of original
variables. Every resulting principal component can be considered as an input attribute for
arbitrary classification model. The goal of the PCA is to maximize the variance of
individual principal components, provided that their covariance is equal to zero. The
resulting principal components are linear combinations of original features. In the
feature space, the components are orthogonal to each other. The first component has
the biggest variance. Every next component has as big variance as possible keeping
constraints that it is uncorrelated and orthogonal to the components obtained before
and its variance is bigger than variance of components after them. The number of
selected components for next analysis has been estimated by Kaiser’s criterion [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
According to this criterion, a principal component is significant if its variance is
greater than the average variance of all the principal components. In some literature,
the result of PCA is denoted as an uncorrelated vector called orthogonal basis set [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Fuzzy Sets and Fuzzification</title>
        <p>
          Input data for OFDT induction is fuzzy. This implies the numerical data obtained
from PCA has to be fuzzified. The fuzzification is the transformation of continuous
numerical data into a set of fuzzy data. One of the possible algorithms of fuzzification
is arbitrary clustering algorithm described in [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. If data are fuzzified correctly, the
ambiguity of the original data can be reduced. Also, classification algorithm should be
less sensitive to errors and variations of measurements of EEG signals and, therefore,
it can result in obtaining better results of the classification process.
        </p>
        <p>The classification algorithm considered in this paper works with fuzzy attributes.
Let us assume that we have  fuzzy attributes denote as   , for  = 1,2, … ,  . Then
fuzzy attribute   is a linguistic attribute, which means that   can take fuzzy
ues   , ,  = 1,2 … ,    . Every value   , of fuzzy attribute   can be considered as a
fuzzy set. Fuzzy set   , is defined as an ordered set of pairs {( ,    , ( )),  ∈  },
where  denotes the set containing all the entities belonged to the discourse (this set
is known as the universe of discourse or the domain of discourse), and    , ( ):  →
[0,1] is a membership function that defines for every entity  from universe U its
degree of membership to set   , (a numeric value between 0 and 1). The membership
of entity  from universe  to fuzzy set   , is defined by function    , ( ) in the
following manner:
1.    , ( ) = 0
2. 0 &lt;    , ( ) &lt; 1
3.    , ( ) = 1
if and only if  is not a member of set   , ,
if and only if  is not a full member of set   , ,
if and only if  is a full member of set   , .</p>
        <p>In what follows, we will use some special terms from fuzzy set theory. The first of
them is cardinality  (  , ) of fuzzy set   , , which is defined as follows:
 (  , ) =
∑   , ( ),
 ∈
and another one is the product of fuzzy sets   ,̃ =   ,1 ×   ,2 × … ×   , . This product
results in new fuzzy set   ,̃, which is defined as follows:</p>
        <p>,̃ = {( ,    ,1( )∗    ,2( )∗ … ∗    , ( )) ,  ∈  }.</p>
        <p>The fuzzy attributes necessary for the OFDT induction in case of EEG signal
classification have to be obtained by fuzzification of principal components obtained after
PCA transformation of data acquired by the Welch’s transformation of raw signals.
For this purpose, we use the arbitrary clustering algorithm. This algorithm divides
numerical values  of the numerical attribute   into   
clusters to obtain fuzzy
attribute   . The  -th cluster, for  = 1,2, … ,    , is described by center   . Creation of
membership functions is based on these centers. The usage of membership functions
   , ( )is explained in Table 1. The columns in Table 1 are separated into two
subcolumns. The first sub-column contains membership functions, and the second
contains obligations, which defines the function that will be used during the
transformation.
   ,1( )
1
0
 2 − 
 2 −  1
  ,1</p>
        <p>
          If
 ≤  1
 1 &lt;  &lt;  2
 ≥  2
utes   , for  = 1,2, … ,  . The values of the output attribute represent class labels. The
repository table consists of  + 1 columns that correlate with n input attributes and 1
output attribute B. Attribute   is divided into   
sub-columns that correspond to the
values of the input attribute. The example of a repository is shown in Table 2. The
cells of this table contain the membership function values for every value of
individual attributes. A row in the table represents one sample used for OFDT induction.
The preliminary data transformation described above was applied on the data
collected in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Firstly, we used Welch’s method. This resulted in obtaining a matrix of
128 features (numeric attributes). After PCA application, the number of features was
reduced to 10 denoted as  1,  2, … ,  10. Before their using in classification, we had to
fuzzify them. Their characteristics are presented in Table 3, whose second column
contains information about the percentage of the total variance contained in the
individual attributes (before fuzzification), and the third column presents the number of
fuzzy values for the individual attributes obtained after fuzzification.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The Classification of EEG Signal by OFDT</title>
      <sec id="sec-4-1">
        <title>OFDT Induction for EEG Signal Classification</title>
        <p>
          According to [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], a decision tree is a formalism that allows recognition
(classification) of a new case based on known cases. Induction of a decision tree is a process of
moving from specific examples to general models and the goal of the induction is to
learn how to classify objects by analyzing a set of instances (already solved cases),
whose classes are known. Instances are typically represented as attribute-value
vectors. A decision tree consists of test nodes (internal nodes associated with input
attributes) linked to two or more sub-trees and leafs or decision nodes labeled with a class
defining the decision. A test node is used to compute an outcome based on values of
the attributes of the instance, where each possible outcome is associated with one of
the sub-trees. Classification of the instance starts in the root node of the tree. If this
node is a test, the outcome for the instance is determined and the process continues
using the appropriate sub-tree. When a leaf is eventually encountered, its label gives
the predicted class of the instance. The OFDT is one of the possible types of decision
trees. It permits operating on fuzzy data (attributes) and term "ordered" means that all
nodes at the same level of the decision tree are associated with a same attribute [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
The level of a node is defined by the number of nodes occurring on the path from the
root to the node.
        </p>
        <p>
          Let us consider application of OFDT for classification of EEG signal that is
represented by reduced features with fuzzy values. The splitting criterion for OFDT
induction is the Cumulative Mutual Information (CMI)  ( ;   1,   2, … ,    −1
[
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] where   1,   2, … ,    −1 is the sequence of nodes from the root to the
investigated node and  is the level of the investigated node. The attribute with the greatest
value of CMI is chosen to associate with all nodes at level z of the tree. The criterion
for choosing attribute that will be associated with level z of the tree can also take into
account the cost needed for obtaining value of attribute   . This characteristic of a

specific attribute is denoted as
        </p>
        <p>(   ). The splitting criterion also tries to reduce
the number of branches in multi valued attributes. This can be achieved using the
entropy  (   ) of the investigated attribute. Therefore, the criterion for choosing the
attribute that will be used for splitting has the following form:
(1)
(2)
(3)
 = arg max (
 ( ;   1, … ,    −1
 (   ) ∗</p>
        <p>,    )),
(   )
where the entropy of attribute    is computed as  (   ) =
(log2  − log2  (   ,  )), where  is the count of data samples and z is the
investi∑ =1 
(   ,  ) ∗
gated level of the OFDT.</p>
        <p>The OFDT algorithm uses pruning technique for establishing leaf nodes. The goal
of pruning is to remove a part of an OFDT, which provides a small power for
classification of new instances. The used pruning method establishes the leaf nodes by
stopping the tree expansion during the induction phase. The pruning procedure uses the
threshold values  and  . Threshold  reflects the minimal frequency of occurrences
in a given branch. The frequency of a branch reflects percentage of instances
belonging to the given branch. (Please realize that one instance can belong to more than one
branch, which is caused by usage of fuzzy logic.) Threshold  represents the maximal
confidence level computed in a given node. Confidence level means the likelihood of
the taken decision. Every internal node of the tree is declared as a leaf if at least one
of the following conditions is satisfied:

≥
 (  1 1 × … ×      )</p>
        <p>
          ≤ 2− (  |  1 1,…,  z  )
where   1 1, … ,      is the sequence of specific values of attributes   1, … ,   z (this
sequence agrees with a path from the root   1 to node   z),   represents the j-th value
of output attribute B, where  = 1,2, … ,   , and  (  |  1 1, … ,   z  ) is computed as:
 (  1 1 × … ×      )⁄log2  (  1 1 × … ×      ×   ). These threshold values have
big influence on tree level and the depth of branches (paths from the root to a specific
node). The branch depth is defined as the number of nodes of given branch.
Increasing value  causes increasing in the depth of tree branches. The parameter  also
affects the depth of the tree. In this case, bigger  causes the smaller depth of
branches. The threshold values should be set to good values to perform accurate
classification. If  = 0 and  = 1, the classification is very accurate, but only for training
instances. Small frequencies of branches will lead in classification mistakes of new
instances [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. We use a simple method to determine the thresholds. The adjustment
of  and  values is performed by repeated induction of OFDT with different
combinations of the thresholds. After the heuristic finishes, the best combination is chosen.
The process is shown in Fig. 3.
        </p>
        <p>Modification of threshold values  and  allows induction of OFDT with accuracy
that agrees with problem conditions. OFDTs for different values of the threshold 
and  can have different structure and accuracy.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Example of OFDT Induction</title>
        <p>In the next part of this section, the induction of the OFDT will be illustrated. The
illustration is done for branch  1,1 2,3. At the beginning of the OFDT induction, we
have set F of unused attributes. From set F we have to choose the attribute with
maximal value of splitting criterion (1). We computed that the attribute with the maximal
value of the splitting criterion is  1. Therefore, this attribute is established as the root
of the tree. The frequency in the root calculated by (2) is always equal to 1. Attribute
 1 has 2 fuzzy values  1,1 and  1,2. These values will be associated with outcomes
from the root. Attribute  1 is removed from set F of unused attributes. The first level
of the decision tree is displayed in Fig. 3.</p>
        <p>Fig. 4. The first level of OFDT</p>
        <p>At the second level, the attribute with a maximal value of (1) is selected from set F
again. In this case, attribute  2 has the maximal value of (1). This attribute is
associated with the nodes at the second level of the tree. Next, it is necessary to check if
some node at the second level can be a leaf. The node is established as a leaf if at least
one of conditions (2) or (3) is satisfied. In explained branch  1,1 2,3, branch  1,1,  2
has frequency equal to 0.543, which is greater than minimal frequency α, and
confidence levels are 0.547 and 0.453. This implies that condition (3) is also not satisfied
and, therefore, node  2 of branch  1,1,  2 cannot be a leaf.</p>
        <p>At the third level, unused attribute  10 is chosen. This attribute is removed from set
F of unused attributes. At the third level two nodes become leafs. A leaf is established
in branch  1,1 2,3, because condition (2) for minimal frequency is satisfied.</p>
        <p>In a similar way the full OFDT displayed in Fig. 7 can be inducted. Every level of
this OFDT has exactly one attribute: the first level has one node with attribute A1, the
second level includes 2 nodes with attribute A2 and the third level has nodes agreeing
with attribute A10. The third level includes some nodes that are labeled as LEAF if the
node is a leaf. Every node has the information about the frequencies and confidence in
the second row and the last row accordantly. The confidence levels and frequencies
are calculated by formulas (2) and (3).</p>
        <p>It can be unclear for someone why attribute A10 has been chosen at the 3-rd level of
the resulting OFDT since, after PCA transformation, the biggest amount of
information should be in first attributes while last attributes should have small amount of
information. This is caused by the splitting criterion because it takes into account the
output attribute too.</p>
        <p>Please note we did not use fuzzification of the best quality in the illustration of
OFDT induction because we needed a tree of such sizes that can fit on the page (this
also affected on the sequence of chosen attributes). For the illustration purpose, we
also chose threshold  for minimal frequency and threshold  for maximal confidence
as 0.1 and 0.65 respectively. Therefore, the OFDT depicted in Fig. 7 does not agree
with one with the best reached accuracy.
5.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Evaluation</title>
      <sec id="sec-5-1">
        <title>Evaluation Procedure</title>
        <p>The important characteristic of the classification procedure is the accuracy. The
accuracy is estimated as the ratio between the count of correctly classified instances and
the number of classified instances, which represents the percentage of properly
classified instances:
where  is the number of classified instances,   is the classified instance from dataset
 ,  = 1,2, … ,  and  (  ) is given by:
 (  ) = {
1, if 
0, otherwise
(  ) = class of 
,
where function classify returns the resulting class of OFDT classification.</p>
        <p>The estimation of the accuracy of the classification can be done by two methods. In
first, the training and testing samples are the same. Each sample   from dataset  is
used to induct the OFDT. Then each sample   ∈  is classified, and the accuracy is
evaluated. The second estimation is provided with the divided set of instances  .
Firstly, the set of instances  is split into two sets. The first set contains training samples
and the second one testing samples. About 20% of samples of dataset  are used for
classification (testing samples) and about 80% for OFDT induction (training
samples). The described process is shown in the Fig. 8.</p>
      </sec>
      <sec id="sec-5-2">
        <title>OFDT</title>
      </sec>
      <sec id="sec-5-3">
        <title>Evaluation of Accuracy of Algorithm For EEG Signal Classification by</title>
        <p>
          The accuracy evaluation was implemented by several experiments. According to [
          <xref ref-type="bibr" rid="ref20 ref21 ref22 ref23">7,
20, 21, 22, 23</xref>
          ] evaluation of algorithms for EEG classification should be performed to
assess the accuracy of the samples division into two groups that agree with EEG of
healthy persons (subsets A and B) and sick persons (subsets C, D, E). Based on the
method presented in this paper, we implemented several classifications by OFDT
(Table 4). Every classification agreed with one experiment. The main difference
between the experiments was the target of classification.
        </p>
        <p>The goal of experiment 1 was to detect epileptic segments. Therefore, only two
output classes were needed: (AB) and (CDE). The first class represents healthy
persons and the second epileptics. The classification in experiment 2 estimated accuracy
for the division into 5 separate subsets (A, B, C, D, E) according to dataset description
in section 1.1. Experiment 3 focused on the seizure detection. This experiment
represents also binary classification. One class represents segments with seizure activity
(E), while the second without the activity (ABCD). Experiment 4 aimed at estimation
of the subsets of the classified segments from the dataset. It was similar to experiment
2, but the subset with seizure activity (E) was not included. Experiment 5 was also
binary classification. The subset with seizure activity (E) was removed and the target
of the classification was to estimate healthy (AB) and epileptic (CD) segments.</p>
        <p>
          Result of each experiment was evaluated by accuracy (Table 4). The experiments
were performed in two versions. These versions are named as "No split" and "Split"
in Table 4. "No split" version agrees with the first method for estimation of the
accuracy of the classification in section 4.1, i.e. the whole dataset was used for OFDT
induction. In "Split" version, the accuracy of the classification was computed using
two sets – training (80% of instances in the original dataset) and testing (the
remaining 20% of instances in the original dataset). According to data in Table 4, the
accuracy of EEG signal classification is better for "No split" versions than for "Split". Need
to note that evaluations in [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], which produce better results than our method, have
been implemented for "No split" version of training data.
The new algorithm for EEG signal classification by OFDT was developed and
evaluated in this paper. Similarly to other algorithms for EEG classification, this algorithm
includes two steps: preliminary data transformation and classification (Fig. 1).
However, in the new algorithm, the additional procedure is included in preliminary data
transformation. This step performs fuzzification of reduced EEG signal features,
which permits taking into account the ambiguity of data for classification caused by
the initial EEG signal transformation (feature extraction and dimension reduction).
Due to fuzzification, special methods for fuzzy data analysis have to be used in
classification of EEG signal. In this paper, we used OFDT. The accuracy of the
classification by OFDT was evaluated and compared with other algorithms for EEG
classification. The various combinations of the output labels (classes) were analyzed. The
proposed classification models reached satisfied results in comparison with other studies.
        </p>
        <p>In future investigation, other methods for fuzzy data analysis and classification
should be used. Also, preliminary data transformation can be realized in many ways.
For example, feature extractor which takes into account the output attribute can be
used instead of PCA. The Welch’s method transforms signal from time domain to the
frequency domain, but methods that can analyze signals in both domain exist.
Example of such method is the wavelet transform. The important goal of this investigation
is to develop a classification of EEG signal with maximal accuracy that can be used in
medical decision support systems.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment</title>
      <p>This work is partly supported by grants VEGA 1/0038/16 and VEGA 1/0354/17.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>L. D. Iasemidis</surname>
          </string-name>
          , “
          <article-title>Epileptic seizure prediction and control</article-title>
          ,
          <source>” Biomed. Eng. IEEE Trans.</source>
          , vol.
          <volume>50</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>549</fpage>
          -
          <lpage>558</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>M. H. Libenson</surname>
          </string-name>
          , Practical approach to electroencephalography, vol.
          <volume>48</volume>
          , no.
          <volume>11</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>A.</given-names>
            <surname>Subasi</surname>
          </string-name>
          and E. Erc, “
          <article-title>Classification of EEG signals using neural network and logistic regression,” Comput. Methods Programs Biomed</article-title>
          ., no.
          <issue>78</issue>
          , p.
          <fpage>87</fpage>
          -
          <lpage>99</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>H. R.</given-names>
            <surname>Gupta</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Mehra</surname>
          </string-name>
          , “
          <article-title>Power Spectrum Estimation using Welch Method for various Window Techniques,”</article-title>
          <source>Int. J. Sci. Res</source>
          . Eng. Technol., vol.
          <volume>2</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>389</fpage>
          -
          <lpage>392</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>A.T.</given-names>
            <surname>Tzallas</surname>
          </string-name>
          and
          <string-name>
            <surname>I.Tsoulos</surname>
          </string-name>
          , “
          <article-title>Classification of EEG signals using feature creation produced by grammatical evolution,”</article-title>
          <source>Proc. of the 24th Telecommunications Forum (TELFOR)</source>
          , pp.
          <fpage>411</fpage>
          -
          <lpage>414</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>L.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rivero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dorado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Munteanu</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Pazos</surname>
          </string-name>
          , “
          <article-title>Automatic feature extraction using genetic programming: An application to epileptic {EEG} classification,” Expert Syst</article-title>
          .
          <source>Appl.</source>
          , vol.
          <volume>38</volume>
          , no.
          <issue>8</issue>
          , pp.
          <fpage>10425</fpage>
          -
          <lpage>10436</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>K.</given-names>
            <surname>Polat</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Güneş</surname>
          </string-name>
          , “
          <article-title>A novel data reduction method: Distance based data reduction and its application to classification of epileptiform EEG signals</article-title>
          ,
          <source>” Appl. Math. Comput.</source>
          , vol.
          <volume>200</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>10</fpage>
          -
          <lpage>27</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>D.</given-names>
            <surname>Ley</surname>
          </string-name>
          , “
          <article-title>Approximating process knowledge and process thinking: Acquiring workflow data by domain experts,” Conf</article-title>
          .
          <source>Proc. - IEEE Int. Conf. Syst. Man Cybern.</source>
          , pp.
          <fpage>3274</fpage>
          -
          <lpage>3279</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>N.</given-names>
            <surname>Gueorguieva</surname>
          </string-name>
          and G. Georgiev, “
          <article-title>Fuzzyfication of Principle Component Analysis for Data Dimensionalty Reduction</article-title>
          ,”
          <source>2016 IEEE Int. Conf. Fuzzy Syst</source>
          ., pp.
          <fpage>1818</fpage>
          -
          <lpage>1825</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. J. Rabcan, “
          <article-title>Ordered Fuzzy Decision Trees Induction based on Cumulative Information Estimates</article-title>
          and Its Application,” ICETA, p.
          <fpage>6</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>J.</given-names>
            <surname>Rabcan</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhartybayeva</surname>
          </string-name>
          , “
          <article-title>Classification by ordered fuzzy decision tree,” Cent</article-title>
          .
          <source>Eur. Res. J.</source>
          , vol.
          <volume>2</volume>
          , no.
          <issue>2</issue>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>V.</given-names>
            <surname>Levashenko</surname>
          </string-name>
          and E. Zaitseva, “
          <article-title>Fuzzy Decision Trees in Medical Decision Making Support System,”</article-title>
          <source>Computer Science and Information Systems. ” in 2012 Federated Conference on Computer Science and Information Systems</source>
          , pp.
          <fpage>213</fpage>
          -
          <lpage>219</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. R. G. Andrzejak,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lehnertz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mormann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rieke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>David</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Elger</surname>
          </string-name>
          , “
          <article-title>Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state</article-title>
          .,
          <source>” Phys. Rev. E. Stat. Nonlin. Soft Matter Phys.</source>
          , vol.
          <volume>64</volume>
          , no.
          <issue>6 Pt 1</issue>
          , p.
          <fpage>61907</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yuan and M. J. Shaw</surname>
          </string-name>
          , “
          <article-title>Induction of fuzzy decision trees,” Fuzzy Sets Syst</article-title>
          ., vol.
          <volume>69</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>139</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. I. T. Jolliffe,
          <article-title>Principal component analysis</article-title>
          , 2nd ed. NY: Springer,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>J. I.</given-names>
            <surname>Maletic</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Marcus</surname>
          </string-name>
          ,
          <source>Data Mining and Knowledge Discovery Handbook</source>
          .
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>V. G.</given-names>
            <surname>Levashenko</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. N.</given-names>
            <surname>Zaitseva</surname>
          </string-name>
          , “
          <article-title>Usage of New Information Estimations for Induction of Fuzzy Decision Trees</article-title>
          ,
          <source>” Lect. Notes Comput. Sci.</source>
          , vol.
          <volume>2412</volume>
          , pp.
          <fpage>493</fpage>
          -
          <lpage>499</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>V.</given-names>
            <surname>Levashenko</surname>
          </string-name>
          , E. Zaitseva,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kvassay</surname>
          </string-name>
          , and Deserno, “
          <article-title>Reliability Estimation of Healthcare Systems using Fuzzy Decision Trees</article-title>
          ,
          <source>” Ann. Comput. Sci. Inf</source>
          . Syst., vol.
          <volume>8</volume>
          , pp.
          <fpage>331</fpage>
          -
          <lpage>340</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>J. R. Quinlan</surname>
          </string-name>
          , “
          <article-title>Induction of Decision Trees,”</article-title>
          <string-name>
            <surname>Mach. Learn.</surname>
          </string-name>
          , vol.
          <volume>1</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>81</fpage>
          -
          <lpage>106</lpage>
          ,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20. E. D. Übeyli, “
          <article-title>Wavelet/mixture of experts network structure for EEG signals classification</article-title>
          ,
          <source>” Expert Syst. Appl.</source>
          , vol.
          <volume>34</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>1954</fpage>
          -
          <lpage>1962</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21. U. Orhan,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hekim</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Ozer</surname>
          </string-name>
          , “
          <article-title>EEG signals classification using the Kmeans clustering and a multilayer perceptron neural network model,” Expert Syst</article-title>
          .
          <source>Appl.</source>
          , vol.
          <volume>38</volume>
          , no.
          <issue>10</issue>
          , pp.
          <fpage>13475</fpage>
          -
          <lpage>13481</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>K.</given-names>
            <surname>Polat</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Güneş</surname>
          </string-name>
          , “
          <article-title>Artificial immune recognition system with fuzzy resource allocation mechanism classifier, principal component analysis and FFT method based new hybrid automated identification system for classification of EEG signals,” Expert Syst</article-title>
          .
          <source>Appl.</source>
          , vol.
          <volume>34</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>2039</fpage>
          -
          <lpage>2048</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>A. Naderi, “Analysis and classification of EEG signals using spectral analysis and recurrent neural networks,”</article-title>
          <string-name>
            <surname>Biomed. Eng.</surname>
          </string-name>
          (NY)., no.
          <issue>11</issue>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>4</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <given-names>L.</given-names>
            <surname>Rokach</surname>
          </string-name>
          and
          <string-name>
            <given-names>O.</given-names>
            <surname>Maimon</surname>
          </string-name>
          , “
          <article-title>Data Mining with Decision Trees</article-title>
          .
          <source>Theory and Applications</source>
          .,” Qual. Assur.,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>