=Paper=
{{Paper
|id=Vol-2255/paper4
|storemode=property
|title=Electroencephalogram Signals Analysis by Fuzzy Classifiers based on Cumulative Mutual Information
|pdfUrl=https://ceur-ws.org/Vol-2255/paper4.pdf
|volume=Vol-2255
|authors=Jan Rabcan,Elena Zaitseva,Vitaly Levashenko
|dblpUrl=https://dblp.org/rec/conf/iddm/RabcanZL18
}}
==Electroencephalogram Signals Analysis by Fuzzy Classifiers based on Cumulative Mutual Information==
Electroencephalogram Signals Analysis by Fuzzy Classifiers based on Cumulative Mutual Information Jan Rabcan [0000-0003-2835-9114], Elena Zaitseva[0000-0002-9087-0311],Vitaly Levashenko[0000- 0003-1932-3603] University of Zilina, Department of Infromatics, Univerzitna 8215/1, 010 26, Zilina, Slovakia (jan.rabcan, elena.zaitseva, vitaly.levashenko)@fri.uniza.sk Abstract. New algorithm for EEG signals classification is proposed in this paper. The typical algorithm for signal classification includes two steps: preliminary transformation and classification. The preliminary transformation modifies in- vestigated signals by the procedures of feature extraction and dimension reduc- tion into data set accepted for the classification. This transformation causes the loss of some information that can improve the classification accuracy. New pro- cedure of fuzzification is added in the preliminary transformation for the pro- posed algorithm. This procedure allows using the fuzzy classifier at the second step. The Fuzzy Random Forest is used for the classification of EEG signals in the proposed algorithm. The new procedure (fuzzification) in the preliminary transformation and new fuzzy classifier (Fuzzy Random Forest) allows increase the classification accuracy of the EEG signals. The efficiency of new algorithm is evaluated by two investigations. The first one focuses on epilepsy diagnostics and the second one aims at epileptics’ seizure detection. The comparison with other studies and shows the increasing of the classification accuracy of EEG sig- nals by the proposed algorithm. Keywords: Electroencephalogram (EEG), Fuzzification, Fuzzy Decision Tree, Fuzzy Random Forest. 1 Introduction Epilepsy is considered as one of the most common chronic neurological disorders. Sta- tistics of World Health Organization shows that approximately 50 million people worldwide suffer from epilepsy [1]. The most notable epileptic symptom are recurrent unprovoked seizures which usually occurs without warning, and they can affect each part of the body [2]. Seizures are caused by the electrical activity of the brain, concretely by the unexpected electrical disturbance of brain and excessive neuronal discharge. These brain activities can be monitored by the Electroencephalogram (EEG) [3]. Ability to measure brain electrical activity makes EEG as one of the most important tools in neurology diagnostics [5, 6]. EEG measures electrical brain activity by the elec- trodes placed on the scalp [1, 2]. Captured measurement are signals. These signals are converted into the graphic form represented by curves [4]. The shape and character of captured curve is strongly affected by current brain activity. Curves of signals obtained from EEG can be evaluated visually by specialized doctors who are able to use them in neural disorder diagnostic. However, such evaluation might not provide enough infor- mation for doctors to provide reliable neurological diagnostics. Diagnostic based on EEG can be significantly improved by development of algorithms for automatic analy- sis of EEG signals. Such algorithms must involve automatic extraction and analysis of information from the captured signal. According to extracted information, the brain state can be predicted or classified. In this paper, the algorithm for EEG signal analysis is proposed. This algorithm is demonstrated by the detection of epileptic seizures and the diagnostics of epilepsy. To develop reliable and credible algorithms for automatic EEG signal analysis can be developed based on the real data. There is well-known dataset of EEG signals col- lected and published by R. G. Andrzejak [6]. This dataset consists of 500 samples of EEG signal evenly divided into five disjoint subsets A, B, C, D, and E. Therefore, each subset contains 100 samples of EEG signals. Each of these samples represents an EEG signal captured during 23.6 seconds of EEG recording. Subset A and B contain meas- urement recorded from persons who do not suffer from epilepsy. In case of subset A, patients had eyes opened during recording while patients in subset B had eyes closed. Whether the eyes are open or closed affects the electrical activity of the brain according to [6]. Samples in subsets C, D, E was recorded from persons suffered from epilepsy. While patients in subsets C and D were investigated during seizure-free intervals, sub- set E contains samples measured during seizure activity only. Subset C consists of the vectors, were obtained from the hippocampal formation of the opposite hemisphere of the brain. Data vectors in the subset D were recorded from within the epileptogenic zone. Examples of samples from these subsets are shown in Fig. 1. Fig. 1. Raw signals examples, randomly chosen from dataset [6] From Fig. 1 is observable that signals in A and E are different from signals in B, C, and D. Nevertheless, only visual inspection can be insufficient to recognize healthy (A, B) and epileptics EEG segments (C, D, E). The most significant similarity between this group is oblivious in case of samples from B, C, and D. The samples are very similar to each other and the use only of visual inspection can lead to failure in diagnostic (patient suffering from epilepsy diagnosed as healthy). This example was used to shows that development of algorithms for automatic EEG signals analysis can be instrumental in diagnostics. It allows deciding whether a person has epilepsy without information if they have had an epileptic seizure or have not. Another task can be focused on detection of seizure activities. Also, development and analysis of these algorithms can enable the growth of decision support systems for early diagnosis of epilepsy [3–5]. Algorithms for automatized EEG signals analysis has been considered in several works. Different classification methods are used in these investigations, for example, K-nearest neighbor classifier [7], neural networks [5, 8, 9], support vector classifier [10, 11], decision trees [12]. The classification accuracy of EEG signals can be in- creased by the application of evolution methods [7] and clustering analysis [13]. Anal- ysis of these investigations shows that one of effective classification methods of EEG signal for identification of persons with epilepsy is based on the application of decision tree [12] and Fuzzy Decision Tree (FDT) [14, 15]. The approach of EEG signal classi- fication by FDT is developed in this paper and ordered FDT (oFDT) introduced in [16] are used. This tree has a regular structure that contains exactly one attribute at each level of the tree [17]. This ability allows parallelization of classification. The evaluation of the EEG signals classification in [17] has possible and quite accurate. However, there can be possibility for increasing of the classification accuracy that is considered in this paper. In particular, the application of Fuzzy Random Forest (FRF) is considered. Typical algorithms for EEG signals classification in addition to the classification procedure include the signal preliminary transformation into numerical attributes [18, 19]. For this reason, analyzed signal must be preprocessed to extract useful information necessary for creating classification models. Nowadays, we can find many prepro- cessing schemes in the literature. Basically, most of them consist of two steps: feature extraction and dimensional reduction. Feature extraction usually includes wavelet or spectral transformation. In [12] discrete wavelet and Discrete Fourier Transformation (DFT) was compared. Result of this paper shows better performance of DFT in combi- nation with method C4.5 for decision tree induction. In [20] and [9] the Welch’s method was used. These studies indicate that power spectrum density estimation by Welch’s method provides strong attributes which represent EEG signals well. The second typi- cal step of preprocessing is dimensional reduction. This step is required when feature extraction produce data matrix of large dimension. Dimensional reduction usually transforms data matrix to matrix of significantly smaller dimension. In [10] three tech- niques was compared: Principal Components Analysis (PCA), linear discriminant anal- ysis, and independent component analysis. The used in this paper PCA is recomended especially in cases when the number of training samples per class is small [21]. In comparison with preprocessing of EEG signals mentioned above, the algorithm presented by us in [14] and [15] involve fuzzy logic to preliminary data transformation as additional step. This step is fuzzification which can significantly reduce possible uncertainty in data. This paper shows a detailed investigation of this step with combination of different fuzzy classifiers: FDTs and FRF. This analysis allows developing of accurate method for EEG signals analysis. 2 Structure of new algorithm for EEG signal classification The proposed algorithm for epileptics’ seizure detection from EEG signals includes two steps (Fig.2): the preliminary transformation of EEG signal and the classification. The differences between traditional algorithms of signal classification and algorithm pro- posed in this paper are (a) the additional procedure of fuzzification at step of prelimi- nary transformation and (b) the fuzzy classifier at step of classification. The addition of the fuzzification permits to use fuzzy classification and consider the ambiguity of the transformed and reduced initial EEG signal. In papers [22–24], it has been shown that fuzzy data are useful for representation of ambiguous and imprecise data with higher accuracy and the application of fuzzy data instead of numerical data allows increasing of the accuracy of classification. The prin- cipal steps of the proposed algorithm for EEG signal classification are shown in the figure bellow. Preliminary Data Transformation Dataset of Result Analysis Signals Feature Extraction (Welch) Dimensional Fuzzy Classifiers Reduction (PCA) (oFDT, uFDT, FRF) Fuzzification Fig. 2. The main steps of the proposed algorithm Signal preprocessing and the preliminary signal transformation of EEG signals causes the loss of some information of initial signal. This fact implies the classification of ambiguous data. Fuzzy classifiers can be used for effective classification in this case [25]. The application of these classifiers is possible if the input data for the classifica- tion is fuzzy. However, data in standard algorithms of EEG signal classification after the preliminary transformation is not fuzzy even if procedures of attribute extraction and dimension reduction include fuzzy transformation. For example, in paper [26] pro- pose the algorithm for EEG signal classification with the application of adaptive neuro- fuzzy inference system. Other fuzzy techniques are described in [27]. At same time classifications in these algorithms are not implemented based on fuzzy classifier. Clas- sification can be improved by application of a fuzzy classifier that causes the represen- tation of data for classification after the preliminary transformation as the fuzzy data. Moreover, we lose some amount of information during EEG signal preprocessing during preliminary data transformation. Therefore, the ambiguity and uncertainty of initial data can occur. By applying fuzzification, we can reduce possible ambiguity. Moreover, fuzzy logic is tolerant to imprecise data and deviations during data measure- ment. Fuzzy logic can also model nonlinear functions of arbitrary complexity [28]. These facts allow reducing ambiguity and uncertainty of transformed data. According to obtained data after preliminary data transformation, we obtain data which was used to create three fuzzy classifiers. In the next sections, the steps of preliminary data trans- formation are described in detail. 3 Preliminary Data Transformation The preliminary data transformation developed in [14] and [15] was evaluated by clas- sification of EEG signals from dataset gathered in [6] (Fig. 1). Description of recording in [6] tells that during the recording of each EEG signal was used the same 128-chan- nels amplifier system. Recorded data were converted to digital form by 12 bit analog- to-digital conversion, and then the data were written continuously onto disk of data acquisition computer system at a sampling rate of 173.61 Hz. The frequency of Band- pass filter was 0.53-40 Hz. Aa was mention in previous section, signal obtained by this recording is function of time. Therefore, the preliminary data transformation is neces- sary to extract useful information, which can be used to build classifiers and classify new EEG records. 3.1 Feature Extraction The signal preprocessing which preliminary data transformation do, consist of three steps (Fig. 2). At first, feature extraction by Welch method is applied [29]. The method is based on the concept of periodogram spectrum estimates, which is the result of signal conversion from its time to frequency domain. Periodogram is commonly used for ex- amining the amplitude vs frequency characteristics of analyzed signal. Time series of a signal is split into overlapping segments (Fig. 3) of the predefined length which is equal for each preiodogram. Then the periodgram of each segment is computed and used to find important periodic components in the time series. At last step, periodograms are averaged. The result of averaged periodograms is known as power spectral density es- timation of analyzed signal [29]. Fig. 3. Illustration of segmentation by Welch’s method. For each segment, periodogram is estimated and obtained periodograms are averaged. After application of Welch method to EEG signals, we obtain feature matrix of 128 features. Each row of this matrix represents one EEG signal in frequency domain. Ob- tained number of features is too big to build reliable classifier (we obtain too many features in ratio with number of data instance). To decrease number of features, we applied dimensional reduction technique. We use PCA technique, which is commonly used in EEG signal analysis [20]. 3.2 Dimensionality Reduction In this paper, we use PCA to reduce dimension of feature matrix obtained by spectral transformation. PCA transform a feature matrix described by 𝑛 features 𝑌 = (𝑌1 , … , 𝑌𝑛 ) into linearly uncorrelated latent variables called "principal components" [30]. Transfor- mation is carried out in such a way that the first component has the biggest variance as possible. Each succeeding component has the biggest variance too as possible but the constraint that it is orthogonal to the preceding components must hold. Bigger variance of component indicates bigger variability in the data. Therefore, variance of component characterizes its importance. Hence, the first component is considered as the most sig- nificant component. PCA provides this transformation using eigenvectors 𝑉 = 𝑌𝑌 T (𝑣1 𝑣2 ⋯ 𝑣𝑛 ) of covariance matrix 𝑴 = . Eigenvector of matrix 𝑴 is defined 𝑛−1 as vector 𝑣𝑖 that 𝑴𝑣𝑖 = 𝜆𝑖 𝑣𝑖 , where 𝜆𝑖 is a corresponding eigenvalue to the vector 𝑣𝑖 given by |𝑴 − 𝜆𝐼| = 0, where 𝐼 is the unit matrix. The principal components are com- puted as 𝑋 = 𝑉 T × 𝑌 T , 𝑌 = (𝑌1 , … , 𝑌𝑛 ). This transformation forms orthogonal principal components. After transformation, it is necessary to choose only several important principal com- ponents. Nowadays, many criterions exist to solve this problem. One of the most com- monly used criteria to determine appropriate number of principal components is the eigenvalue-one criterion, also known as the Kaiser criterion [31]. This criterion selects a principal component as significant if its variance is higher than 1.00. As a result of PCA, we obtained 8 principal components, which describe the reduced feature matrix of origin EEG signals. Therefore, each signal is described by 8 principal components. In text bellow, these principal components are noted as numerical input attributes 𝑋𝑖 (𝑖 = 1, … ,8) of EEG signals. 3.3 Data Fuzzification We propose to use fuzzification process as part of preliminary data transformation. This process transforms each numeric attribute Xi obtained after dimensionality reduction into fuzzy attribute 𝐴𝑖 (𝑖 = 1, … , 𝑛). After this transformation, each fuzzy attribute 𝐴𝑖 will be composed of mi (mi ≥ 2) linguistic terms. A 𝑗-th linguistic term of attribute 𝐴𝑖 is defined by fuzzy set 𝐴𝑖,𝑗 (𝑗 = 1, … , 𝑚𝑖 ). Fuzzy set 𝐴𝑖,𝑗 with respect to a universe Xi is defined by a membership function 𝜇𝐴𝑖,𝑗 (𝑥): 𝑋𝑖 → 〈0,1〉. This function defines a mem- bership degree 𝜇𝐴𝑖,𝑗 (𝑥) for each element 𝑥 (𝑥 ∈ 𝑋𝑖 ), which define how strongly ele- ment 𝑥 is the member of fuzzy set 𝐴𝑖,𝑗 . Formally, fuzzy set 𝐴𝑖,𝑗 is defined as an ordered set of pairs 𝐴𝑖,𝑗 = {(𝑥, 𝜇𝐴𝑖,𝑗 (𝑥)) , 𝑥 ∈ 𝑋𝑖 }, where: (a) 𝜇𝐴𝑖,𝑗 (𝑥) = 0, if and only if 𝑥 is not the member of set 𝐴𝑖,𝑗 ; (b) 0 < 𝜇𝐴𝑖,𝑗 (𝑥) < 1, if and only if 𝑥 is not the full member of set 𝐴𝑖,𝑗 ; and (c) 𝜇𝐴𝑖,𝑗 (𝑥) = 1, if and only if 𝑥 is the full member of set 𝐴𝑖,𝑗 . To perform fuzzification of analyzed data, we used the algorithm proposed in [32]. This algorithm is based on the computation of the fuzzy entropy of fuzzy sets. The algorithm divides values 𝑥 ∈ 𝑋𝑖 into 𝑚𝑖 intervals. Intervals are defined by points 𝐶1 …𝐶𝑚𝑖 . We find this borders by K-Means algorithm as in [32]. Number 𝑚𝑖 of inter- vals is determined automatically. This number agrees with the number of linguistic terms mi of fuzzy attribute 𝐴𝑖 . The algorithm is initially performed with two linguistic terms, then the algorithm adds linguistic term to attribute. Adding of linguistics term is repeated until the fuzzy entropy of attribute does not raise. The definition of Fuzzy entropy, which with the aforementioned algorithm works, is following. 𝑚𝑏 𝐹𝐸(𝐴𝑖,𝑗 ) = − ∑ 𝐷𝐴𝑏𝑖,𝑗 × 𝑙𝑜𝑔2 𝐷𝐴𝑏𝑖,𝑗 𝑘=1 where 𝑚𝑏 is the number of classes defined by output attribute 𝐵. Notation 𝑥∈𝑏 de- fine, that 𝑥 belongs to class 𝑏. Than, 𝐷𝐴𝑏𝑖,𝑗 is defined as follow: ∑𝑥∈𝑏 𝜇𝐴𝑖,𝑗 (𝑥) 𝐷𝐴𝑏𝑖,𝑗 = ∑𝑥 𝜇𝐴𝑖,𝑗 (𝑥) 𝑖 𝑚 For each attribute 𝐴𝑖 must hold following constrain where ∑𝑗=1 𝐴𝑖,𝑗 = 1. Then the fuzzy entropy of attribute 𝐴 defined as: 𝑛 𝐹𝐸(𝐴𝑖 ) = ∑ 𝐹𝐸(𝐴𝑖,𝑗 ) 𝑞=1 Then the fuzzification process assigns membership degree 𝜇𝐴𝑖,1 (𝑥) to each 𝑥 of nu- meric attribute 𝑋𝑖 . This membership degree is obtained by triangular membership func- tion. The definition of used membership function for first linguistic term 𝐴𝑖,1 of attrib- ute 𝐴𝑖 is following. 1 𝑥 ≤ 𝐶1 𝐶2 − 𝑥 𝜇𝐴𝑖,1 (𝑥) = { 𝐶1 < 𝑥 < 𝐶2 𝐶2 − 𝐶1 0 𝑥 ≥ 𝐶2 Each non-first and non-last linguistics term of 𝐴𝑖 has a membership function 𝜇𝐴𝑖,𝑞 (𝑞 = 2,3, … , 𝑚𝑖 − 1) defined as follow: 0 𝑥 ≤ 𝐶𝑗−1 𝑥 − 𝐶𝑞−1 𝐶𝑞−1 < 𝑥 ≤ 𝐶𝑞 𝐶𝑞 − 𝐶𝑞−1 𝜇𝐴𝑖,𝑞 (𝑥) = 𝐶𝑗+1 − 𝑥 𝐶𝑞 < 𝑥 ≤ 𝐶𝑞+1 𝐶𝑗+1 − 𝐶𝑗 {0 𝑥 ≥ 𝐶𝑞+1 Finally, the last term 𝐴𝑖,𝑚𝑖 of 𝐴𝑖 has the membership function with following math- ematical form: 0 𝑥 ≤ 𝐶𝑚𝑖 −1 𝑥 − 𝐶𝑘−1 𝜇𝐴𝑖,𝑚 (𝑥) = 𝐶𝑚𝑖−1 < 𝑥 ≤ 𝐶𝑚𝑖 𝑖 𝐶𝑘 − 𝐶𝑘−1 {1 𝑥 ≥ 𝐶𝑚𝑖 Fuzzification is the last step of preliminary data transformation. After fuzzification of reduced data matrix, we obtained data represented by 8 fuzzy attributes. This data was used to induct fuzzy classifiers and evaluate the performance of proposed algorithm for EEG signals analysis. 4 Classification of Fuzzy Data After fuzzification, we obtained data described by 8 fuzzy input attributes 𝐴𝑖 (𝑖 = 1, … ,8) and one output attribute 𝐵. According to this data, we induct three fuzzy clas- sifiers to evaluate new algorithm for EEG signal analysis. We use oFDT which com- bines into the FRF. 4.1 Fuzzy Decision Trees Nodes and leaves are the main parts of the Decision trees. Each node is associated with one input attribute (splitting attribute). Domain of this input attribute determines out- coming edges of the node. The domain of attribute defines all its possible values. The classification's task solving begins at the root of the tree. If a node is associated with an input attribute, then the outcome edge for the classified instance is determined accord- ing to splitting attribute, and the classification continues using the appropriate sub-tree. When classified instance encounters a leaf, encountered leaf give the predicted class as classification result. In case of fuzzy decision trees, classified instance usually traverses by multiple branches during classification. Therefore, the decision is based on several leaves. In this paper, we use two type of fuzzy decision trees based on CMI. The first is oFDT. This type of tree has nodes on the same level associated with one input attribute. This fact allows to performing classification as parallel process. The second type is uFDT. uFDT do not require the same splitting attribute on each level of the tree. In Table 1 are shown splitting criteria for these FDTs. Table 1. Splitting criteria based on CMI. uFDT oFDT 𝐈 (𝐵; 𝑈𝑞−1 , 𝐴𝑖𝑞 ) 𝐈 (𝐵; 𝑼𝒒−𝟏 , 𝐴𝑖𝑞 ) 𝑖𝑞 = 𝑎𝑟𝑔𝑚𝑎𝑥 ( ) 𝑖𝑞 = 𝑎𝑟𝑔𝑚𝑎𝑥 ( ) 𝐇 (𝐴𝑖𝑞 |𝑈𝑞−1 ) 𝐇 (𝐴𝑖𝑞 |𝑼𝒒−𝟏 , ) In the Table 1 argmax returns attribute index 𝑖𝑞 with maximal value of CMI, 𝑈𝑞−1 = {𝐴𝑖1,𝑗1 × … × 𝐴𝑖𝑞−1,𝑗𝑞−1 } is the fuzzy set defined by the sequence of fuzzy terms 𝐴𝑖1,𝑗1 , … , 𝐴𝑖𝑞−1,𝑗𝑞−1 of selected attributes 𝐴𝑖1 , … , 𝐴𝑖𝑞−1 from the root to the 𝑞-th node. 𝑼𝒒−𝟏 is the sequence of selected attributes {𝐴𝑖1 , … , 𝐴𝑖𝑞−1 }. The CMI in output attribute B about the attribute 𝐴𝑖𝑞 and the sequence of values 𝑈𝑞−1 been introduced in [33] and calculated as: 𝑚 𝑖𝑞 𝑚 𝑏 𝐈 (𝐵; 𝑈𝑞−1 , 𝐴𝑖𝑞 ) = ∑ ∑ 𝑀 (𝐵𝑗 × 𝑈𝑞−1 × 𝐴𝑖𝑞 ,𝑗𝑞 ) × 𝑗𝑞 =1 𝑗=1 × (log 2 𝑀 (𝐵𝑗 × 𝑈𝑞−1 × 𝐴𝑖𝑞,𝑗𝑞 ) + log 2 𝑀(𝑈𝑞−1 ) − log 2 𝑀(𝐵𝑗 × 𝑈𝑞−1 ) − log 2 𝑀 (𝑈𝑞−1 × 𝐴𝑖𝑞,𝑗𝑞 )), where 𝑀 (𝐵𝑗 × 𝑈𝑞−1 × 𝐴𝑖𝑞 ,𝑗𝑞 ) is measurement of cardinality of fuzzy set 𝐵𝑗 × 𝑈𝑞−1 × 𝐴𝑖𝑞 ,𝑗𝑞 ; The conditional cumulative entropy between fuzzy attribute 𝐴𝑖𝑞 and the se- quence of selected attribute terms 𝑈𝑞−1 is defined as: 𝑚𝑖𝑞 𝐇 (𝐴𝑖𝑞 |𝑈𝑞−1 ) = ∑ 𝑀 (𝐴𝑖𝑞 ,𝑗 , 𝑈𝑞−1 ) × (log 2 (𝑀(𝑈𝑞−1 )) − log 2 𝑀 (𝐴𝑖𝑞 ,𝑗 × 𝑈𝑞−1 )). 𝑗=1 Implemented FDTs solve overfitting problem by pruning technique. Overfitting occur leaves covers insuficient number of instances in ratio with input data. To avoid overfitting, we use threshold parameters alpha and beta to determine leaf nodes during FDT induction. These parameters are described in detail in [15]. 4.2 Fuzzy Random Forest Random forest is classifier, which combines set of individual decision trees into one classifier system. Advantage of random forests against individual decision trees is sta- bility. Decision trees are sensitive to variations in data. Generally, random forests are not sensitive to overfitting while decision trees tend to overfit input data [34]. Overfit- ting occurs when leaves contain a small number of instances in ratio with number of input data. Nevertheless, pruning techniques can solve this problem. On the other hand, decision trees are well interpretable classifier. In case of random forest, it is hard to figure out what hundreds of trees tell. Next advantage of decision trees is classification performance. Provided classification by many trees is significantly slower then classify by one individual decision tree. Nevertheless, many papers shows bigger classification accuracy of random forests over individual decision trees, eq. in [35]. Random forests are type of ensemble methods. Ensemble methods combines multi- ple classifiers to obtain better classification accuracy than can acquire any of the con- stituent classifiers alone [36]. Many methods of ensemble learning based on different base of classifiers was developed. One of the first is bagging [37]. Bagging creates new dataset from origin dataset for each classifier by sampling with replacement. It means, one instance from origin dataset can occur in sampled dataset multiple times. This ap- proach helps to avoid overfitting. Bagging is often combined with decision trees. In [38], bagging is combined extended by random attribute selection. Splitting attribute is chosen for each non-leaf node from randomly selected subset of unused attributes in analyzed branch according to used splitting criteria. In paper [39], we propose to use splitting criterion based on CMI [17, 33] for FRF trees induction. FRF consists of defined number of individual FDTs. Each of these FDT provide some decision (classification result). To obtain decision of forest, results of individual trees must be combined (Fig. 3). The combining of these results was achieved by the summation of membership degrees of belonging to each class divided by the number of trees. The strategy of classification, which we use, is shown in figure below. Fig. 2. Strategy of classification by FRF In case of implemented random forest, the number of decision trees must be specified as input parameter. It is important to find ideal value of this parameter because small number of decision trees can lead to small classification power. In case of big number of decision trees, the algorithm can suffer from slow computation performance caused by thousands of trees. Therefore, we implemented simple iterative procedure to find the smaller value of this parameter in combination with the biggest classification accuracy. This procedure builds initial FRF consists of 3 FDTs. On the start of each iteration, one decision tree is added to FRF and accuracy is evaluated. This procedure finishes when classification accuracy stops growing. 5 Evaluation of classifiers After preliminary data transformation we obtained data described by 8 fuzzy input at- tributes and one output attribute. According to this data, we create fuzzy classifiers. In this section is described, how classification performance of used classifiers was evaluated. To evaluate classification performance of implemented classifiers, we used three metrics: accuracy, sensitivity, and specificity. Sensitivity and specificity are important in case, when data is not evenly distributed into classes. The definition of used metrics is showed in the table below. Table 2. Metrics for evaluation of classification performance Accuracy Sensitivity Specificity 𝑇𝑃 + 𝑇𝑁 𝑇𝑃 𝑇𝑁 𝑇𝑁 + 𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 where TP stands for true positive, TN is true negative, FP is false positive, and FN is false negative. True positive and true negative results are obtained after correct clas- sification while false negative and false positive stands for misclassification results. To evaluate classifiers, we divide data into training and testing set in ratio 70:30. Training set was used to create classifier and testing set was used to evaluate it. We repeated this for 10 000 times for each training set. FRF needs to have defined an exact number of trees. Therefore, we analyzed how number of trees can affect classification accuracy of FRF. In this analysis, the forest from 3 trees was created and its accuracy was evaluated. Then, the number of trees was increased by one, and performance was re-evaluated. Adding of trees was repeated while the accuracy of the forest was raising. 5.1 Evaluation of Algorithms for Classification of EEG Signals using Fuzzy Classifiers Evaluation of used classifiers was performed by two experiments. In the first experi- ment, the classification was aimed to recognize patients who suffer from epilepsy. The goal of the second experiment was to find patients during epileptic’s seizure. To solve this task, the database was divided into two classes. In case of the first experiment, healthy persons and persons suffering from epilepsy (AB, CDE). In the second experi- ment, we were aimed to separate persons during seizure activity and persons in seizure- free interval (ABCD, E). For each experiment, the analysis of number of trees necessary for accurate prediction of FRF classifier was performed. Results obtained by proposed algorithm are shown in Table 3. Table 3. Table of obtained results Classifier Predicated Accuracy Sensitivity Specificity uFDT Seizure 0.991 0.996 0.986 oFDT Seizure 0.980 0.992 0.966 FRF Seizure 0.995 0.999 0.992 uFDT Epilepsy 0.906 0.944 0.846 oFDT Epilepsy 0.897 0.942 0.821 FRF Epilepsy 0.894 0.946 0.818 Implemented FRF requires to have a specified number of trees as input parameter. This parameter affects computational complexity of aforementioned FRF. In case of large number of trees, total time necessary for FRT induction can be large. Also, clas- sification by thousands of trees can be complex. Therefore, the setting of this parameter to proper value is necessary. To find proper value of this parameter, the FRF from 3 trees was inducted and its accuracy was estimated. Then, the number of trees was in- creased by one, and performance was evaluated again. Adding of trees was repeated while the accuracy of forest was raising. The results obtained by this analysis are shown on plots bellow. There are shown four plots. The plots show classification accuracy of FRF depending on the number of inducted FDT. The x axes of these plots stand for number of trees. The y axes sands for classification accuracy. Plots on the left side show accuracy for FRF consisted of 3 to 100 trees. Plots on the right side are plots created from the same data, but these plots show the accuracy of FRF consisted from 82 to 100 trees. 0,996 0,99485 0,9948 0,994 0,99475 0,992 0,9947 0,99465 0,99 0,9946 0,988 0,99455 0,9945 0,986 0,99445 0,984 0,9944 3 10 17 24 31 38 45 52 59 66 73 80 87 94 82 84 86 88 90 92 94 96 98 Fig. 3. Plots show accuracy (y axis) of FRF depending on the number of inducted FDT (y axis) for epileptics’ seizure prediction. 0,905 0,90035 0,9 0,9003 0,90025 0,895 0,9002 0,89 0,90015 0,885 0,9001 0,88 0,90005 3 9 15 21 27 33 39 45 51 57 63 69 75 81 87 93 99 828384858687888990919293949596979899 Fig. 4. Plots show accuracy (y axis) of FRF depending on the number of inducted FDT (y axis) for epileptics’ detection. To compare results of our investigation with similar studies, we created the compar- ison showed in Table 4. These tables include only studies which used the same data as we used. The classification of selected studies was targeted on prediction of occur- rences of the epileptic’s seizures. Differences between selected studies are in its pre- liminary data transformation and classification methods, eq. attribute extraction, di- mensional reduction or classification algorithm. The comparison is shown in Table 4, where the first column contains first authors of studies and the reference to the paper, the second describes used methods in preliminary data transformation, third columns tells which classifier was used and the last column shows acquired classification accu- racy. Table 4. Table of results from other studies for epileptic’s seizure detection Preprocessing meth- Study Classifier Accuracy ods K. Polat and S. Güneş [12] DWT C4.5 98.76% L. Guo, D. Rivero [40] DWT MLPNN 97.77% M. A. Naderi, H. Mahdavi- FFT, PCA MLPNN 100% Nasab [9] K-means and U. Orhan, M. Hekim [7] DWT 96.67% MLPNN N. Nicolaou, J.G. Kios [11] Permutation Entropy SVM 94.38% K.Polat and S.Güneş[20] FFT, PCA AIRS classifier 99.81% J. B. Jian, B. Goparaju [41] CEEMD domain RF 98.00% Welch, PCA, Fuzzifi- This study FRF 99.48% cation Conclusion In this paper, the algorithm for EEG signal classification based on fuzzy logic was proposed. Previously, we solve task of EEG signals FDTs in [14] and [15], where the analysis of different fuzzification techniques was performed in detail. In this paper, the algorithm based on fuzzy entropy according to [32] was used. Involving of fuzzy logic permits to take into account the ambiguity of data that can arise in step of preliminary transformation (the feature extraction and the dimension reduction). The modification resulted in adding of fuzzification as new step in the preliminary data transformation. According to data obtained after preliminary data transformation, we compare three type of fuzzy classifiers: oFDT, uFDT, and FRF. These classifiers were inducted based on estimation of the cumulative mutual information [33]. In case of FRF, the impact of number of trees to classification accuracy has been analyzed. Results shows, that usage of proposed preliminary transformation can improve the classification accuracy of the EEG signals. Therefore, involving of fuzzy logic can be considered as recommended step for future development in EEG classification. In future development of proposed classification algorithm, the impact of another classifier can be evaluated. Also, the modification of preliminary data transformation is possible. There are numerous ways to provide spectral transformation, dimensional reduction or fuzzification. The crucial goal of this development is increasing the clas- sification accuracy of EEG signals. Aknowledgment This paper has been supported by grant VEGA 1/0354/17. References 1. WHO, http://www.who.int/mediacentre/factsheets/fs999/en/ last accessed 11/30/2018 2. Engel J. J., Starkman S.: Emergency Medicine Clinics of North America. Emerg. Med. Clin. North Am. 12, 895–923 (1994). 3. Iasemidis, L.D.: Epileptic seizure prediction and control. Biomed. Eng. IEEE Trans. 50, 549–558 (2003). 4. Libenson, M.H.: Practical approach to electroencephalography. (2010). 5. Subasi, A., Erc, E.: Classification of EEG signals using neural network and logistic regression. Comput. Methods Programs Biomed. 87—99 (2005). 6. Andrzejak, R.G., Lehnertz, K., Mormann, F., Rieke, C., David, P., Elger, C.E.: Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys. Rev. E. Stat. Nonlin. Soft Matter Phys. 64, 061907 (2001). 7. Orhan, U., Hekim, M., Ozer, M.: EEG signals classification using the K-means clustering and a multilayer perceptron neural network model. Expert Syst. Appl. 38, 13475–13481 (2011). 8. Lin, C.-J., Hsieh, M.-H.: Classification of mental task from EEG data using neural networks based on particle swarm optimization. Neurocomputing. 72, 1121–1130 (2009). 9. Naderi, M.A.: Analysis and classification of EEG signals using spectral analysis and recurrent neural networks. Biomed. Eng. (NY). 3–4 (2010). 10. Subasi, A., Gursoy, M.I.: EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Syst. Appl. 37, 8659–8666 (2010). 11. Nicolaou, N., Georgiou, J.: Detection of epileptic electroencephalogram based on Permutation Entropy and Support Vector Machines. Expert Syst. Appl. 39, 202–209 (2012). 12. Polat, K., Güneş, S.: A novel data reduction method: Distance based data reduction and its application to classification of epileptiform EEG signals. Appl. Math. Comput. 200, 10–27 (2008). 13. Guo, L., Rivero, D., Dorado, J., Munteanu, C.R., Pazos, A.: Automatic feature extraction using genetic programming: An application to epileptic {EEG} classification. Expert Syst. Appl. 38, 10425–10436 (2011). 14. Rabcan, J., Kvassay, M.: Electroencephalogram Signals Classification by Ordered Fuzzy Decision Tree. In: CEUR Workshop Proceedings. pp. 72–87. ICT in Education, Research, and Industrial Applications: Integration, Harmonization, and Knowledge Transfer, Kyiv, Ukraine (2017). 15. Rabcan, J., Kvassay, M.: Identification of Persons with Epilepsy from Electroencephalogram Signals using Fuzzy Decision Tree. In: Communications in Computer and Information Science. Springer (2018). 16. Androulidakis, I., Levashenko, V., Zaitseva, E.: An empirical study on green practices of mobile phone users. Wirel. Networks. 22, 2203–2220 (2016). 17. Zaitseva, E., Levashenko, V., Kvassay, M., Rabcan, J.: Application of ordered fuzzy decision trees in construction of structure function of multi-state system. (2017). 18. Jantan, H., Hamdan, A.R., Othman, Z.A.: Data Mining Classification Techniques for Human Talent Forecasting. 19. Maletic, J., Marcus, A.: Data Mining & Knowledge Discovery (2005). 20. Polat, K., Güneş, S.: Artificial immune recognition system with fuzzy resource allocation mechanism classifier, principal component analysis and FFT method based new hybrid automated identification system for classification of EEG signals. Expert Syst. Appl. 34, 2039–2048 (2008). 21. Martõ Ânez, A.M., Kak, A.C.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 23, 228–233 (2001). 22. Ley, D.: Approximating process knowledge and process thinking: Acquiring workflow data by domain experts. Conf. Proc. - IEEE Int. Conf. Syst. Man Cybern. 3274–3279 (2011). 23. Gueorguieva, N., Georgiev, G.: Fuzzyfication of Principle Component Analysis for Data Dimensionalty Reduction. 2016 IEEE Int. Conf. Fuzzy Syst. 1818–1825 (2016). 24. Tsipouras, M.G., Exarchos, T.P., Fotiadis, D.I.: A methodology for automated fuzzy model generation. Fuzzy Sets Syst. 159, 3201–3220 (2008). 25. Zaitseva, E., Levashenko, V.: Construction of a reliability structure function based on uncertain data. IEEE Trans. Reliab. 65, 1710–1723 (2016). 26. Vatankhah, M., Yaghubi, M.: Adaptive Neuro-fuzzy Inference System for Classification of EEG Signals Using Fractal Dimension. In: 2009 Third UKSim European Symposium on Computer Modeling and Simulation. pp. 214–218. IEEE (2009). 27. Sudirman, R., Koh, C., Safri, N.M., Daud, W.B., Mahmood, N.H.: EEG different frequency sound response identification using neural network and fuzzy techniques. 2010 6th Int. Colloq. Signal Process. its Appl. 1–6 (2010). 28. Hinojosa, J., Domenech-Asensi, G.: Multiple adaptive neuro-fuzzy inference systems for accurate microwave CAD applications. 2007 18th Eur. Conf. Circuit Theory Des. 767–770 (2007). 29. Gupta, H.R., Mehra, R.: Power Spectrum Estimation using Welch Method for various Window Techniques. Int. J. Sci. Res. Eng. Technol. 2, 389–392 (2013). 30. Smith, L.I.: A tutorial on Principal Components Analysis Introduction. Statistics (Ber). 51, 52 (2002). 31. Jackson, D.A.: Stopping rules in principal components analysis - a comparison of heuristical and statistical approaches. Ecology. 74, 2204–2214 (1993). 32. Lee, H.-M., Chen, C.-M., Chen, J.-M., Jou, Y.-L.: An efficient fuzzy classifier with feature selection based on fuzzy entropy. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 31, 426–432 (2001). 33. Levashenko, V., Zaitseva, E.: Usage of New Information Estimations for Induction of Fuzzy Decision Trees. Lect. Notes Comput. Sci. 2412, 493–499 (2002). 34. Podgorelec, V., Kokol, P., Stiglic, B., Rozman, I.: Decision Trees: An Overview and Their Use in Medicine. J. Med. Syst. 26, 445–463 (2002). 35. De Matteis, A.D., Marcelloni, F., Segatori, A.: A new approach to fuzzy random forest generation. In: 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). pp. 1–8. IEEE (2015). 36. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33, 1–39 (2010). 37. Breiman, L.: Bagging Predictors. 24, 123–140 (1996). 38. Breiman, L.: Random Forests. Mach. Learn. 45, 5–32 (2001). 39. Rabcan, J., Levashenko, V., Zaitseva, E., Chovancova, O.: Generation of Structure Function Based on Ambiguous and Incompletely Specified Data Using Fuzzy Random Forest. In: 2018 IEEE 9th Int. Conf. on Dependable Systems, Services and Technologies (DESSERT). pp. 418–423. IEEE (2018). 40. Guo, L., Rivero, D., Dorado, J., Rab, J.R., Pazos, A.: Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks. J. Neurosci. Methods. 191, 101–109 (2010). 41. Jia, J., Goparaju, B., Song, J., Zhang, R., Westover, M.B.: Automated identification of epileptic seizures in EEG signals based on phase space representation and statistical features in the CEEMD domain. Biomed. Signal Process. Control. 38, 148–157 (2017).