=Paper=
{{Paper
|id=Vol-3628/short3
|storemode=property
|title=Advanced Signal Processing and Classification of EEG Patterns in Neurointerface Systems
|pdfUrl=https://ceur-ws.org/Vol-3628/short3.pdf
|volume=Vol-3628
|authors=Serhii Lupenko,Roman Butsiy,Oleksandr Volyanyk,Nataliia Stadnyk
|dblpUrl=https://dblp.org/rec/conf/ittap/LupenkoBVS23
}}
==Advanced Signal Processing and Classification of EEG Patterns in Neurointerface Systems==
<pdf width="1500px">https://ceur-ws.org/Vol-3628/short3.pdf</pdf>
<pre>
                         Advanced Signal Processing and Classification of EEG Patterns in
                         Neurointerface Systems
                         Serhii Lupenko 1,2, Roman Butsiy 2, Oleksandr Volyanyk 2 and Nataliia Stadnyk 3
                         1
                           Faculty of Electrical Engineering, Automatic Control and Informatics, Opole University of Technology, 45-
                         758 Opole, Poland
                         2
                           Institute of Telecommunications and Global Information Space, National Academy of Sciences of Ukraine,
                         02000 Kyiv, Ukraine
                         3
                           Ternopil Ivan Puluj National Technical University, 46001 Ternopil, Ukraine


                                         Abstract
                                         This study provides a comprehensive assessment of an extended set of modern classifiers
                                         designed for electroencephalography signal analysis in brain-computer interface systems.
                                         Using the modern model of the vector of cyclic rhythmically connected random processes for
                                         estimating signal characteristics, the classifiers compared encompass k-NN, Linear SVM,
                                         Decision Tree, Random Forest, Multilayer Perceptron, AdaBoost, and Naive Bayes. By
                                         decomposing signals into Fourier series, the optimal number of coefficients is investigated to
                                         both reduce computational complexity and increase accuracy. To facilitate a transparent and
                                         decisive comparison among the classifiers, the Confusion Matrix methodology is used.
                                         Results suggest that among the diverse range evaluated, Linear SVM, Naive Bayes, and
                                         Multilayer Perceptron classifiers showcased superior accuracy.

                                         Keywords 1
                                         BCI-systems; EEG signals; brain-computer interface; vector of cyclic rhythmically connected
                                         random processes; classifiers; Confusion Matrix;

                         1. Introduction
                             Brain-computer interface (BCI) systems [1, 2, 3] based on the processing and interpretation of
                         electroencephalography (EEG) signals play an important role in neuroscience and technology. EEG,
                         which captures the electrical activity of the brain, is a crucial component of the effective functioning
                         of BCI systems. Being a non-invasive and relatively cost-effective [4, 5] method, EEG provides real-
                         time information on brain activity, making it invaluable for BCI. The ability to accurately interpret
                         these EEG signals is vital, especially for people with movement disorders [6], as it facilitates direct
                         communication between their brain and external devices, giving them regained independence. In
                         addition to therapeutic applications, advances in EEG processing are improving human-machine
                         interaction in sectors such as gaming [7], virtual reality [8, 9], and robotics control [10]. Moreover,
                         the ontological frameworks, as discussed by authors in the study [11], can be pivotal in enhancing
                         BCI systems by integrating diverse data sources, which is essential for expanding the applications of
                         BCI into fields like Chinese Image Medicine, offering new modalities for diagnosis and treatment
                         planning.
                             In a previous study [12], several classifiers were investigated and compared for classifying EEG-
                         recorded signals. However, this work did not delve deeply into each stage of signal processing. This
                         study provides a more detailed description of each processing step. Additional filtering methods will
                         be applied, and a new model, called a vector of cyclic rhythmically connected random processes, will
                         1
                          Proceedings ITTAP’2023: 3rd International Workshop on Information Technologies: Theoretical and Applied Problems, November 22–24,
                         2023, Ternopil, Ukraine, Opole, Poland
                         EMAIL: s.lupenko@po.edu.pl (A. 1); romanbutsiy@gmail.com (A. 2); wonderage2018@gmail.com (A. 3);
                         natalya.stadnik15@gmail.com (A. 4)
                         ORCID: 0000-0002-6559-0721 (A. 1); 0000-0002-8415-8635 (A. 2); 0000-0001-9137-7580 (A. 3); 0000-0002-7781-7663 (A. 4)
                                         ©️ 2020 Copyright for this paper by its authors.
                                         Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                         CEUR Workshop Proceedings (CEUR-WS.org)


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
be applied during the evaluation of the EEG signal characteristics. A Confusion Matrix will be used to
test the classifiers, which will increase the reliability of an experimentally based choice of the most
effective decision-making technologies in BCI systems.


2. Methodology

2.1.    Signal registration
   The processing of EEG signals in the BCI-system is commonly recognized to involve five [13, 14]
primary stages (Fig. 1), namely, signal registration, pre-processing of signals, estimation of signal
characteristics, classification (recognition) of signals and computer interaction.


Figure 1: Main stages of signal processing by the BCI-system

    Signal collection (registration) as the first stage of signal processing by the neurointerface system
in terms of accuracy and cost is significantly determined by hardware and software of BCI system.
Although the market for affordable and accessible neurointerfaces is relatively limited, there are some
cost-effective solutions [16] available that can facilitate experiments with satisfactory levels of
accuracy and reliability. The choice of the Open Source Brain-Computer Interfaces (OpenBCI) [17]
platform was justified and experimentally tested in the works [12, 13, 14]. This platform is an
affordable, straightforward, and open-source solution that can be easily assembled in the your own
home.


Figure 2: The Ultracortex Mark IV headset is produced using a 3D printer

   A non-invasive method of electroencephalography was chosen to register the electrical activity of
the scalp surface. The 8-channel version of the OpenBCI platform was utilized to record the signals. It
should be noted that the 8-channel version can easily be expanded to 16 channels by using the
additional Daisy module. However, in this research, an 8-channel version was used, since only one
channel was enough for the experiment. C3 electrode and A2 reference electrode were used to record
the signals. The data sampling frequency for each channel is set to 250 Hz. Ultracortex "Mark IV"
was employed to securely attach the electrodes, with the device being made using a 3D printer at
home (see Fig. 2). To manage the recording of EEG signals, the OpenBCI GUI utility is utilized. The
measurement outcomes are saved on a microSD memory card directly integrated into the board. For
processing the acquired results, custom scripts were developed in Python, leveraging various auxiliary
libraries including Sklearn, Numpy, Scipy, Matplotlib, and others.


2.2.    Pre-processing of signals
    The subsequent step involves the pre-processing of the analyzed EEG signals, which involves the
utilization of Butterworth filters. In the initial stage, a 3rd order rejection filter is employed to
eliminate noise originating from the power grid at a frequency of 50 Hz (60 Hz). The signals prior to
and following the initial filtering stage are depicted in Figure 3 ((a) and (b), respectively).
    In the second stage, a 5th order bandpass filter is employed. For this experiment, the filter's
bandwidth is set to 1-17 Hz, effectively removing low-frequency and high-frequency noise. The
filtered signals, which are prepared for subsequent processing stages, are illustrated in Figure 3 с.


Figure 3: EEG signals filtering. Recorded signal (a), after 3rd order rejection filter (b), after 5th order
bandpass filter (c) [13]


2.3.    Estimation of signal characteristics
   In this study [14], a new mathematical model of vector EEG was proposed and substantiated in the
form of a vector of cyclic rhythmically connected random processes. By considering the stochasticity,
cyclicality, variability of the rhythm of multidimensional distribution functions, initial, central, and
mixed moment functions of the signals under investigation, it provides efficient statistical tools for
studying a wide range of characteristics of vector EEG.
   The vector EEG mathematical model introduced in this study [14] aligns with methodologies used
in [15], where a similar vector approach is applied to the statistical processing and modeling of
synchronously registered cardio signals of various physical natures, demonstrating the versatility and
applicability of such models across different domains of biomedical signal analysis
   Using the estimated rhythm function [14], the pre-processed vector of signals will be segmented
into cycles. These cycles can further be divided (see Fig. 4) into active zones (when the operator
performs a mental controlling action) and passive zones (when the operator's mental controlling
action is absent).
   The obtained vectors of activity and passivity zones will be decomposed into Fourier series. This
decomposition allows us to form a vector of informative features for the classifiers using the Fourier
coefficients (see Fig. 4 (b) and (c)). In addition, these coefficients, when considered alongside Bessel's
inequality, provide a comprehensive framework for feature extraction. The resulting coefficients will
be used for training the classifiers to achieve optimal performance.


Figure 4: Visualization of EEG signal characteristics across two zones. On the left, three plots
represent to the activity zone, and on the right, three plots represent the passivity zone.
Horizontally, plots (a) display mathematical expectations, (b) display cosine coefficients from the
Fourier series, and (c) display sine coefficients from the Fourier series

   The Figure 4 shows the characteristics of the EEG signal in the zones of activity and passivity
using mathematical expectations and Fourier series coefficients. It can be seen from the graphs that
the primary information is concentrated in the first 30 coefficients, while the following coefficients
demonstrate noise-like properties.


2.4.    Classification
    The selection of a classifier plays a vital role in the development of neurointerface systems. This
study aims to assess the accuracy of well-known classifiers including k-Nearest Neighbors (k-NN),
Linear Support Vector Machine (Linear SVM), Decision Tree, Random Forest, Multilayer Perceptron
(MLP), Adaptive Boosting (AdaBoost), and Naive Bayes. This classifiers are common [18] in
machine learning due to their versatility, efficacy, and scalability. Their popularity stems from several
strengths: k-NN's simplicity and adaptability [18], SVM's robustness in high-dimensional spaces [19],
decision trees' interpretability [20], Random Forest's ensemble-based accuracy [21], MLP's capability
to capture non-linearities [22], AdaBoost's iterative refinement [23], and Naive Bayes' speedy
predictions and efficiency with large datasets [24]. Their combined theoretical soundness and proven
real-world applicability make them first-choice tools for many practitioners.
    The choice of these classifiers is justified by the following considerations.
    The k-Nearest Neighbors algorithm (k-NN) is one of the simplest machine learning methods and
falls under the category of supervised learning [18]. Its fundamental principle is that an object is
classified based on the "votes" of its nearest neighbors in the feature space. The size of "k" denotes
the number of neighbors participating in the "voting". The k-NN algorithm does not require any
predictive model, but instead utilizes all available training information during classification.
    The Support Vector Machines (SVM) method is a powerful and flexible machine learning
technique used for classification and regression tasks. A Linear SVM is a specific instance of SVM,
where the decision boundary or hyperplane is linear [19].
    The main idea behind Linear SVM [22] is to find a hyperplane that best separates the data into two
classes, maximizing the margin (distance) between the closest data points (support vectors) from both
classes. These points, which lie closest to the hyperplane and determine its position, are called support
vectors. Thanks to this strategy, Linear SVM exhibits good resistance to overfitting.
    The Linear SVM algorithm implements a linear decision boundary, but to implement nonlinear
boundaries, a kernel SVM can be used, applying different kernel functions. In this case, the input data
is transformed into a higher-dimensional space where it can be linearly separated. Despite this, Linear
SVMs are used when data can be linearly separated, or when the feature space far exceeds the number
of training examples, which allows for high computational speed and simplicity of interpreting
results.
    A Decision Tree is a common machine learning algorithm that is employed for both classification
and regression tasks. The principle of a Decision Tree involves dividing the input feature space into
segments, with each corresponding to a specific class or a predicted value [20]. Essentially, a
Decision Tree is a binary tree in which each internal node signifies a test on one of the features, while
each leaf represents the predicted class or value.
    Random Forest is an machine learning algorithm that constructs multiple decision trees and
combines their predictions [21]. Usіng decision trees and their individual decisions, it effectively
handles the overfitting issue often seen in a single decision tree, providing more generalized
predictions. It works well with both classification and regression tasks, can handle large datasets with
high dimensionality, and provides measures of feature importance, making it a versatile and widely
used algorithm in machine learning.
    MLP is a type of artificial neural network widely used for classification and regression tasks. MLP
uses a supervised learning technique called backpropagation for training. It should be noted that the
inclusion of one or more non-linear hidden layers allows MLPs to solve problems that are not linearly
separable, adding to its versatility as a machine learning classifier.
    AdaBoost is a powerful machine learning algorithm that works by combining several weak
learners, typically decision trees, to create a robust classifier that improves prediction accuracy. The
AdaBoost algorithm iteratively adjusts the weights of training instances by increasing the weights of
incorrectly classified instances and decreasing the weights of correctly classified instances. Thus, it
"adapts" by focusing more on difficult cases in subsequent iterations. The final prediction is made by
weighted voting, taking into account the accuracy of each weak learner, making AdaBoost effective
for both binary and multi-class classification problems [23].
    The Naive Bayes classifier is a probabilistic machine learning algorithm based on applying Bayes'
theorem with strong independence assumptions between the features [24]. Despite its oversimplified
assumptions, Naive Bayes classifiers often perform remarkably well in many complex real-time
situations. The model is also favored for its efficiency and scalability, handling large datasets with
high dimensionality effectively.

2.5.    Experiment Procedure
    During EEG signal capture, there are multiple technical challenges, largely attributed to the
diminished amplitude of the signal. As it travels through the brain's protective layers, cerebrospinal
fluid, and the skull to reach the scalp, the signal's amplitude ranges merely between 1-100 microvolts,
with frequencies spanning from 0.1-100 Hz. The choice of electrode material and the tightness of
contacts also impact the quality of the recording.
   To obtain an artifact-free EEG recording, it's crucial that the research participant remains relaxed
during the experiment, seated in a specialized comfortable chair. External light and sound stimuli
should be minimized. Proper electrode placement is vital, with the electrode-skin resistance
maintained below 5 kOhms.
   In this experiment, the participant performed a mental action of either extending or flexing the arm
for approximately one second, followed by a state of relaxation for the next second. This "mental
action-relaxation" cycle was repeated 100 times consecutively.

3. Results
   The results of the study of the effectiveness of various classifiers for the analysis of EEG signals
are presented below. The Confusion Matrix was used to quantify the classification results. The main
calculations and analyzes were carried out for two operators, which helps to take into account possible
individual characteristics in the results. Figures 5-8 shows graphs of the dependences of some
accuracy characteristics of the Confusion Matrix on the number of Fourier coefficients, which enables
the selection of both a classifier and a vector of informative features in BCI systems.


Figure 5. Accuracy (ACC)


Figure 6. The harmonic mean of precision and sensitivity (F1 score)
Figure 7. Fowlkes–Mallows Index (FM)


Figure 8. Balanced Accuracy (BA)

    As can be seen from the figures 5-8, the classifiers were trained using the coefficients from the
Fourier series, and most of them showed similar behavior. Optimal performance was observed at 20-
40 coefficients; as the number of coefficients increases, there is an obvious tendency towards
overfitting, especially noticeable with the k-NN classifier, which can be explained by the noisy nature
of the subsequent coefficients, which have no informative value.

4. Conclusion
   In this study presents an in-depth evaluation of several modern classifiers for EEG signal analysis
in the realm of brain-computer interface systems. Using an innovative model vector of cyclic
rhythmically connected random processes, it was possible to provide a reliable estimation of EEG
signal characteristics. The use of the Confusion Matrix further augmented the clarity of classifiers
comparison in BCI systems. Among the evaluated classifiers, which included k-NN, Linear SVM,
Decision Tree, Random Forest, Multilayer Perceptron, AdaBoost, and Naive Bayes, the most
accuracy was observed from Linear SVM, Naive Bayes, and Multilayer Perceptron. Based on the
analysis of the dependence of the main accuracy characteristics of the Confusion Matrix on the
number of spectral components, the approach to the optimal selection of the vector of informative
features in BCI systems is substantiated.


5. References
[1] Shih, J.J.; Krusienski, D.J.; Wolpaw, J.R. Brain-computer interfaces in medicine. Mayo Clin.
    Proc. 2012, 87, 268–279. doi:https://doi.org/10.1016/j.mayocp.2011.12.008
[2] Murphy, D.P.; Bai, O.; Gorgey, A.S.; Fox, J.; Lovegreen, W.T.; Burkhardt, B.W.; Atri, R.;
     Marquez, J.S.; Li, Q.; Fei, D.-Y. Electroencephalogram-Based brain–computer interface and
     Lower-Limb Prosthesis Control: A Case Study. Front. Neurol. 2017, 8, 00696.
     doi:https://doi.org/10.3389/fneur.2017.00696
[3] Graimann, B.; Allison, B.; Pfurtscheller, G. Brain–Computer Interfaces: A Gentle Introduction.
     In Brain-Computer Interfaces; The Frontiers Collection; Springer: Berlin, Germany, 2010.
     doi:https://doi.org/10.1007/978-3-642-02091-9_1
[4] Hag, A.; Handayani, D.; Altalhi, M.; Pillai, T.; Mantoro, T.; Kit, M.H.; Al-Shargie, F. Enhancing
     EEG-Based Mental Stress State Recognition Using an Improved Hybrid Feature Selection
     Algorithm. Sensors 2021, 21, 8370. doi:https://doi.org/10.3390/s21248370
[5] Rasheed, S. A Review of the Role of Machine Learning Techniques towards Brain–Computer
     Interface      Applications.   Mach.      Learn.     Knowl.     Extr.     2021,     3,     835-862.
     doi:https://doi.org/10.3390/make3040042
[6] Shahini, N.; Bahrami, Z.; Sheykhivand, S.; Marandi, S.; Danishvar, M.; Danishvar, S.; Roosta, Y.
     Automatically Identified EEG Signals of Movement Intention Based on CNN Network (End-To-
     End). Electronics 2022, 11, 3297. doi:https://doi.org/10.3390/electronics11203297
[7] Paszkiel, S.; Rojek, R.; Lei, N.; Castro, M.A. A Pilot Study of Game Design in the Unity
     Environment as an Example of the Use of Neurogaming on the Basis of brain–computer interface
     Technology         to    Improve      Concentration.      NeuroSci      2021,      2,     109–119.
     doi:https://doi.org/10.3390/neurosci2020007
[8] Yang, L.; Van Hulle, M.M. Real-Time Navigation in Google Street View® Using a Motor
     Imagery-Based BCI. Sensors 2023, 23, 1704. doi:https://doi.org/10.3390/s23031704
[9] Lee, P.-L.; Chen, S.-H.; Chang, T.-C.; Lee, W.-K.; Hsu, H.-T.; Chang, H.-H. Continual Learning
     of a Transformer-Based Deep Learning Classifier Using an Initial Model from Action
     Observation EEG Data to Online Motor Imagery Classification. Bioengineering 2023, 10, 186.
     doi:https://doi.org/10.3390/bioengineering10020186
[10] Said, R.R.; Heyat, M.B.B.; Song, K.; Tian, C.; Wu, Z. A Systematic Review of Virtual Reality
     and Robot Therapy as Recent Rehabilitation Technologies Using EEG-Brain–Computer Interface
     Based on Movement-Related Cortical Potentials. Biosensors 2022, 12, 1134.
     https://doi.org/10.3390/bios12121134
[11] Lupenko, S., Orobchuk, O., Xu, M. The Ontology as the Core of Integrated Information
     Environment of Chinese Image Medicine. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds)
     Advances in Computer Science for Engineering and Education II. ICCSEEA 2019. Advances in
     Intelligent Systems and Computing; Springer, Cham. 2020, 938, 471-481.
     doi:https://doi.org/10.1007/978-3-030-16621-2_44
[12] Butsiy, R.; Lupenko, S. Comparison of Modern Methods of Classification of EEG Patterns for
     Neurointerface Systems. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of
     Seventh International Congress on Information and Communication Technology. Lecture Notes
     in      Networks      and    Systems;     Springer,     Singapore,     2022,     465,      345-354.
     doi:https://doi.org/10.1007/978-981-19-2397-5_32
[13] Butsiy, R.; Lupenko, S.; Zozulya, A. Comprehensive justification for the choice of software
     development tools and hardware components of a multi-channel neurointerface system. 2021
     IEEE 16th International Conference on Computer Sciences and Information Technologies
     (CSIT), 2021, 1, 309-312. doi:https://doi.org/10.1109/CSIT52700.2021.9648788
[14] Lupenko, S.; Butsiy, R.; Shakhovska, N. Advanced Modeling and Signal Processing Methods in
     Brain–Computer Interfaces Based on a Vector of Cyclic Rhythmically Connected Random
     Processes. Sensors 2023, 23, 760. doi:https://doi.org/10.3390/s23020760
[15] Lupenko S.; Lytvynenko I.; Sverstiuk A.; Horkunenko A.; Shelestovskyi B. Software for
     statistical processing and modeling of a set of synchronously registered cardio signals of different
     physical nature. CEUR Workshop Proceedings, 2021, 2864, 194–205.
     doi:https://doi.org/10.32782/cmis%2F2864-17
[16] Cruz, A.; Pires, G.; Lopes, A.; Carona, C.; Nunes, U. A Self-Paced BCI With a Collaborative
     Controller for Highly Reliable Wheelchair Driving: Experimental Tests With Physically Disabled
     Individuals.      IEEE     Trans.    Hum.-Mach.        Syst.   2021,      51,     109–119.      doi:
     https://doi.org/10.1109/THMS.2020.3047597
[17] Padfield, N.; Camilleri, K.; Camilleri, T.; Fabri, S.; Bugeja, M. A Comprehensive Review of
     Endogenous EEG-Based BCIs for Dynamic Device Control. Sensors 2022, 22, 5802. doi:
     https://doi.org/10.3390/s22155802
[18] de Zarzà, I.; de Curtò, J.; Calafate, C.T. Optimizing Neural Networks for Imbalanced Data.
     Electronics 2023, 12, 2674. doi:https://doi.org/10.3390/electronics12122674
[19] Hamiane, M.; Saeed, F. SVM Classification of MRI Brain Images for Computer-Assisted
     Diagnosis.        Int.     J.      Electr.      Comput.       Eng.      2017,    7,      2555.
     doi:https://doi.org/10.11591/ijece.v7i5.pp2555-2564
[20] Erdal, H.I. Two-level and hybrid ensembles of decision trees for high performance concrete
     compressive strength prediction. Eng. Appl. Artif. Intell. 2013, 26, 1689–1697.
     doi:https://doi.org/10.1016/j.engappai.2013.03.014
[21] Jaiswal, J.K.; Samikannu, R. Application of Random Forest Algorithm on Feature Subset
     Selection and Classification and Regression. In Proceedings of the 2017 World Congress on
     Computing and Communication Technologies (WCCCT), Tiruchirappalli, India, 2–4 February
     2017, 65–68. doi:https://doi.org/10.1109/WCCCT.2016.25
[22] Collobert, R.; Bengio, S. Links between Perceptrons, MLPs and SVMs. In Proceedings of the
     Twenty First International Conference on Machine Learning—ICML’04, Banff, AB, Canada, 4–8
     July       2004;       ACM         Press:      Banff,      AB,       Canada,    2004,      23.
     doi:https://doi.org/10.1145/1015330.1015415
[23] Chen, S.; Shen, B.; Wang, X.; Yoo, S.J.; Strong Machine Learning, A. Classifier and decision
     stumps based hybrid AdaBoost classification algorithm for cognitive radios. Sensors 2019, 19,
     5077. doi:https://doi.org/10.3390/s19235077
[24] Abdulrahman, S.A.; Khalifa, W.; Roushdy, M.; Salem, A-B.M.; Comparative study for 8
     computational intelligence algorithms for human identification. Comput. Sci. 2020, 36, 100237.
     doi:https://doi.org/10.1016/j.cosrev.2020.100237

</pre>