Analysis and Interpretation of Empirical Data Obtained by BCI
Epoc 14+
Georgi P. Dimitrova, Galina Panayotovaa, Vasyl Martsenyukb, Inna Dimitrovaa, Eugenia
Kovatchevaa, Boyan Jekova and Iva Kostadinovaa
a
    University of Library Studies and Information Technologies, 119, Tsarigradsko Shose, Sofia, Bulgaria
b
    University of Bielsko-Biala, 2 Willowa, Bielsko-Biala, 43-309, Poland
                 Abstract
                     Brain signals based on effective computing are a new development in the research area,
                 aimed at finding a correlation between human emotions and registered EEG signals. The Brain-
                 Computer interface (BCI) would allow the users to control and manage external devices by
                 brain signals emittance. These signals can be received and recorded by multiple special devices
                 like EMotiv Epoc +14, Neuroscan, EasyCap and etc., but the reliable translation of the
                 information obtained into computer commands is still a great challenge. This requires
                 exceptional integration between the information emitted by the brain of the signal user, the
                 BCI system, which transfers the information into digital signals and the respective algorithm
                 translating the brain signals into commands. The analysis of incoming brain signals and the
                 techniques for processing and classification of information are being actively explored in order
                 to improve adaptability of BCI system to the end-user
                     In the present study, we propose an approach to the selection of characteristics based on
                 descriptive statistics. Data streams were studied in order to take into account the time
                 characteristic, the analysis and derivation of dependencies on time data, characterized by a
                 relatively long duration of the experiment and short series of significant, useful data. This
                 approach represents a good trade-off between prediction accuracy and numerical complexity.

                 Keywords 1
                 Mathematical models of objects and processes, Computer Science, Artificial Intelligence,
                 Brain Wave, Machine Learning, Deep Learning, Robotic

1. Introduction
    The use of data obtained from BCI is a complex process that requires multidisciplinary skills and
knowledge in the field of computer science, signal processing, neurology, robotics, artificial
intelligence and others [14]. The study is based on a fixed sequence, which usually consists of six steps,
showing in fig.1: [6], [10] measuring brain activity, pre-processing data, extracting characteristics,
classification, command translation and feedback:
      Receive Data: At this stage, different types of sensors are used to obtain signals that reflect the
brain activity of the user [2]. In this study, we focus on BCI as the technology for obtaining data.
      Preprocessing: This step involves cleaning and removing noise from the input data to improve
the quality of the received signals. [1], [3]
      Extraction of features: It aims to describe signals by several corresponding values, called
“features” [4], [7].
      Classification: The classification stage determines the class based on the extracted
characteristics of the signal [1]. The class corresponds to the type of pre-identified signal. This stage
can also be referred to as “characteristic translation [11], [12] .Classification algorithms are known as
“classifiers”.

Information Technology and Implementation (IT&I-2021), December 01–03, 2021, Kyiv, Ukraine
EMAIL: geo.p.dimitrov@gmail.com (A. 1), panayotovag@gmail.com (A. 2), vmartsenyuk@ath.bielsko.pl (A. 3), innavadi@yahoo.com
(A.4), ekovatcheva@gmail.com(A. 5), b.jekov@unibit.bg (A. 6),i.kostadinova@unibit.bg (A. 7)
ORCID: 0000-0002-4785-0702 (A. 1);
              ©️ 2022 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                                      347
      Command / application translation: Once the received command is identified, it is submitted
for execution by the respective device [10].
      Feedback: Finally, this step provides the user with feedback on the identified command. This
helps control the quality of the received signals processing [8], [9].
    The electroencephalogram (EEG) is an excellent source for obtaining data related to human brain
activity [13]. A typical EEG experiment can produce data described with a two-dimensional matrix
based on brain activity every millisecond, projected onto the surface of the head at a spatial resolution
of a few centimeters [15]. The placement of the electrodes is based on several circuits, the most
commonly used of which is the Standard 10-20 EEG system [15]. As in other modern empirical
sciences, EEG tools provide on the one hand an abundant flow of data and on the other - a corresponding
need for new methods of data analysis.
    An important stage of data Preprocessing is the selection and handling of the obtained data.

                                                 Feedback


                       Receive Data                                  Command translation


                       Preprocessing                                     Classification


                                                    Features

Figure 1: Steps for data analitics

2. Description of the research
2.1.    Basic description
    Our hypothesis is that data normalization simplifies the process of classification of brain signals
considerably and leads to a significant simplification of computational procedures.
    This study aims to simplify the incoming EEG signals preprocessing by normalization of obtained
data, extracting certain characteristic values and subsequent signal classification . The level of signals
related to specific events is registered by 14 channels of the EEG EMotiv Epoc 14+, while the subjects
respond by giving mental commands to control the display of the corresponding command on the
screen. 12 time characteristics (amplitudes and latencies) are calculated and used as descriptors of
positive and negative emotional states in multiple subjects.

2.2.     Collected data
   The research includes analysis of raw data obtained from 21 physically and mentally healthy
participants, without pre-existing neurological disorders and previous experience with using Brain-
Computer Interface (BCI) devices [9]. The participants are in one age groups – 20 and 23 years. An
Emotiv Epoc+ 14ch device is used for the purpose of the study. The device and the location of the
electrodes is shown on fig. 2 The experiment was based on the display of static images (left , right
arrows and Neutral state), where the participants in the experiment should mentally submit the
appropriate command for the movement of a computer simulator - a motor boat. It is important to note
that these are only mental commands, not movement of the arms or legs, which significantly

                                                                                                      348
complicates classification, since it is not related to limb activity. Additionally, the Neutral command is
collection of all other commands, such as synchronization, relaxation etc.


Figure 2: EMotiv Epoc 14+ and electrodes position schema
     Each participant performed the experiment 3 times. Each experiment lasted 600 sec., оr ~ 10 min.
There were 30 min intervals between the different experiment in order to relax the participant. During
the experiment, the respective images with written commands “left” and “right” were shown 20 times
each. Each series consisted of a 3-second display of the respective image (epoch) and additional visual
and audio signals. At the beginning of the series, a 1-second beep was sounded to alert the participant.
Each test series lasted 15 seconds. This included 3 seconds to display the appropriate command and 12
seconds to perform synchronization actions, relaxing and etc. Because the experiment involved motor
imagery, it was mainly focused on beta waves (12 - 30 Hz). That is, in each experiment we have 20
repetitions of Left and Right for 3 seconds (or a total of 60 seconds for each command separately.
Commands received during the remaining time - 8 min (480 sec) are defined as Neutral command.
Altogether, the duration of a given process (signal duration - epoch) is 3 sec. for Left and Right
commands and 12 sec for Neutral. The average value for each condition is calculated and filtered. The
maximum and minimum values of the ensemble of average signals are detected. The localization of the
first minimum in the signals and the characteristics are determined by the latency and amplitude of
successive minima (Amin1, ...) and successive maxima (Amax1, ...), and the associated latency (Lmin1,
..., Lmax1,. ..). Three circuits are implemented by selecting three different filters and detecting N
maxima and N minima at the filter output. When this model is not implemented, the vector function is
filled with zeros.

2.3.    Processing and Norming Data
   As a result, the initial data set was an X matrix with dimensions of 168 columns (14 channels x12
characteristics) and 52 rows (averaged positive and negative test classes of 26 subjects).
                                        X  mean( X )                                                 (1)
                                   X
                                           std ( X )
   The vector space X is then normalized by subtracting the average value for each dimension and
dividing the standard deviation of each column, see formula (1).

3. Result of the Experiment
3.1.    Classification models
   Determining the set of characteristics by which the sample data will be evaluated. The set of
features is derived from the data stream registered for each EEG channel. The characteristics are
determined on the basis of the first six local extremes - 3 minima and 3 maxima (Figure 3). The
amplitudes of these initial extremes and the time of their occurrence (latency) are considered to be


                                                                                                      349
characteristics of the current data flow. Thus, each EEG channel is represented by 12 characteristics -
the amplitudes and latency of the six extremes.


Figure 3: Amplitude Max and Min
   By applying Butterworth fourth order filter with bandwidth [0.5 - 15] Hz, the number of preserved
characteristics is 12, corresponding to latency (time of occurrence) and amplitude at N = 3 maxima and
minima (see Figure 3); the characteristics correspond to the time and amplitude values of the first three
minima that occurred after T = 0s., and the corresponding maxima between them.
   When grouped by channels (Inter-subject), each object is represented by these 12 characteristics.
[Amin1, Amax1, Amin2, Amax2, Amin3, Amax3, Lmin1, Lmax1, Lmin2, Lmax2, Lmin3, Lmax3]

3.2. Data analysis by characteristics, defined and extracted with descriptive
statistics
   We distinguish incoming commands basing on brain activity observed by electroencephalogram
(EEG). The choice of features is important for signal classification. In the present study, we propose a
selection technique based on descriptive statistics (mean and standard deviation) [22]. This approach
represents a good compromise between the accuracy of prediction and numerical complexity. We
propose to reduce obtained data volume by focusing on the central trend (arithmetic mean) and the
variance (standard deviation) of the individual time characteristics and their distribution.

4.   Real data application
4.1. Formation of databases by channels (Inter-subject)
     For the purposes of this research, tree main commands were chosen, using antonymous words:
LEFT, RIGHT and NEUTRAL. Each word is defined by a 14-dimensional vector of channels (x1, x2,
..., x14), where xj denotes the j- channel, of which we have made p observations. Thus, a matrix X of
the type p × 14 is formed, the rows of which display the observations of the study. (Table 1)
   Table 1
   Row data
 AF3      F7       F3       F5       T7       P7       O1       O2       P8       T8       FC6      F4       F8       AF4
 118,90   118,06   118,20   118,12   118,99   118,48   117,92   118,54   118,66   118,27   117,79   120,47   118,12   119,59    Н

     …


   This database allows to reveal individual brain channel dependencies and conclude which of them
are involved when a visual task of the described type is present.

4.2.      Similarity measurement
   Most statistical methods use correlation analysis to determine the similarity between different brain
signals. The results are given in the form of correlation matrices. Table 2, Table 3 and Table 4 display
the correlations between the individual channels of the selected words and their calculation results [5].

                                                                                                                               350
   Table 2
   Correlation Matrix of NEUTRAL
                     AF3        T7          O1          T8           ….        AF4
                     1          0.1523      0.4581      0.5133                 0.6074
                                1           0.4723      0.1661                 0.0908
                                            1           0.3011                 0.3014
                                                        1                      0.4674
                     …
                                                                               1
    Confidence level 95%. n = 8734.
   Table 3
   Correlation Matrix of Left
                     AF3        T7          O1          T8           ....      AF4
                     1          0.1861      0.3060      0.6202                 0.6401
                                1           0.2774      0.2355                 0.1884
                                            1           0.2159                 0.2622

                                                        1                      0.4902
                                                                               1
    Confidence level 95%. n = 12480.
   Table 4
   Correlation Matrix of RIGHT
                     AF3        T7          O1          T8           ....      AF4
                     1          0.1861      0.3060      0.6202                 0.6401
                                1           0.2774      0.2355                 0.1884
                                            1           0.2159                 0.2622

                                                        1                      0.4902
                                                                               1
    Confidence level 95%. n = 12870.
    In this article we will use only channels with correlation > 0.5. Channels with correlation <0.5 are
ignored. For all three commands we use channels AF3, T8 and AF4.

4.3. Data analysis by characteristics defined and derived from descriptive
statistics
   We calculate the mean and standard deviation of those channels that were selected according to
clause 4.2. The results obtained for the three types of commands are given in Tables 5, 6 and 7.
   Table 5
   Statistical characteristics of Neutral
        chanel               mean                St.dev                max                  min
         AF3              8,61476139          593,0781423          884,5149578          -1881,297142
          T8             8,561556105          588,5540203          879,4852859          -1870,022561
           ..
         AF4             8,844488861          596,8591494            890,2455              -1893,2


                                                                                                     351
   Table 6
   Statistical characteristics of Left
        chanel                 mean                St.dev                max                    min
         AF3                -0,6208525            10,55638             45,30853                -92,1
          T8               -0,63486515            11,15294             47,54477               -89,764
           ..
         AF4                -0,6400505            13,02439             47,54477              -92,3567
   Table 7
   Statistical characteristics of Right
        chanel                 mean                 St.dev                max                  min
         AF3               0,006265795            4,059517              18,61453             -11,5468
          T8               0,033032954            7,721639              26,23524             -23,1488
           ..
         AF4               0,014057237            9,524141              39,64777             -30,5498

4.4.    Data normalization
   The data is normalized (Table 8) in order to facilitate the calculation algorithms as much as possible.
Processing of the data assumes that input does not depend on amplitudes but on the structure of the
input value, which requires normalization.
   Table 8
   Normalized value
    chanel                      AF3                       Т8                        AF4
    Value                       118,903                   118,272                   119,5855
    normalized value            0,192577                  10,66148                  29,45653
   The most commonly used rationing is the statistical rationing, which is set by formula (1). Statistical
normalization allows us to compute not the more extreme values but the statistically significant (typical)
values.

5. Conclusion
     The main contribution of this study is the method of identifying the most important characteristics
that maximize the distinction between the individual commands issued after the corresponding brain
stimulation. The proposed method is fast, simple and intuitive. It implements the individual distribution
of features in multiple objects and offers an interpretation of the basic statistical information (mean and
standard deviation). The method can be easily applied to other classification tasks, especially in the
presence of high data variability, which usually occurs in a study that incorporates individual subjects.
    The obtained results show suitable algorithms for the classification of EEG signals. This will help
young researchers to achieve interesting results in this area faster.

6. Acknowledgements
   This work is supported by the research program PPNIP-2021-09/12.14.2021 "Analysis and
optimization of algorithms for classification of signals coming from Smart IoT devices" and National
Science Program "Information and communication technologies for unified digital market in science".

7. References
[1] Akinyode, Babatunde & Khan, Tareef. (2018). Step by step approach for qualitative data analysis.
    International Journal of Built Environment and Sustainability. 5. 10.11113/ijbes.v5.n3.267.


                                                                                                        352
[2] B Colombet, M Woodman, C G Bénar, J M Badier,"AnyWave: A cross-platform and modular
     software for visualizing and processing electrophysiological signals", HAL Id: hal-01323171,
     https://hal.archives-ouvertes.fr/hal-01323171, Submitted on 30 May 2016.
[3] Dr. Zhibin Tan, Dr. William H. Blanton, Miss Qianru Zhang, “Real-time EEG signal processing
     based on TI’s TMS320C6713 DSK”, 120th ASEE Annual Conference@Exposition, Frankly, 23-
     26 Jone, 2013.
[4] G. Schalk a,b,∗, P. Brunner a,c, L.A. Gerhardt b, H. Bischof c, J.R. Wolpawa,"Brain–computer
     interfaces (BCIs): Detection instead of classification Brain-Computer Interface Research and
     Development Program", Journal of Neuroscience Methods 167 (2008) 51–62.
[5] G. Panayotova, D.A. Dimitrov, "Modeling from time series of complex brain signals",
     International Journal of Signal Processing Systems Vol. 9, No. 1, March 2021 pp 1-6.
[6] Georgi P. Dimitrov,. Ilian Iliev, Front-end optimization methods and their effect, MIPRO 2014 -
     37th International Convention, 26-30.06.2014
[7] Georgi P. Dimitrov,. Galina Panayotova, Stefkka Petrowa, Analysis of the Probabilities for
     Processing Incoming Requests in Public LibrariesThe 2 nd Global Virtual Conference 2014 (GV-
     CONF 2014) Goce Delchev University Macedonia & THOMSON Ltd. Slovakia, April 7 - 11,
     2014,ISSN: 1339-2778
[8] Kryvonos, I.G., Krak, I.V, Barmak, O.V, Kulias, A.I.: Methods to create systems for the analysis
     and synthesis of communicative information. Cybern. Syst. Anal. 53(6), 847–856 (2017).
     https://doi.org/10.1007/s10559-017-9986-7
[9] I. Krak, O. Barmak, E. Manziuk. Using visual analytics to develop human and machine-centric
     models: A review of approaches and proposed information technology, Computitional Intelligence
     (2020) 1-26. https://doi.org/10.1111/coin.12289
[10] Metcalf, Leigh & Casey, William. (2016). Introduction to data analysis. 10.1016/B978-0-12-
     804452-0.00004-X.
[11] O'Connor, H. & Gibson, Nancy. (2003). A Step-By-Step Guide To Qualitative Data Analysis.
     Pimatisiwin: A Journal of Aboriginal and Indigenous Community Health. 1. 63-90.
[12] Plesinger F1, Jurco J, Halamek J, Jurak P., "SignalPlant: an open signal processing software
     platform",Physiol Meas. 2016 Jul;37(7):N38-48. doi: 10.1088/0967-3334/37/7/N38. Epub 2016
     May 31.
[13] Rieger, Josef & Kosar, Karel & Lhotska, Lenka & Krajca, Vladimir. (2004). EEG Data and Data
     Analysis Visualization. 3337. 39-48. 10.1007/978-3-540-30547-7_5.
[14] Silverman, B. W., Density Estimation for Statistics and Data Analysis, Chapman and Hall, 1986.
[15] Shu, Hong. (2016). Big data analytics: six techniques. Geo-spatial Information Science. 19. 1-10.
     10.1080/10095020.2016.1182307.


                                                                                                  353