Novel EEG-based BCIs for Elderly Rehabilitation Enhancement Aurora Saibene1,2 , Francesca Gasparini1,2 and Jordi Solé-Casals3 1 University of Milano-Bicocca, Viale Sarca 336, 20126, Milano, Italy 2 NeuroMI, Milan Center for Neuroscience, University of Milano-Bicocca, Piazza dell’Ateneo Nuovo 1, 20126, Milano, Italy 3 University of Vic-Central University of Catalonia, C de la Laura 13, 08500, Vic, Barcelona, Spain Abstract The ageing process may lead to cognitive and physical impairments, which may affect elderly every- day life. In recent years, the use of Brain Computer Interfaces (BCIs) based on Electroencephalography (EEG) has revealed to be particularly effective to promote and enhance rehabilitation procedures, espe- cially by exploiting motor imagery experimental paradigms. Moreover, BCIs seem to increase patients’ engagement and have proved to be reliable tools for elderly overall wellness improvement. However, EEG signals usually present a low signal-to-noise ratio and can be recorded for a limited time. Thus, irrelevant information and faulty samples could affect the BCI performance. Introducing a methodology that allows the extraction of informative components from the EEG sig- nal while maintaining its intrinsic characteristics, may provide a solution to both the described issues: noisy data may be avoided by having only relevant components and combining relevant components may represent a good strategy to substitute the data without requiring long or repeated EEG recordings. Moreover, substituting faulty trials may significantly improve the classification performances of a BCI when translating imagined movement to rehabilitation systems. To this end, in this work the EEG signal decomposition by means of multivariate empirical mode de- composition is proposed to obtain its oscillatory modes, called Intrinsic Mode Functions (IMFs). Subse- quently, a novel procedure for relevant IMF selection criterion based on the IMF time-frequency repre- sentation and entropy is provided. After having verified the reliability of the EEG signal reconstruction with the relevant IMFs only, the relevant IMFs are combined to produce new artificial data and provide new samples to use for BCI training. Keywords BCI, EEG, entropy, MEMD 1. Introduction In the last years, the global growth of the elderly population [1, 2] and the increased life expectancy [3] have been determining factors to increase the awareness on the impact that ageing has on elderly people in their everyday life [1]. In fact, the ageing process may subjectively affect elderly cognitive abilities and introduce motor control impairments [4], which could require the intervention of caretakers and rehabilitation procedures, limiting an elderly person Italian Workshop on Artificial Intelligence for an Ageing Society (AIxAS 2021), November 29th, 2021 " a.saibene2@campus.unimib.it (A. Saibene); francesca.gasparini@unimib.it (F. Gasparini); jordi.sole@uvic.cat (J. Solé-Casals)  0000-0002-4405-8234 (A. Saibene); 0000-0002-6279-6660 (F. Gasparini); 0000-0002-6534-1979 (J. Solé-Casals) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) autonomy. Assistive technologies have become particularly attractive to enhance elderly people overall wellbeing, to allow them a certain independence and maintain their social connections [1, 3, 2]. Among the various technological innovations, the Brain Computer Interfaces (BCIs) have proved to be particularly apt to these tasks and found their application in cognitive and motor rehabilitation systems [5, 6, 7, 4, 8]. In fact, BCIs allow the decoding of brain dynamics in an on- line configuration [9] and can be exploited to control heterogeneous systems (e.g., wheelchairs [10]), and provide an instantaneous feedback to their users [8]. The most popular method to allow the recording of the BCI input brain signals is electroen- cephalography (EEG), which provides multivariate time series collected by placing electrodes on the scalp of a subject. Therefore, the EEG signals have proved to be particularly efficient in accessing brain activities and functions by bringing time, space (electrodes) and frequency information of the neuronal signals [11] in a non-invasive way. Regarding the frequency information, the EEG signal is characterized by different frequency bands (or rhythms) that are representative of specific brain dynamics [12, 13]. Table 1 provides a brief overview of the EEG rhythms. Notice that the 𝛼 and 𝛽 frequency bands can be specifically associated to different cognitive and motor functions and thus their dynamic changes can be widely exploited in rehabilitation systems based on motor imagery (MI) tasks [6, 14]. An MI task consists of the imagination of a real movement, like imaging the opening and closing of a hand, and it has been proved that MI practice improve real movements during a rehabilitation process [5, 8]. However, controlling a BCI system with MI is difficult and the ability to perform this task varies from person to person. A good control of a BCI is usually achieved when a user reaches the 70% accuracy in the MI paradigm [15]. Reaching this level of accuracy may require a long time, however MI tasks usually enhance brain plasticity and providing a feedback to their users could further improve the cognitive, motor and intellectual functions of elderly users, who are particularly affected by brain dynamic changes due to the ageing process [6]. Even though MI-BCI systems based on EEG signals seem to be promising tools for rehabili- tation purposes, there are many challenges that should be considered and that can affect the overall BCI performance. In fact, the EEG signals, that are at the core of these systems, are extremely heterogeneous and easily affected by noise [16]. Moreover, collecting an adequate quantity of reliable data to perform the classification tasks involved in the BCI control system is difficult, due to the lack of sufficient time for recording, the possible poor number of subjects and the difficulty of the experimental tasks [17]. These issues may lead to a general classification deterioration and thus affect machine learning model performances [18]. In the EEG domain, attempts to solve the problem of generating new artificial signals have been provided by the data augmentation literature [18] [17] [11]. However, the coherence between these artificial data and the brain dynamics recorded by the EEG should be considered and verified. In [19], an interesting data substitution approach is proposed1 . The authors exploit the recom- bination of the oscillatory modes, called Intrinsic Mode Functions (IMFs), obtained through 1 The original code is available at https://github.com/ffbear1993/DR-EMD. Table 1 Overview of the rhythms characterizing the EEG signal. Rhythm Frequency range (Hz) Brain dynamic 𝛿 ≤4 sleep 𝜃 4−7 drowsiness, sleep, emotional stress 𝛼 8 − 13 relaxed while awake 𝛽 13 − 30 alertness, thinking, attention 𝛾 ≥ 31 intensive brain activity Empirical Mode Decomposition (EMD) [20] of the EEG signals to maintain the EEG time and frequency information even in the artificially produced portions of signals. Motivated by this work and wanting to provide consistent artificial data to be employed in BCI systems without requiring elderly people to sit over long experimental sessions, we propose a novel processing of EEG data for BCI rehabilitation systems. Therefore, the present paper extends the work provided by Dinarès-Ferran et al. [19] and focuses on finding significant IMFs for portions of MI signals presenting tasks of interest, from now on called trials. The IMFs are computed with a multivariate extension of the EMD algorithm [21], wanting to maintain the cross-channel interdependence typical of EEG data [22]. The IMFs are then selected and recombined in order to obtain an unbiased data substitution. Therefore, our main contributions are (i) the time-frequency representation of the IMFs, on which (ii) the entropy is computed to define the most informative (relevant) IMFs and (iii) the reconstruction of artificial trials by significant IMF combination. These steps should provide artificial EEG signals presenting coherent brain dynamics and thus exploitable in a BCI paradigm. The work is then organized as follows. Section 2 provides a brief literature review on EMD and its multivariate extension, when exploited for IMF selection or data substitution. Section 3 presents the datasets and literature methods employed. Section 4 describes our proposed approach for relevant IMF selection and artificial data generation. Section 5 discusses the obtained results and Section 6 concludes the work. 2. Related works As introduced in Section 1 and starting from the EMD based artificial trial generation proposed by Dinarès-Ferran et al., this work will focus on the use of the Multivariate Empirical Mode Decomposition (MEMD) [21] to select relevant IMFs and recombine them to produce new trials. Therefore, in this section a brief overview of the works presenting some solutions for relevant, redundant, or noisy IMF selection will be provided. In the EMD literature, some authors [23, 24] have focused on finding redundant IMFs by exploiting the Minkowski distance and the Jensen-Rènyi divergence [25] to track the differences between the original signals and the IMFs. The use of a thresholding critrion is presented by Bueno et al. [26]. The authors compute the entropy of each IMF and select them if their entropy is greater than a specifically defined threshold (max 𝑒 − min 𝑒)/2 + min 𝑒. Notice that 𝑒 is the vector containing the entropy of all the IMFs. Considering instead the MEMD literature, the IMF selection has been performed (i) empirically by retaining the IMFs whose combination provides the best classification performance [27], (ii) by comparing through the Wasserstein distance [28] the IMFs obtained by decomposing the EEG data against the IMFs coming from reference electrodes [29], or (iii) by using a measure of similarity between IMFs and reference noise [30, 31]. In the following, we will highlight the fact that our IMF selection is also based on choosing the frequency ranges characterizing EEG rhythms of interest. In the literature, Piper et al. and Gaur et al. [32, 33] have exploited the frequency information: Piper et al. wanted to obtain consistent IMFs between subjects and selected the first 2 IMFs retaining 𝛿 rhythm information, while Gaur et al. found the 𝛼 and 𝛽 rhythm contributions by computing the median frequency values of each IMF. Besides IMF selection or noise removal, EMD and some of its variations have been exploited for artificial data generation. Considering the reference work, Dinarès-Ferran et al. [19] employed EMD and obtained IMFs for each trial. Afterwards, they generated artificial data by combining 15 IMFs of different trials and used them to substitute certain percentages of the data, finding that the efficacy of these substitutions varies from subject to subject. A similar strategy has been presented by Lee et al. [34], who decomposed the signals with the ensemble variation of the EMD [35], instead of using the original EMD. As a final remark, notice that the majority of the described methodologies set the maximum number of IMFs to combine to a specific and unchanged number (during computation) , and when producing artificial data, all the IMFs are considered. 3. Methods In this section, the datasets employed to test our proposal and the Multivariate Empirical Mode Decomposition (MEMD) [21] used for the IMF decomposition of the EEG signals are described. 3.1. Datasets The following procedures have been tested on 2 datasets: (i) the EEG Simulated dataset [36, 37], and (ii) the EEG Motor Imagery BCI dataset [19]. The choice of the EEG Simulated dataset was driven by the necessity of assessing the efficacy of the proposed relevant IMF selection on controlled data. In fact, the EEG Simulated dataset presents clean and raw (noise affected with signal-to-noise ratio from -20dB to 20dB and adding ocular artifacts) simulated EEG signals on 19 electrodes, i.e., C{3, 4, 𝑧}, F{3, 4, 7, 8, 𝑧}, Fp{1, 2}, O{1, 2}, P{3, 4, 𝑧}, T{3, 4, 5, 6}, and considering the 𝛼, 𝛽, and 𝛾 rhythms. 10 trials of 10 s each have been generated. For further details, please refer to [36, 37]. Instead, the EEG Motor Imagery BCI dataset was chosen due to the fact that it has been employed by the reference work on artificial trial substitution for a BCI paradigm [19]. 7 healthy males were asked to perform a MI task consisting of left/right wrist dorsiflexion imagined movements. Each subject participated to 2 experimental runs, during each of which 40 tasks of left and 40 tasks of right wrist MI were randomly performed. A trial consisted of 2s of resting time, an acoustic cue for task preparation and 5s of motor imagination. The recorded electrodes were C{1, 2, 3, 4, 5, 6, 𝑧}, Cp{1, 2, 5, 6}, Fc{1, 2, 5, 6, 𝑧}. Notice that the dataset has been pre-processed by the authors, who used a bandpass (0.5 − 30Hz) and a notch (50Hz) filter. For more details, please refer to [19]. 3.2. Multivariate empirical mode decomposition In 1998, Huang et al. proposed the Empirical Mode Decomposition (EMD) [20], a signal process- ing technique that allows the decomposition of a time series into oscillatory modes (the IMFs), with a completely data-driven approach and avoiding the loss or distortion of the data [38]. In fact, considering a signal 𝑥(𝑡), it (i) finds the signal local maxima and minima, (ii) defines 𝑥(𝑡) upper and lower envelopes, (iii) computes their mean envelope and (iv) subtracts it from 𝑥(𝑡), obtaining the detail 𝑑(𝑡). The computation is repeated assigning 𝑑(𝑡) to 𝑥(𝑡), until 𝑑(𝑡) satisfies the IMFs conditions, i.e., its number of zero-crossings and extrema are equal or differ by a unit and its mean envelope is zero. Thus, 𝑑(𝑡) is considered as an IMF ∑︀and 𝑥(𝑡) can be reconstructed 𝐼 by summing the obtained 𝐼 IMFs and a residuum 𝜖(𝑡): 𝑥(𝑡) = 𝑖=1 𝐼𝑀 𝐹𝑖 (𝑡) + 𝜖𝐼 (𝑡). Therefore, EMD seems to be suitable to deal with non-stationary and non-linear signals [39] [36], like the EEG ones. However, it works on multivariate signals with a channel-by-channel approach, thus ignoring the cross-channel interdependence which characterizes the EEG signals [22]. To address this issue, Rehman and Mandic proposed an EMD variation, i.e., the Multivariate Empirical Mode Decomposition (MEMD) [21]. MEMD provides IMFs with the same number of oscillations for each channel by exploiting different projections of the 𝑛-channel signal into a 𝑛-dimensional space. Thus, given a multivariate signal {𝑥(𝑡)}𝑇𝑡=1 = {𝑥1 (𝑡), 𝑥2 (𝑡), ..., 𝑥𝑛 (𝑡)}, MEMD: 1. Chooses a suitable direction vector 𝑣 𝜃𝑘 = {𝑣1𝑘 , 𝑣2𝑘 , ..., 𝑣𝑛𝑘 }, where 𝑘 = 1, 2, ..., 𝐾 and 𝐾 is the total number of direction vectors and 𝜃 𝑘 = {𝜃1𝑘 , 𝜃2𝑘 , ..., 𝜃𝑙𝑘 } are the angles on a (𝑛 − 1) sphere along which are defined the direction vectors. 2. Computes the 𝑘 𝑡ℎ projection of {𝑥(𝑡)}𝑇𝑡=1 along 𝑣 𝜃𝑘 , obtaining {𝑝𝜃𝑘 (𝑡)}𝐾 𝑘=1 for all 𝑘. 𝜃𝑘 𝐾 3. Finds {𝑡𝑖 }𝑘=1 time instants, which correspond to the {𝑝 (𝑡)}𝑘=1 maxima. 𝜃 𝑘 𝐾 4. Interpolates [𝑡𝜃𝑖 𝑘 , 𝑥(𝑡𝜃𝑖 𝑘 )] to obtain {𝑒𝜃𝑘 (𝑡)}𝐾 𝑘=1 for all 𝑘. 5. Estimates the mean envelope for a set of 𝐾 direction vectors: 𝑚(𝑡) = 𝐾1 𝐾 𝜃𝑘 ∑︀ 𝑘=1 𝑒 (𝑡). 6. Extracts the detail 𝑑𝑗 (𝑡) = 𝑥(𝑡) − 𝑚(𝑡), where 𝑗 = 1, 2, ..., 𝐽 and 𝐽 is the maximum number of decomposition scales. If 𝑑𝑗 (𝑡) meets the IMF conditions, point 1-3 are applied to 𝑥(𝑡) − 𝑑𝑗 (𝑡), otherwise to 𝑑𝑗 (𝑡). In this work, MEMD is used to decompose the data of each subject and trial. Fig. 1 shows an example of the IMFs obtained by applying MEMD on the electrode C1 of a specific subject and trial. The x axis corresponds to the time (s) and the y axis to the signal amplitude (𝜇𝑉 ). 4. Our proposal After having obtained the same number of IMFs for each electrode of a specific subject and trial, the IMFs are used as inputs to the proposed procedure, which consists of 4 main steps: Figure 1: Example of IMFs obtained by applying MEMD to electrode C1. The last oscillatory mode corresponds to the residuum. (i) generation of the time-frequency images for each IMF, (ii) entropy computation on the time-frequency images, (iii) significant IMF selection criterion, and (iv) data substitution. Firstly, the time-frequency image generation requires the choice of a frequency range of interest and its time-frequency resolution. Having as targets MI tasks, the chosen frequency range spans from 8 to 30 Hz. In fact, this range includes the 𝛼 (8-13 Hz) and 𝛽 (13-30 Hz) frequency bands, which are involved in the motor tasks [14] (Section 1). Also, a trade-off between the time and frequency resolution [40] is preferred for image reconstruction. Then, the procedure loops on the frequency range and starts with the generation of a complex Morlet wavelet [41] exploiting the described parameters. Subsequently, it proceeds with the application of the fast Fourier transform on both the wavelet and the original signal. Afterwards, the inverse fast Fourier transform is applied to the wavelet convolved on the signal. Finally, the power data of the convolution is computed and thus the time-frequency image obtained. Fig. 2 presents a colored example of the time-frequency images obtained by computing this procedure on each subject, trial, electrode and relative IMFs (presented in Fig. 1). The x axis corresponds to the time (s) and the y axis to the frequency range (Hz). From the provided example, some clear differences in the IMFs can be observed and we can hypothesize that the IMFs containing more information are the ones which present a more complex texture. Entropy [42] can be used to characterize the texture of an image and thus is employed to find the IMFs having more variability. Here the entropy is computed on the time-frequency gray scale image and is equal to − (𝑝 × ∑︀ log2 (𝑝)), where 𝑝 presents the normalized histogram counts of the time-frequency image. Considering the entropy obtained on all the IMFs of all the electrodes of a specific trial, the IMFs having entropy greater than the mean entropy are selected as the most significant ones. A more conservative approach is preferred to the elimination of a greater number of IMFs in Figure 2: Example of the time-frequency images obtained by applying Morlet wavelet convolution on each of the IMFs presented in Fig. 1. order to avoid the exclusion of effective neurophysiological signals. The same number of IMFs are selected for all the electrodes of the same trial. Finally, the last step of our proposal is the artificial data generation, which is performed on each subject by randomly combining the entropy-selected IMFs of different trials of the same task, using a scheme similar to the one described in [19, 43]. For each subject, a specific number of artificial trials balanced between the tasks of interest are produced by: 1. Finding the maximum number of entropy-selected IMFs 𝑚𝑎𝑥𝐼𝑀 𝐹 to have the same number of IMFs for each trial. In case a trial presents a lesser number of entropy-selected IMFs if it was decomposed in at least 𝑚𝑎𝑥𝐼𝑀 𝐹 -IMFs, the corresponding oscillation modes discarded by the entropy selection are reintegrated until the trial reaches 𝑚𝑎𝑥𝐼𝑀 𝐹 -IMFs, otherwise IMFs equal to a null vector are added until the trial reaches 𝑚𝑎𝑥𝐼𝑀 𝐹 -IMFs. 2. Randomly selecting 𝑚𝑎𝑥𝐼𝑀 𝐹 -IMFs from the IMFs of 𝑚𝑎𝑥𝐼𝑀 𝐹 different original trials for each artificial trial. E.g., a new artificial trial may be composed by the first IMF of the 17𝑡ℎ original trial (of the same task), by the second IMF of the 5𝑡ℎ original trial, by the third IMF of the 1𝑠𝑡 original trial and so on until 𝑚𝑎𝑥𝐼𝑀 𝐹 -IMFs are reached. 3. Reconstructing the artificial trials according to the IMFs obtained at point 2. Notice that the original trials were reconstructed by considering both their 𝑚𝑎𝑥𝐼𝑀 𝐹 -IMFs to provide coherent signals to compare with the artificial trials and the entropy-selected IMFs. Also, analyses on the trial similarity are performed to ensure the absence of biased artificial trials that could be efficiently used in BCI rehabilitation systems. 5. Results and Discussion To better understand the reliability of the proposed relevant IMF selection, a preliminary experiment has been conducted on the EEG Simulated dataset. MEMD has been applied for each trial of the clean and raw data, considering all the signal-to- noise ratio realizations (from -20dB to 20dB). Subsequently, the time-frequency images of the obtained IMFs have been generated considering the 8 − 30 Hz frequency range. Finally, the entropy selection criterion has been applied and the entropy-selected IMFs summed up to obtain the reconstructed EEG signal. The assessment of the proposed method reliability has been performed by computing the similarity between the clean and raw/entropy-reconstructed data by means of Pearson correla- tion coefficient. The similarity check has been applied to each signal-to-noise ratio realization, trial, and electrode. We find that for low signal-to-noise ratios (about -20dB to -17dB), the Pearson correlation coefficient obtained by comparing the clean data versus the entropy-reconstructed ones do not deviate from the results obtained by applying the Pearson correlation coefficient on the clean data versus the raw ones. However, a significant increase of the Pearson correlation coefficient is present for the clean versus entropy-reconstructed signal case with signal-to-noise ratio greater or equal to -12dB. Even though the similarity between the clean and entropy-reconstructed signals increases with higher values of signal-to-noise ratio, it seems that for the electrodes that are usually affected by ocular artifacts the Pearson correlation coefficient remains generally low. For all the clean versus raw data cases, the mean similarity remains always lower than 0.9 and for the majority of the electrodes remains under 0.6. Therefore, the proposed strategy is considered sufficiently reliable for relevant IMF selection and signal reconstruction, having that its results do not deviate from the raw data or that are sufficiently similar to the clean ones. Having obtained a reliable method for relevant IMF selection, it is hypothesized that the entropy-selected IMFs could be efficiently used to reconstruct simulated trials while sufficiently maintaining the intrinsic EEG brain dynamics. As previously introduced, having that the proposed strategy is modeled on the one described by Dinarès-Ferran et al. [19] and that considers a BCI experiment, the EEG Motor Imagery BCI dataset has been used for testing. The procedure has been computed on each subject and again the MEMD has been applied for each trial separately. Notice that the final goal is to discriminate the left (LW) from the right (RW) wrist MI, which could be applied to control a rehabilitation system. Firstly, random trials were substituted by artificial ones. These trials were obtained by unique combinations of 𝑚𝑎𝑥𝐼𝑀 𝐹 relevant IMFs selected through the entropy criterion and belonging to 𝑚𝑎𝑥𝐼𝑀 𝐹 different original trials. Remind that 𝑚𝑎𝑥𝐼𝑀 𝐹 corresponds to the overall maximum number of IMFs selected by the entropy criterion. The artificial trials were then reconstructed by summing the selected IMFs. To reproduce Dinarès-Ferran et al. [19] testing, 2.50, 5.00, 7.50, 10.00, 12.50, 25.0, 37.50, 50.00% of trial substitutions were performed. The power spectral density was extracted for each electrode through Morlet wavelet convolution [44]. This feature extraction follows the time-frequency image representation computation, but as a final step, the power data is integrated in the frequency range of interest. Two rhythms were considered separately, i.e., the 𝛼 and 𝛽 frequency bands, obtaining a total of 38 features. These rhythms have been chosen, due to the presence of MI tasks (Section 1). The feature extraction has been restricted on the signal portion during which the MI task is performed to mimic Dinarès-Ferran et al. [19] experimental setting. Finally, a linear discriminant analysis classifier has been applied. For each subject the first run has been used as the training set and the second run as the test set. As a first analysis, Table 2 reports the median error rates for both the RW and LW conditions after having applied linear discriminant analysis 100 times on the original and reconstructed data. Trying to mimic the analysis given by [19], the error rate is here intended as the percentage of predicted values that have been wrongly classified for each class. The results obtained by Dinarès-Ferran et al. [19] (row DF of Table 2) have been reported for completeness, however notice that their error rate evaluation is based on all the signal samples and that the features are extracted through common spatial pattern application. The remaining table rows present the results obtained by considering the runs in their original form (row OO), and the entropy-reconstructed data of run 1 as the training set and the original data of run 2 as the test set (row RO). This last test has been conducted trying to mimic a real-time scenario, during which previously analyzed data (e.g., BCI training phase) may be exploited to predict new unseen data (e.g., BCI translation phase). Firstly, notice that the results obtained by the proposed strategy seem to be more balanced compared to the ones reported by Dinarès-Ferran et al. [19]. Secondly, the RO results have been used for comparison with the artificial trial substitution results, which are reported in Table 3, having that their values are comparable to the ones obtained for the OO test. Notice that the field AT (%) refers to the percentage of trial substitutions balanced between the RW and LW conditions. Therefore, row 1 (0.00%) corresponds to the RO results presented in the last row of Table 2. Moreover, it can be observed that the results vary from subject to subject. In fact, subject S01 may be considered a good MI task performer, having that in the RO case the error rates are 2.50 and 0.00 for the RW and LW task, respectively and thus complying with what has been stated in Section 1: a person is good in performing MI tasks when he/she can accurately imagine the movement for at least the 70% of the task repetitions. Considering a less stringent constraint, S07 may be also considered good in the MI experiment. Instead, the remaining subjects seem to have some difficulties performing the MI tasks. Analyzing S01 and S07, besides the 37.50% and 50.00% substitutions performed on S01’s trial, it can be noticed that the error rates remain stable or improve. Concerning the remaining subjects, the error rates are generally not improved. However, these error rates become more balanced between the conditions and for S03 and S04 there seem to be a decrease in the error rate values for some substitution percentages. Therefore, to ensure that the results obtained with the trial substitutions are coherent with the original results, a double Median Absolute Deviation (MAD) [45] has been applied to detect Table 2 Median error rates (%) obtained by applying 100 times the linear discriminant analysis classifier on the right (RW) and left wrist (LW) motor imagery. S01 S02 S03 S04 S05 S06 S07 RW LW RW LW RW LW RW LW RW LW RW LW RW LW DF 5.50 6.68 11.20 66.67 29.83 20.39 42.67 32.96 36.24 35.79 27.27 39.60 58.34 22.74 OO 0.00 0.00 35.00 60.00 57.50 55.00 45.00 42.50 45.00 35.00 45.00 45.00 35.00 35.00 RO 0.00 2.50 35.00 57.50 52.50 57.50 32.50 40.00 45.00 37.50 45.00 45.00 35.00 37.50 Table 3 Median error rates (%) obtained by applying 100 times the artificial trial generation and the linear dis- criminant analysis classifier on the right (RW) and left wrist (LW) dorsiflexion for the trial substitution experiment. S01 S02 S03 S04 S05 S06 S07 AT (%) RW LW RW LW RW LW RW LW RW LW RW LW RW LW 0.00 0.00 2.50 35.00 57.50 52.50 57.50 32.50 40.00 45.00 37.50 45.00 45.00 35.00 37.50 2.50 0.00 2.50 37.50 57.50 52.50 52.50 32.50 40.00 45.00 45.00 45.00 45.00 27.50 30.00 5.00 0.00 2.50 37.50 55.00 47.50 47.50 32.50 37.50 45.00 47.50 42.50 47.50 27.50 27.50 7.50 0.00 2.50 40.00 52.50 45.00 47.50 32.50 37.50 45.00 50.00 42.50 50.00 27.50 30.00 10.00 0.00 2.50 42.50 52.50 45.00 50.00 32.50 37.50 45.00 50.00 42.50 50.00 30.00 30.00 12.50 0.00 2.50 42.50 52.50 45.00 47.50 32.50 37.50 45.00 50.00 40.00 50.00 30.00 30.00 25.00 2.50 5.00 47.50 50.00 42.50 45.00 32.50 37.50 45.00 50.00 40.00 50.00 32.50 32.50 37.50 5.00 7.50 47.50 47.50 42.50 45.00 37.50 40.00 47.50 47.50 42.50 50.00 32.50 35.00 50.00 7.50 10.00 47.50 45.00 45.00 45.00 40.00 42.50 50.00 47.50 45.00 52.50 37.50 37.50 if the median results obtained on the original trials could be considered outliers in respect to all the 100 results obtained for each trial substitution. Notice that if more than the 50% of the classification results are equal, the MAD is 0. Table 4 reports the outliers detected by double MAD application, with the following interpretation: (i) if a cell contains 0 MAD, it means that the MAD is 0, (ii) if a cell contains original, it means that the result obtained on the original trial is considered an outlier in respect to the results obtained on the artificial trial substitution, and (iii) if a cell contains a non-zero number, it means that the double MAD detected that specific number of outliers. Analyzing Table 4, the original trial result seems to appear as an outlier in a sufficiently limited number of cases. The proposed strategy results unreliable for S02’s RW condition, otherwise it results efficient especially when making few substitutions (2.50 to 5.00%) or a greater number of substitutions (25.00 to 50.00%). The overall number of outliers seems also to be fairly low. Making a final evaluation of the results achieved by the trial substitution through the proposed artificial trial generation, the observation given by Dinarès-Ferran et al. [19] is confirmed: for subjects that naturally perform better the MI tasks, the strategy is generally more effective. In fact, the reported error rates generally remain stable or slightly increase for the other subjects. This effect could be due not only to the difficulties a user may face in performing the MI task Table 4 Outlier detection performed through double MAD computation on the original and trial substitution results for the trial substitution experiment. S01 S02 S03 S04 S05 S06 S07 AT (%) RW LW RW LW RW LW RW LW RW LW RW LW RW LW 2.50 0 MAD 0 MAD 0 MAD 0 MAD 28 9 15 0 MAD 6 17 0 MAD 0 MAD 9 0 MAD 5.00 0 MAD 0 MAD 0 MAD - original 11 17 18 0 MAD 1 6 9 9 9 9 9 7.50 0 MAD 0 MAD 20 4 1 7 9 0 MAD - original 10 3 - original 28 0 MAD 10 - original 10 10.00 0 MAD 0 MAD 17 - original 8 8 4 16 2 17 4 - original 28 13 0 MAD - original 9 12.50 0 MAD 0 MAD 14 - original 20 7 4 11 0 MAD - original 13 7 - original 0 MAD 0 MAD 0 MAD - original 3 25.00 0 MAD 0 MAD 17 - original 1 8 5 3 3 23 1 21 19 4 19 37.50 22 0 MAD - original 15 - original 8 32 5 4 12 9 1 11 9 12 1 50.00 22 - original 8 13 - original 1 4 3 20 10 9 2 2 6 14 11 which could lead to unreliable trials for a control system, but also to the presence of noisy data. Therefore, a further development of the trial simulation could be represented by the identifica- tion of faulty trials in respect to the experiment of interest. In fact, these trials could deteriorate the overall classification performances and by removing them the trial recombination could benefit the data generation procedure. Moreover, a BCI system could benefit from the faulty trial detection not only intended as noisy trials but also as unreliable trials. In fact, understanding immediately if an elderly person has some difficulties in performing the MI task could provide a better BCI training phase by giving preciser feedback to the subject him/herself and thus increase the task success and decrease subjects’ possible frustration. As a final remark, the proposed methodology seems however to be not extremely faulty when dealing with possibly unskilled subjects. Thus, the performed data substitution may represent a good solution to the EEG data dimensionality problem, providing results that do not deviate excessively from the original ones to train a BCI system. 6. Conclusions This work has provided a brief overview of BCI systems based on motor imagery and electroen- cephalography to enhance elderly people rehabilitation procedures. A novel strategy to process the EEG signals before inputting them to the BCI system has been proposed to provide more reliable data without requiring long experimental sessions. In fact, having the possibility of producing artificial trials that are coherent with the natural brain dynamics can benefit the training phase of a BCI, which usually requires a long time to have a precise subject profile and thus guarantee a correct control of the system. By decreasing the training time, the BCI tasks should also become less demanding both physically and mentally for an elderly patient, who could feel more engaged and less stressed by the training procedure. Therefore, the EEG signal decomposition by means of MEMD and the relevant IMFs selection through a newly defined entropy criterion have been applied to 2 datasets. The proposed approach testing revealed that the signal reconstruction by using only the relevant IMFs is reliable and that the recombination of the relevant IMFs can be efficiently used for artificial trial generation. However, we have noticed that the proposed strategy is particularly suitable for trial substitution of signals recorded from good MI performers, in line with Dinarès- Ferran et al. observations. Therefore, we hypothesize that detecting faulty trials in terms of noise and unsuccessful MI performing, may benefit the artificial trial generation as well as the BCI training phase during which an elderly user may effectively improve his/her brain plasticity. Future works will focus on these directions and on exploiting the trial generation strategy for data augmentation. In fact, many data augmentation approaches have been proposed in the literature [11]: additive noise, generative adversarial networks, sliding or overlapping windows, different sampling methods, EEG segment recombination and so on. However, more attention should be required to guarantee the maintenance of the naturally EEG recorded brain dynamics, which we attempted to preserve in the present work. References [1] M. Vancea, J. Solé-Casals, Population aging in the European Information Societies: towards a comprehensive research Agenda in eHealth innovations for elderly, Aging and disease 7 (2016) 526. [2] A. Saibene, M. Assale, M. Giltri, Addressing Digital Divide and Elderly Acceptance of Medical Expert Systems for Healthy Ageing., in: AIxAS@ AI* IA, 2020, pp. 14–24. [3] A. Saibene, F. Gasparini, Cognitive and physiological response for health monitoring in an ageing population: A multi-modal system, in: International Conference on Internet Science, Springer, 2019, pp. 341–347. [4] A. N. Belkacem, N. Jamil, J. A. Palmer, S. Ouhbi, C. Chen, Brain computer interfaces for improving the quality of life of older adults and elderly patients, Frontiers in Neuroscience 14 (2020) 692. [5] Y. Liu, M. Li, H. Zhang, H. Wang, J. Li, J. Jia, Y. Wu, L. Zhang, A tensor-based scheme for stroke patients’ motor imagery EEG analysis in BCI-FES rehabilitation training, Journal of neuroscience methods 222 (2014) 238–249. [6] J. Gomez-Pilar, R. Corralejo, L. F. Nicolas-Alonso, D. Álvarez, R. Hornero, Neurofeedback training with a motor imagery-based BCI: neurocognitive improvements and EEG changes in the elderly, Medical & biological engineering & computing 54 (2016) 1655–1666. [7] L. Carelli, F. Solca, A. Faini, P. Meriggi, D. Sangalli, P. Cipresso, G. Riva, N. Ticozzi, A. Ciammola, V. Silani, et al., Brain-computer interface for clinical purposes: cognitive assessment and rehabilitation, BioMed research international 2017 (2017). [8] R. Mane, T. Chouhan, C. Guan, Bci for stroke rehabilitation: motor and beyond, Journal of Neural Engineering 17 (2020) 041001. [9] J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, T. M. Vaughan, Brain– computer interfaces for communication and control, Clinical neurophysiology 113 (2002) 767–791. [10] A. Herweg, J. Gutzeit, S. Kleih, A. Kübler, Wheelchair control by elderly participants in a virtual environment with a brain-computer interface (BCI) and tactile stimulation, Biological psychology 121 (2016) 117–124. [11] E. Lashgari, D. Liang, U. Maoz, Data augmentation for deep-learning-based electroen- cephalography, Journal of Neuroscience Methods (2020) 108885. [12] S. Vaid, P. Singh, C. Kaur, EEG signal analysis for BCI interface: A review, in: 2015 fifth international conference on advanced computing & communication technologies, IEEE, 2015, pp. 143–147. [13] X. Wan, K. Zhang, S. Ramkumar, J. Deny, G. Emayavaramban, M. S. Ramkumar, A. F. Hussein, A review on electroencephalogram based brain computer interface for elderly disabled, IEEE Access 7 (2019) 36380–36387. [14] P. Szczuko, M. Lech, A. Czyżewski, Comparison of classification methods for EEG signals of real and imaginary motion, in: Advances in feature selection for data and pattern recognition, Springer, 2018, pp. 227–239. [15] V. Kaiser, G. Bauernfeind, A. Kreilinger, T. Kaufmann, A. Kübler, C. Neuper, G. R. Müller- Putz, Cortical effects of user training in a motor imagery based brain–computer interface measured by fNIRS and EEG, Neuroimage 85 (2014) 432–444. [16] Y. Roy, H. Banville, I. Albuquerque, A. Gramfort, T. H. Falk, J. Faubert, Deep learning-based electroencephalography analysis: a systematic review, Journal of neural engineering 16 (2019) 051001. [17] K. Zhang, G. Xu, Z. Han, K. Ma, X. Zheng, L. Chen, N. Duan, S. Zhang, Data augmentation for motor imagery signal classification based on a hybrid neural network, Sensors 20 (2020) 4485. [18] Y. Luo, B.-L. Lu, EEG data augmentation for emotion recognition using a conditional Wasserstein GAN, in: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, 2018, pp. 2535–2538. [19] J. Dinarès-Ferran, R. Ortner, C. Guger, J. Solé-Casals, A new method to generate artificial frames using the empirical mode decomposition for an EEG-based motor imagery BCI, Frontiers in neuroscience 12 (2018) 308. [20] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N.-C. Yen, C. C. Tung, H. H. Liu, The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis, Proceedings of the Royal Society of London. Series A: mathematical, physical and engineering sciences 454 (1998) 903–995. [21] N. Rehman, D. P. Mandic, Multivariate empirical mode decomposition, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 466 (2010) 1291–1302. [22] X. Chen, X. Xu, A. Liu, M. J. McKeown, Z. J. Wang, The use of multivariate EMD and CCA for denoising muscle artifacts from few-channel EEG recordings, IEEE transactions on instrumentation and measurement 67 (2017) 359–370. [23] D. Boutana, M. Benidir, B. Barkat, On the selection of intrinsic mode function in emd method: application on heart sound signal, in: 2010 3rd International Symposium on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2010), IEEE, 2010, pp. 1–5. [24] L. A. Moctezuma, M. Molinas, Eeg-based subjects identification based on biometrics of imagined speech using emd, in: International Conference on Brain Informatics, Springer, 2018, pp. 458–467. [25] R. Rato, M. D. Ortigueira, A. Batista, On the hht, its problems, and some solutions, Mechanical systems and signal processing 22 (2008) 1374–1394. [26] M. Bueno-López, P. A. Muñoz-Gutiérrez, E. Giraldo, M. Molinas, Analysis of epileptic activity based on brain mapping of eeg adaptive time-frequency decomposition, in: International Conference on Brain Informatics, Springer, 2018, pp. 319–328. [27] C. Park, D. Looney, N. ur Rehman, A. Ahrabian, D. P. Mandic, Classification of motor imagery bci using multivariate empirical mode decomposition, IEEE Transactions on neural systems and rehabilitation engineering 21 (2012) 10–22. [28] L. Rüschendorf, The wasserstein distance and approximation theorems, Probability Theory and Related Fields 70 (1985) 117–129. [29] M. Hu, H. Liang, Search for information-bearing components in neural data, PLoS One 9 (2014) e99793. [30] A. Komaty, A. Boudraa, D. Dare, Emd-based filtering using the hausdorff distance, in: 2012 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), IEEE, 2012, pp. 000292–000297. [31] H. Hao, H. Wang, N. Rehman, A joint framework for multivariate signal denoising using multivariate empirical mode decomposition, Signal Processing 135 (2017) 263–273. [32] D. Piper, K. Schiecke, B. Pester, F. Benninger, M. Feucht, H. Witte, Time-variant coherence between heart rate variability and eeg activity in epileptic patients: an advanced coupling analysis between physiological networks, New Journal of Physics 16 (2014) 115012. [33] P. Gaur, R. B. Pachori, H. Wang, G. Prasad, An automatic subject specific intrinsic mode function selection for enhancing two-class eeg-based motor imagery-brain computer interface, IEEE Sensors Journal 19 (2019) 6938–6947. [34] H. K. Lee, J.-H. Lee, J.-O. Park, Y.-S. Choi, Data-driven data augmentation for motor imagery brain-computer interface, in: 2021 International Conference on Information Networking (ICOIN), IEEE, 2021, pp. 683–686. [35] Z. Wu, N. E. Huang, Ensemble empirical mode decomposition: a noise-assisted data analysis method, Advances in adaptive data analysis 1 (2009) 1–41. [36] E. Gallego-Jutglà, J. Solé-Casals, T. M. Rutkowski, A. Cichocki, Application of Multivariate Empirical Mode Decomposition for Cleaning Eye Blinks Artifacts from EEG Signals., in: IJCCI (NCTA), 2011, pp. 455–460. [37] E. Gallego Jutglà, et al., New signal processing and machine learning methods for EEG data analysis of patients with Alzheimer’s disease, Ph.D. thesis, Universitat de Vic-Universitat Central de Catalunya, 2015. [38] A. Zeiler, R. Faltermeier, I. R. Keck, A. M. Tomé, C. G. Puntonet, E. W. Lang, Empirical mode decomposition-an introduction, in: The 2010 International Joint Conference on Neural Networks (IJCNN), IEEE, 2010, pp. 1–8. [39] J.-p. Zhao, D.-J. Huang, Mirror extending and circular spline function for empirical mode decomposition method, Journal of Zhejiang University (Science) 2 (2001) 247–252. [40] A. Saibene, F. Gasparini, Human-Machine Interaction: EEG Electrode and Feature Selection Exploiting Evolutionary Algorithms in Motor Imagery Tasks, in: CENTRIC 2020 : The Thirteenth International Conference on Advances in Human-oriented and Personalized Mechanisms, Technologies, and Services, IARIA, ThinkMind, 2020, pp. 8–14. [41] M. X. Cohen, A better way to define and describe Morlet wavelets for time-frequency analysis, NeuroImage 199 (2019) 81–86. [42] R. Gonzalez, Digital Image Processing Using Matlab-Gonzalez Woods & Eddins. pdf. Education (2004). [43] Z. Zhang, F. Duan, J. Sole-Casals, J. Dinares-Ferran, A. Cichocki, Z. Yang, Z. Sun, A novel deep learning approach with data augmentation to classify motor imagery signals, IEEE Access 7 (2019) 15945–15954. [44] A. Saibene, F. Gasparini, GA for feature selection of EEG heterogeneous data, arXiv preprint arXiv:2103.07117 (2021). [45] P. Rosenmai, Using the median absolute deviation to find outliers, Eureka Statistics (2013).