=Paper=
{{Paper
|id=Vol-2564/shortarticle_6-CRoNe2019
|storemode=property
|title=Convolutional neural network for cognitive task prediction from EEG’s auditory steady state responses
|pdfUrl=https://ceur-ws.org/Vol-2564/shortarticle_6-CRoNe2019.pdf
|volume=Vol-2564
|authors=Daniela Montilla-Trochez,Rodrigo Salas,Alejandro Bertin,Inga Griskova-Bulanova,Paulo Lisboa,Carolina Saavedra
|dblpUrl=https://dblp.org/rec/conf/crone/Montilla-Trochez19
}}
==Convolutional neural network for cognitive task prediction from EEG’s auditory steady state responses==
<pdf width="1500px">https://ceur-ws.org/Vol-2564/shortarticle_6-CRoNe2019.pdf</pdf>
<pre>
                                         Proceedings of the 4th Congress on Robotics and Neuroscience


                                       Convolutional neural network for
                                       cognitive task prediction from EEG’s
                                       auditory steady state responses
                                       Daniela Montilla-Trochez1 , Rodrigo Salas1,4 , Alejandro Bertin1 , Inga
                                       Griskova-Bulanova2* , Paulo Lisboa3 , Carolina Saavedra1*
*For correspondence:
daniela.montilla@postgrado.uv.cl       1 Universidad de Valparaíso; 2 Vilnius University; 3 Liverpool John Moores University;
(DM); carolina.saavedra@uv.cl (CS)
                                       4 Centro de Investigación y Desarrollo en Ingeniería en Salud

Present address: † Escuela de
Ingeniería C. Biomédica,
Universidad de Valparaíso, Chile;
‡ Department of Neurobiology and

Biophysics, Vilnius University,        Abstract The prediction of cognitive tasks from electroencephalography (EEG) signals have
Lithuania; § School of Applied         allowed to discriminate the cognitive states emitted by the subjects and to carry out robust
Mathematicas, Liverpool John
Moores University, United
                                       monitoring of cognition; a fact that is associated with the attention and performance of an
Kingdom; ¶ Centro de Investigación     individual’s behavior, allowing greater control in the experiments. The objective of this work is to
y Desarrollo en Ingeniería en Salud,   perform the prediction of tasks in the function of the auditory steady-state response (ASSR).
CINGS-UV, Universidad de
Valparaíso, Chile.                     Twenty-two subjects underwent three types of tasks: counting, reading and rest, accompanied by a
                                       constant stimulus. Images were obtained from the Inter Trial phase coherence (ITPC) to train
                                       classiﬁcation algorithms based on convolutional neural networks (CNN) in order to separate the
                                       tasks performed by the subjects. Performance evaluation of the classiﬁcation algorithm shows very
                                       good separation between count, read and rest with an AUROC of 0.95. This is signiﬁcantly better
                                       than a feedforward neural network and a pre-trained convolutional deep neural network.


                                       Introduction
                                       Task detection from electroencephalography (EEG) signal allows us to discriminate between speciﬁc
                                       cognitive states and so monitor cognition. This is associated with the attention and performance of
                                       an individual’s behavior Papakostas et al. (2017). Parameters such as functional connectivity can be
                                       used to distinguish between cognitive states based on the individual’s brain Gaut et al. (2018), also
                                       considering that the last components of the evoked potentials are related to discrimination tasks
                                       that reveal complex cognitive processes Saavedra and Bougrain (2012); Saavedra et al. (2019).
                                           It is important to highlight that discrimination of mental tasks has the purpose of monitoring the
                                       behavior of an individual, rather than establishing a correlation between EEG measurements and
                                       the ﬁnal result of the tasks. In particular, there are patterns that might be able to detect cognitive
                                       states between different users. Palaniappan and Raveendran (2001).
                                           Part of the behavior is associated with the individual’s auditory quality, which can be monitored
                                       through evoked potentials that estimate hearing sensitivity. In fact, speciﬁcally the Auditory Steady
                                       State Responses (ASSRs) are used to measure the ability of local cortical networks to generate
                                       activity and thus be able to differentiate individuals with normal hearing sensitivity from those with
                                       varying degrees of auditory sensorineural loss. Korczak et al. (2012).
                                           It should be noted that ASSRs are obtained when an auditory stimulus that is periodically
                                       presented produces an electroencephalographic response. Although there are investigations that
                                       address the study of tasks and use artiﬁcial neural networks for the classiﬁcation of waveforms of
                                       Event-related Potential (ERP) Gupta et al. (1995), from the EEG, there are few that involve a constant


                                          Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  Proceedings of the 4th Congress on Robotics and Neuroscience


auditory stimulus.
    On the other hand, deep neural networks have been applied successfully in different ﬁelds
and with performances that outperforms conventional machine learning techniques. In particular,
convolutional neural networks are a special type of deep networks that have proved very effective
in classifying images because they have the ability to extract relevant characteristics, which they
use with a non-linear classiﬁer Goodfellow et al. (2016). On the other hand, a wide variety of
deep learning models have been proposed in order to classify images in complex contexts (see for
example Mellado et al. (2019))
    The objective of the present work is to make the prediction of the mental task performed by a
subject from the EEG signal when it is under a steady state response to an auditory stimulus.


Methods and Material
Acquisition of EEG Signals
We used a dataset from Voicikas et al. (2016) consisting of 28 healthy young male subjects. The
auditory stimulus used was the Click trial, which consisted of 20 identical bursts of white sound
with a duration of 1.5 ms. The subjects underwent three tasks: the ﬁrst was to count the number
of presentations of the stimulus. In the end, the subjects who report the number of stimuli
speciﬁed was requested to guarantee attention, in the second the subject had to ignore the
stimulus presented and try to keep his/her mind blank and, ﬁnally, in the third task he/she was
asked to make a silent reading of an easily readable text and presented on a computer screen. At
the end of the experiment, the subjects were brieﬂy questioned about the content of the material
to control the attention.
    The channels that were selected to extract the information of the response to the stimulus were
F3, F1, Fz, F2, F4, FC3, FC1, FCz, FC2, FC4, C3, C1, Cz, C2, C4. The sampling frequency was set at
1024Hz, while the number of trials per subject varies between 100 and 120.

Coherent Averaging
The coherent averaging of 𝑁 trials of EEG signals consists in obtaining the average in each instant
of time of segments of signals of equal size and that are started with the stimulus applied. The
objective of applying this technique is to reduce noise and random activations, but on the other
hand it is expected to highlight the evoked potentials in response to the stimulus. In this work,
coherent averages were applied every 15 trials of EGG signals of the same task.

Inter Trial Phase Coherence
The Inter Trial Phase Coherence (ITPC) is a computational technique that averages the complex
representation of a unit vector, obtained from the phase angle of a trial at a given time, represented
using Euler’s formula. The method was introduced by Tallon-Baudry et al. (1996). The ITPC is
mathematically deﬁned by the following equation:

                                                  |    ∑𝑛          |
                                                  |                |
                                       𝐼𝑇 𝑃 𝐶𝑡𝑓 = |𝑛−1     𝑒𝑖𝑘𝑡𝑓 𝑟 |                               (1)
                                                  |                |
                                                  |    𝑟=1         |
where 𝑛 represents the number of trials and 𝑒𝑖𝑘𝑡𝑓 𝑟 is the complex polar representation of a phase
angle 𝑘 on trial 𝑟, at time-frequency point 𝑡𝑓 .
   The result of the ITPC is an image whose pixels have values that are in the range of 0 to 1,
where 0 indicates evenly distributed phase angles and 1 indicates completely identical phase angles
corresponding to complete coherence Delorme and Makeig (2004).
  Proceedings of the 4th Congress on Robotics and Neuroscience


             (a) Count                            (b) Read                            (c) Rest


Figure 1. Images obtained with the ITPC method for each of the mental tasks, each image comes from different
subjects


   In this work, the ITPC images (see ﬁgure 1) are re-scaled to be used in the classiﬁers.

Classiﬁers
In this work, 3 types of classiﬁers will be evaluated, which are explained below:

  1. Feedforward or Fully Connected Neural Network: It is a type of artiﬁcial neural network
     consisting of 1 layer of input neurons, 1 or more layers of hidden neurons and 1 output layer.
     All neurons in one layer are connected to all neurons in the next layer. This type of network
     has no recurrence, lateral connections, nor connections to layers farther than the consecutive
     ones. The learning algorithm used is Backpropagation. These networks have the property of
     being universal approximators. (More details of these networks see Allende et al. (2001))
     In this work, the network architecture used consists of 2 hidden layers. The learning algorithm
     used is: RMSprop.
  2. Convolutional Neural Network (CNN): these are a type of artiﬁcial neural networks that
     have been successfully applied in computer vision. Its name comes from the mathematical
     operation that is carried out in at least one of its layers, the convolution. A CNN is composed
     of at least 3 layers and these are:

         • Convolution layer: The convolution operation receives the image as input and then
           applies a ﬁlter or kernel on it. This layer returns a map of characteristics of the original
           image and whose dimensions will decrease according to the kernel size.
         • Pooling layer: The purpose of this layer is to reduce the spatial dimensions of the input
           volume for the next convolutional layer without affecting depth. The reduction in size
           and the loss of information is favorable due to the decrease in the size of the network
           that leads to a lower overload in the calculation in the following layers and can also
           reduce the overﬁtting.
         • Fully Connected Layer: This is used as the last layer in the CNN. The neurons of the ﬁlters
           are ﬂattened and the information pass through non-linear activation functions. This layer
           is responsible for classifying the images. The number of output neurons is equivalent to
           the number of classes.

      It should be noted that a CNN can consist of several Convolution layers and several Pooling
      layers. In this work, the CNN network architecture is composed of two stages, in the ﬁrst
      stage there are the 4 layers of convolution with kernel (3x3), followed by layers of average
      pooling and max pooling and 4 layers of batch normalization, which they are responsible
      for the extraction of features and dimensionality reduction respectively, in the second stage
      there are 2 fully-connected layers, responsible for classiﬁcation. The learning algorithm used
      is called RMSprop.
      In this work, an ad-hoc CNN model was developed for the available data, where the resulting
      architecture tries to preserve the parsimony. Because in addition to the CNN model ITPC
      images are incorporated, the model will be called Hybrid-CNN
  Proceedings of the 4th Congress on Robotics and Neuroscience


  3. VGG16: The VGG16 model is a convolutional neural network with speciﬁc architecture and
     has been applied in different contexts (see Simonyan and Zisserman (2014)). The architecture
     of the model is composed of 16 layers, of which 13 are convolutional and 3 fully-connected.


Framework of the proposed model
In this paper, a framework for the classiﬁcation of tasks from ASSR signals is proposed. The scheme
used is shown in ﬁgure 2. The proposed scheme consists of the following stages:

  1. Data Acquisition: The EEG signals were recorded using a ANT device of 50mV/V and 64
     WaveGuard EEG channels.
  2. Channel selection and coherent trial averaging: From the EGG signals, the 15 channels closest
     to Cz were selected where the greatest response to the stimulus is visualized (see section
     Acquisition of EEG Signals). A coherent averaging of 15 trials for each channel is performed.
  3. ITPC: The ITPC method explained in the Inter Trial Phase Coherence section is applied to
     generate the spectral images. The resulting images have dimensions 21 × 717. An image bank
     with 7155 samples was obtained.
  4. Feature Extraction: This stage consists of a series of ﬁlters that are convolved with the input
     signals. Afterwards, an activation functions of max-pooling type are applied which generates
     a downsampling and, at the same time, they work as detectors of relevant characteristics.
  5. Classiﬁcation: At this stage, fully connected non-linear activation neuron layers were used.
     From the activated ﬁlters obtained from the previous stage a characteristic vector is generated
     that is processed by the neuronal classiﬁer. The result is the ﬁnal classiﬁcation in one of the
     following tags: Read, Count and Rest.


Figure 2. Flowchart of the proposed method


Results
This section presents the results of the different types of classiﬁers that were implemented to
predict the tasks from the ASSR data, the best results for each model are reported. In the ﬁgure 3 a
distribution of the confusion matrix of each of the classiﬁers can be visualized


 (a) Feedforward Artiﬁcial Neural Network         (b) VGG16                   (c) Hybrid-CNN


Figure 3. Confusion Matrix of the Test data using the three types of models
  Proceedings of the 4th Congress on Robotics and Neuroscience


    Table 1 shows the performance metrics of the models evaluated in the Test dataset. For all
the task the best results were obtained by the Hybrid-CNN model followed by the VGG16, leaving
behind the FANN that obtained the worst performance. For all the tasks the recall and precision
values for the Hybrid-CNN classiﬁer range between 0.841 and 0.867, while for the VGG16 they range
between 0.652 and 0.711. This indicates that the Hybrid-CNN model outperforms the other models
in recognizing tasks correctly.


                                       Count            Read             Rest
                                                        FANN
                 F1_Score              0,125 ± 0,262    0,253 ± 0,283    0,311 ± 0,267
                 Recall_Score          0,163 ± 0,343    0,193 ± 0,229    0,246 ± 0,244
                 Precision_Score       0,163 ± 0,343    0,423 ± 0,480    0,539 ± 0,481
                                                        VGG16
                 F1_Score              0,689 ±0,042     0,679 ±0,019     0,675 ± 0,041
                 Recall_Score          0,676 ± 0,090    0,672 ± 0,063    0,702 ± 0,052
                 Precision_Score       0,711 ± 0,039    0,694 ± 0,056    0,652 ± 0,044
                                                        Hybrid CNN
                 F1_Score              0,866 ± 0,020    0,846 ± 0,019    0,848 ± 0,017
                 Recall_Score          0,867 ± 0,032    0,852 ± 0,033    0,841 ± 0,034
                 Precision_score       0,866 ± 0,032    0,841 ± 0,028    0,856 ± 0,025

Table 1. Classiﬁer performance evaluation metrics


                                      Loss             Accuracy         ECM
                   FANN               6,286 ± 2,345    0,583 ± 0,076    0,408 ± 0,100
                   vGG16              0,781 ± 0,056    0,682 ± 0,024    0,147 ± 0,010
                   Hybrid-CNN         0,274 ± 0,030    0,903 ± 0,010    0,074 ± 0,007

Table 2. Accuracy of each classiﬁer


   As can be seen in the ROC curves (ﬁgure 4), the VGG16 and hybrid-CNN models are above the
non-discrimination line, otherwise the FANN model is very close, tending the but performance of
the three models, the VGG16 have a good performance with respect to the classiﬁcation of the
tasks but it is not an optimal model, on the contrary the hybrid-CNN algorithm shows the best
performance in all the tasks obtaining the best results of area under the curve.
  Proceedings of the 4th Congress on Robotics and Neuroscience


                    (a) Feed forward                                         (b) VGG16


                                                (c) Hybrid-CNN


Figure 4. ROC curve for each of the classiﬁers: (b) Feedforward, (b) VGG16, and (c) Hybrid-CNN. The Hybrid.CNN
shows a better AUC than the other two methods for the cognitive classiﬁcation task.


Conclusion
We have proposed the application of a convolutional neural network to analyze and classify EEG
signals of steady-state auditory responses. To improve the performance of the convolutional neural
network, coherent averaging of 15 trials was performed and images were then obtained by applying
the ITPC method. The results show that with the pipeline of the proposed model for the prediction
of cognitive tasks can be made with an an AUROC of 0.95, corresponding to a sensitivity (recall)
score of 0.85.
    Future work is required in order to increase the number of subjects in the study, specially
considering people with some alteration or that presents cognitive diﬃculties.


Acknowledgments
The authors acknowledge the support of the grant REDI170367 from CONICYT.


References
Allende H, Moraga C, Salas R. Artiﬁcial Neural Networks in Time Series Forecasting: A Comparative Analysis.
   Kybernetika. 2001; 38(6):685–707.

Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including
  independent component analysis. Journal of neuroscience methods. 2004; 134(1):9–21.
  Proceedings of the 4th Congress on Robotics and Neuroscience


Gaut G, Li X, Turner B, Cunningham WA, Lu ZL, Steyvers M. Predicting Task and Subject Differences with
  Functional Connectivity and BOLD Variability. arXiv:180704745 [q-bio]. 2018 Jul; http://arxiv.org/abs/1807.
  04745, arXiv: 1807.04745.

Goodfellow I, Bengio Y, Courville A. Deep Learning. MIT Press; 2016. http://www.deeplearningbook.org.

Gupta L, Molfese DL, Tammana R. An artiﬁcial neural-network approach to ERP classiﬁcation. Brain and
  cognition. 1995; 27(3):311–330.

Korczak P, Smart J, Delgado R, M Strobel T, Bradford C, Auditory Steady-State Responses; 2012. https://www.
  ingentaconnect.com/content/aaa/jaaa/2012/023/003/art03, doi: info:doi/10.3766/jaaa.23.3.3.

Mellado D, Saavedra C, Chabert S, Torres R, Salas R. Self-Improving Generative Artiﬁcial Neural Net-
 work for Pseudo-Rehearsal Incremental Class Learning. Preprints. 2019; 2019070121:1–17. Doi:
 10.20944/preprints201907.0121.v1.

Palaniappan R, Raveendran P. Cognitive task prediction using parametric spectral analysis of EEG signals.
  Malaysian Journal of Computer Science. 2001; 14(1):58–67.

Papakostas M, Tsiakas K, Giannakopoulos T, Makedon F. Towards predicting task performance from EEG
  signals. In: 2017 IEEE International Conference on Big Data (Big Data); 2017. p. 4423–4425. doi: 10.1109/Big-
  Data.2017.8258478.

Saavedra C, Salas R, Bougrain L. Wavelet-based semblance methods to enhance single-trial ERP detection. To
  be published in Computational Intelligence and Neuroscience. 2019; .

Saavedra C, Bougrain L. Processing stages of visual stimuli and event-related potentials. In: The Neuro-
  Comp/KEOpS’12 workshop; 2012. .

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint
  arXiv:14091556. 2014; .

Tallon-Baudry C, Bertrand O, Delpuech C, Pernier J. Stimulus speciﬁcity of phase-locked and non-phase-locked
  40 Hz visual responses in human. Journal of Neuroscience. 1996; 16(13):4240–4249.

Voicikas A, Niciute I, Ruksenas O, Griskova-Bulanova I. Effect of attention on 40Hz auditory steady-state
  response depends on the stimulation type: Flutter amplitude modulated tones versus clicks. Neuroscience
  Letters. 2016; 629:215–220. doi: 10.1016/j.neulet.2016.07.019.

</pre>