<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Classification through EEG Spectrogram Images</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lorenzo Battisti</string-name>
          <email>lor.battisti5@stud.uniroma3.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessio Ferrato</string-name>
          <email>ale.ferrato@stud.uniroma3.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carla Limongelli</string-name>
          <email>limongel@dia.uniroma3.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mauro Mezzini</string-name>
          <email>mauro.mezzini@uniroma3.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Sydney, Australia</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Education, Roma Tre University</institution>
          ,
          <addr-line>Viale del Castro Pretorio 20, 00185 Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Engineering, Roma Tre University</institution>
          ,
          <addr-line>Via della Vasca Navale 79, 00146 Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>618</volume>
      <fpage>406</fpage>
      <lpage>409</lpage>
      <abstract>
        <p>Emotion modeling for social robotics has the great potential to improve the life quality for the elderly and individuals with disabilities by making communication, care, and interactions more efective. It can help individuals with communication dificulties express their emotions. It can also be used to monitor the emotional well-being of elderly persons living alone and alert caregivers or family members if there are signs of distress. More broadly, emotion modeling is necessary to design robots closer and closer to human beings that can naturally interact with them by understanding their behavior and reactions. Here, we propose a deep learning technique for emotion classification using electroencephalogram (EEG) signals. We aim to recognize valence, arousal, dominance, and likability. Our technique uses the spectrogram from each of the 32 electrodes applied in the skull area. Then, we employ a Resnet101 convolutional neural network to learn a model capable of predicting several emotions. We built and tested our model on the DEAP dataset.</p>
      </abstract>
      <kwd-group>
        <kwd>Emotion classification</kwd>
        <kwd>Electroencephalogram</kwd>
        <kwd>Deep Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and Background</title>
      <p>
        Automatic emotion recognition is a vast and complex
area of research. It has attracted the attention of
scientists in many fields, including psychology, artificial
intelligence, neuroscience, and robotics. The main goal
of this research is to create systems capable of
automatically recognizing and interpreting human emotions.
rience. From the pleasant joy of spending time with a
loved one to the pain of facing a dificult time in life.
which can be divided into two large groups: categorical
models that represent the space of all emotions as a
finite set, and dimensional models that represent emotions
cerning dimensional models, three main components are
frequently used to define emotions and afective states:
arousal, valence, and dominance [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Arousal refers to an
individual’s level of enthusiasm or activity. High arousal
levels are related to emotions of excitement, whereas
lower ones are associated with relaxation. The positivity
or negativity of an emotional experience is referred to as
clips, to assess the association between EEG data and
emotional states. Using a Support Vector Machine
classiifer the authors showed that representing the state space
model in the form of linear dynamical systems removes
the noise not correlated with emotions. This makes the
classification of emotions more accurate. In 2018, Dabas
et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] proposed a 3D emotional model for classifying
the emotions of users watching music videos based on
the DEAP dataset [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In 2019 Donmez and Ozkurt [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
proposed to classify EEG signals by using a convolutional
neural network. They classified three emotions by using
brain signals and spectrogram images.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>
        In this paper, we propose a machine learning [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
technique, more precisely a deep learning [11] technique, for
the realization of a predictive model of emotions using
the EEG signal. We built and tested this model using
one of the best-known online datasets, the DEAP dataset.
This dataset contains the EEG signals of 32 individuals
that were collected while the subjects watched and
listened to music videos taken from YouTube1. Each subject
was invited to view 40 one-minute videos and then asked
to express her emotions on the dimensional model shown
in Figure 1. Additionally, a parameter called likability
was used to quantify how much the participant liked the
stimulus. For each dimension, the participant was asked
to rate its intensity on a continuous scale between 1 and
9, where 1 stands for minimum intensity, and 9 for
maximum intensity. The EEG signal consists of 32 channels,
each corresponding to an electrode that measures the
diference in electric potential in the skull area where
it is positioned. The proposed methodology provides
for the spectral analysis of the signal. This is achieved
by applying the Discrete Fourier Transform to the
signal, thus obtaining the power of the individual sinusoids
that make up the signal. The spectrogram obtained for
each EEG channel is a two-dimensional matrix where
each cell ( , ) represents the intensity of the sinusoid at
the frequency  in the time segment  (for more details
1https://www.youtube.com/
please refer to [12]). Figure 2 shows an example of the
spectral data of the first two EEG channels of one
participant while watching one video. The continuous scale
of each emotion  was transformed into a binary value
() ∈ {0, 1} so that () = 0 if  &lt; 5 , () = 1 if  ≥ 5 .
Denoting below the viewing of a video by a subject as
an experiment, we divided the total number of
experiments (i.e., 40 × 32 = 1280) reserving 32 experiments for
the validation set, 32 for the test set, and the remaining
for the training set. The experiments belonging to the
validation and test set were arbitrarily chosen, one for
each participant relating to diferent video experiments,
so that the sum of positive emotions (in which () = 1 )
and the negative ones (in which () = 0 ) approximately
balance each other. We used the ResNet101 [13]
convolutional neural network, suitably adapted to take as input
a tensor with an arbitrary number of input channels.
We empirically tested diferent hyperparameter
configurations through a grid search, obtaining the following
optimal values:
• Loss Function: Cross-Entropy
• Optimizer: Stochastic Gradient Descent (SGD)
• Momentum: 0.9
• Weight Decay: 0.0005
• Learning Rate: 3.0e-3
The training set size was limited, so the network tends to
overfit after about 200-300 epochs reaching 100 % of
accuracy on the training set. Therefore, we introduced a data
3. Conclusions and Future Works
In the research literature, it has been largely shown that
the knowledge of the user’s emotions can make a
significant contribution to the creation of increasingly efective
human-machine interaction systems. Several aspects can
be analyzed to recognize emotions and, more generally,
the user’s afective state. In this article, we have
presented a deep learning approach to EEG signal analysis.
Specifically, a ResNet101 convolutional neural network
takes the EEG spectrogram as input and returns the
values of arousal, valence, dominance, and likability.
      </p>
      <p>Our idea is still evolving, so the possible future
developments are manifold. These developments can be
methodological or applicative. As regards the former, clearly the
data at our disposal are too limited to fully exploit the
potential of deep neural networks. We, therefore, need new
data, so we are planning to collect it ourselves with the
appropriate instrumentation. Another aspect concerns
the deep neural network chosen. The ResNet101 is one of
the many possibilities that deep learning research makes
available today. A further development of our work
concerns the data augmentation process, which has been
shown to be able to improve the model accuracy. In the
system described, the data augmentation concerned only
the horizontal shift. Hence, we want to apply new
geometric transformations and image processing techniques
and verify whether they can further improve the
accuracy of the results. As far as application developments are
concerned, our idea is to combine physiological data with
those related to facial expressions and eye tracking. Our
ultimate goal is to improve human-machine interaction,
both when the user is dealing with social robots and with
recommender systems [14, 15, 16] or multimedia
applications [17, 18, 19]. For instance, the information related to
the emotions that the user feels when faced with a certain</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Bulagang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. G.</given-names>
            <surname>Weng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mountstephens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Teo</surname>
          </string-name>
          ,
          <article-title>A review of recent approaches for emotion classification using electrocardiography and electrodermography signals</article-title>
          ,
          <source>Informatics in Medicine Unlocked</source>
          <volume>20</volume>
          (
          <year>2020</year>
          )
          <fpage>100363</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <article-title>A circumplex model of afect</article-title>
          ,
          <source>Journal of Personality and Social Psychology</source>
          <volume>39</volume>
          (
          <year>1980</year>
          )
          <fpage>1161</fpage>
          -
          <lpage>1178</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Cavallo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Semeraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Fiorini</surname>
          </string-name>
          , G. Magyar,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sinčák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dario</surname>
          </string-name>
          ,
          <article-title>Emotion modelling for social robotics applications: a review</article-title>
          ,
          <source>Journal of Bionic Engineering</source>
          <volume>15</volume>
          (
          <year>2018</year>
          )
          <fpage>185</fpage>
          -
          <lpage>203</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <article-title>A survey of textual emotion recognition and its challenges</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Webb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ruiz-Garcia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Elshaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Palade</surname>
          </string-name>
          ,
          <article-title>Emotion recognition from face images in an unconstrained environment for usage on social robots</article-title>
          , in: 2020
          <source>International Joint Conference on Neural Networks (IJCNN)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X.-W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.-L.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>Emotional state classification from eeg data using machine learning approach</article-title>
          ,
          <source>Neurocomputing</source>
          <volume>129</volume>
          (
          <year>2014</year>
          )
          <fpage>94</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Dabas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sethi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dalawat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sethia</surname>
          </string-name>
          ,
          <article-title>Emotion classification using eeg signals</article-title>
          ,
          <source>in: Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>380</fpage>
          -
          <lpage>384</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Koelstra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Muhl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soleymani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yazdani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Pun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nijholt</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Patras</surname>
          </string-name>
          ,
          <article-title>Deap: A database for emotion analysis using physiological signals</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          <volume>3</volume>
          (
          <year>2012</year>
          )
          <fpage>18</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H.</given-names>
            <surname>Donmez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ozkurt</surname>
          </string-name>
          ,
          <article-title>Emotion classification from eeg signals in convolutional neural networks</article-title>
          ,
          <source>in: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU)</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Vaccaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sansonetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Micarelli</surname>
          </string-name>
          , An empirical
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>