<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Collection And Processing Problems in Automatic EEG Emotion Recognition</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexander Sergeev[</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>X] sergeevalxndr@yandex.ru</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrey Bilyi[</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>] bilyi andrei@mail.ru</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ITMO University</institution>
          ,
          <addr-line>49 Kronverksky Pr., St. Petersburg, 197101</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Research in the eld of automatic recognition of emotions is quite important nowadays and can be used in a various eld from military a airs (polygraphs) to medicine (psycho-correction) and entertainment. With the development of computer technology, it became possible to create a system that will automatically detect emotions in real time. However, many of the modern techniques of emotion detection strongly rely on a huge amount of valid and suitable data to process. Thus, the study of methods for obtaining data of su cient quality is necessary. The article Is devoted to an analysis of the methods and problems of determining the emotional state of a person using electroencephalography. The problems of calling and evaluating emotions, as well as data processing when developing a software module for determining emotions, are considered. As a result of the analysis, recommendations for data collection were made, key features for machine learning algorithms were determined and a program of experiments for data collection was developed.</p>
      </abstract>
      <kwd-group>
        <kwd>Emotional state</kwd>
        <kwd>Emotion recognition</kwd>
        <kwd>Electroencephalog- raphy</kwd>
        <kwd>Psycho-correction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Undoubtedly, emotions play a big role in everyday life. Their in uence extends
to various aspects of human life, from communication and interaction with
people to decision making. Information about the experienced emotions is widely
used in various elds of society. For example, determining a response to a given
stimulus can help correct a psycho-physiological state, allow an assessment of an
attitude to the environment (workplace, equipment used) and draw
appropriate conclusions, or contribute to improving the content recommendation system
through more detailed feedback from users.</p>
      <p>Copyright c 2019 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>Thereby, automatic and accurate determination of an emotional state is
important. With the development of information technology and the current
availability of medical and computer equipment, research in this area is also becoming
more accessible and attractive.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Subject area</title>
      <p>Achievements in the eld of neurophysiological studies of human brain
activity are signi cantly lagging behind in depth and completeness of data from the
same animal studies. Also, the need for the practical application of these methods
(medicine, optimization of operator activity, the creation of neurocomputer
interfaces) necessitates the continued of fundamental research on the human brain.
Methods and technologies for determining emotions can be applied in various
areas of society, from law enforcement agencies (various polygraphs) to
entertainment eld (environmental regulation in games based on current emotional
state) and medical institutions (psychocorrection). However, the most relevant
area of application of this study is the eld of study of the functional state of a
person.</p>
      <p>Recognition of emotions can be carried out on the basis of audiovisual
methods, such as recognition and analysis of facial expressions, speech, body language
and others, but these methods do not always allow us to give a reliable
assessment of a person's current emotional state. Facial expressions, speech, body
movements, and other external physical characteristics can be easily altered if
desired, and can also be misinterpreted according to the true emotion
experienced. The mental state is more di cult for a person to control. Therefore, the
determination of the emotional state by measuring the bioelectric activity of the
brain is of great scienti c and practical value.</p>
      <p>
        The information theory of emotions by Simonov P.V. was chosen as the main
theoretical base of this study. According to this theory, emotion is a re ection
by the human or animal brain of any actual need (its quality and magnitude)
and the likelihood (possibility) of its satisfaction, which the brain evaluates on
the basis of genetic and previously acquired individual experience [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In general,
the rule for the emergence of emotions can be represented in the formula:
E = f [N; (Ir
      </p>
      <p>Ih); :::];
(1)
where E is emotion; N = strength and quality of an actual need; (Ir-Ih) -
assessment of the ability to meet needs based on innate and ontogenetic experience, Ir
- information about means required to achive the need; Ih - information about
means to achive the goal that subject has at the current moment.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Problems of determining emotions using EEG</title>
      <p>The main problems in the eld of determining emotions using EEG are:
{ The complexity of invoking a speci c emotion
{ The complexity of an objective and unambiguous assessment
{ Data processing</p>
      <p>
        EEG-based emotion recognition imposes certain restrictions on the available
methods. Working with electrical signals involves invoking a speci c emotion. In
the case of facial recognition, emotion can be reproduced using Paul Ekkman's
[FACS] system [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], however, such methods are not applicable here. In this case,
to extract pure emotion, it is necessary to immerse a person in a situation where
he can experience it, to make him feel the need for something that can cause
emotion. For example, when displaying a video in which any injustice is shown,
a person will be empathic for the events in the video and is likely to
experience positive emotions if the injustice is eliminated, or negative emotions if the
situation is not resolved well.
      </p>
      <p>Before conducting experiments, it is recommended to conduct an initial
diagnosis of the test subjects: to determine the interests, personal preferences and
other psychological characteristics of the subjects. Based on this information, it
will be possible to select the stimuli that are best suited for a particular subject.
At this stage, you can also determine the temperament of the subject using the
test, which can help in the future with training machine learning model. Some
other parameters such as gender, age and dominant hand can also complete the
feature set. The last parameter can greatly a ect on the outcome result, since
the work of the cerebral hemispheres is di erent for each right- and left-handed
people.</p>
      <p>To invoke a speci c emotion, it is necessary to identify stimuli. It can be
presented in various forms of media content (music, lms, pictures, computer
games), as well as in some actions with the test subject. For example, a
compliment can cause embarrassment or joy, as well as o ering the test subject a cup
of tea. It is worth noting that di erent stimuli have di erent levels of immersion
and have di erent e ects on the emotional state. For example, a funny or sad
music video can cause negative emotions instead of joy or sadness for the reason
that the subject does not like this musical genre. Computer games provide the
highest level of immersion, but may require the participation of the test person
in them, and as a result, interference and artifacts on the EEG signal may occur.
Video materials do not imply the active participation of the test subject, and
involve only observation, and therefore are the most appropriate stimulus. An
important point is also the entry barrier of the stimulus. Funny or disgusting
pictures are more e ective and productive for invoking emotion than a ten-minute
video with some story.</p>
      <p>The big problem is the correct assessment of experienced emotions. To train
the machine learning model, manual control and markup of data according to
the "presentation of the stimulus - assessment" model is assumed. Control can
be carried out by various methods:
{ Survey of the test subject (which of the following emotional states have you
experienced? Rate the strength and sign of the experienced emotion on a
scale of 1-10)
{ Data on heart rate, HRV, ECG, etc. (subject claims he had experienced
a sheer emotion, while his physical indicators tell otherwise - it is worth
considering the correctness of the answer)
{ Subject's face expressions (subject with an emotionless face claims he had
experienced a sheer emotion, or states with obvious facial expressions that
the stimulus did not cause him any emotions)
At this stage, it is necessary to reject possible incorrect reactions, and in some
cases make a decision on the basis of subject's physical condition, rather than on
the basis of his answers. An important role in determining the reaction is played
by the fact that the subject knows about the conducting of the experiment and
can change his answers willing to help the researcher or, vice versa, hide his
emotions for some personal reasons.</p>
      <p>During the tests, it is possible to oversaturate the subject's emotional state,
worsen the emotional reaction and, as a result, receive incorrect data. This can
happen with the continuous presentation of several stimuli of the same type in a
row or with insu cient rest time of the subject. Thus, it is necessary to alternate
stimuli of di erent types in random order, as well as give the subject time to
\reboot" and relax. The recommended relaxation time is one minute, however,
the time may vary depending on the state of the subject. Relaxation should
occur with closed eyes and clean thoughts, however, this process is fully depends
on the test subject.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Data collection and processing</title>
      <p>
        When developing a software module for emotion recognition, the methods of
data processing are important. The most popular processing method is building
and training a machine learning model. Some researchers use K-nearest
neighbours[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and support vector machine methods[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Tools for data gathering are
also important. So, for example, di erent EEG devices will have a di erent
number of electrodes and di erent locations for it, which makes it di cult to develop
a universal tool that can work with di erent types of equipment.
      </p>
      <p>The feature selection is very important. First of all, it is necessary to choose
those electrodes that are closest to the centers of emotion processing, namely:
electrodes near the thalamus, hypothalamus, frontal and middle lobes of the
brain. The selection of all available electrodes may be redundant and lead to
retraining of the ML model, so the least important electrodes can be excluded.
Of the additional features, you can take such parameters as a person's
temperament. For this, the subjects will need to undergo appropriate testing in advance.
Additional ECG electrodes, heart rate or HRV sensors are also can be selected
as a features. However, this also imposes a limitation on the equipment used.</p>
      <p>There are several models for classifying emotions that can be used in this
study:
{ There is a circumplex model that compares a qualitative assessment of
emotion with its arousal and valence.
{ There is a vector model, that consists of vectors that point in two directions,
representing a "boomerang" shape. The model assumes that there is always
an underlying arousal dimension, and that valence determines the direction
in which a particular emotion lies.
{ Positive activation - negative activation (PANA) model suggests that
positive a ect and negative a ect are two separate systems. Similar to the vector
model, states of higher arousal tend to be de ned by their valence, and states
of lower arousal tend to be more neutral in terms of valence in the PANA
model, the vertical axis represents low to high positive a ect and the
horizontal axis represents low to high negative a ect.</p>
      <p>
        Lets take the circumplex model as example (shown in Fig 1). Arousal and Valence
can be used as features for the ML model. Since Beta rhythms are associated with
an activity, and alpha rhythms with a relaxed state, arousal state of the emotion
can be characterized by a large activity of beta rhythms and low activity of
alpha rhythms [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Thus, the ratio of beta / alpha rhythms can be an objective
indicator of arousal. The calculation of the arousal parameter is presented in
formula 2.
      </p>
      <p>
        Studies by psychophysiologists also show that activity in the right and left
hemispheres is associated with sensory and logical thinking, these types of
thinking lead to appropriate behavior, and this is already associated with a feeling
of positive and negative emotions, respectively[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. By determining the ratio of
beta / alpha rhythms, you can calculate the inactivity of the right hemisphere,
and from it to subtract the inactivity of the left hemisphere, getting the value
of valence in result. The calculation of the valence parameter is presented in
formula 3.
      </p>
      <p>
        Many factors in uence the EEG data. So, for example, physical activity is
more re ected on the EEG than mental and emotional ones, so the movement
of a person (blinking, movement of the body and limbs, swallowing, etc.)
during experiments can lead to artifacts in a certain time period, therefore, random
unwanted movements on the part of the subject should be kept to a minimum.
Internal defects in the EEG equipment and the electrical background from nearby
third-party equipment (mobile phones, laptops) can also distort the signal, so the
readings must additionally go through several stages of preliminary processing.
For instance, the bioelectrical activity of the brain is de ned in the range from 1
to 40 hertz among Delta, Theta, Alpha and Beta rhythms. Therefore, all signals
that are not in this range can be taken as noise and artifacts and be excluded
from the processing. It can be achieved by using low and highpass lters. Among
the signal processing methods, some researchers use the Fast Fourier Transform
and Discrete Wavelet Transform [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to divide the signal to the speci c bands.
Additionally, these bands can be divided into a several sub-bands, and its
summary power can be calculated and used as a feature.
      </p>
      <p>Thus, to obtain the correct data set, it is necessary to take into account:
{ Experiment program;
{ Selection of subjects;
{ Set of stimuls;
{ Preliminary manual data evaluation;
{ Feature selection;
{ Processing interference and artifacts;
5</p>
    </sec>
    <sec id="sec-5">
      <title>Experiment program</title>
      <p>During the analysis, some key points for data collection and analysis were
identied, and an experiment program and some requirements for its implementation
were created. The experiment for a single subject is the following sequence of
actions:
1. Connecting a test subject to an EEG device, connecting electrodes, checking
equipment.
2. Recording background activity of the brain. The subject remains motionless
with his eyes closed for several minutes.
3. Analysis of the condition and behavior of the subject. Needed to determine
the most necessary emotional reactions at the moment.
4. Presentation of stimuli. It is carried out using the display of a computer/
laptop. The subject concentrates on the screen and prepares to perceive the
information.
5. Assessment of the induced reaction.
6. Subject recovery, similar to pt.2.
7. Repeat paragraphs 3-6.</p>
      <p>Thus, taking all notes into account, the correct data will be collected for
training the machine learning model. After this, it is necessary to conduct an
initial analysis of the data, to classify the results according to the responses of
the researcher and the subjects.
The article identi ed and analyzed the main problems in determining emotion
using EEG signals and an experiment program was created. Based on the
information obtained during this study, it is planned to conduct experiments to
collect data and develop a software module for determining emotions using EEG.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Simonov</surname>
            <given-names>P.V.</given-names>
          </string-name>
          :
          <article-title>The emotional brain</article-title>
          .
          <source>Physiology. Neuroanatomy Psychology of emotions, Science, St. Petersburg</source>
          ,
          <year>1981</year>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Simonov</surname>
            <given-names>P.V.</given-names>
          </string-name>
          :
          <article-title>The highest nervous activity of man. Motivational and emotional aspects</article-title>
          .
          <source>Science</source>
          ,
          <year>1975</year>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Facial</given-names>
            <surname>Action Coding</surname>
          </string-name>
          System - Paul Ekman Group https://www.paulekman.com/facial-action
          <article-title>-coding-system</article-title>
          .
          <source>Last accessed 01 Nov 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ekman</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Psychology of emotions. St. Petersburg, Peter,
          <year>2011</year>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Matlovic</surname>
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaspar</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simko</surname>
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bielikova</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moro</surname>
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Emotions Detection Using Facial Expressions Recognition</article-title>
          and
          <string-name>
            <surname>EEG</surname>
          </string-name>
          ,
          <year>2016</year>
          , http://www2. it.stuba.sk/ bielik/publ/abstracts/2016/smap2016-matlovic-etal.
          <source>pdf. Last accessed 02 Nov 2019</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Bombatkar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhoyar</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morjani</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gautam</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Emotion recognition using Speech Processing Using k-nearest neighbor algorithm</article-title>
          ,
          <source>IJERA</source>
          ,
          <year>2014</year>
          https://pdfs.semanticscholar.
          <source>org/a1f5/c39aece58f6504e0334d96eaede32c7329cf.pdf. Last accessed 02 Nov 2019</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>