EMOLEARN-"Las aventuras de Marco"- A Serious
Game to train emotions to children with ASD⋆
Antonio Barba1,*, Verónica Rufo1, Giuseppe Iandolo1, Esteban García-Cuesta2
    Universidad Europea de Madrid
    Departamento de Inteligencia Artificial, Universidad Politécnica de Madrid, España

                                         This paper proposes the use of serious games for development of the emotional psychology of children
                                         with autism spectrum disorders. We exploit machine learning and Speech Emotion Recognition tech-
                                         niques to help children with ASD understand how to express their emotions at different day by day
                                         situations that they can be involved in. Marco’s video game aims to stimulate metacognition and model
                                         adaptive behaviors, assertive communication, and self-regulation in daily social life. It is intended as a
                                         tool that a therapist can use to improve the learning process.

                                         Serious Games, Children with ASD, Speech Emotion Recognition, Machine Learning

1. Introduction
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition characterized by com-
munication, and social interaction impairments, restrictive and repetitive behavioral patterns,
interests, or activities with different levels of severity [13]. Child and adolescents with ASD
show information processing, behavioral integration, and executive dysfunctions [14] with
adaptive behavioral impairments in social contexts [15]. Environments overloaded with stimuli
can lead to abnormal behavioral responses in children and adolescents with ASD and sensory
modulation disorder, such as isolation, avoidance, or disruptive behaviors. Modeling sequences
can entail an opportunity to understand the discomfort and develop social strategies to reduce
it by asking for help from the environment in a socially appropriate way. Interactive educa-
tional games are designed to improve the social communication capabilities of children with
ASD to detect and express emotions. EMOLEARN-"Las aventuras de Marco" aims to stimulate
metacognition and model adaptive behaviors, assertive communication, and self-regulation in
daily social life. In the first scenario, Marco learns to ask to play in the schoolyard without
being intrusive, managing his anger at a possible refusal and asking the teacher for help. In the
second, Marco learns to apologize to the people he may be bothering on a city bus. In the third
scenario, Marco learns to communicate with the teacher when the light bothers him or there is
much noise in the class. In the fourth, he learns to ask for a task in a class workgroup during his
cast. Finally, in the fifth scenario, Marco learns to ask his classmates to talk to him individually

                          (a)                                            (b)
Figure 1: Screenshots of the game representing different situations that the student has to learn from.
(a) Schoolyard scenario with pictograms that explains the current social context. (b) Marco explains to
his roomates without anger that they should talk to him one by one.

and without overwhelming him. The first, second, and fourth scenarios aim to expose the player
to a situational model of complex social interaction. The third and fifth scenarios aim to provide
situational models that need to ask for help in case of overflow due to a sensory modulation
disorder characterized by hypersensitivity to environmental stimuli.

2. Related Works
In recent years, much progress has been made to improve the communication skills for children
with ASD thanks to machine learning emotion recognition techniques. This emotion recognition
is based on both, facial gestures and voice. Thus, projects like Anwar and Milanova [1],
LIFEisGAME [2] and FaceSay [3] present systems to identify facial expressions, faces and
emotions both in expression and in detection for autistic children, while the Emotify project [4]
recognizes the emotion through the voice in a talking educational game. In addition, there are
projects that also take the social aspect into account. The ASC-Inclusion project [5] is a game
experience that helps children with ASD to improve their socio-emotional communication skills,
combining voice and facial and body gestures. Also, the Simoes et al. project [6] developed a
serious game where the children became familiar with the routine of taking a bus. In our work
we focus on the emotion of anger, since anger control as a manifestation of anxiety is a goal
of any intervention in social skills in ASD and it is also one of the first emotions that children
recognize. Moreover, by restricting the study to a simpler speech emotion recognition problem,
the accuracy of the implemented model increases and, therefore, the chances of success in real

3. EMOLEARN: Las aventuras de Marco
EMOLEARN-"Las aventuras de Marco" is a visual novel game where the main character is a 6-10
year old boy with ASD called Marco. The game consists of 5 everyday collaborative contexts
associated with the 5 types of learning that are sought and kinematics are used to develop
them. At the end of the cinematic a decision process is proposed by means of pictograms
with text (see figure 1). The child has to make a choice and if the option is correct then the
process of oral exposition is opened where the child explains what happened. The video game
evaluates the associated emotion and determines whether the child has spoken with or without
anger. In the case of speaking in anger, the process has to be repeated until the child says it
without getting angry. The Speech Emotion Recognition (SER) system includes spectral and
prosodic features. The signal is framed into 20 ms windows to analyze its frequency content in
a short time segment of a longer signal (this is a commonly chosen time window size but other
sizes may be also a valid option). Following the authors recommendation at [10] the spectral
features that we have used are: i) first 13 Mel Frequency Cepstral Coefficients (MFCCs) and their
mean, standard deviation, kurtosis and skewness, the first and second derivatives of MFCC; ii)
spectral centroid; iii) spectral flatness; iv) spectral contrast; and v) Linear Predictive Coding
(LPC). The prosodic features represent those supra-segmental elements of oral expression
which are elements that affect more than one phoneme and can’t be segmented into smaller
units, such as accent, tones, rhythm and intonation. The prosodic features that we use are the
fundamental frequency (F0), intensity, and tempo. As result a total of 140 features are extracted
using using Praat [11] and librosa [12]. Different experiments were conducted using SVMs,
Feed-Forward Deep Neural Networks (FFNN), and EXtreme Gradient Boosting (XGBOOST) and
Cross-Validation (CV) methodology to obtain the most accurate model on anger vs. no-anger
emotion classification task. The best model was obtained for SVM technique with an accuracy
of 80%. Due to the difficulties autistic children have in managing attention, the game has been
adapted to their needs according to the work of Frith’s Central Coherence Theory [7]. This
theory points out the difficulty that people with ASD have in analysing a set of stimuli as a
whole, focusing their interest on details and increasing their capacity for fragmentation [9].
In order to improve children’s attention, the following design decisions have been made for
the visual paratranslation [8]: a simple and contrasting colour palette has been selected for the
main character, the backgrounds have little information in order to focus the detail on the main
action, a visual repetition of facial expressions and actions is sought to facilitate the relationship
of situations with respect to the emotional attitude that occurs in the scenes, and the interface
is direct without visual details that could distract from the main objective.

4. Conclusions and Future Works
This paper presents a fully functional serious games for development of the emotional psychol-
ogy of children with ASD. The game is intended to be used by therapist in order to improve
the emotional learning in social environments overloaded with stimuli. We expect that this
will be a valuable tool in this context and it will be validated doing an A/B test experiment in a
classrom of 20-30 children with ASD.

This work has been partially supported by European Commission Erasmus+ Strategic Partner-
ships for School Education (REF.KA201-063086).
