<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Person-Independent Multimodal Emotion Detection for Children with High-Functioning Autism</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Annanda Sousa</string-name>
          <email>a.defreitassousa1@nuigalway.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mathieu d'Aquin</string-name>
          <email>mathieu.daquin@nuigalway.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manel Zarrouk</string-name>
          <email>zarrouk@lipn.univ-paris13.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jennifer Holloway</string-name>
          <email>jennifer.holloway@nuigalway.ie</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Data Science Institute - National University of Ireland - Galway</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institut Galile ́e - Universite ́ Paris 13</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Psychology - National University of Ireland - Galway</institution>
        </aff>
      </contrib-group>
      <fpage>14</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>The use of affect-sensitive interfaces carries the promise of enhancing human-computer interaction by delivering a system capable of identifying a user's emotions and adapt its content accordingly. Today's technology shows great potential to support children with autism, for example by using computer systems to improve their social skills. Generally, however, this technology does not encompass the potential of affect-sensitive interfaces. This is mainly due to Emotion Detection (ED) models built for the general population usually not performing well when applied to children with autism, who express emotions differently. The aim of this project is therefore to build a person-independent Multimodal Emotion Detection system tailored for children with highfunctioning autism for the ultimate goal of applying it to design affect-sensitive interfaces dedicated to children with autism. This is a work in progress and the project expects to build upon the current body of knowledge on methods to apply ED systems to this specific subset of the general population. We expect to apply the overall theoretical and practical design perspectives that arise from this research investigation (e.g. analysis of modalities and features extraction, behavioural cues based features, fusion layers and classifier techniques) to propose a guiding framework for future studies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Automatic Emotion Detection (ED) aims to automatically
identify people’s cognitive states or emotions, e.g. happiness,
anger, fear using different types of media inputs such as texts,
video, audio and sensor signals. When combining more than
one type of data, they are called Multimodal Emotion
Detection systems and usually outperform unimodal systems.</p>
      <p>Automatic ED is advancing to become an important
component of Human-Computer Interaction (HCI) through
affectsensitive systems. An affect-sensitive system detects the
user’s emotions and automatically adapts its interaction with
the human based on their emotions. This kind of features
has the potential to enhance HCI, creating an individualised
experience for the user in a more human-to-human-like
interaction.</p>
      <p>Even with all the advancement of ED for users with typical
neurological development, usually referred to as
neurotypical, when applying those systems to children with autism
they do not perform well, mainly because of this
particular population’s way to express emotions [Liu et al., 2008],
motivating the need to develop ED systems specifically
tailored for children with autism. Autism Spectrum Disorder
(ASD) is a developmental disorder with spectrum
manifestation of traits, characterised by impairments in social
interaction, communication and repetitive patterns of behaviour
and interests. High-Functioning Autism (HFASD) is defined
as ASD without significant cognitive and language
impairments [Gaus, 2011].</p>
      <p>Among the results of a recent meta-analysis [Trevisan et
al., 2018] that compared the facial expression production
between a typical development (TD) population and people with
ASD, we can find evidence that people with ASD display
facial expressions less often and less frequently than people
with TD. Also, their expressions are found to be lower in
quality and less accurate. In the work of [Grossard et al.,
2020], the results show that a Random Forest model needs
more facial landmarks to classify facial expressions from
children with ASD than it needs from children with typical
development. Providing more evidence that ED systems
developed for children with typical development do not perform
well when applied to children with ASD.</p>
      <p>Nowadays, the development of computer-based
interventions tools for the treatment of children with autism has
increased, turning technology to an important ally when it
comes to teaching those children abilities they lack in social
and emotional areas [Frauenberger et al., 2012]. There are
several examples of computer systems [Hopkins et al., 2011],
virtual reality (VR) environments [Boyd et al., 2018], tablets
and mobile applications [Hourcade et al., 2012], and even
robotic agents that interact with children with ASD as
intervention tools [Rudovic et al., 2018; Marinoiu et al., 2018].
Studies have shown evidence demonstrating the effectiveness
of such tools to support ASD [Ma et al., 2019].
Additionally, new methods are emerging on the use of technology
to support people on the autism spectrum beyond assistive
and intervention tools, shifting the focus from just “fixing the
problem” to a more holistic approach [Frauenberger et al.,
2016]. This includes, for instance, investigating ways to
design technologies to support children with autism considering
their special interests and strengths.</p>
      <p>Being able to automatically identify emotions from
children with autism can represent an important role in
enhancing and individualising HCI between children with ASD and
computer interfaces specially designed to support their needs
and particularities [Sharmin et al., 2018]. Regardless, most
technological tools that have been developed to support
children on the autism spectrum do not use automatic ED which
could be of great relevance to turn them into significant
supplementary support to classic interventions that are usually
expensive and very much dependent on human presence.
Another point of importance is that creating ED systems tailored
for children with autism is another step towards inclusion:
Tools based on ED are being developed focusing on
neurotypical people, which will not be usable by children with ASD
if not adapted to their ways of expressing emotions. Some
examples of ED application areas are Gaming, Health and
Mental Health, which currently does not include the
population on the autism spectrum.</p>
      <p>In the field of Emotion Detection, creating a
personindependent model is one of many well-known
challenges [Cambria et al., 2017]. This challenge refers to
building a model that performs on identifying emotions from
people which data were not present on the model training dataset.
At a high level, it is related to the fact that people express
emotions in an individualised manner. General patterns on
expressing emotions are typically applicable for most people,
e.g. smiling usually means happiness, however only
considering general patterns is not enough to build an Emotion
Detection (ED) system that takes into considerations individual
and specific cues to express emotions.</p>
      <p>This fact is still true for people inside the Autism Spectrum
Disorder. On the one hand, people with ASD do not express
emotions in a similar way to people with typical development.
On the other hand, as for the general population, there is not
a uniform way how people with autism express their
emotions. As a consequence, creating a person-independent ED
systems that models and reflects how this specific population
expresses emotions is needed.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>Previous studies have developed ED systems tailored to
children with autism. The studies created their ED models
envisioning different applications: to allow the creation of
affectsensitive computer-based intervention tools [Liu et al., 2008]
and affect-sensitive e-learning platforms for children with
ASD [Dawood et al., 2018; Chu et al., 2018], to generate
knowledge to support the assessment of autism [Samad et
al., 2018], to support the treatment of anxiety, a common
cooccurring condition in people with ASD [Kushki et al., 2015],
and also to allow the creation of a VR-based platform as
intervention tool [Bekele et al., 2016; Saadatzi et al., 2013].</p>
      <p>They all have in common that they did not focus on
identifying the 7 basic emotions (i.e. fear, happiness,
sadness, anger, disgust, surprise and contempt) because they
argued that, although the ED field focuses on those basic
emotions, they are not the most relevant in the context of
autism. Therefore, they chose to target different emotion
states more suitable to the autism context, e.g. liking,
anxiety, engagement [Liu et al., 2008] and calmness [Chu et al.,
2018]. Another common characteristic is the fact that they
only used one modality of data input for emotion
identification: physiological signals (e.g. heart hate, skin
conductivity) [Liu et al., 2008; Bekele et al., 2016; Kushki et al., 2015;
Sarabadani et al., 2018] and video media input (e.g. facial
expressions, eye gaze, head movement) [Dawood et al., 2018;
Chu et al., 2018; Ahmed and Goodwin, 2017]. Also, they
all used machine learning techniques to create the classifier
model, which is the state-of-the-art of general Emotion
Detection models (i.e. models for the neurotypical population).
One more common point is that all of them needed to develop
and conduct an experiment to elicit emotions from children
with autism in order to create an annotated dataset. Despite
that, none of the datasets were made available for the research
community mostly due to privacy issues.</p>
      <p>Together these studies provide important evidences to
show that it is viable to model and automatically identify
emotions of children with ASD. However, such studies
remain limited when considering two points: input
multimodalities and generalisability of the model. To the best of our
knowledge, none of their models used multimodal input data
for emotion identification and most of the works created
models that are individual-specific.</p>
      <p>Multimodal inputs have been used and explored in the
Emotion Detection field, where studies showed that
multimodal Emotion Detection usually outperform unimodal
emotion detection models. Regarding individual-specific
approach, it means that the ED model created was trained
separately for each individual child, becoming very good at
identify emotions from that specific child, but not performing well
when applied to other children. This creates a huge
impediment for using person-specific Emotion Detection systems in
realistic conditions because every time a new child would use
the system, it would require the model to be trained on their
annotated data.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Research Objectives</title>
      <p>This research seeks to advance ED systems tailored to
children with autism by exploring ways to design and develop
a person-independent Multimodal Emotion Detection system
to be used by children with autism. The ultimate goal of this
research is to enable the benefits of ED on HCI for children
with ASD. Hence, during this project, we aim to answer the
following Research Question:</p>
      <p>RQ1: How to create a multimodal Emotion Detection
system which:
i) Is tailored to how children with high-functioning autism
express emotions;
ii) Is person-independent, i.e. reach an equivalent or
higher accuracy than state-of-the-art person-independent ED
systems for the neurotypical population, when applied to
children with ASD not involved in training the model.</p>
      <p>To be able to answer RQ1, we further need to explore
answers to the following research question:</p>
      <p>RQ2: How to build a ground truth dataset annotated with
the emotion states we aim to identify?</p>
      <p>RQ3: Which modality input(s) and features are more
relevant for cues extraction in the context of multimodal Emotion
Detection for children with autism?</p>
      <p>RQ4: Which data fusion methods work better in the
context of multimodal Emotion Detection for children with
autism?
4</p>
    </sec>
    <sec id="sec-4">
      <title>The proposed system</title>
      <p>Considering the objectives stated above, the proposed
multimodal ED system tailored for children with high-functioning
autism will involve four input modalities: video, audio, text
and physiological signals (i.e. heart rate measure). These
four modalities were selected for the feasibility of achieving
data acquisition by the user’s family and are widely used on
the field of ED. Based on those input, our model will use
features extracted from: facial expressions, body movements,
the words content of the speech, the tone of the voice and the
heart rate values. All of them are broadly used in the ED field.</p>
      <p>Following the previous related works, we will not focus
on identifying the 7 basic emotions, i.e. surprise, happiness,
anger, disgust, contempt, sadness and fear. Instead, we will
use a framework of emotion zones for regulation [Kuypers,
2013] that is extensively used in psychology to help children
with ASD to learn emotion regulation. It is common for
children with ASD to present impairments in emotion regulation
that is manifested by they finding it hard both to understand
their emotions and to calm down after they leave a calming
state [Scarpa and Reyes, 2011]. A child with ASD
necessitates being in a calming emotional state to be able to listen, to
interact and to learn.</p>
      <p>The zones of regulation framework has 4 different zones,
that are represented by colours (See Figure 1). One of the
emotions zones is the calming zone, represented by the green
colour. This is the ideal state, where the child is calm, relaxed
and ready to work, to listen and to interact. Another emotion
zone is the warning zone (yellow colour). In this state, the
child is presenting signals of agitation or excitement. This
state can originate from both positive and negative emotions.
It can start from intense happiness or excitement, and also
from frustration. The following zone is the high-agitation
zone, with a red colour. In here the child is really upset or
angry, presenting serious difficulties in keeping control of their
emotions. The last zone is the slowing zone, with a blue
colour, in which the child is on low energy and showing
emotional signals of being sad, tired, sick or bored. The child,
here, might move slower than usual, stop speaking or show
delays in responding to interaction.</p>
      <p>This project is developing a classifier able to identify which
of the four emotions zones a child with HFASD is engaged
with using multimodal inputs of data. By choosing to use this
emotions zones’ framework we obtain some benefits: firstly,
the framework additionally includes guidelines on activities
to lead children back to the calming zone, making it easy
to incorporate such activities within an affect-sensitive
interface. Second, parents of children with ASD are more likely to
be familiarised with this framework because it is commonly
used in the context of autism, hence making the tagging task
by the parents more comfortable. Thirdly, considering the
children’s well-being, it is less harmful for the emotional
comfort of children with HFASD, during the emotion
elicitation experiment, to elicit the four emotion zones than other
strong negative emotions, e.g. fear, anxiety (more about data
collection in Section 6).
5</p>
    </sec>
    <sec id="sec-5">
      <title>Methodology</title>
      <p>The methodology’s pipeline to address our research goal is
depicted on Figure 2, encompassing five different stages.</p>
      <p>In order to answer our Research Questions, we will follow
the methodology:
1. to conduct a study with human participants to elicit,
capture and tag different emotion zones expressions, for
dataset creation (RQ2);
2. to use the standard ED methodology to build a
multimodal ED classifier by iterating over the steps:
(a) features extraction (RQ3);
(b) fusion information layer design (RQ4);
(c) training and testing of machine learning models
using annotated dataset (RQ1);
(d) evaluation by designing experiments to analyse the
relation between the type of data input, features
and data fusion techniques, and the accuracy of the
model and compare to previous works.</p>
      <p>Therefore, our first challenge to address is to obtain the
ground truth dataset (phase 1). To achieve this, we have
finished the design and planning of the experiment for data
collection (See Section 6). For the subsequent phases of this
research, we plan to follow the general approach of
investigating the state-of-the-art methods applied for the population
without ASD, evaluate their performance within our dataset
and propose on how we can extend those methods to the
population of children with HFASD.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Data collection</title>
      <p>As a required component for meeting the aim of this research,
we have to create an annotated dataset featuring children with
ASD expressing emotions because previous related works did
not make available any working dataset. To do so, we need to
conduct a behavioural experiment with human participants to
elicit, capture and tag emotions.</p>
      <p>There is no way to directly observe an emotion because
it is an internal experience of an individual, what we can do
is to define and capture behavioural indices of the presence
of a given emotion. Also, emotions do not just appear out of
nowhere, they are usually an individual response of a physical
or mental event, i.e. an event in the real world or thought, thus
we need to evoke them.</p>
      <p>During the experiment, we intend to collect the behavioural
indices that children with ASD engage with when they are
in different emotion zones, together with the measurement
of their heart rate. Examples of behavioural indices are
facial expressions and body movements such as smiling,
flapping hands, head movement. We will ask the participants to
perform tasks expected to evoke the emotion zones while we
capture the participant’s behaviour using different data inputs,
i.e., video, audio and heart rate. We will extract features from
these data to train a multimodal emotion detection system to
identify the four emotion zones from a child with ASD.</p>
      <p>For the study, we will recruit 12 children of the age of 8
to 12 years old and their parents/guardians as participants1.
The aimed participant number is an average of the number of
subjects selected by the related studies (See Section 2). These
works reported that it was challenging to recruit participants,
and they had to operate with a small number of subjects for
their models. To be considered part of this study, the child
must 1) have a previous history of diagnosis of ASD, 2) not
have a history of language or intellectual disability, 3) have
their parents or guardian consent to the participation of the
study. Participation in the study involves performing
emotion eliciting tasks during three different sessions. Each
session is expected to last around 30 minutes. We will use a
1http://emotion-asd.datascienceinstitute.ie/
computer-based task environment, i.e. the child will interact
with a computer for the task’s execution.</p>
      <p>We developed web-based software to serve as a task
environment. During the experiment, the child will interact with
a computer using the task environment interface. This
software is a sequence of tasks expected to elicit each of the
emotion zones. Between each zone elicitation, we will add
calming content to help the child to calm down between emotion
zone’s tasks, to both minimise any stress and set a baseline
of emotions between the elicitation part. Also, it finalises the
session with calming content. The tasks for eliciting each
zone are as follow video content for zones green, blue and
calming activity, a game for zone yellow and a set of Math
questions for the red zone. We selected the eliciting tasks
with the input of psychologists with vast experience on work
with children with ASD.</p>
      <p>We decided the emotion zones’ elicitation order by
considering first the participant’s well-being. So, the green zone
starts the session to be sure we will not cause any negative
emotion in the beginning, scaring the child. We then create
a crescendo of emotion zones by eliciting the yellow zone
followed by the red zone. This way, by asking the child to
solve a demanding worksheet (red zone task elicitation), they
will already be over-excited by having played the game
before (yellow zone task elicitation). The blue zone was
selected to be the last because, by the end of the session, it is
expected that the participant will already show signs of being
tired, therefore becoming easier to elicit the blue zone. To be
the most effective in eliciting the four emotion zones, before
the session we will ask the parents to answer a questionnaire
to outline examples of content that usually makes their child
move to a certain emotion zone. Based on the content of this
questionnaire we will adapt the task environment’s content to
be individualised for each child.</p>
      <p>We will annotate the data collected into four different
categories, each of them representing one of the emotion zones.
The annotation will also include behavioural markers, we
will require from the annotator to select which behaviour
they observed that supported their selection of a given
emotion zone. None of the previous works used the emotion
zone’s framework as target emotions to identify, hence
comparing results with their works will not be straightforward.
To minimise this gap and have some measure of comparison
we will in parallel annotate the dataset to include the
happiness/unhappiness/neutral emotion. Happiness/unhappiness
is a measure of Quality of Life (QoF) [Ramey et al., 2019],
and have being used as independent variable to analyse the
effectiveness of interventions in Psychology. We decided to
not only target the happiness/unhappiness emotion for this
project because this emotion alone does not have the power
to represent if a child with ASD is in an optimal state for
learning. A child with ASD can become overexcited and
agitated because of happiness and not being able to stay still for
learning until they calm down, for example.</p>
      <p>The children’s parents or guardians will perform the
annotation task after the eliciting sessions. We will also recruit
psychology students to act as blind annotators. It is part of
our future work to define which agreement measure we will
use to annotate the dataset. In this case, the parents are the
specialists of identifying their child’s emotions because they
know them, but parents also can have biases that an
annotator who does not know the child would not present. Thus, it
is important to define metrics of which annotation has more
weight in case of disagreement.</p>
      <p>We have developed web-based software to support the
annotation task. The annotator will watch the video record from
the study session and will have four different clickable
buttons on the screen representing each emotion zone. We will
instruct the annotator to click on the button to select an
emotion zone, as soon as they identify the emotion zone in
question. When they select a zone, the system asks the
annotator to indicate which behavioural indices were present that
guided them on their emotion zone decision. Some examples
of behavioural indices they can point are a body movement, a
facial expression, hands’ movements, a word said, etc.</p>
      <p>To create the multimodal annotated dataset, we will
follow the methodology used by the authors of the RECOLA
dataset [Ringeval et al., 2013]. RECOLA is a multimodal
annotated dataset that has the same modalities we intend to
include in this study, i.e. video, audio and physiological
signals, and it was used as a benchmark dataset for several
multimodal emotion detection challenges. They divided the
sessions’ records into videos of 5 minutes and annotated fixed
time windows of 400 milliseconds. They also balanced the
training, validation and test datasets according to the
annotation distribution.</p>
      <p>Before running the experiments, we are going to conduct
pilot sessions with the participation of children with typical
development from the same age range of 8-12 years. By
running a pilot session, we intend to test the experiment
protocol, data collection, data synchronisation and data analysis
steps. We expect to verify if the format of the data we will
collect can be used within the data transformation, analysis
and creation of a multimodal emotion detection system. We
will also test the task environment software and the
annotation software. With the information collected during the pilot
sessions, we will iterate over the experiment protocol, to add
any needed improvement identified during pilots. The
collected pilot data will not be published and will be dealt with
the same planned measures for data protection and privacy as
the data from the posterior study. The results from the pilot
will not be included in the project’s results.</p>
      <p>This research was reviewed by the Institution’s Research
Ethics Committee and the Data Protection Office at NUI
Galway and obtained full approval.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>We presented a work in progress of an emotion detection
system tailored for children with high-functioning autism. The
model’s novelty involves mainly two points: the inclusion of
several data input modalities, and it is a personal-independent
model. The input modalities involved in the proposed model
are video, audio and heart rate. The main foreseen
contribution of this research work is the creation of a
personindependent Multimodal Emotion Detection model to be
integrated into affect-sensitive systems that support children with
autism. The affect-sensitive systems, thanks to this research’s
work, will be able to identify the child’s emotional zones and
suggest/present activities to bring the child back to a calming
emotional state based on which emotional state the child is at
the moment.</p>
      <p>Also, it is part of this project’s scope to make the
multimodal dataset available to the research community. In order
to protect the data subject’s privacy rights, the dataset will
be formed by the extracted features from the original raw
audio/video files together with heart rate measures. Therefore,
it will only contain non-identifiable data.</p>
      <p>Finally, this project expects to build upon the current body
of knowledge on methods to apply Emotion Detection
systems to this specific subset of the general population. We
expect to apply the overall theoretical and practical design
perspectives that arise from this research investigation (e.g.
analysis of modalities and features extraction, behavioural
cues based features, fusion layers and classifier techniques)
to propose a guiding framework for future studies.</p>
      <p>Currently, we had to temporally pause the experiments for
data collection. So, we are working on the next phases of
the methodology pipeline, investigating the state-of-the-art
person-independent multimodal emotion detection systems
for the general population to later propose how to adapt them
to the population with ASD.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>This publication has emanated from research conducted with
the financial support of Science Foundation Ireland (SFI)
under Grant Number SFI/12/RC/2289 P2, co-funded by the
European Regional Development Fund.</p>
      <p>We are grateful to Aindrias Cullen for providing us with
comprehensive advice on data protection legislation, so we
could design a project that is compliant with GDPR. We thank
Dr Ciara Gunning for providing us with specialised advice on
how to work with children with ASD and how to design the
data collection experiment, as well as her support for
recruiting participants for this study.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Ahmed and Goodwin</source>
          , 2017]
          <article-title>Alex A Ahmed and Matthew S Goodwin. Automated detection of facial expressions during computer-assisted instruction in individuals on the autism spectrum</article-title>
          .
          <source>In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems</source>
          , pages
          <fpage>6050</fpage>
          -
          <lpage>6055</lpage>
          . ACM,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Bekele et al.,
          <year>2016</year>
          ]
          <string-name>
            <given-names>Esubalew</given-names>
            <surname>Bekele</surname>
          </string-name>
          , Joshua Wade, Dayi Bian, Jing Fan, Amy Swanson, Zachary Warren, and
          <string-name>
            <given-names>Nilanjan</given-names>
            <surname>Sarkar</surname>
          </string-name>
          .
          <article-title>Multimodal adaptive social interaction in virtual environment (MASI-VR) for children with Autism spectrum disorders (ASD)</article-title>
          .
          <source>Proceedings - IEEE Virtual Reality</source>
          ,
          <fpage>2016</fpage>
          -July:
          <fpage>121</fpage>
          -
          <lpage>130</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Boyd et al.,
          <year>2018</year>
          ] LouAnne
          <string-name>
            <given-names>E</given-names>
            <surname>Boyd</surname>
          </string-name>
          , Saumya Gupta, Sagar B Vikmani,
          <string-name>
            <surname>Carlos M Gutierrez</surname>
            ,
            <given-names>Junxiang</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Erik</given-names>
          </string-name>
          <string-name>
            <surname>Linstead</surname>
          </string-name>
          , and Gillian R Hayes.
          <article-title>vrsocial: Toward immersive therapeutic vr systems for children with autism</article-title>
          .
          <source>In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, page 204. ACM</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Cambria et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>Erik</given-names>
            <surname>Cambria</surname>
          </string-name>
          , Devamanyu Hazarika, Soujanya Poria, Amir Hussain, and
          <string-name>
            <given-names>RBV</given-names>
            <surname>Subramanyam</surname>
          </string-name>
          .
          <article-title>Benchmarking multimodal sentiment analysis</article-title>
          .
          <source>In International Conference on Computational Linguistics and Intelligent Text Processing</source>
          , pages
          <fpage>166</fpage>
          -
          <lpage>179</lpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Chu et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Hui</given-names>
            <surname>Chuan Chu</surname>
          </string-name>
          , William Wei Jen Tsai, Min Ju Liao, and Yuh Min Chen.
          <article-title>Facial emotion recognition with transition detection for students with highfunctioning autism in adaptive e-learning</article-title>
          .
          <source>Soft Computing</source>
          ,
          <volume>22</volume>
          (
          <issue>9</issue>
          ):
          <fpage>2973</fpage>
          -
          <lpage>2999</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Dawood et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Amina</given-names>
            <surname>Dawood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Scott</given-names>
            <surname>Turner</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Prithvi</given-names>
            <surname>Perepa</surname>
          </string-name>
          .
          <article-title>Affective Computational Model to Extract Natural Affective States of Students with Asperger Syndrome (AS) in Computer-based Learning Environment</article-title>
          .
          <source>IEEE Access</source>
          ,
          <volume>6</volume>
          :
          <fpage>67026</fpage>
          -
          <lpage>67034</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Frauenberger et al.,
          <year>2012</year>
          ]
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Frauenberger</surname>
          </string-name>
          , Judith Good, Alyssa Alcorn, and
          <string-name>
            <given-names>Helen</given-names>
            <surname>Pain</surname>
          </string-name>
          .
          <article-title>Supporting the design contributions of children with autism spectrum conditions</article-title>
          .
          <source>In Proceedings of the 11th International Conference on Interaction Design and Children</source>
          , pages
          <fpage>134</fpage>
          -
          <lpage>143</lpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Frauenberger et al.,
          <year>2016</year>
          ]
          <string-name>
            <given-names>Christopher</given-names>
            <surname>Frauenberger</surname>
          </string-name>
          , Judith Good, and
          <string-name>
            <given-names>Narcis</given-names>
            <surname>Pares</surname>
          </string-name>
          .
          <article-title>Autism and technology: Beyond assistance &amp; intervention</article-title>
          .
          <source>In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems</source>
          , pages
          <fpage>3373</fpage>
          -
          <lpage>3378</lpage>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Gaus</source>
          , 2011] Valerie L Gaus.
          <article-title>Cognitive behavioural therapy for adults with autism spectrum disorder</article-title>
          .
          <source>Advances in Mental Health and Intellectual Disabilities</source>
          ,
          <volume>5</volume>
          (
          <issue>5</issue>
          ):
          <fpage>15</fpage>
          -
          <lpage>25</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [Grossard et al.,
          <year>2020</year>
          ]
          <string-name>
            <given-names>Charline</given-names>
            <surname>Grossard</surname>
          </string-name>
          , Arnaud Dapogny, David Cohen,
          <string-name>
            <given-names>Sacha</given-names>
            <surname>Bernheim</surname>
          </string-name>
          , Estelle Juillet, Fanny Hamel, Ste´phanie Hun, Je´re´my Bourgeois, Hugues Pellerin, Sylvie Serret, Kevin Bailly, and
          <string-name>
            <given-names>Laurence</given-names>
            <surname>Chaby</surname>
          </string-name>
          .
          <article-title>Children with autism spectrum disorder produce more ambiguous and less socially meaningful facial expressions: An experimental study using random forest classifiers</article-title>
          .
          <source>Molecular Autism</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [Hopkins et al.,
          <year>2011</year>
          ]
          <string-name>
            <given-names>Ingrid</given-names>
            <surname>Maria</surname>
          </string-name>
          <string-name>
            <surname>Hopkins</surname>
          </string-name>
          , Michael W Gower,
          <article-title>Trista A Perez, Dana S Smith</article-title>
          ,
          <string-name>
            <surname>Franklin R Amthor</surname>
            ,
            <given-names>F Casey</given-names>
          </string-name>
          <string-name>
            <surname>Wimsatt</surname>
          </string-name>
          , and
          <string-name>
            <surname>Fred</surname>
          </string-name>
          J Biasini.
          <article-title>Avatar assistant: improving social skills in students with an asd through a computer-based intervention</article-title>
          .
          <source>Journal of autism and developmental disorders</source>
          ,
          <volume>41</volume>
          (
          <issue>11</issue>
          ):
          <fpage>1543</fpage>
          -
          <lpage>1555</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [Hourcade et al.,
          <year>2012</year>
          ]
          <string-name>
            <given-names>Juan</given-names>
            <surname>Pablo</surname>
          </string-name>
          <string-name>
            <surname>Hourcade</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Natasha E</given-names>
            <surname>Bullock-Rest</surname>
          </string-name>
          , and Thomas E Hansen.
          <article-title>Multitouch tablet applications and activities to enhance the social skills of children with autism spectrum disorders</article-title>
          .
          <source>Personal and ubiquitous computing</source>
          ,
          <volume>16</volume>
          (
          <issue>2</issue>
          ):
          <fpage>157</fpage>
          -
          <lpage>168</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Kushki et al.,
          <year>2015</year>
          ]
          <string-name>
            <given-names>Azadeh</given-names>
            <surname>Kushki</surname>
          </string-name>
          , Ajmal Khan, Jessica Brian, and
          <string-name>
            <given-names>Evdokia</given-names>
            <surname>Anagnostou</surname>
          </string-name>
          .
          <article-title>A Kalman filtering framework for physiological detection of anxietyrelated arousal in children with autism spectrum disorder</article-title>
          .
          <source>IEEE Transactions on Biomedical Engineering</source>
          ,
          <volume>62</volume>
          (
          <issue>3</issue>
          ):
          <fpage>990</fpage>
          -
          <lpage>1000</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>[Kuypers</source>
          , 2013]
          <string-name>
            <given-names>Leah</given-names>
            <surname>Kuypers</surname>
          </string-name>
          .
          <article-title>The zones of regulation: A framework to foster self-regulation</article-title>
          .
          <source>Sensory Integration Special Interest Section Quarterly</source>
          ,
          <volume>36</volume>
          (
          <issue>4</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [Liu et al.,
          <year>2008</year>
          ] Changchun Liu, Karla Conn, Nilanjan Sarkar, and
          <string-name>
            <given-names>Wendy</given-names>
            <surname>Stone</surname>
          </string-name>
          .
          <article-title>Physiology-based affect recognition for computer-assisted intervention of children with Autism Spectrum Disorder</article-title>
          .
          <source>International Journal of Human Computer Studies</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [Ma et al.,
          <year>2019</year>
          ] Tengteng Ma, Hasti Sharifi, and
          <string-name>
            <given-names>Debaleena</given-names>
            <surname>Chattopadhyay</surname>
          </string-name>
          .
          <article-title>Virtual humans in health-related interventions: A meta-analysis</article-title>
          .
          <source>In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, page LBW1717. ACM</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [Marinoiu et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Elisabeta</given-names>
            <surname>Marinoiu</surname>
          </string-name>
          , Mihai Zanfir, Vlad Olaru, and
          <string-name>
            <given-names>Cristian</given-names>
            <surname>Sminchisescu</surname>
          </string-name>
          .
          <article-title>3d human sensing, action and emotion recognition in robot assisted therapy of children with autism</article-title>
          .
          <source>In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>
          , pages
          <fpage>2158</fpage>
          -
          <lpage>2167</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [Ramey et al.,
          <year>2019</year>
          ]
          <string-name>
            <given-names>Devon</given-names>
            <surname>Ramey</surname>
          </string-name>
          , Olive Healy, Russell Lang, Laura Gormley, and
          <string-name>
            <given-names>Nathan</given-names>
            <surname>Pullen</surname>
          </string-name>
          .
          <article-title>Mood as a dependent variable in behavioral interventions for individuals with asd: a systematic review</article-title>
          .
          <source>Review Journal of Autism and Developmental Disorders</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [Ringeval et al.,
          <year>2013</year>
          ]
          <string-name>
            <given-names>Fabien</given-names>
            <surname>Ringeval</surname>
          </string-name>
          , Andreas Sonderegger, Juergen Sauer, and
          <string-name>
            <given-names>Denis</given-names>
            <surname>Lalanne</surname>
          </string-name>
          .
          <article-title>Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions</article-title>
          .
          <source>2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition</source>
          ,
          <string-name>
            <surname>FG</surname>
          </string-name>
          <year>2013</year>
          ,
          <article-title>(i</article-title>
          ),
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [Rudovic et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Ognjen</given-names>
            <surname>Rudovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jaeryoung</given-names>
            <surname>Lee</surname>
          </string-name>
          , Miles Dai, Bjo¨rn Schuller, and
          <string-name>
            <surname>Rosalind</surname>
            <given-names>W Picard.</given-names>
          </string-name>
          <article-title>Personalized machine learning for robot perception of affect and engagement in autism therapy</article-title>
          .
          <source>Science Robotics</source>
          ,
          <volume>3</volume>
          (
          <issue>19</issue>
          ),
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [Saadatzi et al.,
          <year>2013</year>
          ]
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Nasser</surname>
          </string-name>
          <string-name>
            <surname>Saadatzi</surname>
          </string-name>
          , Karla Conn Welch, Robert Pennington, and
          <string-name>
            <given-names>James</given-names>
            <surname>Graham</surname>
          </string-name>
          .
          <article-title>Towards an affective computing feedback system to benefit underserved individuals: an example teaching social media skills</article-title>
          .
          <source>In International Conference on Universal Access in Human-Computer Interaction</source>
          , pages
          <fpage>504</fpage>
          -
          <lpage>513</lpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [Samad et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Manar D.</given-names>
            <surname>Samad</surname>
          </string-name>
          ,
          <string-name>
            <surname>Norou</surname>
            <given-names>DIawara</given-names>
          </string-name>
          , Jonna L.
          <string-name>
            <surname>Bobzien</surname>
          </string-name>
          , John W. Harrington, Megan A.
          <string-name>
            <surname>Witherow</surname>
          </string-name>
          , and
          <string-name>
            <surname>Khan</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Iftekharuddin</surname>
          </string-name>
          .
          <article-title>A Feasibility Study of Autism Behavioral Markers in Spontaneous Facial, Visual, and Hand Movement Response Data</article-title>
          .
          <source>IEEE Transactions on Neural Systems and Rehabilitation Engineering</source>
          ,
          <volume>26</volume>
          (
          <issue>2</issue>
          ):
          <fpage>353</fpage>
          -
          <lpage>361</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [Sarabadani et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Sarah</given-names>
            <surname>Sarabadani</surname>
          </string-name>
          , Larissa Christina Schudlo,
          <string-name>
            <surname>Ali-Akbar Samadani</surname>
            , and
            <given-names>Azadeh</given-names>
          </string-name>
          <string-name>
            <surname>Kushki</surname>
          </string-name>
          .
          <article-title>Physiological detection of affective states in children with autism spectrum disorder</article-title>
          .
          <source>IEEE Transactions on Affective Computing</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <source>[Scarpa and Reyes</source>
          , 2011]
          <string-name>
            <given-names>Angela</given-names>
            <surname>Scarpa and Nuri M Reyes.</surname>
          </string-name>
          <article-title>Improving emotion regulation with cbt in young children with high functioning autism spectrum disorders: A pilot study</article-title>
          .
          <source>Behavioural and cognitive psychotherapy</source>
          ,
          <volume>39</volume>
          (
          <issue>4</issue>
          ):
          <fpage>495</fpage>
          -
          <lpage>500</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [Sharmin et al.,
          <year>2018</year>
          ]
          <string-name>
            <given-names>Moushumi</given-names>
            <surname>Sharmin</surname>
          </string-name>
          , Md Monsur Hossain, Abir Saha,
          <string-name>
            <surname>Maitraye Das</surname>
          </string-name>
          ,
          <string-name>
            <surname>Margot Maxwell</surname>
            , and
            <given-names>Shameem</given-names>
          </string-name>
          <string-name>
            <surname>Ahmed</surname>
          </string-name>
          .
          <article-title>From research to practice: Informing the design of autism support smart technology</article-title>
          .
          <source>In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, page 102. ACM</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [Trevisan et al.,
          <year>2018</year>
          ]
          <article-title>Dominic A Trevisan, Maureen Hoskyn, and Elina Birmingham</article-title>
          .
          <article-title>Facial expression production in autism: A meta-analysis</article-title>
          .
          <source>Autism Research</source>
          ,
          <volume>11</volume>
          (
          <issue>12</issue>
          ):
          <fpage>1586</fpage>
          -
          <lpage>1601</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>