=Paper= {{Paper |id=Vol-1680/paper4 |storemode=property |title=Emotion Elicitation in Socially Intelligent Services: the Intelligent Typing Tutor Study Case |pdfUrl=https://ceur-ws.org/Vol-1680/paper4.pdf |volume=Vol-1680 |authors=Andrej Košir,Marko Meža,Janja Košir,Matija Svetina,Gregor Strle |dblpUrl=https://dblp.org/rec/conf/recsys/KosirMKSS16 }} ==Emotion Elicitation in Socially Intelligent Services: the Intelligent Typing Tutor Study Case== https://ceur-ws.org/Vol-1680/paper4.pdf
     Emotion Elicitation in Socially Intelligent Services: the
             Intelligent Typing Tutor Study Case

                   Andrej Košir                             Marko Meža                            Janja Košir
          University of Ljubljana, Faculty          University of Ljubljana, Faculty     University of Ljubljana, Faculty
            of Electrical Engineering                 of Electrical Engineering                   of Education
                Tržaška cesta 25                          Tržaška cesta 25                     Krdeljeva ploščad
               Ljubljana, Slovenia                       Ljubljana, Slovenia                  Ljubljana, Slovenia
           andrej.kosir@fe.uni-lj.si  marko.meza@fe.uni-lj.si janja.kosir@pef.uni-lj.si
                             Matija Svetina           Gregor Strle
                              University of Ljubljana, Faculty         Scientific Research Centre
                                        of Fine Arts                              SAZU
                                     Tržaška cesta 2                             Novi trg 2
                                   Ljubljana, Slovenia                     Ljubljana, Slovenia
                              matija.svetina@ff.uni-lj.si             gregor.strle@zrc-sazu.si

ABSTRACT                                                              mender systems, human-machine communication (HMC),
The paper discusses the challenges of user emotion elici-             among many others [10], [1], [11]. While there have been
tation in socially intelligent services, based on the exper-          substantial advances in many of these areas, the state-of-
imental design and results of the intelligent typing tutor.           the art technology still lacks satisfactory means to efficiently
Human-machine communication (HMC) of the typing tutor                 meet various user needs and/or tailor to their capabilities.
is supported by the continuous real-time emotion elicita-             As the potential for new users of technology supported ser-
tion of user’s expressed emotions and the emotional feedback          vices is growing (e.g. groups of elderly users), so is the
of the service, through the graphically rendered emoticons.           digital divide [42]. This gap may manifest itself in many
It is argued that emotion elicitation is an important part            forms. It may deprive a particular user group of efficient
of successful HMC, as it improves the communication loop              use of a service (e.g., due to the lack of technological profi-
and increases user engagement. Experimental results show              ciency), it may be limited in scope and only partially attend
that user’s valence and arousal are elicited during the typing        to users needs (e.g., the use of multiple services for a series
practice, on average 18% to 25% of the time for valence and           of common, integrated tasks), or for some user groups of-
20% to 31% of the time for arousal. However, the efficiency           fer no accessibility to a service altogether (e.g., e-banking
of emotion elicitation varies greatly throughout the use of           for the elderly users). In general, it results in frustration
the service, and also moderately among users. Overall, the            and increased cognitive load, requiring significant effort to
results show that emotion elicitation, even via simple graph-         use a service (e.g. interaction, navigation, finding informa-
ical emoticons, has significant potential in socially intelligent     tion, etc.), instead of a service adapting to user needs and
services.                                                             capabilities.
                                                                         One way to address these issues is to establish and sus-
                                                                      tain efficient (close-to-human) communication level between
Keywords                                                              a user and a service, with HMC at the core of contextualiza-
affective computing, emotion elicitation, social intelligence,        tion and adaptation procedures. Whereas natural (human-
human-machine communication, intelligent tutoring systems             to-human) communication is innate and in general requires
                                                                      minimal effort for the actors involved to sustain it, HMC
1.   INTRODUCTION                                                     is void of both innateness and context, as well as of non-
                                                                      verbal (auditory, visual, olfactory) cues. Thus, for a modern
  Bridging the gap between modern digital services and the
                                                                      digital service to be successful, it should be capable of ex-
increasing demands and (often insufficient) capabilities of a
                                                                      pressing minimal social intelligence [45]. Another important
wide range of users is a challenging task. In recent years,
                                                                      and inherent property of natural communication is its con-
much focus has been given to user adaptation procedures in
                                                                      tinuity in real-time. HMC should be able to exhibit some
socially intelligent services, including user modeling, recom-
                                                                      level of social intelligence by generating and processing so-
                                                                      cial signals in near-real-time.1 To sustain the feedback loop
                                                                      the user should be at least minimally engaged, with non-
                                                                      verbal (social) signals (such as emotions) elicited at a con-
                                                                      tinuous (minimal delay) rate. Ideally, effective HMC should
                                                                      minimize the user-service adaptation procedures and maxi-
                                                                      mize the engagement and the intended use of a service. In
                                                                      other words, a service is socially intelligent when it is ca-
EMPIRE 2016, September 16, 2016, Boston, MA, USA.                     1
Copyright held by the author(s).                                          The maximal tolerated delay is about 0.5 seconds.
pable of reading (measuring and estimating) user’s social            services are under intensive development for several decades.
signals (verbal and/or non-verbal communication signals),            We briefly present them grouped according to the following
producing machine generated feedback on these signals, and           subsections.
sustaining and adapting according such HMC.
   In general, we believe it is possible to alleviate some of the    2.1    Social intelligence, social signals and non-
main obstacles towards more effective user-service adapta-                  verbal communication cues
tion procedures by addressing the following:                           There are many definitions of social intelligence applica-
     • Non-intrusive user data acquisition. Some types of            ble in this context [23]. The wider definition used here is by
       user data (e.g., user’s emotion state) should be tracked      Vernon [44], who defines social intelligence as the person’s
       in near-real-time. The problem is users do not like ob-       ”ability to get along with people in general, social technique
       trusive data gathering methods (e.g., to repeatedly fill      or ease in society, knowledge of social matters, susceptibility
       in questionnaires or use wearable sensors in everyday         to stimuli from other members of a group, as well as insight
       situations). The state-of-the-art techniques for non-         into the temporary moods or underlying personality traits of
       intrusive user data acquisition are limited and can not       strangers”. Furthermore, social intelligence is demonstrated
       provide sufficient high quality user data for the efficient   as the ability to express and recognize social cues and be-
       user-service adaptation procedures;                           haviors [2], [6], including various non-verbal cues (such as
                                                                     gestures, postures and face expressions) exchanged during
     • Contextualization. Contextualization refers to the def-       social interaction [47].
       inition of circumstances relevant for specific user-service     Social signals are extensively being analyzed in the field of
       adaptation. Effective user adaptation is highly context-      human to computer interaction [47], [46], often under differ-
       sensitive as user involvement, attention and motiva-          ent terminology. For example, [33] use the term ’social sig-
       tion, as well as preferences, are to a large extent con-      nals’ to define a continuously available information required
       text dependent. The emergent technologies of Internet         to estimate emotions, mood, personality, and other traits
       of things (IoT), wearable computing, ubiquitous com-          that are used in human communication. Others [31] define
       puting, and others, offer various building blocks to          such information as ’honest signals’ as they allow to accu-
       model specific contextualization tasks, however, user         rately predict the non-verbal cues and, on the other hand,
       interaction data is typically not taken into account;         one is not able to control the non-verbal cues to the extent
     • Service functionality and content adaptation for the          one can control the verbal form. Here, we will use the term
       user. Ideally, user adaptation procedure is success-          social signal.
       ful when the service is able to adapt to (and improve
       upon) the user needs and preferences in near real-time.
                                                                     2.2    Socially intelligent learning services
       As a result, the adaptation mechanisms of the ser-               Several services exist that support some level of social in-
       vice need to go beyond generally applicable adaptation        telligence, ranging from emotion-aware to meta-cognitive.
       procedures to address the specific task-dependent and         One of the more relevant examples is the intelligent tutor-
       user-interaction scenarios.                                   ing system AutoTutor/Affective AutoTutor [15]. AutoTu-
                                                                     tor/Affective AutoTutor employs both affective and cog-
   The aim of the paper is to analyze the efficiency of emotion      nitive modelling to support learning and engagement, tai-
elicitation in a socially intelligent service. The underlying        lored to the individual user [15]. Some other examples in-
assumption is that emotion elicitation should be an integral         clude: Cognitive Tutor [7] – an instruction based system
part of HMC, as it can greatly improve user-service adapta-          for mathematics and computer science, Help Tutor [3] – a
tion procedure. For this purpose, the experiment was con-            meta-cognitive variation of AutoTutor that aims to develop
ducted using the socially intelligent typing tutor. The tutor        better general help-seeking strategies for students, MetaTu-
is a web-based learning service designed to elicit emotions          tor [9] – which aims to model the complex nature of self-
and thus improve learner’s attention and overall engagement          regulated learning, and various constraint-based intelligent
in the touch-typing training. Emotion elicitation is utilized        tutoring systems that model instructional domains at an ab-
together with the notion of positive reinforcement, where            stract level [28], among many others. Studies on affective
the learner is being rewarded for her efforts through the            learning indicate the superiority of emotion-aware over non-
emotional feedback of the service. Moreover, the tutor is            emotion-aware services, with the former offering significant
able to model and analyze learner’s expressed emotions and           performance increase in learning [37], [22], [43].
measure the efficiency of emotion elicitation in the tutoring
process.                                                             2.3    Computational models of emotion
   The paper is structured as follows. Section 2 presents               One of the core requirements for socially intelligent ser-
related work, while Section 3 discusses general aspects of           vice is the ability to detect and recognize emotions, and
emotion elicitation in socially intelligent services and then        exhibit the capacity for expressing and eliciting basic affec-
presents the socially intelligent typing tutor. Section 4 presents   tive (emotional) states. Most of the literature in this area
the experimental results on emotion elicitation in the intelli-      is dedicated to the affective computing and computational
gent typing tutor. The paper ends with a general conclusion          models of emotion [26], [25], [34], which are mainly based
and future work.                                                     on the appraisal theory of emotions [48]. Several challenges
                                                                     remain, most notably the design, training and evaluation of
2.     RELATED WORK                                                  computational models of emotion [20], their critical analysis
   The research and development of a fully functioning so-           and comparison, and their relevancy for other research fields
cially intelligent service is still at a very early stage. How-      (e.g., cognitive science, human emotion psychology), as most
ever, various components that will ultimately enable such            computational models of emotion are overly simplistic [12].
2.4    Physiological sensors                                        3.     EMOTION ELICITATION IN SOCIALLY
   The development of wearable sensors enabled the acqui-                  INTELLIGENT SERVICES: THE TYPING
sition of user data in near-real-time, as well as the research
and estimation of user’s internal states (such as emotion and
                                                                           TUTOR STUDY CASE
stress level estimation) that started more than a decade ago           The following sections discuss the role of emotion elici-
[5], [4]. Notable advances can also be found in the fields          tation in socially intelligent services and its importance for
of psychological computing and HCI, with the development            efficient HMC. General requirements and the role of emotion
of several novel measurement related procedures and tech-           elicitation are discussed in the context of our study case –
niques. For example, psychophysiological measurements are           the intelligent typing tutor. Later sections present the de-
being employed to extend the communication bandwidth                sign of the intelligent typing tutor and its emotion elicitation
and develop smart technologies [18], along with the design          model.
guidelines for conversational intelligence based on the en-
vironmental sensors [14]. Several studies deal with human
                                                                    3.1      General requirements for a socially intel-
stress estimation [36], workload estimation [30], cognitive
                                                                             ligent service
load estimation [27], [8], among others, and specific learning        A given service is socially intelligent if it is capable of
tasks related to physiological measurements [49], [21].             performing the following elements of social intelligence:
                                                                         1. Read relevant user behavior cues: human emotions are
                                                                            conveyed via behaviour and non-verbal communication
                                                                            cues such as face expression, gestures, body posture,
                                                                            color of the voice, etc.
2.5    Human emotion elicitation                                         2. Analyze, estimate and model user emotions and non-
   The field of affective computing has developed several ap-               verbal (social) communication cues via computational
proaches to modeling, analysis and interpretation of human                  model: behavior cues are used to estimate user’s tem-
emotions [19]. The most known and widely used emotion an-                   porary emotion state. Selected physiological measure-
notation and representation model is the Valence-Arousal-                   ments (pupil size, acceleration of the wrist, etc.) are
Dominance (VAD) emotion space, an extension of Russell’s                    believed to be correlated with user’s emotion state and
valence-arousal model of affect [35]. The VAD space is used                 other non-verbal communication cues. These are used
in many human to machine interaction settings [50], [40],                   as an input to the computational model of user emo-
[32], and was also adopted in the socially intelligent typ-                 tions and other non-verbal communication cues.
ing tutor (see section 3.3.2). There are other attempts to
define models of human emotions, such as specific emo-                   3. Integrate and model machine generated emotion ex-
tion spaces for human computer interaction [16], or more                    pressions and other non-verbal communication cues:
recently, models for the automatic and continuous analy-                    for example, the notion of positive reinforcement could
sis of human emotional behaviour [19]. Recent research on                   be integrated into a service to improve user engage-
emotion perception argues that traditional emotion models                   ment, taking into account user’s temporary emotion
might be overly simplistic, pointing out the notion of emo-                 state and other non-verbal communication cues.
tion is multi-componential, and includes ”appraisals, psy-
                                                                         4. Generate emotion elicitation to improve user engage-
chophysiological activation, action tendencies, and motor ex-
                                                                            ment: continuous feedback loop between user emotion
pressions” [38]. Consequently, and relevant to the interpre-
                                                                            state and machine generated emotion expressions for
tations of valence in the existing models, some researchers
                                                                            purpose of emotion elicitation.
argue there is a need for the ”multifaceted conceptualiza-
tion of valence” that can be linked to ”qualitatively different          5. Context and task-dependent adaptation: adapt the
types of evaluations” used in the appraisal theories [39].                  service according to the design goals. For example,
   Research of emotion elicitation via graphical user interface             in the intelligent typing tutor case study, the intended
is far less common. Whereas several studies on emotion                      goal is to improve learner’s engagement and progress.
elicitation use different stimuli (e.g., pictures, movies, music)           The touch-typing lessons are carefully designed and
[41] and behavior cues [13], none to our knowledge tackle the               adapt in terms of typing speed and difficulty to meet
challenges of graphical user interface design for the purpose               individual’s capabilities, temporary emotion state and
of emotion elicitation.                                                     other non-verbal communication cues.
   In the intelligent typing tutor, user emotions are elicited
by the graphical emoticons (smileys) via the dynamic graph-            Such service is capable of sustaining efficient, continuous
ical user interface of the service. The choice of emoticons         and engaging HMC. It also minimizes user-service adapta-
was due to their semantic simplicity, unobtrusiveness, and          tion procedures. An early-stage example of socially intelli-
ease of continuous measurement – using pictures as a stimuli        gent service is provided below.
would add additional cognitive load and likely evoke multiple
emotions. This approach also builds upon the results of pre-        3.2      Typing tutor as a socially intelligent ser-
vious research, which showed that human face-like graphics                   vice
increase user engagement, that the recognition of emotions             The overall goal of the socially intelligent typing tutor
represented by emoticons is intuitive for humans, and that          is to improve the process of learning touch-typing. For
emotion elicitation based on emoticons is strong enough to          this purpose, emotion elicitation is integrated into HMC to-
be applicable [17]. The latter assumption is verified in this       gether with the notion of positive reinforcement, to amplify
paper.                                                              the attention, motivation, and engagement of the individual
learner. In its current form, the rudimentary model of emo-         by a positive emotional response from the service when she
tion elicitation utilizes emoticon-like graphics via the graph-     invest more effort into practice (the service does not support
ical user interface of the service, presented to the learner in     negative reinforcement). According to the positive reinforce-
real-time (see section 3.3). The tutor uses state-of-the-art        ment assumption, the rewarded behaviors will appear more
technology (3.2.1) and is able to model, measure and analyze        frequently in the future. Negative reinforcement is not used
emotion elicitation throughout the tutoring process.                for two reasons: there is no clear indication how negative
                                                                    reinforcement would contribute to the learning experience,
3.2.1    Architecture and design                                    and it would require an introduction of additional dimen-
  Typing tutor’s main building blocks consist of:                   sion, making the research topic of the experiment even more
                                                                    complex.
  1. Web GUI: to support typing lessons and machine gen-
     erated emotion expressions via emoticons (see Fig. 1);         3.3.1      Machine emotion model
  2. Sensors: to conduct physiological measurements and            The intelligent typing tutor uses emotion elicitation to re-
     monitor user status (wrist accelerometer, camera, emotion- ward any behavior leading to the improvement of learner’s
     recognition software to estimate user emotions, eye        engagement with the service. The rewards come as positive
     gaze, pupil size, etc.);                                   emotional responses conveyed by the emoticon via graphi-
                                                                cal user interface. The machine generated emotion responses
  3. Computational model: for measuring user emotions           range from neutral to positive (smiley) and act as stimuli for
     and attention in the tutoring process;                     user (learner) emotion elicitation. For this purpose, a subset
                                                                of emoticons from Official Unicode Consortium code chart
  4. Recommender system: for modelling machine gener-           (see http://www.unicode.org/) was selected and emoticon-
     ated emotion expressions;                                  like graphical elements were integrated into the newly de-
  5. Typing content generator: which follows typing lec-        signed user interface of the service shown in Fig. 1.
     tures designed by the expert.
   Real-time sensors are integrated into the service to gather
physiological data about the learner. The recorded data is
later used to establish the weak ground truth of learner’s
attention and the efficiency of emotion elicitation. Both are
further estimated through the human annotation procedure,
based on the carefully designed operational definition and
verified using psychometric characteristics. The list of sen-
sors integrated in the tutor includes:
   • Keyboard: to monitor cognitive and locomotor errors
     that occur while typing;
   • Video recorder: to extract learner’s facial emotion ex-        Figure 1: Socially intelligent typing tutor integrates
     pressions in real-time;                                        touch-typing tutoring and machine generated emoti-
                                                                    cons (for emotion elicitation) via its graphical user
   • Wrist accelerometer and gyroscope: to trace the hand           interface.
     movement;
                                                                       Emotional responses are computed according to the learn-
   • Eye tracking: to measure pupil size and estimate learner’s     ing goals of the tutor. To improve learner’s attention and
     attention and possible correlates to typing performance.       overall engagement in the touch-typing practice, the emo-
  The intelligent typing tutor is publicly available as a client-   tional feedback of the service needs to function in real-time.
server service running in a web browser (http://nacomnet.           As mentioned above, the positive reinforcement assumption
lucami.org/test/desetprstno\ tipkanje). Data is stored on           acts as the core underlying mechanism for modelling ma-
the server for later analyses and human annotation proce-           chine generated emotions. At the same time such mecha-
dures. Such architecture allows for crowd-sourced testing           nism is suitable for dynamic personalization, similar to the
and efficient remote maintenance.                                   conversational RecSys [24]. In order to implement it suc-
                                                                    cessfully, the designer needs to decide on 1. which behav-
3.3     Emotion elicitation in the intelligent typ-                 iors need to be reinforced to appear more frequently, and 2.
        ing tutor                                                   which rewards, relevant for the learner, need reinforcement.
   The role of emotion elicitation in the intelligent typing        3.3.2      User emotion model
tutor is that of efficient HMC and reward system. The pos-
                                                                       User (learner) emotions are elicited via tutor’s graphical
itive reinforcement assumption [29] is used in the design of
                                                                    user interface, based on the machine generated emotion ex-
the emotion elicitation model. Positive reinforcement argues
                                                                    pressions from (3.3.1). The VAD emotion model is used
that learning is best motivated by a positive emotional re-
                                                                    for representation and measurement of learner elicited emo-
sponses from the service when learners ratio of attention over
                                                                    tions, similar to [16]. The VAD dimensions are then mea-
fatigue goes up, and vice versa. Here, machine generated
                                                                    sured in real-time by emotion recognition software (see sec-
positive emotion expressions act as rewards, with the aim
                                                                    tion 4.1).2
to improve learner’s attention, motivation and engagement
                                                                    2
during the touch-typing practice. The learner is rewarded               Here, we only discuss valence ΦuV and arousal ΦuA , the
  Two independent linear regression models are used to           following steps:4
model user emotion elicitation as a response to the ma-
chine generated emoticons. The models are fitted as follows:       1. Instructions are given to the test users: users are per-
the measured values of user emotion elicitation for valence           sonally informed about the goal and the procedure of
and arousal are fitted as dependent variables, whereas the            the experiment (by the experiment personnel);
machine generated emotion expression is fitted as an inde-
pendent variable (Eq.1). The aim is to obtain the models’          2. Setting up sensory equipment, start of the experiment:
quality of fit and the proportion of the explained variance           a wrist accelerometer is put on, the video camera is
in emotion elicitation.                                               set on, and the experimental session time recording is
                                                                      started (at 00 seconds);

ΦuV = β1V Φm + β0V + εV , ΦuA = β1A Φm + β0A + εA ,(1)             3. At 60 seconds: machine generated sound disruption of
                                                                      the primary task: ”Name the first and the last letter
where Φm stands for one dimensional parametrization of the            of the word: mouse, letter, backpack, clock”;
machine emoticon graphics, ranging from 0 (neutral emoti-
con) to 1 (maximal positive emotion expression). Notations         4. At 240 seconds: machine generated sound disruption
β1V and β1A are user emotion elicitation linear model co-             of the primary task, ”Name the color of the smallest
efficients, β0V and β0A are the averaged effects of other             circle”, in the figure (Fig 2). This cognitive task is ex-
influences on user emotion elicitation, and εV and εA are             pected to significantly disrupt learner’s attention away
independent variables of white noise.                                 from the typing exercise;
   The linear regression model was selected due to the good
statistical power of its goodness of fit estimation R2 . There     5. The test segment ends at 330 seconds.
is no indication that emotion elicitation is linear, but we
nevertheless believe the choice of the linear model is justi-
fied. The linear model is able to capture the emotion elici-
tation process, detect emotion elicitation, and provide valid
results (see section 4.2). Residual plots (not reported here)
show that linear regression assumptions (homoscedasticity,
normality of residuals) are not violated.
   To further support our argument for emotion elicitation
in the intelligent typing tutor, we statistically tested our
hypothesis that a significant part of learner’s emotions is
indeed elicited by the machine generated emoticons. We did
this with the null hypothesis testing H0 = [R2 = 0] (see
section 4.2), which demonstrated good power compared to
the statistical tests by some of the known non-linear models.

                                                                 Figure 2: Graphics shown during the second disrup-
4.    USER EXPERIMENT: THE ESTIMATION                            tion (Step 4) at 240 seconds of the test segment
      OF USER EMOTION ELICITATION
  The following sections give an overview of the user exper-       During the experiment, users’ emotion expressions are an-
iment and results on emotion elicitation in the intelligent      alyzed using Noldus Observer video analysis software http:
typing tutor.                                                    //www.noldus.com. The recordings are in sync with the
                                                                 machine generated emoticons, readily available for analysis
4.1   User experiment                                            (see next section 4.2).
   The experiment consisted of 32 subjects invited to prac-
tice touch-typing in the intelligent typing tutor (see 3.2),
                                                                 4.2    Experimental results
with the average duration of the typing session approx. 17          The analysis of the experimental data was conducted to
minutes (1020 seconds). The same set of carefully designed       measure the effectiveness of emotion elicitation. The x-axis
touch-typing lessons was given to all test subjects. User data   times for all graphs presented below are relative in seconds
was acquired in real-time using sensors (as described in sec-    [s], for the whole duration of the test segment (330 seconds).
tion 3.2), and used as an input to the computational model       The estimation is based on the emotion elicitation model
of machine generated emotion expressions, and recorded for       (1) fitting. To detect the time when the emotion elicitation
later analysis. For the preliminary analysis presented here,     is present, we conducted the null hypothesis testing H0 =
five randomly selected subjects were analysed on the seg-        [R2 = 0] at risk level α = 0.05. The emotion elicitation is
ment of the overall duration of the experiment.3 The test        determined as present where the null hypotheses is rejected,
segment spans from 6 to 11.5 mins (330 seconds) of the ex-       and not present otherwise.
periment.                                                           An example of valence and arousal ratings for a randomly
   The test segment used for the analysis is composed of the     selected subject is shown in Fig. 3.
                                                                    The model (1) is fitted using linear regression on the mea-
two primary dimensions for measuring emotion elicitation.        sured data for the duration of the test segment. The data is
3
  To simplify the presentation of the experiment results.
                                                                 4
Note that similar results were found for the remaining sub-        Due to limited space, the two disruption parts of the ex-
jects.                                                           periment (Steps 3. and 4.) are not further discussed.
     0.6            User emotions: valence and arousal
                                                                          1.0                  P-values: valence
                                                                          0.8
     0.4                                                                  0.6
                                                                          0.4
                                                                          0.2
     0.2
                                                                          0.0
                                                                                0   50   100     150         200   250   300   350
                                                                                                       [s]
     0.0                                                                  1.0                  P-values: arousal
                                                                          0.8
                                                                          0.6
     0.2
                                                                          0.4
                                                                          0.2
           0   50      100     150         200   250     300   350
                                     [s]                                  0.0
                                                                                0   50   100     150         200   250   300   350
Figure 3: Valence (black line) and arousal (ma-                                                        [s]
genta, light line) ratings of learner’s emotional state              Figure 4: P-values for the null hypothesis testing
throughout the test segment.                                         H0 = [R2 = 0] of emotion elicitation for a randomly
                                                                     selected subject, separately for valence (top) and
                                                                     arousal (bottom). The horizontal red line marks the
sampled in a non-uniform manner due to the technical prop-           risk level α = 0.05, with p-values below the line indi-
erties of the sensors (internal clocks of sensors are not suf-       cating significant emotion elicitation effect.
ficiently accurate, etc.). The data is approximated by con-
tinuous smooth B-splines of order 3, according to the upper
frequency limit of measured phenomena, and uniformly sam-               We also analyzed the reduced percentages. These are 5%
pled to time-align data (we skip re-sampling details here).          lower than the measured ones, since the significance testing
   To fit the regression models the 40 past samples from             was performed at a risk level α = 0.05 and approximately
the current (evaluation) time representing 4 seconds of real-        5% detections are false (type I. errors). Note that Bonfer-
time were used. These two value were selected as an opti-            roni correction does not apply here. However, we neverthe-
mum according to competitive arguments for more statisti-            less computed the above given percentages using Bonferroni
cal power (requires more samples) and for enabling to detect         correction and it turned out the percentages drop approxi-
time-dynamic changes in the effectiveness of emotion elicita-        mately to one half of the reported values.
tion (requiring shorter time interval leading to less samples).         The strength of emotion elicitation is shown in the linear
Note that changing this interval from 3 to 5 seconds did not         regression model R2 as a function of time (Fig. 5).
significantly affect the fitting results. Results are given in                                   R 2 : valence
terms of RV2 , RA2
                   representing the part of explained variance            0.6
of valence and arousal when the elicitation is known, and in              0.5
                                                                          0.4
terms of a pV , pA -values testing the null hypothesis regres-
                                                                          0.3
sion models H0V = [RV2 = 0], H0A = [RA    2
                                             = 0], respectively.
                                                                          0.2
The time dynamics of emotion elicitation is represented by                0.1
p-values pA and pV on Fig. 4.                                             0.0
   In order to estimate the effect of emotion elicitation, the                  0   50   100     150         200   250   300   350
                                                                                                       [s]
percentages were computed on the number of times the elic-
                                                                          0.7                    R 2 : arousal
itation was significant. The analyzed time intervals were
uniformly sampled every 2 seconds. The results are shown                  0.6
                                                                          0.5
in Table 1. It turned out that the test interval sampling had             0.4
no significant impact on the results.                                     0.3
                                                                          0.2
                                                                          0.1
                                                                          0.0
                                                                                0   50   100     150         200   250   300   350
Table 1: Proportion q of the time when the mea-                                                        [s]
sured emotion elicitation is significant. Notation
                                                                     Figure 5: Linear regression model R2 of emotion
red. q stands for the reduced efficiency, which is
                                                                     elicitation for a randomly selected test subject, sep-
5% lower than the measured one. Measured for the
                                                                     arately for valence (top) and arousal (bottom).
five selected test subjects.
                     Valence       Arousal
        User Id q % red. q % q % red. q %                              The strength of emotion elicitation effect is significant,
                                                                     but also varies highly (Fig. 5). Similar results were detected
           1     47.7     45.3 43.2    41.1
                                                                     among all test subjects. However, it is too early to draw any
           2     68.3     65.0 72.2    68.6
                                                                     meaningful conclusions on the reasons for high variability
           3     60.0     57.0 61.3    58.2
                                                                     at this stage, as many of the potential factors influencing
           4     51.6     49.1 60.6    57.6                          emotion elicitation need further analysis.
           5     62.3     59.4 61.9    58.8                            To estimate the average strength of emotion elicitation,
the average values of R2 were computed for the five se-            [4] J. Allanson and S. H. Fairclough. A research agenda
lected subjects (as in Table 1) – these values are part of             for physiological computing. Interacting with
the explained variance for learner emotions when the ma-               Computers, 16(5):857–878, 2004.
chine generated emotion is known. The average value of R2          [5] J. Allanson and G. Wilson. Physiological Computing.
varies across test subjects from 18.3% to 24.5% for valence            In CHI ’02 Extended Abstracts on Human Factors in
and 19.7% to 31.4% for arousal, for all time intervals (when           Computing Systems, pages 21–42, 2002.
significant or non-significant elicitation is present). If we      [6] N. Ambady and R. Rosenthal. Thin slices of
average only over the time intervals when the elicitation is           expressive behavior as predictors of interpersonal
significant, the average value of R2 varies across test sub-           consequences: A meta-analysis. Psychological Bulletin,
jects from 32.5% to 39.3% for valence and 36.3% to 44.9%               111:256–274, 1992.
for arousal (see Table 2).                                         [7] J. R. Anderson, A. T. Corbett, K. R. Koedinger, and
                                                                       R. Pelletier. Cognitive Tutors: Lessons Learned.
                                                                       Journal of the Learning Sciences, 4(2):167–207, 1995.
Table 2: Average values for the explained variance
for valence and arousal in %: for all time intervals               [8] Y. Ayzenberg, J. Hernandez, and R. Picard. FEEL:
and for the time intervals when emotion elicitation                    frequent EDA and event logging – a mobile social
is significant. Measured for the five selected test                    interaction stress monitoring system. Proceedings of
subjects.                                                              the 2012 ACM annual conference extended abstracts
                   Valence              Arousal                        on Human Factors in Computing Systems Extended
                                                                       Abstracts CHI EA 12, page 2357, 2012.
   User Id All int. Signif. int. All int. Signif. int.
      1       18.3       32.5     19.7        36.3                 [9] R. Azevedo, A. Witherspoon, A. Chauncey,
                                                                       C. Burkett, and A. Fike. MetaTutor: A MetaCognitive
      2       19.4       33.8     27.4        39.2
                                                                       Tool for Enhancing Self-Regulated Learning. Annual
      3       24.5       39.3     31.4        44.9
                                                                       Meeting of the American Association for Artificial
      4       19.8       33.3     23.9        39.9
                                                                       Intelligence Symposium on Metacognitive and
      5       21.7       35.4     26.8        40.2                     Cognitive Educational Systems, pages 14–19, 2009.
                                                                  [10] P. Biswas and P. Robinson. A brief survey on user
   Observe that there is considerably less variability among           modelling in HCI. Intelligent Techniques for Speech
the subjects in terms of elicitation strength (average R2 ),           Image and Language Processing SpringerVerlag, 2010.
compared to the proportions of time the elicitation is signif-    [11] J. Bobadilla, F. Ortega, a. Hernando, and
icant (see Table 1).                                                   a. Gutiérrez. Recommender systems survey.
                                                                       Knowledge-Based Systems, 46:109–132, 2013.
5.   CONCLUSION AND FUTURE WORK                                   [12] J. Broekens, T. Bosse, and S. C. Marsella. Challenges
   The paper discussed the efficiency of emotion elicitation in        in computational modeling of affective processes.
socially intelligent services. The experiment was conducted            IEEE Transactions on Affective Computing,
using the socially intelligent typing tutor. The overall aim           4(3):242–245, 2013.
of the intelligent typing tutor is to elicit emotions and thus    [13] J. A. Coan and J. J. Allen, editors. Handbook of
improve learning and engagement in the touch-typing train-             Emotion Elicitation and Assessment (Series in
ing. Emotion elicitation is utilized together with the notion          Affective Science). Oxford University Press, 1 edition,
of positive reinforcement. The tutor is able to model and              4 2007.
analyze learner’s expressed emotions and measure the effi-        [14] D. C. Derrick, J. L. Jenkins, and J. Jay F. Nunamaker.
ciency of emotion elicitation in the process. Experimental             Design Principles for Special Purpose, Embodied,
results show that the efficiency of emotion elicitation is sig-        Conversational Intelligence with Environmental
nificant, but at times also varies highly for the individual           Sensors (SPECIES) Agents. AIS Transactions on
learner and moderately among learners.                                 Human-Computer Interaction, 3(2):62–81, 2011.
   Future work will focus on reasons for variations in emotion    [15] S. D’mello and A. Graesser. Autotutor and affective
elicitation by analyzing potential factors, such as the effects        autotutor: Learning by talking with cognitively and
of machine generated emotion expressions on emotion elic-              emotionally intelligent computers that talk back.
itation, learner’s emotional state, cognitive load, attention,         ACM Trans. Interact. Intell. Syst., 2(4):23:1–23:39,
and engagement, among others.                                          Jan. 2013.
                                                                  [16] D. C. Dryer. Dominance and valence: a two-factor
6.   REFERENCES                                                        model for emotion in HCI. In Proceedings of AAAI
 [1] E. B. Ahmed, A. Nabli, and F. Gargouri. A Survey of               Fall Symposium Series on Affect and Cognition, pages
     User-Centric Data Warehouses: From Personalization                111 – 117. AAAI Press, 1998.
     to Recommendation. International Journal of                  [17] J. Dunlap, D. Bose, P. R. Lowenthal, C. S. York,
     Database Management Systems, 3(2):59–71, 2011.                    M. Atkinson, and J. Murtagh. What sunshine is to
 [2] K. Albrecht. Social Intelligence: The New Science of              flowers: A literature review on the use of emoticons to
     Success. Pfeiffer, 1 edition, February 2009.                      support online learning. Emotions, Design, Learning
 [3] V. Aleven, B. Mclaren, I. Roll, and K. Koedinger.                 and Technology, pages 1–17, 2015.
     Toward meta-cognitive tutoring: A model of help              [18] S. H. Fairclough. Fundamentals of physiological
     seeking with a cognitive tutor. International Journal             computing. Interacting with Computers,
     of Artificial Intelligence in Education, 16(2):101–128,           21(1-2):133–145, 2009.
     2006.                                                        [19] H. Gunes, B. Schuller, M. Pantic, and R. Cowie.
     Emotion representation, analysis and synthesis in              Personality and Social Psychology, 39(6):1161–1178,
     continuous space: A survey. 2011 IEEE International            1980.
     Conference on Automatic Face and Gesture                  [36] V. Sandulescu, S. Andrews, E. David, N. Bellotto, and
     Recognition and Workshops, FG 2011, pages 827–834,             O. M. Mozos. Stress Detection Using Wearable
     2011.                                                          Physiological Sensors. Artificial Computation in
[20] E. Hudlicka. Guidelines for Designing Computational            Biology and Medicine Lecture Notes in Computer
     Models of Emotions. International Journal of                   Science, pages 526–532, 2015.
     Synthetic Emotions, 2(1):26–79, 2011.                     [37] L. Shen, M. Wang, and R. Shen. Affective e-Learning:
[21] X. Jiang, B. Zheng, R. Bednarik, and M. S. Atkins.             Using emotional data to improve learning in pervasive
     Pupil responses to continuous aiming movements.                learning environment related work and the pervasive
     International Journal of Human-Computer Studies,               e-learning platform. Educational Technology &
     83:1–11, 2015.                                                 Society, 12:176–189, 2009.
[22] A. C. K. Koedinger. The Cambridge Handbook of the         [38] V. Shuman, E. Clark-Polner, B. Meuleman,
     Learning Sciences, chapter Cognitive tutors:                   D. Sander, and K. R. Scherer. Emotion perception
     Technology bringing learning sciences to the                   from a componential perspective. Cognition and
     classroom, pages 61–78. Cambridge University Press,            Emotion, 9931(November):1–10, 2015.
     New York, 2006.                                           [39] V. Shuman, D. Sander, and K. R. Scherer. Levels of
[23] J. F. Kihlstrom; and N. Cantor. Social Intelligence. In        valence. Frontiers in Psychology, 4(MAY):1–17, 2013.
     R. Sternberg, editor, Handbook of intelligence, pages     [40] T. Tijs, D. Brokken, and W. Ijsselsteijn. Creating an
     359–379. Cambridge, U.K.: Cambridge University, 2nd            emotionally adaptive game. Lecture Notes in
     edition, 2000.                                                 Computer Science, 5309 LNCS:122–133, 2008.
[24] T. Mahmood, G. Mujtaba, and A. Venturini. Dynamic         [41] M. K. Uhrig, N. Trautmann, U. Baumgärtner, R.-D.
     personalization in conversational recommender                  Treede, F. Henrich, W. Hiller, and S. Marschall.
     systems. Information Systems and e-Business                    Emotion Elicitation: A Comparison of Pictures and
     Management, pages 1–26, 2013.                                  Films. Frontiers in psychology, 7(February):180, 2016.
[25] S. Marsella and J. Gratch. Computationally modeling       [42] J. van Dijk. Digital divide research, achievements and
     human emotion. Communications of the ACM,                      shortcomings. Poetics, 34(4-5):221–235, 2006.
     57(12):56–67, 2014.                                       [43] K. VanLehn, A. C. Graesser, G. T. Jackson,
[26] S. Marsella, J. Gratch, and P. Petta. Computational            P. Jordan, A. Olney, and C. P. Rosé. When are
     models of emotion. In Blueprint for Affective                  tutorial dialogues more effective than reading?
     Computing (Series in Affective Science). Oxford                Cognitive science, 30:1–60, 2006.
     University Press, 2010.                                   [44] P. E. Vernon. Some characteristics of the good judge
[27] D. McDuff, S. Gontarek, and R. Picard. Remote                  of personality. The Journal of Social Psychology,
     Measurement of Cognitive Stress via Heart Rate                 4(1):42–57, 1933.
     Variability. Proceedings of 36th Annual International     [45] A. Vinciarelli, M. Pantic, H. Bourlard, and
     Conference of the IEEE Engineering in Medicine and             A. Pentland. Social signals, their function, and
     Biology Society (EMBC), pages 2957–2960, 2014.                 automatic analysis: a survey. Proceedings of the 10th
[28] A. Mitrovic, B. Martin, and P. Suraweera. Intelligent          international conference on Multimodal interfaces,
     tutors for all: The constraint-based approach. IEEE            pages 61–68, 2008.
     Intelligent Systems, 22(4):38–45, 2007.                   [46] A. Vinciarelli, M. Pantic, D. Heylen, C. Pelachaud,
[29] A. Neuringer. Operant variability: Evidence,                   I. Poggi, F. D’Errico, and M. Schroeder. Bridging the
     functions, and theory. Psychonomic Bulletin &                  gap between social animal and unsocial machine: A
     Review, 9(4):672–705, 2002.                                    survey of social signal processing. IEEE Transactions
[30] D. Novak, B. Beyeler, X. Omlin, and R. Riener.                 on Affective Computing, 3(1):69–87, 2012.
     Workload estimation in physical human–robot               [47] A. Vinciarelli and F. Valente. Social Signal Processing:
     interaction using physiological measurements.                  Understanding Nonverbal Communication in Social
     Interacting with Computers, page iwu021, 2014.                 Interactions. In Proceedings of Measuring Behavior,
[31] A. Pentland. Honest Signals: How They Shape Our                2010.
     World. The MIT Press, 2008.                               [48] T. Wehrle, G. R. Scherer, and N. York. Towards
[32] J. L. Plass, S. Heidig, E. O. Hayward, B. D. Homer,            Computational Modeling of Appraisal Theories.
     and E. Um. Emotional design in multimedia learning:            Appraisal, pages 350–365, 2001.
     Effects of shape and color on affect and learning.        [49] V. Xia, N. Jaques, S. Taylor, S. Fedor, and R. Picard.
     Learning and Instruction, 29:128–140, 2014.                    Active learning for electrodermal activity
[33] V. P. Richmond, J. C. McCroskey, and M. L. Hickson             classification. In 2015 IEEE Signal Processing in
     III. Nonverbal Behavior in Interpersonal Relations             Medicine and Biology Symposium (SPMB), pages 1–6.
     (7th Edition). Pearson, 7 edition, 4 2011.                     IEEE, 2015.
[34] L.-F. Rodriguez and F. Ramos. Development of              [50] Y.-c. Yeh, S. C. Lai, and C.-W. Lin. The dynamic
     Computational Models of Emotions for Autonomous                influence of emotions on game-based creativity: An
     Agents : A Review. Cognitive Computing, pages                  integrated analysis of emotional valence, activation
     351–375, 2014.                                                 strength, and regulation focus. Computers in Human
[35] J. A. Russell. A circumplex model of affect. Journal of        Behavior, 55:817–825, 2016.