<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Sensor-based Data Fusion for Multimodal Affect Detection in Game-based Learning Environments</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nathan L. Henderson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jonathan P. Rowe</string-name>
          <email>jprowe@ncsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bradford W. Mott</string-name>
          <email>bwmott@ncsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>James C. Lester</string-name>
          <email>lester@ncsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>North Carolina State University Raleigh</institution>
          ,
          <addr-line>North Carolina, 27695</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Affect detection is central to educational data mining because of its potential contribution to predicting learning processes and outcomes. Using multiple modalities has been shown to increase the performance of affect detection. With the rise of sensor-based modalities due to their relatively low cost and high level of flexibility, there has been a marked increase in research efforts pertaining to sensor-based, multimodal systems for affective computing problems. In this paper, we demonstrate the impact that multimodal systems can have when using Microsoft Kinect-based posture data and electrodermal activity data for the analysis of affective states displayed by students engaged with a game-based learning environment. We compare the effectiveness of both support vector machines and deep neural networks as affect classifiers. Additionally, we evaluate different types of data fusion to determine which method for combining the separate modalities yields the highest classification rate. Results indicate that multimodal approaches outperform unimodal baseline classifiers, and feature-level concatenation offers the highest performance among the data fusion techniques.</p>
      </abstract>
      <kwd-group>
        <kwd>Affect detection</kwd>
        <kwd>data fusion</kwd>
        <kwd>deep learning</kwd>
        <kwd>posture</kwd>
        <kwd>electrodermal activity</kwd>
        <kwd>sensor-based learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>Affect detection plays a role of growing importance in educational
data mining. Accurately detecting affect is vital to understanding
learning. While states such as confusion or engagement have been
previously correlated with positive learning outcomes [20], other
emotions such as boredom have been associated with negative
learning outcomes [5]. Similarly, it has been found that affect
detection can potentially be used to avoid negative learning
outcomes [10].</p>
      <p>To more closely model the human cognitive perception and
recognition of certain states, affective modeling techniques have
expanded to include multiple parallel data streams that are
processed simultaneously to form a single affect prediction or
approximation; such systems are referred to as “multimodal” [2].
Each data stream, or “modality,” can be provided by a wide array
of sources ranging from user interaction logs to eye gaze tracking.
The processing of multiple independent modalities has been shown
to boost affect classifier performance [6] and provide additional
insight into the various aspects of a student’s interaction with an
intelligent tutoring system [11]. Multimodal computing can be
highly beneficial to affective computing and educational data
mining tasks by providing multiple complementary perspectives on
a single subject or event [3].</p>
      <p>
        A common implementation of multimodal affect detection systems
utilizes sensors as perceptors to capture physical data and activity.
This enables the system to process different types of physiological
and positional information that signify different affective states of
students. Sensors are commonly deployed within multimodal
systems due to their relatively low expense, flexibility with regards
to hardware and software requirements, and generalization across a
variety of domains. Consequently, sensor-based multimodal
systems have been the focus of several research efforts in recent
years. Examples of sensor-based modalities include facial
expression [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], posture [9], electroencephalogram (EEG) data [24],
and electrodermal activity (EDA) [15].
      </p>
      <p>Sensor-based systems are not without inherent challenges [7]. Such
systems can be plagued by issues such as calibration problems,
mistracking, noise, irregular behavior, inconsistent data transfer,
and synchronization issues. Cultural and social behaviors of
participants engaged in a sensor-based system can also impact
performance, as well. In certain instances, a sensor may
malfunction for an extended period of time, resulting in large
intervals of missing or invalid data for one or more modalities.
In this paper, we investigate sensor-based multimodal models for
affect detection using data from students engaged with a
gamebased learning environment for emergency medicine. We utilize
student posture information captured by a Microsoft Kinect, as well
as EDA data captured by an Affectiva Q-Sensor. We compare the
performance of support vector machine (SVM) and deep
feedforward neural network models as affect classifiers using
unimodal data, as well as multimodal data combining the posture
and EDA data channels. Finally, we evaluate three different
variations of data fusion for the multimodal affect classifiers.
Results suggest improved performance of multimodal classifiers as
compared to unimodal classifiers trained on separate Kinect and
QSensor modalities, and they reveal the impact that different data
fusion techniques have on a classifier’s accuracy with multimodal
datasets.</p>
    </sec>
    <sec id="sec-2">
      <title>2. RELATED WORK</title>
      <p>Because of their domain independence, sensors have been
integrated into a wide selection of multimodal affect detection
systems. Pei et al. [23] utilize long short-term memory (LSTM)
recurrent neural networks for a binary affect classification task with
audio and visual recordings. Nazari et al. [18] implement a
multimodal system to detect instances of narcissism in individuals
using modalities such as facial expressions, dialogue, vocal
acoustics, and behavioral cues. Facial tracking is paired with
selfassessment post-tests to detect student engagement with
MetaTutor, an adaptive learning system with a curricular focus on
the human circulatory system [12]. Additionally, Muller et al. [17]
implement a multimodal affect detection system based on human
pose, motion tracking and speech to classify instances of four
affective states (anger, happiness, sadness, and surprise) as well as
estimate continuous level valence and arousal. Other sensor-based
systems use modalities such as eye gaze to predict learning
outcomes using gradient tree boosting algorithms [25].
The use of posture data within affect detection systems has
experienced a significant increase in recent years. Low-cost sensors
such as the Microsoft Kinect have allowed this modality to be easily
integrated into multimodal systems. As shown in [22],
Kinectbased posture data can be used by supervised and rule-based
algorithms to detect various affective states. Likewise, Grafsgaard
et al. [9] use Kinect data to estimate student engagement in
computer-based tutoring systems used to teach introductory
programming concepts. Shifts in posture have been linked to
affective states such as frustration, and thus have been associated
with negative learning outcomes [9]. When used in conjunction
with other modalities such as facial expression and gesture
tracking, posture can also be indicative of engagement, learning,
and self-efficacy, as [10] demonstrates through the use of stepwise
linear regression techniques. Finally, Kinect data has also been
utilized for tasks involving anger detection [21] and biometric
identification [24].</p>
      <p>In addition to posture and pose-related data, advances in
multimodal systems have also extended to biosignal modalities.
Examples of such work include [24], where Kinect-based posture
data is combined with EEG data through sensor fusion to construct
a reliable biometric identification model. Additional low-cost
sensors were used to capture EEG, EDA, and electromyography
(EMG) data, where results indicate that a multimodal approach
outperformed unimodal detectors for arousal and valence levels [8].
Using support vector machines, EEG data as well as eye gaze data
was used to predict emotional response to videos [27]. The
combination of EDA and EEG data has likewise been applied to the
problem of stress detection [15] and frustration detection [7]. EDA
has been paired with Kinect-based posture data and webcam-based
facial expression data to predict students’ instances of frustration
and engagement in response to tutor questions in an educational
environment [29].</p>
    </sec>
    <sec id="sec-3">
      <title>3. DATASET</title>
      <p>We investigate different multimodal affect classifiers within the
context of a game-based environment for emergency medical
training, the Tactical Combat Casualty Care Simulation (TC3Sim).
Developed by Engineering and Computer Simulations (ECS),
TC3Sim is widely used by the U.S. Army to provide realistic
combat medic simulations for soldiers. Students assume the
firstperson perspective of a combat medic involved in different
scenarios alongside a variety of non-player characters (NPCs).
During a training scenario, participants are faced with different
tasks in real time such as securing the area, applying appropriate
medical care to combat victims, and preparing for evacuation. The
Kinect-based posture data and EDA data collected by the Q-Sensor
is captured during four different training scenarios: a leg injury
scenario, an introductory training scenario, a story-driven narrative
scenario, and a patient expiration scenario that portrays a combat
victim expiring regardless of the actions of the player. A screenshot
of a player’s first-person perspective when engaged with TC3Sim
is shown in Figure 1.</p>
      <p>The dataset used in this work was collected from a study with 119
cadets from the United States Military Academy (83% male. 17%
female) who participated in different training sessions with
TC3Sim. All participants completed the same training materials,
which were administered through the Generalized Intelligent
Framework for Tutoring (GIFT) framework. GIFT is a
serviceoriented software framework designed to aid in the development
and deployment of computer-based adaptive training systems [28].
Each participant worked individually at a single workstation, and
each session lasted approximately one hour. The posture activity
for each participant was captured using a Microsoft Kinect for
Windows 1.0 sensor. The head and torso positions and movements
were captured using skeleton-tracking features contained in the
GIFT framework. The data from the Kinect was sampled at a rate
of 10-12 Hz. This modality contained timestamped feature vectors
containing coordinates of 91 vertices. For this effort, three vertices
were selected in accordance with prior research regarding affect
detection with Kinect data [9]: top_skull, center_shoulder, and
head. 73 additional features were engineered from this modality
during the post-processing stage. These features were summary
statistics such as the mean, variance, and standard deviation of the
different vertices over time windows of 5, 10, and 20 seconds prior
to each observation.</p>
      <p>In addition to the postural modality, electrodermal activity was
captured from each user using an Affectiva Q-Sensor bracelet worn
by each participant. The Q-Sensor captured each user’s skin
temperature, electrodermal activity, and the sensor’s acceleration
vectors as determined by an onboard accelerometer. However, in
this study, only the EDA readings were used for affect detectors. In
a similar fashion to the posture modality, summary statistics were
calculated for the EDA modality such as the min, max, and variance
of the EDA values for each session, as well as the summary
statistics across time windows of the prior 5, 10, and 20 seconds.
The net changes in the EDA levels across the previous 3 and 20
seconds were also calculated. However, the Q-Sensors experienced
highly inconsistent behavior with regard to the data capture, which
affected approximately half of the collected data. Additionally, the
interaction trace log data from each session was captured by the
GIFT framework, but because this work focuses exclusively on
sensor-based modalities, this data was not utilized.</p>
      <p>To obtain ground truth labels of each student’s affective states, two
trained observers marked instances of different displays of affect in
accordance with the BROMP protocol [19]. BROMP is a
quantitative observation protocol for run-time coding of student
affect and behavior during classroom-based interactions [19].
During this process, the two observers walked around the perimeter
of the classroom and discreetly marked instances of affect in
20second intervals using a handheld device. Affective states recorded
include bored, confused, engaged, frustrated, and surprised.
A total of 3,066 separate BROMP observations were collected.
Only observations that were collected during students’ actual
engagement with TC3Sim were kept, and observations where there
was disagreement between the two observers were discarded.
Agreeing BROMP observations were treated as a single label, and
only BROMP observations recorded during the TC3Sim exercise
were preserved, excluding instances during pre and post-test
surveys, as well as instances occurring during the instructional
PowerPoint presentation. Additional factors contributing to the
significant reduction in BROMP observations were the subtlety of
instances of affect in the cadets compared to classroom participants,
as well as cases of multiple different affective states being observed
within the same 20-second window. The resulting dataset contained
755 distinct BROMP observations; the distribution of affect
instances is shown in Figure 2. Instances of engagement were by
far the most common occurrence, while instances of frustration and
surprise were sparse. As stated previously, the Q-Sensor
experienced frequent stops in data logging. This issue resulted in
333 BROMP observations containing missing EDA information,
while a subset of 422 data samples contained both the posture and
the EDA modalities. The posture-based modality did not appear to
suffer any data loss from the Kinect sensor.</p>
      <p>700
s600
e
c
an500
t
s
In400
f
o
re300
b
u200
m
N
100
0</p>
      <p>435
73
red
o
B
174
fused
on
C
32</p>
      <p>29
aged
Eng
rustrated
F</p>
      <p>rised
rp
Su</p>
    </sec>
    <sec id="sec-4">
      <title>4. METHODOLOGY</title>
      <p>The primary goal of this paper is to demonstrate the effectiveness
of a multimodal classification system for affect detection using two
modalities: Kinect-based posture data and electrodermal activity
data. To ensure that both modalities are present in each data sample,
any BROMP observation with missing or invalid EDA data was
removed from the dataset. Therefore, our classifiers were trained
on a dataset using 422 BROMP observations containing correlated
posture and EDA data.</p>
    </sec>
    <sec id="sec-5">
      <title>4.1 Data Preprocessing</title>
      <p>After the aforementioned BROMP observations were removed
from the dataset, five separate datasets were created through
oversampling of each affective state. The oversampling was
accomplished using a minority class cloning technique.
Additionally, feature data was scaled using z-score standardization.
This method ensures that each attribute of the feature vectors have
the same mean and standard deviation but allows for different
ranges.</p>
    </sec>
    <sec id="sec-6">
      <title>4.2 Feature Selection</title>
      <p>Prior to training the classifiers, each dataset underwent forward
selection for the purpose of feature selection. This reduces the
number of attributes in each dataset through a greedy algorithm that
trains a model and selects the best [0, k] features based on each
model’s Cohen’s Kappa [4]. For our work, a k value of 10 was
chosen. The model used in feature selection was the sequential
minimal optimization (SMO) support vector machine [7]. This
polynomial-kernel model was selected due to its linear memory
requirements and scalability, as a high number of models were
trained to obtain the best features. An attribute was not considered
unless it showed positive improvement over the currently-selected
dataset, and the attribute showing the highest improvement was
kept as a selected feature. The feature selection was implemented
using RapidMiner 9.0 [16]. This platform was selected due to its
convenience as a toolkit for implementing the data processing
pipeline, as well as its use in prior work in affect detection [7].</p>
    </sec>
    <sec id="sec-7">
      <title>4.3 Classifiers</title>
      <p>Prior work has demonstrated the effectiveness of deep neural
networks in affect classification tasks [14]. We utilize the same
neural network approach and compare it with SVM models. The
SVMs contain a radial kernel function with a convergence epsilon
of 0.001 for a maximum of 100,000 iterations. The artificial neural
network (ANN) architecture contained feed-forward layers of 800,
800, 500, 100, and 50 nodes, respectively, in addition to a binary
classification layer. Each layer’s activation function was a
Rectified Linear Unit (ReLU). Each network was trained for 10
epochs with the ADADELTA adaptive learning rate [30]. A
separate classifier was trained for each affective state, using the
selected features of the oversampled data as described in section
4.1.</p>
    </sec>
    <sec id="sec-8">
      <title>4.4 Data Fusion</title>
      <p>To evaluate different methods of integrating the two modalities for
affect classification, we implement several variations of data fusion
techniques. We test two types of data fusion: feature-level fusion
(“Early Fusion”) and decision-level fusion (“Late Fusion”). Early
Fusion involves the concatenation of features from the posture and
EDA modalities prior to training the affect classifier. Late Fusion
calls for the training of separate classifiers for each modality, and
the predicted confidence levels of each binary class (positive or
negative label of affective state) are processed by a voting
schematic to produce a singular prediction of the affective state.
The voting schematic can be implemented in different ways, such
as majority voting, averaging, or weighting [2]. For this paper, we
take the highest confidence value across the two classifiers and use
the associated class as our final representative prediction. Two
different variations of Early Fusion are also evaluated. The first
variation, referred to in this paper as “Early Fusion 1”, concatenates
the features prior to the feature selection process. The other
variation, referred to as “Early Fusion 2”, performs separate feature
selection on the separate modalities, and only the selected features
are concatenated prior to training the classifiers. A visual
representation of the various data fusion pipelines is shown in
Figure 3.</p>
    </sec>
    <sec id="sec-9">
      <title>5. RESULTS AND DISCUSSION</title>
      <p>The classifiers were evaluated using 10-fold cross validation, with
the data split on a per-session basis to ensure that all data from
individual training sessions were kept in the same fold. The same
batches of data were maintained across all modeling approaches to
ensure fair comparisons across classifiers. The unimodal baseline
classifiers and Early Fusion pipelines were implemented using
RapidMiner 9.0. RapidMiner does not support decision-level
fusion, so the Late Fusion pipeline was implemented using Python
3.6, while the classifiers were still implemented in RapidMiner.
Unimodal classifiers were trained on the posture and EDA
modalities independently to provide a baseline for the multimodal
classifiers’ performance. The results for the posture and
EDAbased unimodal classifiers for each affective state are shown in
Tables 1 and 2 respectively. Evaluation metrics include Cohen’s
Kappa, raw accuracy, and F1 Score. Particular focus is given to
Cohen’s Kappa due to its ability to account for the possibility of
correct classification due to random chance.</p>
      <p>The posture-based SVM returned the highest Kappa for four of the
five affective states, and the EDA-based SVM outperformed the
ANN for three of the five affective states. The ANN model
performed poorly on a majority of the evaluations, returning a
negative Kappa on two of the posture-based states and four of the
five EDA-based states, indicating that the ANN is no better than a
random classifier for a majority of states.
The posture classifiers performed relatively poorly on boredom,
confused, and surprised. It is worth noting that surprised contains
the lowest number of positive instances within the dataset, which
may contribute to the poor performance. Additionally, it is possible
that postural behavior may not distinguishably change between
positive instances of boredom and confused, lead to common
misclassifications across the two states. The EDA classifiers also
performed poorly on the affective states of bored, engaged, and
frustrated. However, the EDA modality contains significantly
fewer features than the posture modality, and this may have caused
additional misclassifications. It is also possible that the EDA
modality may not contain enough variance for the classifiers to
distinguish between positive and negative instances of affective
states. Additionally, the EDA classifiers face the task of
distinguishing between different changes in the EDA
measurements, and determining whether such changes can be
attributed to a particular affective state or another cause. However,
this proves to be more difficult than the posture modality due to the
singular dimensionality of the EDA channel. To further illustrate
this issue, a graphical representation of the change in EDA
throughout a session is shown in Figure 4.</p>
      <p>8
)7
s
6
n
e
i 5
m
e
o4
s
r
i 3
c
m
(2
A
1
D
E
0
0
500
1000
1500</p>
      <p>2000</p>
      <p>Time (seconds)</p>
      <p>The SVM was selected as the classifier used to implement and
evaluate the data fusion methods discussed in Section 4.4. The
same feature selection algorithm and classifier configuration were
used as in the unimodal approach, and the same session-level
groupings were also maintained. The three different data fusion
approaches were evaluated for each affective state, and the results
for each state are shown in Table 3.</p>
      <p>Early Fusion 2 returned the highest Kappa for bored, engaged, and
frustrated. Early Fusion 1 returned the highest value for confused,
while the Q-Sensor baseline was the highest value for surprised.
One possible reason that Early Fusion 2 is the highest-performing
data fusion method is because feature selection is performed
separately on each modality prior to each classifier. This means that
if each feature selection algorithm selects up to the kth best features,
then the combined feature vector can contain up to 2*k features,
twice as many features as allowed by Early Fusion 1. This increase
in features may boost the performance of the classifier. Late Fusion
can also work with 2*k features, but the features are split between
the two unimodal classifiers before decision-level fusion. Early
Fusion 2 also explores the correlations between various inter-modal
attributes more deeply compared to Early Fusion 1. The complex
relationships between various intra-modal features are explicitly
modeled in the feature selection performed on each independent
modality, while the correlations between the selected inter-modal
features are explored when training the primary classifier following
feature selection. However, these two stages are performed
simultaneously in Early Fusion 1 and certain complex relationships
may not be detected as a result.</p>
      <p>Late Fusion provides the ability to “correct” a possibly incorrect
prediction across the two modalities. For example, if the postural
classifier produces an incorrect prediction of TRUE with a
confidence level of 0.6, but the EDA classifier produces an accurate
prediction of FALSE with a confidence level of 0.8, then the EDA
modality overrides the incorrect prediction because of our selected
voting schematic. However, Late Fusion was not the optimal fusion
method for any of the affective states, though its effectiveness as a
multimodal fusion technique has been demonstrated in other
affective computing tasks [14].</p>
      <p>Of note is the performance of the multimodal classifier on the
frustration dataset compared to the other affective states, as the
classifier achieved substantially higher Kappa scores. One possible
explanation for this behavior is that negative, high-arousal
emotions such as frustration or anger have been shown to occur
relatively infrequently in students engaged with computer-based
learning environments [13]. This may possibly mean that the
recorded instances of frustration may contain more distinguishable
features compared to other common, low-arousal affective states
such as boredom and engagement, encouraging higher performance
from the frustration-based classifier. Additionally, frustration has
been demonstrated to illicit higher EDA levels [26], indicating that
the inclusion of the EDA modality with the posture modality
provides additional informative features to the feature vectors,
contributing to the relatively high performance of the classifier.
Although the multimodal classifiers generally outperformed
unimodal classifiers, the highest-performing model returned a
relatively low Kappa compared to the performance of a human
BROMP labeler (~0.6). However, this threshold can vary
depending on the affective state and intervention associated with
each state. For example, identifying instances of engagement can
be viewed as a lower priority than identifying instances of
frustration or boredom, as these affective states often necessitate a
dynamic intervention to improve learning outcomes. However, the
Kappas for most of the classifiers fall below 0.05, indicating
significant difficulty for several classifiers in achieving consistent
performance across multiple affective states.</p>
      <p>Previous research efforts have demonstrated that the EDA modality
does not have a tightly-coupled relationship with different affective
states when compared to other higher-dimensionality modalities
such as facial expression and gesture [13]. The results of our work
also indicate that the EDA modality resulted in at least one
classifier returning a negative Kappa for all five affective states.
Possible explanations for this behavior include an inadequate
amount of training data, lack of variance or distinguishable trends
across the observed time windows, or lack of useful features (17
EDA features vs. 75 posture features). However, our results
indicate that the EDA modality does generally improve classifier
performance when used in conjunction with the posture modality.</p>
    </sec>
    <sec id="sec-10">
      <title>6. CONCLUSION</title>
      <p>In this paper, we demonstrate the effectiveness of a multimodal
affect detection system based on sensor data capturing a user’s
posture and EDA data while engaged with a game-based learning
environment. We show the improvement that multimodal
classifiers achieve compared with unimodal classifiers for both
modalities. We also demonstrate that SVMs outperform ANNs as
a unimodal classifier in this particular domain. Finally, we
demonstrate that data fusion is an effective way to combine
multiple modalities, either prior to or following classification.
Results suggest several promising directions for future work. To
improve model performance on smaller datasets or data containing
instances of missing modalities, more sophisticated feature
engineering approaches can be evaluated. The evaluation of our
data fusion techniques with additional modalities can further
indicate the effectiveness of this approach in a variety of
multimodal systems. Additional exploration of generalizable
multimodal systems should be undertaken to further utilize the
flexibility of sensor-based systems. Further evaluation of
classification algorithms can be investigated as well, in particular,
algorithms designed for the processing of temporal data such as
recurrent neural networks. The impact of additional biosignal
modalities such as EEG or EMG data would provide a more
indepth perspective of the effect such modalities have on multimodal
affect detection systems. Finally, the integration of multimodal
affect detection into a run-time learning environment would enable
adaptive pedagogical functionalities that address potentially
negative learning outcomes through the use of dynamic
interventions and user-tailored feedback based on learners’
affective states.</p>
    </sec>
    <sec id="sec-11">
      <title>7. ACKNOWLEDGMENTS</title>
      <p>We wish to thank Dr. Jeanine DeFalco, Dr. Benjamin Goldberg,
and Dr. Keith Brawner of the U.S. Army Combat Capabilities
Development Command, Dr. Mike Matthews and COL James Ness
of the U.S. Military Academy, Dr. Robert Sottilare of SoarTech,
and Dr. Ryan Baker of the University of Pennsylvania for their
assistance in facilitating this research. The research was supported
by the U.S. Army Research Laboratory under cooperative
agreement #W911NF-13-2-0008. Any opinions, findings, and
conclusions expressed in this paper are those of the authors and do
not necessarily reflect the views of the U.S. Army.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Arroyo</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cooper</surname>
            ,
            <given-names>D.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burleson</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woolf</surname>
            ,
            <given-names>B.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muldner</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Christopherson</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Emotion sensors go to school</article-title>
          .
          <source>In Proceedings of the 14th International Conference on Artificial Intelligence In Education</source>
          (
          <year>2009</year>
          ),
          <fpage>17</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Baltrušaitis</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ahuja</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Morency</surname>
          </string-name>
          , L.-P.
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          .
          <volume>41</volume>
          ,
          <issue>2</issue>
          (
          <year>2018</year>
          ),
          <fpage>423</fpage>
          -
          <lpage>443</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          DOI:https://doi.org/10.1109/TPAMI.
          <year>2018</year>
          .
          <volume>2798607</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          2017.
          <article-title>A bootstrapped multi-view weighted kernel fusion framework for cross-corpus integration of multimodal emotion recognition</article-title>
          .
          <source>In 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII)</source>
          (
          <year>2017</year>
          ),
          <fpage>377</fpage>
          -
          <lpage>382</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>1960</year>
          .
          <article-title>A coefficient of agreement for nominal scales</article-title>
          .
          <source>Educational and psychological measurement. 20</source>
          ,
          <issue>1</issue>
          (
          <year>1960</year>
          ),
          <fpage>37</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Craig</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Graesser</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sullins</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gholson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>Affect and learning: An exploratory look into the role of affect in learning with AutoTutor</article-title>
          .
          <source>Journal of Educational Media</source>
          .
          <volume>29</volume>
          ,
          <issue>3</issue>
          (
          <year>2005</year>
          ),
          <fpage>241</fpage>
          -
          <lpage>250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>DOI:https://doi.org/10.1080/1358165042000283101.</mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>D'Mello</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kory</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>Consistent but modest: A meta-analysis on unimodal and multimodal affect detection accuracies from 30 studies</article-title>
          .
          <source>Proceedings of the 14th ACM international conference on Multimodal interaction - ICMI '12</source>
          . (
          <year>2012</year>
          ),
          <fpage>31</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>DOI:https://doi.org/10.1145/2388676.2388686.</mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>DeFalco</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rowe</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paquette</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>GeorgoulasSherry</surname>
          </string-name>
          , V.,
          <string-name>
            <surname>Brawner</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mott</surname>
            ,
            <given-names>B.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lester</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Detecting and addressing frustration in a serious game for military training</article-title>
          .
          <source>International Journal of Artificial Intelligence in Education</source>
          .
          <volume>28</volume>
          ,
          <issue>2</issue>
          (
          <year>2018</year>
          ),
          <fpage>152</fpage>
          -
          <lpage>193</lpage>
          . DOI:https://doi.org/10.1007/s40593-017-0152-1.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Girardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanubile</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Novielli</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Emotion detection using noninvasive low cost sensors</article-title>
          .
          <source>In 2017 Seventh International Conference on Affective Computing and Intelligent Interaction</source>
          (
          <year>2017</year>
          ),
          <fpage>125</fpage>
          -
          <lpage>130</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Grafsgaard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiebe</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lester</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <article-title>Analyzing posture and affect in task-oriented tutoring</article-title>
          .
          <source>In International Conference of the Florida Artificial Intelligence Research Society</source>
          (
          <year>2012</year>
          ),
          <fpage>438</fpage>
          -
          <lpage>443</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>and Lester</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>Predicting learning and affect from multimodal data streams in task-oriented tutorial dialogue</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <source>In Proceedings of the Seventh International Conference on Educational Data Mining (London, UK</source>
          ,
          <year>2014</year>
          ),
          <fpage>122</fpage>
          -
          <lpage>129</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Grafsgaard</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiggins</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vail</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyer</surname>
            ,
            <given-names>K.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiebe</surname>
            ,
            <given-names>E.N.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lester</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>The additive value of multimodal features for predicting engagement, frustration, and learning during tutoring</article-title>
          .
          <source>In Proceedings of the Sixteenth ACM International Conference on Multimodal Interaction</source>
          (
          <year>2014</year>
          ),
          <fpage>42</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Harley</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bouchet</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Azevedo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>Aligning and comparing data on emotions experienced</article-title>
          <source>during [13] [14] [15] [16] [17] [18] [19] [20] [21</source>
          <article-title>] learning with MetaTutor</article-title>
          .
          <source>In International Conference on Artificial Intelligence in Education</source>
          (
          <year>2013</year>
          ),
          <fpage>61</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Harley</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bouchet</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hussain</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Azevedo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Calvo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>A multi-componential analysis of emotions during complex learning with an intelligent multi-agent system</article-title>
          .
          <source>Computers in Human Behavior</source>
          .
          <volume>48</volume>
          ,
          <string-name>
            <surname>May</surname>
          </string-name>
          (
          <year>2015</year>
          ),
          <fpage>615</fpage>
          -
          <lpage>625</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          DOI:https://doi.org/10.1016/j.chb.
          <year>2015</year>
          .
          <volume>02</volume>
          .013.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Henderson</surname>
            ,
            <given-names>N.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rowe</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mott</surname>
            ,
            <given-names>B.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brawner</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lester</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          <year>2019</year>
          .
          <article-title>4D Affect Detection : Improving Frustration Detection in Game-Based Learning with Posture-Based Temporal Data Fusion</article-title>
          .
          <source>In Proceedings of The 20th International Conference on Artificial Intelligence in Education (in press)</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Kalimeri</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Saitis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Exploring multimodal biosignal features for stress detection during indoor mobility</article-title>
          .
          <source>In Proceedings of the 18th ACM International Conference on Multimodal Interaction</source>
          (
          <year>2016</year>
          ),
          <fpage>53</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          2006.
          <article-title>Yale: Rapid prototyping for complex data mining tasks</article-title>
          .
          <source>In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          (
          <year>2006</year>
          ),
          <fpage>935</fpage>
          -
          <lpage>940</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Muller</surname>
            ,
            <given-names>P.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verma</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andriluka</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bulling</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>Emotion recognition from embedded bodily expressions and speech during dyadic interactions</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <source>2015 International Conference on Affective Computing and Intelligent Interaction</source>
          ,
          <string-name>
            <surname>ACII</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>(</article-title>
          <year>2015</year>
          ),
          <fpage>663</fpage>
          -
          <lpage>669</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          DOI:https://doi.org/10.1109/ACII.
          <year>2015</year>
          .
          <volume>7344640</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <surname>Nazari</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lucas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gratch</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>Multimodal approach for automatic recognition of machiavellianism.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <source>2015 International Conference on Affective Computing and Intelligent Interaction</source>
          ,
          <string-name>
            <surname>ACII</surname>
          </string-name>
          <year>2015</year>
          .
          <article-title>(</article-title>
          <year>2015</year>
          ),
          <fpage>215</fpage>
          -
          <lpage>221</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          DOI:https://doi.org/10.1109/ACII.
          <year>2015</year>
          .
          <volume>7344574</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <surname>Ocumpaugh</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rodrigo</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <given-names>Baker</given-names>
            <surname>Rodrigo Ocumpaugh Monitoring</surname>
          </string-name>
          <article-title>Protocol (BROMP) 2.0 Technical and</article-title>
          Training Manual.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <surname>Pardos</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pedro</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gowda</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gowda</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>Affective states and state tests: investigating how affect and engagement during the school year predict end-of-year learning outcomes.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <source>Journal of Learning Analytics. 1</source>
          ,
          <issue>1</issue>
          (
          <year>2014</year>
          ),
          <fpage>107</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>DOI:https://doi.org/10.1145/2460296.2460320.</mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <string-name>
            <surname>Patwardhan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Knapp</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Aggressive actions and anger detection from multiple modalities using Kinect</article-title>
          .
          <source>CoRR</source>
          . (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          <string-name>
            <surname>Patwardhan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Knapp</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Multimodal affect recognition using Kinect</article-title>
          .
          <source>arXiv preprint arXiv:1607</source>
          .
          <fpage>02652</fpage>
          . (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          <string-name>
            <surname>Pei</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sahli</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>Multimodal dimensional affect recognition using deep bidirectional long short-term memory recurrent neural networks</article-title>
          .
          <source>In Proceedings of the International Conference on Affective Computing and Intelligent Interaction (ACII)</source>
          (
          <year>2015</year>
          ),
          <fpage>208</fpage>
          -
          <lpage>214</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          <string-name>
            <surname>Rahman</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Gavrilova</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Emerging EEG and Kinect face fusion for biometric identification</article-title>
          .
          <source>In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI)</source>
          (
          <year>2017</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          <string-name>
            <surname>Rajendran</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carter</surname>
            ,
            <given-names>K.E.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Levin</surname>
            ,
            <given-names>D.T.</given-names>
          </string-name>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          <article-title>Predicting Learning by Analyzing Eye-Gaze Data of Reading Behavior</article-title>
          .
          <source>International Educational Data Mining Society</source>
          . (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          <string-name>
            <surname>Ramachandran</surname>
            ,
            <given-names>B.R.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinto</surname>
            ,
            <given-names>S.A.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Born</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Winkler</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ratnam</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Measuring neural, physiological and behavioral effects of frustration</article-title>
          .
          <source>In Proceedings of the 16th International Conference on Biomedical Engineering</source>
          (
          <year>2017</year>
          ),
          <fpage>43</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          <string-name>
            <surname>Soleymani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pantic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Pun</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>Multimodal emotion recognition in response to videos</article-title>
          .
          <source>IEEE transactions on affective computing. 3</source>
          ,
          <issue>2</issue>
          (
          <year>2012</year>
          ),
          <fpage>211</fpage>
          -
          <lpage>223</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          <string-name>
            <surname>Sottilare</surname>
            ,
            <given-names>R.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Graesser</surname>
            ,
            <given-names>A.C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lester</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Special Issue on the Generalized Intelligent Framework for Tutoring (GIFT): Creating a stable and flexible platform for innovations in AIED Research</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          <source>International Journal of Artificial Intelligence in Education</source>
          .
          <volume>28</volume>
          ,
          <issue>2</issue>
          (
          <year>2018</year>
          ),
          <fpage>139</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>DOI:https://doi.org/10.1007/s40593-017-0149-9.</mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          <string-name>
            <surname>Vail</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiggins</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grafsgaard</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyer</surname>
            ,
            <given-names>K.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiebe</surname>
            ,
            <given-names>E.N.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lester</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>The Affective Impact of Tutor Questions: Predicting Frustration and Engagement Alexandria</article-title>
          .
          <source>International Educational Data Mining Society</source>
          . (
          <year>2016</year>
          ),
          <fpage>247</fpage>
          -
          <lpage>254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>DOI:https://doi.org/10.1145/1235.</mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          <string-name>
            <surname>Zeiler</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          <year>2012</year>
          .
          <article-title>ADADELTA: An adaptive learning rate method</article-title>
          . (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>DOI:https://doi.org/http://doi.acm.org.ezproxy.lib.ucf.ed u/10.1145/1830483.1830503.</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>