<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yuxin Zhi</string-name>
          <email>yuxin.zhi@mail.utoronto.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bilal Taha</string-name>
          <email>bilal.taha@mail.utoronto.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dimitrios Hatzinakos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Toronto</institution>
          ,
          <addr-line>ON</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This study delves into the exploration of pupillometry as a modality for afect state recognition. It examines the propensity for bias in both feature-based and learning-based machine learning models that interpret afect through pupil responses. Our research lies at the intersection of afective computing and mental health, recognizing the paramount importance of accurately identifying afect states for efective mental health interventions. We rigorously evaluate the performance of these pupillometry-based models across diverse demographic groups, including variables such as ethnicity, gender, age, vision problems, and iris color. Our findings reveal notable disparities, particularly in gender and ethnicity. Bias levels are pronounced in both feature-based and learning-based models, with F1 score diferentials reaching up to 36.28%. Additionally, our analysis uncovers a slight bias related to iris color, significantly impacting the eficacy of afect state recognition models that rely on pupil responses. This underscores the critical need for fairness and accuracy in developing machine learning models within afective computing. By highlighting these areas of potential bias, our study contributes to the broader discourse on creating equitable AI systems and advancing mental health care, education, and social robotics. It emphasizes the ethical imperative of developing unbiased, inclusive technologies in healthcare systems.</p>
      </abstract>
      <kwd-group>
        <kwd>Pupillometry</kwd>
        <kwd>Afect state recognition</kwd>
        <kwd>Mental health interventions</kwd>
        <kwd>Bias</kwd>
        <kwd>Fairness</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR</p>
      <p>ceur-ws.org
Biometrics⋆</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <sec id="sec-2-1">
        <title>In the emerging field of afective computing and men</title>
        <p>
          tal health, the intricate relationship between afect state
recognition and cognitive and mental health outcomes
Afect state recognition, central to understanding and
managing various mental health disorders, encompasses
the complex process of identifying and interpreting
emotional states [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. This process is crucial in disorders
such as depression and anxiety, where impairments in
emotional awareness and regulation are prevalent [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
nition software and mood-predicting algorithms, has
opened new avenues in the monitoring and treatment of
mental health conditions [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. However, this brings forth
the challenge of bias in machine learning models [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>The accuracy and reliability of these models in afect misinterpretations, potentially worsening mental health conditions or leading to inappropriate treatment methodologies.</title>
      </sec>
      <sec id="sec-2-3">
        <title>Pupil response has been employed in diverse stud</title>
        <p>
          ies within psychiatry and psychology, particularly in
assessing cognitive load for memory-based tasks [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
pies, including cognitive-behavioral therapy (CBT) and
The advancement of cognitive and mental health thera- It has also been utilized in analyzing the emotional
impact of stimuli on individuals [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. One investigation
mindfulness-based strategies, hinges on the nuanced un- focused on the confounding efects of eye blinking in
derstanding and regulation of afect states [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ]. These
emotional states profoundly influence core cognitive
pupillometry and proposed remedies [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Additionally,
the utility of pupillometry in psychiatry was reviewed,
making [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. This underscores the importance of afect
processes, including attention, memory, and decision- highlighting its role in understanding patients’
information processing styles, predicting treatment outcomes,
state recognition in therapeutic interventions. Further- and examining cognitive functions [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. A separate study
        </p>
        <p>
          The advent of technological solutions, such as recog- lometry in diagnosing nonconvulsive status epilepticus
Machine Learning for Cognitive and Mental Health Workshop ical responses such as pupillometry are generally
conmore, the predictive nature of afect state recognition in
mental health conditions paves the way for early and
more efective intervention strategies [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
(ML4CMH), AAAI 2024, Vancouver, BC, Canada
∗Corresponding author.
nEvelop-O
        </p>
        <p>
          employed pupillometry to assess atypical pupillary light
reflexes and the LC-NE system in Autism Spectrum
Disorder (ASD)[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. The potential clinical use of
pupil(NCSE) has also been explored[
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Although
physiologsidered less biased than other modalities, hidden biases
can emerge from factors like stimuli selection and
demographic influences [
          <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
          ]. For instance, responses
to visual stimuli may vary significantly across diferent
cultural backgrounds, orientations, and age groups.
        </p>
        <p>This work aims to investigate the bias that exists in Several features can be extracted from the pupil
reafect state recognition models based on physiological sponse, including mean and variance of the pupil
resignals, specifically pupillometry, which plays a signifi- sponse, maximum dilation, minimum contraction,
dilacant role in understanding cognitive and mental health tion speed, dilation duration, contraction duration, and
applications. The main goal of this study is to shed light the diference between dilation and contraction. In total,
on the potential bias that may exist in common learn- 30 features were manually extracted and used to train
ing methods. The structure of the paper is as follows: a kernel SVM classifier. Diferent kernels were tested,
ifrst, the methodology, which includes preprocessing and and the Gaussian Kernel showed the best performance
learning models, is explained. Then, the experiments in general.
and results are presented and validated using a dataset
collected for this work. Finally, we discuss the findings 2.3. Learned-Based Model
and conclude at the end.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Methodology</title>
      <p>The framework focuses on the use of pupillary responses
and approaches the task of afect state recognition as
a binary classification problem based on the targeted
group. The first step is preprocessing the pupillometry
data to mitigate the efect of noisy samples. Then, the
data is used to develop the classification model either
from handcrafted features specific to the pupillometry
data or using a learned model. Finally, model training
and testing are described at the end to investigate the
diferent cases.</p>
      <sec id="sec-3-1">
        <title>2.1. PreProcessing</title>
        <p>
          The initial processing of the pupillometry data is
paramount to remove any irrelevant and noisy samples
that may impact pupil size analysis. The raw data can
be contaminated with various outliers like system errors,
blinks, eye-tracker glitches, and eyelid occlusion, which
can be identified and eliminated during this stage.
Previous studies [
          <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
          ] have proposed a robust method
for detecting such invalid samples, which we have used
in our study. The method uses dilation speed as a
metric to determine whether a data point is an outlier. If
a data sample exhibits a dilation speed greater than a
pre-defined threshold, it is removed as an anomaly. After
that, to ensure the continuity of the data, the filtered data
is modeled using a Gaussian process.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Feature-Based Models</title>
        <p>The feature-based method is a common approach in
machine learning where specific features are extracted from
the data and used to train a the algorithm. In this study,
the pupil responses for each participant were divided into
150 sets of sequences, with each sequence corresponding
to the pupil response for each image. Each sequence has
a length of 300 samples, which were used to extract the
features.</p>
        <p>
          The long short-term memory (LSTM) [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] model is
commonly used in machine learning for modeling sequential
data. In this approach, the LSTM model has been
implemented as seen in Figure1 with 128 LSTM units, a
dropout rate of 0.5, a 128-unit dense layer, and a rectified
linear unit (ReLU) activation function. Finally, a dense
layer at the end is added with a SoftMax function to
produce the classification output. The cross-entropy loss
function and RMSprop optimizer are used for training
the model.
        </p>
        <p>The use of deep learning methods such as LSTM for
feature learning and afect state recognition is efective
in various machine learning tasks. This approach can
improve the performance of the model, as it can capture
temporal dependencies and relationships in the data that
might be missed by manual feature extraction.
In both approaches, feature-based and learned-based, we
divided the data into training and testing datasets,
allocating 80% to training and 20% to testing, respectively. While
constructing the model, we utilized data from all
demographic groups with the intention of creating a model that
captures feature representations from all these groups.
To assess the model’s fairness and prevent bias towards
any particular group, we further divided the testing data
into subgroups during the evaluation phase and assessed
the model’s performance for each subgroup.</p>
        <p>Due to the limited number of samples, we introduced
augmentation to enhance the training data. This
augmentation was applied later in the evaluation, allowing us to
assess its impact on the results. The pupil data sequences
were augmented using noise injection and time-shifting
methods [22]. Specifically, we added white noise to the
original pupil data and performed 50 sample shifts.
Importantly, the augmentation was applied to samples from
the non-dominant group to ensure that our findings were
not influenced by this imbalance.
• Iris Color : The eye color case is a unique factor
relevant to models using pupillometry data for
their applications. Eye color afects the precision
of detecting pupils and measuring their dilation
and contraction. Thus, we categorize the data
into light (light brown, green, blue, hazel) versus
dark (black, brown, dark brown) iris colors.
• Vision: This case evaluates the model’s
efectiveness in capturing emotional states in data from
individuals wearing glasses versus those not
wearing glasses.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.2. Experimental Protocol</title>
        <p>The proposed system was evaluated using a dataset
collected at the University of Toronto. In the experiment,
participants viewed a series of visual stimuli intended to
3. Experiments and Results elicit emotions spanning diferent valence and arousal
values. The visual stimuli were selected from the
InternaBias can be seen as the disparity in performance
mettional Afective Picture System (IAPS) dataset [ 23]. The
rics across diferent groups for a given task. Assuming
IAPS database provides normative ratings of emotional
we have  = { 1,  2, … ,   } be the set of groups for bias valence and arousal for a large set of images. The rating
investigation. For each group   , we compute the per- scales are based on the Self-Assessment Manikin (SAM),
formance metrics of a recognition model  (  ). Then, a 9-point rating scale where a score of 9 represents a
tahbesobliuatse diifesriednecnetiifiendtfhoerirampeatirriocsf:groups (  ,   ) as the high rating (i.e., high pleasure, high arousal), a score of 5
indicates a neutral rating, and a rating of 1 represents a
(  ,   ) = | (  ) −  (  )| low rating (i.e., low pleasure, low arousal).</p>
        <p>
          The selected visual stimuli elicit the emotions of
interest, which include the two quadrants of the VA
di3.1. Data mensional model (i.e., HA, LA, or HV, LV). Each of the
To conduct a thorough assessment of bias in pupillom- aforementioned emotional states is achieved by
displayetry afect state recognition, we collected a dataset that ing 30 images of the same emotional target for 5 seconds
encompasses pupillometry data in response to visual stim- each. The images were selected to statistically produce
uli, taking into account a diverse range of demographics. the same response for diferent groups of people. All
The study involved 35 university students aged between images were presented on a screen with a resolution of
18 and 40 years, with a mean age of 24.6 and a standard 1920 by 1080 pixels. Following the recommendations of
deviation of 5.17. Participants were required to have the device manufacturers, the Gazepoint eye-tracking
no history of vision disorders, and they were also asked system was placed approximately 45 cm in front of the
about any medications they might be taking that could af- participant at an angle of around 30 degrees. The total
fect their responses, such as depression medication. The number of participants was 35.
data collected from the participants is categorized into The data collection process was approved by the
rediferent cases based on various demographic factors: search ethics committee at the University of Toronto. All
participants signed a consent form that clearly explained
• Gender : This case examines the algorithm’s abil- the data collection procedure and the privacy of their data.
ity to fairly recognize emotional states in females Furthermore, all participants received compensation in
versus males. the form of a gift card.
• Ethnic Group: This case assesses the model’s
ability to impartially detect emotional states 3.3. Metrics:
based on participants’ ethnic groups, including
Asian (Chinese), White (North American or Eu- In the evaluation process, two common metrics were
ropean), Black (African American or Caribbean), employed: accuracy and F1 score. Accuracy gauges the
and South Asian (Pakistani or Indian) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. proportion of correct predictions made by the algorithm.
• Age: This case explores the impact of age on the The F1 score, on the other hand, assesses the balance
model’s ability to detect emotional states, consid- between precision and recall. It ofers a more nuanced
ering age groups [
          <xref ref-type="bibr" rid="ref17 ref18 ref19 ref20 ref21">17-24</xref>
          ] versus [25-55].
evaluation of the algorithm’s performance, especially
when dealing with imbalanced datasets.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Results from the Feature-Based</title>
      </sec>
      <sec id="sec-3-5">
        <title>Model</title>
        <p>We employed a feature-based algorithm for emotion
recognition and assessed the presence of bias among
different demographic groups, focusing on valence-based
and arousal-based classifications. Our evaluation yielded
results presented in Tables 1 and 2, along with Figures 2
and 3.</p>
        <p>Notably, our findings reveal significant performance
diferences between males and females in both arousal
and valence. Specifically, our analysis indicated that
males scored 20.28% higher in arousal and 17.46% higher
in valence compared to females. The F1 score exhibited
a similar gender-based pattern of diferences.</p>
        <p>Further examination of the model based on ethnicity
factors showed significant variations in accuracy and
F1 scores across diferent groups. Notably, the Asian
group, despite having the highest number of samples,
displayed the lowest accuracy and F1 scores in terms of
arousal classification. In contrast, the South Asian group,
with the second-lowest number of samples, demonstrated
the highest performance. The percentage diference
between the highest-performing group (South Asian) and
the lowest-performing group (Asian) was 28.93% in
accuracy and 21% in F1 score for arousal classification. These
ifndings suggest that obtaining accurate feature
representations for the Asian group in terms of arousal
classiifcation may be more challenging based on the provided
stimuli.</p>
        <p>Regarding valence classification, our analysis revealed
similar performance among the Asian, White, and South
Asian groups, while the Black group exhibited
significantly lower accuracy and F1 scores. Specifically, the
percentage diference between the Black group and the
group with the highest performance was 26.99% in
accuracy and 46.71% in F1 score, respectively.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.5. Results from the LSTM Model</title>
        <p>We employed an LSTM-based approach to investigate
bias across diferent demographic groups. The results of
Ethnic Group</p>
        <p>Asian</p>
        <p>White
South Asian</p>
        <p>Black
in arousal accuracy and a 10.15% bias toward males in
valence accuracy. Concerning ethnic groups, accuracy
exhibited substantial variations across diferent
ethnicities, as depicted in Tables 3 and 4. In terms of arousal,
the Asian group had the lowest performance, while the
White group achieved the highest accuracy, resulting in
a significant 20.90% advantage favoring the White group.
The other ethnic groups showed similar performance.
In terms of valence, the Black group displayed the
lowest performance, while the South Asian group achieved
the highest, with a diference of 36.28%. In the
remaining cases, there were no significant diferences between
individual groups, suggesting that these factors share
common representations that can be captured by the
algorithms.</p>
      </sec>
      <sec id="sec-3-7">
        <title>3.6. Bias and Fairness</title>
        <p>Based on the results presented above, it is evident that
both models exhibit significant diferences in accuracy
and F1 scores concerning ethnic groups and gender. This
indicates that these two factors play a pivotal role in
the development of afect recognition from pupillometry
data, as the models struggled to find efective
representations for them. In contrast, the other four cases displayed
minor diferences in terms of accuracy and F1 scores,
suggesting that these factors share common representations
across all groups and do not adversely afect the data’s
quality. For example, iris color had a limited impact on
recognition performance, albeit not as pronounced as
with gender and ethnic groups.</p>
        <p>Despite the dataset including diverse groups during
model training, the quality of the representations failed
to adequately capture the diverse group responses within
the studied population. We acknowledge that the
unbalanced number of samples in each group might contribute
to the bias observed in the results. To address this
potential issue, we implemented data augmentation techniques
(see 2.4) for the non-dominant groups (groups with fewer
samples) to increase their sample size. Subsequently,
we followed the same procedure as in the original case.
However, our results demonstrated that even with the
implementation of data augmentation, the performance
did not change significantly. The bias in performance
persisted in both the ethnic groups and gender-based cases,
while the remaining cases exhibited similar performance.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>In this study, we investigated the performance of
featurebased and learned-based afect recognition models across
various group factors, including ethnicity, gender, vision,
iris color, and age, focusing on pupillometry as a the
modality. Our research, involving a dataset from 35
diverse participants, revealed significant gender and ethnic
biases in standard afect recognition algorithms,
impacting both arousal and valence-based classifications. We
also identified minor biases related to other factors, such
as iris color. These findings emphasize the potential bias
in afect recognition systems, highlighting the need for
more inclusive and representative training data, rigorous
fairness evaluation, and enhanced transparency in model
development. Our study not only sheds light on the
inherent biases in afective computing but also underscores
the importance of considering demographic factors in
the development of more equitable and efective afect
recognition technologies, particularly given their direct
relation to cognitive and mental health.
[22] T. Ko, V. Peddinti, D. Povey, S. Khudanpur, Audio
augmentation for speech recognition, in: Sixteenth
annual conference of the international speech
communication association, 2015.
[23] P. J. Lang, International Afective Picture System
(IAPS): Afective ratings of pictures and instruction
manual, Technical report (2005).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Assabumrungrat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sangnark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Charoenpattarawut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Polpakdee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Sudhawiyangkul</surname>
          </string-name>
          , E. Boonchieng, T. Wilaiprasitporn,
          <article-title>Ubiquitous affective computing: A review</article-title>
          ,
          <source>IEEE Sensors Journal</source>
          <volume>22</volume>
          (
          <year>2021</year>
          )
          <fpage>1867</fpage>
          -
          <lpage>1881</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Greene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Thapliyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Caban-Holt</surname>
          </string-name>
          ,
          <article-title>A survey of afective computing for stress detection: Evaluating technologies in stress detection for better health</article-title>
          ,
          <source>IEEE Consumer Electronics Magazine</source>
          <volume>5</volume>
          (
          <year>2016</year>
          )
          <fpage>44</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Calvo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dinakar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Picard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Maes</surname>
          </string-name>
          , Computing in mental health,
          <source>in: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>3438</fpage>
          -
          <lpage>3445</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Phung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Venkatesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Berk</surname>
          </string-name>
          ,
          <article-title>Afective and content analysis of online depression communities</article-title>
          ,
          <source>IEEE transactions on afective computing 5</source>
          (
          <year>2014</year>
          )
          <fpage>217</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zucco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Calabrese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cannataro</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis and afective computing for depression monitoring</article-title>
          ,
          <source>in: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM)</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>1988</fpage>
          -
          <lpage>1995</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Kirk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Taha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>McCague</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hatzinakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Katz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ritvo</surname>
          </string-name>
          ,
          <article-title>A web-based cognitive behavioral therapy, mindfulness meditation, and yoga intervention for posttraumatic stress disorder: Single-arm experimental clinical trial</article-title>
          ,
          <source>JMIR Mental Health</source>
          <volume>9</volume>
          (
          <year>2022</year>
          )
          <article-title>e26479</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Lerner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Keltner</surname>
          </string-name>
          , Beyond valence:
          <article-title>Toward a model of emotion-specific influences on judgement and choice</article-title>
          ,
          <source>Cognition &amp; emotion 14</source>
          (
          <year>2000</year>
          )
          <fpage>473</fpage>
          -
          <lpage>493</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R. e.</given-names>
            <surname>Kaliouby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Picard</surname>
          </string-name>
          , S. Baron-Cohen,
          <article-title>Afective computing and autism</article-title>
          ,
          <source>Annals of the New York Academy of Sciences</source>
          <volume>1093</volume>
          (
          <year>2006</year>
          )
          <fpage>228</fpage>
          -
          <lpage>248</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nouman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Y.</given-names>
            <surname>Khoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Mahmud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Z.</given-names>
            <surname>Kouzani</surname>
          </string-name>
          ,
          <article-title>Recent advances in contactless sensing technologies for mental health monitoring</article-title>
          ,
          <source>IEEE Internet of Things Journal</source>
          <volume>9</volume>
          (
          <year>2021</year>
          )
          <fpage>274</fpage>
          -
          <lpage>297</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Mehrabi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Morstatter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galstyan</surname>
          </string-name>
          ,
          <article-title>A survey on bias and fairness in machine learning</article-title>
          ,
          <source>ACM Computing Surveys (CSUR) 54</source>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Granholm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Asarnow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Sarkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Dykes</surname>
          </string-name>
          ,
          <article-title>Pupillary responses index cognitive resource limitations</article-title>
          ,
          <source>Psychophysiology</source>
          <volume>33</volume>
          (
          <year>1996</year>
          )
          <fpage>457</fpage>
          -
          <lpage>461</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>M. M. Bradley</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Miccoli</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Escrig</surname>
            ,
            <given-names>P. J.</given-names>
          </string-name>
          <string-name>
            <surname>Lang</surname>
          </string-name>
          ,
          <article-title>The pupil as a measure of emotional arousal and autonomic activation</article-title>
          ,
          <source>Psychophysiology</source>
          <volume>45</volume>
          (
          <year>2008</year>
          )
          <fpage>602</fpage>
          -
          <lpage>607</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K.</given-names>
            <surname>Yoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ahn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>The confounding efects of eye blinking on pupillometry, and their remedy</article-title>
          ,
          <source>Plos one 16</source>
          (
          <year>2021</year>
          )
          <article-title>e0261463</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Graur</surname>
          </string-name>
          , G. Siegle,
          <article-title>Pupillary motility: bringing neuroscience to the psychiatry clinic of the future</article-title>
          ,
          <source>Current neurology and neuroscience reports 13</source>
          (
          <year>2013</year>
          )
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lynch</surname>
          </string-name>
          ,
          <article-title>Using pupillometry to assess the atypical pupillary light reflex and lc-ne system in asd</article-title>
          ,
          <source>Behavioral Sciences 8</source>
          (
          <year>2018</year>
          )
          <fpage>108</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hocker</surname>
          </string-name>
          ,
          <article-title>Pupillometry for diagnosing nonconvulsive status epilepticus and assessing treatment response?</article-title>
          ,
          <source>Neurocritical Care</source>
          <volume>35</volume>
          (
          <year>2021</year>
          )
          <fpage>304</fpage>
          -
          <lpage>305</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Sarsenbayeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Dingler</surname>
          </string-name>
          , G. Wadley,
          <string-name>
            <given-names>J.</given-names>
            <surname>Goncalves</surname>
          </string-name>
          ,
          <article-title>Behavioral and physiological signals-based deep multimodal approach for mobile emotion recognition</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>H.-C. Yang</surname>
            ,
            <given-names>C.-C.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Annotation matters: A comprehensive study on recognizing intended, selfreported, and observed emotion labels using physiology</article-title>
          ,
          <source>in: 2019 8th International Conference on Afective Computing and Intelligent Interaction (ACII)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Kret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. E.</given-names>
            <surname>Sjak-Shie</surname>
          </string-name>
          ,
          <article-title>Preprocessing pupil size data: Guidelines and code</article-title>
          ,
          <source>Behavior research methods 51</source>
          (
          <year>2019</year>
          )
          <fpage>1336</fpage>
          -
          <lpage>1342</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>B.</given-names>
            <surname>Taha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kirk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ritvo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hatzinakos</surname>
          </string-name>
          ,
          <article-title>Detection of post-traumatic stress disorder using learned time-frequency representations from pupillometry</article-title>
          ,
          <source>in: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>3950</fpage>
          -
          <lpage>3954</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Long short-term memory</article-title>
          ,
          <source>Neural computation 9</source>
          (
          <year>1997</year>
          )
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>