=Paper= {{Paper |id=Vol-3297/paper6 |storemode=property |title=How Anxiety State and Acceptance of an Embodied Agent Affect User Gaze Patterns |pdfUrl=https://ceur-ws.org/Vol-3297/paper6.pdf |volume=Vol-3297 |authors=Nermin Shaltout,Diego Monteiro Monteiro,Monica Perusquia-Hernandez,Kiyoshi Kiyokawa,Jason Orlorsky |dblpUrl=https://dblp.org/rec/conf/apmar/ShaltoutMPKO22 }} ==How Anxiety State and Acceptance of an Embodied Agent Affect User Gaze Patterns== https://ceur-ws.org/Vol-3297/paper6.pdf
How Anxiety State and Acceptance of an Embodied Agent
Affect User Gaze Patterns⋆
Nermin Shaltout1,2,*,† , Diego Vilela Monteiro3 , Monica Perusquía-Hernández2 , Jason Orlosky1,4
and Kiyoshi Kiyokawa2
1
  Osaka University, 1-32 Machikaneyama, Toyonaka, Osaka 560-0043, Japan
2
  Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan
3
  École Supérieure d’Informatique Electronique Automatique, 38 Rue des Docteurs Calmette et Guérin, 53000 Laval, France
4
  Augusta University, School of Computer and Cyber Sciences 100 Grace Hopper Ln, Augusta, GA 30901, USA


                                       Abstract
                                       In virtual reality (VR), the interactions of users with embodied agents when the users are anxious or when they do not
                                       accept an agent are not yet completely understood. Gaze can be indicative of the user’s anxiety and acceptability of an
                                       embodied agent. An agent’s expressions or actions can, in turn, be used to accommodate the user’s anxiety. Previous work on
                                       social anxiety disorder (SAD) found evidence of avoidance or hyper-vigilant gaze patterns in relation to agents or people the
                                       participants were gazing at. Thus, we investigated if there are specific gaze patterns for normal individuals experiencing
                                       anxiety in the moment when gazing at an embodied agent. We focused mostly on avoidant gaze patterns. Based on evidence
                                       of gaze patterns in SAD and autism, we designed an experiment where normative individuals interact with an agent showing
                                       a neutral, happy and angry expressions. We aim to examine if normal anxious participants have similar gaze patterns or
                                       avoidance patterns to those with SAD. We also investigated if the user’s acceptability or preference of the virtual agent’s
                                       display of emotions had an effect on the avoidance via eye gaze. In particular, we investigated the user’s eye patterns in
                                       relations to the agent’s eyes, face or body to see if there were similarities to people with SAD. Using correlation analysis, we
                                       found a significant positive correlation between the acceptability of the participant to the virtual agent’s expression and their
                                       fixation on the agent’s eyes. We also found a significant correlation between fixations on the agent’s body and how anxious
                                       the participant was at the experiment’s start. Later, these results can be used to find a link between acceptability, anxiety and
                                       SAD.

                                       Keywords
                                       Virtual Reality, Embodied Agent, Eye Gaze, Anxiety



1. Introduction                                                                                        and were usually conducted using still photographs [5].
                                                                                                       Individuals with Social Anxiety Disorder (SAD) react dif-
In the field of virtual reality (VR), embodied agents are ferently to facial displays of emotion. This happens in
commonplace as non-player characters (NPCs) or as VR too. It happens independently of the avatar fidelity
other users (avatars) in-game. Thus, determining how [6]. Little is known of how normative individuals’ anxi-
individuals react to virtual agents is an important topic ety (aka those without SAD) affects gaze behaviour with
in the field [1]. The adaptation of embodied agent or respect to facial displays of emotion. The exposure of in-
avatar facial expressions can influence user behavior [2]. dividuals to virtual situations has also risen with the rise
In particular, for those who might use the agents for of new platforms like VR, and increased by the advent
learning [3], social support, or feedback [4]. Previous of covid-19. Studying the effects of anxiety on virtual
studies assessed the gaze patterns of individuals in social embodied agents, is thus important, for SAD as well as
situations to understand psychological and emotional for anxious normative individuals.
patterns. These studies are used to better understand and                                                 VR offers the possibility of presenting dynamic facial
train people with disorders such as high social anxiety stimuli with a wealth of parameters, leading to detailed
                                                                                                       descriptions of the facial movements required to convey
APMAR’22: Asia-Pacific Workshop on Mixed and Augmented Reality, a socio-affective message accurately [7]. Furthermore,
Dec. 02-03, 2022, Yokohama, Japan                                                                      the advent of biosensors allows real-time reproduction of
*
  Corresponding author.
$ nermeena@gmail.com (N. Shaltout);
                                                                                                       facial expressions from other users [8]. Moreover, with
diego.vilelamonteiro@esie.fr (D. V. Monteiro);                                                         a recent increased interest of the general public in the
m.perusquia@is.naist.jp (M. Perusquía-Hernández);                                                      metaverse after the 2019 pandemic. We believe it is timely
jorlosky@augusta.edu (J. Orlosky); kiyo@is.naist.jp (K. Kiyokawa) to study the effects of VR agents’ facial expressions on
 0000-0002-1570-3652 (D. V. Monteiro); 0000-0002-0486-1743                                            the user gaze.
(M. Perusquía-Hernández); 0000-0002-0538-6630 (J. Orlosky);
0000-0003-2260-1707 (K. Kiyokawa)
                                                                                                          Thus, this study aims to explore different gaze parame-
          © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License ters and their effectiveness to determine how comfortable
          Attribution 4.0 International (CC BY 4.0).
    CEUR

          CEUR Workshop Proceedings (CEUR-WS.org)
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073                                                                       the user is with the agent as it presents different facial
expressions; which might be an alternate method for          happy/neutral) using their eye gaze. Participants show
measuring the reaction towards VR agents. The design of      both avoidance and hyper-vigilance according to the age
the study was inspired by previous works conducted on        group, the agent display, and the passage of time during
individuals with SAD. Individuals with SAD usually do        the trial. The bias is measured with the duration of fix-
not deal well with emotions presented on the face. They      ations towards angry faces and towards more pleasant
tend to avoid gaze when faced with emotional people or       faces such as neutral or happy faces. The fixation dura-
their representations. The avoidance might increase with     tion of neutral faces is subtracted from angry faces to
negative emotions. They especially avoid looking at the      create a negative or positive bias. Based on the above
eyes of individuals displaying emotions [9]. Thus, we an-    studies we hypothesize that individuals with SAD might
alyzed the effect of user’s anxiety and acceptance when      have two factors that affect their gaze patterns. The level
confronted with an embodied agent showing different          of anxiety of the person when they are looking at the
emotions on the user’s eye gaze patterns.                    face, and whether or not the individuals accept the fa-
   We hypothesize that we can use gaze location on           cial expression of the embodied agent. We would like to
the agent to measure the degree of comfort towards an        also observe if this affects normative individuals and if it
agent’s facial expression or degree of user anxiety. To      mimics those with SAD.
this aim, eye gaze was measured using the VIVE Pro Eye
tracker while participants looked at a VR agent with vary-   2.2. Hypotheses
ing expressions. The main contributions of this paper
are:                                                         Our hypotheses are as follows.

     • Analyzing the correlation between general anx- H1 The agent’s facial expressions have an effect on the
       iety of a normative user and their gaze patterns       participant’s self-reported arousal and valence.
       on the embodied agent.
     • Analyzing the correlation between the user accep- H2 The user’s acceptance or preference of the agent’s
       tance to the embodied agent’s emotional display        emotional display can be observed in the eye fix-
       and their gaze patterns on the embodied agent.         ation patterns on the agent.
     • Comparing the findings to those found in SAD H3 The overall anxiety state of the participants could
       using similar studies.                                 alter the eye fixation patterns on the agent.

                                                               For H1, the Self Assessment Manniken (SAM) was
2. Prior Work and Hypotheses                                used to assess if the facial expressions had an effect on
                                                            the participant’s affect, to see if the avatar’s expressions
2.1. Gaze Analysis Studies Related to                       affected the user. Regarding H2, while participants were
       Social Anxiety                                       answering questionnaires in 3.6 about the different emo-
                                                            tional display of the agent, not all of them accepted the
Eye metrics are promising tools to assess attitudes to- agent’s emotional display in the same manner. E.g. while
wards virtual agents. The main inspiration for this study some people highly disliked the happy face, other people
stems from gaze analysis of individuals with HSA (High were comfortable with it. We assessed if there is a pattern
Social Anxiety) towards the facial expressions of other between acceptability of the emotional display and the
individuals in social situations. Previous gaze studies number of fixations on the agent. For H3, we assessed if
showed that individuals with HSA averted their gaze there is a relation between the anxiety of the participant
when shown photos of individuals expressing positive or and their fixation behaviors on the agent’s different body
negative emotions [5, 10]. In such studies, static photos parts. We expected the more anxious participants to be
of people presenting happy, sad, and neutral facial expres- avoidant of the agent’s face and eyes. The anxiety state
sions were commonly shown while gaze directions and in this case is the user’s default state before and during
fixations were measured. Thus, we adapted our study to the experiment.
find a relation between gaze direction on the embodied
agent displaying emotions and the user’s acceptance of
the emotional display.                                      3. Experiment
   Wieckowski et al. explored variability in bias to-
ward social stimuli in the form of vigilant attention and 3.1. Participants
avoidant attention using eye gaze techniques instead of
a traditional probe task technique often used to study A total of 21 student volunteers in their early twenties
attention bias in anxious youth with clinical SAD [11]. participated in the study; 10 Japanese, 1 Kenyan, 1 Ger-
The visual dot probe task involves allowing the users man, 2 Nepali, 1 Colombian, 4 Chinese, 1 Thai, and 1
to select between two agent pairs (e.g., angry/neutral, Malaysian. No participant tried our system before. The
participants were asked to wear glasses if their vision was
not good. Sources of error were accounted for by remov-
ing three participants in which there was missing data
(e.g., the VIVE Pro Eye tracking was disabled accidentally
for one of the faces). Three participants that were ex-
tremely fatigued were excluded using a fatigue score in Figure 1: A diagram showing the resulting avatar (right)
                                                            created when using an average face (left) on the Ready Player
the pre-questionnare. After the exclusions, the number
                                                            Me avatar creator (https://bit.ly/34C2G).
of participants was 15. The experiment was approved by
the ethics committee of our institution.

                                                             hyper realistic [13]. Thus we used a semi realistic avatar.
3.2. Experiment Design
                                                                We used an average Asian face to accommodate the
We tested the participant’s eye gaze patterns when pre-      majority Asian demographic involved in the experiment.
sented with different facial expressions from a humanoid     Figure 1 shows the resulting avatar when inputting an
agent in VR. There were three conditions corresponding       average Asian face to the Ready Player Me interface. The
to three facial displays expressed by the virtual agent: a   Ready Player Me was used because its low-poly char-
happy, a sad, and an angry facial expression. The con-       acteristics make it more likely to be used by people in
ditions were presented in a within subjects design, i.e.,    current virtual chats and metaverse settings.
each participant saw all three faces. We chose to en-           Though readyplayer.me characters are usually used
able the agent to have only facial expressions; to avoid     as avatars in VRchat, we use it in this case to animate
confounding factors caused by other agent behaviors.         the agent, as if it’s an example user in VR. To animate
                                                             the happy, angry, and neutral emotions, the Facial Ac-
3.3. Procedure                                               tion Coding System (FACS) was used [14]. The FACS
                                                             presents action units (AUs) used for coding facial move-
The participants were exposed to each of the agent’s fa-     ments without making inferences about the underlying
cial expression one minute at a time. There were three       emotions. It is a popular tool in emotion studies to either
runs total, one for each facial expression. The partici-     create faces with a certain expression or to interpret a
pants were seated in front of the agent as to be the same    facial expression. The FACS is now incorporated in most
height as the agent and faced the agent head-on without      VR chat avatars to enable the avatars to express emotions
an angle. Before every run, the agent was adjusted to        by encoding certain AU movements. Ready Player Me
be the same height as the participant. The participants      avatars come equipped with most of the values available
were seated throughout the course of the experiment and      in the FACS. We focused on prototypical AUs according
encouraged to use only gaze and head movements. A full       to the Basic Emotion Theory [15] to animate the avatar,
agent was used so the participants could freely choose       together with guidelines described in Farnsworth’s vi-
whether or not to gaze at the agent’s face, body, or out-    sual FACS guide [16]. For instance, to animate a happy
side of the agent completely. The participants answered      face, we used AU 6 (cheek raiser, Fig. 2) and AU 12 (lip
questionnaires pre- and post-experiment and after each       corner puller) with values of 1.0. The agent’s default face
avatar was displayed. Details are mentioned in the mea-      with some minor adjustments was used to represent the
surements section.                                           neutral face, Because the Ready Player Me avatars are
                                                             designed to look slightly happy by default, we adjusted
3.4. Stimuli                                                 brow lowered AU 4 and lip corner depressor AU 15 to
                                                             bring back the avatar to its normal state. The facial ex-
The agent was designed using Ready Player Me [12],           pressions are animated using blend shapes. There are
which is a tool that converts a photograph of a person       three separate faces shown to the same participant. We
into an avatar with similar facial features. It is used      refer to them as three different trials with questionnaires
for players to make agents of themselves in-game. It is      in between. All facial expressions start from the neutral
currently most popular on platforms such as VR chatting      expression. It took one second for the facial animation to
programs. Ready Player Me is also equipped with the          reach their maximum intensity. Then the avatar’s expres-
ability to map the user’s emotions onto the agent via eye    sion remains for the duration of the trial. The animation
and facial tracking. The readyplayer.me avatar was based     stays constant for one minute per trial. We measure the
on FACs and is usually embodied by people in VR but          participant’s reaction within one minute for the purpose
we controlled it using the animation module of Unity to      of measuring fixations. We do not consider the initial an-
introduce more control over the experiment                   imation to have a detrimental effect on the gaze patterns
   The possibility that there is an uncanny valley in the    or number of fixations thus we did not account for the
virtual agent’s appearance is higher with agents that are    baseline when observing gaze patterns.
Figure 2: A diagram showing action unit 6, cheek raiser (left
two images, permission was taken from imotions to use the
image). The rightmost two images show the AU6 applied on        Figure 3: (a) Cube used for calibrating the eye gaze pre-
the Ready Player Me avatar, used in the experiment as an        experiment (b) height adjusting the participant while obstruct-
agent (https://bit.ly/3bKN9a).                                  ing the avatar’s face (c), (d) examples of the neutral face and
                                                                angry faces used for the experiment. The gaze ray is only
                                                                shown for illustrative purposes but is omitted in the actual
                                                                run of the experiment. The Action Units (AU)s for both happy
   To reduce the chances of a participant experiencing
                                                                and angry faces were amped to 1 for both angry and happy
the uncanny valley, the avatar’s blinking was animated.
                                                                faces to get the maximum effect.
The frequency of the blinks was randomized, between
0.5 and 4 seconds. The agent’s gaze was fixed during
the run of this experiment. The agent was also given
a breathing animation using Maximo to give it a more
realistic feel [17]. No other interactions other than vary-
ing facial expressions were added to the agent. In this
experiment we focused more on the relationship between Figure 4: Figure showing the face mask (highlighted on the
user’s anxiety and user’s gaze patterns rather than design left) used to detect collisions/fixations
a complex interaction system. Thus a simple design was
used. Future work will feature a more interactive avatar.

                                                                150 ms constituted a “fixation.” The number of fixations
3.5. Calibration                                                on each body part was counted within one minute of
To ensure that eye and facial feature tracking worked           the participant looking at the avatar. The fixations were
correctly, we used a cube display to calibrate and con-         collected for each facial expression; Neutral, Angry, and
firm participant’s gaze prior the experiment’s start. The       Happy. The process was repeated per participant. By
agent’s height was adjusted to be the same as the par-          analyzing the number of fixations on each body part,
ticipant’s height in every trial. The agent’s face was          we expected to find a pattern related to the participant’s
only revealed once the VIVE Pro Eye was calibrated, as          current state, the participant’s preference of the face, and
shown in Fig. 3. We calibrated each participant’s eye gaze      patterns related to the facial expression that the avatar
pre-experiment using the VIVE’s internal calibration soft-      displayed to the participant.
ware before running the experiment. The participant was            Additionally, we added a face mask not visible to the
then positioned to see the front view of the agent. How-        user to roughly measure the number of fixations on the
ever, the participant was free to move his/her head or          face. We counted the number of fixations surpassing
gaze freely and look either at the avatar’s face or body        150 ms on the colliders added to this face mask, and
for the duration of the one-minute trial. No background         divided the number of fixations into upper face region
objects were visible so that the participant would focus        and lower face region. Any collider above the lower eye
on the agent. Participants were seated during the ex-           was considered upper face while colliders below the face
periment and received no instructions regarding as to           were considered lower face. If the upper face collisions
where to look when facing the agent and were left to            are higher in the results, then the user accepts the avatar,
interact naturally with the agent using eye movements           otherwise if the lower face collisions are higher the user
and head movements once the experiment started. The             rejects the avatar according to source.
participant’s location did not change.
                                                                Questionnaires The participants answered question-
3.6. Measurements                                               naires at different points during the experiment. Some-
                                                                times a questionnaire like SAM was repeated more than
Eye Metrics We implemented a gaze ray to determine              once so were questionnaires about the acceptability of
the intersection of the participant’s gaze with the avatar’s    the avatar. To identify each of which point of the exper-
face. To determine the number of collisions between the         iment the participant answered the questionnaires we
gaze ray and parts of the face, we added colliders to           assigned codes as follows:
the face, eyes, and body (Fig. 5). When the ray did not
collide with the avatar, it was recorded as ‘other.’ We         B : Before seeing any avatar.
defined that a constant gaze on the same body part for
                                                                AH : After seeing the happy facial expression agent.
                                                                                           [a]Valence




                                                                                           [b]Arousal




Figure 5: Figure showing how the number of fixations is
measured using colliders placed on the avatar. (a) headf: rep-
resents the number of times the participant fixated on the         Figure 6: The [a]Valence and [b]Arousal questionnaire from
avatar’s head. (b)bodyf: the number of times the participant       SAM.
fixated on the avatar’s body, (c) eyef: the number of times
the participant fixated on the avatar’s eyes, and (d) other: the
number of fixations outside of the avatar. Collisions on the
avatar were detected using a convex collider respectively.
                                                                      The questionnaire about the avatar was used to mea-
                                                                   sure the participant’s acceptance aka contextual com-
                                                                   fort of the virtual agent’s expression after each trial. The
                                                                   questionnaires were answered thrice per participant after
AA : After seeing the angry facial expression agent.               each face. From the answers of the acceptability ques-
                                                                   tionnaire, we created a variable known as acceptability
AN : After seeing the neutral facial expression agent.
                                                                   by taking as follows:
  All three avatars were shown to each participant in a
counterbalanced order. The following permutations were                   𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑏𝑖𝑙𝑖𝑡𝑦 : 𝑆𝑐𝑜𝑟𝑒(𝑄1) − 𝑆𝑐𝑜𝑟𝑒(𝑄2)
used: (AA, AH, AN); (AA, AN, AH); (AH, AN, AA); (AH,
AN, AA); (AN, AA, AH); (AN, AH, AA)                                   A positive or zero value indicated that the agent was
                                                                   accepted, while a negative value indicated that the agent
Pre-questionnaire Before the experiments, partici-                 was rejected by the user. We calculated the acceptability
pants reported their demographics and their current fa-            per face per participant. This score was then used to find
tigue and anxiety. They also reported their general anxi-          a correlation between the acceptability of the agent. Each
ety level on a 9-point Likert scale.                               facial expression was compared to the participant’s eye
                                                                   on fixations different locations of the embodied agent
Self Assessment Manikin (SAM) A SAM question-                      (head, body, eye, etc).
naire with 9 point likert scale is used to measure va-                The acceptability questionnaire was conducted three
lence,arousal and dominance. (Fig. 6) shows a sample of            times per participant, after the participant viewed every
the valence and arousal questionnaires used. SAM was               expression of the agent for one minute. For instance,
given to the participant before the experiment and after           after the participant views the happy expression for one
every face. SAM was used to measure how the avatar                 minute, the experiment stops and the participant answers
affected the participant. The SAM questionnaire was                the SAM and the acceptability questionnaires. This pro-
presented prior to and after looking at the avatar.                cedure is then repeated with the other two expressions.
                                                                   The questionnaires are not validated which might be a
                                                                   limitation of the study.
Acceptability Questionnaire This was a more de-
tailed questionnaire about the avatar. The questionnaires
were given a code AH, AA, or AN, after each different              4. Analysis and Results
avatar with different facial expression. It asked how peo-
ple felt about the avatar. It consisted of several questions       Despite being given no instructions on where to di-
answered on a 9-point Likert scale. The scores are on a            rect their gaze, the majority of participants gazed at
scale from 1 to 9 to match the same format of the SAM.             the agent’s face or body. However, the gaze patterns
From this questionnaire, we only used two questions for            changed according to the facial expression shown, the
further analyses:                                                  participant’s acceptability score, and the anxiety score, as
                                                                   detailed below. Deviations away from the face changed
Q1 : I felt comforted by the avatar.                               according to the acceptability and anxiety scores.
Q2 : I felt disturbed by the avatar respectively.                    SPSS was used to analyze the data. To test H1, we mea-
                                                                   sured if the agent’s facial expression had an effect on the
                                                                [a] Valence: A higher value means that a person is content
participant’s affect as follows. The SAM questionnaire
                                                                     while a lower value means that a person is upset
was taken four times and symbolized with experiment
codes described in the Questionnaire portion of Sec. 3.6
as follows: Before the experiment (B), after the neutral
agent(AN), after the happy agent (AH), after the angry
agent (AA) for each participant.
    A Shapiro-Wilk test showed a significant departure
from normality W(98) = 0.94 , 𝑝 < 0.01; W(98) = 0.94,
𝑝 < 0.01; W(98) = 0.95, 𝑝 < 0.01 for valence, arousal
and dominance respectively. We thus ran the Kruskal-
Wallis as a non-parametric test to compare if there were
significant differences between conditions B, AH, AA,          [b] Arousal: Measures the state of alertness. A higher value
and AN. A Kruskal-Wallis rank-sum test carried out on                         indicates higher alerntness.
arousal valence and dominance showed that there was
a statistical significance in valence (𝜒2 (3) = 21.48, 𝑝 <
0.001) and arousal (𝜒2 (3) = 8.867, 𝑝 < 0.05) between the
different embodied agent expressions when compared
to the baseline. The mean rank valence scores of 41.07,
45.16 , 28.67, and 62.5 for B, AN, AH, and AA, respectively.
For arousal, the mean rank scores were 39.98, 41.61, 57.6,
37.84 for B, AN, AH, and AA, respectively. The results
for dominance showed no statistical significance.
    Pairwise Wilcoxon rank sum tests with Benjamini-           Figure 7: Boxplot showing [a] Valence and [b] Arousal values
Hochberg (BH) 𝑝-value adjustment were carried out for          entered by participants after viewing each facial expression
the SAM valence and arousal scores since they showed           (AA, AH, AN) compared to the Baseline, B. The starred brackets
significance. Experiment codes (B, AN, AH, AA) were            show statistically significant pairings using the Wilcoxon rank
used to represent the experiment stage as detailed in 3.6.     sum tests.
The following score pairs were compared to each other:
(B, AN), (B, AH), (B, AA) representing the comparison
between the baseline state and the state of the participant    (AA, AH) with (𝑝 < 0.01, Cohen’s 𝑑 = 0.25) and (AN,
after viewing agents of different emotions. This is to test    AH) with (𝑝 < 0.01, Cohen’s 𝑑 = 0.25); 𝑁 = 15 for all
if the different emotional agents had a significant effect     the test concerning valence and arousal
on the affect of the participant. The difference between          We tested H2 and H3 and see if anxiety and acceptabil-
the scores of the following face pairs were compared to        ity affect where the participant looks most at the avatar
each other: (AN, AH), (AN, AA), (AH, AA).                      and whether they are avoiding the agent’s face or eyes as
    The significant results for valence and arousal are sum-   a result. We compared those results to previous findings
marized using starred brackets in Fig. 7. Significant re-      for socially anxious individuals. Kendall’s 𝜏 b correlation
sults were as follows: The valence was larger after intro-     was used to run the correlation tests since the sample size
ducing the happy agent (AH), compared to the baseline          is small, and the normality tests showed that the sample
(B): (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). The valence was             did not follow a normal distribution as before.
conversely lower than the baseline after introducing the          To test H2, when running the tests on all the partici-
angry agent (AA): (𝑝 < 0.01, Cohen’s 𝑑 = 0.25). There          pants, we found no correlation between the acceptability
was no significant differences in valence when comparing       score and number of fixations on all portions of the agent.
the score of the baseline condition to introducing the neu-    We ran another Kendall’s 𝜏 b correlation using only anx-
tral agent (AH). The valence score was also lower after
viewing the angry face as compared to after viewing the
neutral face (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). Conversely,         Table 1
the valence score was higher after the happy face as com-      Correlation values between acceptability score and fixations
pared to the neutral face (𝑝 < 0.05, Cohen’s 𝑑 = 0.25).        on different portions of the agent for N = 13 (anxious partici-
The difference in valency values between the AH and AA         pants) (*: 𝑝 < 0.05, **: 𝑝 < 0.01),
pair was significant (𝑝 < 0.01, Cohen’s 𝑑 = 0.25).                                 headf    bodyf     eyesf    other
    The results for arousal were as follows: the difference     bodyf                 .04
between arousal values for the following score pairs with       eyesf                .51*    −.22
the following experiment codes showing statistical sig-         other               .59**      .21       .16
nificance: (B, AH) with (𝑝 < 0.05, Cohen’s 𝑑 = 0.25);           Acceptability         .31      .03     .51*      .22
ious participants. The results are summarized in Table 1.
A positive correlation between the participant’s accept-
ability score and the number of fixations on the eye were
found which was statistically significant (𝜏 𝑏 = 0.510,
𝑝 < 0.05). The acceptability score was standardized
between −1 and 1.
   To investigate H3, a Kendall’s 𝜏 b correlation was run      [a] Angry Expression
to determine the relationship between the participant’s
anxiety score and amount of fixations on a certain por-
tion of the avatar, as shown in Table 2, with 𝑁 = 45,
regardless of the face used. There was a strong, posi-
tive correlation between the participant’s anxiety score
of and the number of fixations on the agent body per
minute, which was statistically significant (𝜏 𝑏 = 0.30,      [b] Neutral Expression
𝑝 < 0.001). Additionally, there was also a strong nega-       Figure 8: Scatter plot illustrating the correlation between the
tive correlation between the participant’s (𝜏 𝑏 = −0.22,      anxiety score of the participant and the number of fixations
𝑝 < 0.01). The results are summarized in Table 2.             detected on outside of the agent’s face, when observing the
   A Kendall’s 𝜏 -b correlation was run to determine the      avatar with the (a) angry expression as opposed to number
relationship between the anxiety score of the participant     of fixations on the agent’s body, when observing the agent
                                                              with the (b) neutral expression over one minute.
and the number of fixations in a certain section of the
agent, among 15 participants, per facial expressions. We
observed if there are any patterns for specific facial ex-
pressions. We found no significant correlations between       tion between the anxiety score and the upper face with
the anxiety score and the body fixations when present-        statistical significance (𝜏 𝑏 = −0.263, 𝑝 < 0.05). For
ing the participants with the happy face. However we          the result per facial expression, only the happy expres-
found a strong correlation between the number of body         sion showed a strong negative correlation between the
fixations (bodyf) and the anxiety score when participants     anxiety and the upper face, with statistical significance
were presented with the neutral face, which was statis-       (𝜏 𝑏 = −0.408, 𝑝 < 0.05).
tically significant (𝜏 𝑏 = 0.411, 𝑝 < 0.05). There was
also a strong correlation between the number of fixa-
tions detected outside the agent and the anxiety score        5. Discussion
when participants viewed the angry face, there was also
                                                              In H1 we suggested that the agent affects the user va-
a significant correlation between the number of fixations
                                                              lence and arousal. The results in Fig. 7 show that the
outside the agent when participants were viewing the an-
                                                              agent has a significant effect on the valence and arousal
gry face, which was statistically significant (𝜏 𝑏 = 0.420,
                                                              of the user according to the emotion displayed. This
𝑝 < 0.05). The values were Benjamini-Hochberg cor-
                                                              shows that the model of the avatar actually works to
rected. The results are summarized in the scatter plot
                                                              affect the participant. The differences between valence
detailed in Fig. 8. We ran two additional Kendall’s 𝜏 -b
                                                              and arousal of the participant when viewing the neutral
correlation tests to see if there was a correlation between
                                                              agent compared to the baseline were not statistically sig-
anxiety score and if the participant gazed at the upper
                                                              nificant. This is expected because there was no emotion
or lower part of the avatar’s face, first regardless of the
                                                              conveyed with the neutral facial expression. The angry
face with 𝑁 = 45, then per facial expression (angry,
                                                              and happy facial expressions affected the valence and
neutral, happy) with 𝑁 = 15. For the results, regardless
                                                              arousal of the participants significantly compared to the
of the face presented there was strong negative correla-
                                                              baseline. This showed that there the facial expressions of
                                                              the agents affected the participants’ valence and arousal.
                                                              The angry facial expression also induced a lower valence
Table 2
Correlation values between anxiety score and fixations on score while the happy facial expression induced a higher
different portions of the agent for N = 45. (*: 𝑝 < 0.05, **: valence. This confirmed that the facial expressions of the
𝑝 < 0.01)                                                     avatar were perceived correctly and supports H1.
                                                                 In H2, we predicted that user’s acceptability of the
                        headf bodyf        eyef    other
     Anxiety Score      -0.12    .30**    -.22*      0.2
                                                              agent  emotions affects gaze patterns on the agent. SAD
     headf                       -.32**   .56**    -.0.3      individuals do not want to confront the emotions of oth-
     bodyf                               -.50**     .24*      ers, thus exhibiting behaviors such as avoiding the face
     eyef                                           .27*      or being hyper-vigilant to cope with emotional display.
In addition to the anxiety score, we measured if the par-      the agent, when anxious, matching the pattern of those
ticipant’s acceptance or rejection of the agent’s emotions     with SAD as per [5, 10] and supports H3.
played a factor in avoidant gaze patterns. We created the         When analyzing the correlation between anxiety score
acceptability score as measure of the individual partici-      and user’s fixations on the upper and lower part for all the
pant’s preference to the agent’s facial expressions were       faces, we found a strong negative correlation between the
variant. Individuals with SAD are also anxious at the          upper face and the anxiety score. This indicated that the
time they do not accept facial expressions. Thus we mea-       anxious users avoided the agent’s eyes as cited in [5, 10].
sured if there’s a correlation between the acceptability       When analyzing the correlations per facial expression,
score for anxious participants and number of fixations         we only found a strong negative correlation between
on specific parts of the agent. A positive correlation was     user fixations on the upper face and the user’s anxiety
found between fixations on the agent’s eye and the ac-         score, with the happy agent. This is probably because
ceptability score in anxious participants. This shows that     the participants were avoiding the face otherwise for the
even anxious participants are more likely to gaze at the       neutral and angry agents. This indicates that anxious
agent’s eyes even if they accept the agent’s facial expres-    users are more likely to look at the agent’s face, if it has
sion. Conversely, the opposite is true when the person         positive affect, despite avoiding the eyes.
avoids the gaze of the agent, if they find the agent’s emo-
tions unacceptable. This matches the literature for SAD
individuals avoiding eye gaze for displays of affection        6. Conclusions
and supports H2.
                                                               The study observes the correlation between anxiety of
   The acceptability score can be used to change how
                                                               normative users, their acceptability of the agent’s emo-
agent’s faces are studied. It can be used in the future to
                                                               tions and the respective gaze patterns on an agent with
study the root cause of SAD or even cultural differences
                                                               varying emotions. Our results suggest that individuals
in social norms when both accepting and reacting to
                                                               with anxiety in the moment have similar gaze patterns to
varying facial expressions. The acceptability measure
                                                               those with SAD. The similarities between normative indi-
can be used as an extra factor in future SAD studies.
                                                               viduals facing anxiety in the moment, their gaze patterns
   For H3, we hypothesized that user’s overall anxiety
                                                               and their acceptability of the agent, can unlock a better
affects gaze patterns on the agent. In this study, we took
                                                               understanding of the way SAD individuals operate. The
the number of fixations on each part of the agent as
                                                               techniques used for SAD individuals can also be used
an indication of acceptance or avoidance of the facial
                                                               to accommodate normative anxious individuals facing
expressions of the avatar. On one hand, an increased
                                                               social situations in VR.
number of fixations on the eyes or head suggest that
                                                                  In this model the participant is allowed to look away
the participant accepts the avatar. This is consistent
                                                               from the face to other sections of the embodied agent
to previous studies in [5, 10]. On the other hand, an
                                                               including the body, which emulates an actual social situa-
increased number of fixations on the body or outside of
                                                               tion in VR. Whether the users gazed at the upper or lower
the avatar indicates that the user is avoiding the agent.
                                                               parts of the agent’s face, were also analyzed. The more
   We also explored the relationship between agent avoid-
                                                               negative the expression, the further the anxious partic-
ance and the participants’ anxiety level. The initial anxi-
                                                               ipant strayed away from agent’s eye, then the face and
ety score represents an approximation of the participants’
                                                               entire body respectively. These findings are an important
overall state before facing the agent. We found signifi-
                                                               indication in designing future systems. E.g. Nonverbal
cant correlations between the user’s initial anxiety and
                                                               agents with positive affect might be a better choice for
body fixations, invariant of facial expression as shown in
                                                               an anxious normative individual as they were still more
Table 2. When analyzing the correlations, on per facial
                                                               likely to gaze at the face regardless of their anxiety. The
expression basis, there was a strong correlation between
                                                               studies also show that users are more likely to gaze at the
the anxiety score and the number of fixations on the body
                                                               agent’s eyes if they accept the agent’s emotional display.
of the agent or outside of the body when presented with
                                                                  A Metaverse VR avatar and VIVE tracker were used
the neutral and angry expressions respectively. When
                                                               in the experiment. The technique can be easily applied
viewing the angry expression, the fixations were com-
                                                               to a more ecologically valid setting to find avoidance
pletely outside the body showing that the participant
                                                               patterns of anxious users in real-time. This is useful to
is more likely to avoid the agent completely the more
                                                               adjust a non-verbal agent’s expression to accommodate
negative the expression is.
                                                               for the user’s anxiety and facial preference. Other sen-
   We also found a strong negative correlations between
                                                               sors can also be added to find stronger patterns e.g heart
the participants’ anxiety score and the number of fixa-
                                                               sensor or facial tracker. The studies suffer some limita-
tions on the agent’s eyes, invariant of facial expression.
                                                               tions due to limited participant count and non-validated
This indicates that the participant avoids the agent’s eyes
                                                               questionnaires.
and is more likely to look at the agent’s body or outside of
References                                                    [11] A. T. Wieckowski, N. N. Capriola-Hall, R. Elias, T. H.
                                                                   Ollendick, S. W. White, Variability of attention
 [1] D. Monteiro, H.-N. Liang, J. Wang, L. Wang,                   bias in socially anxious adolescents: differences in
     X. Wang, Y. Yue, Evaluating the effects of a cartoon-         fixation duration toward adult and adolescent face
     like character with emotions on users’ behaviour              stimuli, Cognition and emotion 33 (2019) 825–831.
     within virtual reality environments, in: 2018 IEEE       [12] Wolf3D, Cross-game avatar platform for the meta-
     International Conference on Artificial Intelligence           verse, in: https://readyplayer.me/, accessed January
     and Virtual Reality (AIVR), IEEE Computer Society,            14th, 2022, 2022.
     2018, pp. 229–236.                                       [13] M. Shin, S. J. Kim, F. Biocca, The uncanny val-
 [2] E. Hasegawa, N. Isoyama, D. V. Monteiro, N. Sakata,           ley: No need for any further judgments when an
     K. Kiyokawa, The effects of speed-modulated vi-               avatar looks eerie, Computers in Human Behavior
     sual stimuli seen through smart glasses on work               94 (2019) 100–109.
     efficiency after viewing, Sensors 22 (2022) 2272.        [14] P. Ekman, W. P. Friesen, Measuring facial move-
 [3] H. Lee, H. Kim, D. V. Monteiro, Y. Goh, D. Han, H.-           ment with the Facial Action Coding System, in:
     N. Liang, H. S. Yang, J. Jung, Annotation vs. virtual         P. Ekman (Ed.), Emotion in the human face, sec-
     tutor: Comparative analysis on the effectiveness              ond edi ed., Cambridge University Press, 1982, pp.
     of visual instructions in immersive virtual reality,          178–211.
     in: 2019 IEEE International Symposium on Mixed           [15] P. Ekman, E. Rosenberg, What the face reveals: Ba-
     and Augmented Reality (ISMAR), IEEE, 2019, pp.                sic and Applied Studies of Spontaneous Expression
     318–327.                                                      Using the Facial Action Coding System (FACS), sec-
 [4] D. Monteiro, H.-N. Liang, H. Li, Y. Fu, X. Wang,              ond edition ed., Oxford University Press, 2005.
     Evaluating the need and effect of an audience in a       [16] B. Farnsworth, Facial action coding system (facs)
     virtual reality presentation training tool, in: Inter-        - a visual guidebook, 2021. URL: https://imotions.
     national Conference on Computer Animation and                 com/blog/facial-action-coding-system/.
     Social Agents, Springer, 2020, pp. 62–70.                [17] Mixamo, 2022. URL: https://www.mixamo.com/#/.
 [5] K. Roelofs, P. Putman, S. Schouten, W.-G. Lange,
     I. Volman, M. Rinck, Gaze direction differentially
     affects avoidance tendencies to happy and angry
     faces in socially anxious individuals, Behaviour
     research and therapy 48 (2010) 290–294.
 [6] J. H. Kwon, C. Alan, S. Czanner, G. Czanner, J. Pow-
     ell, A study of visual perception: Social anxiety
     and virtual realism, in: Proceedings of the 25th
     Spring Conference on Computer Graphics, 2009,
     pp. 167–172.
 [7] R. E. Jack, P. G. Schyns,              The Human
     Face as a Dynamic Tool for Social Com-
     munication,           Current Biology 25 (2015)
     R621–R634. URL: https://www.sciencedirect.
     com/science/article/pii/S0960982215006557.
     doi:10.1016/j.cub.2015.05.052.
 [8] H.-S. Cha, S.-J. Choi, C.-H. Im, Real-time recogni-
     tion of facial expressions using facial electromyo-
     grams recorded around the eyes for social virtual
     reality applications, IEEE Access PP (2020) 1–1.
     doi:10.1109/ACCESS.2020.2983608.
 [9] L. A. Rutter, D. J. Norton, T. A. Brown, Visual atten-
     tion toward emotional stimuli: Anxiety symptoms
     correspond to distinct gaze patterns, Plos one 16
     (2021) e0250176.
[10] M. Garner, K. Mogg, B. P. Bradley, Orienting and
     maintenance of gaze to facial expressions in so-
     cial anxiety., Journal of Abnormal Psychology 115
     (2006) 760–770. doi:10.1037/0021-843x.115.4.
     760.