=Paper= {{Paper |id=Vol-3297/paper6 |storemode=property |title=How Anxiety State and Acceptance of an Embodied Agent Affect User Gaze Patterns |pdfUrl=https://ceur-ws.org/Vol-3297/paper6.pdf |volume=Vol-3297 |authors=Nermin Shaltout,Diego Monteiro Monteiro,Monica Perusquia-Hernandez,Kiyoshi Kiyokawa,Jason Orlorsky |dblpUrl=https://dblp.org/rec/conf/apmar/ShaltoutMPKO22 }} ==How Anxiety State and Acceptance of an Embodied Agent Affect User Gaze Patterns== https://ceur-ws.org/Vol-3297/paper6.pdf

How Anxiety State and Acceptance of an Embodied Agent
Affect User Gaze Patterns⋆
Nermin Shaltout1,2,*,† , Diego Vilela Monteiro3 , Monica Perusquía-Hernández2 , Jason Orlosky1,4
and Kiyoshi Kiyokawa2
1
Osaka University, 1-32 Machikaneyama, Toyonaka, Osaka 560-0043, Japan
2
Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan
3
École Supérieure d’Informatique Electronique Automatique, 38 Rue des Docteurs Calmette et Guérin, 53000 Laval, France
4
Augusta University, School of Computer and Cyber Sciences 100 Grace Hopper Ln, Augusta, GA 30901, USA

Abstract
In virtual reality (VR), the interactions of users with embodied agents when the users are anxious or when they do not
accept an agent are not yet completely understood. Gaze can be indicative of the user’s anxiety and acceptability of an
embodied agent. An agent’s expressions or actions can, in turn, be used to accommodate the user’s anxiety. Previous work on
social anxiety disorder (SAD) found evidence of avoidance or hyper-vigilant gaze patterns in relation to agents or people the
participants were gazing at. Thus, we investigated if there are specific gaze patterns for normal individuals experiencing
anxiety in the moment when gazing at an embodied agent. We focused mostly on avoidant gaze patterns. Based on evidence
of gaze patterns in SAD and autism, we designed an experiment where normative individuals interact with an agent showing
a neutral, happy and angry expressions. We aim to examine if normal anxious participants have similar gaze patterns or
avoidance patterns to those with SAD. We also investigated if the user’s acceptability or preference of the virtual agent’s
display of emotions had an effect on the avoidance via eye gaze. In particular, we investigated the user’s eye patterns in
relations to the agent’s eyes, face or body to see if there were similarities to people with SAD. Using correlation analysis, we
found a significant positive correlation between the acceptability of the participant to the virtual agent’s expression and their
fixation on the agent’s eyes. We also found a significant correlation between fixations on the agent’s body and how anxious
the participant was at the experiment’s start. Later, these results can be used to find a link between acceptability, anxiety and
SAD.

Keywords
Virtual Reality, Embodied Agent, Eye Gaze, Anxiety

1. Introduction and were usually conducted using still photographs [5].
Individuals with Social Anxiety Disorder (SAD) react dif-
In the field of virtual reality (VR), embodied agents are ferently to facial displays of emotion. This happens in
commonplace as non-player characters (NPCs) or as VR too. It happens independently of the avatar fidelity
other users (avatars) in-game. Thus, determining how [6]. Little is known of how normative individuals’ anxi-
individuals react to virtual agents is an important topic ety (aka those without SAD) affects gaze behaviour with
in the field [1]. The adaptation of embodied agent or respect to facial displays of emotion. The exposure of in-
avatar facial expressions can influence user behavior [2]. dividuals to virtual situations has also risen with the rise
In particular, for those who might use the agents for of new platforms like VR, and increased by the advent
learning [3], social support, or feedback [4]. Previous of covid-19. Studying the effects of anxiety on virtual
studies assessed the gaze patterns of individuals in social embodied agents, is thus important, for SAD as well as
situations to understand psychological and emotional for anxious normative individuals.
patterns. These studies are used to better understand and VR offers the possibility of presenting dynamic facial
train people with disorders such as high social anxiety stimuli with a wealth of parameters, leading to detailed
descriptions of the facial movements required to convey
APMAR’22: Asia-Pacific Workshop on Mixed and Augmented Reality, a socio-affective message accurately [7]. Furthermore,
Dec. 02-03, 2022, Yokohama, Japan the advent of biosensors allows real-time reproduction of
*
Corresponding author.
$ nermeena@gmail.com (N. Shaltout);
facial expressions from other users [8]. Moreover, with
diego.vilelamonteiro@esie.fr (D. V. Monteiro); a recent increased interest of the general public in the
m.perusquia@is.naist.jp (M. Perusquía-Hernández); metaverse after the 2019 pandemic. We believe it is timely
jorlosky@augusta.edu (J. Orlosky); kiyo@is.naist.jp (K. Kiyokawa) to study the effects of VR agents’ facial expressions on
0000-0002-1570-3652 (D. V. Monteiro); 0000-0002-0486-1743 the user gaze.
(M. Perusquía-Hernández); 0000-0002-0538-6630 (J. Orlosky);
0000-0003-2260-1707 (K. Kiyokawa)
Thus, this study aims to explore different gaze parame-
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License ters and their effectiveness to determine how comfortable
Attribution 4.0 International (CC BY 4.0).
CEUR

CEUR Workshop Proceedings (CEUR-WS.org)
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073 the user is with the agent as it presents different facial
expressions; which might be an alternate method for happy/neutral) using their eye gaze. Participants show
measuring the reaction towards VR agents. The design of both avoidance and hyper-vigilance according to the age
the study was inspired by previous works conducted on group, the agent display, and the passage of time during
individuals with SAD. Individuals with SAD usually do the trial. The bias is measured with the duration of fix-
not deal well with emotions presented on the face. They ations towards angry faces and towards more pleasant
tend to avoid gaze when faced with emotional people or faces such as neutral or happy faces. The fixation dura-
their representations. The avoidance might increase with tion of neutral faces is subtracted from angry faces to
negative emotions. They especially avoid looking at the create a negative or positive bias. Based on the above
eyes of individuals displaying emotions [9]. Thus, we an- studies we hypothesize that individuals with SAD might
alyzed the effect of user’s anxiety and acceptance when have two factors that affect their gaze patterns. The level
confronted with an embodied agent showing different of anxiety of the person when they are looking at the
emotions on the user’s eye gaze patterns. face, and whether or not the individuals accept the fa-
We hypothesize that we can use gaze location on cial expression of the embodied agent. We would like to
the agent to measure the degree of comfort towards an also observe if this affects normative individuals and if it
agent’s facial expression or degree of user anxiety. To mimics those with SAD.
this aim, eye gaze was measured using the VIVE Pro Eye
tracker while participants looked at a VR agent with vary- 2.2. Hypotheses
ing expressions. The main contributions of this paper
are: Our hypotheses are as follows.

• Analyzing the correlation between general anx- H1 The agent’s facial expressions have an effect on the
iety of a normative user and their gaze patterns participant’s self-reported arousal and valence.
on the embodied agent.
• Analyzing the correlation between the user accep- H2 The user’s acceptance or preference of the agent’s
tance to the embodied agent’s emotional display emotional display can be observed in the eye fix-
and their gaze patterns on the embodied agent. ation patterns on the agent.
• Comparing the findings to those found in SAD H3 The overall anxiety state of the participants could
using similar studies. alter the eye fixation patterns on the agent.

For H1, the Self Assessment Manniken (SAM) was
2. Prior Work and Hypotheses used to assess if the facial expressions had an effect on
the participant’s affect, to see if the avatar’s expressions
2.1. Gaze Analysis Studies Related to affected the user. Regarding H2, while participants were
Social Anxiety answering questionnaires in 3.6 about the different emo-
tional display of the agent, not all of them accepted the
Eye metrics are promising tools to assess attitudes to- agent’s emotional display in the same manner. E.g. while
wards virtual agents. The main inspiration for this study some people highly disliked the happy face, other people
stems from gaze analysis of individuals with HSA (High were comfortable with it. We assessed if there is a pattern
Social Anxiety) towards the facial expressions of other between acceptability of the emotional display and the
individuals in social situations. Previous gaze studies number of fixations on the agent. For H3, we assessed if
showed that individuals with HSA averted their gaze there is a relation between the anxiety of the participant
when shown photos of individuals expressing positive or and their fixation behaviors on the agent’s different body
negative emotions [5, 10]. In such studies, static photos parts. We expected the more anxious participants to be
of people presenting happy, sad, and neutral facial expres- avoidant of the agent’s face and eyes. The anxiety state
sions were commonly shown while gaze directions and in this case is the user’s default state before and during
fixations were measured. Thus, we adapted our study to the experiment.
find a relation between gaze direction on the embodied
agent displaying emotions and the user’s acceptance of
the emotional display. 3. Experiment
Wieckowski et al. explored variability in bias to-
ward social stimuli in the form of vigilant attention and 3.1. Participants
avoidant attention using eye gaze techniques instead of
a traditional probe task technique often used to study A total of 21 student volunteers in their early twenties
attention bias in anxious youth with clinical SAD [11]. participated in the study; 10 Japanese, 1 Kenyan, 1 Ger-
The visual dot probe task involves allowing the users man, 2 Nepali, 1 Colombian, 4 Chinese, 1 Thai, and 1
to select between two agent pairs (e.g., angry/neutral, Malaysian. No participant tried our system before. The
participants were asked to wear glasses if their vision was
not good. Sources of error were accounted for by remov-
ing three participants in which there was missing data
(e.g., the VIVE Pro Eye tracking was disabled accidentally
for one of the faces). Three participants that were ex-
tremely fatigued were excluded using a fatigue score in Figure 1: A diagram showing the resulting avatar (right)
created when using an average face (left) on the Ready Player
the pre-questionnare. After the exclusions, the number
Me avatar creator (https://bit.ly/34C2G).
of participants was 15. The experiment was approved by
the ethics committee of our institution.

hyper realistic [13]. Thus we used a semi realistic avatar.
3.2. Experiment Design
We used an average Asian face to accommodate the
We tested the participant’s eye gaze patterns when pre- majority Asian demographic involved in the experiment.
sented with different facial expressions from a humanoid Figure 1 shows the resulting avatar when inputting an
agent in VR. There were three conditions corresponding average Asian face to the Ready Player Me interface. The
to three facial displays expressed by the virtual agent: a Ready Player Me was used because its low-poly char-
happy, a sad, and an angry facial expression. The con- acteristics make it more likely to be used by people in
ditions were presented in a within subjects design, i.e., current virtual chats and metaverse settings.
each participant saw all three faces. We chose to en- Though readyplayer.me characters are usually used
able the agent to have only facial expressions; to avoid as avatars in VRchat, we use it in this case to animate
confounding factors caused by other agent behaviors. the agent, as if it’s an example user in VR. To animate
the happy, angry, and neutral emotions, the Facial Ac-
3.3. Procedure tion Coding System (FACS) was used [14]. The FACS
presents action units (AUs) used for coding facial move-
The participants were exposed to each of the agent’s fa- ments without making inferences about the underlying
cial expression one minute at a time. There were three emotions. It is a popular tool in emotion studies to either
runs total, one for each facial expression. The partici- create faces with a certain expression or to interpret a
pants were seated in front of the agent as to be the same facial expression. The FACS is now incorporated in most
height as the agent and faced the agent head-on without VR chat avatars to enable the avatars to express emotions
an angle. Before every run, the agent was adjusted to by encoding certain AU movements. Ready Player Me
be the same height as the participant. The participants avatars come equipped with most of the values available
were seated throughout the course of the experiment and in the FACS. We focused on prototypical AUs according
encouraged to use only gaze and head movements. A full to the Basic Emotion Theory [15] to animate the avatar,
agent was used so the participants could freely choose together with guidelines described in Farnsworth’s vi-
whether or not to gaze at the agent’s face, body, or out- sual FACS guide [16]. For instance, to animate a happy
side of the agent completely. The participants answered face, we used AU 6 (cheek raiser, Fig. 2) and AU 12 (lip
questionnaires pre- and post-experiment and after each corner puller) with values of 1.0. The agent’s default face
avatar was displayed. Details are mentioned in the mea- with some minor adjustments was used to represent the
surements section. neutral face, Because the Ready Player Me avatars are
designed to look slightly happy by default, we adjusted
3.4. Stimuli brow lowered AU 4 and lip corner depressor AU 15 to
bring back the avatar to its normal state. The facial ex-
The agent was designed using Ready Player Me [12], pressions are animated using blend shapes. There are
which is a tool that converts a photograph of a person three separate faces shown to the same participant. We
into an avatar with similar facial features. It is used refer to them as three different trials with questionnaires
for players to make agents of themselves in-game. It is in between. All facial expressions start from the neutral
currently most popular on platforms such as VR chatting expression. It took one second for the facial animation to
programs. Ready Player Me is also equipped with the reach their maximum intensity. Then the avatar’s expres-
ability to map the user’s emotions onto the agent via eye sion remains for the duration of the trial. The animation
and facial tracking. The readyplayer.me avatar was based stays constant for one minute per trial. We measure the
on FACs and is usually embodied by people in VR but participant’s reaction within one minute for the purpose
we controlled it using the animation module of Unity to of measuring fixations. We do not consider the initial an-
introduce more control over the experiment imation to have a detrimental effect on the gaze patterns
The possibility that there is an uncanny valley in the or number of fixations thus we did not account for the
virtual agent’s appearance is higher with agents that are baseline when observing gaze patterns.
Figure 2: A diagram showing action unit 6, cheek raiser (left
two images, permission was taken from imotions to use the
image). The rightmost two images show the AU6 applied on Figure 3: (a) Cube used for calibrating the eye gaze pre-
the Ready Player Me avatar, used in the experiment as an experiment (b) height adjusting the participant while obstruct-
agent (https://bit.ly/3bKN9a). ing the avatar’s face (c), (d) examples of the neutral face and
angry faces used for the experiment. The gaze ray is only
shown for illustrative purposes but is omitted in the actual
run of the experiment. The Action Units (AU)s for both happy
To reduce the chances of a participant experiencing
and angry faces were amped to 1 for both angry and happy
the uncanny valley, the avatar’s blinking was animated.
faces to get the maximum effect.
The frequency of the blinks was randomized, between
0.5 and 4 seconds. The agent’s gaze was fixed during
the run of this experiment. The agent was also given
a breathing animation using Maximo to give it a more
realistic feel [17]. No other interactions other than vary-
ing facial expressions were added to the agent. In this
experiment we focused more on the relationship between Figure 4: Figure showing the face mask (highlighted on the
user’s anxiety and user’s gaze patterns rather than design left) used to detect collisions/fixations
a complex interaction system. Thus a simple design was
used. Future work will feature a more interactive avatar.

150 ms constituted a “fixation.” The number of fixations
3.5. Calibration on each body part was counted within one minute of
To ensure that eye and facial feature tracking worked the participant looking at the avatar. The fixations were
correctly, we used a cube display to calibrate and con- collected for each facial expression; Neutral, Angry, and
firm participant’s gaze prior the experiment’s start. The Happy. The process was repeated per participant. By
agent’s height was adjusted to be the same as the par- analyzing the number of fixations on each body part,
ticipant’s height in every trial. The agent’s face was we expected to find a pattern related to the participant’s
only revealed once the VIVE Pro Eye was calibrated, as current state, the participant’s preference of the face, and
shown in Fig. 3. We calibrated each participant’s eye gaze patterns related to the facial expression that the avatar
pre-experiment using the VIVE’s internal calibration soft- displayed to the participant.
ware before running the experiment. The participant was Additionally, we added a face mask not visible to the
then positioned to see the front view of the agent. How- user to roughly measure the number of fixations on the
ever, the participant was free to move his/her head or face. We counted the number of fixations surpassing
gaze freely and look either at the avatar’s face or body 150 ms on the colliders added to this face mask, and
for the duration of the one-minute trial. No background divided the number of fixations into upper face region
objects were visible so that the participant would focus and lower face region. Any collider above the lower eye
on the agent. Participants were seated during the ex- was considered upper face while colliders below the face
periment and received no instructions regarding as to were considered lower face. If the upper face collisions
where to look when facing the agent and were left to are higher in the results, then the user accepts the avatar,
interact naturally with the agent using eye movements otherwise if the lower face collisions are higher the user
and head movements once the experiment started. The rejects the avatar according to source.
participant’s location did not change.
Questionnaires The participants answered question-
3.6. Measurements naires at different points during the experiment. Some-
times a questionnaire like SAM was repeated more than
Eye Metrics We implemented a gaze ray to determine once so were questionnaires about the acceptability of
the intersection of the participant’s gaze with the avatar’s the avatar. To identify each of which point of the exper-
face. To determine the number of collisions between the iment the participant answered the questionnaires we
gaze ray and parts of the face, we added colliders to assigned codes as follows:
the face, eyes, and body (Fig. 5). When the ray did not
collide with the avatar, it was recorded as ‘other.’ We B : Before seeing any avatar.
defined that a constant gaze on the same body part for
AH : After seeing the happy facial expression agent.
[a]Valence

[b]Arousal

Figure 5: Figure showing how the number of fixations is
measured using colliders placed on the avatar. (a) headf: rep-
resents the number of times the participant fixated on the Figure 6: The [a]Valence and [b]Arousal questionnaire from
avatar’s head. (b)bodyf: the number of times the participant SAM.
fixated on the avatar’s body, (c) eyef: the number of times
the participant fixated on the avatar’s eyes, and (d) other: the
number of fixations outside of the avatar. Collisions on the
avatar were detected using a convex collider respectively.
The questionnaire about the avatar was used to mea-
sure the participant’s acceptance aka contextual com-
fort of the virtual agent’s expression after each trial. The
questionnaires were answered thrice per participant after
AA : After seeing the angry facial expression agent. each face. From the answers of the acceptability ques-
tionnaire, we created a variable known as acceptability
AN : After seeing the neutral facial expression agent.
by taking as follows:
All three avatars were shown to each participant in a
counterbalanced order. The following permutations were 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑏𝑖𝑙𝑖𝑡𝑦 : 𝑆𝑐𝑜𝑟𝑒(𝑄1) − 𝑆𝑐𝑜𝑟𝑒(𝑄2)
used: (AA, AH, AN); (AA, AN, AH); (AH, AN, AA); (AH,
AN, AA); (AN, AA, AH); (AN, AH, AA) A positive or zero value indicated that the agent was
accepted, while a negative value indicated that the agent
Pre-questionnaire Before the experiments, partici- was rejected by the user. We calculated the acceptability
pants reported their demographics and their current fa- per face per participant. This score was then used to find
tigue and anxiety. They also reported their general anxi- a correlation between the acceptability of the agent. Each
ety level on a 9-point Likert scale. facial expression was compared to the participant’s eye
on fixations different locations of the embodied agent
Self Assessment Manikin (SAM) A SAM question- (head, body, eye, etc).
naire with 9 point likert scale is used to measure va- The acceptability questionnaire was conducted three
lence,arousal and dominance. (Fig. 6) shows a sample of times per participant, after the participant viewed every
the valence and arousal questionnaires used. SAM was expression of the agent for one minute. For instance,
given to the participant before the experiment and after after the participant views the happy expression for one
every face. SAM was used to measure how the avatar minute, the experiment stops and the participant answers
affected the participant. The SAM questionnaire was the SAM and the acceptability questionnaires. This pro-
presented prior to and after looking at the avatar. cedure is then repeated with the other two expressions.
The questionnaires are not validated which might be a
limitation of the study.
Acceptability Questionnaire This was a more de-
tailed questionnaire about the avatar. The questionnaires
were given a code AH, AA, or AN, after each different 4. Analysis and Results
avatar with different facial expression. It asked how peo-
ple felt about the avatar. It consisted of several questions Despite being given no instructions on where to di-
answered on a 9-point Likert scale. The scores are on a rect their gaze, the majority of participants gazed at
scale from 1 to 9 to match the same format of the SAM. the agent’s face or body. However, the gaze patterns
From this questionnaire, we only used two questions for changed according to the facial expression shown, the
further analyses: participant’s acceptability score, and the anxiety score, as
detailed below. Deviations away from the face changed
Q1 : I felt comforted by the avatar. according to the acceptability and anxiety scores.
Q2 : I felt disturbed by the avatar respectively. SPSS was used to analyze the data. To test H1, we mea-
sured if the agent’s facial expression had an effect on the
[a] Valence: A higher value means that a person is content
participant’s affect as follows. The SAM questionnaire
while a lower value means that a person is upset
was taken four times and symbolized with experiment
codes described in the Questionnaire portion of Sec. 3.6
as follows: Before the experiment (B), after the neutral
agent(AN), after the happy agent (AH), after the angry
agent (AA) for each participant.
A Shapiro-Wilk test showed a significant departure
from normality W(98) = 0.94 , 𝑝 < 0.01; W(98) = 0.94,
𝑝 < 0.01; W(98) = 0.95, 𝑝 < 0.01 for valence, arousal
and dominance respectively. We thus ran the Kruskal-
Wallis as a non-parametric test to compare if there were
significant differences between conditions B, AH, AA, [b] Arousal: Measures the state of alertness. A higher value
and AN. A Kruskal-Wallis rank-sum test carried out on indicates higher alerntness.
arousal valence and dominance showed that there was
a statistical significance in valence (𝜒2 (3) = 21.48, 𝑝 <
0.001) and arousal (𝜒2 (3) = 8.867, 𝑝 < 0.05) between the
different embodied agent expressions when compared
to the baseline. The mean rank valence scores of 41.07,
45.16 , 28.67, and 62.5 for B, AN, AH, and AA, respectively.
For arousal, the mean rank scores were 39.98, 41.61, 57.6,
37.84 for B, AN, AH, and AA, respectively. The results
for dominance showed no statistical significance.
Pairwise Wilcoxon rank sum tests with Benjamini- Figure 7: Boxplot showing [a] Valence and [b] Arousal values
Hochberg (BH) 𝑝-value adjustment were carried out for entered by participants after viewing each facial expression
the SAM valence and arousal scores since they showed (AA, AH, AN) compared to the Baseline, B. The starred brackets
significance. Experiment codes (B, AN, AH, AA) were show statistically significant pairings using the Wilcoxon rank
used to represent the experiment stage as detailed in 3.6. sum tests.
The following score pairs were compared to each other:
(B, AN), (B, AH), (B, AA) representing the comparison
between the baseline state and the state of the participant (AA, AH) with (𝑝 < 0.01, Cohen’s 𝑑 = 0.25) and (AN,
after viewing agents of different emotions. This is to test AH) with (𝑝 < 0.01, Cohen’s 𝑑 = 0.25); 𝑁 = 15 for all
if the different emotional agents had a significant effect the test concerning valence and arousal
on the affect of the participant. The difference between We tested H2 and H3 and see if anxiety and acceptabil-
the scores of the following face pairs were compared to ity affect where the participant looks most at the avatar
each other: (AN, AH), (AN, AA), (AH, AA). and whether they are avoiding the agent’s face or eyes as
The significant results for valence and arousal are sum- a result. We compared those results to previous findings
marized using starred brackets in Fig. 7. Significant re- for socially anxious individuals. Kendall’s 𝜏 b correlation
sults were as follows: The valence was larger after intro- was used to run the correlation tests since the sample size
ducing the happy agent (AH), compared to the baseline is small, and the normality tests showed that the sample
(B): (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). The valence was did not follow a normal distribution as before.
conversely lower than the baseline after introducing the To test H2, when running the tests on all the partici-
angry agent (AA): (𝑝 < 0.01, Cohen’s 𝑑 = 0.25). There pants, we found no correlation between the acceptability
was no significant differences in valence when comparing score and number of fixations on all portions of the agent.
the score of the baseline condition to introducing the neu- We ran another Kendall’s 𝜏 b correlation using only anx-
tral agent (AH). The valence score was also lower after
viewing the angry face as compared to after viewing the
neutral face (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). Conversely, Table 1
the valence score was higher after the happy face as com- Correlation values between acceptability score and fixations
pared to the neutral face (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). on different portions of the agent for N = 13 (anxious partici-
The difference in valency values between the AH and AA pants) (*: 𝑝 < 0.05, **: 𝑝 < 0.01),
pair was significant (𝑝 < 0.01, Cohen’s 𝑑 = 0.25). headf bodyf eyesf other
The results for arousal were as follows: the difference bodyf .04
between arousal values for the following score pairs with eyesf .51* −.22
the following experiment codes showing statistical sig- other .59** .21 .16
nificance: (B, AH) with (𝑝 < 0.05, Cohen’s 𝑑 = 0.25); Acceptability .31 .03 .51* .22
ious participants. The results are summarized in Table 1.
A positive correlation between the participant’s accept-
ability score and the number of fixations on the eye were
found which was statistically significant (𝜏 𝑏 = 0.510,
𝑝 < 0.05). The acceptability score was standardized
between −1 and 1.
To investigate H3, a Kendall’s 𝜏 b correlation was run [a] Angry Expression
to determine the relationship between the participant’s
anxiety score and amount of fixations on a certain por-
tion of the avatar, as shown in Table 2, with 𝑁 = 45,
regardless of the face used. There was a strong, posi-
tive correlation between the participant’s anxiety score
of and the number of fixations on the agent body per
minute, which was statistically significant (𝜏 𝑏 = 0.30, [b] Neutral Expression
𝑝 < 0.001). Additionally, there was also a strong nega- Figure 8: Scatter plot illustrating the correlation between the
tive correlation between the participant’s (𝜏 𝑏 = −0.22, anxiety score of the participant and the number of fixations
𝑝 < 0.01). The results are summarized in Table 2. detected on outside of the agent’s face, when observing the
A Kendall’s 𝜏 -b correlation was run to determine the avatar with the (a) angry expression as opposed to number
relationship between the anxiety score of the participant of fixations on the agent’s body, when observing the agent
with the (b) neutral expression over one minute.
and the number of fixations in a certain section of the
agent, among 15 participants, per facial expressions. We
observed if there are any patterns for specific facial ex-
pressions. We found no significant correlations between tion between the anxiety score and the upper face with
the anxiety score and the body fixations when present- statistical significance (𝜏 𝑏 = −0.263, 𝑝 < 0.05). For
ing the participants with the happy face. However we the result per facial expression, only the happy expres-
found a strong correlation between the number of body sion showed a strong negative correlation between the
fixations (bodyf) and the anxiety score when participants anxiety and the upper face, with statistical significance
were presented with the neutral face, which was statis- (𝜏 𝑏 = −0.408, 𝑝 < 0.05).
tically significant (𝜏 𝑏 = 0.411, 𝑝 < 0.05). There was
also a strong correlation between the number of fixa-
tions detected outside the agent and the anxiety score 5. Discussion
when participants viewed the angry face, there was also
In H1 we suggested that the agent affects the user va-
a significant correlation between the number of fixations
lence and arousal. The results in Fig. 7 show that the
outside the agent when participants were viewing the an-
agent has a significant effect on the valence and arousal
gry face, which was statistically significant (𝜏 𝑏 = 0.420,
of the user according to the emotion displayed. This
𝑝 < 0.05). The values were Benjamini-Hochberg cor-
shows that the model of the avatar actually works to
rected. The results are summarized in the scatter plot
affect the participant. The differences between valence
detailed in Fig. 8. We ran two additional Kendall’s 𝜏 -b
and arousal of the participant when viewing the neutral
correlation tests to see if there was a correlation between
agent compared to the baseline were not statistically sig-
anxiety score and if the participant gazed at the upper
nificant. This is expected because there was no emotion
or lower part of the avatar’s face, first regardless of the
conveyed with the neutral facial expression. The angry
face with 𝑁 = 45, then per facial expression (angry,
and happy facial expressions affected the valence and
neutral, happy) with 𝑁 = 15. For the results, regardless
arousal of the participants significantly compared to the
of the face presented there was strong negative correla-
baseline. This showed that there the facial expressions of
the agents affected the participants’ valence and arousal.
The angry facial expression also induced a lower valence
Table 2
Correlation values between anxiety score and fixations on score while the happy facial expression induced a higher
different portions of the agent for N = 45. (*: 𝑝 < 0.05, **: valence. This confirmed that the facial expressions of the
𝑝 < 0.01) avatar were perceived correctly and supports H1.
In H2, we predicted that user’s acceptability of the
headf bodyf eyef other
Anxiety Score -0.12 .30** -.22* 0.2
agent emotions affects gaze patterns on the agent. SAD
headf -.32** .56** -.0.3 individuals do not want to confront the emotions of oth-
bodyf -.50** .24* ers, thus exhibiting behaviors such as avoiding the face
eyef .27* or being hyper-vigilant to cope with emotional display.
In addition to the anxiety score, we measured if the par- the agent, when anxious, matching the pattern of those
ticipant’s acceptance or rejection of the agent’s emotions with SAD as per [5, 10] and supports H3.
played a factor in avoidant gaze patterns. We created the When analyzing the correlation between anxiety score
acceptability score as measure of the individual partici- and user’s fixations on the upper and lower part for all the
pant’s preference to the agent’s facial expressions were faces, we found a strong negative correlation between the
variant. Individuals with SAD are also anxious at the upper face and the anxiety score. This indicated that the
time they do not accept facial expressions. Thus we mea- anxious users avoided the agent’s eyes as cited in [5, 10].
sured if there’s a correlation between the acceptability When analyzing the correlations per facial expression,
score for anxious participants and number of fixations we only found a strong negative correlation between
on specific parts of the agent. A positive correlation was user fixations on the upper face and the user’s anxiety
found between fixations on the agent’s eye and the ac- score, with the happy agent. This is probably because
ceptability score in anxious participants. This shows that the participants were avoiding the face otherwise for the
even anxious participants are more likely to gaze at the neutral and angry agents. This indicates that anxious
agent’s eyes even if they accept the agent’s facial expres- users are more likely to look at the agent’s face, if it has
sion. Conversely, the opposite is true when the person positive affect, despite avoiding the eyes.
avoids the gaze of the agent, if they find the agent’s emo-
tions unacceptable. This matches the literature for SAD
individuals avoiding eye gaze for displays of affection 6. Conclusions
and supports H2.
The study observes the correlation between anxiety of
The acceptability score can be used to change how
normative users, their acceptability of the agent’s emo-
agent’s faces are studied. It can be used in the future to
tions and the respective gaze patterns on an agent with
study the root cause of SAD or even cultural differences
varying emotions. Our results suggest that individuals
in social norms when both accepting and reacting to
with anxiety in the moment have similar gaze patterns to
varying facial expressions. The acceptability measure
those with SAD. The similarities between normative indi-
can be used as an extra factor in future SAD studies.
viduals facing anxiety in the moment, their gaze patterns
For H3, we hypothesized that user’s overall anxiety
and their acceptability of the agent, can unlock a better
affects gaze patterns on the agent. In this study, we took
understanding of the way SAD individuals operate. The
the number of fixations on each part of the agent as
techniques used for SAD individuals can also be used
an indication of acceptance or avoidance of the facial
to accommodate normative anxious individuals facing
expressions of the avatar. On one hand, an increased
social situations in VR.
number of fixations on the eyes or head suggest that
In this model the participant is allowed to look away
the participant accepts the avatar. This is consistent
from the face to other sections of the embodied agent
to previous studies in [5, 10]. On the other hand, an
including the body, which emulates an actual social situa-
increased number of fixations on the body or outside of
tion in VR. Whether the users gazed at the upper or lower
the avatar indicates that the user is avoiding the agent.
parts of the agent’s face, were also analyzed. The more
We also explored the relationship between agent avoid-
negative the expression, the further the anxious partic-
ance and the participants’ anxiety level. The initial anxi-
ipant strayed away from agent’s eye, then the face and
ety score represents an approximation of the participants’
entire body respectively. These findings are an important
overall state before facing the agent. We found signifi-
indication in designing future systems. E.g. Nonverbal
cant correlations between the user’s initial anxiety and
agents with positive affect might be a better choice for
body fixations, invariant of facial expression as shown in
an anxious normative individual as they were still more
Table 2. When analyzing the correlations, on per facial
likely to gaze at the face regardless of their anxiety. The
expression basis, there was a strong correlation between
studies also show that users are more likely to gaze at the
the anxiety score and the number of fixations on the body
agent’s eyes if they accept the agent’s emotional display.
of the agent or outside of the body when presented with
A Metaverse VR avatar and VIVE tracker were used
the neutral and angry expressions respectively. When
in the experiment. The technique can be easily applied
viewing the angry expression, the fixations were com-
to a more ecologically valid setting to find avoidance
pletely outside the body showing that the participant
patterns of anxious users in real-time. This is useful to
is more likely to avoid the agent completely the more
adjust a non-verbal agent’s expression to accommodate
negative the expression is.
for the user’s anxiety and facial preference. Other sen-
We also found a strong negative correlations between
sors can also be added to find stronger patterns e.g heart
the participants’ anxiety score and the number of fixa-
sensor or facial tracker. The studies suffer some limita-
tions on the agent’s eyes, invariant of facial expression.
tions due to limited participant count and non-validated
This indicates that the participant avoids the agent’s eyes
questionnaires.
and is more likely to look at the agent’s body or outside of
References [11] A. T. Wieckowski, N. N. Capriola-Hall, R. Elias, T. H.
Ollendick, S. W. White, Variability of attention
[1] D. Monteiro, H.-N. Liang, J. Wang, L. Wang, bias in socially anxious adolescents: differences in
X. Wang, Y. Yue, Evaluating the effects of a cartoon- fixation duration toward adult and adolescent face
like character with emotions on users’ behaviour stimuli, Cognition and emotion 33 (2019) 825–831.
within virtual reality environments, in: 2018 IEEE [12] Wolf3D, Cross-game avatar platform for the meta-
International Conference on Artificial Intelligence verse, in: https://readyplayer.me/, accessed January
and Virtual Reality (AIVR), IEEE Computer Society, 14th, 2022, 2022.
2018, pp. 229–236. [13] M. Shin, S. J. Kim, F. Biocca, The uncanny val-
[2] E. Hasegawa, N. Isoyama, D. V. Monteiro, N. Sakata, ley: No need for any further judgments when an
K. Kiyokawa, The effects of speed-modulated vi- avatar looks eerie, Computers in Human Behavior
sual stimuli seen through smart glasses on work 94 (2019) 100–109.
efficiency after viewing, Sensors 22 (2022) 2272. [14] P. Ekman, W. P. Friesen, Measuring facial move-
[3] H. Lee, H. Kim, D. V. Monteiro, Y. Goh, D. Han, H.- ment with the Facial Action Coding System, in:
N. Liang, H. S. Yang, J. Jung, Annotation vs. virtual P. Ekman (Ed.), Emotion in the human face, sec-
tutor: Comparative analysis on the effectiveness ond edi ed., Cambridge University Press, 1982, pp.
of visual instructions in immersive virtual reality, 178–211.
in: 2019 IEEE International Symposium on Mixed [15] P. Ekman, E. Rosenberg, What the face reveals: Ba-
and Augmented Reality (ISMAR), IEEE, 2019, pp. sic and Applied Studies of Spontaneous Expression
318–327. Using the Facial Action Coding System (FACS), sec-
[4] D. Monteiro, H.-N. Liang, H. Li, Y. Fu, X. Wang, ond edition ed., Oxford University Press, 2005.
Evaluating the need and effect of an audience in a [16] B. Farnsworth, Facial action coding system (facs)
virtual reality presentation training tool, in: Inter- - a visual guidebook, 2021. URL: https://imotions.
national Conference on Computer Animation and com/blog/facial-action-coding-system/.
Social Agents, Springer, 2020, pp. 62–70. [17] Mixamo, 2022. URL: https://www.mixamo.com/#/.
[5] K. Roelofs, P. Putman, S. Schouten, W.-G. Lange,
I. Volman, M. Rinck, Gaze direction differentially
affects avoidance tendencies to happy and angry
faces in socially anxious individuals, Behaviour
research and therapy 48 (2010) 290–294.
[6] J. H. Kwon, C. Alan, S. Czanner, G. Czanner, J. Pow-
ell, A study of visual perception: Social anxiety
and virtual realism, in: Proceedings of the 25th
Spring Conference on Computer Graphics, 2009,
pp. 167–172.
[7] R. E. Jack, P. G. Schyns, The Human
Face as a Dynamic Tool for Social Com-
munication, Current Biology 25 (2015)
R621–R634. URL: https://www.sciencedirect.
com/science/article/pii/S0960982215006557.
doi:10.1016/j.cub.2015.05.052.
[8] H.-S. Cha, S.-J. Choi, C.-H. Im, Real-time recogni-
tion of facial expressions using facial electromyo-
grams recorded around the eyes for social virtual
reality applications, IEEE Access PP (2020) 1–1.
doi:10.1109/ACCESS.2020.2983608.
[9] L. A. Rutter, D. J. Norton, T. A. Brown, Visual atten-
tion toward emotional stimuli: Anxiety symptoms
correspond to distinct gaze patterns, Plos one 16
(2021) e0250176.
[10] M. Garner, K. Mogg, B. P. Bradley, Orienting and
maintenance of gaze to facial expressions in so-
cial anxiety., Journal of Abnormal Psychology 115
(2006) 760–770. doi:10.1037/0021-843x.115.4.
760.