<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>How Anxiety State and Acceptance of an Embodied Agent Afect User Gaze Patterns ⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nermin Shaltout</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Vilela Monteiro</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Monica Perusquía-Hernández</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jason Orlosky</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kiyoshi Kiyokawa</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Augusta University, School of Computer and Cyber Sciences 100 Grace Hopper Ln</institution>
          ,
          <addr-line>Augusta, GA 30901</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>École Supérieure d'Informatique Electronique Automatique, 38 Rue des Docteurs Calmette et Guérin</institution>
          ,
          <addr-line>53000 Laval</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Nara Institute of Science and Technology</institution>
          ,
          <addr-line>8916-5, Takayama, Ikoma, Nara 630-0192</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Osaka University</institution>
          ,
          <addr-line>1-32 Machikaneyama, Toyonaka, Osaka 560-0043</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In virtual reality (VR), the interactions of users with embodied agents when the users are anxious or when they do not accept an agent are not yet completely understood. Gaze can be indicative of the user's anxiety and acceptability of an embodied agent. An agent's expressions or actions can, in turn, be used to accommodate the user's anxiety. Previous work on social anxiety disorder (SAD) found evidence of avoidance or hyper-vigilant gaze patterns in relation to agents or people the participants were gazing at. Thus, we investigated if there are specific gaze patterns for normal individuals experiencing anxiety in the moment when gazing at an embodied agent. We focused mostly on avoidant gaze patterns. Based on evidence of gaze patterns in SAD and autism, we designed an experiment where normative individuals interact with an agent showing a neutral, happy and angry expressions. We aim to examine if normal anxious participants have similar gaze patterns or avoidance patterns to those with SAD. We also investigated if the user's acceptability or preference of the virtual agent's display of emotions had an efect on the avoidance via eye gaze. In particular, we investigated the user's eye patterns in relations to the agent's eyes, face or body to see if there were similarities to people with SAD. Using correlation analysis, we found a significant positive correlation between the acceptability of the participant to the virtual agent's expression and their ifxation on the agent's eyes. We also found a significant correlation between fixations on the agent's body and how anxious the participant was at the experiment's start. Later, these results can be used to find a link between acceptability, anxiety and SAD.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Virtual Reality</kwd>
        <kwd>Embodied Agent</kwd>
        <kwd>Eye Gaze</kwd>
        <kwd>Anxiety</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        and were usually conducted using still photographs [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Individuals with Social Anxiety Disorder (SAD) react
difIn the field of virtual reality (VR), embodied agents are ferently to facial displays of emotion. This happens in
commonplace as non-player characters (NPCs) or as VR too. It happens independently of the avatar fidelity
other users (avatars) in-game. Thus, determining how [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Little is known of how normative individuals’
anxiindividuals react to virtual agents is an important topic ety (aka those without SAD) afects gaze behaviour with
in the field [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The adaptation of embodied agent or respect to facial displays of emotion. The exposure of
inavatar facial expressions can influence user behavior [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. dividuals to virtual situations has also risen with the rise
In particular, for those who might use the agents for of new platforms like VR, and increased by the advent
learning [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], social support, or feedback [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Previous of covid-19. Studying the efects of anxiety on virtual
studies assessed the gaze patterns of individuals in social embodied agents, is thus important, for SAD as well as
situations to understand psychological and emotional for anxious normative individuals.
patterns. These studies are used to better understand and VR ofers the possibility of presenting dynamic facial
train people with disorders such as high social anxiety stimuli with a wealth of parameters, leading to detailed
descriptions of the facial movements required to convey
APMAR’22: Asia-Pacific Workshop on Mixed and Augmented Reality, a socio-afective message accurately [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Furthermore,
Dec. 02-03, 2022, Yokohama, Japan the advent of biosensors allows real-time reproduction of
*$Conrerremspeoenndai@ngg mauatihl.ocor.m (N. Shaltout); facial expressions from other users [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Moreover, with
diego.vilelamonteiro@esie.fr (D. V. Monteiro); a recent increased interest of the general public in the
m.perusquia@is.naist.jp (M. Perusquía-Hernández); metaverse after the 2019 pandemic. We believe it is timely
jorlosky@augusta.edu (J. Orlosky); kiyo@is.naist.jp (K. Kiyokawa) to study the efects of VR agents’ facial expressions on
0000-0002-1570-3652 (D. V. Monteiro); 0000-0002-0486-1743 the user gaze.
(0M00.0P-0er0u0s3q-2u2ía6-0H-1e7r0n7á n(Kde.zK);iy0o0k0a0w-0a0)02-0538-6630 (J. Orlosky); Thus, this study aims to explore diferent gaze
parame© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License ters and their efectiveness to determine how comfortable
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutRion W4.0oInrtekrnsahtioonpal (PCCroBYce4.0e).dings (CEUR-WS.org) the user is with the agent as it presents diferent facial
expressions; which might be an alternate method for happy/neutral) using their eye gaze. Participants show
measuring the reaction towards VR agents. The design of both avoidance and hyper-vigilance according to the age
the study was inspired by previous works conducted on group, the agent display, and the passage of time during
individuals with SAD. Individuals with SAD usually do the trial. The bias is measured with the duration of
fixnot deal well with emotions presented on the face. They ations towards angry faces and towards more pleasant
tend to avoid gaze when faced with emotional people or faces such as neutral or happy faces. The fixation
duratheir representations. The avoidance might increase with tion of neutral faces is subtracted from angry faces to
negative emotions. They especially avoid looking at the create a negative or positive bias. Based on the above
eyes of individuals displaying emotions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Thus, we an- studies we hypothesize that individuals with SAD might
alyzed the efect of user’s anxiety and acceptance when have two factors that afect their gaze patterns. The level
confronted with an embodied agent showing diferent of anxiety of the person when they are looking at the
emotions on the user’s eye gaze patterns. face, and whether or not the individuals accept the
fa
      </p>
      <p>We hypothesize that we can use gaze location on cial expression of the embodied agent. We would like to
the agent to measure the degree of comfort towards an also observe if this afects normative individuals and if it
agent’s facial expression or degree of user anxiety. To mimics those with SAD.
this aim, eye gaze was measured using the VIVE Pro Eye
tracker while participants looked at a VR agent with vary- 2.2. Hypotheses
ing expressions. The main contributions of this paper
are: Our hypotheses are as follows.</p>
      <p>• Analyzing the correlation between general
anxiety of a normative user and their gaze patterns
on the embodied agent.
• Analyzing the correlation between the user
acceptance to the embodied agent’s emotional display
and their gaze patterns on the embodied agent.
• Comparing the findings to those found in SAD</p>
      <p>using similar studies.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Prior Work and Hypotheses</title>
      <sec id="sec-2-1">
        <title>2.1. Gaze Analysis Studies Related to</title>
      </sec>
      <sec id="sec-2-2">
        <title>Social Anxiety</title>
        <sec id="sec-2-2-1">
          <title>Eye metrics are promising tools to assess attitudes to</title>
          <p>
            wards virtual agents. The main inspiration for this study
stems from gaze analysis of individuals with HSA (High
Social Anxiety) towards the facial expressions of other
individuals in social situations. Previous gaze studies
showed that individuals with HSA averted their gaze
when shown photos of individuals expressing positive or
negative emotions [
            <xref ref-type="bibr" rid="ref10 ref5">5, 10</xref>
            ]. In such studies, static photos
of people presenting happy, sad, and neutral facial
expressions were commonly shown while gaze directions and
ifxations were measured. Thus, we adapted our study to
ifnd a relation between gaze direction on the embodied
agent displaying emotions and the user’s acceptance of
the emotional display.
          </p>
          <p>
            Wieckowski et al. explored variability in bias
toward social stimuli in the form of vigilant attention and
avoidant attention using eye gaze techniques instead of
a traditional probe task technique often used to study
attention bias in anxious youth with clinical SAD [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ].
The visual dot probe task involves allowing the users
to select between two agent pairs (e.g., angry/neutral,
          </p>
        </sec>
        <sec id="sec-2-2-2">
          <title>H1 The agent’s facial expressions have an efect on the participant’s self-reported arousal and valence.</title>
        </sec>
        <sec id="sec-2-2-3">
          <title>H2 The user’s acceptance or preference of the agent’s emotional display can be observed in the eye fixation patterns on the agent.</title>
        </sec>
        <sec id="sec-2-2-4">
          <title>H3 The overall anxiety state of the participants could alter the eye fixation patterns on the agent.</title>
          <p>For H1, the Self Assessment Manniken (SAM) was
used to assess if the facial expressions had an efect on
the participant’s afect, to see if the avatar’s expressions
afected the user. Regarding H2, while participants were
answering questionnaires in 3.6 about the diferent
emotional display of the agent, not all of them accepted the
agent’s emotional display in the same manner. E.g. while
some people highly disliked the happy face, other people
were comfortable with it. We assessed if there is a pattern
between acceptability of the emotional display and the
number of fixations on the agent. For H3, we assessed if
there is a relation between the anxiety of the participant
and their fixation behaviors on the agent’s diferent body
parts. We expected the more anxious participants to be
avoidant of the agent’s face and eyes. The anxiety state
in this case is the user’s default state before and during
the experiment.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiment</title>
      <sec id="sec-3-1">
        <title>3.1. Participants</title>
        <sec id="sec-3-1-1">
          <title>A total of 21 student volunteers in their early twenties</title>
          <p>
            participated in the study; 10 Japanese, 1 Kenyan, 1
German, 2 Nepali, 1 Colombian, 4 Chinese, 1 Thai, and 1
Malaysian. No participant tried our system before. The
participants were asked to wear glasses if their vision was
not good. Sources of error were accounted for by
removing three participants in which there was missing data
(e.g., the VIVE Pro Eye tracking was disabled accidentally
for one of the faces). Three participants that were
extremely fatigued were excluded using a fatigue score in
the pre-questionnare. After the exclusions, the number
of participants was 15. The experiment was approved by
the ethics committee of our institution.
3.2. Experiment Design hyper realistic [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ]. Thus we used a semi realistic avatar.
We used an average Asian face to accommodate the
We tested the participant’s eye gaze patterns when pre- majority Asian demographic involved in the experiment.
sented with diferent facial expressions from a humanoid Figure 1 shows the resulting avatar when inputting an
agent in VR. There were three conditions corresponding average Asian face to the Ready Player Me interface. The
to three facial displays expressed by the virtual agent: a Ready Player Me was used because its low-poly
charhappy, a sad, and an angry facial expression. The con- acteristics make it more likely to be used by people in
ditions were presented in a within subjects design, i.e., current virtual chats and metaverse settings.
each participant saw all three faces. We chose to en- Though readyplayer.me characters are usually used
able the agent to have only facial expressions; to avoid as avatars in VRchat, we use it in this case to animate
confounding factors caused by other agent behaviors. the agent, as if it’s an example user in VR. To animate
the happy, angry, and neutral emotions, the Facial
Ac3.3. Procedure tion Coding System (FACS) was used [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ]. The FACS
presents action units (AUs) used for coding facial
moveThe participants were exposed to each of the agent’s fa- ments without making inferences about the underlying
cial expression one minute at a time. There were three emotions. It is a popular tool in emotion studies to either
runs total, one for each facial expression. The partici- create faces with a certain expression or to interpret a
pants were seated in front of the agent as to be the same facial expression. The FACS is now incorporated in most
height as the agent and faced the agent head-on without VR chat avatars to enable the avatars to express emotions
an angle. Before every run, the agent was adjusted to by encoding certain AU movements. Ready Player Me
be the same height as the participant. The participants avatars come equipped with most of the values available
were seated throughout the course of the experiment and in the FACS. We focused on prototypical AUs according
encouraged to use only gaze and head movements. A full to the Basic Emotion Theory [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ] to animate the avatar,
agent was used so the participants could freely choose together with guidelines described in Farnsworth’s
viwhether or not to gaze at the agent’s face, body, or out- sual FACS guide [
            <xref ref-type="bibr" rid="ref16">16</xref>
            ]. For instance, to animate a happy
side of the agent completely. The participants answered face, we used AU 6 (cheek raiser, Fig. 2) and AU 12 (lip
questionnaires pre- and post-experiment and after each corner puller) with values of 1.0. The agent’s default face
avatar was displayed. Details are mentioned in the mea- with some minor adjustments was used to represent the
surements section. neutral face, Because the Ready Player Me avatars are
designed to look slightly happy by default, we adjusted
3.4. Stimuli brow lowered AU 4 and lip corner depressor AU 15 to
bring back the avatar to its normal state. The facial
exThe agent was designed using Ready Player Me [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ], pressions are animated using blend shapes. There are
which is a tool that converts a photograph of a person three separate faces shown to the same participant. We
into an avatar with similar facial features. It is used refer to them as three diferent trials with questionnaires
for players to make agents of themselves in-game. It is in between. All facial expressions start from the neutral
currently most popular on platforms such as VR chatting expression. It took one second for the facial animation to
programs. Ready Player Me is also equipped with the reach their maximum intensity. Then the avatar’s
expresability to map the user’s emotions onto the agent via eye sion remains for the duration of the trial. The animation
and facial tracking. The readyplayer.me avatar was based stays constant for one minute per trial. We measure the
on FACs and is usually embodied by people in VR but participant’s reaction within one minute for the purpose
we controlled it using the animation module of Unity to of measuring fixations. We do not consider the initial
anintroduce more control over the experiment imation to have a detrimental efect on the gaze patterns
          </p>
          <p>The possibility that there is an uncanny valley in the or number of fixations thus we did not account for the
virtual agent’s appearance is higher with agents that are baseline when observing gaze patterns.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>To reduce the chances of a participant experiencing</title>
          <p>the uncanny valley, the avatar’s blinking was animated.</p>
          <p>
            The frequency of the blinks was randomized, between
0.5 and 4 seconds. The agent’s gaze was fixed during
the run of this experiment. The agent was also given
a breathing animation using Maximo to give it a more
realistic feel [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ]. No other interactions other than
varying facial expressions were added to the agent. In this
experiment we focused more on the relationship between
user’s anxiety and user’s gaze patterns rather than design
a complex interaction system. Thus a simple design was
used. Future work will feature a more interactive avatar.
150 ms constituted a “fixation.” The number of fixations
3.5. Calibration on each body part was counted within one minute of
To ensure that eye and facial feature tracking worked the participant looking at the avatar. The fixations were
correctly, we used a cube display to calibrate and con- collected for each facial expression; Neutral, Angry, and
ifrm participant’s gaze prior the experiment’s start. The Happy. The process was repeated per participant. By
agent’s height was adjusted to be the same as the par- analyzing the number of fixations on each body part,
ticipant’s height in every trial. The agent’s face was we expected to find a pattern related to the participant’s
only revealed once the VIVE Pro Eye was calibrated, as current state, the participant’s preference of the face, and
shown in Fig. 3. We calibrated each participant’s eye gaze patterns related to the facial expression that the avatar
pre-experiment using the VIVE’s internal calibration soft- displayed to the participant.
ware before running the experiment. The participant was Additionally, we added a face mask not visible to the
then positioned to see the front view of the agent. How- user to roughly measure the number of fixations on the
ever, the participant was free to move his/her head or face. We counted the number of fixations surpassing
gaze freely and look either at the avatar’s face or body 150 ms on the colliders added to this face mask, and
for the duration of the one-minute trial. No background divided the number of fixations into upper face region
objects were visible so that the participant would focus and lower face region. Any collider above the lower eye
on the agent. Participants were seated during the ex- was considered upper face while colliders below the face
periment and received no instructions regarding as to were considered lower face. If the upper face collisions
where to look when facing the agent and were left to are higher in the results, then the user accepts the avatar,
interact naturally with the agent using eye movements otherwise if the lower face collisions are higher the user
and head movements once the experiment started. The rejects the avatar according to source.
participant’s location did not change.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.6. Measurements</title>
        <p>Eye Metrics We implemented a gaze ray to determine
the intersection of the participant’s gaze with the avatar’s
face. To determine the number of collisions between the
gaze ray and parts of the face, we added colliders to
the face, eyes, and body (Fig. 5). When the ray did not
collide with the avatar, it was recorded as ‘other.’ We
defined that a constant gaze on the same body part for
Questionnaires The participants answered
questionnaires at diferent points during the experiment.
Sometimes a questionnaire like SAM was repeated more than
once so were questionnaires about the acceptability of
the avatar. To identify each of which point of the
experiment the participant answered the questionnaires we
assigned codes as follows:</p>
        <sec id="sec-3-2-1">
          <title>B : Before seeing any avatar.</title>
        </sec>
        <sec id="sec-3-2-2">
          <title>AH : After seeing the happy facial expression agent.</title>
          <p>The questionnaire about the avatar was used to
measure the participant’s acceptance aka contextual
comfort of the virtual agent’s expression after each trial. The
questionnaires were answered thrice per participant after
each face. From the answers of the acceptability
questionnaire, we created a variable known as acceptability
by taking as follows:
 : (1) − (2)</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>A positive or zero value indicated that the agent was</title>
          <p>accepted, while a negative value indicated that the agent
Pre-questionnaire Before the experiments, partici- was rejected by the user. We calculated the acceptability
pants reported their demographics and their current fa- per face per participant. This score was then used to find
tigue and anxiety. They also reported their general anxi- a correlation between the acceptability of the agent. Each
ety level on a 9-point Likert scale. facial expression was compared to the participant’s eye
on fixations diferent locations of the embodied agent
Self Assessment Manikin (SAM) A SAM question- (head, body, eye, etc).
naire with 9 point likert scale is used to measure va- The acceptability questionnaire was conducted three
lence,arousal and dominance. (Fig. 6) shows a sample of times per participant, after the participant viewed every
the valence and arousal questionnaires used. SAM was expression of the agent for one minute. For instance,
given to the participant before the experiment and after after the participant views the happy expression for one
every face. SAM was used to measure how the avatar minute, the experiment stops and the participant answers
afected the participant. The SAM questionnaire was the SAM and the acceptability questionnaires. This
propresented prior to and after looking at the avatar. cedure is then repeated with the other two expressions.
The questionnaires are not validated which might be a
limitation of the study.</p>
          <p>Acceptability Questionnaire This was a more
detailed questionnaire about the avatar. The questionnaires
were given a code AH, AA, or AN, after each diferent 4. Analysis and Results
avatar with diferent facial expression. It asked how
people felt about the avatar. It consisted of several questions Despite being given no instructions on where to
dianswered on a 9-point Likert scale. The scores are on a rect their gaze, the majority of participants gazed at
scale from 1 to 9 to match the same format of the SAM. the agent’s face or body. However, the gaze patterns
From this questionnaire, we only used two questions for changed according to the facial expression shown, the
further analyses: participant’s acceptability score, and the anxiety score, as
detailed below. Deviations away from the face changed
Q1 : I felt comforted by the avatar. according to the acceptability and anxiety scores.
Q2 : I felt disturbed by the avatar respectively. SPSS was used to analyze the data. To test H1, we
measured if the agent’s facial expression had an efect on the
[a] Valence: A higher value means that a person is content
while a lower value means that a person is upset
participant’s afect as follows. The SAM questionnaire
was taken four times and symbolized with experiment
codes described in the Questionnaire portion of Sec. 3.6
as follows: Before the experiment (B), after the neutral
agent(AN), after the happy agent (AH), after the angry
agent (AA) for each participant.</p>
          <p>A Shapiro-Wilk test showed a significant departure
from normality W(98) = 0.94 ,  &lt; 0.01; W(98) = 0.94,
 &lt; 0.01; W(98) = 0.95,  &lt; 0.01 for valence, arousal
and dominance respectively. We thus ran the
KruskalWallis as a non-parametric test to compare if there were
significant diferences between conditions B, AH, AA, [b] Arousal: Measures the state of alertness. A higher value
and AN. A Kruskal-Wallis rank-sum test carried out on indicates higher alerntness.
arousal valence and dominance showed that there was
a statistical significance in valence (  2(3) = 21.48,  &lt;
0.001) and arousal ( 2(3) = 8.867,  &lt; 0.05) between the
diferent embodied agent expressions when compared
to the baseline. The mean rank valence scores of 41.07,
45.16 , 28.67, and 62.5 for B, AN, AH, and AA, respectively.</p>
          <p>For arousal, the mean rank scores were 39.98, 41.61, 57.6,
37.84 for B, AN, AH, and AA, respectively. The results
for dominance showed no statistical significance.</p>
          <p>Pairwise Wilcoxon rank sum tests with Benjamini- Figure 7: Boxplot showing [a] Valence and [b] Arousal values
Hochberg (BH) -value adjustment were carried out for entered by participants after viewing each facial expression
the SAM valence and arousal scores since they showed (AA, AH, AN) compared to the Baseline, B. The starred brackets
significance. Experiment codes (B, AN, AH, AA) were show statistically significant pairings using the Wilcoxon rank
used to represent the experiment stage as detailed in 3.6. sum tests.</p>
          <p>The following score pairs were compared to each other:
(B, AN), (B, AH), (B, AA) representing the comparison
between the baseline state and the state of the participant (AA, AH) with ( &lt; 0.01, Cohen’s  = 0.25) and (AN,
after viewing agents of diferent emotions. This is to test AH) with ( &lt; 0.01, Cohen’s  = 0.25);  = 15 for all
if the diferent emotional agents had a significant efect the test concerning valence and arousal
on the afect of the participant. The diference between We tested H2 and H3 and see if anxiety and
acceptabilthe scores of the following face pairs were compared to ity afect where the participant looks most at the avatar
each other: (AN, AH), (AN, AA), (AH, AA). and whether they are avoiding the agent’s face or eyes as</p>
          <p>The significant results for valence and arousal are sum- a result. We compared those results to previous findings
marized using starred brackets in Fig. 7. Significant re- for socially anxious individuals. Kendall’s  b correlation
sults were as follows: The valence was larger after intro- was used to run the correlation tests since the sample size
ducing the happy agent (AH), compared to the baseline is small, and the normality tests showed that the sample
(B): ( &lt; 0.05, Cohen’s  = 0.25). The valence was did not follow a normal distribution as before.
conversely lower than the baseline after introducing the To test H2, when running the tests on all the
particiangry agent (AA): ( &lt; 0.01, Cohen’s  = 0.25). There pants, we found no correlation between the acceptability
was no significant diferences in valence when comparing score and number of fixations on all portions of the agent.
the score of the baseline condition to introducing the neu- We ran another Kendall’s  b correlation using only
anxtral agent (AH). The valence score was also lower after
viewing the angry face as compared to after viewing the
neutral face ( &lt; 0.05, Cohen’s  = 0.25). Conversely, Table 1
the valence score was higher after the happy face as com- Correlation values between acceptability score and fixations
pared to the neutral face ( &lt; 0.05, Cohen’s  = 0.25). on diferent portions of the agent for N = 13 (anxious
particiThe diference in valency values between the AH and AA pants) (*:  &lt; 0.05, **:  &lt; 0.01),
pair was significant (  &lt; 0.01, Cohen’s  = 0.25). headf bodyf eyesf other</p>
          <p>The results for arousal were as follows: the diference bodyf .04
between arousal values for the following score pairs with eyesf .51* − .22
the following experiment codes showing statistical sig- other .59** .21 .16
nificance: (B, AH) with (  &lt; 0.05, Cohen’s  = 0.25); Acceptability .31 .03 .51* .22
ious participants. The results are summarized in Table 1.</p>
          <p>A positive correlation between the participant’s
acceptability score and the number of fixations on the eye were
found which was statistically significant (   = 0.510,
 &lt; 0.05). The acceptability score was standardized
between − 1 and 1.</p>
          <p>To investigate H3, a Kendall’s  b correlation was run [a] Angry Expression
to determine the relationship between the participant’s
anxiety score and amount of fixations on a certain
portion of the avatar, as shown in Table 2, with  = 45,
regardless of the face used. There was a strong,
positive correlation between the participant’s anxiety score
of and the number of fixations on the agent body per
minute, which was statistically significant (   = 0.30, [b] Neutral Expression
 &lt; 0.001). Additionally, there was also a strong nega- Figure 8: Scatter plot illustrating the correlation between the
tive correlation between the participant’s (  = − 0.22, anxiety score of the participant and the number of fixations
 &lt; 0.01). The results are summarized in Table 2. detected on outside of the agent’s face, when observing the</p>
          <p>A Kendall’s  -b correlation was run to determine the avatar with the (a) angry expression as opposed to number
relationship between the anxiety score of the participant of fixations on the agent’s body, when observing the agent
and the number of fixations in a certain section of the with the (b) neutral expression over one minute.
agent, among 15 participants, per facial expressions. We
observed if there are any patterns for specific facial
expressions. We found no significant correlations between tion between the anxiety score and the upper face with
the anxiety score and the body fixations when present- statistical significance (   = − 0.263,  &lt; 0.05). For
ing the participants with the happy face. However we the result per facial expression, only the happy
expresfound a strong correlation between the number of body sion showed a strong negative correlation between the
ifxations (bodyf) and the anxiety score when participants anxiety and the upper face, with statistical significance
were presented with the neutral face, which was statis- (  = − 0.408,  &lt; 0.05).
tically significant (   = 0.411,  &lt; 0.05). There was
also a strong correlation between the number of
fixations detected outside the agent and the anxiety score 5. Discussion
when participants viewed the angry face, there was also
a significant correlation between the number of fixations
outside the agent when participants were viewing the
angry face, which was statistically significant (   = 0.420,
 &lt; 0.05). The values were Benjamini-Hochberg
corrected. The results are summarized in the scatter plot
detailed in Fig. 8. We ran two additional Kendall’s  -b
correlation tests to see if there was a correlation between
anxiety score and if the participant gazed at the upper
or lower part of the avatar’s face, first regardless of the
face with  = 45, then per facial expression (angry,
neutral, happy) with  = 15. For the results, regardless
of the face presented there was strong negative
correlaIn H1 we suggested that the agent afects the user
valence and arousal. The results in Fig. 7 show that the
agent has a significant efect on the valence and arousal
of the user according to the emotion displayed. This
shows that the model of the avatar actually works to
afect the participant. The diferences between valence
and arousal of the participant when viewing the neutral
agent compared to the baseline were not statistically
significant. This is expected because there was no emotion
conveyed with the neutral facial expression. The angry
and happy facial expressions afected the valence and
arousal of the participants significantly compared to the
baseline. This showed that there the facial expressions of
the agents afected the participants’ valence and arousal.</p>
          <p>The angry facial expression also induced a lower valence
score while the happy facial expression induced a higher
valence. This confirmed that the facial expressions of the
avatar were perceived correctly and supports H1.</p>
          <p>
            In H2, we predicted that user’s acceptability of the
agent emotions afects gaze patterns on the agent. SAD
individuals do not want to confront the emotions of
others, thus exhibiting behaviors such as avoiding the face
or being hyper-vigilant to cope with emotional display.
In addition to the anxiety score, we measured if the par- the agent, when anxious, matching the pattern of those
ticipant’s acceptance or rejection of the agent’s emotions with SAD as per [
            <xref ref-type="bibr" rid="ref10 ref5">5, 10</xref>
            ] and supports H3.
played a factor in avoidant gaze patterns. We created the When analyzing the correlation between anxiety score
acceptability score as measure of the individual partici- and user’s fixations on the upper and lower part for all the
pant’s preference to the agent’s facial expressions were faces, we found a strong negative correlation between the
variant. Individuals with SAD are also anxious at the upper face and the anxiety score. This indicated that the
time they do not accept facial expressions. Thus we mea- anxious users avoided the agent’s eyes as cited in [
            <xref ref-type="bibr" rid="ref10 ref5">5, 10</xref>
            ].
sured if there’s a correlation between the acceptability When analyzing the correlations per facial expression,
score for anxious participants and number of fixations we only found a strong negative correlation between
on specific parts of the agent. A positive correlation was user fixations on the upper face and the user’s anxiety
found between fixations on the agent’s eye and the ac- score, with the happy agent. This is probably because
ceptability score in anxious participants. This shows that the participants were avoiding the face otherwise for the
even anxious participants are more likely to gaze at the neutral and angry agents. This indicates that anxious
agent’s eyes even if they accept the agent’s facial expres- users are more likely to look at the agent’s face, if it has
sion. Conversely, the opposite is true when the person positive afect, despite avoiding the eyes.
avoids the gaze of the agent, if they find the agent’s
emotions unacceptable. This matches the literature for SAD
individuals avoiding eye gaze for displays of afection 6. Conclusions
and supports H2.
          </p>
          <p>The acceptability score can be used to change how The study observes the correlation between anxiety of
agent’s faces are studied. It can be used in the future to normative users, their acceptability of the agent’s
emostudy the root cause of SAD or even cultural diferences tions and the respective gaze patterns on an agent with
in social norms when both accepting and reacting to varying emotions. Our results suggest that individuals
varying facial expressions. The acceptability measure with anxiety in the moment have similar gaze patterns to
can be used as an extra factor in future SAD studies. those with SAD. The similarities between normative
indi</p>
          <p>
            For H3, we hypothesized that user’s overall anxiety viduals facing anxiety in the moment, their gaze patterns
afects gaze patterns on the agent. In this study, we took and their acceptability of the agent, can unlock a better
the number of fixations on each part of the agent as understanding of the way SAD individuals operate. The
an indication of acceptance or avoidance of the facial techniques used for SAD individuals can also be used
expressions of the avatar. On one hand, an increased to accommodate normative anxious individuals facing
number of fixations on the eyes or head suggest that social situations in VR.
the participant accepts the avatar. This is consistent In this model the participant is allowed to look away
to previous studies in [
            <xref ref-type="bibr" rid="ref10 ref5">5, 10</xref>
            ]. On the other hand, an from the face to other sections of the embodied agent
increased number of fixations on the body or outside of including the body, which emulates an actual social
situathe avatar indicates that the user is avoiding the agent. tion in VR. Whether the users gazed at the upper or lower
          </p>
          <p>We also explored the relationship between agent avoid- parts of the agent’s face, were also analyzed. The more
ance and the participants’ anxiety level. The initial anxi- negative the expression, the further the anxious
particety score represents an approximation of the participants’ ipant strayed away from agent’s eye, then the face and
overall state before facing the agent. We found signifi- entire body respectively. These findings are an important
cant correlations between the user’s initial anxiety and indication in designing future systems. E.g. Nonverbal
body fixations, invariant of facial expression as shown in agents with positive afect might be a better choice for
Table 2. When analyzing the correlations, on per facial an anxious normative individual as they were still more
expression basis, there was a strong correlation between likely to gaze at the face regardless of their anxiety. The
the anxiety score and the number of fixations on the body studies also show that users are more likely to gaze at the
of the agent or outside of the body when presented with agent’s eyes if they accept the agent’s emotional display.
the neutral and angry expressions respectively. When A Metaverse VR avatar and VIVE tracker were used
viewing the angry expression, the fixations were com- in the experiment. The technique can be easily applied
pletely outside the body showing that the participant to a more ecologically valid setting to find avoidance
is more likely to avoid the agent completely the more patterns of anxious users in real-time. This is useful to
negative the expression is. adjust a non-verbal agent’s expression to accommodate</p>
          <p>We also found a strong negative correlations between for the user’s anxiety and facial preference. Other
senthe participants’ anxiety score and the number of fixa- sors can also be added to find stronger patterns e.g heart
tions on the agent’s eyes, invariant of facial expression. sensor or facial tracker. The studies sufer some
limitaThis indicates that the participant avoids the agent’s eyes tions due to limited participant count and non-validated
and is more likely to look at the agent’s body or outside of questionnaires.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Monteiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.-N.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yue</surname>
          </string-name>
          ,
          <article-title>Evaluating the efects of a cartoonlike character with emotions on users' behaviour within virtual reality environments</article-title>
          ,
          <source>in: 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)</source>
          ,
          <source>IEEE Computer Society</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>229</fpage>
          -
          <lpage>236</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hasegawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Isoyama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. V.</given-names>
            <surname>Monteiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sakata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kiyokawa</surname>
          </string-name>
          ,
          <article-title>The efects of speed-modulated visual stimuli seen through smart glasses on work eficiency after viewing</article-title>
          ,
          <source>Sensors</source>
          <volume>22</volume>
          (
          <year>2022</year>
          )
          <fpage>2272</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. V.</given-names>
            <surname>Monteiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Goh</surname>
          </string-name>
          , D. Han,
          <string-name>
            <given-names>H.- N.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <article-title>Annotation vs. virtual tutor: Comparative analysis on the efectiveness of visual instructions in immersive virtual reality</article-title>
          ,
          <source>in: 2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>318</fpage>
          -
          <lpage>327</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Monteiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.-N.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Evaluating the need and efect of an audience in a virtual reality presentation training tool</article-title>
          , in: International Conference on Computer Animation and Social Agents, Springer,
          <year>2020</year>
          , pp.
          <fpage>62</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Roelofs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Putman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schouten</surname>
          </string-name>
          , W.-G. Lange, I. Volman,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rinck</surname>
          </string-name>
          ,
          <article-title>Gaze direction diferentially afects avoidance tendencies to happy and angry faces in socially anxious individuals</article-title>
          ,
          <source>Behaviour research and therapy 48</source>
          (
          <year>2010</year>
          )
          <fpage>290</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Kwon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Alan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Czanner</surname>
          </string-name>
          , G. Czanner,
          <string-name>
            <given-names>J.</given-names>
            <surname>Powell</surname>
          </string-name>
          ,
          <article-title>A study of visual perception: Social anxiety and virtual realism</article-title>
          ,
          <source>in: Proceedings of the 25th Spring Conference on Computer Graphics</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>167</fpage>
          -
          <lpage>172</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Jack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Schyns</surname>
          </string-name>
          ,
          <article-title>The Human Face as a Dynamic Tool for Social Communication</article-title>
          ,
          <source>Current Biology</source>
          <volume>25</volume>
          (
          <year>2015</year>
          )
          <fpage>R621</fpage>
          -
          <lpage>R634</lpage>
          . URL: https://www.sciencedirect. com/science/article/pii/S0960982215006557. doi:
          <volume>10</volume>
          .1016/j.cub.
          <year>2015</year>
          .
          <volume>05</volume>
          .052.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.-S.</given-names>
            <surname>Cha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-J.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.-H.</given-names>
            <surname>Im</surname>
          </string-name>
          ,
          <article-title>Real-time recognition of facial expressions using facial electromyograms recorded around the eyes for social virtual reality applications</article-title>
          , IEEE Access PP (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2020</year>
          .
          <volume>2983608</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Rutter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Norton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <article-title>Visual attention toward emotional stimuli: Anxiety symptoms correspond to distinct gaze patterns</article-title>
          ,
          <source>Plos one 16</source>
          (
          <year>2021</year>
          )
          <article-title>e0250176</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Garner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mogg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Bradley</surname>
          </string-name>
          ,
          <article-title>Orienting and maintenance of gaze to facial expressions in social anxiety</article-title>
          .,
          <source>Journal of Abnormal Psychology</source>
          <volume>115</volume>
          (
          <year>2006</year>
          )
          <fpage>760</fpage>
          -
          <lpage>770</lpage>
          . doi:
          <volume>10</volume>
          .1037/
          <fpage>0021</fpage>
          -
          <lpage>843x</lpage>
          .
          <year>115</year>
          .4. 760.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Wieckowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Capriola-Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Elias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Ollendick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. W.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <article-title>Variability of attention bias in socially anxious adolescents: diferences in ifxation duration toward adult and adolescent face stimuli</article-title>
          ,
          <source>Cognition and emotion 33</source>
          (
          <year>2019</year>
          )
          <fpage>825</fpage>
          -
          <lpage>831</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <fpage>Wolf3D</fpage>
          ,
          <article-title>Cross-game avatar platform for the metaverse</article-title>
          , in: https://readyplayer.me/,
          <source>accessed January 14th</source>
          ,
          <year>2022</year>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Biocca</surname>
          </string-name>
          ,
          <article-title>The uncanny valley: No need for any further judgments when an avatar looks eerie</article-title>
          ,
          <source>Computers in Human Behavior</source>
          <volume>94</volume>
          (
          <year>2019</year>
          )
          <fpage>100</fpage>
          -
          <lpage>109</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. P.</given-names>
            <surname>Friesen</surname>
          </string-name>
          ,
          <article-title>Measuring facial movement with the Facial Action Coding System</article-title>
          , in: P. Ekman (Ed.),
          <article-title>Emotion in the human face</article-title>
          , second edi ed., Cambridge University Press,
          <year>1982</year>
          , pp.
          <fpage>178</fpage>
          -
          <lpage>211</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          , E. Rosenberg,
          <article-title>What the face reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS</article-title>
          ), second edition ed., Oxford University Press,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>B.</given-names>
            <surname>Farnsworth</surname>
          </string-name>
          ,
          <article-title>Facial action coding system (facs) - a visual guidebook, 2021</article-title>
          . URL: https://imotions. com/blog/facial-action
          <article-title>-coding-system/.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Mixamo</surname>
          </string-name>
          ,
          <year>2022</year>
          . URL: https://www.mixamo.com/#/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>