=Paper=
{{Paper
|id=Vol-3297/paper6
|storemode=property
|title=How Anxiety State and Acceptance of an Embodied Agent Affect User Gaze Patterns
|pdfUrl=https://ceur-ws.org/Vol-3297/paper6.pdf
|volume=Vol-3297
|authors=Nermin Shaltout,Diego Monteiro Monteiro,Monica Perusquia-Hernandez,Kiyoshi Kiyokawa,Jason Orlorsky
|dblpUrl=https://dblp.org/rec/conf/apmar/ShaltoutMPKO22
}}
==How Anxiety State and Acceptance of an Embodied Agent Affect User Gaze Patterns==
How Anxiety State and Acceptance of an Embodied Agent Affect User Gaze Patterns⋆ Nermin Shaltout1,2,*,† , Diego Vilela Monteiro3 , Monica Perusquía-Hernández2 , Jason Orlosky1,4 and Kiyoshi Kiyokawa2 1 Osaka University, 1-32 Machikaneyama, Toyonaka, Osaka 560-0043, Japan 2 Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan 3 École Supérieure d’Informatique Electronique Automatique, 38 Rue des Docteurs Calmette et Guérin, 53000 Laval, France 4 Augusta University, School of Computer and Cyber Sciences 100 Grace Hopper Ln, Augusta, GA 30901, USA Abstract In virtual reality (VR), the interactions of users with embodied agents when the users are anxious or when they do not accept an agent are not yet completely understood. Gaze can be indicative of the user’s anxiety and acceptability of an embodied agent. An agent’s expressions or actions can, in turn, be used to accommodate the user’s anxiety. Previous work on social anxiety disorder (SAD) found evidence of avoidance or hyper-vigilant gaze patterns in relation to agents or people the participants were gazing at. Thus, we investigated if there are specific gaze patterns for normal individuals experiencing anxiety in the moment when gazing at an embodied agent. We focused mostly on avoidant gaze patterns. Based on evidence of gaze patterns in SAD and autism, we designed an experiment where normative individuals interact with an agent showing a neutral, happy and angry expressions. We aim to examine if normal anxious participants have similar gaze patterns or avoidance patterns to those with SAD. We also investigated if the user’s acceptability or preference of the virtual agent’s display of emotions had an effect on the avoidance via eye gaze. In particular, we investigated the user’s eye patterns in relations to the agent’s eyes, face or body to see if there were similarities to people with SAD. Using correlation analysis, we found a significant positive correlation between the acceptability of the participant to the virtual agent’s expression and their fixation on the agent’s eyes. We also found a significant correlation between fixations on the agent’s body and how anxious the participant was at the experiment’s start. Later, these results can be used to find a link between acceptability, anxiety and SAD. Keywords Virtual Reality, Embodied Agent, Eye Gaze, Anxiety 1. Introduction and were usually conducted using still photographs [5]. Individuals with Social Anxiety Disorder (SAD) react dif- In the field of virtual reality (VR), embodied agents are ferently to facial displays of emotion. This happens in commonplace as non-player characters (NPCs) or as VR too. It happens independently of the avatar fidelity other users (avatars) in-game. Thus, determining how [6]. Little is known of how normative individuals’ anxi- individuals react to virtual agents is an important topic ety (aka those without SAD) affects gaze behaviour with in the field [1]. The adaptation of embodied agent or respect to facial displays of emotion. The exposure of in- avatar facial expressions can influence user behavior [2]. dividuals to virtual situations has also risen with the rise In particular, for those who might use the agents for of new platforms like VR, and increased by the advent learning [3], social support, or feedback [4]. Previous of covid-19. Studying the effects of anxiety on virtual studies assessed the gaze patterns of individuals in social embodied agents, is thus important, for SAD as well as situations to understand psychological and emotional for anxious normative individuals. patterns. These studies are used to better understand and VR offers the possibility of presenting dynamic facial train people with disorders such as high social anxiety stimuli with a wealth of parameters, leading to detailed descriptions of the facial movements required to convey APMAR’22: Asia-Pacific Workshop on Mixed and Augmented Reality, a socio-affective message accurately [7]. Furthermore, Dec. 02-03, 2022, Yokohama, Japan the advent of biosensors allows real-time reproduction of * Corresponding author. $ nermeena@gmail.com (N. Shaltout); facial expressions from other users [8]. Moreover, with diego.vilelamonteiro@esie.fr (D. V. Monteiro); a recent increased interest of the general public in the m.perusquia@is.naist.jp (M. Perusquía-Hernández); metaverse after the 2019 pandemic. We believe it is timely jorlosky@augusta.edu (J. Orlosky); kiyo@is.naist.jp (K. Kiyokawa) to study the effects of VR agents’ facial expressions on 0000-0002-1570-3652 (D. V. Monteiro); 0000-0002-0486-1743 the user gaze. (M. Perusquía-Hernández); 0000-0002-0538-6630 (J. Orlosky); 0000-0003-2260-1707 (K. Kiyokawa) Thus, this study aims to explore different gaze parame- © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License ters and their effectiveness to determine how comfortable Attribution 4.0 International (CC BY 4.0). CEUR CEUR Workshop Proceedings (CEUR-WS.org) Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 the user is with the agent as it presents different facial expressions; which might be an alternate method for happy/neutral) using their eye gaze. Participants show measuring the reaction towards VR agents. The design of both avoidance and hyper-vigilance according to the age the study was inspired by previous works conducted on group, the agent display, and the passage of time during individuals with SAD. Individuals with SAD usually do the trial. The bias is measured with the duration of fix- not deal well with emotions presented on the face. They ations towards angry faces and towards more pleasant tend to avoid gaze when faced with emotional people or faces such as neutral or happy faces. The fixation dura- their representations. The avoidance might increase with tion of neutral faces is subtracted from angry faces to negative emotions. They especially avoid looking at the create a negative or positive bias. Based on the above eyes of individuals displaying emotions [9]. Thus, we an- studies we hypothesize that individuals with SAD might alyzed the effect of user’s anxiety and acceptance when have two factors that affect their gaze patterns. The level confronted with an embodied agent showing different of anxiety of the person when they are looking at the emotions on the user’s eye gaze patterns. face, and whether or not the individuals accept the fa- We hypothesize that we can use gaze location on cial expression of the embodied agent. We would like to the agent to measure the degree of comfort towards an also observe if this affects normative individuals and if it agent’s facial expression or degree of user anxiety. To mimics those with SAD. this aim, eye gaze was measured using the VIVE Pro Eye tracker while participants looked at a VR agent with vary- 2.2. Hypotheses ing expressions. The main contributions of this paper are: Our hypotheses are as follows. • Analyzing the correlation between general anx- H1 The agent’s facial expressions have an effect on the iety of a normative user and their gaze patterns participant’s self-reported arousal and valence. on the embodied agent. • Analyzing the correlation between the user accep- H2 The user’s acceptance or preference of the agent’s tance to the embodied agent’s emotional display emotional display can be observed in the eye fix- and their gaze patterns on the embodied agent. ation patterns on the agent. • Comparing the findings to those found in SAD H3 The overall anxiety state of the participants could using similar studies. alter the eye fixation patterns on the agent. For H1, the Self Assessment Manniken (SAM) was 2. Prior Work and Hypotheses used to assess if the facial expressions had an effect on the participant’s affect, to see if the avatar’s expressions 2.1. Gaze Analysis Studies Related to affected the user. Regarding H2, while participants were Social Anxiety answering questionnaires in 3.6 about the different emo- tional display of the agent, not all of them accepted the Eye metrics are promising tools to assess attitudes to- agent’s emotional display in the same manner. E.g. while wards virtual agents. The main inspiration for this study some people highly disliked the happy face, other people stems from gaze analysis of individuals with HSA (High were comfortable with it. We assessed if there is a pattern Social Anxiety) towards the facial expressions of other between acceptability of the emotional display and the individuals in social situations. Previous gaze studies number of fixations on the agent. For H3, we assessed if showed that individuals with HSA averted their gaze there is a relation between the anxiety of the participant when shown photos of individuals expressing positive or and their fixation behaviors on the agent’s different body negative emotions [5, 10]. In such studies, static photos parts. We expected the more anxious participants to be of people presenting happy, sad, and neutral facial expres- avoidant of the agent’s face and eyes. The anxiety state sions were commonly shown while gaze directions and in this case is the user’s default state before and during fixations were measured. Thus, we adapted our study to the experiment. find a relation between gaze direction on the embodied agent displaying emotions and the user’s acceptance of the emotional display. 3. Experiment Wieckowski et al. explored variability in bias to- ward social stimuli in the form of vigilant attention and 3.1. Participants avoidant attention using eye gaze techniques instead of a traditional probe task technique often used to study A total of 21 student volunteers in their early twenties attention bias in anxious youth with clinical SAD [11]. participated in the study; 10 Japanese, 1 Kenyan, 1 Ger- The visual dot probe task involves allowing the users man, 2 Nepali, 1 Colombian, 4 Chinese, 1 Thai, and 1 to select between two agent pairs (e.g., angry/neutral, Malaysian. No participant tried our system before. The participants were asked to wear glasses if their vision was not good. Sources of error were accounted for by remov- ing three participants in which there was missing data (e.g., the VIVE Pro Eye tracking was disabled accidentally for one of the faces). Three participants that were ex- tremely fatigued were excluded using a fatigue score in Figure 1: A diagram showing the resulting avatar (right) created when using an average face (left) on the Ready Player the pre-questionnare. After the exclusions, the number Me avatar creator (https://bit.ly/34C2G). of participants was 15. The experiment was approved by the ethics committee of our institution. hyper realistic [13]. Thus we used a semi realistic avatar. 3.2. Experiment Design We used an average Asian face to accommodate the We tested the participant’s eye gaze patterns when pre- majority Asian demographic involved in the experiment. sented with different facial expressions from a humanoid Figure 1 shows the resulting avatar when inputting an agent in VR. There were three conditions corresponding average Asian face to the Ready Player Me interface. The to three facial displays expressed by the virtual agent: a Ready Player Me was used because its low-poly char- happy, a sad, and an angry facial expression. The con- acteristics make it more likely to be used by people in ditions were presented in a within subjects design, i.e., current virtual chats and metaverse settings. each participant saw all three faces. We chose to en- Though readyplayer.me characters are usually used able the agent to have only facial expressions; to avoid as avatars in VRchat, we use it in this case to animate confounding factors caused by other agent behaviors. the agent, as if it’s an example user in VR. To animate the happy, angry, and neutral emotions, the Facial Ac- 3.3. Procedure tion Coding System (FACS) was used [14]. The FACS presents action units (AUs) used for coding facial move- The participants were exposed to each of the agent’s fa- ments without making inferences about the underlying cial expression one minute at a time. There were three emotions. It is a popular tool in emotion studies to either runs total, one for each facial expression. The partici- create faces with a certain expression or to interpret a pants were seated in front of the agent as to be the same facial expression. The FACS is now incorporated in most height as the agent and faced the agent head-on without VR chat avatars to enable the avatars to express emotions an angle. Before every run, the agent was adjusted to by encoding certain AU movements. Ready Player Me be the same height as the participant. The participants avatars come equipped with most of the values available were seated throughout the course of the experiment and in the FACS. We focused on prototypical AUs according encouraged to use only gaze and head movements. A full to the Basic Emotion Theory [15] to animate the avatar, agent was used so the participants could freely choose together with guidelines described in Farnsworth’s vi- whether or not to gaze at the agent’s face, body, or out- sual FACS guide [16]. For instance, to animate a happy side of the agent completely. The participants answered face, we used AU 6 (cheek raiser, Fig. 2) and AU 12 (lip questionnaires pre- and post-experiment and after each corner puller) with values of 1.0. The agent’s default face avatar was displayed. Details are mentioned in the mea- with some minor adjustments was used to represent the surements section. neutral face, Because the Ready Player Me avatars are designed to look slightly happy by default, we adjusted 3.4. Stimuli brow lowered AU 4 and lip corner depressor AU 15 to bring back the avatar to its normal state. The facial ex- The agent was designed using Ready Player Me [12], pressions are animated using blend shapes. There are which is a tool that converts a photograph of a person three separate faces shown to the same participant. We into an avatar with similar facial features. It is used refer to them as three different trials with questionnaires for players to make agents of themselves in-game. It is in between. All facial expressions start from the neutral currently most popular on platforms such as VR chatting expression. It took one second for the facial animation to programs. Ready Player Me is also equipped with the reach their maximum intensity. Then the avatar’s expres- ability to map the user’s emotions onto the agent via eye sion remains for the duration of the trial. The animation and facial tracking. The readyplayer.me avatar was based stays constant for one minute per trial. We measure the on FACs and is usually embodied by people in VR but participant’s reaction within one minute for the purpose we controlled it using the animation module of Unity to of measuring fixations. We do not consider the initial an- introduce more control over the experiment imation to have a detrimental effect on the gaze patterns The possibility that there is an uncanny valley in the or number of fixations thus we did not account for the virtual agent’s appearance is higher with agents that are baseline when observing gaze patterns. Figure 2: A diagram showing action unit 6, cheek raiser (left two images, permission was taken from imotions to use the image). The rightmost two images show the AU6 applied on Figure 3: (a) Cube used for calibrating the eye gaze pre- the Ready Player Me avatar, used in the experiment as an experiment (b) height adjusting the participant while obstruct- agent (https://bit.ly/3bKN9a). ing the avatar’s face (c), (d) examples of the neutral face and angry faces used for the experiment. The gaze ray is only shown for illustrative purposes but is omitted in the actual run of the experiment. The Action Units (AU)s for both happy To reduce the chances of a participant experiencing and angry faces were amped to 1 for both angry and happy the uncanny valley, the avatar’s blinking was animated. faces to get the maximum effect. The frequency of the blinks was randomized, between 0.5 and 4 seconds. The agent’s gaze was fixed during the run of this experiment. The agent was also given a breathing animation using Maximo to give it a more realistic feel [17]. No other interactions other than vary- ing facial expressions were added to the agent. In this experiment we focused more on the relationship between Figure 4: Figure showing the face mask (highlighted on the user’s anxiety and user’s gaze patterns rather than design left) used to detect collisions/fixations a complex interaction system. Thus a simple design was used. Future work will feature a more interactive avatar. 150 ms constituted a “fixation.” The number of fixations 3.5. Calibration on each body part was counted within one minute of To ensure that eye and facial feature tracking worked the participant looking at the avatar. The fixations were correctly, we used a cube display to calibrate and con- collected for each facial expression; Neutral, Angry, and firm participant’s gaze prior the experiment’s start. The Happy. The process was repeated per participant. By agent’s height was adjusted to be the same as the par- analyzing the number of fixations on each body part, ticipant’s height in every trial. The agent’s face was we expected to find a pattern related to the participant’s only revealed once the VIVE Pro Eye was calibrated, as current state, the participant’s preference of the face, and shown in Fig. 3. We calibrated each participant’s eye gaze patterns related to the facial expression that the avatar pre-experiment using the VIVE’s internal calibration soft- displayed to the participant. ware before running the experiment. The participant was Additionally, we added a face mask not visible to the then positioned to see the front view of the agent. How- user to roughly measure the number of fixations on the ever, the participant was free to move his/her head or face. We counted the number of fixations surpassing gaze freely and look either at the avatar’s face or body 150 ms on the colliders added to this face mask, and for the duration of the one-minute trial. No background divided the number of fixations into upper face region objects were visible so that the participant would focus and lower face region. Any collider above the lower eye on the agent. Participants were seated during the ex- was considered upper face while colliders below the face periment and received no instructions regarding as to were considered lower face. If the upper face collisions where to look when facing the agent and were left to are higher in the results, then the user accepts the avatar, interact naturally with the agent using eye movements otherwise if the lower face collisions are higher the user and head movements once the experiment started. The rejects the avatar according to source. participant’s location did not change. Questionnaires The participants answered question- 3.6. Measurements naires at different points during the experiment. Some- times a questionnaire like SAM was repeated more than Eye Metrics We implemented a gaze ray to determine once so were questionnaires about the acceptability of the intersection of the participant’s gaze with the avatar’s the avatar. To identify each of which point of the exper- face. To determine the number of collisions between the iment the participant answered the questionnaires we gaze ray and parts of the face, we added colliders to assigned codes as follows: the face, eyes, and body (Fig. 5). When the ray did not collide with the avatar, it was recorded as ‘other.’ We B : Before seeing any avatar. defined that a constant gaze on the same body part for AH : After seeing the happy facial expression agent. [a]Valence [b]Arousal Figure 5: Figure showing how the number of fixations is measured using colliders placed on the avatar. (a) headf: rep- resents the number of times the participant fixated on the Figure 6: The [a]Valence and [b]Arousal questionnaire from avatar’s head. (b)bodyf: the number of times the participant SAM. fixated on the avatar’s body, (c) eyef: the number of times the participant fixated on the avatar’s eyes, and (d) other: the number of fixations outside of the avatar. Collisions on the avatar were detected using a convex collider respectively. The questionnaire about the avatar was used to mea- sure the participant’s acceptance aka contextual com- fort of the virtual agent’s expression after each trial. The questionnaires were answered thrice per participant after AA : After seeing the angry facial expression agent. each face. From the answers of the acceptability ques- tionnaire, we created a variable known as acceptability AN : After seeing the neutral facial expression agent. by taking as follows: All three avatars were shown to each participant in a counterbalanced order. The following permutations were 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑏𝑖𝑙𝑖𝑡𝑦 : 𝑆𝑐𝑜𝑟𝑒(𝑄1) − 𝑆𝑐𝑜𝑟𝑒(𝑄2) used: (AA, AH, AN); (AA, AN, AH); (AH, AN, AA); (AH, AN, AA); (AN, AA, AH); (AN, AH, AA) A positive or zero value indicated that the agent was accepted, while a negative value indicated that the agent Pre-questionnaire Before the experiments, partici- was rejected by the user. We calculated the acceptability pants reported their demographics and their current fa- per face per participant. This score was then used to find tigue and anxiety. They also reported their general anxi- a correlation between the acceptability of the agent. Each ety level on a 9-point Likert scale. facial expression was compared to the participant’s eye on fixations different locations of the embodied agent Self Assessment Manikin (SAM) A SAM question- (head, body, eye, etc). naire with 9 point likert scale is used to measure va- The acceptability questionnaire was conducted three lence,arousal and dominance. (Fig. 6) shows a sample of times per participant, after the participant viewed every the valence and arousal questionnaires used. SAM was expression of the agent for one minute. For instance, given to the participant before the experiment and after after the participant views the happy expression for one every face. SAM was used to measure how the avatar minute, the experiment stops and the participant answers affected the participant. The SAM questionnaire was the SAM and the acceptability questionnaires. This pro- presented prior to and after looking at the avatar. cedure is then repeated with the other two expressions. The questionnaires are not validated which might be a limitation of the study. Acceptability Questionnaire This was a more de- tailed questionnaire about the avatar. The questionnaires were given a code AH, AA, or AN, after each different 4. Analysis and Results avatar with different facial expression. It asked how peo- ple felt about the avatar. It consisted of several questions Despite being given no instructions on where to di- answered on a 9-point Likert scale. The scores are on a rect their gaze, the majority of participants gazed at scale from 1 to 9 to match the same format of the SAM. the agent’s face or body. However, the gaze patterns From this questionnaire, we only used two questions for changed according to the facial expression shown, the further analyses: participant’s acceptability score, and the anxiety score, as detailed below. Deviations away from the face changed Q1 : I felt comforted by the avatar. according to the acceptability and anxiety scores. Q2 : I felt disturbed by the avatar respectively. SPSS was used to analyze the data. To test H1, we mea- sured if the agent’s facial expression had an effect on the [a] Valence: A higher value means that a person is content participant’s affect as follows. The SAM questionnaire while a lower value means that a person is upset was taken four times and symbolized with experiment codes described in the Questionnaire portion of Sec. 3.6 as follows: Before the experiment (B), after the neutral agent(AN), after the happy agent (AH), after the angry agent (AA) for each participant. A Shapiro-Wilk test showed a significant departure from normality W(98) = 0.94 , 𝑝 < 0.01; W(98) = 0.94, 𝑝 < 0.01; W(98) = 0.95, 𝑝 < 0.01 for valence, arousal and dominance respectively. We thus ran the Kruskal- Wallis as a non-parametric test to compare if there were significant differences between conditions B, AH, AA, [b] Arousal: Measures the state of alertness. A higher value and AN. A Kruskal-Wallis rank-sum test carried out on indicates higher alerntness. arousal valence and dominance showed that there was a statistical significance in valence (𝜒2 (3) = 21.48, 𝑝 < 0.001) and arousal (𝜒2 (3) = 8.867, 𝑝 < 0.05) between the different embodied agent expressions when compared to the baseline. The mean rank valence scores of 41.07, 45.16 , 28.67, and 62.5 for B, AN, AH, and AA, respectively. For arousal, the mean rank scores were 39.98, 41.61, 57.6, 37.84 for B, AN, AH, and AA, respectively. The results for dominance showed no statistical significance. Pairwise Wilcoxon rank sum tests with Benjamini- Figure 7: Boxplot showing [a] Valence and [b] Arousal values Hochberg (BH) 𝑝-value adjustment were carried out for entered by participants after viewing each facial expression the SAM valence and arousal scores since they showed (AA, AH, AN) compared to the Baseline, B. The starred brackets significance. Experiment codes (B, AN, AH, AA) were show statistically significant pairings using the Wilcoxon rank used to represent the experiment stage as detailed in 3.6. sum tests. The following score pairs were compared to each other: (B, AN), (B, AH), (B, AA) representing the comparison between the baseline state and the state of the participant (AA, AH) with (𝑝 < 0.01, Cohen’s 𝑑 = 0.25) and (AN, after viewing agents of different emotions. This is to test AH) with (𝑝 < 0.01, Cohen’s 𝑑 = 0.25); 𝑁 = 15 for all if the different emotional agents had a significant effect the test concerning valence and arousal on the affect of the participant. The difference between We tested H2 and H3 and see if anxiety and acceptabil- the scores of the following face pairs were compared to ity affect where the participant looks most at the avatar each other: (AN, AH), (AN, AA), (AH, AA). and whether they are avoiding the agent’s face or eyes as The significant results for valence and arousal are sum- a result. We compared those results to previous findings marized using starred brackets in Fig. 7. Significant re- for socially anxious individuals. Kendall’s 𝜏 b correlation sults were as follows: The valence was larger after intro- was used to run the correlation tests since the sample size ducing the happy agent (AH), compared to the baseline is small, and the normality tests showed that the sample (B): (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). The valence was did not follow a normal distribution as before. conversely lower than the baseline after introducing the To test H2, when running the tests on all the partici- angry agent (AA): (𝑝 < 0.01, Cohen’s 𝑑 = 0.25). There pants, we found no correlation between the acceptability was no significant differences in valence when comparing score and number of fixations on all portions of the agent. the score of the baseline condition to introducing the neu- We ran another Kendall’s 𝜏 b correlation using only anx- tral agent (AH). The valence score was also lower after viewing the angry face as compared to after viewing the neutral face (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). Conversely, Table 1 the valence score was higher after the happy face as com- Correlation values between acceptability score and fixations pared to the neutral face (𝑝 < 0.05, Cohen’s 𝑑 = 0.25). on different portions of the agent for N = 13 (anxious partici- The difference in valency values between the AH and AA pants) (*: 𝑝 < 0.05, **: 𝑝 < 0.01), pair was significant (𝑝 < 0.01, Cohen’s 𝑑 = 0.25). headf bodyf eyesf other The results for arousal were as follows: the difference bodyf .04 between arousal values for the following score pairs with eyesf .51* −.22 the following experiment codes showing statistical sig- other .59** .21 .16 nificance: (B, AH) with (𝑝 < 0.05, Cohen’s 𝑑 = 0.25); Acceptability .31 .03 .51* .22 ious participants. The results are summarized in Table 1. A positive correlation between the participant’s accept- ability score and the number of fixations on the eye were found which was statistically significant (𝜏 𝑏 = 0.510, 𝑝 < 0.05). The acceptability score was standardized between −1 and 1. To investigate H3, a Kendall’s 𝜏 b correlation was run [a] Angry Expression to determine the relationship between the participant’s anxiety score and amount of fixations on a certain por- tion of the avatar, as shown in Table 2, with 𝑁 = 45, regardless of the face used. There was a strong, posi- tive correlation between the participant’s anxiety score of and the number of fixations on the agent body per minute, which was statistically significant (𝜏 𝑏 = 0.30, [b] Neutral Expression 𝑝 < 0.001). Additionally, there was also a strong nega- Figure 8: Scatter plot illustrating the correlation between the tive correlation between the participant’s (𝜏 𝑏 = −0.22, anxiety score of the participant and the number of fixations 𝑝 < 0.01). The results are summarized in Table 2. detected on outside of the agent’s face, when observing the A Kendall’s 𝜏 -b correlation was run to determine the avatar with the (a) angry expression as opposed to number relationship between the anxiety score of the participant of fixations on the agent’s body, when observing the agent with the (b) neutral expression over one minute. and the number of fixations in a certain section of the agent, among 15 participants, per facial expressions. We observed if there are any patterns for specific facial ex- pressions. We found no significant correlations between tion between the anxiety score and the upper face with the anxiety score and the body fixations when present- statistical significance (𝜏 𝑏 = −0.263, 𝑝 < 0.05). For ing the participants with the happy face. However we the result per facial expression, only the happy expres- found a strong correlation between the number of body sion showed a strong negative correlation between the fixations (bodyf) and the anxiety score when participants anxiety and the upper face, with statistical significance were presented with the neutral face, which was statis- (𝜏 𝑏 = −0.408, 𝑝 < 0.05). tically significant (𝜏 𝑏 = 0.411, 𝑝 < 0.05). There was also a strong correlation between the number of fixa- tions detected outside the agent and the anxiety score 5. Discussion when participants viewed the angry face, there was also In H1 we suggested that the agent affects the user va- a significant correlation between the number of fixations lence and arousal. The results in Fig. 7 show that the outside the agent when participants were viewing the an- agent has a significant effect on the valence and arousal gry face, which was statistically significant (𝜏 𝑏 = 0.420, of the user according to the emotion displayed. This 𝑝 < 0.05). The values were Benjamini-Hochberg cor- shows that the model of the avatar actually works to rected. The results are summarized in the scatter plot affect the participant. The differences between valence detailed in Fig. 8. We ran two additional Kendall’s 𝜏 -b and arousal of the participant when viewing the neutral correlation tests to see if there was a correlation between agent compared to the baseline were not statistically sig- anxiety score and if the participant gazed at the upper nificant. This is expected because there was no emotion or lower part of the avatar’s face, first regardless of the conveyed with the neutral facial expression. The angry face with 𝑁 = 45, then per facial expression (angry, and happy facial expressions affected the valence and neutral, happy) with 𝑁 = 15. For the results, regardless arousal of the participants significantly compared to the of the face presented there was strong negative correla- baseline. This showed that there the facial expressions of the agents affected the participants’ valence and arousal. The angry facial expression also induced a lower valence Table 2 Correlation values between anxiety score and fixations on score while the happy facial expression induced a higher different portions of the agent for N = 45. (*: 𝑝 < 0.05, **: valence. This confirmed that the facial expressions of the 𝑝 < 0.01) avatar were perceived correctly and supports H1. In H2, we predicted that user’s acceptability of the headf bodyf eyef other Anxiety Score -0.12 .30** -.22* 0.2 agent emotions affects gaze patterns on the agent. SAD headf -.32** .56** -.0.3 individuals do not want to confront the emotions of oth- bodyf -.50** .24* ers, thus exhibiting behaviors such as avoiding the face eyef .27* or being hyper-vigilant to cope with emotional display. In addition to the anxiety score, we measured if the par- the agent, when anxious, matching the pattern of those ticipant’s acceptance or rejection of the agent’s emotions with SAD as per [5, 10] and supports H3. played a factor in avoidant gaze patterns. We created the When analyzing the correlation between anxiety score acceptability score as measure of the individual partici- and user’s fixations on the upper and lower part for all the pant’s preference to the agent’s facial expressions were faces, we found a strong negative correlation between the variant. Individuals with SAD are also anxious at the upper face and the anxiety score. This indicated that the time they do not accept facial expressions. Thus we mea- anxious users avoided the agent’s eyes as cited in [5, 10]. sured if there’s a correlation between the acceptability When analyzing the correlations per facial expression, score for anxious participants and number of fixations we only found a strong negative correlation between on specific parts of the agent. A positive correlation was user fixations on the upper face and the user’s anxiety found between fixations on the agent’s eye and the ac- score, with the happy agent. This is probably because ceptability score in anxious participants. This shows that the participants were avoiding the face otherwise for the even anxious participants are more likely to gaze at the neutral and angry agents. This indicates that anxious agent’s eyes even if they accept the agent’s facial expres- users are more likely to look at the agent’s face, if it has sion. Conversely, the opposite is true when the person positive affect, despite avoiding the eyes. avoids the gaze of the agent, if they find the agent’s emo- tions unacceptable. This matches the literature for SAD individuals avoiding eye gaze for displays of affection 6. Conclusions and supports H2. The study observes the correlation between anxiety of The acceptability score can be used to change how normative users, their acceptability of the agent’s emo- agent’s faces are studied. It can be used in the future to tions and the respective gaze patterns on an agent with study the root cause of SAD or even cultural differences varying emotions. Our results suggest that individuals in social norms when both accepting and reacting to with anxiety in the moment have similar gaze patterns to varying facial expressions. The acceptability measure those with SAD. The similarities between normative indi- can be used as an extra factor in future SAD studies. viduals facing anxiety in the moment, their gaze patterns For H3, we hypothesized that user’s overall anxiety and their acceptability of the agent, can unlock a better affects gaze patterns on the agent. In this study, we took understanding of the way SAD individuals operate. The the number of fixations on each part of the agent as techniques used for SAD individuals can also be used an indication of acceptance or avoidance of the facial to accommodate normative anxious individuals facing expressions of the avatar. On one hand, an increased social situations in VR. number of fixations on the eyes or head suggest that In this model the participant is allowed to look away the participant accepts the avatar. This is consistent from the face to other sections of the embodied agent to previous studies in [5, 10]. On the other hand, an including the body, which emulates an actual social situa- increased number of fixations on the body or outside of tion in VR. Whether the users gazed at the upper or lower the avatar indicates that the user is avoiding the agent. parts of the agent’s face, were also analyzed. The more We also explored the relationship between agent avoid- negative the expression, the further the anxious partic- ance and the participants’ anxiety level. The initial anxi- ipant strayed away from agent’s eye, then the face and ety score represents an approximation of the participants’ entire body respectively. These findings are an important overall state before facing the agent. We found signifi- indication in designing future systems. E.g. Nonverbal cant correlations between the user’s initial anxiety and agents with positive affect might be a better choice for body fixations, invariant of facial expression as shown in an anxious normative individual as they were still more Table 2. When analyzing the correlations, on per facial likely to gaze at the face regardless of their anxiety. The expression basis, there was a strong correlation between studies also show that users are more likely to gaze at the the anxiety score and the number of fixations on the body agent’s eyes if they accept the agent’s emotional display. of the agent or outside of the body when presented with A Metaverse VR avatar and VIVE tracker were used the neutral and angry expressions respectively. When in the experiment. The technique can be easily applied viewing the angry expression, the fixations were com- to a more ecologically valid setting to find avoidance pletely outside the body showing that the participant patterns of anxious users in real-time. This is useful to is more likely to avoid the agent completely the more adjust a non-verbal agent’s expression to accommodate negative the expression is. for the user’s anxiety and facial preference. Other sen- We also found a strong negative correlations between sors can also be added to find stronger patterns e.g heart the participants’ anxiety score and the number of fixa- sensor or facial tracker. The studies suffer some limita- tions on the agent’s eyes, invariant of facial expression. tions due to limited participant count and non-validated This indicates that the participant avoids the agent’s eyes questionnaires. and is more likely to look at the agent’s body or outside of References [11] A. T. Wieckowski, N. N. Capriola-Hall, R. Elias, T. H. Ollendick, S. W. White, Variability of attention [1] D. Monteiro, H.-N. Liang, J. Wang, L. Wang, bias in socially anxious adolescents: differences in X. Wang, Y. Yue, Evaluating the effects of a cartoon- fixation duration toward adult and adolescent face like character with emotions on users’ behaviour stimuli, Cognition and emotion 33 (2019) 825–831. within virtual reality environments, in: 2018 IEEE [12] Wolf3D, Cross-game avatar platform for the meta- International Conference on Artificial Intelligence verse, in: https://readyplayer.me/, accessed January and Virtual Reality (AIVR), IEEE Computer Society, 14th, 2022, 2022. 2018, pp. 229–236. [13] M. Shin, S. J. Kim, F. Biocca, The uncanny val- [2] E. Hasegawa, N. Isoyama, D. V. Monteiro, N. Sakata, ley: No need for any further judgments when an K. Kiyokawa, The effects of speed-modulated vi- avatar looks eerie, Computers in Human Behavior sual stimuli seen through smart glasses on work 94 (2019) 100–109. efficiency after viewing, Sensors 22 (2022) 2272. [14] P. Ekman, W. P. Friesen, Measuring facial move- [3] H. Lee, H. Kim, D. V. Monteiro, Y. Goh, D. Han, H.- ment with the Facial Action Coding System, in: N. Liang, H. S. Yang, J. Jung, Annotation vs. virtual P. Ekman (Ed.), Emotion in the human face, sec- tutor: Comparative analysis on the effectiveness ond edi ed., Cambridge University Press, 1982, pp. of visual instructions in immersive virtual reality, 178–211. in: 2019 IEEE International Symposium on Mixed [15] P. Ekman, E. Rosenberg, What the face reveals: Ba- and Augmented Reality (ISMAR), IEEE, 2019, pp. sic and Applied Studies of Spontaneous Expression 318–327. Using the Facial Action Coding System (FACS), sec- [4] D. Monteiro, H.-N. Liang, H. Li, Y. Fu, X. Wang, ond edition ed., Oxford University Press, 2005. Evaluating the need and effect of an audience in a [16] B. Farnsworth, Facial action coding system (facs) virtual reality presentation training tool, in: Inter- - a visual guidebook, 2021. URL: https://imotions. national Conference on Computer Animation and com/blog/facial-action-coding-system/. Social Agents, Springer, 2020, pp. 62–70. [17] Mixamo, 2022. URL: https://www.mixamo.com/#/. [5] K. Roelofs, P. Putman, S. Schouten, W.-G. Lange, I. Volman, M. Rinck, Gaze direction differentially affects avoidance tendencies to happy and angry faces in socially anxious individuals, Behaviour research and therapy 48 (2010) 290–294. [6] J. H. Kwon, C. Alan, S. Czanner, G. Czanner, J. Pow- ell, A study of visual perception: Social anxiety and virtual realism, in: Proceedings of the 25th Spring Conference on Computer Graphics, 2009, pp. 167–172. [7] R. E. Jack, P. G. Schyns, The Human Face as a Dynamic Tool for Social Com- munication, Current Biology 25 (2015) R621–R634. URL: https://www.sciencedirect. com/science/article/pii/S0960982215006557. doi:10.1016/j.cub.2015.05.052. [8] H.-S. Cha, S.-J. Choi, C.-H. Im, Real-time recogni- tion of facial expressions using facial electromyo- grams recorded around the eyes for social virtual reality applications, IEEE Access PP (2020) 1–1. doi:10.1109/ACCESS.2020.2983608. [9] L. A. Rutter, D. J. Norton, T. A. Brown, Visual atten- tion toward emotional stimuli: Anxiety symptoms correspond to distinct gaze patterns, Plos one 16 (2021) e0250176. [10] M. Garner, K. Mogg, B. P. Bradley, Orienting and maintenance of gaze to facial expressions in so- cial anxiety., Journal of Abnormal Psychology 115 (2006) 760–770. doi:10.1037/0021-843x.115.4. 760.