=Paper=
{{Paper
|id=Vol-3669/paper12
|storemode=property
|title=Self-avatar representation matters: Deciphering user immersion in VR games through Steam reviews
|pdfUrl=https://ceur-ws.org/Vol-3669/paper12.pdf
|volume=Vol-3669
|authors=Dion Deng,Mila Bujić,Wang Chi Lee,Ming Rui Li,Juho Hamari
|dblpUrl=https://dblp.org/rec/conf/gamifin/DengBLLH24
}}
==Self-avatar representation matters: Deciphering user immersion in VR games through Steam reviews==
Self-avatar representation matters: Deciphering user
immersion in VR games through Steam reviews
Dion Deng1, Mila Bujić1, Wangchi Lee2, Mingrui Li3, Juho Hamari1
1 Tampere University, Tampere 33100, Finland
2 The Hong Kong University of Science and Technology, Hong Kong, China
3 Hong Yi Cambridge International School, Changsha, China
Abstract
This study critically examines the influence of self-avatars on user immersion in VR games by
analyzing user reviews from Steam's top 100 VR games. Utilizing the BERT algorithm for text
classification and detailed manual coding on avatar representations, the research addresses the effects
of presence, perspective, visual features, and interactivity of avatars on immersion. Although the
Mann-Whitney U test results were non-significant, effect size analyses revealed practical implications
of avatar characteristics on user immersion. Notably, the study identifies key trends in avatar design
within popular VR games, such as the predominance of first-person perspectives and the relative
importance of hand representations over facial features. These findings suggest a need for a shift in
focus in avatar research towards more user-relevant features. This innovative approach, using user-
generated content, marks a significant departure from traditional experimental methods. It offers a
richer, more ecologically valid understanding of user experiences in VR. The study's insights have
significant implications for future avatar design and research.
Keywords
Avatar, virtual reality, product review, text classification, BERT, content analysis 1
environments to study user interactions and
1. Introduction responses [5],[6],[7]. While these studies have
provided valuable insights, their limited scope and
In immersive Virtual Reality (VR), self-avatars are the controlled settings often fail to capture the diverse,
users’ digital embodiment that play an important role real-world experiences of users. Such methods may
in users’ interaction and experience. The design and not fully encompass the wide range of user
features of self-avatars are not merely aesthetic backgrounds, preferences, and naturalistic behaviors
choices or tools to operate within the virtual that occur in everyday gaming contexts.
environments, but also instrumental in determining Our research adopts a novel approach by analyzing
the degree of immersion [1]. player feedback through game reviews in the context
The immersion experience in VR is significantly of avatars in VR. This method leverages the
influenced by the user's ability to identify with their spontaneous, authentic, and varied opinions of the
avatar [2]. This identification is deeply rooted in the gaming community, providing a broader and more
concept of presence, the sensation of being physically ecologically valid understanding of how self-avatars
located in the virtual environment [2]. One important influence player experiences in real-life settings. User
factor that is crucial to immersion experience is the reviews, as a form of naturalistic data, offer insights
similarity between the user’s visual appearance and into the aspects of self-avatars that resonate the most
their avatar, which can encompass physical with players and significantly impact their sense of
resemblance and behavioral and emotional immersion and overall experience [8].
congruence [3],[4]. This study tries to explore the role of self-avatars
In this context, understanding how different in VR games through reviews, focusing on how their
attributes of self-avatars, such as their presence, presence, perspective, visual features, and interactive
perspective, visual features, and interaction capabilities impact user immersion. Our primary data
capabilities influence the user experience becomes source is user-generated reviews from the most
paramount. Avatar research in VR has primarily relied popular VR games on the Steam platform for online
on experimental methods, utilizing controlled lab distribution of games (Steam Inc). The methodology
8th International GamiFIN Conference 2024 (GamiFIN 2024), April 2-
5, 2024, Ruka, Finland.
xiaohang.deng@tuni.fi (D. Deng); mila.bujic@tuni.fi (M. Bujić);
juho.hamari@tuni.fi (J. Hamari)
0009-0006-1626-2365 (D. Deng); 0000-0002-4171-4806 (M.
Bujić); 0000-0002-6573-588X (J. Hamari)
© 2024 Copyright for this paper by its authors. The use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
132
involves a two-pronged approach: firstly, employing affect how users process information and interact
the BERT (Bidirectional Encoder Representations within VR [15].
from Transformers) algorithm to identify and extract Hand Representation and Body Connectivity
reviews specifically mentioning immersion; and (RQ2): Realistic hand representations can
secondly, utilizing a detailed codebook to manually significantly boost the sense of agency and control, a
code avatar features within these games. Through this key aspect of immersion. For example, when users see
innovative methodology, our study aims at analyzing a their virtual hands synchronized with their real
detailed picture of the current landscape of self-avatar movements, it enhances the sense of embodiment,
design in VR games and how it aligns with users’ resulting in a more engaging and intuitive VR
experiences of immersion. experience [9].
The degree of body connectivity also impacts
immersive experience. A fully connected avatar, as
1.1. The effects of self-avatar in VR opposed to a disembodied hand or partial body
representation, can increase the sense of bodily
Self-avatars not only serve as digital representations of presence in the virtual environment [13]. This sense of
players but also significantly influence their a complete body in virtual space is crucial for a
psychological experiences in VR [9],[10]. This section, coherent experience, as it aligns with our natural
grounded in extensive literature, examines how perception of our bodies in the real world. The
different aspects of self-avatar representations integration of proprioceptive feedback, where the
influence player immersion, aligning with our four user's movements are accurately reflected in the
leading research questions: avatar, further enhances this sense of presence and
RQ1: How do presence of self-avatar and embodiment [1].
perspective affect the proportion of positive reviews Visual Features of Self-Avatars (RQ3): The
about immersion of VR games? visual features of self-avatars, including their level of
RQ2: How do self-avatar’s hand representation detail, anthropomorphism, skin color, and body size,
and body connectivity affect the proportion of positive play a pivotal role in shaping user experiences in VR.
reviews about immersion of VR games? These features significantly influence the degree of
RQ3: How do visual features of self-avatar, identification a user feels with their avatar, which in
including detail level, anthropomorphism, skin color, turn affects their immersion and overall experience.
and body size affect the proportion of positive reviews The level of detail in an avatar’s appearance can
about immersion of VR games? dramatically affect the user's sense of presence and
RQ4: How does visual feedback of self-avatar immersion. High-resolution textures and detailed
interaction affect the proportion of positive reviews avatars can enhance the presence and engagement of
about immersion of VR games? the VR experience, leading to a stronger connection
between the user and the virtual environment [4].
Presence and Perspective (RQ1): The
Detailed avatars enable users to identify more closely
incorporation of a self-avatar in VR is a fundamental
with virtual selves, fostering stronger embodiment.
element that significantly enhances the user's sense of
Avatars with human-like features can enhance the
embodiment, crucial for fostering a deep sense of
social presence and emotional connection in VR,
presence and immersion within the virtual
especially in multiplayer or social VR settings [16].
environment. Embodiment is central to VR
This connection can be particularly potent when
experiences, directly influencing users’ engagement
avatars exhibit subtle human-like movements and
and interaction within the virtual world [11],[12].
micro-expressions.
The choice of perspective, particularly between
The representation of diverse skin colors in
first-person and third-person views, further
avatars allows users from different backgrounds to
modulates this experience of embodiment. First-
find avatars that resemble and represent them,
person perspectives are often associated with a higher
enhancing their sense of identity within the VR world
sense of embodiment and presence, as they more
[10]. The choice of an avatar’s skin color can also
closely mimic the natural human perception of their
impact the level of empathy and connection users feel
embodied perspective, offering a direct and
with the virtual character.
uninterrupted view of the virtual environment from
The congruence of avatar size with the user's real
the avatar's eyes [13]. This immersive perspective
body size can affect how users perceive spatial
allows users to directly align their physical movements
relationships and interact within the virtual reality
with those of their avatars, creating a seamless and
environment [4]. This aspect is particularly important
intuitive interaction that enhances the feeling of being
in applications where accurate spatial perception is
in the virtual world [1].
crucial, such as in training simulations, modelling and
In contrast, third-person perspectives provide a
other similar visualizations.
different type of interaction. While they offer a broader
Our RQ3 explores the principles of Representation
view of the avatar and its surroundings, they can
Theory [17], which posits that the effectiveness of
sometimes create a sense of detachment, as the users
information systems, such as VR environments, is
view their avatars from an external standpoint [14].
significantly enhanced when they accurately represent
However, this perspective can also be beneficial in
real-life elements. By mirroring real-world
certain scenarios, such as strategy games or situations
characteristics, these avatars serve as authentic
where spatial awareness is key. Researchers have
extensions of the user's identity within the virtual
found that the choice of perspective can significantly
world. This fidelity in representation fosters a deeper
connection and immersion, as users find it easier to
133
relate to and engage with avatars that closely resemble 1.3. Proportion of reviews as a
actual human features and behaviors.
Visual Feedback and Interaction in Self-Avatars measure
(RQ4): The way self-avatars interact with the virtual
environment and the corresponding visual feedback Our methodological choice to quantify the proportion
they provide are critical in shaping a user’s immersion of user reviews addressing positive immersion
and overall experience in VR. This aspect of avatar experience as an indicator of the games' performance
design, encompassing the responsiveness and visual on immersion is underpinned by rigorous academic
realism of avatar interactions, significantly contributes precedent. Quantitative content analysis of user-
to the sense of presence and engagement within the generated reviews is a well-established approach in
virtual world [14]. the literature, which allows for the objective,
Realistic interaction mechanics, such as accurate systematic, and quantitative examination of
hand tracking and responsive movement, can deepen communication content [18][19]. By focusing on the
the user's sense of embodiment and agency within the proportion of reviews that mention positive user
virtual environment [1]. This visual realism aids in experience of immersion, our study adopts a metric of
bridging the gap between the worlds, making the VR salience that has been academically recognized as
experience more intuitive and immersive. indicative of the importance or prominence of that
The visual feedback from an avatar's actions, such topic within the consumer community [20].
as changes in the environment or reactions from other This approach is grounded in the notion that the
virtual entities, further amplifies the immersive frequency of comments on a specific feature can be
experience. The feedback provides users with tangible reflective of its significance to the user base, a
consequences of their actions in VR, reinforcing the methodological assumption that is supported by the
sense an active participant in the virtual world [12]. Agenda-Setting Theory in mass communication [21].
This theory posits that the frequency of issues covered
by the media influences the perceived importance of
1.2. Game reviews as the data these issues among the public [21]. In the context of
source online reviews, the proportion of mentions can
similarly set an 'agenda' by highlighting the features
Avatar research has relied heavily on experimental most impactful to users' experiences [22].
methods to understand user behavior and experience. The reliance on proportional data is bolstered by
While these methods have been instrumental in research that suggests the volume and valence of
advancing the field, they come with inherent mentions in reviews can act as proxies for consumer
limitations. Oulasvirta et al. (2016) [5] highlight that attitudes and satisfaction levels [23]. The significance
experimental settings often fail to replicate the of this method is further emphasized in [24],[25]
complexity and variability of real-world scenarios, which demonstrate a strong correlation between the
potentially leading to findings that lack ecological proportion of review mentions of certain attributes
validity. Experiments typically involve small, non- and the consumer ratings of products.
representative samples, limiting the generalizability of By utilizing the proportion of topic mentions
the findings [6]. The controlled nature of these studies rather than the presence or absence of such mentions,
can also result in responses that do not fully capture we mitigate the risk of over-representing outlier
the spontaneous and authentic reactions of users in opinions and instead capture a more balanced view of
naturalistic environments. the collective sentiment. This is in line with previous
Recognizing these limitations, research on avatars findings[26]which highlight the robustness of
has increasingly turned to alternative methods. One proportional measures in depicting a more accurate
method is the analysis of user-generated content, such reflection of the consensus among the user base.
as user reviews. These reviews offer a rich, unfiltered, Considering these theoretical and empirical
and authentic source of user feedback. Unlike the foundations, our methodology is academically sound
responses elicited in experimental settings, user and provides a nuanced lens through which to assess
reviews provide insights into the real-world the collective evaluation of a game's specific features
experiences of a broad and diverse user base. This shift by its users. The proportion of topic-specific reviews
is supported by the growing understanding within the thus serves as a quantitative measure that is indicative
avatar studies that user experiences are multi-faceted of the overall perceived immersion. The method of
and context-dependent [8]. conducting quantitative analysis with the proportion
Analyzing user reviews addresses several of the of positive reviews on immersion is introduced in the
limitations inherent in experimental methods. It next section.
provides access to a diverse user sample, offering a
level of representativeness that is often unattainable in
lab-based studies. This is particularly important in VR,
2. Methods
where user diversity significantly impacts interaction
patterns and experiences [7]. As these reviews are 2.1. Data collection
generated in naturalistic settings, they offer a more
accurate reflection of how users interact with and From the Steam store, we narrowed the games by VR
perceive technology in their daily lives, thus providing support (VR only) and language (English supported) to
ecological validity that experimental studies might make sure our data is highly related to our research
lack. questions about VR and to avoid difficulties caused by
multi-language text in the algorithm training. 2,938
134
games fulfilled the criteria. We selected the top 100 game are beautiful," which did not mention how it
games as our sample is based on the number of user affects immersion.
reviews, which gave us sufficient review data for the Interactivity (Enhancement of immersion
language model training. through game's interactive features): The emphasis on
To collect the reviews, we used Steam’s official API, interactivity aligns with Witmer and Singer’s (1998)
which provided data on all the reviews for games in [30] Immersion Tendency Questionnaire, which
STEAM, including the reviews’ text, published date, suggests that interactive features of a game
time consumption on the game of the reviewers, etc. significantly contribute to the immersion experience.
[7]. Our data collection was conducted on 25 Oct 2023, By coding for mentions of enhancement of immersion
and a totally data of 282,847 reviews from 100 games through interactivity, the codebook captures this
was collected. aspect of the VR experience. For example, "The gesture
control made me completely immersed in the game's
actions" was included, and "The game controls are
2.2. Data annotations smooth," which did not mention immersion was
excluded.
2.2.1. Game reviews Real-World Comparison (Comparisons with real-
world sensations): This criterion is based on the
To train a text classification algorithm to detect concept of 'Place Illusion' in VR [31], which argues that
reviews that reported positive user experience of realistic, immersive VR experiences often lead to
immersion, we randomly selected 2,500 reviews (25 comparisons with real-world sensations. By coding for
for each game) from our dataset as the training data. such comparisons, the codebook identifies instances
We annotated the reviews related to positive user where the immersive experience is strong enough to
experience of immersion as 1, and others as 0 with the elicit real-world analogies, indicating a high level of
guide of a codebook (Table 1). In constructing this presence. For example, we included "When I put on the
codebook, we aligned our inclusion and exclusion VR headset, I completely forgot the outside world, as if
criteria with established games and theories, and prior I was in the game.", and excluded "This game made me
empirical studies about immersion. Our purpose was forget my daily troubles," which did not specifically
to encompass a comprehensive range of elements that involve the immersive experience.
contribute to immersive experiences, as delineated by Finally, 146 (5.84%) reviews were annotated as 1
the following five aspects: (related to the positive user experience of immersion).
Terminology (Immersion and its related terms):
Table 1
Lombard and Ditton's (1997) [27] theory of presence
Annotation criteria of game reviews
emphasizes the psychological state where users feel
immersed in a virtual environment. By focusing on Criteria Inclusion Exclusion
direct references to 'immersion' or its synonyms, the Terminology direct references to mention
codebook aligns with this theoretical framework, "immersion" or its "immersion" only
capturing users' perceived sense of being in the virtual synonyms, in a literal or non-
world. For example, "The immersion in this VR game is indicating an explicit contextual sense
really amazing; I totally forgot about the outside discussion of the
world" was included, and "I spent a lot of time immersive
immersed in this game," was excluded, where experience.
"immersed" refers to time spent, not describing the Specificity detailing specific general personal
sense of immersion. experiences or experiences not
Specificity (Detailing specific experiences or emotions that directly related to
emotions): Csikszentmihalyi’s (1991) [28] concept of convey a sense of immersion
flow in gaming posits that immersion is often immersion
accompanied by detailed descriptions of experiences Game analyze the general
and emotions. This criterion ensures that the reviews Design influence of game assessments of
analyzed are not just superficial mentions of
design elements on game design
immersion but reflect a deeper, flow-like engagement
immersion lacking a direct
with the VR game. For example, we included "In this
connection to
game, I totally felt the presence of being the character,
as if I was really in that world.", and excluded "I really immersion
enjoy this game, it's fun to play," which is just a general Interactivity emphasize the focusing solely on
experience share. enhancement of interactivity
Game Design (Influence of game design elements immersion through without linking to
on immersion): The Mechanics-Dynamics-Aesthetics the game's immersion
(MDA) framework proposed by Hunicke, LeBlanc, and interactive features
Zubek (2004) [29] illustrates how game mechanics Real-World compare the gaming comparisons with
influence player dynamics, including immersion. This Comparison experience to real- the real world
criterion captures how users perceive and articulate world sensations to that do not
the influence of game design on their immersive accentuate specifically
experiences. For example, we included "The 3D sound immersion's emphasize
effects in the game made me feel like I was truly in intensity immersion
another world” and excluded "The graphics of the
135
2.2.2. Game data 2.2.3. Reliability of annotation
To classify avatar visual representations in the 100 VR To ensure the accuracy and consistency of our manual
games, we designed another codebook based on coding process, we conducted a preliminary
established games and theories, and prior empirical annotation exercise with three independent coders.
studies about avatar representation and embodiment. For the review data, three coders were tasked with
To observe self-avatar representations in each game, analyzing a subset of 100 reviews, while for the avatar
we used the keyword “game name + full gameplay” on features, three coders each coded the characteristics of
YouTube and watched at least two of the gameplay 10 games. Inter-rater reliability was evaluated using
videos with a minimum of five minutes on each video. Cohen’s k, which measures the level of agreement
After totally understanding every feature of the self- between coders beyond what would be expected by
avatar representation, we annotated them based on chance [32]. A score of .81 for the review data
the codebook (Table 2). For a further explication and indicated good agreement, whereas for the avatar
descriptive data of the avatars’ feature annotation, see features a score of .68 suggested substantial
section 3. Results. agreement. Discrepancies in coding were reviewed in
In our annotation of avatars within the selected VR a series of consensus meetings where the coders
games, we paid particular attention to the aspect of discussed each disagreement until a unanimous
personalization. For each feature of the avatar, such as decision was reached.
skin color and body size, we assessed whether the
game allowed players to personalize these elements. If
a game offered the option for players to customize 2.3. Topic detection
these aspects of their avatar, we labeled it as
'personalized'. Most of the games in our study do not To classify the sentiment of reviews, we employed a
have personalized self-avatars. Consequently, in our state-of-the-art text classification algorithm BERT
quantitative analysis, any data related to these (Bidirectional Encoder Representations from
customizable features were treated as missing due to Transformers) [33]. BERT is particularly well-suited
insufficient data. for natural language processing tasks due to its deep
learning architecture that considers the context from
Table 2 both the left and the right side of a token.
Codebook of the avatar representations To train the model, we utilized a labeled dataset,
Variables Annotation rules where each review was pre-classified as either
positive (1) or negative (0) based on the criteria in the
Presence Visible or invisible avatar codebook. The model was fine-tuned on this dataset,
iterating through the corpus to learn the complex
First-person First-person perspective support patterns associated with the sentiment expressed in
or not gaming reviews.
Third-person Third-person perspective support We assessed the performance of our final BERT
or not model using several evaluation metrics. The Receiver
Body type Hands-only avatar or full-body/ Operating Characteristic Area Under the Curve (ROC
upper-body avatar AUC) was 0.9679, indicating an excellent ability of the
model to discriminate between the positive and
Hand Realistic or unrealistic hand
negative classes. The ROC AUC is a performance
representation representation
measurement for classification problems at various
Hand accessories With or without accessories on threshold settings, where a score of 1 represents a
hands perfect model and a score of 0.5 represents a model
Hand-body Hand-body connects or not with no discriminative power.
connection In terms of precision, recall, and F1-score, which
Detail level Include detailed textures or not are critical metrics for classification problems, our
model achieved the following results:
Anthropomorphism With or without For class 0 (negative reviews), the model had a
anthropomorphic features precision of 0.9916, meaning that 99.16% of the
Skin color (race) Dark or light-skin negative classifications were correct. The recall was
0.9834, indicating that 98.34% of the actual negative
Skin color Skin color is semitransparent or instances were correctly identified. The F1-score, a
(transparency) non-transparent harmonic mean of precision and recall, was 0.9875.
Body size Congruent avatar model size with For class 1 (positive reviews), the model achieved
human or incongruent (much a precision of 0.7647 and a recall of 0.8667, resulting
bigger/smaller) avatar in an F1-score of 0.8125. This shows that while the
Interactivity Provide visual feedback caused model was slightly less precise in identifying positive
by the avatar’s interactivity or reviews, it was robust in retrieving a high proportion
not of all relevant instances.
The overall accuracy of the model was 0.9766,
demonstrating that it correctly classified 97.66% of
the reviews. The macro average F1-score, which gives
equal weight to both classes, was 0.9000, and the
136
weighted average F1-score, which accounts for class results showed that some avatar features have impacts
imbalance, was 0.9772. on perceived immersion. Especially, the realism of
hand representation had a large effect size and thus it
can be concluded that how hands represented in video
3. Results game may considerably contribute to immersive
experience. Contrastingly, detailed textures and skin
As the size is relatively small for our dataset, we color transparency reflected small to medium effect
selected the Mann-Whitney U test as a primary method sizes showing that they had more modest impacts on
of analysis. This non-parametric test is best suited to immersion. The size of avatar and the provision of
compare differences between two independent groups visual feedback from interactions also had medium
when the sample sizes are small, and distribution of effect sizes indicating their considerable role to
data is not assumed normal. We also utilized Cliff’s improve immersive experience. These results note
Delta as an effect size measure that gives a more that certain avatar features may have different effects
relevant estimate of the scale of observed disparities on the immersion of players in VR spaces. Table 3
in nonparametric situations. This approach summarizes the details for each feature in terms of
supplements what we found with substantial practical statistics and effect sizes
understanding beyond just statistical significance. The
Table 3
Summary of the data analyses
Mann-
Median proportion of Median proportion of Effect size
Variable Whitney
positive reviews on positive reviews on Asymp. (Cliff's d)
U
immersion in Group 1 immersion in Group 2 Sig.
Visible Invisible
Presence of Avatar 765 .384 .138*
3.2% (N = 84, SD = .025) 2.6% (N = 16, SD = .017)
Supported Not supported
First-Person Perspective 481 .153 .307**
2.7% (N = 92, SD = .025) 3.7% (N = 8, SD = .017)
Supported Not supported
Third-Person Perspective 882 .325 .146*
2.5% (N = 19, SD = .178) 3.0% (N = 81, SD = .253)
Hands-only Full-body/upper-body
Body Type 840 .294 .148*
2.6% (N = 24, SD = .198) 2.5% (N = 61, SD = .269)
Realistic Unrealistic
Hand Realism 890 .068 .601***
2.4% (N = 49, SD = .025) 2.6% (N = 34, SD = .026)
With accessories Without accessories
Hand Accessories 366 .528 -.099
2.5% (N = 28, SD = .027) 2.4% (N = 29, SD = .022)
Connected Not connected
Hand-Body Connectivity 666 .374 .138*
2.3% (N = 18, SD = .020) 2.6% (N = 65, SD = .027)
Includes detailed textures Excludes detailed textures
Detail Level 590 .304 -.149*
3.2% (N = 22, SD = .028) 2.5% (N = 63, SD = .024)
With anthropomorphic Without anthropomorphic
Anthropomorphism features features 1037 .431 .106*
2.8% (N = 75, SD = .023) 3.3% (N = 25, SD = .028)
Light skin Dark skin
Skin Color - Light vs. Dark 20 .106 -.487**
3.1% (N = 13, SD = .032) 2.1% (N = 6, SD = .013)
Non-transparent Semi-transparent
Skin Color - Transparency 172 .403 .194*
2.5% (N = 40, SD = .026) 2.0% (N = 4, SD = .024)
Congruent size Incongruent size
Avatar Size 26 .154 -.422**
3.0% (N = 15, SD = .021) 2.2% (N = 6, SD = .009)
Interaction Visual With visual feedback Without visual feedback
344 .068 .451**
Feedback 4.3% (N = 79, SD = .026) 2.5% (N = 6, SD = .018)
Note. * small effect size (0.1 < |d| < 0.3), ** medium effect size (0.3 < |d| < 0.5), *** large effect size (|d| > 0.5).
137
VR games. This trend has significant implications for
4. Discussion the direction of avatar research.
Most avatar research has placed emphasis on
By analyzing user reviews from a selection of the most visual attributes such as gender, age, and other identity
popular VR games on Steam, we found insights that markers. However, our findings suggest a disconnect
challenge conventional understandings and between these research foci and the real-world VR
potentially inspire novel perspectives in avatar games. In the absence of third-person perspectives or
research in the context of immersive virtual reality. mirrors used in most popular VR games, features like
Our discussions explore the multifaceted findings of facial appearance, gender, or age are less perceived.
our study, interpreting the implications of our results. This raises questions about the relevance of such
visual cues in first-person VR environments, where
4.1. Overall results of the Mann- users primarily interact with the game world through
their avatars' hands and actions.
Whitney U and effect sizes Given this context, a shift in research focus appears
necessary. Hand and lower-body representations in
The predominance of non-significant results in our VR seem to be more critical for user immersion and
Mann-Whitney U initially appears to suggest a limited interaction. This is supported by studies such as [9],
influence of avatar characteristics on player which highlight the importance of hand representation
immersion. However, focusing on the effect sizes, in VR for enhancing the sense of control and
rather than solely on statistical significance, offers a embodiment. Additionally, research by [13]
more nuanced understanding of our findings. Effect underscores the significance of embodiment in first-
sizes provide insight into the magnitude of differences, person VR experiences, further validating the need to
and are less impacted by sample size, which is focus on aspects directly experienced by the user.
particularly informative in studies like ours where the Therefore, future avatar research might consider
sample size is relatively small [32]. However, it is prioritizing the study of hand and full-body
important to note that the precision and confidence representations, exploring how their design, realism,
interval of these effect sizes is still influenced by the and functionality contribute to immersion.
sample size.
These specific effect size results from our study
carry important implications for future avatar 4.3. Avatar-user visual similarity
research and design. The impact of avatar presence and immersion
and perspective highlights the need for more targeted
research to understand which aspects of avatar design Our findings make a significant contribution to the
resonate with different user demographics. For understanding of avatar-user similarity and its impact
instance, future studies might explore how individual on immersion in VR games. The nuances revealed
player characteristics, such as prior VR experience or through our analysis underscore the importance of
personal preferences, interact with avatar features to similarity in fostering a deeper sense of immersion.
affect immersion. The preference for more detailed and realistic
Moreover, our findings about the medium effect of avatars aligns with the theory that higher fidelity in
first-person perspective on immersion suggest that VR avatar design enhances the player's ability to relate to
game designers might consider offering players the and identify with their virtual counterpart. This is
option to choose their preferred perspective. This supported by studies like that of [34], which found that
customization could cater to diverse player users respond more positively to avatars that
preferences, potentially enhancing the immersive resemble their real-life appearance. Our findings
experience for a broader user base. extend this notion, suggesting that a detailed, high-
Furthermore, the impact of skin color and avatar fidelity avatar can act as an extension of the self within
size in our study, which exhibited considerable effect the virtual environment, thereby enhancing the
sizes warrants special attention in design immersive experience.
considerations. These factors were among the closest The observation regarding the preference for light-
to reaching statistical significance, indicating their skinned avatars also points to a deeper aspect of user-
potential substantial influence on immersion in VR avatar similarity but it could be reflective of the
experiences. This suggests that even seemingly minor demographic composition of the VR gaming
aspects of avatar design, like skin tone and body size, community. This phenomenon is echoed in the work
can have a profound impact on how users perceive and [10], which demonstrated how skin color in avatars
interact with the virtual environment. could influence the user's experiences and reactions in
a virtual environment. Their results suggest that
4.2. Trends in avatar design for VR congruence in physical characteristics, such as skin
color, between the avatar and the user can intensify
games the immersion, possibly due to enhanced
identification.
The descriptive analysis of the top 100 VR games on Additionally, body size emerged as an influential
Steam provides a revealing glimpse into current trends aspect of avatar-user similarity. Our findings suggest
in VR game design, particularly regarding avatar that avatars with body sizes that closely match or are
representation. A striking observation from our data is perceived as ideal by users can practically impact
the scarcity of third-person perspectives, including immersion. This is supported by research from [4],
indirect forms such as mirror reflections, in popular which demonstrated that the physical dimensions of
138
avatars, including their height and build, can affect the 5.1. Sample size and scope of
user’s psychological responses in virtual interactions.
An avatar with a relatable body size can create a more analysis
compelling and convincing representation of the user
in the virtual world, contributing to a heightened sense The primary limitation of our study is the sample size
of presence and immersion. (n=100), which constrains the depth and breadth of
These insights suggest that VR developers should our analysis. With a larger dataset, more robust
consider incorporating customizable avatars that can statistical methods like linear regression or factor
adapt to diverse user preferences, thereby enriching analysis could be employed to uncover deeper insights
the overall user experience in virtual environments. [36]. The selected games are the most popular on
Steam, which indicates that the findings in this study
might not be generalizable to a broader range of VR
4.4. Methodological innovations experiences.
and contributions to avatar Expanding the study to include other VR platforms,
such as Meta Quest, can provide a broader perspective.
studies By analyzing user reviews across different platforms,
researchers can capture a more diverse range of user
The methodology employed in this study represents a experiences and preferences, as noted by [35] in their
significant paradigm shift in avatar research, discussion on user experience research.
particularly in the field of VR. By harnessing the power Future research should also consider longitudinal
of user-generated content in the form of Steam studies to track changes in user preferences and
reviews, we have successfully introduced a novel perceptions over time, as VR technology continues to
approach to understanding how self-avatars impact evolve [37].
user experiences in VR games. This method transcends
the limitations of experimental research, offering a
more authentic and comprehensive view of player 5.2. Sentiment classification
perceptions and interactions with avatars.
Our approach, which combines the advanced The initial plan to classify sentiment in immersion-
natural language processing capabilities of BERT with related reviews encountered a limitation due to the
meticulous manual coding, enables us to extract and small size of the training dataset. This small sample
analyze nuanced player feedback on a scale previously size restricts the robustness and generalizability of
unachievable. This dual-method strategy effectively any sentiment classification model we could develop.
balances the need for large-scale data analysis with the Despite the small dataset, our annotated data
subtlety of human interpretation, setting a new revealed a significant majority of positive immersion-
standard for research in this field. The use of BERT, related reviews (93.15%, n=136). This high
particularly, exemplifies the cutting-edge of proportion of positive sentiment is promising,
computational linguistics, offering unprecedented suggesting that players generally perceive immersion
precision in identifying and classifying relevant user aspects of VR games favorably. However, as [38] noted,
sentiments [33]. sentiment analysis in complex domains like gaming
The significance of this methodology lies not just in can benefit from more nuanced classification,
its technical prowess but also in its ability to capture capturing a spectrum of sentiments rather than a
the diverse and multifaceted experiences of users. binary positive/negative division.
Unlike controlled experimental settings, our approach For future research, expanding the dataset for
taps into a rich vein of real-world user interactions, training the sentiment model is crucial. A larger and
encompassing a broad spectrum of opinions and more varied set of reviews would enable the
experiences. development of a more sophisticated sentiment
Furthermore, the insights garnered through this analysis model that can accurately differentiate
method offer invaluable implications for VR game between positive and negative sentiments regarding
design and avatar development. By understanding immersion.
user preferences and perceptions as expressed Additionally, future studies should consider
organically in reviews, developers can tailor avatar employing advanced machine learning techniques that
designs more effectively to enhance user immersion can handle imbalanced datasets, as often seen in user-
and satisfaction. This user-centered approach to generated content where certain sentiments may
design is increasingly recognized as vital in creating dominate. Techniques such as SMOTE (Synthetic
engaging and impactful VR experiences [35]. Minority Over-sampling Technique) or ensemble
learning methods can address the imbalance, as
recommended by [39].
5. Limitations and future
research agenda 5.3. Observational limitations
While our study provides valuable insights into self- The use of YouTube gameplay videos as a primary
avatar characteristics in VR games, it's crucial to source for observing avatar characteristics in VR
acknowledge its limitations and outline potential games presents several limitations that need
avenues for future research. consideration in future research.
While YouTube offers accessibility and a wide
range of content, relying on gameplay videos for
139
detailed observations can lead to incomplete or
skewed data. Gameplay videos are often edited and
References
curated, potentially omitting crucial aspects of the
gaming experience that are pertinent to avatar [1] M. Slater, B. Spanlang, M. V. Sanchez-Vives, and O.
research. This limitation aligns with the concerns Blanke, “First person experience of body transfer
raised by [40], who note that online video content may in virtual reality,” PLoS One, vol. 5, no. 5, 2010,
not always represent the full spectrum of user doi: 10.1371/journal.pone.0010564.
experiences due to selective editing. [2] M. V. Sanchez-Vives and M. Slater, “From
Another limitation is the potential for bias in the presence to consciousness through virtual
selection of videos. Content creators may have specific reality,” Nature Reviews Neuroscience, vol. 6, no.
preferences or play styles that do not represent the 4. 2005. doi: 10.1038/nrn1651.
average player's experience. This issue is highlighted [3] J. N. Bailenson, J. Blascovich, A. C. Beall, and J. M.
by [41], who discusses how content creator biases can Loomis, “Equilibrium theory revisited: Mutual
influence the portrayal of digital experiences in online gaze and personal space in virtual
videos. environments,” Presence: Teleoperators and
To address these limitations, future studies could Virtual Environments, vol. 10, no. 6, 2001, doi:
incorporate direct gameplay observation through 10.1162/105474601753272844.
platforms that offer unedited and comprehensive [4] N. Yee and J. Bailenson, “The proteus effect: The
gameplay experiences. For instance, using data from effect of transformed self-representation on
beta testing sessions or developer-provided gameplay behavior,” Hum Commun Res, vol. 33, no. 3, 2007,
footage could yield more accurate and representative doi: 10.1111/j.1468-2958.2007.00299.x.
insights into avatar characteristics. Additionally, as [5] A. Oulasvirta and K. Hornbæk, “HCI research as
suggested by [42], incorporating player interviews or problem-solving,” in Conference on Human
surveys alongside gameplay observation can provide a Factors in Computing Systems - Proceedings,
more holistic understanding of player experiences and 2016. doi: 10.1145/2858036.2858283.
perceptions. [6] M. Muller and S. Kogan, “Grounded theory
method in hci and cscw,” Cambridge: IBM Center
for Social …, 2010.
5.4. Impact of game type diversity [7] D. Deng, M. Bujic, and J. Hamari, “Understanding
Multi-platform Social VR Consumer Opinions: A
The diversity of game types in our sample of 100 VR Case Study in VRChat Using Topics Modeling of
games, ranging from RPGs to sports, music videos, and Reviews,” in Lecture Notes in Business
shooting games, introduces a potential confounding Information Processing, 2023. doi: 10.1007/978-
variable in our analysis of avatar representations. The 3-031-32302-7_4.
variation in game genres can significantly influence [8] C. Lu, X. Li, T. Nummenmaa, Z. Zhang, and J.
how avatars are designed and interacted with, Peltonen, “Patches and Player Community
potentially impacting user perceptions of immersion. Perceptions: Analysis of No Man’s Sky Steam
The heterogeneous nature of game genres in our Reviews,” 2020.
dataset could have diluted the specificity of our [9] F. Argelaguet, L. Hoyet, M. Trico, and A. Lécuyer,
findings regarding avatar representations. As “The role of interaction in virtual embodiment:
highlighted by [43], different game genres cater to Effects of the virtual hand representation,” in
different player expectations and experiences, which Proceedings - IEEE Virtual Reality, 2016. doi:
can significantly affect how players perceive and 10.1109/VR.2016.7504682.
interact with avatars. For instance, the role and [10] T. C. Peck, S. Seinfeld, S. M. Aglioti, and M. Slater,
representation of avatars in an RPG might be “Putting yourself in the skin of a black avatar
fundamentally different from those in a sports game, reduces implicit racial bias,” Conscious Cogn, vol.
leading to varied impacts on user immersion. 22, no. 3, pp. 779–787, 2013, doi:
To address this issue, future research should 10.1016/j.concog.2013.04.016.
consider focusing on specific game genres to control [11] M. Slater and S. Wilbur, “A framework for
genre-related variance. This approach would allow for immersive virtual environments (FIVE):
a more nuanced understanding of how avatar Speculations on the role of presence in virtual
representations influence immersion within a environments,” Presence: Teleoperators and
particular gaming context. By isolating the variable of Virtual Environments, vol. 6, no. 6, 1997, doi:
game type, researchers can more accurately assess the 10.1162/pres.1997.6.6.603.
impact of avatars on the user experience. [12] F. Biocca, “The Cyborg’s Dilemma: Progressive
Embodiment in Virtual Environments [1],”
Journal of Computer-Mediated Communication,
Acknowledgements vol. 3, no. 2, 2006, doi: 10.1111/j.1083-
6101.1997.tb00070.x.
This research was partially supported by the Academy [13] K. Kilteni, R. Groten, and M. Slater, “The Sense of
of Finland (342144; ’POSTEMOTION’). Embodiment in virtual reality,” Presence:
Teleoperators and Virtual Environments, vol. 21,
no. 4. 2012. doi: 10.1162/PRES_a_00124.
[14] A. McMahan, “Immersion, engagement, and
presence: A method for analyzing 3-d video
140
games,” in The Video Game Theory Reader, 2013. [30] B. G. Witmer and M. J. Singer, “Measuring
doi: 10.4324/9780203700457-10. presence in virtual environments: A presence
[15] R. Tamborini and P. Skalski, “The role of presence questionnaire,” Presence: Teleoperators and
in the experience of electronic games,” in Playing Virtual Environments, vol. 7, no. 3, 1998, doi:
Video Games: Motives, Responses, and 10.1162/105474698565686.
Consequences, 2006. doi: [31] M. Slater, D. Banakou, A. Beacco, J. Gallego, F.
10.4324/9780203873700. Macia-Varela, and R. Oliva, “A Separate Reality:
[16] S. J. Ahn, “Embodied Experiences in Immersive An Update on Place Illusion and Plausibility in
Virtual Environments: Effects on Pro- Virtual Reality,” Frontiers in Virtual Reality, vol.
Environmental Attitude and Behavior,” 3. 2022. doi: 10.3389/frvir.2022.914392.
Dissertation:Stanford University, no. May, 2011. [32] J. Cohen, “Statistical power analysis for the
[17] J. Burleson, V. Grover, J. B. Thatcher, and H. Sun, behavioural sciences. Hillside,” NJ: Lawrence
“A Representation Theory Perspective on the Earlbaum Associates. 1988.
Repurposing of Personal Technologies for Work- [33] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova,
Related Tasks,” J Assoc Inf Syst, vol. 22, no. 6, “BERT: Pre-training of deep bidirectional
2021, doi: 10.17705/1jais.00707. transformers for language understanding,” in
[18] M. Shelley and K. Krippendorff, “Content NAACL HLT 2019 - 2019 Conference of the North
Analysis: An Introduction to its Methodology.,” J American Chapter of the Association for
Am Stat Assoc, vol. 79, no. 385, 1984, doi: Computational Linguistics: Human Language
10.2307/2288384. Technologies - Proceedings of the Conference,
[19] K. A. Neuendorf, The Content Analysis 2019.
Guidebook. 2020. doi: [34] J. Bailenson, K. Patel, A. Nielsen, R. Bajscy, S. H.
10.4135/9781071802878. Jung, and G. Kurillo, “The effect of interactivity on
[20] D. Riffe, S. Lacy, and F. Fico, Analyzing media learning physical actions in virtual reality,”
messages: Using quantitative content analysis in Media Psychol, vol. 11, no. 3, 2008, doi:
research. 2014. doi: 10.4324/9780203551691. 10.1080/15213260802285214.
[21] M. E. Mccombs and D. L. Shaw, “The agenda- [35] M. Hassenzahl and N. Tractinsky, “User
setting function of mass media,” Public Opin Q, experience - A research agenda,” Behaviour and
vol. 36, no. 2, 1972, doi: 10.1086/267990. Information Technology, vol. 25, no. 2, 2006, doi:
[22] T. Tang, E. Fang, and F. Wang, “Is neutral really 10.1080/01449290500330331.
neutral? The effects of neutral user-generated [36] Andy Field, Discovering Statistics using SPSS
content on product sales,” J Mark, vol. 78, no. 4, Statistics, vol. 66. 2009.
2014, doi: 10.1509/jm.13.0301. [37] S. Faisal, P. Cairns, and A. Blandford, “Building for
[23] A. Nikolay, G. Anindya, and G. I. Panagiotis, users not for experts: Designing a visualization of
“Deriving the pricing power of product features the literature domain,” in Proceedings of the
by mining consumer reviews,” Management International Conference on Information
Science, vol. 57, no. 8. 2011. doi: Visualisation, 2007. doi: 10.1109/IV.2007.32.
10.1287/mnsc.1110.1370. [38] B. Pang and L. Lee, “Opinion mining and
[24] C. Vásquez, “Right now versus back then: sentiment analysis,” Foundations and Trends in
Recency and remoteness as discursive resources Information Retrieval, vol. 2, no. 1–2, 2008, doi:
in online reviews,” Discourse, Context and Media, 10.1561/1500000011.
vol. 9, pp. 5–13, Sep. 2015, doi: [39] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P.
10.1016/j.dcm.2015.05.010. Kegelmeyer, “SMOTE: Synthetic minority over-
[25] Y. Zhang, Y. Goh, and Q. Wang, “Unraveling the sampling technique,” Journal of Artificial
Effect of Competing Product Reviews on Intelligence Research, vol. 16, 2002, doi:
Consumer Choice and the Moderating Role of 10.1613/jair.953.
Consumer-Reviewer Peer Types,” IEEE Trans [40] M. Sjöblom and J. Hamari, “Why do people watch
Eng Manag, vol. 70, no. 10, p. 3315, 2023, doi: others play video games? An empirical study on
10.1109/TEM.2021. the motivations of Twitch users,” Comput Human
[26] R. Ullah, N. Amblee, W. Kim, and H. Lee, “From Behav, vol. 75, 2017, doi:
valence to emotions: Exploring the distribution 10.1016/j.chb.2016.10.019.
of emotions in online product reviews,” Decis [41] R. Rosalen, “YouTube: Online video and
Support Syst, vol. 81, pp. 41–53, Jan. 2016, doi: participatory culture,” New Media Soc, vol. 21,
10.1016/j.dss.2015.10.007. no. 9, 2019, doi: 10.1177/1461444819859476.
[27] M. Lombard and T. Ditton, “At the heart of it all: [42] M. Consalvo and N. Dutton, “Game analysis:
The concept of presence,” Journal of Computer- Developing a methodological toolkit for the
Mediated Communication, vol. 3, no. 2. 1997. doi: qualitative study of games,” Game Studies, vol. 6,
10.1111/j.1083-6101.1997.tb00072.x. no. 1, 2006.
[28] P. H. Mirvis and M. Csikszentmihalyi, “Flow: The [43] N. Wardrip-Fruin and P. Harrigan, First person:
Psychology of Optimal Experience,” The new media as story, performance, and game, vol.
Academy of Management Review, vol. 16, no. 3, 42, no. 02. 2006. doi: 10.5860/choice.42-0713.
1991, doi: 10.2307/258925.
[29] R. Hunicke, M. Leblanc, and R. Zubek, “MDA: A
formal approach to game design and game
research,” in AAAI Workshop - Technical Report,
2004.
141