=Paper=
{{Paper
|id=Vol-3669/paper12
|storemode=property
|title=Self-avatar representation matters: Deciphering user immersion in VR games through Steam reviews
|pdfUrl=https://ceur-ws.org/Vol-3669/paper12.pdf
|volume=Vol-3669
|authors=Dion Deng,Mila Bujić,Wang Chi Lee,Ming Rui Li,Juho Hamari
|dblpUrl=https://dblp.org/rec/conf/gamifin/DengBLLH24
}}
==Self-avatar representation matters: Deciphering user immersion in VR games through Steam reviews==
<pdf width="1500px">https://ceur-ws.org/Vol-3669/paper12.pdf</pdf>
<pre>
                         Self-avatar representation matters: Deciphering user
                         immersion in VR games through Steam reviews
                         Dion Deng1, Mila Bujić1, Wangchi Lee2, Mingrui Li3, Juho Hamari1
                         1 Tampere University, Tampere 33100, Finland
                         2 The Hong Kong University of Science and Technology, Hong Kong, China
                         3 Hong Yi Cambridge International School, Changsha, China


                                           Abstract
                                           This study critically examines the influence of self-avatars on user immersion in VR games by
                                           analyzing user reviews from Steam's top 100 VR games. Utilizing the BERT algorithm for text
                                           classification and detailed manual coding on avatar representations, the research addresses the effects
                                           of presence, perspective, visual features, and interactivity of avatars on immersion. Although the
                                           Mann-Whitney U test results were non-significant, effect size analyses revealed practical implications
                                           of avatar characteristics on user immersion. Notably, the study identifies key trends in avatar design
                                           within popular VR games, such as the predominance of first-person perspectives and the relative
                                           importance of hand representations over facial features. These findings suggest a need for a shift in
                                           focus in avatar research towards more user-relevant features. This innovative approach, using user-
                                           generated content, marks a significant departure from traditional experimental methods. It offers a
                                           richer, more ecologically valid understanding of user experiences in VR. The study's insights have
                                           significant implications for future avatar design and research.

                                           Keywords
                                           Avatar, virtual reality, product review, text classification, BERT, content analysis 1

                                                                                                                        environments to study user interactions and
                         1. Introduction                                                                                responses [5],[6],[7]. While these studies have
                                                                                                                        provided valuable insights, their limited scope and
                         In immersive Virtual Reality (VR), self-avatars are the                                        controlled settings often fail to capture the diverse,
                         users’ digital embodiment that play an important role                                          real-world experiences of users. Such methods may
                         in users’ interaction and experience. The design and                                           not fully encompass the wide range of user
                         features of self-avatars are not merely aesthetic                                              backgrounds, preferences, and naturalistic behaviors
                         choices or tools to operate within the virtual                                                 that occur in everyday gaming contexts.
                         environments, but also instrumental in determining                                                 Our research adopts a novel approach by analyzing
                         the degree of immersion [1].                                                                   player feedback through game reviews in the context
                             The immersion experience in VR is significantly                                            of avatars in VR. This method leverages the
                         influenced by the user's ability to identify with their                                        spontaneous, authentic, and varied opinions of the
                         avatar [2]. This identification is deeply rooted in the                                        gaming community, providing a broader and more
                         concept of presence, the sensation of being physically                                         ecologically valid understanding of how self-avatars
                         located in the virtual environment [2]. One important                                          influence player experiences in real-life settings. User
                         factor that is crucial to immersion experience is the                                          reviews, as a form of naturalistic data, offer insights
                         similarity between the user’s visual appearance and                                            into the aspects of self-avatars that resonate the most
                         their avatar, which can encompass physical                                                     with players and significantly impact their sense of
                         resemblance and behavioral and emotional                                                       immersion and overall experience [8].
                         congruence [3],[4].                                                                                This study tries to explore the role of self-avatars
                             In this context, understanding how different                                               in VR games through reviews, focusing on how their
                         attributes of self-avatars, such as their presence,                                            presence, perspective, visual features, and interactive
                         perspective, visual features, and interaction                                                  capabilities impact user immersion. Our primary data
                         capabilities influence the user experience becomes                                             source is user-generated reviews from the most
                         paramount. Avatar research in VR has primarily relied                                          popular VR games on the Steam platform for online
                         on experimental methods, utilizing controlled lab                                              distribution of games (Steam Inc). The methodology

                         8th International GamiFIN Conference 2024 (GamiFIN 2024), April 2-
                         5, 2024, Ruka, Finland.
                             xiaohang.deng@tuni.fi (D. Deng); mila.bujic@tuni.fi (M. Bujić);
                         juho.hamari@tuni.fi (J. Hamari)
                              0009-0006-1626-2365 (D. Deng); 0000-0002-4171-4806 (M.
                         Bujić); 0000-0002-6573-588X (J. Hamari)
                                        © 2024 Copyright for this paper by its authors. The use permitted under
                                        Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                        CEUR Workshop Proceedings (CEUR-WS.org)


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings


                                                                                                                  132
involves a two-pronged approach: firstly, employing               affect how users process information and interact
the BERT (Bidirectional Encoder Representations                   within VR [15].
from Transformers) algorithm to identify and extract                  Hand Representation and Body Connectivity
reviews specifically mentioning immersion; and                    (RQ2): Realistic         hand representations can
secondly, utilizing a detailed codebook to manually               significantly boost the sense of agency and control, a
code avatar features within these games. Through this             key aspect of immersion. For example, when users see
innovative methodology, our study aims at analyzing a             their virtual hands synchronized with their real
detailed picture of the current landscape of self-avatar          movements, it enhances the sense of embodiment,
design in VR games and how it aligns with users’                  resulting in a more engaging and intuitive VR
experiences of immersion.                                         experience [9].
                                                                      The degree of body connectivity also impacts
                                                                  immersive experience. A fully connected avatar, as
    1.1. The effects of self-avatar in VR                         opposed to a disembodied hand or partial body
                                                                  representation, can increase the sense of bodily
Self-avatars not only serve as digital representations of         presence in the virtual environment [13]. This sense of
players but also significantly influence their                    a complete body in virtual space is crucial for a
psychological experiences in VR [9],[10]. This section,           coherent experience, as it aligns with our natural
grounded in extensive literature, examines how                    perception of our bodies in the real world. The
different aspects of self-avatar representations                  integration of proprioceptive feedback, where the
influence player immersion, aligning with our four                user's movements are accurately reflected in the
leading research questions:                                       avatar, further enhances this sense of presence and
    RQ1: How do presence of self-avatar and                       embodiment [1].
perspective affect the proportion of positive reviews                 Visual Features of Self-Avatars (RQ3): The
about immersion of VR games?                                      visual features of self-avatars, including their level of
    RQ2: How do self-avatar’s hand representation                 detail, anthropomorphism, skin color, and body size,
and body connectivity affect the proportion of positive           play a pivotal role in shaping user experiences in VR.
reviews about immersion of VR games?                              These features significantly influence the degree of
    RQ3: How do visual features of self-avatar,                   identification a user feels with their avatar, which in
including detail level, anthropomorphism, skin color,             turn affects their immersion and overall experience.
and body size affect the proportion of positive reviews               The level of detail in an avatar’s appearance can
about immersion of VR games?                                      dramatically affect the user's sense of presence and
    RQ4: How does visual feedback of self-avatar                  immersion. High-resolution textures and detailed
interaction affect the proportion of positive reviews             avatars can enhance the presence and engagement of
about immersion of VR games?                                      the VR experience, leading to a stronger connection
                                                                  between the user and the virtual environment [4].
    Presence and Perspective (RQ1): The
                                                                  Detailed avatars enable users to identify more closely
incorporation of a self-avatar in VR is a fundamental
                                                                  with virtual selves, fostering stronger embodiment.
element that significantly enhances the user's sense of
                                                                      Avatars with human-like features can enhance the
embodiment, crucial for fostering a deep sense of
                                                                  social presence and emotional connection in VR,
presence and immersion within the virtual
                                                                  especially in multiplayer or social VR settings [16].
environment. Embodiment is central to VR
                                                                  This connection can be particularly potent when
experiences, directly influencing users’ engagement
                                                                  avatars exhibit subtle human-like movements and
and interaction within the virtual world [11],[12].
                                                                  micro-expressions.
    The choice of perspective, particularly between
                                                                      The representation of diverse skin colors in
first-person and third-person views, further
                                                                  avatars allows users from different backgrounds to
modulates this experience of embodiment. First-
                                                                  find avatars that resemble and represent them,
person perspectives are often associated with a higher
                                                                  enhancing their sense of identity within the VR world
sense of embodiment and presence, as they more
                                                                  [10]. The choice of an avatar’s skin color can also
closely mimic the natural human perception of their
                                                                  impact the level of empathy and connection users feel
embodied perspective, offering a direct and
                                                                  with the virtual character.
uninterrupted view of the virtual environment from
                                                                      The congruence of avatar size with the user's real
the avatar's eyes [13]. This immersive perspective
                                                                  body size can affect how users perceive spatial
allows users to directly align their physical movements
                                                                  relationships and interact within the virtual reality
with those of their avatars, creating a seamless and
                                                                  environment [4]. This aspect is particularly important
intuitive interaction that enhances the feeling of being
                                                                  in applications where accurate spatial perception is
in the virtual world [1].
                                                                  crucial, such as in training simulations, modelling and
    In contrast, third-person perspectives provide a
                                                                  other similar visualizations.
different type of interaction. While they offer a broader
                                                                      Our RQ3 explores the principles of Representation
view of the avatar and its surroundings, they can
                                                                  Theory [17], which posits that the effectiveness of
sometimes create a sense of detachment, as the users
                                                                  information systems, such as VR environments, is
view their avatars from an external standpoint [14].
                                                                  significantly enhanced when they accurately represent
However, this perspective can also be beneficial in
                                                                  real-life elements. By mirroring real-world
certain scenarios, such as strategy games or situations
                                                                  characteristics, these avatars serve as authentic
where spatial awareness is key. Researchers have
                                                                  extensions of the user's identity within the virtual
found that the choice of perspective can significantly
                                                                  world. This fidelity in representation fosters a deeper
                                                                  connection and immersion, as users find it easier to


                                                            133
relate to and engage with avatars that closely resemble               1.3. Proportion of reviews as a
actual human features and behaviors.
    Visual Feedback and Interaction in Self-Avatars                        measure
(RQ4): The way self-avatars interact with the virtual
environment and the corresponding visual feedback                 Our methodological choice to quantify the proportion
they provide are critical in shaping a user’s immersion           of user reviews addressing positive immersion
and overall experience in VR. This aspect of avatar               experience as an indicator of the games' performance
design, encompassing the responsiveness and visual                on immersion is underpinned by rigorous academic
realism of avatar interactions, significantly contributes         precedent. Quantitative content analysis of user-
to the sense of presence and engagement within the                generated reviews is a well-established approach in
virtual world [14].                                               the literature, which allows for the objective,
    Realistic interaction mechanics, such as accurate             systematic, and quantitative examination of
hand tracking and responsive movement, can deepen                 communication content [18][19]. By focusing on the
the user's sense of embodiment and agency within the              proportion of reviews that mention positive user
virtual environment [1]. This visual realism aids in              experience of immersion, our study adopts a metric of
bridging the gap between the worlds, making the VR                salience that has been academically recognized as
experience more intuitive and immersive.                          indicative of the importance or prominence of that
    The visual feedback from an avatar's actions, such            topic within the consumer community [20].
as changes in the environment or reactions from other                 This approach is grounded in the notion that the
virtual entities, further amplifies the immersive                 frequency of comments on a specific feature can be
experience. The feedback provides users with tangible             reflective of its significance to the user base, a
consequences of their actions in VR, reinforcing the              methodological assumption that is supported by the
sense an active participant in the virtual world [12].            Agenda-Setting Theory in mass communication [21].
                                                                  This theory posits that the frequency of issues covered
                                                                  by the media influences the perceived importance of
    1.2. Game reviews as the data                                 these issues among the public [21]. In the context of
         source                                                   online reviews, the proportion of mentions can
                                                                  similarly set an 'agenda' by highlighting the features
Avatar research has relied heavily on experimental                most impactful to users' experiences [22].
methods to understand user behavior and experience.                   The reliance on proportional data is bolstered by
While these methods have been instrumental in                     research that suggests the volume and valence of
advancing the field, they come with inherent                      mentions in reviews can act as proxies for consumer
limitations. Oulasvirta et al. (2016) [5] highlight that          attitudes and satisfaction levels [23]. The significance
experimental settings often fail to replicate the                 of this method is further emphasized in [24],[25]
complexity and variability of real-world scenarios,               which demonstrate a strong correlation between the
potentially leading to findings that lack ecological              proportion of review mentions of certain attributes
validity. Experiments typically involve small, non-               and the consumer ratings of products.
representative samples, limiting the generalizability of              By utilizing the proportion of topic mentions
the findings [6]. The controlled nature of these studies          rather than the presence or absence of such mentions,
can also result in responses that do not fully capture            we mitigate the risk of over-representing outlier
the spontaneous and authentic reactions of users in               opinions and instead capture a more balanced view of
naturalistic environments.                                        the collective sentiment. This is in line with previous
    Recognizing these limitations, research on avatars            findings[26]which highlight the robustness of
has increasingly turned to alternative methods. One               proportional measures in depicting a more accurate
method is the analysis of user-generated content, such            reflection of the consensus among the user base.
as user reviews. These reviews offer a rich, unfiltered,              Considering these theoretical and empirical
and authentic source of user feedback. Unlike the                 foundations, our methodology is academically sound
responses elicited in experimental settings, user                 and provides a nuanced lens through which to assess
reviews provide insights into the real-world                      the collective evaluation of a game's specific features
experiences of a broad and diverse user base. This shift          by its users. The proportion of topic-specific reviews
is supported by the growing understanding within the              thus serves as a quantitative measure that is indicative
avatar studies that user experiences are multi-faceted            of the overall perceived immersion. The method of
and context-dependent [8].                                        conducting quantitative analysis with the proportion
    Analyzing user reviews addresses several of the               of positive reviews on immersion is introduced in the
limitations inherent in experimental methods. It                  next section.
provides access to a diverse user sample, offering a
level of representativeness that is often unattainable in
lab-based studies. This is particularly important in VR,
                                                                  2. Methods
where user diversity significantly impacts interaction
patterns and experiences [7]. As these reviews are                    2.1. Data collection
generated in naturalistic settings, they offer a more
accurate reflection of how users interact with and                From the Steam store, we narrowed the games by VR
perceive technology in their daily lives, thus providing          support (VR only) and language (English supported) to
ecological validity that experimental studies might               make sure our data is highly related to our research
lack.                                                             questions about VR and to avoid difficulties caused by
                                                                  multi-language text in the algorithm training. 2,938


                                                            134
games fulfilled the criteria. We selected the top 100               game are beautiful," which did not mention how it
games as our sample is based on the number of user                  affects immersion.
reviews, which gave us sufficient review data for the                   Interactivity (Enhancement of immersion
language model training.                                            through game's interactive features): The emphasis on
     To collect the reviews, we used Steam’s official API,          interactivity aligns with Witmer and Singer’s (1998)
which provided data on all the reviews for games in                 [30] Immersion Tendency Questionnaire, which
STEAM, including the reviews’ text, published date,                 suggests that interactive features of a game
time consumption on the game of the reviewers, etc.                 significantly contribute to the immersion experience.
[7]. Our data collection was conducted on 25 Oct 2023,              By coding for mentions of enhancement of immersion
and a totally data of 282,847 reviews from 100 games                through interactivity, the codebook captures this
was collected.                                                      aspect of the VR experience. For example, "The gesture
                                                                    control made me completely immersed in the game's
                                                                    actions" was included, and "The game controls are
     2.2. Data annotations                                          smooth," which did not mention immersion was
                                                                    excluded.
          2.2.1. Game reviews                                           Real-World Comparison (Comparisons with real-
                                                                    world sensations): This criterion is based on the
To train a text classification algorithm to detect                  concept of 'Place Illusion' in VR [31], which argues that
reviews that reported positive user experience of                   realistic, immersive VR experiences often lead to
immersion, we randomly selected 2,500 reviews (25                   comparisons with real-world sensations. By coding for
for each game) from our dataset as the training data.               such comparisons, the codebook identifies instances
    We annotated the reviews related to positive user               where the immersive experience is strong enough to
experience of immersion as 1, and others as 0 with the              elicit real-world analogies, indicating a high level of
guide of a codebook (Table 1). In constructing this                 presence. For example, we included "When I put on the
codebook, we aligned our inclusion and exclusion                    VR headset, I completely forgot the outside world, as if
criteria with established games and theories, and prior             I was in the game.", and excluded "This game made me
empirical studies about immersion. Our purpose was                  forget my daily troubles," which did not specifically
to encompass a comprehensive range of elements that                 involve the immersive experience.
contribute to immersive experiences, as delineated by                   Finally, 146 (5.84%) reviews were annotated as 1
the following five aspects:                                         (related to the positive user experience of immersion).
    Terminology (Immersion and its related terms):
                                                                     Table 1
Lombard and Ditton's (1997) [27] theory of presence
                                                                     Annotation criteria of game reviews
emphasizes the psychological state where users feel
immersed in a virtual environment. By focusing on                      Criteria             Inclusion             Exclusion
direct references to 'immersion' or its synonyms, the               Terminology direct references to               mention
codebook aligns with this theoretical framework,                                     "immersion" or its      "immersion" only
capturing users' perceived sense of being in the virtual                                  synonyms,          in a literal or non-
world. For example, "The immersion in this VR game is                               indicating an explicit    contextual sense
really amazing; I totally forgot about the outside                                     discussion of the
world" was included, and "I spent a lot of time                                            immersive
immersed in this game," was excluded, where                                               experience.
"immersed" refers to time spent, not describing the                  Specificity       detailing specific    general personal
sense of immersion.                                                                     experiences or        experiences not
    Specificity (Detailing specific experiences or                                      emotions that        directly related to
emotions): Csikszentmihalyi’s (1991) [28] concept of                                  convey a sense of          immersion
flow in gaming posits that immersion is often                                             immersion
accompanied by detailed descriptions of experiences                     Game              analyze the              general
and emotions. This criterion ensures that the reviews                  Design         influence of game       assessments of
analyzed are not just superficial mentions of
                                                                                     design elements on         game design
immersion but reflect a deeper, flow-like engagement
                                                                                          immersion           lacking a direct
with the VR game. For example, we included "In this
                                                                                                               connection to
game, I totally felt the presence of being the character,
as if I was really in that world.", and excluded "I really                                                       immersion
enjoy this game, it's fun to play," which is just a general         Interactivity       emphasize the        focusing solely on
experience share.                                                                      enhancement of           interactivity
    Game Design (Influence of game design elements                                   immersion through       without linking to
on immersion): The Mechanics-Dynamics-Aesthetics                                          the game's             immersion
(MDA) framework proposed by Hunicke, LeBlanc, and                                    interactive features
Zubek (2004) [29] illustrates how game mechanics                    Real-World      compare the gaming       comparisons with
influence player dynamics, including immersion. This                Comparison        experience to real-      the real world
criterion captures how users perceive and articulate                                 world sensations to        that do not
the influence of game design on their immersive                                           accentuate             specifically
experiences. For example, we included "The 3D sound                                      immersion's             emphasize
effects in the game made me feel like I was truly in                                       intensity             immersion
another world” and excluded "The graphics of the


                                                              135
         2.2.2. Game data                                                 2.2.3. Reliability of annotation
To classify avatar visual representations in the 100 VR          To ensure the accuracy and consistency of our manual
games, we designed another codebook based on                     coding process, we conducted a preliminary
established games and theories, and prior empirical              annotation exercise with three independent coders.
studies about avatar representation and embodiment.              For the review data, three coders were tasked with
To observe self-avatar representations in each game,             analyzing a subset of 100 reviews, while for the avatar
we used the keyword “game name + full gameplay” on               features, three coders each coded the characteristics of
YouTube and watched at least two of the gameplay                 10 games. Inter-rater reliability was evaluated using
videos with a minimum of five minutes on each video.             Cohen’s k, which measures the level of agreement
After totally understanding every feature of the self-           between coders beyond what would be expected by
avatar representation, we annotated them based on                chance [32]. A score of .81 for the review data
the codebook (Table 2). For a further explication and            indicated good agreement, whereas for the avatar
descriptive data of the avatars’ feature annotation, see         features a score of .68 suggested substantial
section 3. Results.                                              agreement. Discrepancies in coding were reviewed in
    In our annotation of avatars within the selected VR          a series of consensus meetings where the coders
games, we paid particular attention to the aspect of             discussed each disagreement until a unanimous
personalization. For each feature of the avatar, such as         decision was reached.
skin color and body size, we assessed whether the
game allowed players to personalize these elements. If
a game offered the option for players to customize                   2.3. Topic detection
these aspects of their avatar, we labeled it as
'personalized'. Most of the games in our study do not            To classify the sentiment of reviews, we employed a
have personalized self-avatars. Consequently, in our             state-of-the-art text classification algorithm BERT
quantitative analysis, any data related to these                 (Bidirectional     Encoder     Representations      from
customizable features were treated as missing due to             Transformers) [33]. BERT is particularly well-suited
insufficient data.                                               for natural language processing tasks due to its deep
                                                                 learning architecture that considers the context from
Table 2                                                          both the left and the right side of a token.
Codebook of the avatar representations                               To train the model, we utilized a labeled dataset,
     Variables               Annotation rules                    where each review was pre-classified as either
                                                                 positive (1) or negative (0) based on the criteria in the
     Presence              Visible or invisible avatar           codebook. The model was fine-tuned on this dataset,
                                                                 iterating through the corpus to learn the complex
   First-person       First-person perspective support           patterns associated with the sentiment expressed in
                                     or not                      gaming reviews.
   Third-person       Third-person perspective support               We assessed the performance of our final BERT
                                     or not                      model using several evaluation metrics. The Receiver
    Body type          Hands-only avatar or full-body/           Operating Characteristic Area Under the Curve (ROC
                              upper-body avatar                  AUC) was 0.9679, indicating an excellent ability of the
                                                                 model to discriminate between the positive and
       Hand              Realistic or unrealistic hand
                                                                 negative classes. The ROC AUC is a performance
  representation                representation
                                                                 measurement for classification problems at various
 Hand accessories      With or without accessories on            threshold settings, where a score of 1 represents a
                                     hands                       perfect model and a score of 0.5 represents a model
    Hand-body            Hand-body connects or not               with no discriminative power.
    connection                                                       In terms of precision, recall, and F1-score, which
    Detail level       Include detailed textures or not          are critical metrics for classification problems, our
                                                                 model achieved the following results:
Anthropomorphism               With or without                       For class 0 (negative reviews), the model had a
                          anthropomorphic features               precision of 0.9916, meaning that 99.16% of the
 Skin color (race)            Dark or light-skin                 negative classifications were correct. The recall was
                                                                 0.9834, indicating that 98.34% of the actual negative
     Skin color        Skin color is semitransparent or          instances were correctly identified. The F1-score, a
  (transparency)               non-transparent                   harmonic mean of precision and recall, was 0.9875.
     Body size        Congruent avatar model size with               For class 1 (positive reviews), the model achieved
                        human or incongruent (much               a precision of 0.7647 and a recall of 0.8667, resulting
                            bigger/smaller) avatar               in an F1-score of 0.8125. This shows that while the
   Interactivity       Provide visual feedback caused            model was slightly less precise in identifying positive
                        by the avatar’s interactivity or         reviews, it was robust in retrieving a high proportion
                                       not                       of all relevant instances.
                                                                     The overall accuracy of the model was 0.9766,
                                                                 demonstrating that it correctly classified 97.66% of
                                                                 the reviews. The macro average F1-score, which gives
                                                                 equal weight to both classes, was 0.9000, and the


                                                           136
weighted average F1-score, which accounts for class                   results showed that some avatar features have impacts
imbalance, was 0.9772.                                                on perceived immersion. Especially, the realism of
                                                                      hand representation had a large effect size and thus it
                                                                      can be concluded that how hands represented in video
3. Results                                                            game may considerably contribute to immersive
                                                                      experience. Contrastingly, detailed textures and skin
As the size is relatively small for our dataset, we                   color transparency reflected small to medium effect
selected the Mann-Whitney U test as a primary method                  sizes showing that they had more modest impacts on
of analysis. This non-parametric test is best suited to               immersion. The size of avatar and the provision of
compare differences between two independent groups                    visual feedback from interactions also had medium
when the sample sizes are small, and distribution of                  effect sizes indicating their considerable role to
data is not assumed normal. We also utilized Cliff’s                  improve immersive experience. These results note
Delta as an effect size measure that gives a more                     that certain avatar features may have different effects
relevant estimate of the scale of observed disparities                on the immersion of players in VR spaces. Table 3
in nonparametric situations. This approach                            summarizes the details for each feature in terms of
supplements what we found with substantial practical                  statistics and effect sizes
understanding beyond just statistical significance. The

Table 3
Summary of the data analyses

                                                                                            Mann-
                                Median proportion of           Median proportion of                                  Effect size
         Variable                                                                           Whitney
                                 positive reviews on            positive reviews on                      Asymp.       (Cliff's d)
                                                                                              U
                                immersion in Group 1           immersion in Group 2                        Sig.
                                       Visible                       Invisible
    Presence of Avatar                                                                         765         .384            .138*
                               3.2% (N = 84, SD = .025)       2.6% (N = 16, SD = .017)
                                     Supported                     Not supported
 First-Person Perspective                                                                      481         .153        .307**
                               2.7% (N = 92, SD = .025)        3.7% (N = 8, SD = .017)
                                     Supported                    Not supported
 Third-Person Perspective                                                                      882         .325            .146*
                               2.5% (N = 19, SD = .178)       3.0% (N = 81, SD = .253)
                                     Hands-only                Full-body/upper-body
        Body Type                                                                              840         .294            .148*
                               2.6% (N = 24, SD = .198)       2.5% (N = 61, SD = .269)
                                      Realistic                     Unrealistic
       Hand Realism                                                                            890         .068        .601***
                               2.4% (N = 49, SD = .025)       2.6% (N = 34, SD = .026)
                                  With accessories              Without accessories
     Hand Accessories                                                                          366         .528            -.099
                               2.5% (N = 28, SD = .027)       2.4% (N = 29, SD = .022)
                                     Connected                    Not connected
 Hand-Body Connectivity                                                                        666         .374            .138*
                               2.3% (N = 18, SD = .020)       2.6% (N = 65, SD = .027)
                               Includes detailed textures    Excludes detailed textures
        Detail Level                                                                           590         .304        -.149*
                                3.2% (N = 22, SD = .028)      2.5% (N = 63, SD = .024)
                               With anthropomorphic          Without anthropomorphic
    Anthropomorphism                  features                       features                 1037         .431            .106*
                               2.8% (N = 75, SD = .023)       3.3% (N = 25, SD = .028)
                                      Light skin                     Dark skin
 Skin Color - Light vs. Dark                                                                   20          .106        -.487**
                               3.1% (N = 13, SD = .032)        2.1% (N = 6, SD = .013)
                                  Non-transparent                 Semi-transparent
 Skin Color - Transparency                                                                     172         .403            .194*
                               2.5% (N = 40, SD = .026)        2.0% (N = 4, SD = .024)
                                   Congruent size                 Incongruent size
        Avatar Size                                                                            26          .154        -.422**
                               3.0% (N = 15, SD = .021)        2.2% (N = 6, SD = .009)
     Interaction Visual         With visual feedback          Without visual feedback
                                                                                               344         .068        .451**
         Feedback              4.3% (N = 79, SD = .026)       2.5% (N = 6, SD = .018)
Note. * small effect size (0.1 < |d| < 0.3), ** medium effect size (0.3 < |d| < 0.5), *** large effect size (|d| > 0.5).


                                                              137
                                                                  VR games. This trend has significant implications for
4. Discussion                                                     the direction of avatar research.
                                                                      Most avatar research has placed emphasis on
By analyzing user reviews from a selection of the most            visual attributes such as gender, age, and other identity
popular VR games on Steam, we found insights that                 markers. However, our findings suggest a disconnect
challenge     conventional     understandings       and           between these research foci and the real-world VR
potentially inspire novel perspectives in avatar                  games. In the absence of third-person perspectives or
research in the context of immersive virtual reality.             mirrors used in most popular VR games, features like
Our discussions explore the multifaceted findings of              facial appearance, gender, or age are less perceived.
our study, interpreting the implications of our results.          This raises questions about the relevance of such
                                                                  visual cues in first-person VR environments, where
    4.1. Overall results of the Mann-                             users primarily interact with the game world through
                                                                  their avatars' hands and actions.
         Whitney U and effect sizes                                   Given this context, a shift in research focus appears
                                                                  necessary. Hand and lower-body representations in
The predominance of non-significant results in our                VR seem to be more critical for user immersion and
Mann-Whitney U initially appears to suggest a limited             interaction. This is supported by studies such as [9],
influence of avatar characteristics on player                     which highlight the importance of hand representation
immersion. However, focusing on the effect sizes,                 in VR for enhancing the sense of control and
rather than solely on statistical significance, offers a          embodiment. Additionally, research by [13]
more nuanced understanding of our findings. Effect                underscores the significance of embodiment in first-
sizes provide insight into the magnitude of differences,          person VR experiences, further validating the need to
and are less impacted by sample size, which is                    focus on aspects directly experienced by the user.
particularly informative in studies like ours where the               Therefore, future avatar research might consider
sample size is relatively small [32]. However, it is              prioritizing the study of hand and full-body
important to note that the precision and confidence               representations, exploring how their design, realism,
interval of these effect sizes is still influenced by the         and functionality contribute to immersion.
sample size.
    These specific effect size results from our study
carry important implications for future avatar                         4.3. Avatar-user visual similarity
research and design. The impact of avatar presence                          and immersion
and perspective highlights the need for more targeted
research to understand which aspects of avatar design             Our findings make a significant contribution to the
resonate with different user demographics. For                    understanding of avatar-user similarity and its impact
instance, future studies might explore how individual             on immersion in VR games. The nuances revealed
player characteristics, such as prior VR experience or            through our analysis underscore the importance of
personal preferences, interact with avatar features to            similarity in fostering a deeper sense of immersion.
affect immersion.                                                     The preference for more detailed and realistic
    Moreover, our findings about the medium effect of             avatars aligns with the theory that higher fidelity in
first-person perspective on immersion suggest that VR             avatar design enhances the player's ability to relate to
game designers might consider offering players the                and identify with their virtual counterpart. This is
option to choose their preferred perspective. This                supported by studies like that of [34], which found that
customization could cater to diverse player                       users respond more positively to avatars that
preferences, potentially enhancing the immersive                  resemble their real-life appearance. Our findings
experience for a broader user base.                               extend this notion, suggesting that a detailed, high-
    Furthermore, the impact of skin color and avatar              fidelity avatar can act as an extension of the self within
size in our study, which exhibited considerable effect            the virtual environment, thereby enhancing the
sizes warrants special attention in design                        immersive experience.
considerations. These factors were among the closest                  The observation regarding the preference for light-
to reaching statistical significance, indicating their            skinned avatars also points to a deeper aspect of user-
potential substantial influence on immersion in VR                avatar similarity but it could be reflective of the
experiences. This suggests that even seemingly minor              demographic composition of the VR gaming
aspects of avatar design, like skin tone and body size,           community. This phenomenon is echoed in the work
can have a profound impact on how users perceive and              [10], which demonstrated how skin color in avatars
interact with the virtual environment.                            could influence the user's experiences and reactions in
                                                                  a virtual environment. Their results suggest that
    4.2. Trends in avatar design for VR                           congruence in physical characteristics, such as skin
                                                                  color, between the avatar and the user can intensify
         games                                                    the immersion, possibly due to enhanced
                                                                  identification.
The descriptive analysis of the top 100 VR games on                   Additionally, body size emerged as an influential
Steam provides a revealing glimpse into current trends            aspect of avatar-user similarity. Our findings suggest
in VR game design, particularly regarding avatar                  that avatars with body sizes that closely match or are
representation. A striking observation from our data is           perceived as ideal by users can practically impact
the scarcity of third-person perspectives, including              immersion. This is supported by research from [4],
indirect forms such as mirror reflections, in popular             which demonstrated that the physical dimensions of


                                                            138
avatars, including their height and build, can affect the              5.1. Sample size and scope of
user’s psychological responses in virtual interactions.
An avatar with a relatable body size can create a more                      analysis
compelling and convincing representation of the user
in the virtual world, contributing to a heightened sense           The primary limitation of our study is the sample size
of presence and immersion.                                         (n=100), which constrains the depth and breadth of
    These insights suggest that VR developers should               our analysis. With a larger dataset, more robust
consider incorporating customizable avatars that can               statistical methods like linear regression or factor
adapt to diverse user preferences, thereby enriching               analysis could be employed to uncover deeper insights
the overall user experience in virtual environments.               [36]. The selected games are the most popular on
                                                                   Steam, which indicates that the findings in this study
                                                                   might not be generalizable to a broader range of VR
     4.4. Methodological innovations                               experiences.
          and contributions to avatar                                  Expanding the study to include other VR platforms,
                                                                   such as Meta Quest, can provide a broader perspective.
          studies                                                  By analyzing user reviews across different platforms,
                                                                   researchers can capture a more diverse range of user
The methodology employed in this study represents a                experiences and preferences, as noted by [35] in their
significant paradigm shift in avatar research,                     discussion on user experience research.
particularly in the field of VR. By harnessing the power               Future research should also consider longitudinal
of user-generated content in the form of Steam                     studies to track changes in user preferences and
reviews, we have successfully introduced a novel                   perceptions over time, as VR technology continues to
approach to understanding how self-avatars impact                  evolve [37].
user experiences in VR games. This method transcends
the limitations of experimental research, offering a
more authentic and comprehensive view of player                        5.2. Sentiment classification
perceptions and interactions with avatars.
     Our approach, which combines the advanced                     The initial plan to classify sentiment in immersion-
natural language processing capabilities of BERT with              related reviews encountered a limitation due to the
meticulous manual coding, enables us to extract and                small size of the training dataset. This small sample
analyze nuanced player feedback on a scale previously              size restricts the robustness and generalizability of
unachievable. This dual-method strategy effectively                any sentiment classification model we could develop.
balances the need for large-scale data analysis with the               Despite the small dataset, our annotated data
subtlety of human interpretation, setting a new                    revealed a significant majority of positive immersion-
standard for research in this field. The use of BERT,              related reviews (93.15%, n=136). This high
particularly, exemplifies the cutting-edge of                      proportion of positive sentiment is promising,
computational linguistics, offering unprecedented                  suggesting that players generally perceive immersion
precision in identifying and classifying relevant user             aspects of VR games favorably. However, as [38] noted,
sentiments [33].                                                   sentiment analysis in complex domains like gaming
     The significance of this methodology lies not just in         can benefit from more nuanced classification,
its technical prowess but also in its ability to capture           capturing a spectrum of sentiments rather than a
the diverse and multifaceted experiences of users.                 binary positive/negative division.
Unlike controlled experimental settings, our approach                  For future research, expanding the dataset for
taps into a rich vein of real-world user interactions,             training the sentiment model is crucial. A larger and
encompassing a broad spectrum of opinions and                      more varied set of reviews would enable the
experiences.                                                       development of a more sophisticated sentiment
     Furthermore, the insights garnered through this               analysis model that can accurately differentiate
method offer invaluable implications for VR game                   between positive and negative sentiments regarding
design and avatar development. By understanding                    immersion.
user preferences and perceptions as expressed                          Additionally, future studies should consider
organically in reviews, developers can tailor avatar               employing advanced machine learning techniques that
designs more effectively to enhance user immersion                 can handle imbalanced datasets, as often seen in user-
and satisfaction. This user-centered approach to                   generated content where certain sentiments may
design is increasingly recognized as vital in creating             dominate. Techniques such as SMOTE (Synthetic
engaging and impactful VR experiences [35].                        Minority Over-sampling Technique) or ensemble
                                                                   learning methods can address the imbalance, as
                                                                   recommended by [39].
5. Limitations and future
   research agenda                                                     5.3. Observational limitations
While our study provides valuable insights into self-              The use of YouTube gameplay videos as a primary
avatar characteristics in VR games, it's crucial to                source for observing avatar characteristics in VR
acknowledge its limitations and outline potential                  games presents several limitations that need
avenues for future research.                                       consideration in future research.
                                                                      While YouTube offers accessibility and a wide
                                                                   range of content, relying on gameplay videos for


                                                             139
detailed observations can lead to incomplete or
skewed data. Gameplay videos are often edited and
                                                                 References
curated, potentially omitting crucial aspects of the
gaming experience that are pertinent to avatar                   [1]  M. Slater, B. Spanlang, M. V. Sanchez-Vives, and O.
research. This limitation aligns with the concerns                    Blanke, “First person experience of body transfer
raised by [40], who note that online video content may                in virtual reality,” PLoS One, vol. 5, no. 5, 2010,
not always represent the full spectrum of user                        doi: 10.1371/journal.pone.0010564.
experiences due to selective editing.                            [2] M. V. Sanchez-Vives and M. Slater, “From
    Another limitation is the potential for bias in the               presence to consciousness through virtual
selection of videos. Content creators may have specific               reality,” Nature Reviews Neuroscience, vol. 6, no.
preferences or play styles that do not represent the                  4. 2005. doi: 10.1038/nrn1651.
average player's experience. This issue is highlighted           [3] J. N. Bailenson, J. Blascovich, A. C. Beall, and J. M.
by [41], who discusses how content creator biases can                 Loomis, “Equilibrium theory revisited: Mutual
influence the portrayal of digital experiences in online              gaze and personal space in virtual
videos.                                                               environments,” Presence: Teleoperators and
    To address these limitations, future studies could                Virtual Environments, vol. 10, no. 6, 2001, doi:
incorporate direct gameplay observation through                       10.1162/105474601753272844.
platforms that offer unedited and comprehensive                  [4] N. Yee and J. Bailenson, “The proteus effect: The
gameplay experiences. For instance, using data from                   effect of transformed self-representation on
beta testing sessions or developer-provided gameplay                  behavior,” Hum Commun Res, vol. 33, no. 3, 2007,
footage could yield more accurate and representative                  doi: 10.1111/j.1468-2958.2007.00299.x.
insights into avatar characteristics. Additionally, as           [5] A. Oulasvirta and K. Hornbæk, “HCI research as
suggested by [42], incorporating player interviews or                 problem-solving,” in Conference on Human
surveys alongside gameplay observation can provide a                  Factors in Computing Systems - Proceedings,
more holistic understanding of player experiences and                 2016. doi: 10.1145/2858036.2858283.
perceptions.                                                     [6] M. Muller and S. Kogan, “Grounded theory
                                                                      method in hci and cscw,” Cambridge: IBM Center
                                                                      for Social …, 2010.
    5.4. Impact of game type diversity                           [7] D. Deng, M. Bujic, and J. Hamari, “Understanding
                                                                      Multi-platform Social VR Consumer Opinions: A
The diversity of game types in our sample of 100 VR                   Case Study in VRChat Using Topics Modeling of
games, ranging from RPGs to sports, music videos, and                 Reviews,” in Lecture Notes in Business
shooting games, introduces a potential confounding                    Information Processing, 2023. doi: 10.1007/978-
variable in our analysis of avatar representations. The               3-031-32302-7_4.
variation in game genres can significantly influence             [8] C. Lu, X. Li, T. Nummenmaa, Z. Zhang, and J.
how avatars are designed and interacted with,                         Peltonen, “Patches and Player Community
potentially impacting user perceptions of immersion.                  Perceptions: Analysis of No Man’s Sky Steam
    The heterogeneous nature of game genres in our                    Reviews,” 2020.
dataset could have diluted the specificity of our                [9] F. Argelaguet, L. Hoyet, M. Trico, and A. Lécuyer,
findings regarding avatar representations. As                         “The role of interaction in virtual embodiment:
highlighted by [43], different game genres cater to                   Effects of the virtual hand representation,” in
different player expectations and experiences, which                  Proceedings - IEEE Virtual Reality, 2016. doi:
can significantly affect how players perceive and                     10.1109/VR.2016.7504682.
interact with avatars. For instance, the role and                [10] T. C. Peck, S. Seinfeld, S. M. Aglioti, and M. Slater,
representation of avatars in an RPG might be                          “Putting yourself in the skin of a black avatar
fundamentally different from those in a sports game,                  reduces implicit racial bias,” Conscious Cogn, vol.
leading to varied impacts on user immersion.                          22, no. 3, pp. 779–787, 2013, doi:
    To address this issue, future research should                     10.1016/j.concog.2013.04.016.
consider focusing on specific game genres to control             [11] M. Slater and S. Wilbur, “A framework for
genre-related variance. This approach would allow for                 immersive virtual environments (FIVE):
a more nuanced understanding of how avatar                            Speculations on the role of presence in virtual
representations influence immersion within a                          environments,” Presence: Teleoperators and
particular gaming context. By isolating the variable of               Virtual Environments, vol. 6, no. 6, 1997, doi:
game type, researchers can more accurately assess the                 10.1162/pres.1997.6.6.603.
impact of avatars on the user experience.                        [12] F. Biocca, “The Cyborg’s Dilemma: Progressive
                                                                      Embodiment in Virtual Environments [1],”
                                                                      Journal of Computer-Mediated Communication,
Acknowledgements                                                      vol. 3, no. 2, 2006, doi: 10.1111/j.1083-
                                                                      6101.1997.tb00070.x.
This research was partially supported by the Academy             [13] K. Kilteni, R. Groten, and M. Slater, “The Sense of
of Finland (342144; ’POSTEMOTION’).                                   Embodiment in virtual reality,” Presence:
                                                                      Teleoperators and Virtual Environments, vol. 21,
                                                                      no. 4. 2012. doi: 10.1162/PRES_a_00124.
                                                                 [14] A. McMahan, “Immersion, engagement, and
                                                                      presence: A method for analyzing 3-d video


                                                           140
     games,” in The Video Game Theory Reader, 2013.              [30] B. G. Witmer and M. J. Singer, “Measuring
     doi: 10.4324/9780203700457-10.                                   presence in virtual environments: A presence
[15] R. Tamborini and P. Skalski, “The role of presence               questionnaire,” Presence: Teleoperators and
     in the experience of electronic games,” in Playing               Virtual Environments, vol. 7, no. 3, 1998, doi:
     Video Games: Motives, Responses, and                             10.1162/105474698565686.
     Consequences,                 2006.            doi:         [31] M. Slater, D. Banakou, A. Beacco, J. Gallego, F.
     10.4324/9780203873700.                                           Macia-Varela, and R. Oliva, “A Separate Reality:
[16] S. J. Ahn, “Embodied Experiences in Immersive                    An Update on Place Illusion and Plausibility in
     Virtual Environments: Effects on Pro-                            Virtual Reality,” Frontiers in Virtual Reality, vol.
     Environmental       Attitude     and     Behavior,”              3. 2022. doi: 10.3389/frvir.2022.914392.
     Dissertation:Stanford University, no. May, 2011.            [32] J. Cohen, “Statistical power analysis for the
[17] J. Burleson, V. Grover, J. B. Thatcher, and H. Sun,              behavioural sciences. Hillside,” NJ: Lawrence
     “A Representation Theory Perspective on the                      Earlbaum Associates. 1988.
     Repurposing of Personal Technologies for Work-              [33] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova,
     Related Tasks,” J Assoc Inf Syst, vol. 22, no. 6,                “BERT: Pre-training of deep bidirectional
     2021, doi: 10.17705/1jais.00707.                                 transformers for language understanding,” in
[18] M. Shelley and K. Krippendorff, “Content                         NAACL HLT 2019 - 2019 Conference of the North
     Analysis: An Introduction to its Methodology.,” J                American Chapter of the Association for
     Am Stat Assoc, vol. 79, no. 385, 1984, doi:                      Computational Linguistics: Human Language
     10.2307/2288384.                                                 Technologies - Proceedings of the Conference,
[19] K. A. Neuendorf, The Content Analysis                            2019.
     Guidebook.                  2020.              doi:         [34] J. Bailenson, K. Patel, A. Nielsen, R. Bajscy, S. H.
     10.4135/9781071802878.                                           Jung, and G. Kurillo, “The effect of interactivity on
[20] D. Riffe, S. Lacy, and F. Fico, Analyzing media                  learning physical actions in virtual reality,”
     messages: Using quantitative content analysis in                 Media Psychol, vol. 11, no. 3, 2008, doi:
     research. 2014. doi: 10.4324/9780203551691.                      10.1080/15213260802285214.
[21] M. E. Mccombs and D. L. Shaw, “The agenda-                  [35] M. Hassenzahl and N. Tractinsky, “User
     setting function of mass media,” Public Opin Q,                  experience - A research agenda,” Behaviour and
     vol. 36, no. 2, 1972, doi: 10.1086/267990.                       Information Technology, vol. 25, no. 2, 2006, doi:
[22] T. Tang, E. Fang, and F. Wang, “Is neutral really                10.1080/01449290500330331.
     neutral? The effects of neutral user-generated              [36] Andy Field, Discovering Statistics using SPSS
     content on product sales,” J Mark, vol. 78, no. 4,               Statistics, vol. 66. 2009.
     2014, doi: 10.1509/jm.13.0301.                              [37] S. Faisal, P. Cairns, and A. Blandford, “Building for
[23] A. Nikolay, G. Anindya, and G. I. Panagiotis,                    users not for experts: Designing a visualization of
     “Deriving the pricing power of product features                  the literature domain,” in Proceedings of the
     by mining consumer reviews,” Management                          International Conference on Information
     Science, vol. 57, no. 8. 2011. doi:                              Visualisation, 2007. doi: 10.1109/IV.2007.32.
     10.1287/mnsc.1110.1370.                                     [38] B. Pang and L. Lee, “Opinion mining and
[24] C. Vásquez, “Right now versus back then:                         sentiment analysis,” Foundations and Trends in
     Recency and remoteness as discursive resources                   Information Retrieval, vol. 2, no. 1–2, 2008, doi:
     in online reviews,” Discourse, Context and Media,                10.1561/1500000011.
     vol.    9, pp. 5–13, Sep.             2015, doi:            [39] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P.
     10.1016/j.dcm.2015.05.010.                                       Kegelmeyer, “SMOTE: Synthetic minority over-
[25] Y. Zhang, Y. Goh, and Q. Wang, “Unraveling the                   sampling technique,” Journal of Artificial
     Effect of Competing Product Reviews on                           Intelligence Research, vol. 16, 2002, doi:
     Consumer Choice and the Moderating Role of                       10.1613/jair.953.
     Consumer-Reviewer Peer Types,” IEEE Trans                   [40] M. Sjöblom and J. Hamari, “Why do people watch
     Eng Manag, vol. 70, no. 10, p. 3315, 2023, doi:                  others play video games? An empirical study on
     10.1109/TEM.2021.                                                the motivations of Twitch users,” Comput Human
[26] R. Ullah, N. Amblee, W. Kim, and H. Lee, “From                   Behav,          vol.      75,       2017,        doi:
     valence to emotions: Exploring the distribution                  10.1016/j.chb.2016.10.019.
     of emotions in online product reviews,” Decis               [41] R. Rosalen, “YouTube: Online video and
     Support Syst, vol. 81, pp. 41–53, Jan. 2016, doi:                participatory culture,” New Media Soc, vol. 21,
     10.1016/j.dss.2015.10.007.                                       no. 9, 2019, doi: 10.1177/1461444819859476.
[27] M. Lombard and T. Ditton, “At the heart of it all:          [42] M. Consalvo and N. Dutton, “Game analysis:
     The concept of presence,” Journal of Computer-                   Developing a methodological toolkit for the
     Mediated Communication, vol. 3, no. 2. 1997. doi:                qualitative study of games,” Game Studies, vol. 6,
     10.1111/j.1083-6101.1997.tb00072.x.                              no. 1, 2006.
[28] P. H. Mirvis and M. Csikszentmihalyi, “Flow: The            [43] N. Wardrip-Fruin and P. Harrigan, First person:
     Psychology of Optimal Experience,” The                           new media as story, performance, and game, vol.
     Academy of Management Review, vol. 16, no. 3,                    42, no. 02. 2006. doi: 10.5860/choice.42-0713.
     1991, doi: 10.2307/258925.
[29] R. Hunicke, M. Leblanc, and R. Zubek, “MDA: A
     formal approach to game design and game
     research,” in AAAI Workshop - Technical Report,
     2004.


                                                           141

</pre>