=Paper= {{Paper |id=Vol-2995/paper6 |storemode=property |title=People in the Context - an Analysis of Game-based Experimental Protocol |pdfUrl=https://ceur-ws.org/Vol-2995/paper6.pdf |volume=Vol-2995 |authors=Krzysztof Kutt,Laura Żuchowska,Szymon Bobek,Grzegorz J. Nalepa |dblpUrl=https://dblp.org/rec/conf/ijcai/KuttZBN21 }} ==People in the Context - an Analysis of Game-based Experimental Protocol== https://ceur-ws.org/Vol-2995/paper6.pdf
Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021                                                        46




          People in the Context – an Analysis of Game-based Experimental Protocol

        Krzysztof Kutt1∗ , Laura Żuchowska2 , Szymon Bobek1,2 and Grzegorz J. Nalepa1,2
   1
     Jagiellonian Human-Centered Artificial Intelligence Laboratory (JAHCAI) and Institute of Applied
                       Computer Science, Jagiellonian University, Kraków, Poland
2
  Department of Applied Computer Science, AGH University of Science and Technology, Kraków, Poland
                     krzysztof.kutt@uj.edu.pl, szymon.bobek@uj.edu.pl, gjn@gjn.re


                               Abstract                                       measurement and easily extendable to wider research groups,
                                                                              wearable and portable, affordable-for-all devices are used.
        The paper provides insights into two main threads                        A key aspect of the BIRAFFE experiments is the use of
        of analysis of the BIRAFFE2 dataset concerning                        games as the experimental environment. They were cho-
        the associations between personality and physio-                      sen as a trade-off between a stimulus-rich complex near-
        logical signals and concerning the game logs’ gen-                    natural environment and the need to control and record as
        eration and processing. Alongside the presen-                         much context as possible to provide the most detailed post-
        tation of results, we propose the generation of                       experimental analyses. The latest version of the experiment,
        event-marked maps as an important step in the ex-                     BIRAFFE2 [Kutt et al., 2020]1 , used a game consisting of
        ploratory analysis of game data. The paper con-                       three independent levels. The aim of the first was to evoke
        cludes with a set of guidelines for using games as a                  positive emotions. The second was intended to induce irrita-
        context-rich experimental environment.                                tion and frustration, e.g., through impaired control. Finally,
                                                                              the third level was a neutral maze. A detailed description of
1       Introduction and Motivation                                           the games is presented in [Żuchowska et al., 2020].
                                                                                 This paper provides insights into the core analyses of the
The development of a good personalised intelligent assistant                  BIRAFFE2 dataset on contextual information processing in
that behaves in a natural way requires the development of                     affective games. The first thread, presented in Sect. 2, fo-
proper toolbox as a base [Nalepa et al., 2019]. In order to                   cuses on the analysis of the relationship between physiologi-
be user-friendly, an assistant should not only perform its task,              cal signals and the so-called “Big Five” personality traits. The
but also respond to the user’s changing emotions. This is                     existence of such relationships in the data will allow further
due to our natural tendency to anthropomorphize interfaces                    work to create emotion prediction models that will be moder-
– the user will assume that the assistant will react appropri-                ated and personalised through the identification of personality
ately, e.g., understand that the nervousness is due to a mis-                 profiles. The second topic, described in Sect. 3, addresses the
take committed. Such affective information can be extracted                   topic of accurate game logging and the possibility of recon-
from the range of physiological signals, particularly obtained                structing an entire game from such stored logs. The whole
through low-cost wearable devices that will make this tech-                   article concludes with a set of lessons-learned regarding the
nology available to everyone. Finally, it is important to note                implementation of games as an experimental environment in
that emotions do not happen in a void—they are always de-                     Sect. 4.
pendent on the context a person is in [Prinz, 2006]—so it is
also important to collect information about the user’s current
situation (e.g., activity, weather conditions, time of day).                  2 Physiological Signals and Personality
   An important step in establishing the above-outlined                       Before undertaking the analyses, three features were calcu-
framework for personalized assistants is the collection of the                lated for ECG signal using HeartPy library [van Gent et al.,
right data. This, in turn, strictly depends on the develop-                   2019]: heart rate (number of heart beats per minute), mean
ment of appropriate research environments and experimental                    of successive differences between R-R intervals (MoSD) and
protocols. Such issues are addressed in the BIRAFFE (Bio-                     breathing rate. Also, to group the valence and arousal scores
Reactions and Faces for Emotion-based Personalization) se-                    into discrete variable, 16 clusters were introduced as pre-
ries of experiments [Kutt et al., 2021]. Their main objective                 sented on Fig. 1.
is to develop methods for emotion recognition using a range                      In order to find correlations and dependencies between
of contextual information and physiological signals such as                   physiological data (the ECG signal was chosen as an illus-
cardiac activity (ECG), electrodermal response (EDA), hand                    tration) and personality traits (each on [1, 10] scale), several
movements (accelerometer) or changes in facial expression.
                                                                                  1
In order to ensure that the research is highly ecological in                     The entire dataset from the BIRAFFE2 experiment
                                                                              is available under CC licence on the Zenodo platform,
    ∗
        Corresponding Author                                                  DOI:10.5281/zenodo.3865859.




Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021                                                                   47


                9                                                                            ECG characteristic         Mean         SD

                        12 13 14 15                                                          Heart rate [BPM]
                                                                                             MoSD [ms]
                                                                                                                        80.92
                                                                                                                        34.41
                                                                                                                                   16.21
                                                                                                                                   37.89
                7                                                                            Breathing rate [Hz]         0.10       0.12

                         8 9 10 11                                                    Table 2: Descriptive statistics for ECG characteristics.
      Arousal



                5
                         4 5 6 7                                                   Independent var.
                                                                                   Conscientiousness             1
                                                                                                                  df      MS
                                                                                                                         3.85      0.60
                                                                                                                                          F      p
                                                                                                                                               0.44
                3                                                                  Openness                      1       5.24      0.82        0.37

                         0 1 2 3                                                   Agreeableness
                                                                                   Extraversion
                                                                                   Neuroticism
                                                                                                                 1
                                                                                                                 1
                                                                                                                 1
                                                                                                                        68.57
                                                                                                                         6.12
                                                                                                                         0.02
                                                                                                                                  10.70
                                                                                                                                   0.95
                                                                                                                                  0.004
                                                                                                                                              0.001
                                                                                                                                               0.33
                                                                                                                                               0.95
                1                                                                  Heart rate                    1      59.97      9.35       0.002
                    1       3          5           7           9                   MoSD                          1       0.87      0.14        0.71
                                    Valence                                        Breathing rate                1       2.91      0.45        0.50
                                                                                   Residual Error            10881       6.41
    Figure 1: Valence and arousal scores grouped into clusters.
                                                                              Table 3: ANOVA model for valence with five personality traits and
           Personality Trait        Mean       SD      Median                 three ECG-related characteristics as independent variables.

           Conscientiousness         5.68     2.61             6
           Openness                  5.48     2.22             6              eral strong associations. What seems most interesting is the
           Agreeableness             6.25     2.39             6              strong relationship between heart rate and valence, which is
           Extraversion              5.84     2.29             7              somehow in opposition to most approaches in which heart
           Neuroticism               5.38     2.45             5              rate is used to predict arousal, while other signals such as
                                                                              EDA are mostly used for valence [Dzedzickis et al., 2020].
          Table 1: Descriptive statistics for personality traits.
                                                                              3 Game Logs and Questionnaires
approaches to statistical analysis were made. Firstly, basic                  As noted in the introduction, one motivation for using games
descriptive statistics were calculated to find outliers and pos-              as an experimental environment is the ability to frequently
sible extremas. As can be seen in Tab. 1-2, the data was dis-                 sample and log the entire player context. Properly prepared
tributed proportionally in terms of mean, median and stan-                    logs should allow the reconstruction of both the level map (the
dard deviation, which indicates a promising start for further                 same for each subject) and the course of the entire game for
analysis.                                                                     each player. Indeed, this is possible for the games studied.
   The second analysis was aimed at investigation of corre-                   As part of the log analyses, a number of maps were gener-
lations between features. Although the results did not show                   ated, which were verified by comparison with the games and
any strong dependencies between them (see Fig. 2), they in-                   recorded screencasts of the gameplay. These maps can also
dicated the existence of potentially interesting relationships                be used for aggregated analyses, e.g., by plotting all events of
worthy of further analysis and further research. Namely,                      one type followed by an initial visual inspection. Fig. 3 shows
in terms of the associations between personality and widget                   all the death locations of the protagonist in the first level. One
responses, valence and arousal are related to distinct traits.                can notice a very high number of deaths in the central room
For arousal, the highest values are for openness and consci-                  – this is consistent with the observations made during the ex-
entiousness. On the other hand, valence’s most significant                    periment: this is the first room where players are just getting
factors are agreeableness and extraversion. When consider-                    familiar with the game interface.
ing the correlations between physiological reactions and wid-                    Another part of the analysis was the examination of an-
get, among heart rate, MoSD, and respiratory rate, the high-                  swers from Game Experience Questionnaire [IJsselsteijn et
est values were noted for the first of these for both valence                 al., 2013], a survey taken by each participant by the end of
and arousal. The outcome of personality trait to heart rate                   the experiment. The results allow to understand whether the
was presented as maximal for both conscientiousness and ex-                   games made an impact on emotional state of the subjects, ac-
traversion. Considering the MoSD, highest value—and the                       cording to themselves. The results are represented by 7-factor
highest inter-correlation in general, i.e., the correlation be-               structure. Five of them were further analysed, as they were
tween different data sources—was for extraversion (0.23) and                  the most relevant to the assumed game differences:
conscientiousness (−0.19). Finally, values of correlation for                    • Challenge – I felt time pressure/I had to put a lot effort,
breathing rate played in favor of extraversion.
   The last statistical analysis performed was two ANOVAs                        • Tension – I was irritated/I feel angry,
for valence and arousal (see Tab. 3-4), which indicated sev-                     • Negative affect – I felt bad/made me bored,



Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021                                                                                                        48


                                                                                                                                                                                   1.00
                Openness          1           0.14            0.26          0.083           0.038     0.0085       0.093         0.09           0.016        0.049        0.045
        Conscientiousness        0.14           1             0.16          0.12            -0.22     -0.17         -0.19        -0.12          0.011        0.042         0.05    0.75

             Extraversion        0.26         0.16             1            0.35            -0.38     -0.11         0.23         0.14           0.023    -0.0079 0.0006
                                                                                                                                                                                   0.50
           Agreeableness     0.083            0.12            0.35              1       -0.011        -0.046       -0.067        0.059          0.039        -0.033       -0.025
                                                                                                                                                                                   0.25
             Neuroticism     0.038            -0.22        -0.38            -0.011           1         0.07        -0.023        0.092      -0.0031          0.014        0.0095
               Heart rate    0.0085           -0.17        -0.11            -0.046          0.07          1         0.39         0.32           0.029        -0.056       -0.035   0.00
                   MoSD      0.093            -0.19           0.23          -0.067      -0.023         0.39            1         0.74           0.014        -0.01     -0.0086
                                                                                                                                                                                   −0.25
           Breathing rate        0.09         -0.12           0.14          0.059           0.092      0.32         0.74             1          0.021        -0.012       -0.015
                 Valence     0.016            0.011        0.023            0.039      -0.0031        0.029        0.014         0.021           1           -0.12         0.19    −0.50

                  Arousal    0.049            0.042       -0.0079           -0.033          0.014     -0.056        -0.01        -0.012         -0.12          1           0.9     −0.75
                  Cluster    0.045            0.05         0.0006           -0.025      0.0095        -0.035       -0.0086       -0.015         0.19          0.9            1
                                                                                                                                                                                   −1.00




                                                                                                                   D
                               s




                                                                                                                                            ce



                                                                                                                                                             al



                                                                                                                                                                          r
                                            ss



                                                           n



                                                                            ss


                                                                                       ism



                                                                                                      te




                                                                                                                                te




                                                                                                                                                                        te
                             es




                                                         io




                                                                                                                oS




                                                                                                                                                        us
                                                                       ne
                                           ne




                                                                                                     ra




                                                                                                                                ra



                                                                                                                                           en




                                                                                                                                                                      us
                            nn




                                                         rs




                                                                                      tic




                                                                                                                                                        ro
                                                                       le
                                         us




                                                                                                               M




                                                                                                                                            l
                                                                                                 rt




                                                                                                                           ng
                                                       e




                                                                                                                                                                     Cl
                                                                                                                                         Va
                        pe




                                                                   ab
                                                    av




                                                                                    ro




                                                                                                                                                       A
                                      tio




                                                                                               ea




                                                                                                                           hi
                       O




                                                                                 eu
                                                    tr



                                                                  ee




                                                                                              H
                                   en




                                                                                                                        at
                                                Ex




                                                                                N
                                                               gr




                                                                                                                     re
                                 ci




                                                              A




                                                                                                                    B
                              ns
                            Co




Figure 2: Correlation matrix for five personality traits, three ECG-related characteristics and widget responses represented as Valence, Arousal
and Cluster.


  Independent var.                    df              MS                    F                    p
  Conscientiousness              1               60.43             14.50            < 0.001
  Openness                       1              101.03             24.25            < 0.001
  Agreeableness                  1               44.51             10.68              0.001
  Extraversion                   1                9.61              2.31               0.13
  Neuroticism                    1               13.14              3.15               0.08
  Heart rate                     1              138.06             33.13            < 0.001
  MoSD                           1               11.27              2.71               0.10
  Breathing rate                 1                1.64              0.39               0.53
  Residual Error             10881                4.17

Table 4: ANOVA model for arousal with five personality traits and
three ECG-related characteristics as independent variables.


   • Positive affect – I felt good/made me happy,
   • Competence – I felt competent/skillful.
   The factors were compared to each other in order to dig
into the feelings of players. The expectations for the first
game were that subject is supposed to feel happy (high pos-                                                   Figure 3: Map for Stage 1 recreated from game logs with all death-
itive, low negative, low tension) and not challenged (high                                                    related events marked as dots.
competence, low challenge). The second stage’s purpose was
contrary to the first one – high negative, tension and chal-                                                  were not happy during and after the game. This cannot be
lenge, with low competence and positive. The huge differ-                                                     said about the first stage, where according to the answers,
ence is more likely to have an impact, as the contrast is hitting                                             only 30% of subjects felt somewhat irritated. Same outcome
the player suddenly. Based on the GEQ results (see Fig. 4),                                                   can be said about positive feedback for both stages – the first
one can state that everything worked as planned.                                                              was keeping the emotions of participants on a very high level
   The Competence line during first gameplay was set pretty                                                   of happiness, while the second one changed it for a little one.
high, while leaving the tension line in the bottom, making
the subject feel calm enough to let their guard down, but still
be entertained by the gameplay. The second stage’s extreme                                                    4 Discussion and Lessons Learned
difficulty and pressure-building environment made the expe-                                                   As a summary of the analyses presented, we propose a set
rience hard to enjoy. A very similar result can be seen in                                                    of guidelines concerning the issues one should pay attention
Negative/Tension comparison. About 95% of the participants                                                    to when creating games with the intention of using them as
agreed that the second level has left them irritated, 83,5%                                                   context-rich experimental environments:



Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021                                                          49


                               Factor: Challenge                                      pects, should be preferred to one large level that com-
               4                                                                      bines all experimental manipulations. The levels anal-
Factor level
                                                                                      ysed achieved their objectives well, as shown by the re-
               2                                                                      sults of the GEQ questionnaire in Sect. 3.
                                                                                   3. Logs should be collected as densely as possible, accord-
               0                                                                      ing to the specifics of the game being developed. All
                   0   20         40          60         80         100               features necessary to reproduce the gameplay should
                                 Factor: Tension                                      be recorded. In the analyses carried out, it was found
               4                                                                      that the logs were sufficiently detailed to reproduce the
Factor level




                                                                                      progress of the game. However, the data lacked infor-
               2                                                                      mation on the type of death in the second level, which
                                                                                      would be useful to compare with the emotions felt at the
                                                                                      time of death. This information is still reproducible, e.g.,
               0                                                                      from the recorded screencasts, however it will require a
                   0   20         40          60         80         100               fair amount of data processing.
                            Factor: Negative affect                                4. Maps with events marked on them are a useful tool for
               4                                                                      exploratory analysis of game logs. There are a num-
Factor level




                                                                                      ber of studies concerning the analysis of game logs
               2                                                                      (e.g., [Cheong et al., 2008]), including those related to
                                                                                      the evaluation of social science theories [Shim et al.,
               0                                                                      2011]. However, to the best of our knowledge, data vi-
                   0   20         40          60         80         100               sualisation in the form of maps (as in Fig. 3) has not
                            Factor: Positive affect                                   been done as part of the analyses. We believe that this
               4                                                                      is a valuable approach to quickly assess the validity of
Factor level




                                                                                      the data and to propose hypotheses that have not been
                                                                                      considered before.
               2
                                                                                   These findings will be incorporated into the preparation of
               0                                                                the next experiment in the BIRAFFE series, planned for Au-
                   0   20         40          60         80         100         tumn 2021.
                             Factor: Competence
               4                                                                Acknowledgements
Factor level




                                                                                The research has been supported by a grant from the Prior-
               2                                                                ity Research Area Digiworld under the Strategic Programme
                                                                                Excellence Initiative at the Jagiellonian University.
               0                                                                   The authors are also grateful to Academic Computer Cen-
                   0   20         40          60         80         100         tre CYFRONET AGH and Jagiellonian University for grant-
                                       Subject                                  ing access to the computing infrastructure built in the projects
                                                                                No. POIG.02.03.00-00-028/08 “PLATON – Science Services
  Figure 4: GEQ factors for the first and second level (green and yel-          Platform” and No. POIG.02.03.00-00-110/13 “Deploying
  low, respectively). Horizontal lines mark the average values.                 high-availability, critical services in Metropolitan Area Net-
                                                                                works (MAN-HA)”.

        1. It is important to take into account the features of the             References
           subjects in the contextual information set. In line with
           the results obtained from the BIRAFFE1 [Kutt et al.,                 [Cheong et al., 2008] Yun-Gyung Cheong, Arnav Jhala,
           2021] and DEAP [Zhao et al., 2019] datasets, the anal-                 Byung-Chull Bae, and Robert Michael Young. Automati-
           yses summarised in Sect. 2 indicate interesting relation-              cally generating summary visualizations from game logs.
           ships between personality traits and physiological sig-                In Christian Darken and Michael Mateas, editors, AIIDE
           nals. Merging such several subject-related contextual in-              2008. The AAAI Press, 2008.
           formation will allow a more accurate analysis leading to             [Dzedzickis et al., 2020] Andrius Dzedzickis, Arturas Kak-
           better modelling of a person’s behaviour in the consid-                lauskas, and Vytautas Bucinskas. Human emotion recog-
           ered environment.                                                      nition: Review of sensors and methods.           Sensors,
                                                                                  20(3):592, 2020.
        2. The set of stimuli should be well balanced so that there
           are neither too many (which will make analysis difficult)            [IJsselsteijn et al., 2013] Wijnand A. IJsselsteijn, Yvonne
           nor too few (the environment will not be interesting for                A. W. de Kort, and Karolien Poels. The Game Experience
           the subject). Small levels, each focusing on selected as-               Questionnaire. Technische Universiteit Eindhoven, 2013.



  Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021                                                       50


[Kutt et al., 2020] Krzysztof Kutt, Dominika Dra˛żyk, Maciej
   Szela˛żek, Szymon Bobek, and Grzegorz J. Nalepa. The
   BIRAFFE2 experiment. study in bio-reactions and faces
   for emotion-based personalization for AI systems. CoRR,
   abs/2007.15048, 2020.
[Kutt et al., 2021] Krzysztof Kutt, Dominika Dra˛żyk, Szy-
   mon Bobek, and Grzegorz J. Nalepa. Personality-based
   affective adaptation methods for intelligent systems. Sen-
   sors, 21(1):163, 2021.
[Nalepa et al., 2019] Grzegorz J. Nalepa, Krzysztof Kutt,
   and Szymon Bobek. Mobile platform for affective context-
   aware systems. Future Generation Computer Systems,
   92:490–503, mar 2019.
[Prinz, 2006] Jesse J. Prinz. Gut Reactions. A Perceptual
   Theory of Emotion. Oxford University Press, Oxford,
   2006.
[Shim et al., 2011] Kyong Jin Shim, Nishith Pathak,
   Muhammad Aurangzeb Ahmad, Colin DeLong, Zoheb
   Borbora, Amogh Mahapatra, and Jaideep Srivastava.
   Analyzing human behavior from multiplayer online game
   logs: A knowledge discovery approach. IEEE Intell. Syst.,
   26(1):85–89, 2011.
[van Gent et al., 2019] Paul van Gent, Haneen Farah, Nicole
   van Nes, and Bart van Arem. Analysing noisy driver phys-
   iology real-time using off-the-shelf sensors: Heart rate
   analysis software from the taking the fast lane project.
   Journal of Open Research Software, 7(1):32, 2019.
[Zhao et al., 2019] Sicheng Zhao, Amir Gholaminejad,
   Guiguang Ding, Yue Gao, Jungong Han, and Kurt Keutzer.
   Personalized emotion recognition by personality-aware
   high-order learning of physiological signals. ACM Trans.
   Multim. Comput. Commun. Appl., 15(1s):14:1–14:18,
   2019.
[Żuchowska et al., 2020] Laura Żuchowska, Krzysztof Kutt,
   Krzysztof Geleta, Szymon Bobek, and Grzegorz J. Nalepa.
   Affective games provide controlable context. proposal of
   an experimental framework. In Jörg Cassens, Rebekah
   Wegener, and Anders Kofod-Petersen, editors, Proceed-
   ings of the Eleventh International Workshop Modelling
   and Reasoning in Context co-located with the 24th Euro-
   pean Conference on Artificial Intelligence, MRC@ECAI
   2020, Santiago de Compostela, Galicia, Spain, August
   29, 2020, volume 2787 of CEUR Workshop Proceedings,
   pages 45–50. CEUR-WS.org, 2020.




Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).