=Paper=
{{Paper
|id=Vol-2995/paper6
|storemode=property
|title=People in the Context - an Analysis of Game-based Experimental Protocol
|pdfUrl=https://ceur-ws.org/Vol-2995/paper6.pdf
|volume=Vol-2995
|authors=Krzysztof Kutt,Laura Żuchowska,Szymon Bobek,Grzegorz J. Nalepa
|dblpUrl=https://dblp.org/rec/conf/ijcai/KuttZBN21
}}
==People in the Context - an Analysis of Game-based Experimental Protocol==
Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 46 People in the Context – an Analysis of Game-based Experimental Protocol Krzysztof Kutt1∗ , Laura Żuchowska2 , Szymon Bobek1,2 and Grzegorz J. Nalepa1,2 1 Jagiellonian Human-Centered Artificial Intelligence Laboratory (JAHCAI) and Institute of Applied Computer Science, Jagiellonian University, Kraków, Poland 2 Department of Applied Computer Science, AGH University of Science and Technology, Kraków, Poland krzysztof.kutt@uj.edu.pl, szymon.bobek@uj.edu.pl, gjn@gjn.re Abstract measurement and easily extendable to wider research groups, wearable and portable, affordable-for-all devices are used. The paper provides insights into two main threads A key aspect of the BIRAFFE experiments is the use of of analysis of the BIRAFFE2 dataset concerning games as the experimental environment. They were cho- the associations between personality and physio- sen as a trade-off between a stimulus-rich complex near- logical signals and concerning the game logs’ gen- natural environment and the need to control and record as eration and processing. Alongside the presen- much context as possible to provide the most detailed post- tation of results, we propose the generation of experimental analyses. The latest version of the experiment, event-marked maps as an important step in the ex- BIRAFFE2 [Kutt et al., 2020]1 , used a game consisting of ploratory analysis of game data. The paper con- three independent levels. The aim of the first was to evoke cludes with a set of guidelines for using games as a positive emotions. The second was intended to induce irrita- context-rich experimental environment. tion and frustration, e.g., through impaired control. Finally, the third level was a neutral maze. A detailed description of 1 Introduction and Motivation the games is presented in [Żuchowska et al., 2020]. This paper provides insights into the core analyses of the The development of a good personalised intelligent assistant BIRAFFE2 dataset on contextual information processing in that behaves in a natural way requires the development of affective games. The first thread, presented in Sect. 2, fo- proper toolbox as a base [Nalepa et al., 2019]. In order to cuses on the analysis of the relationship between physiologi- be user-friendly, an assistant should not only perform its task, cal signals and the so-called “Big Five” personality traits. The but also respond to the user’s changing emotions. This is existence of such relationships in the data will allow further due to our natural tendency to anthropomorphize interfaces work to create emotion prediction models that will be moder- – the user will assume that the assistant will react appropri- ated and personalised through the identification of personality ately, e.g., understand that the nervousness is due to a mis- profiles. The second topic, described in Sect. 3, addresses the take committed. Such affective information can be extracted topic of accurate game logging and the possibility of recon- from the range of physiological signals, particularly obtained structing an entire game from such stored logs. The whole through low-cost wearable devices that will make this tech- article concludes with a set of lessons-learned regarding the nology available to everyone. Finally, it is important to note implementation of games as an experimental environment in that emotions do not happen in a void—they are always de- Sect. 4. pendent on the context a person is in [Prinz, 2006]—so it is also important to collect information about the user’s current situation (e.g., activity, weather conditions, time of day). 2 Physiological Signals and Personality An important step in establishing the above-outlined Before undertaking the analyses, three features were calcu- framework for personalized assistants is the collection of the lated for ECG signal using HeartPy library [van Gent et al., right data. This, in turn, strictly depends on the develop- 2019]: heart rate (number of heart beats per minute), mean ment of appropriate research environments and experimental of successive differences between R-R intervals (MoSD) and protocols. Such issues are addressed in the BIRAFFE (Bio- breathing rate. Also, to group the valence and arousal scores Reactions and Faces for Emotion-based Personalization) se- into discrete variable, 16 clusters were introduced as pre- ries of experiments [Kutt et al., 2021]. Their main objective sented on Fig. 1. is to develop methods for emotion recognition using a range In order to find correlations and dependencies between of contextual information and physiological signals such as physiological data (the ECG signal was chosen as an illus- cardiac activity (ECG), electrodermal response (EDA), hand tration) and personality traits (each on [1, 10] scale), several movements (accelerometer) or changes in facial expression. 1 In order to ensure that the research is highly ecological in The entire dataset from the BIRAFFE2 experiment is available under CC licence on the Zenodo platform, ∗ Corresponding Author DOI:10.5281/zenodo.3865859. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 47 9 ECG characteristic Mean SD 12 13 14 15 Heart rate [BPM] MoSD [ms] 80.92 34.41 16.21 37.89 7 Breathing rate [Hz] 0.10 0.12 8 9 10 11 Table 2: Descriptive statistics for ECG characteristics. Arousal 5 4 5 6 7 Independent var. Conscientiousness 1 df MS 3.85 0.60 F p 0.44 3 Openness 1 5.24 0.82 0.37 0 1 2 3 Agreeableness Extraversion Neuroticism 1 1 1 68.57 6.12 0.02 10.70 0.95 0.004 0.001 0.33 0.95 1 Heart rate 1 59.97 9.35 0.002 1 3 5 7 9 MoSD 1 0.87 0.14 0.71 Valence Breathing rate 1 2.91 0.45 0.50 Residual Error 10881 6.41 Figure 1: Valence and arousal scores grouped into clusters. Table 3: ANOVA model for valence with five personality traits and Personality Trait Mean SD Median three ECG-related characteristics as independent variables. Conscientiousness 5.68 2.61 6 Openness 5.48 2.22 6 eral strong associations. What seems most interesting is the Agreeableness 6.25 2.39 6 strong relationship between heart rate and valence, which is Extraversion 5.84 2.29 7 somehow in opposition to most approaches in which heart Neuroticism 5.38 2.45 5 rate is used to predict arousal, while other signals such as EDA are mostly used for valence [Dzedzickis et al., 2020]. Table 1: Descriptive statistics for personality traits. 3 Game Logs and Questionnaires approaches to statistical analysis were made. Firstly, basic As noted in the introduction, one motivation for using games descriptive statistics were calculated to find outliers and pos- as an experimental environment is the ability to frequently sible extremas. As can be seen in Tab. 1-2, the data was dis- sample and log the entire player context. Properly prepared tributed proportionally in terms of mean, median and stan- logs should allow the reconstruction of both the level map (the dard deviation, which indicates a promising start for further same for each subject) and the course of the entire game for analysis. each player. Indeed, this is possible for the games studied. The second analysis was aimed at investigation of corre- As part of the log analyses, a number of maps were gener- lations between features. Although the results did not show ated, which were verified by comparison with the games and any strong dependencies between them (see Fig. 2), they in- recorded screencasts of the gameplay. These maps can also dicated the existence of potentially interesting relationships be used for aggregated analyses, e.g., by plotting all events of worthy of further analysis and further research. Namely, one type followed by an initial visual inspection. Fig. 3 shows in terms of the associations between personality and widget all the death locations of the protagonist in the first level. One responses, valence and arousal are related to distinct traits. can notice a very high number of deaths in the central room For arousal, the highest values are for openness and consci- – this is consistent with the observations made during the ex- entiousness. On the other hand, valence’s most significant periment: this is the first room where players are just getting factors are agreeableness and extraversion. When consider- familiar with the game interface. ing the correlations between physiological reactions and wid- Another part of the analysis was the examination of an- get, among heart rate, MoSD, and respiratory rate, the high- swers from Game Experience Questionnaire [IJsselsteijn et est values were noted for the first of these for both valence al., 2013], a survey taken by each participant by the end of and arousal. The outcome of personality trait to heart rate the experiment. The results allow to understand whether the was presented as maximal for both conscientiousness and ex- games made an impact on emotional state of the subjects, ac- traversion. Considering the MoSD, highest value—and the cording to themselves. The results are represented by 7-factor highest inter-correlation in general, i.e., the correlation be- structure. Five of them were further analysed, as they were tween different data sources—was for extraversion (0.23) and the most relevant to the assumed game differences: conscientiousness (−0.19). Finally, values of correlation for • Challenge – I felt time pressure/I had to put a lot effort, breathing rate played in favor of extraversion. The last statistical analysis performed was two ANOVAs • Tension – I was irritated/I feel angry, for valence and arousal (see Tab. 3-4), which indicated sev- • Negative affect – I felt bad/made me bored, Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 48 1.00 Openness 1 0.14 0.26 0.083 0.038 0.0085 0.093 0.09 0.016 0.049 0.045 Conscientiousness 0.14 1 0.16 0.12 -0.22 -0.17 -0.19 -0.12 0.011 0.042 0.05 0.75 Extraversion 0.26 0.16 1 0.35 -0.38 -0.11 0.23 0.14 0.023 -0.0079 0.0006 0.50 Agreeableness 0.083 0.12 0.35 1 -0.011 -0.046 -0.067 0.059 0.039 -0.033 -0.025 0.25 Neuroticism 0.038 -0.22 -0.38 -0.011 1 0.07 -0.023 0.092 -0.0031 0.014 0.0095 Heart rate 0.0085 -0.17 -0.11 -0.046 0.07 1 0.39 0.32 0.029 -0.056 -0.035 0.00 MoSD 0.093 -0.19 0.23 -0.067 -0.023 0.39 1 0.74 0.014 -0.01 -0.0086 −0.25 Breathing rate 0.09 -0.12 0.14 0.059 0.092 0.32 0.74 1 0.021 -0.012 -0.015 Valence 0.016 0.011 0.023 0.039 -0.0031 0.029 0.014 0.021 1 -0.12 0.19 −0.50 Arousal 0.049 0.042 -0.0079 -0.033 0.014 -0.056 -0.01 -0.012 -0.12 1 0.9 −0.75 Cluster 0.045 0.05 0.0006 -0.025 0.0095 -0.035 -0.0086 -0.015 0.19 0.9 1 −1.00 D s ce al r ss n ss ism te te te es io oS us ne ne ra ra en us nn rs tic ro le us M l rt ng e Cl Va pe ab av ro A tio ea hi O eu tr ee H en at Ex N gr re ci A B ns Co Figure 2: Correlation matrix for five personality traits, three ECG-related characteristics and widget responses represented as Valence, Arousal and Cluster. Independent var. df MS F p Conscientiousness 1 60.43 14.50 < 0.001 Openness 1 101.03 24.25 < 0.001 Agreeableness 1 44.51 10.68 0.001 Extraversion 1 9.61 2.31 0.13 Neuroticism 1 13.14 3.15 0.08 Heart rate 1 138.06 33.13 < 0.001 MoSD 1 11.27 2.71 0.10 Breathing rate 1 1.64 0.39 0.53 Residual Error 10881 4.17 Table 4: ANOVA model for arousal with five personality traits and three ECG-related characteristics as independent variables. • Positive affect – I felt good/made me happy, • Competence – I felt competent/skillful. The factors were compared to each other in order to dig into the feelings of players. The expectations for the first game were that subject is supposed to feel happy (high pos- Figure 3: Map for Stage 1 recreated from game logs with all death- itive, low negative, low tension) and not challenged (high related events marked as dots. competence, low challenge). The second stage’s purpose was contrary to the first one – high negative, tension and chal- were not happy during and after the game. This cannot be lenge, with low competence and positive. The huge differ- said about the first stage, where according to the answers, ence is more likely to have an impact, as the contrast is hitting only 30% of subjects felt somewhat irritated. Same outcome the player suddenly. Based on the GEQ results (see Fig. 4), can be said about positive feedback for both stages – the first one can state that everything worked as planned. was keeping the emotions of participants on a very high level The Competence line during first gameplay was set pretty of happiness, while the second one changed it for a little one. high, while leaving the tension line in the bottom, making the subject feel calm enough to let their guard down, but still be entertained by the gameplay. The second stage’s extreme 4 Discussion and Lessons Learned difficulty and pressure-building environment made the expe- As a summary of the analyses presented, we propose a set rience hard to enjoy. A very similar result can be seen in of guidelines concerning the issues one should pay attention Negative/Tension comparison. About 95% of the participants to when creating games with the intention of using them as agreed that the second level has left them irritated, 83,5% context-rich experimental environments: Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 49 Factor: Challenge pects, should be preferred to one large level that com- 4 bines all experimental manipulations. The levels anal- Factor level ysed achieved their objectives well, as shown by the re- 2 sults of the GEQ questionnaire in Sect. 3. 3. Logs should be collected as densely as possible, accord- 0 ing to the specifics of the game being developed. All 0 20 40 60 80 100 features necessary to reproduce the gameplay should Factor: Tension be recorded. In the analyses carried out, it was found 4 that the logs were sufficiently detailed to reproduce the Factor level progress of the game. However, the data lacked infor- 2 mation on the type of death in the second level, which would be useful to compare with the emotions felt at the time of death. This information is still reproducible, e.g., 0 from the recorded screencasts, however it will require a 0 20 40 60 80 100 fair amount of data processing. Factor: Negative affect 4. Maps with events marked on them are a useful tool for 4 exploratory analysis of game logs. There are a num- Factor level ber of studies concerning the analysis of game logs 2 (e.g., [Cheong et al., 2008]), including those related to the evaluation of social science theories [Shim et al., 0 2011]. However, to the best of our knowledge, data vi- 0 20 40 60 80 100 sualisation in the form of maps (as in Fig. 3) has not Factor: Positive affect been done as part of the analyses. We believe that this 4 is a valuable approach to quickly assess the validity of Factor level the data and to propose hypotheses that have not been considered before. 2 These findings will be incorporated into the preparation of 0 the next experiment in the BIRAFFE series, planned for Au- 0 20 40 60 80 100 tumn 2021. Factor: Competence 4 Acknowledgements Factor level The research has been supported by a grant from the Prior- 2 ity Research Area Digiworld under the Strategic Programme Excellence Initiative at the Jagiellonian University. 0 The authors are also grateful to Academic Computer Cen- 0 20 40 60 80 100 tre CYFRONET AGH and Jagiellonian University for grant- Subject ing access to the computing infrastructure built in the projects No. POIG.02.03.00-00-028/08 “PLATON – Science Services Figure 4: GEQ factors for the first and second level (green and yel- Platform” and No. POIG.02.03.00-00-110/13 “Deploying low, respectively). Horizontal lines mark the average values. high-availability, critical services in Metropolitan Area Net- works (MAN-HA)”. 1. It is important to take into account the features of the References subjects in the contextual information set. In line with the results obtained from the BIRAFFE1 [Kutt et al., [Cheong et al., 2008] Yun-Gyung Cheong, Arnav Jhala, 2021] and DEAP [Zhao et al., 2019] datasets, the anal- Byung-Chull Bae, and Robert Michael Young. Automati- yses summarised in Sect. 2 indicate interesting relation- cally generating summary visualizations from game logs. ships between personality traits and physiological sig- In Christian Darken and Michael Mateas, editors, AIIDE nals. Merging such several subject-related contextual in- 2008. The AAAI Press, 2008. formation will allow a more accurate analysis leading to [Dzedzickis et al., 2020] Andrius Dzedzickis, Arturas Kak- better modelling of a person’s behaviour in the consid- lauskas, and Vytautas Bucinskas. Human emotion recog- ered environment. nition: Review of sensors and methods. Sensors, 20(3):592, 2020. 2. The set of stimuli should be well balanced so that there are neither too many (which will make analysis difficult) [IJsselsteijn et al., 2013] Wijnand A. IJsselsteijn, Yvonne nor too few (the environment will not be interesting for A. W. de Kort, and Karolien Poels. The Game Experience the subject). Small levels, each focusing on selected as- Questionnaire. Technische Universiteit Eindhoven, 2013. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 50 [Kutt et al., 2020] Krzysztof Kutt, Dominika Dra˛żyk, Maciej Szela˛żek, Szymon Bobek, and Grzegorz J. Nalepa. The BIRAFFE2 experiment. study in bio-reactions and faces for emotion-based personalization for AI systems. CoRR, abs/2007.15048, 2020. [Kutt et al., 2021] Krzysztof Kutt, Dominika Dra˛żyk, Szy- mon Bobek, and Grzegorz J. Nalepa. Personality-based affective adaptation methods for intelligent systems. Sen- sors, 21(1):163, 2021. [Nalepa et al., 2019] Grzegorz J. Nalepa, Krzysztof Kutt, and Szymon Bobek. Mobile platform for affective context- aware systems. Future Generation Computer Systems, 92:490–503, mar 2019. [Prinz, 2006] Jesse J. Prinz. Gut Reactions. A Perceptual Theory of Emotion. Oxford University Press, Oxford, 2006. [Shim et al., 2011] Kyong Jin Shim, Nishith Pathak, Muhammad Aurangzeb Ahmad, Colin DeLong, Zoheb Borbora, Amogh Mahapatra, and Jaideep Srivastava. Analyzing human behavior from multiplayer online game logs: A knowledge discovery approach. IEEE Intell. Syst., 26(1):85–89, 2011. [van Gent et al., 2019] Paul van Gent, Haneen Farah, Nicole van Nes, and Bart van Arem. Analysing noisy driver phys- iology real-time using off-the-shelf sensors: Heart rate analysis software from the taking the fast lane project. Journal of Open Research Software, 7(1):32, 2019. [Zhao et al., 2019] Sicheng Zhao, Amir Gholaminejad, Guiguang Ding, Yue Gao, Jungong Han, and Kurt Keutzer. Personalized emotion recognition by personality-aware high-order learning of physiological signals. ACM Trans. Multim. Comput. Commun. Appl., 15(1s):14:1–14:18, 2019. [Żuchowska et al., 2020] Laura Żuchowska, Krzysztof Kutt, Krzysztof Geleta, Szymon Bobek, and Grzegorz J. Nalepa. Affective games provide controlable context. proposal of an experimental framework. In Jörg Cassens, Rebekah Wegener, and Anders Kofod-Petersen, editors, Proceed- ings of the Eleventh International Workshop Modelling and Reasoning in Context co-located with the 24th Euro- pean Conference on Artificial Intelligence, MRC@ECAI 2020, Santiago de Compostela, Galicia, Spain, August 29, 2020, volume 2787 of CEUR Workshop Proceedings, pages 45–50. CEUR-WS.org, 2020. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).