From Simulation to Reality and Back Again: A Hybrid Approach to Estimate the Compliance of ESM Study Participants to Different ESM Protocols Alireza Khanshan1,2,∗ 1 Department of Industrial Design, Eindhoven University of Technology, 5612 AZ, De Zaale, Eindhoven, North Brabant, Netherlands 2 Eindhoven Artificial Intelligence Systems Institute, 5612 AZ, De Zaale, Eindhoven, North Brabant, Netherlands Abstract Sustaining sufficient compliance in long-running Experience Sampling Method (ESM) studies has re- mained a challenge. Participants of such studies usually drop out after a few weeks due to response fatigue, technical difficulties, the intrusiveness of the prompts, and changes in their motivation. One common approach to ensure higher compliance is to tailor the timing of the prompts. Different tailoring approaches that take into account the personal context of participants have been proposed. Such as considering calendar events, ESM device usage patterns, or information derived from physiological sensors. Recently, the application of the reinforcement learning (RL) approach in this domain has shown promise in learning the right timing of the prompts. However, RL agents require repeated inquiries at the beginning of their learning process which is from the participants’ point of view intrusive and may result in early dropouts. To overcome this problem, agents can pre-train from prior knowledge and avoid “cold start”. Although real-life data about compliance in ESM studies is insufficient, agents could instead train with generated data that imitates real-life events. Accordingly, psychological theories should be involved in the simulation process that generates ESM-related data. We present our hybrid approach that utilizes both historical ESM data and synthesized data backed by psychological theories to provide sufficient data to train ML models and RL agents to predict the opportune moments of ESM prompts. Keywords User Simulation, Experience Sampling Method, Computational Modeling, Reinforcement Learning, Cognitive Modeling, Tailored Interaction, Adaptive Notification 1. Introduction The Experience Sampling Method (ESM) is a research procedure that enables studying what people do, feel, and think during their daily lives through systematic self-reports [1]. In the context of ESM, addressing the low compliance of participants in long-running studies is challenging [1]. An effective solution should be able to sustain sufficient response rates and lower dropouts. Accordingly, researchers have argued that appropriate sampling regimes, tailored content, and adopting the right modalities could address such a challenge [2]. Notably, ACM SIGCHI Symposium on Engineering Interactive Computing Systems, June 21–24, 2022, Sophia Antipolis, France ∗ Corresponding author. Envelope-Open a.khanshan@tue.nl (A. Khanshan) GLOBE https://khanshan.com/ (A. Khanshan) Orcid 0000-0002-9112-4695 (A. Khanshan) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) it has been hypothesized that timing and frequency of sampling, if chosen appropriately, can help retain participation longer. More specifically, the impact of various sampling regimes such as random, interval-based, and also context-dependent has been explored empirically on the compliance of ESM study participants [3]. Unfortunately, most such studies so far have produced insights that are specific to an application domain, special cohorts, or represent a limited sample of the target population. Furthermore, such studies are expensive, they take time to conclude, and still, they cannot cover all the possible cohorts as well as different contexts. Furthermore, running studies with poorly configured sampling regimes could one day be considered unethical, just like it is considered unethical to run studies with poorly selected sample sizes [4]. Besides hypothesizing and testing, studies have been carried out that learn the optimal ESM protocol over time via machine learning. Specifically, reinforcement learning approaches have been applied to find the right timing or opportune moments for sending prompts to study participants [5, 6]. Prior works have shown that adopting the timing of the prompts through RL increases the compliance of participants [5]. But, it still takes time for an RL agent to learn about one’s favorable protocol (cold start problem). Thus, participants are likely to drop out soon because the initial decisions of the agent are most probably burdensome (repetitive and at inopportune moments due to lack of knowledge). Conversely, if a pre-trained agent is applied (warm start), participants are less likely to be disturbed (because the agent can make more sensible decisions with fewer inquiries). To provide the aforementioned training data to enable warm start of RL agents or to train ML models in general, besides the obvious use of historical data, it seems promising to leverage simulation to generate data that estimates the reactions of participants to prompts. Such a simulation should be model-driven and based on psychological theories as open historical data in this domain is quite scarce. However, incorporating the right psychological theories in such a process requires further research. Additionally, researchers in charge of ESM studies should be able to apply their domain knowledge of the target participants to fine-tune the execution of the aforementioned simulations while considering ethical norms. Finally, ESM studies executed with the suggested warm start parameters should return the empirical feedback to the machine learning infrastructure. To the best of our knowledge, such a hybrid framework that combines simulation data with real-life feedback has never been studied in settings where human decision-making behavior, as well as ethical norms, are taken into account. Accordingly, we have formulated our main research question as “How to design a system that estimates compliance of ESM study participants to different ESM protocols?”. To address that we aim to deliver the first open self-learning ESM framework which avoids the cold-start problems and considers ethical norms in the domain of health. 2. Related work In the context of Human-Technology Interaction, approaches to overcome the cold start problem of reinforcement learning agents have been proposed that utilize cognitive models to generate human decision-making data which then facilitates the warm start of the reinforcement learn- ing agents [7]. More practically, the effectiveness of such an approach to optimize adaptive notifications has been shown in the context of mobile health intervention systems as well [5]. In such works, the decision-making process of participants has been modeled by the incorporation of different psychological theories mainly related to memory accessibility [8, 9] and option generation [10]. Meanwhile, providing a more flexible and configurable model could contribute to generating data for a variety of domains and cohorts. Using models to simulate certain behaviors to avoid greedy reinforced decisions has also been explored. Such as in adapting user interfaces in the context of Human-Computer Interaction [11]. More generally, in cognitive psychology, the human decision-making process has been modeled by evidence accumulation and decisions threshold crossing as well [12]. Then again, in social research, providing a mathematical formalization of psychological theories and computational modeling are keys to develop models and tools that represent different human behaviors [13]. In the context of ESM and finding opportune moments to prompt, besides modeling the human decision-making process, estimating the cost of interruptions by modeling the user’s attention has been explored [14, 15, 16, 17]. Since such works are designed primarily for maximizing the economical utility of prompts, we need to extend them for use in health-related studies, where values other than economical utility are at stake. 3. Research Question We aim to enable the incorporation of psychological theories that can guide the simulation process and estimate human decision-making during health-related ESM studies. Beyond theories that focus on economic principles, we incorporate medical importance and societal values, as they are typically overlooked in applied computing. On the other hand, we intend to apply RL to predict the opportune moments of interruption. Hence, by developing a hybrid simulation approach that includes both historical and generated data, the cold start problem of RL agents can be alleviated. Lastly, besides the mentioned framework, an open, available, and secure research tool to carry out ESM studies is being developed by the author as well. Overall, we can break down our research objective into the following sub-research questions: • How can ESM protocol variables and collected contextual and behavioral data of partici- pants during ESM studies turn into sufficiently rich data for machine learning? • Which psychological theories are relevant to simulate the human decision-making model in a health-related ESM context? • What are the design considerations of a software tool that researchers from different domains can use to run their simulations, share their knowledge, and benefit from smarter ESM protocols? • How can self-learning ESM protocols increase the compliance of participants? 4. Proposed approach and methodology To describe our proposed approach and applied methodologies, we explain them with respect to the sub-RQs above: 4.1. How can ESM protocol variables and collected contextual and behavioral data of participants during ESM studies turn into sufficiently rich data for machine learning? To facilitate controlled data collection, we have created our ESM tool, Experiencer [18], to set up protocols, collect self-reports, and gather fine-grained contextual data from commodity-level smartwatch sensors. We also anticipate integrating interoperable ESM protocol definition languages such as executable HTML [19] for easier conversion of protocol definitions into simulation instructions and ML features. Accordingly, we have been running multiple ESM experiments to monitor compliance, analyze ESM protocol parameters, and gather sensor as well as participants’ behavior data to kick-start the ML training process. Even though such data provides an accountable perspective about ESM compliance, it is still insufficient to build transferable ML models. Thus, complementary data should be generated given a computational model of the human decision-making process and contextual factors. 4.2. Which psychological theories are relevant to simulate the human decision-making model in the ESM context? Following the common approaches in the literature, we model the decision-making process via dynamic Bayesian networks. Such graphical representation helps associate our variables with their conditional dependencies and captures their evolution over time. Following the proposed models of attention-sensitive alerting, the impact of memory aids, option generation, and course of retention, we utilize them to model the decision-making process. The epicenter of our model is an Action (a self-report) following an external memory aid (an ESM prompt). In addition, our model reckons personal context (e.g., physiological signals and calendar events) as well as participant’s instrumental attitude [8] and perceived burden [20]. Accordingly, models such as COM-B [21] and Fogg [22] can help abstract the Bayesian network. Furthermore, we aim to test the effectiveness and the correctness of the aforementioned human simulator by developing an evaluation process. That can be achieved by data-driven approaches where the simulated data is compared against the real-life historical data. However, to cover corner cases, black-box testing or white-box testing can be incorporated but that requires further research. 4.3. What are the design considerations of a software tool that researchers from different domains can use to run their simulations, share their knowledge, and benefit from smarter ESM protocols? To generate data and estimate compliance with a specific protocol, we intend to introduce a framework where the researcher can either hand-engineer the parameters of the model and generate data while considering the characteristics of target participants or reuse the parameters of past studies. In our framework, we define compliance by response rate and set the goal to find opportune moments to prompt. Accordingly, after learning through the simulation, the trained model (e.g., a warmed up RL agent) is ready to be used in the wild. Collected data during such ESM studies are transferred back to our data repository to be leveraged to update models incrementally. Meanwhile, the system allows the creation of multiple models for different contexts and different data sets. Subsequently, such models can be aggregated (e.g., by bagging) and potentially increase the quality of the predictions. In addition, we allow fine-tuning the ratio of synthesized and historical data to form a hybrid training set that yields high-performance models. 4.4. How can self-learning ESM protocols increase the compliance of participants? Indeed the focus of our approach is to predict the right time of prompting such that the compliance is not hampered. Rather than classic prompting regimes (random and interval- based [23]) ML models, particularly RL agents, can be used to predict opportune moments by learning from participants’ behavior over time. Such a process is not only possible during the run time of a study but also at design time in that transferable models learned from other studies or from model-driven simulation can be integrated to improve the prediction quality. On the other hand, by discovering participants’ response patterns, the questions and/or answers themselves can be predicted. Thus, instead of a generic questionnaire, its variant based on ML predictions can be presented which potentially contributes to higher compliance as well since they are personalized. Besides, ecological validity needs to be maintained as well. Namely, randomness in the prompting schedule, and a minimum number of inquiries per day need to be satisfied. Clearly, a high response rate may be achieved by fewer prompts but a researcher oftentimes requires a minimum number of responses, within specific time frames. 5. Current status of doctoral work The research started in early 2020, by exploring how systems used in longitudinal monitoring studies could adapt and evolve to enhance user experience, especially in the context of ESM in the public health domain. Specifically, we have been working on Experiencer [18], as an ESM tool that exploits commodity-level smartwatches to maximize compliance by finding opportune moments of prompts based on the context derived from physiological sensors (sub-RQs 1 and 3). Accordingly, we have published our latest findings titled “Assessing the Influence of Physical Activity Upon the Experience Sampling Response Rate on Wrist-Worn Devices” [24] in the International Journal of Environmental Research and Public Health. Meanwhile, the article that introduces Experiencer software is to be submitted soon. We have created a context-sensitive tool that allows for recording of sensor data, configuring sensors, remote device management, event-based data collection, various sampling regimes, dynamic user interface, and also optimizes device data storage and data transactions over the network while being open, available, and compliant with standard privacy measures (sub-RQ 3). With respect to the discussed topics in this paper, we have built our initial dynamic Bayesian network that incorporates the psychological theories to imitate the human decision-making process that also considers personal context (sub-RQ 2). Preliminary calculations are being carried out to mathematically formalize the model. These will be followed by prototyping the human simulator, building a framework that could be used by researchers, and ultimately testing it in the wild (sub-RQ 4). References [1] R. Larson, M. Csikszentmihalyi, The Experience Sampling Method, in: M. Csikszentmihalyi (Ed.), Flow and the Foundations of Positive Psychology: The Collected Works of Mihaly Csikszentmihalyi, Springer Netherlands, Dordrecht, 2014, pp. 21–34. URL: https://doi.org/ 10.1007/978-94-017-9088-8_2. doi:10.1007/978- 94- 017- 9088- 8_2 . [2] H. A. A. Spelt, J. H. D. M. Westerink, L. Frank, J. Ham, W. A. IJsselsteijn, Physiology-based personalization of persuasive technology: a user modeling perspective, User Modeling and User-Adapted Interaction (2022). URL: https://doi.org/10.1007/s11257-021-09313-8. doi:10.1007/s11257- 021- 09313- 8 . [3] N. van Berkel, J. Goncalves, L. Lovén, D. Ferreira, S. Hosio, V. Kostakos, Effect of experience sampling schedules on response rate and recall accuracy of objective self- reports, International Journal of Human-Computer Studies 125 (2019) 118–128. URL: https: //www.sciencedirect.com/science/article/pii/S1071581918306797. doi:10.1016/j.ijhcs. 2018.12.002 . [4] P. Bacchetti, L. E. Wolf, M. R. Segal, C. E. McCulloch, Ethics and Sample Size, American Journal of Epidemiology 161 (2005) 105–110. URL: https://doi.org/10.1093/aje/ kwi014. doi:10.1093/aje/kwi014 . arXiv:https://academic.oup.com/aje/article- pdf/161/2/105/712714/kwi014.pdf . [5] S. Wang, C. Zhang, B. Kröse, H. van Hoof, Optimizing Adaptive Notifications in Mo- bile Health Interventions Systems: Reinforcement Learning from a Data-driven Behav- ioral Simulator, Journal of Medical Systems 45 (2021) 102. URL: https://doi.org/10.1007/ s10916-021-01773-0. doi:10.1007/s10916- 021- 01773- 0 . [6] S. Wang, K. Sporrel, H. van Hoof, M. Simons, R. D. D. de Boer, D. Ettema, N. Nibbeling, M. Deutekom, B. Kröse, Reinforcement Learning to Send Reminders at Right Moments in Smartphone Exercise Application: A Feasibility Study, International Journal of Environ- mental Research and Public Health 18 (2021) 6059. URL: https://www.mdpi.com/1660-4601/ 18/11/6059. doi:10.3390/ijerph18116059 , number: 11 Publisher: Multidisciplinary Digi- tal Publishing Institute. [7] C. Zhang, S. Wang, H. Aarts, M. Dastani, Using Cognitive Models to Train Warm Start Reinforcement Learning Agents for Human-Computer Interactions, arXiv:2103.06160 [cs] (2021). URL: http://arxiv.org/abs/2103.06160, arXiv: 2103.06160. [8] R. Tobias, Changing behavior by memory aids: A social psychological model of prospective memory and habit development tested with dynamic field data, Psychological Review 116 (2009) 408–438. doi:10.1037/a0015512 , place: US Publisher: American Psychological Association. [9] D. C. Rubin, S. Hinton, A. Wenzel, The precise time course of retention, Journal of Experimental Psychology: Learning, Memory, and Cognition 25 (1999) 1161–1176. doi:10. 1037/0278- 7393.25.5.1161 , place: US Publisher: American Psychological Association. [10] B. Kamphorst, A. Kalis, Why option generation matters for the design of autonomous e-coaching systems, AI & Society 30 (2015) 77–88. URL: https://doi.org/10.1007/ s00146-013-0532-5. doi:10.1007/s00146- 013- 0532- 5 . [11] K. Todi, G. Bailly, L. Leiva, A. Oulasvirta, Adapting User Interfaces with Model-based Reinforcement Learning, in: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, CHI ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp. 1–13. URL: https://doi.org/10.1145/3411764.3445497. doi:10.1145/3411764. 3445497 . [12] J. R. Busemeyer, S. Gluth, J. Rieskamp, B. M. Turner, Cognitive and Neural Bases of Multi-Attribute, Multi-Alternative, Value-based Decisions, Trends in Cognitive Sciences 23 (2019) 251–263. URL: https://www.sciencedirect.com/science/article/pii/ S1364661318302845. doi:10.1016/j.tics.2018.12.003 . [13] M. Guimaraes, L. Emmendorfer, D. Adamatti, Persuasive agent based simulation for evaluation of the dynamic threshold line and trigger classification from the Fogg Behavior Model, Simulation Modelling Practice and Theory 83 (2018) 18–35. URL: https://www. sciencedirect.com/science/article/pii/S1569190X18300017. doi:10.1016/j.simpat.2018. 01.001 . [14] E. J. Horvitz, A. Jacobs, D. Hovel, Attention-Sensitive Alerting, arXiv:1301.6707 [cs] (2013). URL: http://arxiv.org/abs/1301.6707, arXiv: 1301.6707. [15] E. Horvitz, P. Koch, J. Apacible, BusyBody: creating and fielding personalized models of the cost of interruption, in: Proceedings of the 2004 ACM conference on Computer supported cooperative work, CSCW ’04, Association for Computing Machinery, New York, NY, USA, 2004, pp. 507–510. URL: https://doi.org/10.1145/1031607.1031690. doi:10.1145/ 1031607.1031690 . [16] E. Horvitz, C. Kadie, T. Paek, D. Hovel, Models of attention in computing and communica- tion: from principles to applications, Communications of the ACM 46 (2003) 52–59. URL: https://doi.org/10.1145/636772.636798. doi:10.1145/636772.636798 . [17] E. Horvitz, J. Apacible, Learning and reasoning about interruption, in: Proceedings of the 5th international conference on Multimodal interfaces, ICMI ’03, Association for Computing Machinery, New York, NY, USA, 2003, pp. 20–27. URL: https://doi.org/10.1145/ 958432.958440. doi:10.1145/958432.958440 . [18] A. Khanshan, Experiencer, 2022. URL: https://experiencer.eu/. [19] N. Batalas, V.-J. Khan, M. Franzen, P. Markopoulos, M. aan het Rot, Formal representation of ambulatory assessment protocols in HTML5 for human readability and computer execution, Behavior Research Methods 51 (2019) 2761–2776. URL: https://doi.org/10.3758/ s13428-018-1148-y. doi:10.3758/s13428- 018- 1148- y . [20] M. Csikszentmihalyi, R. Larson, Validity and Reliability of the Experience-Sampling Method, in: M. Csikszentmihalyi (Ed.), Flow and the Foundations of Positive Psy- chology: The Collected Works of Mihaly Csikszentmihalyi, Springer Netherlands, Dor- drecht, 2014, pp. 35–54. URL: https://doi.org/10.1007/978-94-017-9088-8_3. doi:10.1007/ 978- 94- 017- 9088- 8_3 . [21] S. Michie, M. M. van Stralen, R. West, The behaviour change wheel: A new method for characterising and designing behaviour change interventions, Implementation Science 6 (2011) 42. URL: https://doi.org/10.1186/1748-5908-6-42. doi:10.1186/1748- 5908- 6- 42 . [22] B. Fogg, Fogg behavior model, 2019. URL: https://behaviormodel.org. [23] P. A. E. G. Delespaul, Technical note: Devices and time-sampling procedures, in: The experience of psychopathology: Investigating mental disorders in their natural set- tings, Cambridge University Press, New York, NY, US, 1992, pp. 363–373. doi:10.1017/ CBO9780511663246.033 . [24] A. Khanshan, P. Van Gorp, R. Nuijten, P. Markopoulos, Assessing the Influence of Physical Activity Upon the Experience Sampling Response Rate on Wrist-Worn Devices, Inter- national Journal of Environmental Research and Public Health 18 (2021) 10593. URL: https://www.mdpi.com/1660-4601/18/20/10593. doi:10.3390/ijerph182010593 , number: 20 Publisher: Multidisciplinary Digital Publishing Institute.