From Simulation to Reality and Back Again: A Hybrid
Approach to Estimate the Compliance of ESM Study
Participants to Different ESM Protocols
Alireza Khanshan1,2,∗
1
  Department of Industrial Design, Eindhoven University of Technology, 5612 AZ, De Zaale, Eindhoven, North Brabant,
Netherlands
2
  Eindhoven Artificial Intelligence Systems Institute, 5612 AZ, De Zaale, Eindhoven, North Brabant, Netherlands


                                         Abstract
                                        Sustaining sufficient compliance in long-running Experience Sampling Method (ESM) studies has re-
                                        mained a challenge. Participants of such studies usually drop out after a few weeks due to response
                                        fatigue, technical difficulties, the intrusiveness of the prompts, and changes in their motivation. One
                                        common approach to ensure higher compliance is to tailor the timing of the prompts. Different tailoring
                                        approaches that take into account the personal context of participants have been proposed. Such as
                                        considering calendar events, ESM device usage patterns, or information derived from physiological
                                        sensors. Recently, the application of the reinforcement learning (RL) approach in this domain has shown
                                        promise in learning the right timing of the prompts. However, RL agents require repeated inquiries at
                                        the beginning of their learning process which is from the participants’ point of view intrusive and may
                                        result in early dropouts. To overcome this problem, agents can pre-train from prior knowledge and avoid
                                        “cold start”. Although real-life data about compliance in ESM studies is insufficient, agents could instead
                                        train with generated data that imitates real-life events. Accordingly, psychological theories should be
                                        involved in the simulation process that generates ESM-related data. We present our hybrid approach
                                        that utilizes both historical ESM data and synthesized data backed by psychological theories to provide
                                        sufficient data to train ML models and RL agents to predict the opportune moments of ESM prompts.

                                         Keywords
                                         User Simulation, Experience Sampling Method, Computational Modeling, Reinforcement Learning,
                                         Cognitive Modeling, Tailored Interaction, Adaptive Notification


1. Introduction
The Experience Sampling Method (ESM) is a research procedure that enables studying what
people do, feel, and think during their daily lives through systematic self-reports [1]. In the
context of ESM, addressing the low compliance of participants in long-running studies is
challenging [1]. An effective solution should be able to sustain sufficient response rates and
lower dropouts. Accordingly, researchers have argued that appropriate sampling regimes,
tailored content, and adopting the right modalities could address such a challenge [2]. Notably,

ACM SIGCHI Symposium on Engineering Interactive Computing Systems, June 21–24, 2022, Sophia Antipolis, France
∗
    Corresponding author.
Envelope-Open a.khanshan@tue.nl (A. Khanshan)
GLOBE https://khanshan.com/ (A. Khanshan)
Orcid 0000-0002-9112-4695 (A. Khanshan)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
it has been hypothesized that timing and frequency of sampling, if chosen appropriately, can
help retain participation longer. More specifically, the impact of various sampling regimes such
as random, interval-based, and also context-dependent has been explored empirically on the
compliance of ESM study participants [3]. Unfortunately, most such studies so far have produced
insights that are specific to an application domain, special cohorts, or represent a limited sample
of the target population. Furthermore, such studies are expensive, they take time to conclude,
and still, they cannot cover all the possible cohorts as well as different contexts. Furthermore,
running studies with poorly configured sampling regimes could one day be considered unethical,
just like it is considered unethical to run studies with poorly selected sample sizes [4].
   Besides hypothesizing and testing, studies have been carried out that learn the optimal ESM
protocol over time via machine learning. Specifically, reinforcement learning approaches have
been applied to find the right timing or opportune moments for sending prompts to study
participants [5, 6]. Prior works have shown that adopting the timing of the prompts through
RL increases the compliance of participants [5]. But, it still takes time for an RL agent to learn
about one’s favorable protocol (cold start problem). Thus, participants are likely to drop out
soon because the initial decisions of the agent are most probably burdensome (repetitive and at
inopportune moments due to lack of knowledge). Conversely, if a pre-trained agent is applied
(warm start), participants are less likely to be disturbed (because the agent can make more
sensible decisions with fewer inquiries).
   To provide the aforementioned training data to enable warm start of RL agents or to train
ML models in general, besides the obvious use of historical data, it seems promising to leverage
simulation to generate data that estimates the reactions of participants to prompts. Such a
simulation should be model-driven and based on psychological theories as open historical
data in this domain is quite scarce. However, incorporating the right psychological theories in
such a process requires further research. Additionally, researchers in charge of ESM studies
should be able to apply their domain knowledge of the target participants to fine-tune the
execution of the aforementioned simulations while considering ethical norms. Finally, ESM
studies executed with the suggested warm start parameters should return the empirical feedback
to the machine learning infrastructure. To the best of our knowledge, such a hybrid framework
that combines simulation data with real-life feedback has never been studied in settings where
human decision-making behavior, as well as ethical norms, are taken into account. Accordingly,
we have formulated our main research question as “How to design a system that estimates
compliance of ESM study participants to different ESM protocols?”. To address that we
aim to deliver the first open self-learning ESM framework which avoids the cold-start problems
and considers ethical norms in the domain of health.


2. Related work
In the context of Human-Technology Interaction, approaches to overcome the cold start problem
of reinforcement learning agents have been proposed that utilize cognitive models to generate
human decision-making data which then facilitates the warm start of the reinforcement learn-
ing agents [7]. More practically, the effectiveness of such an approach to optimize adaptive
notifications has been shown in the context of mobile health intervention systems as well [5]. In
such works, the decision-making process of participants has been modeled by the incorporation
of different psychological theories mainly related to memory accessibility [8, 9] and option
generation [10]. Meanwhile, providing a more flexible and configurable model could contribute
to generating data for a variety of domains and cohorts.
   Using models to simulate certain behaviors to avoid greedy reinforced decisions has also been
explored. Such as in adapting user interfaces in the context of Human-Computer Interaction [11].
More generally, in cognitive psychology, the human decision-making process has been modeled
by evidence accumulation and decisions threshold crossing as well [12]. Then again, in social
research, providing a mathematical formalization of psychological theories and computational
modeling are keys to develop models and tools that represent different human behaviors [13].
   In the context of ESM and finding opportune moments to prompt, besides modeling the human
decision-making process, estimating the cost of interruptions by modeling the user’s attention
has been explored [14, 15, 16, 17]. Since such works are designed primarily for maximizing the
economical utility of prompts, we need to extend them for use in health-related studies, where
values other than economical utility are at stake.


3. Research Question
We aim to enable the incorporation of psychological theories that can guide the simulation
process and estimate human decision-making during health-related ESM studies. Beyond
theories that focus on economic principles, we incorporate medical importance and societal
values, as they are typically overlooked in applied computing. On the other hand, we intend to
apply RL to predict the opportune moments of interruption. Hence, by developing a hybrid
simulation approach that includes both historical and generated data, the cold start problem of
RL agents can be alleviated. Lastly, besides the mentioned framework, an open, available, and
secure research tool to carry out ESM studies is being developed by the author as well. Overall,
we can break down our research objective into the following sub-research questions:

    • How can ESM protocol variables and collected contextual and behavioral data of partici-
      pants during ESM studies turn into sufficiently rich data for machine learning?
    • Which psychological theories are relevant to simulate the human decision-making model
      in a health-related ESM context?
    • What are the design considerations of a software tool that researchers from different
      domains can use to run their simulations, share their knowledge, and benefit from smarter
      ESM protocols?
    • How can self-learning ESM protocols increase the compliance of participants?


4. Proposed approach and methodology
To describe our proposed approach and applied methodologies, we explain them with respect
to the sub-RQs above:
4.1. How can ESM protocol variables and collected contextual and behavioral
     data of participants during ESM studies turn into sufficiently rich data for
     machine learning?
To facilitate controlled data collection, we have created our ESM tool, Experiencer [18], to set
up protocols, collect self-reports, and gather fine-grained contextual data from commodity-level
smartwatch sensors. We also anticipate integrating interoperable ESM protocol definition
languages such as executable HTML [19] for easier conversion of protocol definitions into
simulation instructions and ML features. Accordingly, we have been running multiple ESM
experiments to monitor compliance, analyze ESM protocol parameters, and gather sensor as
well as participants’ behavior data to kick-start the ML training process. Even though such
data provides an accountable perspective about ESM compliance, it is still insufficient to build
transferable ML models. Thus, complementary data should be generated given a computational
model of the human decision-making process and contextual factors.

4.2. Which psychological theories are relevant to simulate the human
     decision-making model in the ESM context?
Following the common approaches in the literature, we model the decision-making process via
dynamic Bayesian networks. Such graphical representation helps associate our variables with
their conditional dependencies and captures their evolution over time. Following the proposed
models of attention-sensitive alerting, the impact of memory aids, option generation, and course
of retention, we utilize them to model the decision-making process. The epicenter of our model
is an Action (a self-report) following an external memory aid (an ESM prompt). In addition,
our model reckons personal context (e.g., physiological signals and calendar events) as well as
participant’s instrumental attitude [8] and perceived burden [20]. Accordingly, models such as
COM-B [21] and Fogg [22] can help abstract the Bayesian network. Furthermore, we aim to
test the effectiveness and the correctness of the aforementioned human simulator by developing
an evaluation process. That can be achieved by data-driven approaches where the simulated
data is compared against the real-life historical data. However, to cover corner cases, black-box
testing or white-box testing can be incorporated but that requires further research.

4.3. What are the design considerations of a software tool that researchers
     from different domains can use to run their simulations, share their
     knowledge, and benefit from smarter ESM protocols?
To generate data and estimate compliance with a specific protocol, we intend to introduce a
framework where the researcher can either hand-engineer the parameters of the model and
generate data while considering the characteristics of target participants or reuse the parameters
of past studies. In our framework, we define compliance by response rate and set the goal to
find opportune moments to prompt. Accordingly, after learning through the simulation, the
trained model (e.g., a warmed up RL agent) is ready to be used in the wild. Collected data during
such ESM studies are transferred back to our data repository to be leveraged to update models
incrementally. Meanwhile, the system allows the creation of multiple models for different
contexts and different data sets. Subsequently, such models can be aggregated (e.g., by bagging)
and potentially increase the quality of the predictions. In addition, we allow fine-tuning the ratio
of synthesized and historical data to form a hybrid training set that yields high-performance
models.

4.4. How can self-learning ESM protocols increase the compliance of
     participants?
Indeed the focus of our approach is to predict the right time of prompting such that the
compliance is not hampered. Rather than classic prompting regimes (random and interval-
based [23]) ML models, particularly RL agents, can be used to predict opportune moments by
learning from participants’ behavior over time. Such a process is not only possible during
the run time of a study but also at design time in that transferable models learned from other
studies or from model-driven simulation can be integrated to improve the prediction quality.
On the other hand, by discovering participants’ response patterns, the questions and/or answers
themselves can be predicted. Thus, instead of a generic questionnaire, its variant based on ML
predictions can be presented which potentially contributes to higher compliance as well since
they are personalized. Besides, ecological validity needs to be maintained as well. Namely,
randomness in the prompting schedule, and a minimum number of inquiries per day need to
be satisfied. Clearly, a high response rate may be achieved by fewer prompts but a researcher
oftentimes requires a minimum number of responses, within specific time frames.


5. Current status of doctoral work
The research started in early 2020, by exploring how systems used in longitudinal monitoring
studies could adapt and evolve to enhance user experience, especially in the context of ESM in
the public health domain. Specifically, we have been working on Experiencer [18], as an ESM
tool that exploits commodity-level smartwatches to maximize compliance by finding opportune
moments of prompts based on the context derived from physiological sensors (sub-RQs 1 and 3).
Accordingly, we have published our latest findings titled “Assessing the Influence of Physical
Activity Upon the Experience Sampling Response Rate on Wrist-Worn Devices” [24] in the
International Journal of Environmental Research and Public Health.
   Meanwhile, the article that introduces Experiencer software is to be submitted soon. We have
created a context-sensitive tool that allows for recording of sensor data, configuring sensors,
remote device management, event-based data collection, various sampling regimes, dynamic
user interface, and also optimizes device data storage and data transactions over the network
while being open, available, and compliant with standard privacy measures (sub-RQ 3).
   With respect to the discussed topics in this paper, we have built our initial dynamic Bayesian
network that incorporates the psychological theories to imitate the human decision-making
process that also considers personal context (sub-RQ 2). Preliminary calculations are being
carried out to mathematically formalize the model. These will be followed by prototyping
the human simulator, building a framework that could be used by researchers, and ultimately
testing it in the wild (sub-RQ 4).
References
 [1] R. Larson, M. Csikszentmihalyi, The Experience Sampling Method, in: M. Csikszentmihalyi
     (Ed.), Flow and the Foundations of Positive Psychology: The Collected Works of Mihaly
     Csikszentmihalyi, Springer Netherlands, Dordrecht, 2014, pp. 21–34. URL: https://doi.org/
     10.1007/978-94-017-9088-8_2. doi:10.1007/978- 94- 017- 9088- 8_2 .
 [2] H. A. A. Spelt, J. H. D. M. Westerink, L. Frank, J. Ham, W. A. IJsselsteijn, Physiology-based
     personalization of persuasive technology: a user modeling perspective, User Modeling
     and User-Adapted Interaction (2022). URL: https://doi.org/10.1007/s11257-021-09313-8.
     doi:10.1007/s11257- 021- 09313- 8 .
 [3] N. van Berkel, J. Goncalves, L. Lovén, D. Ferreira, S. Hosio, V. Kostakos, Effect of
     experience sampling schedules on response rate and recall accuracy of objective self-
     reports, International Journal of Human-Computer Studies 125 (2019) 118–128. URL: https:
     //www.sciencedirect.com/science/article/pii/S1071581918306797. doi:10.1016/j.ijhcs.
     2018.12.002 .
 [4] P. Bacchetti, L. E. Wolf, M. R. Segal, C. E. McCulloch, Ethics and Sample Size,
     American Journal of Epidemiology 161 (2005) 105–110. URL: https://doi.org/10.1093/aje/
     kwi014. doi:10.1093/aje/kwi014 . arXiv:https://academic.oup.com/aje/article-
     pdf/161/2/105/712714/kwi014.pdf .
 [5] S. Wang, C. Zhang, B. Kröse, H. van Hoof, Optimizing Adaptive Notifications in Mo-
     bile Health Interventions Systems: Reinforcement Learning from a Data-driven Behav-
     ioral Simulator, Journal of Medical Systems 45 (2021) 102. URL: https://doi.org/10.1007/
     s10916-021-01773-0. doi:10.1007/s10916- 021- 01773- 0 .
 [6] S. Wang, K. Sporrel, H. van Hoof, M. Simons, R. D. D. de Boer, D. Ettema, N. Nibbeling,
     M. Deutekom, B. Kröse, Reinforcement Learning to Send Reminders at Right Moments in
     Smartphone Exercise Application: A Feasibility Study, International Journal of Environ-
     mental Research and Public Health 18 (2021) 6059. URL: https://www.mdpi.com/1660-4601/
     18/11/6059. doi:10.3390/ijerph18116059 , number: 11 Publisher: Multidisciplinary Digi-
     tal Publishing Institute.
 [7] C. Zhang, S. Wang, H. Aarts, M. Dastani, Using Cognitive Models to Train Warm Start
     Reinforcement Learning Agents for Human-Computer Interactions, arXiv:2103.06160 [cs]
     (2021). URL: http://arxiv.org/abs/2103.06160, arXiv: 2103.06160.
 [8] R. Tobias, Changing behavior by memory aids: A social psychological model of prospective
     memory and habit development tested with dynamic field data, Psychological Review
     116 (2009) 408–438. doi:10.1037/a0015512 , place: US Publisher: American Psychological
     Association.
 [9] D. C. Rubin, S. Hinton, A. Wenzel, The precise time course of retention, Journal of
     Experimental Psychology: Learning, Memory, and Cognition 25 (1999) 1161–1176. doi:10.
     1037/0278- 7393.25.5.1161 , place: US Publisher: American Psychological Association.
[10] B. Kamphorst, A. Kalis, Why option generation matters for the design of autonomous
     e-coaching systems, AI & Society 30 (2015) 77–88. URL: https://doi.org/10.1007/
     s00146-013-0532-5. doi:10.1007/s00146- 013- 0532- 5 .
[11] K. Todi, G. Bailly, L. Leiva, A. Oulasvirta, Adapting User Interfaces with Model-based
     Reinforcement Learning, in: Proceedings of the 2021 CHI Conference on Human Factors
     in Computing Systems, CHI ’21, Association for Computing Machinery, New York, NY,
     USA, 2021, pp. 1–13. URL: https://doi.org/10.1145/3411764.3445497. doi:10.1145/3411764.
     3445497 .
[12] J. R. Busemeyer, S. Gluth, J. Rieskamp, B. M. Turner, Cognitive and Neural Bases
     of Multi-Attribute, Multi-Alternative, Value-based Decisions, Trends in Cognitive
     Sciences 23 (2019) 251–263. URL: https://www.sciencedirect.com/science/article/pii/
     S1364661318302845. doi:10.1016/j.tics.2018.12.003 .
[13] M. Guimaraes, L. Emmendorfer, D. Adamatti, Persuasive agent based simulation for
     evaluation of the dynamic threshold line and trigger classification from the Fogg Behavior
     Model, Simulation Modelling Practice and Theory 83 (2018) 18–35. URL: https://www.
     sciencedirect.com/science/article/pii/S1569190X18300017. doi:10.1016/j.simpat.2018.
     01.001 .
[14] E. J. Horvitz, A. Jacobs, D. Hovel, Attention-Sensitive Alerting, arXiv:1301.6707 [cs] (2013).
     URL: http://arxiv.org/abs/1301.6707, arXiv: 1301.6707.
[15] E. Horvitz, P. Koch, J. Apacible, BusyBody: creating and fielding personalized models
     of the cost of interruption, in: Proceedings of the 2004 ACM conference on Computer
     supported cooperative work, CSCW ’04, Association for Computing Machinery, New York,
     NY, USA, 2004, pp. 507–510. URL: https://doi.org/10.1145/1031607.1031690. doi:10.1145/
     1031607.1031690 .
[16] E. Horvitz, C. Kadie, T. Paek, D. Hovel, Models of attention in computing and communica-
     tion: from principles to applications, Communications of the ACM 46 (2003) 52–59. URL:
     https://doi.org/10.1145/636772.636798. doi:10.1145/636772.636798 .
[17] E. Horvitz, J. Apacible, Learning and reasoning about interruption, in: Proceedings
     of the 5th international conference on Multimodal interfaces, ICMI ’03, Association for
     Computing Machinery, New York, NY, USA, 2003, pp. 20–27. URL: https://doi.org/10.1145/
     958432.958440. doi:10.1145/958432.958440 .
[18] A. Khanshan, Experiencer, 2022. URL: https://experiencer.eu/.
[19] N. Batalas, V.-J. Khan, M. Franzen, P. Markopoulos, M. aan het Rot, Formal representation
     of ambulatory assessment protocols in HTML5 for human readability and computer
     execution, Behavior Research Methods 51 (2019) 2761–2776. URL: https://doi.org/10.3758/
     s13428-018-1148-y. doi:10.3758/s13428- 018- 1148- y .
[20] M. Csikszentmihalyi, R. Larson, Validity and Reliability of the Experience-Sampling
     Method, in: M. Csikszentmihalyi (Ed.), Flow and the Foundations of Positive Psy-
     chology: The Collected Works of Mihaly Csikszentmihalyi, Springer Netherlands, Dor-
     drecht, 2014, pp. 35–54. URL: https://doi.org/10.1007/978-94-017-9088-8_3. doi:10.1007/
     978- 94- 017- 9088- 8_3 .
[21] S. Michie, M. M. van Stralen, R. West, The behaviour change wheel: A new method for
     characterising and designing behaviour change interventions, Implementation Science 6
     (2011) 42. URL: https://doi.org/10.1186/1748-5908-6-42. doi:10.1186/1748- 5908- 6- 42 .
[22] B. Fogg, Fogg behavior model, 2019. URL: https://behaviormodel.org.
[23] P. A. E. G. Delespaul, Technical note: Devices and time-sampling procedures, in:
     The experience of psychopathology: Investigating mental disorders in their natural set-
     tings, Cambridge University Press, New York, NY, US, 1992, pp. 363–373. doi:10.1017/
     CBO9780511663246.033 .
[24] A. Khanshan, P. Van Gorp, R. Nuijten, P. Markopoulos, Assessing the Influence of Physical
     Activity Upon the Experience Sampling Response Rate on Wrist-Worn Devices, Inter-
     national Journal of Environmental Research and Public Health 18 (2021) 10593. URL:
     https://www.mdpi.com/1660-4601/18/20/10593. doi:10.3390/ijerph182010593 , number:
     20 Publisher: Multidisciplinary Digital Publishing Institute.