Promoting Trustworthy AI in mHealth: a Gamified
                                Approach to Value-Sensitive Design
                                Maria Inês Ribeiro1,*,† , Laura Genga2,† , Monique Simons3,‡ and Pieter Van Gorp4,†
                                1
                                    Technical University of Eindhoven Eindhoven, Netherlands
                                2
                                    Wageningen University & Research Wageningen, Netherlands


                                              Abstract
                                              The rise of mobile health (mHealth) apps leveraging AI and wearables to promote healthy lifestyles
                                              is accompanied by growing ethical concerns among the public, developers, and policymakers. While
                                              AI guidelines exist to mitigate concerns, translating them to practical design requirements remains
                                              challenging. This research proposes a gamified approach to help bridge the gap between theory and
                                              practice in Value-Sensitive Design (VSD) for AI applications in mHealth. This approach aims to facilitate
                                              the development of trustworthy AI by aligning design with stakeholder ethical values. Using the
                                              design science methodology, we developed a card game to improve stakeholder participation, foster
                                              an understanding of AI in mHealth, and facilitate in-depth ethical discussions. Pilot-testing with 19
                                              peer researchers showed active engagement and motivation of players through self-discovery. The
                                              findings highlight the game’s potential to elicit ethical discussions and promote an understanding of
                                              AI’s real-world implications. Future iterations could explore digital, blended, or survey formats to
                                              enhance engagement, accessibility, and depth of insights, catering to diverse stakeholder preferences.
                                              This gamified approach to VSD holds promise as a tool for supporting the development of trustworthy AI
                                              technologies in healthcare, aligned with stakeholder values. Further validation with broader stakeholder
                                              groups and a longitudinal impact assessment are needed.

                                              Keywords
                                              mHealth, Trustworthy AI, Value-Sensitive Design


                                1. Introduction
                                In recent years, mHealth apps that track our sleep patterns, heart rate, and activity levels have
                                become increasingly popular to promote healthy lifestyle behaviors. While these personalized
                                health tools powered with AI technology and wearables data hold immense potential, recent
                                ethical incidents raise concerns and spark alarm among the public, developers, and policymakers
                                [1, 2].
                                   Ethical frameworks and regulations are emerging to mitigate these concerns and ensure
                                trustworthy AI development. For instance, the High-level expert group on AI from the European
                                HHAI-WS 2024: Workshops at the Third International Conference on Hybrid Human-Artificial Intelligence (HHAI), June
                                10—14, 2024, Malmö, Sweden
                                *
                                  Corresponding author.
                                $ m.i.da.graca.jorge.da.silva.ribeiro@tue.nl (M. I. Ribeiro); l.genga@tue.nl (L. Genga); monique.simons@wur.nl
                                (M. Simons); P.M.E.v.Gorp@tue.nl (P. Van Gorp)
                                 https://orcid.org/0009-0001-7746-4685 (M. I. Ribeiro); https://orcid.org/0000-0001-8746-8826 (L. Genga);
                                https://orcid.org/0000-0003-4693-9980 (M. Simons); https://orcid.org/0000-0001-5197-3986 (P. Van Gorp)
                                 0009-0001-7746-4685 (M. I. Ribeiro); 0000-0001-8746-8826 (L. Genga); 0000-0003-4693-9980 (M. Simons);
                                0000-0001-5197-3986 (P. Van Gorp)
                                            © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
Commission (EC) advocates for a human-centered approach grounded by the ethical principles
of respect for human autonomy, prevention against harm, fairness, and explicability [3]. A
self-assessment checklist is available as a tool for AI developers to implement these principles
[4]. Yet, checklist-based approaches lack practical implementation details, leaving developers to
navigate complex ethical dilemmas and tensions between diverse stakeholder ethical values [5].
In mHealth apps, conflicts between data privacy and personalized lifestyle recommendations
are particularly evident. For example, an app may collect health and lifestyle data to predict
opportunistic moments for suggesting a walk. This app could bring health benefits with
increased physical activity but also poses privacy risks as sensitive health data could be exposed.
What is then more valuable: data privacy or health benefits?
   To address these complex trade-offs, several design approaches can inform the integration
of stakeholder values in the design process of AI technology. While User-Centered Design
prioritizes user experience and Privacy by Design focuses on data protection, VSD offers a more
comprehensive framework, analyzing AI ethics across individual, group, and societal levels, and
aiming at the symbiotic evolution of technology and societal norms [6, 7, 8].
   Multiple methods have been employed to elicit values in VSD, such as the Value Scenario
method, which emphasizes technology implications in narrated use cases, or the Value-oriented
Mock-up, Prototype, or Field Deployment method, which investigates values implications in real-
world contexts [9]. Despite these efforts, current VSD methods face several limitations, often
addressing only one or two of these key challenges: (1) recruiting and engaging stakeholders
in focus groups; (2) providing enough technical and ethical AI understanding to stakeholders,
and (3) eliciting ethical discussions that allow for translating abstract findings into actionable
requirements for AI developers [6].
   A gamified tool seems intuitively capable of addressing these challenges simultaneously.
First, games are inherently engaging, attracting and retaining stakeholder participation better
than traditional methods. Second, they may simulate complex scenarios and provide immediate
feedback, helping stakeholders grasp AI’s technical and ethical dimensions without prior
expertise. Third, the structured yet flexible nature of games allows for quantitative tracking of
decisions and actions, providing concrete data for actionable design requirements.
   This research proposes to support VSD with a gamified approach. We developed and pilot-
tested a card game to elicit and explore stakeholder values regarding the use of AI in mHealth
apps. This approach seeks to provide practical insights that can enhance the effectiveness of
VSD in guiding the development of trustworthy AI technology.
   The structure of the remainder of this paper is as follows: Section 2 outlines the methods
used to develop and pilot-test the gamified approach; Section 3 presents the key findings from
the pilot tests; Section 4 discusses the adherence of the game to its objectives and potential
future directions for refining the game; and Section 5 concludes with a summary of key points
and suggestions for further research.


2. Methods
In this study, we employed the design science methodology framework to develop and refine
a game exploring stakeholder values and ethical considerations in using AI for mHealth apps
[10]. The overall goal was to provide a practical tool to support VSD. This preliminary version
of the game was designed for a general population, assessing its acceptance of using private
data for generating personalized lifestyle recommendations. This section outlines the game
objectives, design, and pilot testing.

2.1. Game Objectives
The game aimed to achieve the following objectives:
   Objective 1: Enhance Recruitment and Engagement. Leverage gamification to create an
interactive and captivating experience for stakeholders during focus groups.
   Objective 2: Provide Understanding of AI. Present concrete examples of AI applications
and implications to guide the definition of AI design requirements by assessing ethical concerns
on AI-specific uses.
   Objective 3: Elicit In-Depth Ethical Discussions. Engage stakeholders in structured
discussions on AI in mHealth to gain insights on specific ethical considerations relevant to AI
design and development.

2.2. Game Design
The game design adheres to the Mechanics-Dynamics-Aesthetics framework to create an en-
gaging exploration of AI ethical considerations in mHealth apps with a trade-off between data
privacy and personalized lifestyle recommendations [11].
   To enhance recruitment and engagement (Objective 1), the game offers intrinsic and extrinsic
rewards. At the beginning of a game session, players were motivated to embark on a self-
discovery journey fostering reflection on personal ethical values (intrinsic reward) while earning
AI user-type badges (extrinsic reward). This AI user type was defined based on the prevalence
of each participant’s ethical concerns categorized according to the four ethical principles of
trustworthy AI defined by the EC [3]. During the game, participants encountered multiple
ethical dilemmas that required them to weigh competing values and priorities when interacting
with mHealth apps. The game provided a safe and comfortable social environment for open and
honest discussions about ethical concerns, promoting community and empathy among players.
   Featuring a card game, the core game mechanics revolved around Black Cards presenting
AI-generated lifestyle recommendations with five possible human reactions (Figure 1), aiming to
promote understanding of AI’s real-world implications (Objective 2). Such prompts were linked
to at least one AI development decision, e.g. ’Is it acceptable to use GPX location to recommend
convenient and nearby walking routes?’. Players individually chose a White Card, labeled from
A to E, reflecting their preferred reaction to the AI prompt, and placed it facing down on the
table. A moderator facilitated discussion in each round as players shared and debated their
choices (Objective 3). Encrypted color coding on Score Keeping Cards tracks player decisions.
Upon game over, players earn the AI User-Type Badges reflecting their ethical priorities based
on gameplay and revealed by the Game Over Card.
                     Hi, human!
                     Your sedentary streak is on! How
                     about we break free and take a
                     brisk walk? I've already charted
                     a route nearby your current
                     location.

                     Hi, AI Buddy!
                     A. Whoa, you are right! How cool is your new
                     routing feature. Let’s go in 15 minutes.
                     B. Sounds nice, but how did you decide that’s what I
                     need right now? Looks like just guess work, pal.
                     C. Does everyone get this special treatment, or am I the
                     chosen one? Let’s make sure the wellness party in open
                     to all, not just the tech-savvy.
                     D. Why do you always know where I am?
                     E. While I appreciate your initiative, I will do whatever
                     I want. And that is very human.


Figure 1: Black Card sample presenting an AI-generated health recommendation and five human
reactions.


2.3. Pilot Testing
We conducted two 90-minute focus groups to pilot-test the game. A total of 19 researchers were
recruited through convenience sampling at our affiliated universities (Eindhoven University of
Technology and Wageningen University & Research). Both sessions follow a similar agenda:
introduction explaining the game and its objectives (10 minutes), gameplay (six rounds or 45
minutes), and feedback (35 minutes). Observations during gameplay from both focus groups
were used to evaluate the game’s adherence to its key objectives. The feedback sessions served
to ideate new game mechanics, dynamics, and aesthetics and refine the game design for future
iterations. Focus Group 1 engaged at first in spontaneous feedback and then collaborated in
a co-creation task using the gamification model canvas to refine the game design [12]. Focus
Group 2 participated in a semi-structured discussion with predefined questions to guide the
co-creation process.
3. Results
We briefly report the most significant findings and related participants’ suggestions for game
refinement offered in focus groups.
   Finding 1: Motivation through Self-Discovery. Participants found that uncovering their
AI user type was a strong motivator for participation. While some players found the assigned
type aligned with their values, others desired more rounds for a clearer picture. One participant
recommended using a subset of AI scenarios in digital format as a teaser to recruit players.
   Finding 2: Player Engagement and Relatedness. Participants expressed joy in game-
play, reporting higher engagement when scenarios resonated with personal experiences. The
alignment of AI prompts with personal interests significantly influenced their reactions and
investment in the gameplay. Participants suggested avoiding overly specific prompts (e.g.,
detailed activities or timing) and incorporating open-ended response options to encourage
imagination and enhance connection to the scenarios.
   Finding 3: Ethical Discussion. Participants engaged in discussions prompted by the game’s
ethical dilemmas, expressing the challenge of selecting a single answer from limited choices. To
address this, they proposed implementing a ranking score system to allow for more nuanced
responses. Some participants were unsure about the benefits of discussions for uncovering
their AI user type. They suggested clarifying discussion goals and offering incentives for active
participation. Additionally, participants recommended using a centralized moderator and AI
voice for reading prompts to streamline gameplay and enhance immersion. Finally, participants
emphasized the need for a safe environment between other players; they were worried that
introverted players would not give their input. It was suggested to cluster stakeholders in
dedicated sessions.
   Finding 4: Contextual Clarity. Participants highlighted the need for additional context
surrounding AI prompts to facilitate informed decision-making. They proposed introducing a
game board element displaying complementary information.
   Finding 5: Understanding of AI in mHealth. The game facilitated an understanding
of AI’s real-world implications as it revealed participants’ varied comfort levels with sharing
different data (e.g., social media vs. physiological) for an AI-driven mHealth tool.
   Finding 6: Influence of Phrasing. Participants identified potential bias from prompt
wording and tone. They recommended neutral language while acknowledging humor’s role in
fostering curiosity and engagement.
   Finding 7: Digital Format. While some participants appreciated the physical format,
transitioning to an online platform was viewed favorably. This could enable new mechanics
like nuanced scoring and virtual moderation. Participants believed that an online version would
be more accessible and inclusive, potentially reaching a wider audience beyond physical group
settings.
4. Discussion
4.1. Game Adherence to Objectives
Pilot testing demonstrated the serious game’s potential to achieve its key objectives. First,
the game promises to attract participants (Objective 1) through gamification elements like
badges and personalized user types, offering an enjoyable and stimulating self-discovery experi-
ence. Moving forward, clarifying discussion goals and rewarding participation could enhance
engagement further.
   Second, data privacy concerns raised during gameplay highlight the game’s ability to guide
the player in learning the implications of AI in mHealth (Objective 2). A storytelling dynamic
holds the potential to further contextualize different uses of AI in mHealth. Hence, this gamified
approach seems suitable for including stakeholders with low AI literacy in the design process.
In addition to assessing ethical trade-offs between data privacy and potential health outcomes,
this gamified tool could be leveraged in other stages in the design process, e.g. to assess the
ease of use of prototypes or evaluate stakeholders’ feeling of empowerment in the co-creation
of new technology. Despite such opportunities, there is a need to explore how such a gamified
approach could be scaled without a cumbersome effort in adapting the game to new use cases.
   Third, the structured format encourages stakeholders to debate the ethical implications of
AI technologies (Objective 3). When players shared their decision-making process between
different ethical human reactions, they provided nuanced insights that may inform AI developers
in making design decisions. In future work, each game card or AI prompt could be linked to
an AI development decision, where quantitative analysis of the players’ choices may translate
stakeholders’ values into actionable insights in alignment with Trustworthy AI principles.

4.2. Future directions
Three possible future directions emerge to refine the game in subsequent design iteration:

   1. In-person Digital Approach: Moving the game to a digital platform while preserving
      its engaging elements could enhance accessibility and scalability. A digital version could
      introduce nuanced scoring mechanisms, virtual moderation, and incorporate additional
      contextual information to improve the game’s effectiveness and reduce response bias.
   2. Blended Approach: Combining elements of the paper-based game with digital com-
      ponents offers the advantages of both formats. This approach could maintain tangible
      interaction with physical cards while integrating online features for enhanced scoring,
      moderation, and broader engagement across different settings. It would cater to diverse
      preferences and maximize the game’s impact.
   3. Digital Survey Approach: A digital survey format could target stakeholders who may
      be unwilling to dedicate time to gameplay but whose input remains valuable for AI system
      design. While this approach could scale distribution and provide more representative data,
      it has fewer gamification opportunities for engagement (Objective 1) and may sacrifice
      the nuanced personal values that emerge from meaningful discussions during gameplay
      (Objective 3), which are not trivially realized in online settings.
  In future research, choosing the most suitable approach depends on the desired engage-
ment, accessibility, and depth of insights needed for ethical AI design and development, where
A/B testing could provide further insight. Further exploration and refinement are crucial for
maximizing the game’s potential.
  Future validation efforts should involve broader testing with diverse stakeholder groups
beyond academic researchers and longitudinal studies to assess the game’s impact on stake-
holders’ attitudes and decision-making processes over time, establishing it as a reliable tool for
promoting responsible and trustworthy AI development.


5. Conclusion
Featuring a gamified engagement, a deeper understanding of AI applications, and in-depth
ethical discussions, this gamified approach shows promise as a tool to support the development
of trustworthy AI in mHealth aligned with stakeholder values. Further refinement efforts
could explore a fully digital format prioritizing accessibility and nuanced scoring, a blended
physical-digital approach, or even a streamlined online survey, depending on the desired balance
between engagement, accessibility, and depth of stakeholder insights gleaned.


Acknowledgments
The authors gratefully acknowledge the contributions of focus group researchers for their
valuable feedback shaping the next game design iteration.


References
 [1] A. Kankanhalli, Q. Xia, P. Ai, X. Zhao, Understanding personalization for health behav-
     ior change applications: A review and future directions, AIS Transactions on Human-
     Computer Interaction 13 (2021) 316–349. doi:10.17705/1thci.00152.
 [2] S. McGregor, Preventing repeated real world AI failures by cataloging incidents: The
     AI incident database, in: Proceedings of the Thirty-Fifth AAAI Conference on Artificial
     Intelligence , Thirty-Third Conference on Innovative Applications of Artificial Intelligence,
     The Eleventh Symposium on Educational Advances in Artificial Intelligence; 2021 Feb 2-9;
     Virtual Event, AAAI Press, 2021, pp. 15458–15463. doi:10.1609/AAAI.V35I17.17817.
 [3] E. Commission, C. Directorate-General for Communications Networks, Technology, Ethics
     guidelines for trustworthy AI, Publications Office, 2019. doi:10.2759/346720.
 [4] European Commission, C. a. T. Directorate-General for Communications Networks, The
     Assessment List for Trustworthy Artificial Intelligence (ALTAI) for self assessment, Publi-
     cations Office, 2020. doi:10.2759/002360.
 [5] C. Huang, Z. Zhang, B. Mao, X. Yao, An overview of artificial intelligence ethics, IEEE Trans-
     actions on Artificial Intelligence 4 (2022) 799–819. doi:10.1109/TAI.2022.3194503.
 [6] L. van Velsen, G. Ludden, C. Grünloh, The limitations of user-and human-centered design
     in an ehealth context and how to move beyond them, J Med Internet Res 24 (2022) e37341.
     doi:10.2196/37341.
 [7] P. Schaar, Privacy by design, Identity in the Information Society 3 (2010) 267–274.
     doi:10.1007/s12394-010-0055-x.
 [8] B. Friedman, P. H. Kahn, A. Borning, A. Huldtgren, Value sensitive design and information
     systems, in: N. Doorn, D. Schuurbiers, I. van de Poel, M. E. Gorman (Eds.), Early engagement
     and new technologies: Opening up the laboratory, Springer Netherlands, 2013, pp. 55–95.
     doi:10.1007/978-94-007-7844-3_4.
 [9] B. Friedman, D. G. Hendry, A. Borning, A survey of value sensitive design methods,
     Foundations and Trends® in Human–Computer Interaction 11 (2017) 63–125. URL: https://
     www.nowpublishers.com/article/Details/HCI-015. doi:10.1561/1100000015, publisher:
     Now Publishers, Inc.
[10] R. J. Wieringa, The design cycle, in: Design Science Methodology for Information Systems
     and Software Engineering, Springer Berlin Heidelberg, 2014, pp. 27–34. doi:10.1007/
     978-3-662-43839-8_3.
[11] R. Hunicke, M. LeBlanc, R. Zubek, Mda: A formal approach to game design and game
     research, in: Proceedings of the Nineteenth AAAI Conference on Artificial Intelligence;
     2004 Jul 25-29, volume 4, San Jose, CA, USA, 2004.
[12] J. L. R. Robledo, F. N. Lucena, S. J. Arenas, Gamification as a strategy of internal marketing,
     Intangible Capital 9 (2013) 1113–1144. doi:10.3926/ic.455.