=Paper=
{{Paper
|id=Vol-3926/paper11
|storemode=property
|title=One Spell Fits All: A Generative AI Game as a Tool for Research in AI Creativity and Sustainable Design
|pdfUrl=https://ceur-ws.org/Vol-3926/paper11.pdf
|volume=Vol-3926
|authors=Tom Tucek,Kseniia Harshina,Georgia Samaritaki,Dipika Rajesh
|dblpUrl=https://dblp.org/rec/conf/exag/TucekHSR24
}}
==One Spell Fits All: A Generative AI Game as a Tool for Research in AI Creativity and Sustainable Design==
One Spell Fits All: A Generative AI Game as a Tool for Research
in AI Creativity and Sustainable Design
Tom Tucek1,*,† , Kseniia Harshina1,*,† , Georgia Samaritaki2,† and Dipika Rajesh3,†
1
University of Klagenfurt, Klagenfurt, 9020, Austria
2
University of Amsterdam
3
University of California, Santa Cruz
Abstract
This paper presents "One Spell Fits All", an AI-native game prototype where the player, playing as a witch, solves villagers’ problems
using magical conjurations. We show how, beyond being a standalone game, "One Spell Fits All" could serve as a research platform to
explore several key areas in AI-driven and AI-native game design. These areas include AI creativity, user experience in predominantly
AI-generated content, and the energy efficiency of locally running versus cloud-based AI models. By leveraging smaller, locally running
generative AI models, including LLMs and diffusion models for image generation, the game dynamically generates and evaluates content
without the need for external APIs or internet access, offering a sustainable and responsive gameplay experience. This paper explores
the application of LLMs in narrative video games, outlines a game prototype’s design and mechanics, and proposes future research
opportunities that can be explored using the game as a platform.
Keywords
Generative AI, AI-driven Gameplay, Local AI Models, AI and Creativity, Player Experience, Procedural Content Generation
1. Introduction player. The authors of the game also argue for the use of the
term ”AI-native” games, as opposed to AI-powered games,
In "One Spell Fits All" (OSFA), players take on the role of a for video games that use generative models as an integral
witch, capable of conjuring an infinite variety of items using core feature. We adopt this term to describe OSFA.
text-based input (see Figures 1 and 2). NPC-Villagers visit The integration of AI in video games has been explored
the witch with a wide range of problems, and the player in literature, with foundational texts such as those by
is tasked with finding exactly one solution that solves all Yannakakis and Togelius [8] providing a comprehensive
of their problems at once – using a single text input. The overview of how artificial intelligence can enhance game
game uses locally running AI models to generate and evalu- design through techniques like procedural content genera-
ate game content, providing a sustainable alternative with tion (PCG), player modelling, and dynamic difficulty adjust-
lower latency compared to cloud-based AI services. Fur- ment. OSFA builds upon these concepts by applying local AI
thermore, we believe that this video game can serve as a models to create a responsive and sustainable game environ-
research platform, as the game can provide an environment ment. Elements of PCG [9] have formed the base of OSFA,
to investigate various aspects of generative AI, such as AI to showcase how AI can autonomously create complex and
creativity, user experience for AI-generated content, and engaging game content.
energy efficiency – all of which can be compared across
different models or implementations. This paper discusses
the game’s integration of generative AI tools and outlines 3. Game Design and Mechanics
the potential research directions that can be pursued using
this platform. The core gameplay loop of OSFA revolves around the
player’s text input. Taking on the role of a witch, play-
ers have to address the various needs of NPC-villagers by
2. Related Work using magical conjurations. In each round, the system feeds
a solution item from a list of predefined solutions to the LLM,
Incorporating AI into interactive narratives has been a long- which then reverse-engineers problems. These problems
term objective in video game research [1]. This goal has are then posed as requests from the villagers. Players have
seen significant advancements with the recent surge in gen- complete creative freedom regarding their input, which is
erative AI tools, particularly large language models (LLMs). funnelled into various models.
Furthermore, the use of LLMs in video games has been
shown to increase the amount and the quality of possible
player interactions [2, 3, 4, 5].
3.1. Gameplay Overview
Significant advances for generative AI integrated into In OSFA, an increasing number of villagers visit the player’s
games have been shown with examples like "1001 Nights" [6, hut, each with a specific problem that needs to be solved.
7], a game which explores dynamic content generation by The player must provide a solution through text input, to
creating images of weapons based on the stories told by the conjure an item that solves all of the present villagers’ prob-
lems at once. The game progresses in turns, with each turn
11th Experimental Artificial Intelligence in Games Workshop, November
representing a new challenge through a new set of villagers.
19, 2024, Lexington, Kentucky, USA.
$ Tom.Tucek@aau.at (T. Tucek); Kseniia.Harshina@aau.at The player’s success is measured by their ability to satisfy
(K. Harshina); g.samaritaki@uva.nl (G. Samaritaki); dirajesh@ucsc.edu the villagers, which is reflected in the game’s scoring system.
(D. Rajesh) Figures 1 and 2 showcase the in-game environment where
https://github.com/YenR/OneSpellFitsAll (T. Tucek) the witch summons items based on player input (top-right)
0009-0001-1277-1473 (T. Tucek); 0009-0001-4491-4619 (K. Harshina);
and the villagers voicing their problems.
0009-0006-8467-2374 (G. Samaritaki); 0009-0007-53 57-5560 (D. Rajesh)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
in which the player must think creatively and strategically
to meet all of the villagers’ needs.
3.4. Artistic Elements
OSFA features a distinct pixel art style that complements its
AI-driven gameplay. The game’s art is a mix of free assets
and AI-generated art, such as the sprite of the witch. The
locally running model creates pixel art assets in real-time
based on player input, which matches stylistically with the
rest of the game. Furthermore, the background music and
the theme song for the main menu have also been created
Figure 1: Screenshot of OSFA, showing villagers and their prob- using generative AI tools.
lems. (The problems read: "Catching many small fish for dinner"
and "Collecting fallen apples from [a] large tree".)
4. Technical Workflow
Figure 3 illustrates the workflow of OSFA, where various
AI components generate and evaluate game content dynam-
ically in the game’s core loop.
The workflow is detailed as follows:
1. Villagers enter the witch’s store, bringing their
unique problems for the witch (the player) to solve.
2. The game uses an LLM to generate a problem for
each villager. This AI model with custom instruc-
tions takes a sentence (in our case, a keyword) as
input and produces problems, which are solvable by
the keyword.
Figure 2: Screenshot of OSFA, showing the witch conjuring a 3. Each turn, the player must address all of the villagers’
chair and the villagers being satisfied with the solution. Note that problems at once. The player inputs text – an item
this is not the solution to the problems in the previous screenshot. to be conjured by the witch to solve the problems.
4. The game then evaluates the player’s solution us-
ing the all-MiniLM-L6-v2 transformer model. This
3.2. Problem and Solution Generation model checks the similarity between the player’s
input and the original keyword to determine if the
One of the innovative aspects of OSFA is the use of AI to dy- problem has been correctly addressed. Note that this
namically generate problems and evaluate solutions at run- step could also be given to the LLM instead, which
time. The game currently employs the Mistral-7B-Instruct would then be able to evaluate the solution on a
model, via the LLM for Unity library 1 to generate unique per-villager basis.
problems equal to the number of present villagers, based
5. The game generates an image representing the solu-
on a provided keyword. This ensures that the challenges
tion. This image visually depicts the outcome of the
presented to the player are varied, contextually relevant,
player’s input.
and have at least one correct solution for all problems. Once
6. The game then calculates the player’s score based
the player provides a solution, the game tests for validity
on the success of their solution. The villagers give
and then uses the all-MiniLM-L6-v2 2 transformer model
visual feedback on whether the solution was con-
to evaluate the solution based on a similarity score to the
sidered successful or not, leave the store, and a new
original keyword. Future iterations of the game will try the
loop begins.
use of LLMs for evaluation of the solution as well. The game
then generates a visual representation of the solution, using
a custom diffusion model 3 and ComfyUI 4 , which is then 5. AI Integration
presented to the player and the villagers.
OSFA uses multiple AI models running locally and at the
3.3. Villager Interactions same time to create a dynamic gameplay experience. The
integration of these models is central to the game’s ability
The interactions between the player and the NPC-villagers to generate problems, evaluate solutions, and provide visual
are central to the game’s narrative and mechanics. Villagers feedback. This section gives more details on how various
respond to the player’s solutions based on the AI’s evalua- AI components are implemented and interact within the
tion, and their satisfaction is reflected in the game’s scoring game. Notably, the game’s architecture was designed with
system. The scoring system, in turn, affects how many modularity in mind – which means that each model can be
villagers come back to the player. These interactions are swapped out separately.
designed to create a dynamic and responsive gameplay loop, The Mistral-7B-Instruct model, used as the LLM within
1
the game, is responsible for generating the unique problems
https://github.com/undreamai/LLMUnity
2
https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
that each villager presents to the player. The model operates
3
https://huggingface.co/megaaziib/aziibpixelmix locally on the player’s computer, limiting energy consump-
4
https://github.com/comfyanonymous/ComfyUI tion and eliminating the need for external API calls. The use
Figure 3: Workflow diagram showing game loop for OSFA and the interaction between AI components and user input in the
game.
of such a local implementation can also reduce the latency cover the correct solutions but remained persistent, contin-
between player input and feedback, as well as improve data ually attempting new ideas. When the villagers left satisfied
privacy concerns, as no data is sent to third parties. with a solution, it elicited cheers and a sense of accomplish-
To evaluate the player’s solutions, OSFA integrates the ment among the group. Participants particularly enjoyed
all-MiniLM-L6-v2 transformer model. After the player pro- the generated images, highlighting them as a standout fea-
vides a solution to a villager’s problem, the model calculates ture. However, some areas of improvement were noted.
a similarity score compared to a ’perfect’ solution, thus de- There was ambiguity when players felt their solutions were
termining whether the player’s input sufficiently addresses appropriate but unacknowledged by the system. Enhanc-
the problem. Currently, all villagers’ problems are evaluated ing the feedback mechanism to provide clearer guidance
at once, but future implementations will aim to evaluate when solutions are rejected could significantly improve user
them on a case-by-case basis (e.g., by using an LLM instead experience.
of a transformer), thus allowing for partially correct answers
as well.
Based on the player’s input, the conjured items are gener- 7. Future Work
ated using a pixel-art diffusion model and a custom ComfyUI
Future work will focus on evaluating and enhancing the
workflow. This tool is integrated into the game’s AI pipeline,
AI-driven creativity within the game, assessing user experi-
allowing it to dynamically create pixel art assets based on
ence and satisfaction through comprehensive playtesting,
the input in real time.
and analyzing the cost-benefit ratio of using local AI models
The AI models in OSFA are implemented within a frame-
compared to cloud-based solutions. We aim to ensure an
work that supports real-time interaction and processing.
engaging and satisfying player experience while at the same
The models run locally on the player’s machine and the
time promoting sustainable AI practices in game develop-
game was made using the Unity Game Engine, using scripts,
ment.
plugins, and libraries to control the AI models. By running
the AI components locally, the game minimizes latency and
increases security through the lack of network communica- 7.1. AI Creativity
tion, although the player’s hardware specification can still Creativity, especially in the context of generative AI, is a
cause delays. rich and interdisciplinary research field [10, 11, 12]. As
OSFA uses an LLM to generate villagers’ wishes and re-
6. User Experience and Feedback quests based on specific keywords, we believe that it opens
up new possibilities for exploring AI creativity. The ability
Initial testing indicates that players appreciate the dynamic of AI to generate novel and varied problems from similar
problem-solving elements, though challenges such as model inputs is understood to be a form of creativity, presenting
latency were encountered and mitigated through optimiza- several interesting research questions. The first step could
tion techniques. The game was developed and showcased be defining and quantifying AI creativity in the context of
during a game jam event, where it won second place. More the game prototype. One approach could be to develop a
than ten people at a time engaged with the game, often creativity score to evaluate the LLM’s performance in gener-
collaborating to guess and influence the player’s decisions. ating unique and contextually appropriate problems. This
This collective engagement fostered curiosity and a fun, score would be based on the similarity of problems gener-
cooperative atmosphere. Players found it challenging to dis- ated by the AI across various playthroughs and different
versions of the game. 7.3. Energy Efficiency of Local AI Models vs.
To explore this, data on AI-generated problems needs to Cloud-Based Solutions
be collected across multiple sessions and then compared
against several types of solutions: those manually crafted As AI-native games like OSFA continue to develop, an im-
by human designers, those generated solely by the LLM, and portant question arises: How does the utilization of local AI
a mix of human and AI-generated content. This comparison models compare to cloud-based solutions in terms of energy
could reveal how closely AI outputs align with expected cre- efficiency? This question is particularly relevant for our
ative standards, as well as how AI can complement human game, which operates AI models locally. While it is gener-
creativity. ally accepted that running smaller LLMs locally would be
In addition to a creativity score, other metrics can help faster and more power efficient than remote calls to larger
to better understand and measure AI creativity. Novelty can cloud-based models (such as GPT-5), this claim still war-
assess how often the AI produces entirely new problems not rants further investigation, especially in terms of broader
seen in previous playthroughs. Diversity measures the range scalability and environmental impact. A more detailed com-
of different problem types generated in response to similar parative analysis could explore the trade-offs between local
keywords within a single playthrough. Consistency can eval- and cloud-based solutions, considering the quality of the
uate the AI’s ability to maintain high-quality, contextually generated content, latency, and also the total energy con-
relevant problems across different scenarios. Adaptability sumption involved. This includes power requirements for
assesses how well the AI adjusts its problem generation to both local machines and cloud-based servers, as well as
varying game contexts or player actions, ensuring that the the energy costs of data transmission between clients and
problems remain appropriate and challenging. remote servers. The goal would be to determine which ap-
This leads to another question – how do different LLM proach offers greater energy efficiency, considering factors
models compare in terms of their creative outputs? By sys- such as scalability, environmental impact, and the trade-offs
tematically comparing these metrics across various LLMs, between performance, quality, latency, and energy use.
we could gain insights into which models are better suited Understanding these differences can have significant im-
for generating creative content in games. This could also in- plications for game development. If local AI models prove to
form the future development of AI systems designed specif- be more energy-efficient while guaranteeing sufficient qual-
ically for creative tasks. Developing these metrics and com- ity of experience, they could become the preferred choice for
paring different models should deepen our understanding sustainable game design. However, if their quality remains
of how AI can be used to generate novel and engaging con- lacking, and cloud-based solutions offer better efficiency,
tent. This research could have broader implications for AI this will influence how AI resources are deployed in future
applications in creative industries, such as game design, projects as well. Exploring the energy efficiency of AI mod-
storytelling, and digital art. els is important for many reasons – not only for improving
game performance but also for advancing sustainability in
the AI and video game industry. By comparing the energy
7.2. User Experience and Satisfaction use of local AI models with server-based and cloud-based
Understanding user experience (UX) and satisfaction can solutions, this research could guide future developments in
help in evaluating the success of OSFA, especially given its AI-driven game design, promoting more environmentally
integration of AI-generated content. This section explores conscious practices.
methods to assess how players interact with and perceive
the game, particularly in the context of its AI elements. One
approach to evaluating UX in OSFA is heuristic evaluation.
8. Conclusion
By having experts review the game based on established OSFA demonstrates the potential of locally-run AI models
usability principles, it’s possible to identify both strengths in game development, offering both an engaging gameplay
and potential issues within the game. Key considerations experience and a versatile platform for research. By reduc-
include ensuring that AI-generated content maintains con- ing reliance on cloud-based services, the game opens up
sistency and adheres to the game’s internal rules. new avenues for studying AI creativity, UX, and energy
Beyond heuristic evaluation, another important aspect efficiency. This paper outlines the game’s design and me-
is understanding how players’ awareness of AI-generated chanics, while also proposing several research directions
content influences their enjoyment. It is worth investigat- that can be explored using the game as a platform, thus
ing whether players perceive the AI elements (referring to contributing to the broader discourse on AI in games and
both the pre-generated content such as music and visuals, digital creativity.
as well as dynamically generated content) as positive addi-
tions to the game or if knowing that most of the content is
AI-generated affects their perception of the game’s creativ- References
ity and quality. For instance, surveys or interviews could
be conducted to explore whether this awareness impacts [1] M. O. Riedl, V. Bulitko, Interactive narrative: An in-
players’ overall satisfaction and engagement and how AI- telligent systems approach, Ai Magazine 34 (2013)
driven design challenges their expectations. In summary, 67–67.
understanding user experience and satisfaction in a predom- [2] T. Ashby, B. K. Webb, G. Knapp, J. Searle, N. Fulda,
inantly AI-generated game like OSFA can help answer other Personalized quest and dialogue generation in role-
important research questions regarding players’ perceptions playing games: A knowledge graph-and language
of AI content. model-based approach, in: Proceedings of the 2023
CHI Conference on Human Factors in Computing Sys-
tems, 2023, pp. 1–20.
[3] S. Värtinen, P. Hämäläinen, C. Guckelsberger, Gen-
erating role-playing game quests with GPT language
models, IEEE transactions on games (2022).
[4] Y. Wang, Q. Zhou, D. Ledo, Storyverse: Towards
co-authoring dynamic plot with LLM-based charac-
ter simulation via narrative planning, arXiv preprint
arXiv:2405.13042 (2024).
[5] Latitude, AI Dungeon, https://play.aidungeon.io/, 2019.
Accessed: 2024-09-27.
[6] Y. Sun, Z. Li, K. Fang, C. H. Lee, A. Asadipour, Lan-
guage as reality: A co-creative storytelling game ex-
perience in 1001 nights using generative AI, Proceed-
ings of the AAAI Conference on Artificial Intelligence
and Interactive Digital Entertainment 19 (2023) 425–
434. URL: https://ojs.aaai.org/index.php/AIIDE/article/
view/27539. doi:10.1609/aiide.v19i1.27539.
[7] Ada Eden, 1001 Nights AI: Discover generative stories,
2024. URL: https://www.1001nights.ai/, accessed: 2024-
08-30.
[8] G. N. Yannakakis, J. Togelius, Artificial intelligence
and games, Springer, 2018.
[9] D. Gravina, A. Khalifa, A. Liapis, J. Togelius, G. N.
Yannakakis, Procedural content generation through
quality diversity, in: 2019 IEEE Conference on Games
(CoG), IEEE, 2019, pp. 1–8.
[10] J. Rowe, D. Partridge, Creativity: A survey of AI ap-
proaches, Artificial Intelligence Review 7 (1993) 43–70.
[11] J. H. Roosa Wingström, R. Lundman, Redefining cre-
ativity in the era of AI? perspectives of computer sci-
entists and new media artists, Creativity Research
Journal 36 (2024) 177–193. doi:10.1080/10400419.
2022.2107850.
[12] L.-C. Lu, S.-J. Chen, T.-M. Pai, C.-H. Yu, H.-y. Lee, S.-
H. Sun, LLM discussion: Enhancing the creativity of
large language models via discussion framework and
role-play, arXiv preprint arXiv:2405.06373 (2024).