Tackling Generation of Combat Encounters in Role-playing Digital Games

                                         Matouš Kozma1 , Vojtěch Černý1 , and Jakub Gemrot1

                                         Faculty of Mathematics and Physics, Charles University
                                             Ke Karlovu 3, Praha 2, 121 16, Czech Republic
                                    mattkozma@hotmail.com, {cerny,gemrot}@gamedev.cuni.cz

Abstract: Procedural content generation (PCG) has been                    be described as how to select a group of monsters that a
used in digital games since the early 1980s. Here we fo-                  player should face while keeping the game engaging for
cus on a new problem of generating personalized combat                    the player.
encounters in role playing video games (RPG). A game                         One of the current widely accepted psychological mod-
should provide a player with combat encounters of ade-                    els of engagement in games is flow [3, 4]. Csikszentmiha-
quate difficulties, which ideally should be matching the                  lyi defines flow as:"A state in which people are so involved
player’s performance in order for a game to provide ade-                  in an activity that nothing else seems to matter; the expe-
quate challenge to the player. In this paper, we describe                 rience is so enjoyable that people will continue to do it
our own reinforcement learning algorithm that estimates                   even at great cost, for the sheer sake of doing it.” [5]. It
difficulties of combat encounters during game runtime,                    is thought that a player can enter the flow state only if a
which can be them used to find next suitable combat en-                   game supplies a challenge that is adequate to the player’s
counter of desired difficulty in a stochastic hill-climbing               ability [6]. Wrt. player skill, if the challenge is too high,
manner. After a player finishes the encounter, its result is              a player may experience anxiety, fear of failure or frustra-
propagated through the matrix to update the estimations                   tion, while if the challenge is too low, a player may expe-
of not only the presented combat encounter, but also simi-                rience routine, boredom or loss of interest (Fig. 1).
lar ones. To test our solution, we conducted a preliminary
study with human players on a simplified RPG game we
have developed. The data collected suggests our algorithm
can adapt the matrix to the player performance fast from
little amounts of data, even though not precisely.
Keywords: Procedural generation, Video games, Role-
playing games, Encounter generation, Difficulty adapta-
tion
                                                                          Figure 1: Relation of the player’s skill to the challenge
1    Introduction                                                         provided by the game, depicting the flow channel area.

Many games today implement some form of procedural                           Focusing on RPG combat encounters, the challenge of
content generation (PCG) [1, 2], where some parts of a                    their design is how to compose a group of monsters play-
game are generated by an algorithm. PCG is a colloquial                   ers should face in order for them to stay in the flow chan-
term referring to creation of digital content through rules               nel (Fig. 1). This is problematic as the player’s ability is
instead by hand. In games, it has been used for generating,               not known when the game starts. Whereas we can fathom
e.g. dungeon level layouts (Rogue by Michael Toy and                      how difficult a given combat encounter is, e.g., by statis-
Glenn Wichman, 1980) to entire galaxies (Elite Danger-                    tics over monsters combat values (total health, damage per
ous by Frontier Developments, 2015). This can greatly re-                 second, etc.), we do not know how proficient a player is at
duce the work required from designers or artists. It can im-              first or how fast they can improve in playing the game. We
prove the replay value of the game, as, e.g., the levels, dun-            propose to solve this problem using reinforcement learn-
geons, characters, or even dialogues can be generated dur-                ing, i.e., to create a system that given the observation of
ing playtime to vary a player’s game-play experience every                player’s performance in the game can generate next com-
run. This paper deals with the PCG of combat encoun-                      bat encounters of appropriate difficulty.
ters for role-playing games (RPGs). Combat encounter in                      The rest of the paper is structured as follows. In Section
RPG refers to any situation in which a player, who is in                  2, we examine work related to ours. In Section 3, we for-
control of virtual characters usually called heroes, faces an             mulate the combat encounter generation problem (CEGP)
opposition of enemy non-player characters (NPC) usually                   and requirements on its solutions, namely the ability to
called monsters, which result in a combat between char-                   adapt to concrete game instances and the fact the solu-
acters. The problem of combat encounter generation can                    tion must be able to adapt to the player fast. In Section
     Copyright ©2021 for this paper by its authors. Use permitted under   4, we detail our approach tackling CEGP. In Section 5, we
Creative Commons License Attribution 4.0 International (CC BY 4.0).       present an implementation of our approach for an example
RPG game we have created to test it. In Section 6, we de-       3   The Problem
scribe and discuss the results of an experiment with human
subjects that was conducted to test our approach. Finally       Here we define several terms commonly used in RPG
in Section 7, we conclude the paper with notes on future        games, which we will use throughout the paper; defined
work.                                                           terms are in italics. Some terms are formulated in an open-
                                                                ended manner as different games will differ in details.
2     Related Works                                                For the purpose of this paper, we will understand a game
                                                                as a double (W, S) of a game world W and a game state S.
                                                                   A game world as a double (L, P) of game locations L
While procedural content generation in games is a well-
                                                                and actor prototypes P. A game state consists of spawned
studied field, it seems that existing research does not focus
                                                                actors only.
on generating combat encounters. The works we found
about content generation in RPGs focus either on generat-          Game locations L is a set of all possible positions in
ing levels [7, 8], or quests [9, 10].                           the game world, which actors may occupy. In some games
                                                                locations could be a grid of discrete tiles, whereas in others
   We tried looking at other genres for inspiration, yet we
                                                                it could be a more complex 3D structure. The number
were unable to find any relevant research. It seems that
                                                                of these atomic locations can be in thousands per game
even in genres such as platformers[11] or roguelikes[12]
                                                                level for grid-based games, or infinite in case of 3D worlds
the main focus is on generating the level, not on generating
                                                                where location is represented by floating point numbers.
monsters or combat encounters in general.
                                                                   An actor prototype, p ∈ P, is a prototype of a character
                                                                that can be spawned as an actor into the game. An actor
2.1   Dynamic Difficulty                                        prototype is a tuple of all the attribute values that describe
                                                                the class of an actor, e.g., its health, damage per second,
The encounters we generate must be appropriately diffi-         size, skills, etc. These attributes are game specific, but we
cult for the player. Therefore, techniques for runtime ad-      assume health and damage per second to be present at least
justment of difficulty might be relevant to us.                 to allow for combat between actors.
   Xue et al.[13] describe the results of adding dynamic           Actor, a = (pa , sa , la , ca ), is a spawned actor prototype
difficulty to a match-three game. For our purposes, the         that can be present within the game and a part of the game
most important conclusion is that we should attempt to          state. It consists of actor prototype pa definition, its cur-
maximize engagement throughout the entire game, not fo-         rent state sa of pa attributes (initial values set according to
cus on creating encounters of some specific difficulty.         pa ), location la ∈ L an actor occupies in the world and a
   Missura and Gärtner[14] introduced an algorithm that         controller ca , which is either a human or an AI. We label
could split players into groups and then generate chal-         the set of all possible actors as A.
lenges specifically for that group. However, this approach         A hero(r) ::= {r ∈ A | cr = human} is an actor a player
required a lot of data about many players playing the game,     currently controls. A party(R) ::= P({r ∈ R | hero(r)}) is
which we do not have, so this approach could not be used.       a set of heroes the player may control. All such parties are
                                                                denoted as ρ.
2.2   Algorithm Evaluation                                         An enemy(e) ::= {e ∈ A | ce = AI} is an actor a player
                                                                does not control and the party has to fight. enemies(E) ::=
As the goal of the algorithm will be generating encounters      P({e | enemy(e)}) is then a group of enemies a player may
suitably difficult for the players, its evaluation must in-     face in the game, all such enemy groups are denoted as ε.
volve real people playing the encounters generated by the          A combat area, O = {l ∈ L | ∀l1 , l2 ∈ O : path(l1 , l2 ) ⊆
algorithm. We have chosen to also measure the player’s          O}, represents some continuous area in the game world
feeling of flow [15] as the measure of quality for the algo-    where some combat encounter may take place.
rithm.                                                             A combat encounter is a triple, c = (Ec , Rc , Oc ), that
   In psychology, flow is a state where the person is fully     represents a situation where a party must defeat some en-
concentrating on some task. It is also known as being “in       emies. It consists of enemies Ec the party Rc has to fight
the zone”. A person in a flow state loses track of time and     inside some combat area Oc . The set of all possible combat
she focuses completely on the activity she is doing. To         encounters is labelled as C. In particular, actor prototypes
enter the flow state, the person must feel competent at the     are not the only part of c as also their runtime states are
task and the task must be appropriately difficult for her.      included. As such, C can be thought of as game states in
   To measure the player’s feeling of flow, we use the Flow     the space of combat; as actors are exchanging damage, this
Short Scale [16, 17]. This survey measures the flow on a 7-     space is traversed until either side is eliminated (their total
point scale and a person’s perceived difficulty of the task.    health reaches zero).
However, these data were not conclusive and we do not re-          We model the player y = (sy ), as a single number sy ∈
port them and focus on difficulty estimation of encounters      [0; 1] denoting the skill of the player. 0 means no skill,
only.                                                           which can be thought of as not issuing any actions to
heroes during any combat encounter. 1 refers to an opti-        seen as training data, provided during the gameplay. How-
mal play, i.e., player can achieve the best outcome possible    ever, it is not realistic to expect a player to play even ten
regardless the encounter given. We denote set of all kinds      tutorial encounters just to calibrate the model of the player,
of player abilities as Y .                                      which is still very few still given the size of C.
   We define the difficulty of an encounter c ∈ C as a func-
tion D : C ×Y →< 0; 1 >, where its value states how many
percent of party health will be lost during encounter as        Video game context. RPG games are (typically) played
played through by the player of given skill. 0 means the        in real-time and there are usually several to tens of com-
party Rc will defeat enemies Ec at Oc without any health        bat encountered prepared by game designers in each game
losses and 1 means the party Rc will certainly be elimi-        level. Therefore, the algorithm solving the CEGP must
nated by the enemies Ec regardless player actions. Impor-       compute fast in order not to interrupt the gameplay and it
tantly, there can be encounters that cannot be won even         should not consume an unnecessary amount of memory.
by optimal players (enemies are too strong for the party to
defeat).                                                        4     Methodology
   The combat encounter generation problem (CEGP) is
than an optimization problem: given a set of all possible
                                                                Our current approach to solving the CEGP is to use re-
combat encounters C for a given party R, a player y, and
                                                                inforcement learning for estimating the difficulty function
a desired difficulty d, find cmin ∈ C such as ∀c ∈ C : |d −
                                                                D, while heuristically reducing the size of the combat en-
D(cmin , y)| ≤ |d − D(c, y)|.
                                                                counters set C. If we can contract the set, it would make
                                                                both the estimation of D and the search for suitable c ∈ C
3.1   Problem Complexity                                        much easier. For looking up an encounter of some required
                                                                difficulty d, we are softening the requirement of optimal-
The CEGP consists of two sub-problems: 1) how to eval-          ity and use stochastic hill-climbing to look for a combat
uate the difficulty D; 2) even if D is known, how to search     encounter c of difficulty that differs up-to δ from d, i.e.,
for appropriate c ∈ C fast. The challenge of solving them       |d − D(c, y)| < δ .
stems from the problem formulation itself (e.g., the size
of C) and from the video game context, which poses addi-
tional requirements on the CEGP solution.                       4.1   Estimating Encounter Difficulties

                                                                Looking at a combat encounter definition c = (Ec , Rc , Oc ),
Difficulty evaluation. The skill of a player is not known       we disregard Oc (simplification) and maintain a sparse dif-
beforehand. As the result, we cannot compute precise val-       ficulty matrix M indexed by enemies and a party. The ma-
ues of D even via simulation. Moreover, player’s skill          trix then holds estimated difficulties of a combat encoun-
can change throughout the game as the player improves in        ters Mer given enemies Ee and the party Rr for the player
playing the game. Therefore, once observed difficulty of        y that is currently playing the game.
a concrete c may become irrelevant in future as the player         The size of the matrix is huge. Adapting only single ma-
skill changes. Finally, if the game is played in real-time,     trix element after each encounter would not result in fast
the skill may fluctuate as player’s reflexes improve.           adaptation to the player. Therefore, we propose to adapt
                                                                multiple values from the matrix given a single observation
                                                                using some difficulty adjustment function.
Problem size. Considering solving the CEGP for
                                                                   A difficulty adjustment function, which we denote by
the game Dragon Age: Origins (Bioware, 2009)
                                                                ∆(dm , Em , Rm , do , Eo , Ro , α) should return a new difficulty
featuring 140 enemy actor prototypes (according to
                                                                value for an encounter between enemies Em and the party
https://dragonage.fandom.com) trying to generate an op-
                                                                Rm , which was thought to be of dm difficulty, given the ob-
timal encounter that consists of about 7 enemies in the
                                                                served difficulty do from just finished encounter between
game, we would need to go through 1407 possible groups
                                                                enemies Eo and the party Ro ; α is a learning factor.
of enemies. Moreover, every enemy group can be spawned
                                                                   Given the observation of do for some encounter
into the game to a different locations, which could af-
                                                                c(Eo , Ro , Oo ), we update each matrix element with Mer =
fect difficulty of the encounter greatly. E.g., if a group of
                                                                ∆(Mer , Ee , Rr , do , Eo , Ro , α).
archers is spawned behind a strong knight that is blocking
                                                                   To be able to construct ∆, we define two heuristic func-
the path towards them, it would pose an extra challenge
                                                                tions, which computes similarities of enemies and parties.
to the player than if the group of archers could be easily
                                                                Enemies similarity function is defined as SE : ε ×ε → [0; 1]
reached. The size of C is thus enormous.
                                                                and party similarity function is defined as SR : ρ × ρ →
                                                                [0; 1]. 0 value means total difference, while 1 means equal-
Training data. The only direct way to model skill of a          ity of elements.
player playing an instance of the game is through observa-         We then compute ∆(dm , Em , Rm , do , Eo , Ro , α) = dm +
tion of his combat encounters. Such observations can be         (do − dm ) ∗ max{0, 1 − SE (Em , Eo ) − SR (Rm , Ro )} ∗ α. The
 1: procedure U PDATE M ATRIX (M, do , Eo , Ro , α)               1: procedure G EN E NCOUNTER (R, O, d, M, δ , itermax )
 2:    for each (Em , Rm ) ∈ M do                                 2:    i=0
 3:        dm = M[Em , Em ]                                       3:    Ebest = Ecurr = {pick random enemy for O}
 4:       M[Em , Em ] = ∆(dm , Em , Rm , do , Eo , Ro , α)        4:    δbest = δcurr = |M[Ecurr , R] − d|
 5:    end for                                                    5:    while δ ≤ δbest & i < itermax do
 6: end procedure                                                 6:        if M[Ecurr , R] − d ≤ 0 then
                                                                  7:            remove random enemy from Ecurr
Figure 2: Pseudocode of the method updating difficulty            8:        else
matrix M after each combat encounter result. Inputs are:          9:            add random enemy for O into Ecurr
M - difficulty matrix, do - observed difficulty of some en-      10:        end if
counter that just finished, Eo - initial enemies of the en-      11:        δcurr = |M[Ecurr , R] − d|
counter, Ro - party that entered the encounter, α - learning     12:        if δcurr ≤ δbest then
rate.                                                            13:            Ebest = Ecurr
                                                                 14:            δbest = δcurr
                                                                 15:        end if
function updates the estimation of its error scaled accord-      16:        i++
ing to the similarity of enemies and the party of the matrix     17:    end while
element to real enemies and party participating in combat        18:    return c(Ebest , R, O)
that was just observed (and multiplied by the learning rate      19: end procedure
α).
   Updating matrix is then done by updating all of its ele-      Figure 3: Pseudocode of the method for generating an en-
ments (Fig. 2) using ∆ function. This is the place where         counter. Inputs are: R - current party, O - combat area, for
similarity functions are used the most as they allow us to       which we generate the encounter, d - required difficulty,
update multiple elements of the matrix.                          M - the difficulty matrix, δ - difficulty tolerance, itermax -
                                                                 maximum number of search iterations
4.2   Difficulty Matrix Initialization
                                                                 5     Implementation
Ideally, initialization of M should be based on real-data
by letting players play the game to obtain observations.         In order to test our approach, we implemented a simple
However, the amount of required observations is usually          PC game of the RPG genre, which is mouse-controlled.
so high that it is probably unreasonable to require it even      Its design is inspired by gameplay mechanics of well-
from big video game making companies. Instead, we pro-           known, if a bit dated, RPG games such as Baldur’s gate
pose to implement a simple reactive player that can play         (BioWare, 1998) and Planescape Torment (Black Isle Stu-
through combat encounters and use it to obtain first dif-        dios, 1999). A player is leading a party of three preset
ficulty estimations for the matrix. We do this only for a        heroes through a dungeon, which is a procedurally gen-
selected subset of enemies and parties.                          erated maze consisting of rooms, which constitute com-
                                                                 bat areas where encounters with enemies take place. We
                                                                 tried to design the game to be fun and to have a certain
4.3   Generating an Encounter                                    gameplay depth. Heroes represent three RPG archetypes:
                                                                 knight (high health, melee attack only), ranger (low health,
Having matrix M estimating D for player y, enemies and           strong ranged attack) and cleric (healer and area con-
party similarity functions SE , and SR we can implement          troller). Each hero, apart form the attack action, has three
stochastic hill-climbing algorithm for finding a combat en-      different skills they can use to affect the fight. A player
counter to present as a challenge to the player.                 can level heroes between encounters, i.e., improve their
   The algorithm (Fig. 3) is getting the party R, the combat     health and dealt damage per second. For enemies, there
area O and required encounter difficulty d as an input. It is    are 4 archetypes. Some of them having multiple instances
supposed to find a suitable set of enemies E in order to re-     for variety. Each archetype is designed to behave differ-
turn an encounter c(E, R, O), whose difficulty differs from      ently in the combat in order to prolong the learning curve
a required d by an amount, which is smaller or equal to          for the player. Additional information can be found in a
δ . It works in a stochastic hill-climbing manner. It keeps      previous master’s thesis [18]. Due to the time and budget
adding/removing random enemies to/from the E trying to           constraints, we used some free and some paid sound and
match required d. For estimation of D, it queries passed         graphical assets of the pixel-art style (Fig. 4).
difficulty matrix M. If particular element M[E, R] is not           The combat. Every time the party enters a new room,
present within M, it uses similarity functions SE , SR to pick   i.e., opens the door, an encounter is generated inside given
the most similar element of M. For practical real-time con-      the difficulty assigned by the designer (us) to the room.
straints, the algorithm does not perform more then itermax       Then, the party enters, the door closes behind it, and the
iterations before terminating.                                   party has to fight spawned enemies until one of the sides
                                                                Figure 5: Visualization of initial values of difficulty matrix
                                                                of the game. The x-axis is the party ordered according to
                                                                their estimated strength, the y-axis represents enemies or-
                                                                dered according to their estimated strength. Colors depict
                                                                different difficulty categories: green is [0;0.5), yellow is
                                                                [0.5;1.5), red is [1.5;2.5), black is [2.5;3].

                                                                initial estimation of encounter difficulties. Our only hy-
                                                                pothesis therefore was that the algorithm would be able to
Figure 4: Cropped game screenshot with labels overlay
                                                                adapt initial difficulty matrix to the ability of players, i.e.,
depicting important game elements.
                                                                it would provide better difficulty estimates.

is eliminated. The combat happens in real-time, though          6.2   Design
the player is able to pause it anytime to analyze the sit-      Each participant was given the build of the game to play.
uation and reconsider their heroes’ actions. Enemies are        The data about encounters was collected by the game itself
driven by a simple reactive AI, that chooses their attack       and stored within online database. The game consist of
according to a script associated with the enemy archetype.      three levels.
Heroes also attack automatically if player does not inter-         1. Tutorial level. A short and easy level with 4 combat
vene, i.e., orders them to use some skill, changes their tar-   encounters, which were fixed and created by us. The level
get or moves them around the room.                              existed for two reasons. First, it was designed to teach the
   The difficulty range. As the party consist of 3 heroes,      player the basic mechanics of the game. Second, while
we modified the range of difficulty to be [0;3]; a difficulty   the algorithm was not generating enemies in this level, it
was then the summed portions of health points lost during       was still estimating the difficulty of these static encounters
the encounter.                                                  and was updating the matrix. Therefore, after the tutorial
   Difficulty matrix. To generate the initial sparse matrix,    level the matrix could have matched the player skill more
we generated 83572 scenarios, in each of which a random         closely.
party was fighting a random group of monsters. In these            2. Static levels. Two levels with encounters fixed and
scenarios the party was controlled by a very simple script      created by us. Unlike the tutorial level, the difficulty ma-
doing random actions. This proved to be a better approx-        trix was not being modified during this phase.
imation of player behavior than a script making tactical           3. Generated levels. Two levels with encounters being
decisions, which was too strong and produced estimations        generated according to the matrix. The matrix got updated
suitable for a skillful player rather then a beginner. The      after every combat encounter.
results of these scenarios were recorded and used as the           Participants were split into two even groups A and B
initial matrix to be modified at runtime. We set the learn-     randomly. Group A was called static-first as their mem-
ing rate α = 0.75 for UpdateMatrix method (Fig. 2).             bers played the game in the sequence 1-2-3. Group B was
   Visualization of the initial difficulty matrix is provided   called generated-first as their members played the game
in Fig. 5.                                                      in the sequence 1-3-2. The difference wrt. to difficulty
                                                                estimation was that for static-first group, there was a gap
6     Experiment                                                in the matrix adaptation between levels 1 and 3. Here we
                                                                were interested to see if there is going to be difference be-
To test our combat encounter generator we conducted a           tween groups as participants of the static-group should be
pilot experiment with human subjects playing the game.          more experienced when they enter the third level.

                                                                6.3   Data Collected
6.1   Hypothesis
                                                                We collected both subjective categorical data from partic-
As there is no prior work to compare our results to, we         ipants (both quantitative and qualitative) and objective nu-
designed the experiment to be exploratory only. We were         merical data concerning combat encounters, collected au-
interested to see if the algorithm can adapt initial estima-    tomatically by the game. Originally, we were also inter-
tion of combat difficulties to the player and improve the       ested how would the players rate the game and collected
                                                                we measured errors of difficulty prediction as the differ-
              Table 1: Participants profile.
                                                                ence between predicted difficulty and observed difficulty
          Group Size          Male       Female
                                                                (negative value means the encounter was more difficulty
          A        5          4          1                      then it should have been). Data for groups A and B re-
          B        6          5          1                      spectively are summarized by Fig. 6 and Fig. 7.

data on flow via the Flow Short Scale [16, 17] calibrated
questionnaire [18], however these data was not conclusive.
   Regarding the encounters, for each encounter in levels
1 and 3, the game recorded the enemies spawned, esti-
mated difficulty by the initial version of the matrix and the
current version of matrix and the result of the encounter,
which is the final observed difficulty. We also generated a
visualization of the difficulty matrix state after every ma-
trix update for every participant.


6.4   Participants

The experiment was carried out during COVID-19 out-
break in May, 2020. Therefore, all experiments happened
online in an uncontrolled environment, typically partici-
pants’ homes. We had 22 participants out of which a total
of 11 participants were not able to finish the level 3 (ei-
ther because of a bug in a game, lack of time or not being
able to understand all game mechanics) and thus we de-          Figure 6: Difficulty prediction errors for group A; left part
cided to discard their data from the analysis. In total, we     depicts value for adapted matrix and the right part for the
analysed data from 11 participants (Table 1); all but two       initial state of the matrix.
participants reported they play games more than 2 hours
per week (majority more then 10), but these two reported
that they played games regularly in the past. We consider
both groups to consist of experienced game players only.
Participants were split into groups randomly, they were of
similar age (20-25).


6.5   Results and Discussions

Even though all the participants finished level 3, they
played a different amount of combat encounters. This was
the result of the difficulty we preset for some rooms, where
some participants have their parties eliminated and were
thus forced to replay the level.
   Together, we assembled 260 combat encounter observa-
tions. The first 11 observations are irrelevant as they come
from the first combat in level 1 where the matrix has been
in the initial setting. In total, we analyse 249 combat en-
counters for which we can observe the differences between
the adapted matrix (referred to as ADAPTED in graphs)
and its initial version (referred to as INIT in graphs). We     Figure 7: Difficulty prediction errors for group B; left part
mainly analysed with estimation errors, i.e., difference be-    depicts value for adapted matrix and the right part for the
tween difficulty prediction of a matrix and the real diffi-     initial state of the matrix.
culty observed within the game .
   To visualize the data, we are using R version 4.0.5             For both groups, mean values between ADAPTED and
together with the package ggstatsplot v0.8.0 and its            INIT differ. Interestingly, mean of INIT matrix errors for
ggstatsplot for group comparisons.                              group A is higher then in group B. This is probably the
   First, we were interested to see whether the algorithm       effect of "static-first" ordering in group A. Majority of
was able to provide better difficulty estimations. For that,    data collected for group A happens only after participants
played level 2. As the result, they are more skilled and
INIT matrix is more off then for group B.
  Then we looked how errors of ADAPTED matrices be-
tween groups looked like (Fig. 8).


                                                               Figure 9: Difficulty prediction differences between
                                                               ADAPTED and INIT matrices for both groups respec-
                                                               tively. Positive number means the INIT matrix pro-
                                                               vided better prediction whereas negative number means
                                                               ADAPTED matrix provided better prediction.
Figure 8: Difficulty prediction errors of ADAPTED matri-
ces between groups.
                                                               Table 2: Errors by participant. Columns: participant group
   Interestingly they are not much different, which is a bit   and ID, number of encounters observed, mean and std.
surprising. Given the previous observation, that INIT ma-      dev. of ADAPTED matrix errors, mean and std. dev. of
trix has a larger error for group A, we expected the same      INIT matrix errors, how many times was ADAPTED ma-
would be true for ADAPTED matrices between groups A            trix prediction better then INIT matrix prediction.
and B. However, that’s not the case. It seems that missing        Group, #Enc Err.                 Err. INIT       Err.
observation from level 2 in group A did not influence the         ID                ADAPTED                        A<I
ability of the algorithm to adapt to boosted players’ abil-       A-1       16      0.38 ± 0.89 0.82 ± 0.98        8
ities. The only difference, visually speaking, is the data        A-2       16      0.14 ± 0.77 0.56 ± 0.78        10
distribution, which is more packed for group B.                   A-3       20      0.14 ± 0.22 0.61 ± 0.92        13
   Finally, we checked the error differences between              A-4       44      0.21 ± 0.96 1.46 ± 1.09        35
ADAPTED and INIT matrices for both groups respec-                 A-5       24      0.51 ± 0.81 1.89 ± 1.05        20
tively (Fig. 9).                                                  B-1       16      0.22 ± 0.61 0.89 ± 0.79        12
   Interestingly, the analysis shows that the adapted ma-         B-2       33      0.19 ± 0.78 0.72 ± 0.84        20
trix provided correct predictions more frequently (173/249        B-3       20      0.40 ± 0.76 1.23 ± 0.83        17
times in total 69.5%, 86 / 120 times for group A 71.7%,           B-4       25      0.25 ± 0.89 0.72 ± 0.94        16
85 / 129 times for group B 65.9%). Also means are dif-            B-5       16      0.08 ± 0.53 0.13 ± 0.58        11
ferent for both groups. We think this is the influence of         B-6       19      0.08 ± 0.01 -0.08 ± 0.59 11
statics-first vs. generated-first level again as INIT matrix
had larger error for group A due to longer gameplay. Re-
sults per participants are summarized in Table 2.
                                                               games, in which CEGP arises. Moreover, if the game is
   We do not comment on statistical significance of find-
                                                               not solving the CEGP during runtime, it must be solved
ings as reported by graphs (e.g. Welch test and others) as
                                                               by designers by hand during design time. We proposed an
not all samples are independent because multiple samples
                                                               approach for solving CEGP consisting of two parts: a) an
(estimation errors) are grouped according to players.
                                                               estimator of combat encounters difficulties, which needs
                                                               to be able to adapt to players’ ability online, and b) a gen-
7   Conclusion                                                 erator that can propose next encounters given the estima-
                                                               tion of their difficulties, which need to generate encoun-
In this paper, we discussed a new procedural content gen-      ters fast. Our first implementation is making use of a) a
eration problem: the Combat Encounter Generation Prob-         custom reinforcement learning algorithm, which attempts
lem. We proposed an open-ended formalization of RPG            to estimate difficulties of combat encounters despite the
limited number of observations, and b) utilize stochastic          [7] R. Van Der Linden, R. Lopes, and R. Bidarra, “Procedu-
hill-climbing for finding a combat encounter of required               ral generation of dungeons,” IEEE Transactions on Com-
difficulty. We presented and discussed preliminary results             putational Intelligence and AI in Games, vol. 6, no. 1, pp.
of the pilot study with human subjects. Even though the                78–89, 2013.
number of players is small, which is limiting the results,         [8] O. Nepožitek, “Procedural 2d map generation for computer
the number of combat encounters observed is moderate                   games,” Master thesis, Univerzita Karlova, Matematicko-
(249). The data shown that the algorithm was indeed able               fyzikální fakulta, 2018.
to adapt the difficulty estimation of encounters to players        [9] J.-M. Vanhatupa, “Guidelines for personalizing the player
though the prediction was not always better wrt. initial               experience in computer role-playing games,” in Proceed-
                                                                       ings of the 6th International Conference on Foundations of
setting of the matrix as computed from synthetic simula-
                                                                       Digital Games, 2011, pp. 46–52.
tions using simple AI as human player surrogates. Given
                                                                  [10] J. Doran and I. Parberry, “Towards procedural quest gener-
the fact that the number of encounters is huge and our ap-
                                                                       ation: A structural analysis of rpg quests,” Dept. Comput.
proach is rather simplistic, we deem our results as promis-            Sci. Eng., Univ. North Texas, Tech. Rep. LARC-2010, vol. 2,
ing.                                                                   2010.
                                                                  [11] G. Smith, M. Treanor, J. Whitehead, and M. Mateas,
                                                                       “Rhythm-based level generation for 2d platformers,” in
7.1   Future Work                                                      Proceedings of the 4th international Conference on Foun-
                                                                       dations of Digital Games, 2009, pp. 175–182.
Encouraged by the results of our pilot experiment, we see         [12] N. Shaker, A. Liapis, J. Togelius, R. Lopes, and
many directions for future work.                                       R. Bidarra, “Constructive generation methods for dungeons
   First, it would be interesting to employ the algorithm              and levels,” in Procedural Content Generation in Games.
in the context of a different game, ideally existing one,              Springer, 2016, pp. 31–55.
though this is usually very challenging technically. Sec-         [13] S. Xue, M. Wu, J. Kolen, N. Aghdaie, and K. A. Zaman,
ond, if a (large enough) data set on how human players are             “Dynamic difficulty adjustment for maximized engagement
playing a given game is available, it might be possible to             in digital games,” in Proceedings of the 26th International
model human-like play in order to experiment with the al-              Conference on World Wide Web Companion, 2017, pp.
gorithm without the need of experiments with human sub-                465–471.
jects. Finally, proposed algorithm is a skeleton that can be      [14] O. Missura and T. Gärtner, “Player modeling for intelligent
configured at many places (e.g., creating different similar-           difficulty adjustment,” in International Conference on Dis-
ity functions or providing different abstractions for combat           covery Science. Springer, 2009, pp. 197–211.
encounters), thus it would be interesting to observe how it       [15] M. Csikszentmihalyi and I. S. Csikszentmihalyi, Optimal
behaves under different configurations.                                experience: Psychological studies of flow in consciousness.
                                                                       Cambridge university press, 1992.
                                                                  [16] S. Engeser and F. Rheinberg, “Flow, performance and mod-
Acknowledgement This work was supported by the                         erators of challenge-skill balance,” Motivation and Emo-
Charles University grant SVV-260588.                                   tion, vol. 32, no. 3, pp. 158–172, 2008.
                                                                  [17] F. Rheinberg, R. Vollmeyer, and S. Engeser, “Die erfassung
                                                                       des flow-erlebens,” 2003.
References                                                        [18] M. Kozma, “Procedural generation of combat encounters
                                                                       in role playing video games,” Master’s thesis, Charles Uni-
 [1] N. Shaker, J. Togelius, and M. J. Nelson, Procedural con-         versity, 2020.
     tent generation in games. Springer, 2016.
 [2] J. Togelius, G. N. Yannakakis, K. O. Stanley, and
     C. Browne, “Search-based procedural content generation:
     A taxonomy and survey,” IEEE Transactions on Compu-
     tational Intelligence and AI in Games, vol. 3, no. 3, pp.
     172–186, 2011.
 [3] M. Csikszentmihalyi, S. Abuhamdeh, and J. Nakamura,
     “Flow,” in Flow and the foundations of positive psychol-
     ogy. Springer, 2014, pp. 227–238.
 [4] J. H. Salisbury and P. Tomlinson, “Reconciling csikszent-
     mihalyi’s broader flow theory: with meaning and value in
     digital games,” Transactions of the Digital Games Research
     Association (ToDIGRA), vol. 2, no. 2, pp. 55–77, 2016.
 [5] M. Csikszentmihalyi, Flow: The Psychology of Optimal
     Experience. Harper Perennial Modern Classics, 2008.
 [6] J. Chen, “Flow in games (and everything else),” Communi-
     cations of the ACM, vol. 50, no. 4, pp. 31–34, 2007.