<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jean-Baptiste Hervé</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oliver Withington</string-name>
          <email>o.withington@qmul.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marion Hervé</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laurissa Tokarchuk</string-name>
          <email>laurissa.tokarchuk@qmul.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph Salge</string-name>
          <email>christophsalge@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>PCG Evaluation, Generative AI, Minecraft, GDMC</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AIIDE Workshop on Experimental Artificial Intelligence in Games</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Queen Mary University of London</institution>
          ,
          <addr-line>London</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Hertfordshire</institution>
          ,
          <addr-line>Hatfield</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Workshop Proce dings</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>With growing interest in Procedural Content Generation (PCG) it becomes increasingly important to develop methods and tools for evaluating and comparing alternative systems. There is a particular lack regarding the evaluation of generative pipelines, where a set of generative systems work in series to make iterative changes to an artifact. We introduce a novel method called Generative Shift for evaluating the impact of individual stages in a PCG pipeline by quantifying the impact that a generative process has when it is applied to a pre-existing artifact. We explore this technique by applying it to a very rich dataset of Minecraft game maps produced by a set of alternative settlement generators developed as part of the Generative Design in Minecraft Competition (GDMC), all of which are designed to produce appropriate settlements for a pre-existing map. While this is an early exploration of this technique we find it to be a promising lens to apply to PCG evaluation, and we are optimistic about the potential of Generative Shift to be a domain-agnostic method for evaluating generative pipelines.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Procedural content generation refers to the automatic</title>
        <p>
          creation of pieces of content, usually called ‘artifacts’.
Among common artifacts, we find game levels,
equipment, or even 3d models. It is used to provide a lot of
diferent assets, at any scale, without requiring to fully
author. Academic game research has studied PCG from
diferent perspectives. Recent publications include new
techniques of generation [
          <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
          ], diferent nature of
artifacts [3, 4, 5, 6, 7, 8], or even design questions [9]. One
area, in particular, is the automated evaluation of
generated content [10, 11, 12, 13, 14]. However, the
commonly used techniques, such as Expressive Range
Analysis (ERA) [10], are not well suited for intermediary steps
of the generation. They also focus on the entirety of an
artifact, even if its sub-components and their interactions
are also good indicators of the player’s perception [15].
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>These techniques also rely on the use of user-defined metrics, usually based on the user’s intuition or previous work. But most metrics don’t align with the perceived quality of generated content [13, 16, 17], nor do they have</title>
        <p>LGOBE
owithington.co.uk (O. Withington)
0000-0002-7007-5193 (O. Withington)
any guarantees to capture meaningful variation.
Furthermore, many metrics are hard to interpret by end-users,
such as game designers [12]. The previous issues,
combined with the availability of many diferent metrics also
make the selection and visualization of metrics dificult.</p>
        <sec id="sec-1-2-1">
          <title>1.1. Key Contributions and Overview</title>
          <p>In this paper, we make three conceptual contributions to
the area of PCG system evaluation, which are as follows:
1. Evaluation of intermediary PCG artifacts and the
impact of generative steps - Allowing designers
to interrogate the impact of individual steps in a
generative process.
2. The application of dimensionality reduction to
artifact metrics - Freeing designers from having
to narrowly define the metrics that interest them
when visualizing generative spaces.</p>
        </sec>
      </sec>
      <sec id="sec-1-3">
        <title>3. ERA of locations within a single generated level,</title>
        <p>rather than of levels in totality - Allowing for the
analysis of 3D environments that more closely
aligns with how they are experienced by players,
i.e. from a specific location in the game space.</p>
      </sec>
      <sec id="sec-1-4">
        <title>While each of these contributions has potential util</title>
        <p>ity in isolation, they can also be combined into a single
technique which we call Generative Shift Analysis (GSA).</p>
      </sec>
      <sec id="sec-1-5">
        <title>This technique allows for the qualitative and quantitative evaluation of the efect that a generator has on a base artefact in aggregate, while also allowing a user to high</title>
        <p>CEUR</p>
        <p>ceur-ws.org</p>
        <p>CEUR
htp:/ceur-ws.org
ISN1613-073</p>
        <p>CEUR</p>
        <p>Workshop Proceedings (CEUR-WS.org)</p>
        <p>within the generated environment.</p>
        <p>Attribution 4.0 International (CC BY 4.0).</p>
        <p>© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License light the most extremely afected individual locations
After introducing this technique we show how it can
be usefully applied to gain insight from real world
generative systems, in our case a set of alternative Minecraft
settlement generators. We argue that GSA can be used
as the basis for techniques which are better suited for
evaluating PCG systems and the virtual environments
they produce in the form that these systems actually exist
in contemporary game development.
composite [15]. Some of ERA limitations have been
addressed by further study, ofering to expand it, through
better representations of ER and ease of metrics selection
[11, 19, 20]. But these techniques have yet to be adopted.</p>
        <sec id="sec-1-5-1">
          <title>2.2. Dimensionality Reduction</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <sec id="sec-2-1">
        <title>2.1. Expressive Range Analysis</title>
        <p>Within the field of PCG, a truly qualitative evaluation of
artifacts is challenging [18, 14]. The range of
possibilities of a given generator is quite complex to represent
hence Expressive Range [10] Analysis (ERA) is commonly
employed. The Expressive Range is bounded by selected
metrics, which form its dimensions. All the possible
artifacts (or at least large panel of them) are plotted according
to the metrics, leading to an analysis of their distribution.</p>
        <p>The analysis of the Expressive Range of a generator can
be useful in order to understand its behavior and how the
artifacts are spread among the dimensions. Therefore,
the usefulness of an Expressive Range depends mostly
on the relevance of the dimensions by which it is defined.</p>
        <p>Usually, dimensions used are automatically computed
metrics applied on the whole artifact. They do not
necessarily need to capture something associated with quality,
but there is often an underlying assumption that higher
values in certain dimensions is preferred. More
importantly, the metrics should capture meaningful diferences,
so artifacts lying in diferent areas of the expressive range
appear diferent to the relevant players. Metrics of
evaluation can also be used by designers to tune the generators
and optimize certain aspects of the generated artifacts.</p>
        <p>However, ERA still has limitations. Firstly, a computed
diference between two artifacts does not necessarily
lead to a perceived diference for the player [ 14]. The
chosen metrics themselves lack of embodiment, and
several experiments have already been conducted to
critically examine metrics commonly used and their relevance
[13, 16, 17]. In most cases, the Expressive Range is
deifned as a 2D space, mostly for ease of interpretation, 2.3. GDMC Competition
but ultimately prevent in depth analysis of the
interactions between several metrics [11]. As a consequence, The GDMC is a yearly competition in which teams
subpicking metrics pairs that impart meaning can be time mit a settlement generator [25] for Minecraft, which is a
consuming and adds additional complexity [19]. The use computer program that can add or remove blocks from a
of the metrics is also not well defined, and depending of given Minecraft maps without human intervention. All
their nature, a ”good” artifact might need to maximise a the generated settlements are then sent to the jury. The
metric or target a specific value depending purpose [ 16]. jury includes experts in various fields, such as AI, Game
Another limitation is the reliance on evaluating artifacts Design, or urbanism. Each judge scores the settlements
independently and as atomic objects, while other results in each of the following categories: Adaptability,
Funcsuggest generated artifacts are in reality experienced as tionality, Narrative, and Aesthetic. Adaptability is how
well the settlement is suited for its location - how well
One of the ideas explored in this paper is using a
dimensionality reduction (DR) algorithm, principal
component analysis (PCA), to compress and combine a set
of quantitative metrics. DR algorithms are a family of
approaches for compressing high dimensional data into
lower dimensional space while maintaining as much of
the information present in the original data. They are
commonly used for exploratory data analysis as well as as
a pre-processing step in deep learning implementations.</p>
        <p>DR algorithms have found use in many diferent
contexts in prior PCG research, most commonly as a
visualisation tool. They can be applied directly to the encoded
representations of game content, as in the work of
Justesen et.al. ([21]) which used PCA to produce visualisations
of the distribution of their generated levels. Withington
and Tokarchuk applied PCA and other DR algorithms to
generated levels to explore whether this could be used as
a useful alternative to ERA without the need for metric
calculation [19]. Chang and Smith applied a diferent
DR algorithm, t-SNE, to images from playthroughs of
game levels rather than the encoded levels themselves to
visualise reachable play states as part of their Diferentia
tool [22]. They can also be used as a generative step,
as in the work of Summerville and Mateas who PCA as
an intermediary step in their approach for procedurally
generating Zelda levels ([23]).</p>
        <p>While this is the first application of unaugmented DR
algorithms to combine sets of metrics that we are aware
of, it does bear significant resemblance to another work
from Withington and Tokarchuk which used
Convolutional Neural Networks (CNNs) to produce compressed
representations of sets of metrics for evaluating game
levels [24]. Our approach of using PCA on its own lacks the
sophisticaiton of CNNs but has the advantage of being
significantly simpler to calculate and implement.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>it adapts to the terrain, both on a large and small scale.</p>
      <p>Functionality is about what afordances the settlement
provides, both to the Minecraft player and the simulated In this section we explore and explain our process for
villagers. It covers various aspects, such as food, produc- generating and synthesising isovist information from
tion, navigability, security, etc. Narrative reflects how Minecraft maps, as well as our methods for using this
well the settlement itself tells an evocative story about data from pre and post settlement generation to
proits own history, and who its inhabitants are. Finally, Aes- duce useful insights about the properties of the generator.
thetic is a rating of the overall look of the settlements. In While the method used here is very specific to this
dothe competition, the rating of each category is computed main, we argue that our techniques or at least a subset of
for each generator by averaging (mean) across all judge’s them could be useful when analysing almost any content
scores. generator within games.</p>
      <p>The GDMC has been used as a test bed for PCG
evaluation studies [26, 15, 16]. Minecraft is an interesting 3.1. Data Gathering and Preparation
test subject in that regard. It is mostly known for its
open-ended nature and has been compared to LEGO on 3.1.1. Map Data Used - GDMC Entries
a computer. Even though the game ofers a main objec- To evaluate the novel techniques and approaches that
tive, it is mostly used as a sandbox game. Many players we explore and explain in this paper, we apply them to
use the block mechanic to terraform the game world, the 2022 entries to the Generative Design in Minecraft
create structures such as houses, castles, or cities, and competition. This dataset is highly useful for several
play the game according to self-imposed goals and chal- reasons. Firstly the map data is publicly available (Found
lenges. Since the art style and setting of Minecraft are at gendesignmc.engineering.nyu.edu/results), meaning
very generic, the game afords free creation of almost it can be used both by us but also by other researchers
any kind of artifact, with only the player’s imagination looking to modify our methods with the same data. The
setting the limits. Automatic evaluation of all the com- code used to produce them is also publicly available for
ponents of a Minecraft’s map is both challenging and most generators, meaning we can in future both explore
relevant in the larger context of PCG evaluation [15]. the relationship between the architecture of the
generator’s code and its output using this paper’s techniques, as
2.4. Isovists and Game Spaces well as exploring the extent to which our findings about
individual generators extend to their use on alternative
base maps.</p>
      <p>One of the most important and unique aspects of this
dataset though is the presence of the base maps. As
discussed in Section 2.3, the way the GDMC is organised
is that entry generators have to be designed to produce
settlements on any pre-existing terrain, then a set of
actual maps to be used as test beds is decided by the GDMC
organisers and used for all entries. Firstly this means
that we can directly compare the alternative generators
as they all use the same base map to generate on. More
excitingly however, it lets us do pre and post generation
analysis on the settlements, meaning we can drill into
what specifically a given generator changed about the
map. As discussed earlier this more closely mirrors the
way that PCG systems are implemented in the game
industry i.e as a generative pipeline of processes that iterate
on a single artefact rather than the one shot generation
of complete artefacts.</p>
      <p>Isovists have been developed with the intent of capturing
perception of space [27]. Given a bounded environment,
for each point  , its isovist   [27] is the set of all the
points visible from  . As a result, we can compute the
isovist of every positions in the space. Each isovist is
characterized by a range of scalar values such as their size,
shape, intervisibility, etc [27, 28, 29, 30] These properties
have been shown to have some correlation of how a given
environment is experienced and appreciated [31, 32].</p>
      <p>Space in general can be analyzed to understand and
even predict human and social behavior [33, 34, 35].
Similar analysis have also be made for video games already
[36, 37], connecting the 3D space to diferent layer of
experience, such as narrative, gameplay, social interactions,
etc.</p>
      <p>Hervé and Salge [26] already applied the isovist theory
to Minecraft, with results suggesting that they could be
used as new way to evaluate PCG artifact. Rather than an
holistic approach, isovists allow to focus on one specific
spot of the map, and extract the experience for that given
location. In the rest of this paper, we will be using the
data coming from the Hervé and Salge study.
3.1.2. Isovist Calculation</p>
      <sec id="sec-3-1">
        <title>As discussed in Section 2.4, the concept of an isovist</title>
        <p>comes from the domain of real world space analysis.</p>
        <p>The concept was brought over to the domain of game
level evaluation in ‘Automated Isovist Computation for
Minecraft’ [ 26] , and it is the method from that paper that
we use to produce our base metrics in this work. For a
more detailed explanation of how the isovist metrics are
calculated please see A, or [26]. Hervé and Salge
introduced 13 distinct metrics which can be calculated for a
given standable location in a Minecraft map. These are
all inspired by the concept of isovists and are intended
to capture some aspect of the player’s experience of the
space. These isovist metrics can then be calculated for
either all or a subset of the standable locations within
a specific area of a Minecraft map, giving us a set of
locations and the associated isovist metrics for them.
3.1.3. Isovist Metric Compression
or map determined by the top two PC values for the
isovist metrics at the location. This is very similar to
conventional ERA in the form popularised by Smith and
Whitehead [10] except with two distinctions. Firstly, all
of the data points are derived from locations within a
single map rather than each point representing a
complete artefact. Secondly, we are efectively visualising
diversity in terms of all 13 of the isovist metrics due to
the PCA preprocessing, rather than being limited to a
choice of two metrics which traditional ERA is.</p>
        <p>As in conventional ERA these plots give us an idea
of the relative diversity present in alternative content
generators by how widely distributed the isovist data
from a map is in this space. It can also give us insight into
the extent to which two generators are producing similar
output by examining the similarity between distribution
shapes in these heatmaps. Heatmap similarity is a strong
indicator that the efect two generators are having on a
base map is broadly alike, at least in terms of their isovist
metric values. While we lose the benefit of conventional
ERA of highlighting exactly what combinations of metric
values are likely and unlikely to be produced, we argue
this is ofset by the gain in being able to qualitatively
understand diversity in terms of multiple metrics in the
form of principal components.</p>
        <p>To facilitate visual inspection and comparison between
isovist sets we wanted to reduce the dimensionality of
the 13 metric dataset down to two dimensions, similar
to more conventional ERA except rather than each point
representing a diferent artefact instead they represent
a location within an artefact. There are several options
for producing or selecting a 2D version of the data (See
Section 2.1). In this work we compress the assembled set
of metrics using PCA. PCA calculates the linear
combinations of underlying features which explain the maximal
variance in the data, meaning we can take and visualise
the two two most explanatory components and visualise
them in a 2D plot while maintaining a robust amount of
variance in the underlying isovist metric data. 3.2.2. Principal Components as Map Overlays</p>
        <p>Before applying PCA we pre-process the metric data The second technique we use for gaining insights into the
by centering it, removing the mean of each metric and generators is overlaying values derived from applying
scaling to unit variance. This is standard practice when PCA to the actual Minecraft maps themselves. By taking
applying PCA to ensure no underlying variable has an an overhead view of the map and then colour coding
locaoutsized impact due to having a large range of values. In tions within that map based on the principal component
our results we report the actual contributions of each un- values found at that location we can further qualitatively
derlying metric to the principal components themselves evaluate a generated space, and focus our investigation of
to explore the extent to which all metrics are actually rep- it. If a designer wanted a generated space which provided
resented. These principal components were calculated a consistent experience while traversing it as a player,
for the assembled set of all generated maps they might look for smoothly changing values in their
maps, whereas if they wanted a map with a high amount
3.2. Generator Insight Methods of notable locations which stood out from their
surroundings, then they would hope to see many locations with
In this subsection we describe the methods that we use radically diferent values to their surroundings. Most
imto gather insights about individual maps and map genera- portantly however for our use case, they support useful
tors, as well as methods highlighting noteworthy isovists and direct comparison between generated settlements
within the generated settlements. We aim to describe the and the base map. Using this approach we can see both
technical process for arriving at these insights, as well the extent to which a generator changed the base map,
what useful information we gather from them about the but also where these changes were located.
underlying generators. In this paper we produce these overlays by first taking
an overhead view of the Minecraft map from in game.
3.2.1. Principal Component Expressive Range We then take the values for only the first principal
comAnalysis ponent as this is the one that explains the most variance
in the underlying metric data, and then localise these
values on the map based on the location of the isovist
that produced them. These can then be colour coded on
a continuous scale based on their value. This is a direct</p>
      </sec>
      <sec id="sec-3-2">
        <title>The first method we explore for gaining insight into our</title>
        <p>roster of Minecraft settlement generators is to visualise
the derived isovist principal components in a
scatterplot, with the position for each location in the settlement
evolution of one of the approaches used by Herve and
Salge in the paper that introduced the isovist metrics
to Minecraft [ 26], except they looked at the values for
individual metrics. Note that because we are viewing the
map from vertically above and only taking the highest
isovist for each location, we averaged values for every
coordinate pair, starting from a vertical threshold
corresponding to the common ground level. This means that
any influence a generator is having on the map below
the top layer of blocks, such as generating cave networks
beneath the surface, would likely not be visible.
well, for example by not having too large an influence
on an already desirable base map, or it could highlight
that the generator is having too extreme an impact.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Details</title>
      <sec id="sec-4-1">
        <title>In this section we describe the set up of the pilot experiments that we ran to explore our concept of generative shift in the context of Minecraft settlement generation.</title>
        <sec id="sec-4-1-1">
          <title>4.1. Isovists Calculated</title>
          <p>3.2.3. Calculating Generative Shift</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Our experiment runs on the 20 entries from 1(of 2) map</title>
        <p>from the 2021 GDMC competition. The map we have
focused on is the ’volcano map’ used in the 2021 GDMC
as well as in the following year’s competition.</p>
        <p>We compute isovists on surfaces where a player can
stand, at a height matching the camera position for the
given location and a radius  of 256 blocks (the default
view distance in Minecraft). For each visible coordinate,
we check the type of block, if it is transparent, and if it is
a location where a player can stand [26].</p>
        <p>The default map, without any structure, has 92560
possible surfaces, and a single isovist takes several seconds
to be computed, depending on its size. We therefore
proceeded to the following subsampling of our data: for
each height value (Y coordinate in Minecraft), we
compute 1 isovist out of 10 possibles, picked randomly. This
subsampling significantly reduces the computation time,
and previous test showed it wasn’t impacting the global
results [26].</p>
        <p>The final and most innovative technique we explore is
what we term Generative Shift Analysis (GSA), by which
we mean the change in characteristics of a location before
and after a stage in a generative pipeline, in our case
the generation of a settlement. More specifically we
are interested in highlighting both the general trend of
the shift caused by a given generative step, as well as
highlighting the locations which experienced the largest
shift in their isovist metrics pre and post generation. This
gives designers a view of locations that experienced the
biggest change in their characteristics as a result of a
generative step.</p>
        <p>To calculate this we take the top two principal
component data points for locations in a generated settlement,
and pair them up the data from the matching locations
in the base map. This gives us a set of pairs of 2D
vectors, with one representing the combined isovist metric
information for that location before a settlement was
generated, and the second representing the same
location afterwards. We can then calculate the 2D vector 4.2. Highlighted Maps
that takes us between these two points, with its magni- While we have access to 20 diferent settlement
generatude acting as a heuristic for how much a location was tors, calculating the generative shift for all 20 would
dechanged by the settlement being generated in terms of liver a potentially overwhelming amount of information
its isovist metrics. This distance is what we refer to as to dig through without further clarifying the strengths
‘generative shift’. We then rank locations by those which and weaknesses of the technique. As a result we opted to
experienced the largest shift, allowing us to identify the focus on only two generators. Generators 6 and 15 were
top X most transformed locations by a given generator. the two highest performing generators: Generator 6 had
These locations can then be manually inspected in game. the highest score for the Aesthetic criteria and Generator</p>
        <p>There is a significant amount of qualitative informa- 15 was the overall highest scoring entry across all
evalution that we hope to gain from applying GSA to our data ation criteria. To get an idea of the character of the two
set. As discussed in Section 2.1, an issue with the ma- generated settlements as well as the base map see Figure
jority of techniques for understanding and visualising 1.
generative spaces is that they only aford a high level
abstracted view. Here instead we provide a justification
for examining specific locations within a generated
settlement, with a view to gaining insight into the generator
as a whole. By seeing the extremes of what a generative
step can do to the base artefact it gives a designer a
concrete idea of how transformative a generative step can be
overall. Depending on a designer’s goals this could either
increase their confidence that the generator was working</p>
        <p>We felt that if we were only going to evaluate a
subsample of generators then it made sense to select high
performing generators as any insights into how they
function might be more generally useful than insights
gained into less high performing systems. However this
decision is somewhat arbitrary, and exploring the extent
to which our generative technique approach works for
more diverse generators is a natural extension of this
work.</p>
        <sec id="sec-4-2-1">
          <title>4.3. Location Selection</title>
          <p>To reduce the compute and processing time required to
pair up and calculate the generative shift between isovists
from the same location in the base map and generated
settlements, we opted to only find pairings for 2% of map
locations and then calculate the generative shift for these
only. This 2% was selected stochastically during the
pairing process. This sub-sampling was only required in this
case due to the specific process used for gathering isovist
data which was inherited from [26]. This did not produce
data that was easy to pair in both the base maps and
generated settlements, and therefore required large numbers
of euclidean distance calculations to find matched pairs.
A revised data gathering process which simultaneously
gathered isovist data from the same location in both maps
could avoid this limitation.</p>
          <p>While the 2% threshold was largely arbitrary and based
on the resources available for our pilot experiment, the
intention for GSA is that it will still give useful insights
even when only sampling a fraction of possible locations.
In fact for most 3D game domains which are structured
as continuous spaces rather than the large cube voxels
of Minecraft, some form of sub-sampling would always
be required.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Discussion</title>
      <sec id="sec-5-1">
        <title>This section discusses the strengths, weaknesses and</title>
        <p>interesting aspects of the approach we have introduced
and explored, as well as highlighting valuable directions
for future inquiry that we intend to pursue ourselves or
that we feel would be useful for the field.
4.4. Software On the application of PCA to sets of metrics, we found
The code for extracting the isovist data from the this to be intuitive and easy to apply, and we believe it
Minecraft maps was written in Python, and relies on the could be useful in other contexts in which synthesising
GDMC’s framework (https://github.com/avdstaaij/gdpc) multiple metrics is desirable, not just when applying our
in order to collect game’s data, and Cuda for computa- approach of Generative Shift. The metrics themselves
tional acceleration (https://developer.nvidia.com/cuda- proved relatively compressible despite all being relevant,
python). with the Top 2 PCs explaining over 65% of the total
vari</p>
        <p>The code for pairing up isovists, compressing their met- ance in the data (PC 1 explained 52% variance and PC 2
rics to a 2D vector using PCA, calculating their generative 13.3%). Being able to combine multiple phenotypic
metshift and visualising it was written in Python and is avail- rics let us understand the changes caused by a generator
able at: (https://github.com/KrellFace/gen_shift_analy- in a more holistic way and it directed us to a more diverse
sis). To apply PCA to the isovist metrics we used the set of significantly altered locations (Figures 2) than we
scikit-learn Python package (https://scikit-learn.org). believe would have been discoverable with a smaller set.
It also had the unexpected benefit of confirming to an
extent that all 13 of the isovist metrics were capturing
diferent aspects of location diversity, as we can see in the
relatively even contribution between diferent metrics in
A direction for future study is exploring whether we can
quantify the amount of generative shift that a designer
wants to see produced by a system, and whether this
could be used to guide a generative process. In future
work we aim to explore whether such a concept could be
operationalised in a generative system by using
generative shift as a pseudo fitness function to produce optimal
amounts of changes to a pre-existing artefact.</p>
        <p>Another future direction we are excited about is one
which explores the relationship between player
experience and smoothness of change in isovist metric values.</p>
        <p>This builds on from the map overlay heat-maps in
Figure 2 and how they allow us to easily visualise how the
compressed metric values shift geographically within the
map. Our hypothesis is that maps with isovist metric
values that only change gradually as a player navigates
the space are likely to be more relaxing or perhaps even
boring than maps which present the player with more
sudden changes in isovist metrics when moving. This
is something we aim to explore in future, possibly with
user studies using this same data set.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper we introduced Generative Shift Analysis,
a novel approach for analysing generated 3D spaces
through the use of combined isovist metrics, and the
changes produced in them by a generative process that
uses a pre-existing virtual environment as its base. While
this is only an early exploration of these concepts we still
conclude that this is a potentially valuable avenue for
PCG research to explore and one that we are excited to
expand upon.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <sec id="sec-7-1">
        <title>We thank the reviewers for providing insightful and detailed feedback. This work was supported in part by the EPSRC Centre for Doctoral Training in Intelligent Games &amp; Games Intelligence (IGGI) [EP/S022325/1].</title>
        <p>machine learning (pcgml), IEEE Transactions on tion of the hidden judging criteria in the generative
Games 10 (2018) 257–270. design in minecraft competition (2023).
[3] A. Khalifa, M. C. Green, D. Perez-Liebana, J. To- [16] J.-B. Hervé, C. Salge, Comparing pcg metrics with
gelius, General video game rule generation, in: human evaluation in minecraft settlement
genera2017 IEEE Conference on Computational Intelli- tion (2021).
gence and Games (CIG), 2017, pp. 170–177. doi:10. [17] A. Summerville, J. R. Mariño, S. Snodgrass, S.
On1109/CIG.2017.8080431. tañón, L. H. Lelis, Understanding mario: an
eval[4] J. M. Font, T. Mahlmann, D. Manrique, J. Togelius, uation of design metrics for platformers, in:
ProA card game description language, in: European ceedings of the 12th international conference on
Conference on the Applications of Evolutionary the foundations of digital games, 2017, pp. 1–10.</p>
        <p>Computation, Springer, 2013, pp. 254–263. [18] C. Lamb, D. G. Brown, C. L. A. Clarke, Evaluating
[5] T. Mahlmann, Modelling and generating strategy computational creativity: An interdisciplinary
tugames mechanics, IT University of Copenhagen, torial, ACM Comput. Surv. 51 (2018). URL: https:
Center for Computer Games Research, 2013. //doi.org/10.1145/3167476. doi:10.1145/3167476.
[6] J. Togelius, R. De Nardi, S. M. Lucas, Towards [19] O. Withington, L. Tokarchuk, Compressing and
automatic personalised content creation for rac- Comparing the Generative Spaces of
Proceduing games, in: 2007 IEEE Symposium on Com- ral Content Generators, in: 2022 IEEE
Conputational Intelligence and Games, IEEE, 2007, pp. ference on Games (CoG), IEEE, Beijing, China,
252–259. 2022, pp. 143–150. URL: https://ieeexplore.ieee.org/
[7] A. Pantaleev, In search of patterns: Disrupting rpg document/9893615/. doi:10.1109/CoG51982.2022.
classes through procedural content generation, in: 9893615.</p>
        <p>Proceedings of the The third workshop on Proce- [20] O. Withington, L. Tokarchuk, The right variety:
dural Content Generation in Games, 2012, pp. 1–5. Improving expressive range analysis with metric
[8] J. Doran, I. Parberry, A prototype quest genera- selection methods, in: Proceedings of the 18th
tor based on a structural analysis of quests from International Conference on the Foundations of
four MMORPGs, ACM International Conference Digital Games, 2023, pp. 1–11.</p>
        <p>Proceeding Series (2011). doi:10.1145/2000919. [21] N. Justesen, R. R. Torrado, P. Bontrager, A.
Khal2000920. ifa, J. Togelius, S. Risi, Illuminating Generalization
[9] G. Smith, What do we value in procedural content in Deep Reinforcement Learning through
Procedugeneration?, in: Proceedings of the 12th Interna- ral Level Generation, arXiv:1806.10729 [cs, stat]
tional Conference on the Foundations of Digital (2018). URL: http://arxiv.org/abs/1806.10729, arXiv:
Games, 2017, pp. 1–2. 1806.10729.
[10] G. Smith, J. Whitehead, Analyzing the ex- [22] K. Chang, A. Smith, Diferentia: Visualizing
Increpressive range of a level generator, in: Pro- mental Game Design Changes, Proceedings of the
ceedings of the 2010 Workshop on Procedu- AAAI Conference on Artificial Intelligence and
Inral Content Generation in Games - PCGames teractive Digital Entertainment 16 (2020) 175–181.
’10, ACM Press, Monterey, California, 2010, pp. URL: https://ojs.aaai.org/index.php/AIIDE/article/
1–7. URL: http://portal.acm.org/citation.cfm?doid= view/7427. doi:10.1609/aiide.v16i1.7427.
1814256.1814260. doi:10.1145/1814256.1814260. [23] A. Summerville, M. Mateas, Sampling Hyrule:
[11] A. Summerville, Expanding expressive range: Eval- Multi-Technique Probabilistic Level Generation for
uation methodologies for procedural content gen- Action Role Playing Games, Proceedings of the
eration, in: Proceedings of the AAAI Conference AAAI Conference on Artificial Intelligence and
Inon Artificial Intelligence and Interactive Digital En- teractive Digital Entertainment 11 (2021) 63–67.
tertainment, volume 14, 2018. URL: https://ojs.aaai.org/index.php/AIIDE/article/
[12] M. Cook, J. Gow, S. Colton, Danesh: Helping bridge view/12817. doi:10.1609/aiide.v11i3.12817.
the gap between procedural generators and their [24] O. Withington, L. Tokarchuk, Visualising
Generaoutput (2016). tive Spaces Using Convolutional Neural Network
[13] J. Mariño, W. Reis, L. Lelis, An empirical evalua- Embeddings, 2022. URL: http://arxiv.org/abs/2210.
tion of evaluation metrics of procedurally gener- 17464, arXiv:2210.17464 [cs].
ated mario levels, in: Proceedings of the AAAI [25] C. Salge, M. C. Green, R. Canaan, J. Togelius,
GenConference on Artificial Intelligence and Interac- erative design in minecraft (gdmc) settlement
gentive Digital Entertainment, volume 11, 2015. eration competition, in: Proceedings of the 13th
[14] K. Compton, So you want to build a generator International Conference on the Foundations of
(2016). Digital Games, 2018, pp. 1–10.
[15] J.-B. Hervé, C. Salge, H. Warpefelt, An examina- [26] J.-B. Hervé, C. Salge, Automated Isovist
Com</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>A. Set-based definitions of isovist metrics</title>
      <p>Given a point  and its isovist   ,  is called the centroid
of   . The lines connecting  and the boundary of  
are referred to as radials  , .   is also defined by its the
visible area (  ), and its perimeter (  ). It is worthwhile
to note that   is defined by ”real-surfaces” which are
defined by Benedidkt as “opaque, material, visible surface,
mean Radials</p>
      <sec id="sec-8-1">
        <title>Drift Vista Length</title>
        <p>∶=  (
∶=  ,()
∶=  (
, )
, )
(7)
(8)
(9)
The visible head spaces  , are basically all positions an
avatar could be in and then see its head from its current
position.   − 2 are all the standable blocks supporting
those headspaces.  , provides a perimeter of blocks
that limit our view and contains all blocks visible from
the current position.  , is a list of all walkable blocks,
obtained by floodfilling from the standable block
supporting the current position, within  steps. We use usual
Minecraft movement rules, that allow moving up by one
block per lateral transverse, and dropping down to lower
levels. Note how the features of avatar height, movement
rules, and vision sensors could afect those basic sets. The
following properties can now be computed by operating
on those sets alone, without having to recompute them.</p>
      </sec>
      <sec id="sec-8-2">
        <title>The function (.) counts how many diferent types of</title>
        <p>blocks are in a set.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Togelius</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. N.</given-names>
            <surname>Yannakakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. O.</given-names>
            <surname>Stanley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Browne</surname>
          </string-name>
          ,
          <article-title>Search-based procedural content generation: A taxonomy and survey</article-title>
          ,
          <source>IEEE Transactions on Computational Intelligence and AI in Games</source>
          <volume>3</volume>
          (
          <year>2011</year>
          )
          <fpage>172</fpage>
          -
          <lpage>186</lpage>
          . doi:
          <volume>10</volume>
          .1109/TCIAIG.
          <year>2011</year>
          .
          <volume>2148116</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Summerville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Snodgrass</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Guzdial</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Holmgård</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Hoover</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Isaksen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nealen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Togelius</surname>
          </string-name>
          , Procedural content generation via
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>