<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Comparison of Monte Carlo Tree Search Variants</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Peter Guba</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jakub Gemrot</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Charles University, Faculty of Mathematics and Physics</institution>
          ,
          <addr-line>Ke Karlovu 3, 121 16 Praha 2</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Monte Carlo Tree Search (MCTS) is a popular game AI algorithm that searches the state space of a game while using randomised simulations to evaluate new states. There have been many papers published about various adjustments of the original algorithm, however, work which compares these adjustments is rare. This lack of data can make it dificult to decide which algorithm to use without manual evaluation, which is time-consuming. The aim of this paper is therefore twofold. First, to create such a comparison, and second, to introduce a new variant, Weighted Propagation MCTS, which is based on the idea that one should be able to gather more information from a playout by taking into account all the states encountered during its computation. We chose three settings in which we compare our implementations - Chess, µRTS, and the 4X game Children of the Galaxy.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Monte Carlo Tree Search</kwd>
        <kwd>MCTS</kwd>
        <kwd>games</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Since MCTS does not rely on any game-specific knowledge, it has been successfully used in a wide
variety of games, such as go [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Total War: Rome II, Ms. Pac-man [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and Poker [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Due to its practical usefulness and theoretical properties, the algorithm has been a long-standing
subject of academic interest, with two surveys being published that list numerous adjustments of the
original algorithm that have been proposed since [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ].
      </p>
      <p>
        Our literature review found four papers that compare a higher number of MCTS variants [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref15">12, 13, 14,
15</xref>
        ]. The first two do so in the context of single-player games from the general video game AI (GVG-AI)
framework, and the third was done on simultaneous-move games. There are two main diferences
between these three papers and ours. First, we conduct our experiments on sequential two-player
games and second, we allow for the use of domain-specific knowledge.
      </p>
      <p>The fourth paper, by Hsueh et al., compares 4 diferent variants and their combinations in Chinese
dark chess. This is a sequential two-player game, similar to the ones we use, and the authors also
allowed for the use of heuristics. Their work is therefore similar to ours, but whereas their aim was
to improve on an agent for playing Chinese dark chess, ours is to create a more general comparison,
which is why we use multiple games.</p>
    </sec>
    <sec id="sec-3">
      <title>3. MCTS</title>
      <p>MCTS explores the state-action space of a game while trying to balance the exploration of new moves
and the exploitation of moves already deemed good. It does this by building a partial search tree where
each node corresponds to a game state. It operates in four steps, which repeat in a loop until a terminal
condition is reached and then picks a move. The individual steps work as follows.
• Selection The algorithm traverses the already explored part of the search tree from the root until
it reaches a node with children that have not been visited yet. At every node, it decides which
action to perform using a strategy called the tree policy.
• Expansion A new child node is created and appended to the node selected in the previous step.
• Simulation A simulation called a playout is run from the newly created node. The moves in the
simulation are made according to the algorithm’s default policy until a terminal state is reached
or a cut-of condition is triggered. A numeric score is obtained from the last reached state.
• Backpropagation The score is then used to adjust the scores of nodes within the partial search
tree on the path from the newly created node to the root node.
• Termination The four steps described above repeat until some terminal condition is reached
in most applications, this is an exceeding of the time limit. Afterwards, a move is picked using
some strategy that is usually diferent from the tree policy, as exploration no longer plays a role.</p>
    </sec>
    <sec id="sec-4">
      <title>4. MCTS Variants</title>
      <p>
        Our original motivation was to improve combat AI in the game Children of the Galaxy (described in
Section 5.2). To that end, we searched for MCTS variants to implement in a number of papers, including
an extensive survey of MCTS techniques [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The only selection criterion we used (besides having to be
adjustments of MCTS) was whether they were applicable to our environment.
      </p>
      <p>Due to space constraints, we only present short overviews of the chosen variants here; for details,
we refer the reader to the original papers.</p>
      <sec id="sec-4-1">
        <title>4.1. Upper Confidence bounds applied to Trees[16]</title>
        <p>Upper Confidence bounds applied to Trees (UCT) is the most well-known variant of MCTS. It balances
exploration and exploitation using the UCB1 policy, which picks the node that maximises
¯  + 
√︃
log 

(1)
where ¯  is the current mean reward at node ,  is the number of visits of that node,  is the number
of visits of its parent and  is a tunable parameter.</p>
        <p>This formula was developed for the Multi-Armed Bandit (MAB) problem where it is popular due to
its regret bounds. As the decision made at every node during the selection step can be considered an
instance of MAB, it can be applied here.</p>
        <p>UCT’s default policy consists of picking random moves. Its evaluation function assigns the final
states of playouts a score of 1, 0, or 0.5, depending on whether the result was a win, loss, or draw/the
game did not finish. The algorithm returns the move with the best average score. The parameter  for
the UCB formula was set to 1.</p>
        <p>Note that the default policy, way of picking moves, and way of evaluating final states used here
are not a part of UCT’s specification – UCT only defines the tree policy. We decided to go with these
approaches because they can be considered the most standard implementations of these parts of MCTS,
just like UCB1 can be considered the most standard implementation of its tree policy.</p>
        <p>
          All the following algorithms are based on this version of UCT and only difer in explicitly stated
ways.
4.2. SR+CR MCTS [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]
SR and CR in the name of this variant stand for “simple regret” and “cumulative regret” respectively.
These are measures of the diference between the reward for an optimal decision and the reward for a
decision made based on some strategy . Simple regret is defined as follows:
        </p>
        <p>where  * is the expected reward for choosing the optimal node,  is the node that would be picked
at iteration  and   is its expected reward. It is the regret for choosing a suboptimal action at round
. Cumulative regret, on the other hand, is computed over all rounds.</p>
        <p>The UCB1 formula that is used by UCT aims to minimise the cumulative regret of all the arm pulls in
an MAB setting. This causes the algorithm to exploit the best moves at the root a lot more than it needs
to, since it only actually picks a move to play at the end of its run. To remedy this issue, this variant
uses a policy at the root that minimises simple regret (otherwise, it uses UCB1).</p>
        <p>
          An example of such a policy is the  -greedy policy, which picks the current best move with probability
 and any other with probability 1− −  1 , where  is the number of moves. Unlike UCB1, it doesn’t require
every child of a node to be tried once before it can be applied, allowing early exploitation.
4.3. VOI-aware MCTS [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]
Like SR+CR MCTS, this variant tries to minimise simple regret when picking actions at the root. It
does this by approximating the value of information (VOI) that can be gained from choosing diferent
actions. As this is impossible to compute exactly, the authors used the myopic assumption that the
algorithm will only sample one of the available actions and used it to create the following equations for
approximating the VOI of diferent nodes:
  =
        </p>
        <p>¯
 + 1</p>
        <p>exp(− 2( ¯ − ¯ )2 )
  =
1</p>
        <p>¯
−  exp(− 2( ¯ − ¯)2),  ̸= 
 + 1
 =  * −</p>
        <p>= ∑︁  * −  
=1
(2)
(3)
(4)
(5)
where  = argmax ¯ ,  = argmax,̸= ¯ , that is, they denote the current best and second best
actions, respectively. The node with the highest VOI is always selected at the root and UCB1 is used
otherwise.</p>
        <p>
          For the details of the derivation of these equations, please see the original paper, as it is beyond the
scope of this work.
4.4. UCB1-Tuned MCTS [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]
UCB1-Tuned is a formula that provides tighter bounds on the uncertainty of observations than UCB1
using an estimate of the rewards’ variance. This variant uses it as its tree policy.
4.5. Sigmoid MCTS [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]
This variant combines two common ways of evaluating playouts - win-or-lose and final score. The
former is the one used in our UCT implementation. In the latter, some more complex metric is used,
which gives a more nuanced evaluation of state quality. To combine them, the authors apply a sigmoid
function to the final score.
        </p>
        <p>() =</p>
        <p>1
1 + − 
(6)</p>
        <p>This function changes the value to something between win-or-lose and final score. How close it is to
either is determined by the constant parameter .</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.6. Relative Bonus MCTS [20]</title>
        <p>Instead of just backpropagating an evaluation of a playout’s final state, this variant applies a bonus to it
computed from its length and the depth at which it starts. The idea is that the longer the playout, the
less reliable information is obtained. A sigmoid function, the shape of which can again be altered using
a parameter labelled , is used when computing the bonus to bound and shape its values.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.7. Qualitative Bonus MCTS [20]</title>
        <p>Like Relative Bonus MCTS, but the bonus is derived from the quality of the last state of the playout.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.8. Relative-Qualitative Bonus MCTS [20]</title>
        <p>A combination of the previous two variants - both bonuses are computed and added to the playout
score. The sigmoid functions for both bonuses uses the same value of .
Like Sigmoid MCTS and Qualitative Bonus MCTS, this variant utilizes some more nuanced metric for
state quality evaluation. The score obtained from the evaluation is then backpropagated and normalised
at every node by the maximum possible value of the metric in that node.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.10. Feedback Adjustment Policy MCTS [21]</title>
        <p>This variant partitions playouts into groups based on the time at which they were computed (the later
the better) and assigns a multiplicative factor  to each group (every playout from a group is considered
to be worth  playouts). The partitioning is based on the assumption that playouts that are performed
later ofer more information.</p>
        <p>The authors suggested two schemes for both the partitioning of playouts and the computation of
multiplicative factors - linear (lin) and exponential (exp). The schemes for the two tasks are independent
(i.e. one can use a linear scheme for partitioning and an exponential scheme to compute the multiplicative
factors, or vice versa).</p>
        <p>It is important to note that, unlike the other variants, this one relies on knowing the number of
playouts beforehand - information that is not available when the algorithm’s execution is bounded by a
time limit.</p>
      </sec>
      <sec id="sec-4-6">
        <title>4.11. Weighted Propagation MCTS</title>
        <p>Our proposed variant, Weighted Propagation (WP) MCTS, is based on the idea that if a good heuristic
for evaluating states is available, it should be possible to extract more useful information out of a playout
than just the value of its final state by taking into account how the value of states evolved through time.</p>
        <p>To put this formally, a playout is a sequence of states 1, 2.... In standard MCTS, the evaluation
function is of the form  (). We propose to use a function that takes into account all the encountered
states -  (1, 2...). In this paper, we specifically use
(, ) =
1() + 2(, )</p>
        <p>2
1() = min (−1 , 10000)
2(, ) = min (2− , 10000)
 (1, 2...) =
Σ =1(, ) · ℎ()</p>
        <p>· Σ =1(, )
where ℎ() is a heuristic function used to evaluate states, (, ) is a weight function computed from
the position of the state in the playout, and  is an upper bound on the heuristic value of a state. The
upper bound is computed every time the algorithm is started as the maximum heuristic value of the
state if one player’s pieces/units are discarded. This corresponds to the player destroying their opponent
without sufering any losses and ensures that the values are always between 0 and 1.</p>
        <p>Our weight function takes into account two criteria - first, the later a state occurs in a playout, the
less probable it is to actually be encountered. Second, the earlier a state occurs in a playout, the less
information it gives us about how the game can evolve. Our function can therefore be separated into
two functions, which are then averaged.</p>
        <p>Both of these functions are exponentials with the base determined by a parameter of the algorithm
and the exponent determined by the data.</p>
        <p>The 10000 upper bound on the weights was added as small bases could otherwise cause overflow.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Test Environments</title>
      <p>We implemented and tested all presented MCTS variants in 3 environments with diferent characteristics:
chess (turn-based, small branching factor), Children of the Galaxy (turn-based, large branching factor),
and µRTS (real-time, large branching factor).</p>
      <p>The code for these environments, as well as for some auxiliary programs we used, can be found in
our GitHub repository 1.
1https://github.com/peter-guba/MCTSTesting
(7)
(8)
(9)
(10)
MCTS-HP
Qualitative Bonus MCTS
Relative Bonus MCTS
Relative-Qualitative Bonus MCTS
Sigmoid MCTS
SR+CR MCTS
UCT
UCB1-Tuned MCTS
VOI-aware MCTS
WP MCTS</p>
      <p>Abbreviation Chess
FAP segmentation:
exp
multiplication:
lin
 : 100
HP X
QB  = 100
RB  = 10
RQB  = 1
Sig  = 1
SR policy:  -greedy</p>
      <p>= 0.75
UCT X
U-T X
VOI X
WP 1 = 10, 2 = 10
5.1. Chess
We assume that chess is a well-enough-known game to need no introduction. We will therefore only
explain the way we implemented the final score heuristic mentioned in Section refsigmoid and also
used in Sections 4.7, 4.8, 4.9 and 4.11. All pieces are assigned static values, which are summed up for
every player and the diference between these two sums is computed. We assigned the following values:
100 for pawn, 300 for knight, 320 for bishop, 500 for rook, 900 for queen and 1200 for king.</p>
      <sec id="sec-5-1">
        <title>5.2. Children of the Galaxy (CotG)</title>
        <p>CotG belongs to a subgenre of strategy games called 4X games. In the game, the player tries to conquer
a galaxy using various means, from technological research to military expansion. It can be divided into
discrete subtasks, of which we target only one - combat between units. This is the most critical part of
CotG, as any higher-level decision making (where to attack, where to defend, etc.) is bounded by the
quality of combat outcome estimations.</p>
        <p>Like chess, CotG is divided into discrete turns. Its matches are carried out on a hexagonal grid and
involve units that the player can manipulate. Each unit has a certain number of hit points (HP) as well
as an attack range. Once an enemy unit is within a unit’s attack range, the unit can attack it, thereby
decreasing its HP. When this reaches 0, it is removed from the game. The game ends when at least one
of the players has no more units. In each turn, the player can move as many units as they like, but each
unit only has a limited distance by which it can move in one turn.</p>
        <p>
          For our experiments, we used a CotG combat simulator created by Šmejkal [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. The final score
heuristic implementation here is similar to the one we used in chess, except instead of piece values we
use the remaining HP of units.
5.3. µRTS [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]
µRTS is a simple real-time strategy (RTS) game implementation created by Ontañón and intended for
reinforcement learning experiments that involve RTS games. The game is played in real time and,
similar to CotG, requires balancing between various tasks, such as resource gathering, fighting, and
research. Again, each of these tasks can be controlled by a separate system, therefore we only apply the
tested algorithms to combat, which in this game takes place on a square grid. Like in CotG, it involves
the player moving units that can attack one another, and the game ends when at least one of the players
has no more units.
        </p>
        <p>
          Due to its real-time nature, we had to alter our algorithms to take action durations into account. We
did this using the approach described by Churchill and Buro in [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ].
        </p>
        <p>The final score heuristic used here is the same as the one used in CotG.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Script-based search</title>
      <p>
        Strategy games where players battle using large numbers of units are known to have large branching
factors, making them very challenging for algorithms like MCTS, which rely on traversing a substantial
portion of the game tree. One technique for addressing this is restricting possible unit actions using
scripts that process the current game state and output a small number of actions (usually only one) for
a given unit to execute next [
        <xref ref-type="bibr" rid="ref23 ref24">23, 24</xref>
        ]. We chose this approach instead of using an adaptation of MCTS,
such as NaiveMCTS [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], as we wanted to keep the algorithms consistent across all environments. For
our scripts, we chose No-Overkill-Attack-Value (NOKAV) and Kiter, as these are commonly used for
creating these kinds of action abstractions.
      </p>
      <p>In CotG, the scripts are chosen at every turn and then discarded. In µRTS, they are followed until
they are either completed or determined impossible to follow further (for example if a unit’s target is
destroyed).</p>
    </sec>
    <sec id="sec-7">
      <title>7. Experiments</title>
      <p>Our aim was to determine whether some of the discussed MCTS variants could be said to perform
better than others. To do this, we created tests in which we pitted them against each other and, for
each game, ran them in a round-robin tournament while gathering data about their performance. In the
following subsections, we describe how our tests and whole experiment were designed, the specific
settings, values, and tools we used, and finally, we present our results.</p>
      <sec id="sec-7-1">
        <title>7.1. Test Design</title>
        <p>Our tests consisted of a series of 1 vs 1 matches. Each test was parameterised by two algorithms with
specified parameter settings and a playout setting that was identical for both algorithms. We selected 3
playout settings - 1000, 5000, and 10000.</p>
        <p>The same tests were run in each environment with a limit on the number of rounds (for chess and
CotG) or game ticks (for µRTS) set to 1000. We played 18 matches per test.</p>
        <p>In CotG and µRTS, these matches were separated into 3 categories based on the number of units that
each team had at its disposal - 4 vs 4, 8 vs 8, and 16 vs 16 - and multiple matches were conducted in
each category to get more reliable results. We refer to the number of units as the combat setting. In
chess, we ran the same number of matches as in the other two environments, but all were started from
the default chess starting position.</p>
        <p>The reason why we decided to restrict playouts rather than time was that such results are independent
of the underlying software (the operating system and its configuration) and hardware they are run on.
Any paper published now about MCTS that restricts algorithm runtime will be outdated in ten years,
while results based on playout restrictions will remain replicable. This approach should also ensure
that the agents perform the same on all computers.</p>
        <p>It could be argued that this unfairly advantages our algorithm, as it should be one of the most
computationally demanding ones. We therefore conducted an extra set of experiments, identical to the
ones described above, except with varying numbers of playouts for WP MCTS.</p>
      </sec>
      <sec id="sec-7-2">
        <title>7.2. Experiment Design</title>
        <p>Due to time constraints and the problem of combinatorial expansion, we had to limit the number of
experiments. We started by listing all the possible algorithm settings we wanted to try (by algorithm
setting, we mean an algorithm with all its parameters specified), producing 117. We then tested each of
these against UCT, which, as mentioned in Section 4.1, can be considered the default implementation of
MCTS. We ran these tests in each of our testing environments. Based on the results, we picked one
algorithm setting for every MCTS variant and every test environment (so the algorithm settings used
in diferent environments were allowed to difer) and tested every possible pair.</p>
        <p>In the second round of testing, we ran 56 repetitions of every test in every environment. This gave
us 3 environments * 3 playout settings * 55 algorithm pairings * 56 repetitions = 27,720 tests (498,960
matches) in total.</p>
      </sec>
      <sec id="sec-7-3">
        <title>7.3. Experiment Specification</title>
        <p>The chosen parameter values for each algorithm are shown in Figure 1, as well as the abbreviations
that we will be using. X indicates there were no parameters to set.</p>
        <p>
          Our experiments were run using MetaCentrum [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] - a Czech virtual organisation that ofers resources
for grid computing. We created test files, each of which contained 3 tests with the same algorithms but
diferent playout settings, and assigned each of the tests one processor and up to 85 GB of memory,
depending on the testing environment. The chess and CotG testing environments were written in C#
targeting .NET 6. µRTS was written in Java using JDK 8.
        </p>
        <p>We now go over experiment details specific to each environment.
7.3.1. Chess
7.3.2. CotG
All games were started from the default chess starting position. The player who moved first alternated
between matches. A match was considered finished when one of the standard chess terminating
conditions occurred or when the round limit was reached. If the match terminated without either of
the kings being captured, it was labelled a draw.</p>
        <p>We used two types of units - Destroyers and Battleships. The former is a unit with high damage but
short attack range and low HP, while the latter has medium attack range, lower damage, but high HP
and shields. While the game ofers other units, these two provided enough diversity for our purposes.</p>
        <p>In every combat setting, half of the units were always of one type and half of the other. Their positions
were randomly generated for one team and for each combat setting beforehand, symmetrically assigned
for the other team and then remained constant.</p>
        <p>The first player alternated between matches. A match was considered over either when all the units
of at least one of the players were destroyed, or the round limit was exceeded. Unlike in chess, exceeding
this limit here did not necessarily mean a draw. Instead, the winner was decided based on the remaining
HP of the players - the player with more HP was labelled the winner.
7.3.3. µRTS</p>
      </sec>
      <sec id="sec-7-4">
        <title>7.4. Results</title>
        <p>The setup in this environment was quite similar to CotG. We also used two types of units - heavy
(slower, higher damage) and light (faster, lower damage). Half the units were again assigned each type,
and their positions were randomly generated for one side, symmetrically assigned for the other and
then remained constant.</p>
        <p>As both players move at the same time in µRTS, there was no switching of who got to go first.</p>
        <p>The terminal conditions and way of determining the winner were the same as in CotG.
For all three testing environments, we show the numbers of wins of every algorithm against every other
algorithm. For CotG and µRTS, we also show the accumulated numbers of wins for every algorithm
and combat setting. We omit splitting the results by number of playouts, as that did not show any
significant diferences in algorithm rankings.
7.4.1. Chess
The number of wins scored by algorithms against one another in chess can be seen in Figure 2. As
there were no draws, we omit showing them. There are two notable performers. The first is Sigmoid
MCTS, which always performed poorly. This is unsurprising, as, unlike in CotG and µRTS, the winning
condition here is not eliminating all the opponent’s pieces. The heuristic that we used here therefore
misleads the algorithm.</p>
        <p>The other is WP MCTS, which achieved a high win rate against every other variant. Other
wellperforming variants are MCTS-HP and FAP MCTS. It is interesting to note that WP MCTS actually
scored better against these than against some overall worse-performing ones, including UCT. It may be
that the moves these variants choose are more similar to those WP MCTS does, and it is therefore able
to better model them.</p>
        <p>Also noteworthy is the fact that, although every algorithm except Sigmoid MCTS beat UCT in
one-on-one matches, some variants, like Qualitative Bonus MCTS and VOI-aware MCTS, scored fewer
wins overall.
7.4.2. CotG
The top part of Figure 3 shows the number of wins scored by every algorithm against every other
algorithm in CotG. The data here is more evenly distributed than in chess, which is expected as there is
often less of a clear distinction between optimal and suboptimal moves in CotG. However, like in chess,
there were no draws.</p>
        <p>The worst-performing variants are UCT, UCB1-Tuned MCTS, and VOI-aware MCTS. In the case of
UCT, this is to be expected, as every other variant was designed to be an improvement over it. The
best-performing variants, on the other hand, are WP MCTS, MCTS-HP, and Sigmoid MCTS. These rely
on the final score metric described in Section 5.2, which evidently provides much useful information in
this context.</p>
        <p>The bottom part of Figure 3 shows the number of wins of each algorithm in each combat setting. In
the first two, the results are quite evenly distributed. This indicates that if the algorithms can suficiently
explore the search tree in a game with many equally good moves, their diferences diminish. In the
last combat setting, the diferences suddenly get much more pronounced. We think this is because the
search trees are so large that, in the beginning, the algorithms cannot even properly explore their first
levels, therefore the ones that can extract the most information out of a single playout and the ones
that do not need to finish searching the first level before proceeding to the next fare much better.
7.4.3. µRTS
In µRTS, the results were even more evenly spaced than in CotG. As seen in the top part of Figure 5,
the diference between the worst result and the best one is only 170 points, far smaller than in CotG
(879) or chess (2302). What’s more, unlike in chess and CotG, some number of matches ended in draws
in this setting, as shown in the bottom part of Figure 5. This could be due to the persistency of script
assignments, as mentioned in Section 6 - since scripts in µRTS are followed until they are completed or
deemed impossible to follow further, it is possible that they are picked much less frequently, which
means smaller search trees which can be searched faster and allow for less strategising.</p>
        <p>The data in Figure 4 is quite evenly spaced in all combat settings, unlike in CotG. The ordering of the
algorithms also difers greatly from one combat setting to the next, leading us to believe that the tested
MCTS variants are quite equal, but perform slightly better or worse depending on conditions such as
the initial unit positions.</p>
        <p>Our algorithm performed poorly in this environment, scoring the lowest number of wins overall. The
reasons for this are unclear, especially in the 16 vs 16 combat setting, where MCTS-HP and Sigmoid
MCTS - two other variants that use the same heuristic for evaluating states as WP MCTS - performed
the best.</p>
        <p>One possibility is that the heuristic we use is biased, and in smaller search trees which the other
algorithms can suficiently explore, they can find better moves. If that were the case, however, we
would expect to see the same behaviour in CotG, which was not observed. More research is required to
identify the cause with certainty.</p>
        <sec id="sec-7-4-1">
          <title>7.4.4. Extra WP MCTS tests</title>
          <p>Figures 6 and 7 show the results of WP MCTS in the extra tests that we ran to see how well it would
fair with a reduced number of playouts (against the other algorithms with 10,000).</p>
          <p>In chess, cutting the number of playouts down to 5000 did not show a significant decrease in win
rate. At 1000 playouts, the algorithm’s performance clearly declined, but it nonetheless won more times
than it lost in every pairing.</p>
          <p>In CotG, giving the algorithm 5000 playouts actually produced an improvement in some cases, though
this could be a random fluctuation. Only 2 of the other variants managed to outperform it at this setting.
At 1000 playouts, the algorithm’s win rate decreased for most pairings, though it still outperformed 3
out of the 10 tested variants.</p>
          <p>In µRTS, performance was quite uniform independent of the number of playouts, and the algorithm
actually fared better at 1000 playouts than at 10,000. This further supports our hypothesis that the
search trees here are smaller and ofer fewer opportunities for strategising.</p>
        </sec>
        <sec id="sec-7-4-2">
          <title>7.4.5. Overall assessments</title>
          <p>• FAP MCTS - One of the most robust variants, though never one of the best-performing ones.</p>
          <p>Does not seem to perform well in combat settings with a small number of units, though this could
be due to us using the same parameter values in all combat settings.
• MCTS-HP - Also a very robust variant, performed very well in every testing environment.
• Qualitative Bonus MCTS - One of the worst-performing variants, did not achieve any noteworthy
results. This is surprising because it uses the same metrics for evaluating game states as MCTS-HP,
WP MCTS and Sigmoid MCTS, which all outperformed it significantly.
• Relative Bonus MCTS - Average performing variant, also did not achieve any noteworthy results.
• Relative-Qualitative Bonus MCTS - Also average performing. Strangely, it often performed
slightly worse than just Relative Bonus MCTS. Perhaps having separate sigmoid functions for the
two bonuses could provide an improvement.
• Sigmoid MCTS - Due to the heuristic used to evaluate states, it did not perform well in chess, but
otherwise, it achieved a high win rate. Still, if a heuristic for judging state quality is available,
MCTS-HP or WP MCTS seem like better choices.
• SR+CR MCTS - The most robust variant. Going by the total number of wins, it was among the
top 4 in every environment. It was also the only variant for which we used the same setting in
every environment after the initial round of testing.
• UCB1-Tuned MCTS - Usually achieved very similar results to UCT. As it only has a slight
adjustment of the UCB formula, that is unsurprising.
• VOI-aware MCTS - Performed surprisingly poorly, given that the motivation behind it is the same
as for SR+CR MCTS and in the original paper, it even outperformed this variant. Its disadvantage
may lie in the fact that, like the other variants, it needs to fully expand the root node before
moving on to the next level of the tree.
• WP MCTS - Performed the best in chess and CotG, the worst in µRTS. However, as the range of
results achieved in µRTS was quite small, we do not consider this a substantial failure.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>8. Conclusions and Future Work</title>
      <p>In this paper, we have compared 11 diferent variants of the Monte Carlo Tree Search algorithm in three
diferent environments. One of these was a novel algorithm called WP MCTS, and it proved to be the
best-performing algorithm in two of the environments.</p>
      <p>As the environments used in this paper are freely available on our GitHub, they can be used by other
researchers, and their results will be comparable to ours.</p>
      <p>For future work, we might further explore aspects of these algorithms, such as their behaviour in
situations where they control even more units, their performance in other environments, or how the
optimal algorithm settings change with diferent number of playouts. We would also like to investigate
further the lack of variability of our results in µRTS.</p>
      <p>Regarding our algorithm, we would like to look into how well it performs equipped with diferent
weight and evaluation functions, as well as further explore the idea of getting the most out of playouts
by analysing all of their states.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments References</title>
      <p>Computational resources were provided by the e-INFRA CZ project (ID:90254), supported by the
Ministry of Education, Youth and Sports of the Czech Republic.</p>
    </sec>
    <sec id="sec-10">
      <title>Declaration on Generative AI</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G. V. D.</given-names>
            <surname>Broeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Driessens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ramon</surname>
          </string-name>
          ,
          <article-title>Monte-Carlo Tree Search in Poker using expected reward distributions</article-title>
          ,
          <source>in: Asian Conference on Machine Learning</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>367</fpage>
          -
          <lpage>381</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kozelek</surname>
          </string-name>
          ,
          <article-title>Methods of MCTS and the game Arimaa, Master's thesis</article-title>
          , Charles University,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pepels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Winands</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lanctot</surname>
          </string-name>
          ,
          <article-title>Real-time Monte Carlo Tree Search in Ms Pac-Man, Transactions on Computational Intelligence and AI in games 6 (</article-title>
          <year>2014</year>
          )
          <fpage>245</fpage>
          -
          <lpage>257</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Silver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Maddison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sifre</surname>
          </string-name>
          , G. Driessche,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schrittwieser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Antonoglou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Panneershelvam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lanctot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dieleman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Grewe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kalchbrenner</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lillicrap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Leach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kavukcuoglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Graepel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hassabis</surname>
          </string-name>
          ,
          <article-title>Mastering the game of Go with deep neural networks and tree search</article-title>
          ,
          <source>Nature</source>
          <volume>529</volume>
          (
          <year>2016</year>
          )
          <fpage>484</fpage>
          -
          <lpage>489</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Silver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hubert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schrittwieser</surname>
          </string-name>
          , I. Antonoglou,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Guez, . . . ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hassabis</surname>
          </string-name>
          ,
          <article-title>Mastering chess and shogi by self-play with a general reinforcement learning algorithm</article-title>
          ,
          <source>arXiv preprint arXiv:1712</source>
          .
          <year>01815</year>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hiraoka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Onishi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Nakata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tsuruoka</surname>
          </string-name>
          ,
          <article-title>Monte Carlo Tree Search with variable simulation periods for continuously running tasks</article-title>
          ,
          <source>2019 IEEE 31st International Conference on Tools with Artificial Intelligence</source>
          (
          <year>2019</year>
          )
          <fpage>416</fpage>
          -
          <lpage>423</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Coulom</surname>
          </string-name>
          ,
          <article-title>Eficient selectivity and backup operators in Monte-Carlo Tree Search</article-title>
          , in: International conference on computers and games, volume
          <volume>26</volume>
          ,
          <year>2006</year>
          , pp.
          <fpage>72</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C. B.</given-names>
            <surname>Browne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Powley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Whitehouse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lucas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. I.</given-names>
            <surname>Cowling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rohlfshagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Colton</surname>
          </string-name>
          ,
          <article-title>A survey of Monte Carlo Tree Search methods</article-title>
          ,
          <source>IEEE Transactions on Computational Intelligence and AI in games 4</source>
          (
          <year>2012</year>
          )
          <fpage>1</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Świechowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Godlewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Sawicki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mańdziuk</surname>
          </string-name>
          ,
          <article-title>Monte Carlo Tree Search: A review of recent modifications and applications</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          <volume>56</volume>
          (
          <year>2023</year>
          )
          <fpage>2497</fpage>
          -
          <lpage>2562</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ontanón</surname>
          </string-name>
          ,
          <article-title>The combinatorial Multi-Armed Bandit problem and its application to real-time strategy games</article-title>
          ,
          <source>in: In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment</source>
          , volume
          <volume>9</volume>
          ,
          <year>2013</year>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Šmejkal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gemrot</surname>
          </string-name>
          ,
          <article-title>Engaging turn-based combat in the Children of the Galaxy videogame</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment</source>
          , volume
          <volume>14</volume>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F.</given-names>
            <surname>Frydenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Andersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Risi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Togelius</surname>
          </string-name>
          ,
          <string-name>
            <surname>Investigating</surname>
            <given-names>MCTS</given-names>
          </string-name>
          <article-title>modifications in general video game playing</article-title>
          ,
          <source>IEEE Conference on Computational Intelligence and Games</source>
          (
          <year>2015</year>
          )
          <fpage>107</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Soemers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. F.</given-names>
            <surname>Sironi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Winands</surname>
          </string-name>
          ,
          <article-title>Enhancements for real-time Monte-Carlo Tree Search in general video game playing</article-title>
          ,
          <source>IEEE Conference on Computational Intelligence and Games</source>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>M. J. Tak</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lanctot</surname>
            ,
            <given-names>M. H.</given-names>
          </string-name>
          <string-name>
            <surname>Winands</surname>
          </string-name>
          ,
          <article-title>Monte Carlo Tree Search variants for simultaneous move games</article-title>
          ,
          <source>2014 IEEE Conference on Computational Intelligence and Games</source>
          (
          <year>2014</year>
          )
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Hsueh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. J.</given-names>
            <surname>Tseng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Yen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>An analysis for strength improvement of an MCTS-based program playing chinese dark chess</article-title>
          ,
          <source>Theoretical Computer Science</source>
          <volume>644</volume>
          (
          <year>2016</year>
          )
          <fpage>63</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>L.</given-names>
            <surname>Kocsis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Szepesvári</surname>
          </string-name>
          ,
          <article-title>Bandit based Monte-Carlo planning</article-title>
          ,
          <source>in: European conference on machine learning</source>
          ,
          <year>2006</year>
          , pp.
          <fpage>282</fpage>
          -
          <lpage>293</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Tolpin</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Shimony, MCTS based on simple regret</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>26</volume>
          ,
          <year>2012</year>
          , pp.
          <fpage>570</fpage>
          -
          <lpage>576</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Cesa-Bianchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <article-title>Finite-time analysis of the Multiarmed Bandit problem</article-title>
          ,
          <source>Machine learning 47</source>
          (
          <year>2002</year>
          )
          <fpage>235</fpage>
          -
          <lpage>256</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shibahara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kotani</surname>
          </string-name>
          ,
          <article-title>Combining final score with winning percentage by sigmoid function in Monte-Carlo simulations</article-title>
          ,
          <source>in: 2008 IEEE Symposium On Computational Intelligence and Games</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>183</fpage>
          -
          <lpage>190</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pepels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Tak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lanctot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Winands</surname>
          </string-name>
          ,
          <article-title>Quality-based rewards for Monte-Carlo Tree Search simulations</article-title>
          , in: ECAI,
          <year>2014</year>
          , pp.
          <fpage>705</fpage>
          -
          <lpage>710</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>F.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Backpropagation modification in Monte-Carlo game tree search</article-title>
          ,
          <source>in: 2009 Third International Symposium on Intelligent Information Technology Application</source>
          , volume
          <volume>2</volume>
          ,
          <year>2009</year>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ontañón</surname>
          </string-name>
          ,
          <article-title>Informed Monte Carlo Tree Search for real-time strategy games</article-title>
          ,
          <source>in: 2016 IEEE Conference on Computational Intelligence and Games (CIG)</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>D.</given-names>
            <surname>Churchill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Buro</surname>
          </string-name>
          ,
          <article-title>Portfolio Greedy Search and simulation for large-scale combat in StarCraft</article-title>
          ,
          <source>in: Proceedings of the Conference on Computational Intelligence in Games</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>D.</given-names>
            <surname>Churchill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Buro</surname>
          </string-name>
          , Hierarchical Portfolio Search:
          <article-title>Prismata's RobustAI architecture for games with large search spaces</article-title>
          ,
          <source>in: In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment</source>
          , volume
          <volume>11</volume>
          ,
          <year>2015</year>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Šustr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sitera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mulač</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Antoš</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hejtmánek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Holub</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Salvet</surname>
          </string-name>
          , L. Matyska,
          <article-title>Metacentrum, the czech virtualized ngi</article-title>
          ,
          <source>In EGEE Technical Forum</source>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>