<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cameron Browne</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dennis J. N. J. Soemers</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Piette</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Data Science and Knowledge Engineering (DKE) Maastricht University</institution>
          ,
          <addr-line>Bouillonstraat 8-10 Maastricht, 6211 LH</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This short paper describes an ongoing research project that requires the automated self-play learning and evaluation of a large number of board games in digital form. We describe the approach we are taking to determine relevant features, for biasing MCTS playouts for arbitrary games played on arbitrary geometries. Benefits of our approach include efficient implementation, the potential to transfer learnt knowledge to new contexts, and the potential to explain strategic knowledge embedded in features in human-comprehensible terms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The Digital Ludeme Project1 is a five-year research project,
recently launched at Maastricht University, that aims to
model the world’s traditional strategy games in a single,
playable digital database. This database will be used to find
relationships between games and their component ludemes,2
in order to develop a model of the evolution of games
throughout recorded human history and to chart their spread
across cultures worldwide. This paper describes a new
approach to defining strategic features for general games that
we are implementing in order to achieve the project’s aims.</p>
      <sec id="sec-1-1">
        <title>Statement of the Problem</title>
        <p>
          While there exists substantial archaeological evidence for
ancient games (i.e. those played before 500AD) the rule sets
explaining how these games were actually played are
typically lost; our modern understanding of ancient and early
games is based primarily on modern reconstructions that
have proved to be less than reliable
          <xref ref-type="bibr" rid="ref15">(Murray 1952)</xref>
          .
        </p>
        <p>
          A key challenge in this project will be to evaluate
reconstructions of ancient rule sets and to improve them where
possible. The general game system we are developing for
this project, called LUDII, must therefore be able to:
1. Model the full range of traditional strategy games.
2. Play them at a competent human level.
3. Evaluate rule set reconstructions for: a) historical
authenticity, and b) quality of play.
4. Improve rule set reconstructions where needed.
1Funded by a e2m ERC Consolidator Grant (http://ludeme.eu).
2Units of game-related information
          <xref ref-type="bibr" rid="ref4">(Borvo 1977)</xref>
          .
        </p>
        <p>We aim to model the 1,000 most influential traditional
games throughout history, each of which may have multiple
interpretations, each of which may require hundreds of
variant rule sets to be tested. This is therefore not just a
mathematical/computational challenge but also a logistical one, as
LUDII may need to be able to learn to play and evaluate up
to a million candidate rule sets over the project’s five years.</p>
        <p>This paper outlines work-in-progress for achieving this
ambitious goal, through the self-play learning of simple
lightweight features that embody strategic knowledge about
the games being modelled. The features we propose are
geometry-independent, facilitating their transfer to new
contexts, and represent basic strategies that may be understood
and explained in human-comprehensible terms.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Geometric Piece Features</title>
      <p>
        The LUDII general game system employs Monte Carlo tree
search (MCTS)
        <xref ref-type="bibr" rid="ref6">(Browne et al. 2012)</xref>
        as its core method
for AI move planning, which has proven to be a
superior approach for general games in the absence of
domainspecific knowledge
        <xref ref-type="bibr" rid="ref10 ref13">(Finnsson and Bjo¨ rnsson 2010)</xref>
        . While
MCTS performance can vary significantly if strictly random
playouts are used
        <xref ref-type="bibr" rid="ref7">(Browne 2013)</xref>
        , the reliability of Monte
Carlo simulations can be improved by biasing playouts with
domain-relevant features.
      </p>
      <p>
        There is a significant amount of prior research that
describes the use of local features or patterns for game-playing
agents tailored towards a specific game. Some of the most
common approaches for generating features are:
Manually constructing features using expert knowledge.
This was, for example, done by
        <xref ref-type="bibr" rid="ref12">Gelly et al. (2006)</xref>
        for the
game of Go, and by
        <xref ref-type="bibr" rid="ref23">Sturtevant and White (2007)</xref>
        for the
game of Hearts.
      </p>
      <p>
        Exhaustively generating all possible patterns up to a
certain restriction (often limited to, e.g., areas of 3 3 cells).
This was, for example, done by
        <xref ref-type="bibr" rid="ref11">Gelly and Silver (2007)</xref>
        for the game of Go, and
        <xref ref-type="bibr" rid="ref14">Lorentz and Zosa IV (2017</xref>
        ) for
the game of Breakthrough.
        <xref ref-type="bibr" rid="ref23">Sturtevant and White (2007)</xref>
        also exhaustively generated combinations (for instance
pairs, triples, etc.) of the manually selected features for
the game of Hearts as described above.
      </p>
      <p>
        Supervised learning based on databases of human expert
games. This has been particularly common in the game
of Go
        <xref ref-type="bibr" rid="ref2 ref20 ref21 ref24 ref5 ref8 ref9">(Enderton 1991; Stoutamire 1991; van der Werf et
al. 2003; Bouzy and Chaslot 2005; Stern, Herbrich, and
Graepel 2006; Araki et al. 2007; Coulom 2007)</xref>
        .
      </p>
      <p>
        Other approaches include Gradual Focus (Skowronski,
Bjo¨rnsson, and Winands 2009), a genetic programming
approach to find useful patterns
        <xref ref-type="bibr" rid="ref10 ref13">(Hoock and Teytaud 2010)</xref>
        ,
and the use of deep convolutional neural networks for
learning relevant features through self-play (Silver et al. 2016).
1
2
      </p>
      <p>
        The features we are interested in here are geometric piece
patterns of arbitrary topology. For example, Figure 1 shows
a pattern of pieces that represents a beneficial move for
connection games played on the hexagonal grid. If Black has
just played move 1 to intrude into the virtual connection
of two White pieces (left), then White should reply
immediately with move 2 to complete that connection. Biasing
MCTS playouts to play such moves with higher probability
was found to dramatically improve AI playing strength for
connection games such as Hex and Y
        <xref ref-type="bibr" rid="ref16">(Raiko and Peltonen
2008)</xref>
        .
      </p>
    </sec>
    <sec id="sec-3">
      <title>Feature Definition</title>
      <p>Each feature x defines:
1. A pattern px which specifies one or more elements that
must be present or absent in certain locations in game
states for feature x to hold.
2. An action ax which may be encouraged or discouraged in
game states where x holds.</p>
      <p>For most board games, actions can be sufficiently
specified using one or two integers; a location to move “to”, and
in some games a location to move “from”. For example, in
games such as Hex and Go a single board location is
sufficient to uniquely identify any action in any particular game
state. In games such as Draughts or Chess, the current
location of the piece to be moved is also required.</p>
      <sec id="sec-3-1">
        <title>Syntax</title>
        <p>Each pattern consists of a set of elements. We define the
following element types for each potential feature element,
relative to the current mover:
“-” Off-board location (for locating edges and corners).
“.” Empty board location.
“o” Friendly piece (i.e. of the mover’s colour).
“x” Enemy piece (i.e. not of the mover’s colour).
“Pn” Piece belonging to player n.
“In” Piece with index n in the game definition.3</p>
        <p>We also define “!” (i.e. “not”) versions of these element
types. For example, “!P2” means that the specified location
does not contain a piece belonging to Player 2. Locations can
simultaneously have multiple qualifiers, e.g. a location with
both “x” and “!I3” qualified would match an enemy piece at
that location unless it has index 3 in the game definition.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Topology</title>
        <p>Rather than tying feature topology to any particular board
or grid type, feature elements are described by their
relative locations on the underlying graph of the game board
(or more accurately its dual, which defines orthogonal cell
adjacencies). Starting with a given anchor location and
direction, the relative location of each element is defined as a
walk through the underlying board graph. We assume that
the orthogonal neighbours of each board location are listed
in consecutive clockwise order, including null placeholders
for off-board steps.</p>
        <p>Each walk is defined as a sequence of adjacent steps
through the underlying board graph, where:
0 denotes a step forwards in the current direction.</p>
        <p>13 denotes a3 counterclokwise turns in a cell with a sides
(i.e. one turn in a triangle), then a step forwards,
+ 24 denotes 24a clockwise turns in a cell with a sides (i.e.
two turns in a square), then a step forwards, etc.</p>
        <p>Figure 2 shows the relative steps through cells with
different numbers of sides. Fractional turns are rounded to the
nearest sensible fraction for any given cell (e.g., a turn of 14
in a triangle becomes equal to a turn of 13 ). The basic
mechanism is similar in principle to the use of turtle commands
in computer graphics.</p>
        <p>Note that cells with an odd number of sides, such as the
triangular and pentagonal cells shown in Figure 2, do not
have a forwards (0) direction. Such cases may be handled
by instantiating two pattern instances per ambiguous step:
one with 0 replaced by a1 and one with 0 replaced by + a1
for that step in a cell with a sides.
-1
3
13 -14
0
1
4 -2
5
-1
5</p>
        <p>For example, a knight move may be described using this
notation as f0; 0; 14 g, as shown in Figure 3 (left). This
description can be mapped directly to other grids of arbitrary
topology to produce plausible results, such as the
semiregular 3.4.6.4 tiling (right). Note that the turn of 14 is
ambiguous in the semi-regular grid, and is therefore split into
two possible patterns.</p>
        <p>3The game definition maintains a unique index for each type of
0</p>
        <p>For convenience, the number of valid rotations and
reflections for each pattern may also be specified, so that all
possible instances of a given pattern may be generated across the
board from a single description.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Reactive vs Proactive Patterns</title>
        <p>We distinguish between reactive patterns that trigger actions
directly in response to the previous player’s last move,4 and
proactive patterns that can trigger actions anywhere across
the board regardless of the previous player’s last move. This
distinction is similar to the distinction between “responsive”
and “non-responsive” patterns used in (Silver et al. 2016).</p>
        <p>For example, the bottom row of Figure 4 shows the Hex
pattern described earlier in Figure 1 (left) and a reactive
pattern that describes this situation (middle). The top row shows
a proactive pattern that encourages Hex players to play two
disjoint steps away from existing friendly pieces. For
clarity in the diagrams, friendly pieces are denoted as white
disks and enemy pieces as black discs (the last move made
is dotted). White dots indicate empty cells, edges between
elements indicate cell adjacency, and green “+” symbols
indicate the preferred action triggered by each pattern. The
rightmost column of Figure 4 shows how these patterns map
easily from the hexagonal grid to the square grid.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Implementation</title>
      <p>The LUDII system and associated feature mechanisms are
implemented in Java 8. Internal game states are defined
using a custom BitSet class, called a ChunkSet, that
compresses the required state information in the minimum
number of bits, based on each game’s definition.
equipment it involves, including containers (boards, piles, hands,
decks, etc.) and components (pieces, tiles, dice, cards, etc).
4There may be more than one opponent.
+
+
+
+</p>
      <p>For example, if a game involves no more equipment than
a board and uniform pieces in N colours, then the game state
is described by a ChunkSet subdivided into chunks of B
bits per board cell, where B is the lowest power of 2 that
provides enough bits to represent every possible state per
cell (including state 0 for the empty cells).5</p>
      <p>For each game, we seek to derive an optimal feature set F ,
which is the set of features that outperform all other known
feature sets for that game on average, within a given
computational budget.</p>
      <sec id="sec-4-1">
        <title>Feature Instantiation</title>
        <p>As each game is loaded, an instance of every possible valid
rotation, reflection and translation is pre-generated once for
all component features fx of that game’s optimal feature
set F , including combinations of valid element types within
each feature. Each feature definition can therefore generate
hundreds of instances.</p>
        <p>Each instance is defined using the same custom BitSet
class as the game state. Hence, each feature instance can
be matched to the given game state efficiently using bitwise
parallel operations, such that only a few bitwise operations
need to be applied per instance match test, regardless of the
feature’s complexity.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Feature Application</title>
        <p>Feature sets are applied to bias MCTS playouts as follows:
1. For each player move, all legal moves are generated and
initialised with equal probability of being selected.
2. Each reactive feature instance corresponding to the
previous player’s last move is then checked for a match to
the current game state, and if a match is detected then
5Chunk sizes are set to the lowest power of 2 to avoid issues
with chunks straddling consecutive long member variables.
the feature’s weight is added to the probability of the
feature’s action being selected (note that feature weights can
be negative for detrimental moves).
3. Each proactive feature instance is then applied across the
board and checked for a match to the current game state,
and if a match is detected then the feature’s weight is
added to the probability of the feature’s action being
selected.
4. The move to be made is then selected randomly according
to the resulting biased distribution.</p>
        <p>Features can similarly be used in other parts of MCTS,
such as its selection phase. Note that reactive features are
much more efficient to apply than proactive ones, as only
those patterns relevant to the previous player’s last move
need be tested. We therefore seek to generate reactive
features, and to replace proactive features with their reactive
equivalents, where possible.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Feature Generation</title>
        <p>Manually generating features using expert knowledge, or
extracting them through supervised learning from human
expert games, are not suitable solutions for the current task
in which a wide variety of unfamiliar games must be
supported. Exhaustive generation of features can be done more
easily, but has clear drawbacks due to the massive volume of
features it can generate.</p>
        <p>One advantage of describing games by their component
ludemes is that these may be exploited to generate a
plausible set of candidate feature patterns based on the specified
rules and equipment. Such patterns may not be suitable
features in and of themselves, but represent “minimum”
patterns which must be present in every feature – for example,
every Hex or Go feature must contain an empty “to” action
location – allowing significant reductions in the number of
possible patterns to be generated. For many games, this kind
of knowledge can be extracted automatically from ludemes.</p>
        <p>
          The subsequent fine-tuning of features to obtain the
optimal feature set F for a given game is still a
work-inprogress. Suffice it to say that frequent pattern mining
          <xref ref-type="bibr" rid="ref1">(Aggarwal and Han 2014)</xref>
          approaches to iteratively building
features from large numbers of randomly generated
selfplay games has not proved effective. Learning feature sets
against a random opponent can have disastrous – and
sometimes amusing – results in the unexpected behaviours that
can thwart the random player. There does not seem to be
any escape from evaluating feature sets using more
timeconsuming but realistic tests against intelligent AI
opponents, with MCTS as the baseline yardstick.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Features and Strategies</title>
      <p>An attractive aspect of the proposed approach is that learnt
features have the potential to encode simple strategies for the
games to which they apply. For example, the features shown
on the top row of Figure 5 encourage the player to make
lines of four pieces of their colour, while the features shown
on the bottom row encourage the player to not make lines
of three pieces of their colour (the red “–” symbols indicate
negative weights that discourage such actions).</p>
      <p>The features shown in Figure 6 encourage the player to
form groups of three pieces of their colour, by encouraging
singleton pieces to grow in any direction then discouraging
growth beyond group size 3. Similarly, the features shown in
Figure 7 encourage the player to form long, thin groups of
their pieces, by encouraging singleton pieces to grow in any
direction, then encouraging friendly pairs to extend at the
ends but discouraging growth at the common points adjacent
to both pieces (which would create shorter, thicker groups).</p>
      <sec id="sec-5-1">
        <title>Feature Explanation</title>
        <p>An additional benefit of the ludemic model for games is that
features applicable to a given game may be readily
transferred as candidate features for related games defined by
similar ludemes. Any feature that has proved relevant for a
given context is a good starting point for similar contexts.</p>
        <p>Further, we can potentially invert the causal relationship
between ludemes and features to reverse engineer
comprehensible explanations for the learnt strategies that optimal
feature sets represent, by determining which ludemes are
responsible for given features. This relationship between a
game’s ludemes and its derived features is further
strengthened by the fact that initial candidate feature sets are derived
directly from the game’s ludemic description during feature
generation process.</p>
        <p>Each ludeme in the LUDII Ludeme Library corresponds to
a Java class with an appropriate name, providing a source of
convenient plain English captions for learnt concepts. And
even if the ludeme names do not provide sufficient
explanation in themselves, they provide hints for further
postprocessing steps to find geometric relationships within the
feature patterns. It is plausible that simple descriptions such
as “make lines of 4” or “make groups of 3” may be derived
from such feature sets. Implementing such mechanisms for
explainable AI in the context of startegy learning for general
games will be a key focus of future work.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>Lightweight features based on geometric piece patterns have
a number of advantages for our work on the Anonymous
Project. They improve MCTS playing strength, map readily
to other geometries, encode simple strategies, can be
associated with the underlying ludemic descriptions of games,
and have the potential to help explain learnt strategies in
human-comprehensible terms. We will continue developing
and testing this approach as the LUDII general game
system matures to support an increasing range of game types,
and investigating the automated extraction and explanation
of relevant strategies from such learnt features.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This research is part of the European Research
Councilfunded Digital Ludeme Project (ERC Consolidator Grant
#771292) being run by Cameron Browne. We would also
like to acknowledge the RIKEN Institute’s Advanced
Intelligence Project (AIP), especially Kazuki Yoshizoe, for their
generous support of prior research that led to this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Aggarwal</surname>
            ,
            <given-names>C. C.</given-names>
          </string-name>
          , and Han,
          <string-name>
            <surname>J</surname>
          </string-name>
          ., eds.
          <year>2014</year>
          . Frequent Pattern Mining. Springer.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Araki</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Yoshida</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Tsuruoka</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Tsujii</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>In Proceedings of the 2007 IEEE Symposium on Computational intelligence and Games</source>
          ,
          <volume>189</volume>
          -
          <fpage>195</fpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Borvo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>1977</year>
          .
          <article-title>Anatomie d'un jeu de cartes: L'Aluette ou</article-title>
          le Jeu de Vache. Nantes: Librairie Nantaise Yves Vachon.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Bouzy</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Chaslot</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <year>2005</year>
          .
          <article-title>Bayesian generation and integration of K-nearest-neighbor patterns for 19x19 Go</article-title>
          . In Kendall, G., and
          <string-name>
            <surname>Lucas</surname>
          </string-name>
          , S., eds.,
          <source>Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Games</source>
          ,
          <volume>176</volume>
          -
          <fpage>181</fpage>
          . IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Browne</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Powley</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Whitehouse</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Lucas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; Cowling,
          <string-name>
            <surname>P. I.</surname>
          </string-name>
          ; Rohlfshagen,
          <string-name>
            <given-names>P.</given-names>
            ;
            <surname>Tavener</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ;
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ;
            <surname>Samothrakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ; and
            <surname>Colton</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <year>2012</year>
          .
          <article-title>A survey of Monte Carlo tree search methods</article-title>
          .
          <source>IEEE Transactions on Computational Intelligence and AI in Games</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Browne</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>A problem case for UCT</article-title>
          .
          <source>IEEE Transactions on Computational Intelligence and AI in Games</source>
          <volume>5</volume>
          (
          <issue>1</issue>
          ):
          <fpage>69</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Coulom</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Computing “ELO ratings” of move patterns in the game of Go</article-title>
          .
          <source>ICGA Journal</source>
          <volume>30</volume>
          (
          <issue>4</issue>
          ):
          <fpage>198</fpage>
          -
          <lpage>208</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Enderton</surname>
            ,
            <given-names>H. D.</given-names>
          </string-name>
          <year>1991</year>
          .
          <article-title>The Golem Go program</article-title>
          .
          <source>Technical Report CMU-CS-92-101</source>
          , School of Computer Science, Carnegie-Mellon University.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Finnsson</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , and Bjo¨rnsson,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <year>2010</year>
          .
          <article-title>Learning simulation control in general game-playing agents</article-title>
          .
          <source>In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence</source>
          ,
          <fpage>954</fpage>
          -
          <lpage>959</lpage>
          . AAAI Press.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Gelly</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Silver</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Combining online and offline knowledge in UCT</article-title>
          .
          <source>In Proceedings of the 24th International Conference on Machine Learning</source>
          ,
          <fpage>273</fpage>
          -
          <lpage>280</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Gelly</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Munos</surname>
          </string-name>
          , R.; and
          <string-name>
            <surname>Teytaud</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Modification of UCT with patterns in Monte-Carlo Go</article-title>
          .
          <source>Technical Report RR-6062</source>
          , INRIA, Paris.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Hoock</surname>
            ,
            <given-names>J.-P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Teytaud</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <year>2010</year>
          .
          <article-title>Bandit-based genetic programming</article-title>
          .
          <source>In Esparcia-Alca´zar, A. I.; Eka´rt, A.;</source>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Dignum</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Uyar</surname>
          </string-name>
          , A. S¸ ., eds.,
          <source>European Conference on Genetic Programming</source>
          , volume
          <volume>6021</volume>
          of Lecture Notes in Computer Science,
          <volume>268</volume>
          -
          <fpage>277</fpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Lorentz</surname>
            ,
            <given-names>R. J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Zosa</surname>
            <given-names>IV</given-names>
          </string-name>
          ,
          <string-name>
            <surname>T. E.</surname>
          </string-name>
          <year>2017</year>
          .
          <article-title>Machine learning in the game of Breakthrough</article-title>
          . In Winands, M. H. M.; van den Herik, H.; and
          <string-name>
            <surname>Kosters</surname>
          </string-name>
          , W. A., eds.,
          <source>Advances in Computer Games</source>
          , volume
          <volume>10664</volume>
          of Lecture Notes in Computer Science,
          <volume>140</volume>
          -
          <fpage>150</fpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Murray</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <year>1952</year>
          .
          <article-title>A History of Board-Games Other Than Chess</article-title>
          . Clarendon Press.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Raiko</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Peltonen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2008</year>
          .
          <article-title>Application of UCT search to the connection games of Hex, Y, *</article-title>
          <string-name>
            <surname>Star</surname>
          </string-name>
          , and Renkula!
          <source>In Proceedings of the Finnish Artificial Intelligence Conference</source>
          ,
          <volume>89</volume>
          -
          <fpage>93</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          2016.
          <article-title>Mastering the game of Go with deep neural networks and tree search</article-title>
          .
          <source>Nature</source>
          <volume>529</volume>
          (
          <issue>7587</issue>
          ):
          <fpage>484</fpage>
          -
          <lpage>489</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          2009.
          <article-title>Automated discovery of search-extension features</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          In van den Herik, H. J., and
          <string-name>
            <surname>Spronck</surname>
          </string-name>
          , P., eds.,
          <source>Advances in Computer Games</source>
          , volume
          <volume>6048</volume>
          of Lecture Notes in Computer Science. Springer, Berlin, Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Stern</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ; Herbrich, R.; and Graepel,
          <string-name>
            <surname>T.</surname>
          </string-name>
          <year>2006</year>
          .
          <article-title>Bayesian pattern ranking for move prediction in the game of Go</article-title>
          . In Cohen, W. W., and
          <string-name>
            <surname>Moore</surname>
          </string-name>
          , A., eds.,
          <source>Proceedings of the 23rd International Conference on Machine Learning</source>
          ,
          <fpage>873</fpage>
          -
          <lpage>880</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Stoutamire</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>1991</year>
          .
          <article-title>Machine learning, game play, and</article-title>
          <string-name>
            <surname>Go.</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <source>Technical Report TR 91-128</source>
          ,
          <article-title>Center for Automation and Intelligent Systems Research</article-title>
          , Case Western Reserve University.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Sturtevant</surname>
            ,
            <given-names>N. R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>White</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Feature construction for reinforcement learning in Hearts</article-title>
          . In van den Herik, H. J.; Ciancarini,
          <string-name>
            <given-names>P.</given-names>
            ; and Donkers,
            <surname>H. H. L. M</surname>
          </string-name>
          ., eds.,
          <source>Computers and Games</source>
          , volume
          <volume>4630</volume>
          of Lecture Notes in Computer Science,
          <volume>122</volume>
          -
          <fpage>134</fpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>van der Werf</surname>
          </string-name>
          , E.;
          <string-name>
            <surname>Uiterwijk</surname>
            ,
            <given-names>J. W. H. M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Postma</surname>
            , E.; and van den Herik,
            <given-names>J.</given-names>
          </string-name>
          <year>2003</year>
          .
          <article-title>Local move prediction in Go</article-title>
          . In Schaeffer, J.; Mu¨ller, M.; and Bjo¨rnsson, Y., eds.,
          <source>Computers and Games</source>
          , volume
          <volume>2883</volume>
          of Lecture Notes in Computer Science. Springer.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>