=Paper= {{Paper |id=Vol-2540/paper40 |storemode=property |title=None |pdfUrl=https://ceur-ws.org/Vol-2540/FAIR2019_paper_57.pdf |volume=Vol-2540 }} ==None== https://ceur-ws.org/Vol-2540/FAIR2019_paper_57.pdf
    Deep neural network design for learning Kriegspiel, an
                 imperfect information game

    Yasar Mahomed Abbas1[0000-0003-0445-6956], Anban Pillay1[0000-0001-7160-6972], Brett van
           Niekerk1[0000-0003-1050-4256] and Franziska Pannach2[0000-0003-4216-8410]
              1
                  University of KwaZulu-Natal, Westville Durban 3630, South Africa
                       2
                         University of Göttingen, 37073 Göttingen, Germany
                                   Yasar.tm44@gmail.com



        Abstract. State of the art Artificial Intelligence (AI) systems perform well on
        various types of games with perfect information. However, in many real-life set-
        tings only limited information about opponents is provided. In this paper, an ar-
        chitecture for an agent for an imperfect information game, Kriegspiel chess, is
        proposed. The architecture uses Deep Reinforcement Learning and learning by
        self play. We encode the state of the board and information on previous moves
        in an 8x8x27 layered Neural-Network. In order to select the best possible action,
        a Deep Counterfactual Regret Value minimization algorithm is used. Neural Net-
        works are trained using self play in a tournament setting.

        Keywords: Kriegspiel, Imperfect Information, Self Play, Information State,
        Counter Factual Regret.


1       Introduction

In imperfect information games players only observe their information state, Ut, and
generally do not know which exact game state, St, they are in. While the game state is
the state of the entire game, e.g. all pieces on a chessboard at a particular time, the
information state contains only the information that is available to a player at a given
time. The player may form beliefs, P(St|Ut), which are generally affected by fellow
players’ strategies at preceding states, Sk, k