Algorithm Based On Reward And Punishment Technique
For Checker Player
Alicja Winnickaa
a Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44-100 Gliwice, Poland


                                          Abstract
                                          Board games always need algorithms for players to be a challenge for people. The only way to increase playability for people
                                          is to create an algorithm that will be intelligent enough to beat a human. In our paper we decided to implement an algorithm
                                          based on reward and punishment, which decides every turn which move is the best, considering the situation on the board.

                                          Keywords
                                          Game, algorithm, reward and punishment, board, checkers, artificial intelligence


1. Introduction                                                                                                    in different fields [6, 7, 8] .
                                                                                                                      Last year brought different types of reality games
Game development is important due to many different                                                                like virtualcit [9] or augmented [10, 11] which are still
factors, and above all entertainment. It is notewor-                                                               improved. In [12], the authors presented the results of
thy that players drive their development by placing                                                                their analysis that games help students learn. Again
greater and greater requirements regarding the story                                                               in [13], the idea of gamification is described with some
as well as the quality of the game. This means that                                                                results based on conducted experiments.
newer games need more computing power, so the de-                                                                     In this paper, we propose a solution for playing check-
velopment of equipment that allows them to run is                                                                  ers, where the current iteration of the game is analyzed
driven very fast. The plot, the world surrounding the                                                              on the basis of the reward and punishment technique.
hero are also important and quite often both elements
are generated by algorithms that must be constantly
developed and improved. In addition, the behavior of                                                               2. Checkers’ rules
opponents is important for the whole game, which is
commonly called as artificial intelligence. All these                                                              Checkers are one of the most known classic board ga-
elements are constantly being developed in order to                                                                mes based on strategic thinking. They were invented
improve the quality of multimedia games production.                                                                probably in the XII century and since then there were
Improvements in network for remote gaming are also                                                                 created many variants of board and play. The basic
envisaged [1].                                                                                                     and the only one being sport discipline is International
   The development of game analysis strategies and                                                                 Checkers also called Polish Checkers, where the board
artificial intelligence activities point to milestones in                                                          is 10 × 10 tiles and each player has 20 checkers. How-
this field. One of the biggest is the algorithm that de-                                                           ever, the most played by people is the English variant
feated the world champion in the game of Go, which is                                                              with 8 × 8 board and 12 checkers per player.
considered much more difficult than chess [2]. Other                                                                  The board is divided into black and white tiles ar-
types of games are also analyzed by the researcher                                                                 ranged alternately, which in our program is grey and
what can be seen in [3]. The authors proposed a new                                                                yellow to better recognition of checkers, which also
strategy for solving fuzzy matrix games. In [4], the                                                               are black-white. We used an English variant with check-
authors described a sequential model for the predic-                                                               ers set on black tiles. According to the assumption
tion of poker moves. Again in [5], the idea of using                                                               each player has 12 checkers set on the opposite edges
eye gaze to play cards with the use of artificial neural                                                           of the board in three rows. This arrangement of po-
networks was presented. In fact the neural networks                                                                sition is shown in Fig. 1. The goal of this game is to
have proved to be a very powerful tool for predictions                                                             take off all enemy’s checkers – then the winner is the
                                                                                                                   player who has at least one checker on the board.
IVUS 2020: Information Society and University Studies, 23 April 2020,                                                 This game requires strategic thinking and consider-
KTU Santaka Valley, Kaunas, Lithuania                                                                              ing possible movements of the enemy, and, which is
" Alicja.Lidia.Winnicka@gmail.com (A. Winnicka)                                                                    the most important, ability to change a plan depend-

                                    © 2020 Copyright for this paper by its authors. Use permitted under Creative   ing on the situation on the board.
                                    Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)                                           We assume that checkers can move only diagonally
Figure 1: Starting positions of the checkers.              Figure 2: Board with marked checker.


on the black tiles and they can move one tile per turn.    3. Algorithm
The situation, when we want to take an enemy’s check-
er in the neighborhood. Then tile behind this tile must    The purpose of our algorithm is to find the best option
be empty and our checker jumps over enemy’s one,           to move during one turn. We may divide it into a few
so it moves over two tiles. Changing the checker to        parts.
"King" is a second exception. King can move over the
whole board diagonally on black tiles.                     3.1. First Step
   One of these movements is treated as one turn, but
after taking an enemy’s checker player can move again      Let us assume that a checker is on (𝑥, 𝑦). In this sit-
in the same turn. A second player’s turn starts when       uation, depending on the position, it can have maxi-
the first player will move without taking his checker.     mally four directions to move, which are checked to
   According to the description above, we had to set a     classified as possible or not possible to move. It will
few basic assumptions:                                     be classified as not possible when our checker is on
                                                           the edge of the board or a checked tile is not empty.
    • checkers can move backward,                          For example, the checker marked as red in Fig. 2 has
                                                           two options to move – left-up and right-up – because
    • checkers can take others checkers backward,
                                                           two tiles below him are used by other white checkers.
    • taking checkers is not obligatory,                   Because our coordinates starts in top left corner, these
                                                           tiles are on (𝑥 − 1, 𝑦 − 1) and (𝑥 + 1, 𝑦 − 1).
    • checkers can take kings,
    • king can move in diagonal line through whole 3.2. Second Step
      board,                                       After finding all the options for a turn, we have to
    • board is 8 × 8,                              calculate which of them is the best option. This cal-
                                                   culating is based on the sum of rewards and punish-
    • checkers move on black tiles,                ment connected to these movements. As a reward, we
                                                   treat being close to changing into the king and tak-
    • player with white checkers starts.           ing the enemy’s checker. Alternatively, punishments
                                                   mean being possibly taken in the next enemy’s move.
                                                      Let us set 𝑠(𝑥, 𝑚, 𝑛) as this sum for the one move-
                                                   ment. According to our function, it will be a sum of


                                                      55
three values:

    • reward for being closer to be a king:
      A basic assumption in this option is the fact, that
      being closer to being a king means being closer
      to the enemy’s edge of the board. Additionally, it
      is important, how many checkers have the player.
      It will be more urgent to have a king when the
      number of checkers is smaller – it would be very
      useful to have a king when there is only one left
      checker on the board.
      We needed a function, which will increase with
      increasing tile counter 𝑥 – a number of tiles be-
      tween player’s edge of the board and a checker –
      and which will have bigger values for a smaller
      number of checkers 𝑚. We defined a function:                  In this situation our 𝑠 will be defined with following
                                                                  equation:
                                     sin(𝑥)
                       𝑑(𝑥, 𝑚) = |          |          (1)                𝑠(𝑥, 𝑚, 𝑛) = 𝑑(𝑥, 𝑛) + 𝑓 (𝑛) − 𝑘(𝑚, 𝑛) + 𝑟     (4)
                                       𝑚
    • reward for taking enemy’s checker:                             where 𝑥 ∈ [0, 8] is number of tiles from player’s
                                                                  edge, 𝑚 ∈ [1, 12] and 𝑛 ∈ [1, 12] are respectively num-
      The second reward is the situation when a player
                                                                  ber of player’s and enemy’s checkers left on the board
      can take the enemy’s checker. Because it is a                            1 ) is random value.
                                                                  and 𝑟 ∈ (0, 𝑚𝑛
      goal of the game player should be focused on
                                                                     Value of 𝑠 is calculated for each possible move in this
      this – so if the player can take enemy’s checker
                                                                  turn (so each possibility for each checker) and the next
      it must be his priority - unless his winning is in
                                                                  algorithm finds the best of them, which is equivalent to
      danger. Additionally, this move causes the pos-
                                                                  the biggest one. This value is treated as the best option
      sibility of the next move, so another value to the
                                                                  to move considering the situation on the board.
      reward – all these rewards are calculated as a
                                                                     Random value 𝑟 was added to make small differ-
      reward of this turn.
                                                                  ences between possibilities which are simple moves
      It is worth noticing that the less enemy’s check-           towards the end of the board and where 𝑥 is the same.
      ers, the better is the player’s situation. Accord-          Then 𝑠 would be the same value. Without 𝑟 algorithm
      ing to this, we assumed that it is more urgent to           became predictable, because it would choose the first
      take enemy’s checkers’ when there are fewer of              of the option with the same value. 𝑟 allowed to make
      them – to finish the game quickly. We needed                small differences between them.
      a function dependent on a number of enemy’s                    The whole algorithm of one turn is shown above.
      checkers 𝑛 and player’s checkers 𝑚 and we de-
      fined simple increasing function:
                                                                  4. Experiments
                            1
                𝑓 (𝑚, 𝑛) =    ,       𝑚, 𝑛 ∈ [1, 12]   (2)During experiments, we focused on the impact of a
                           𝑚𝑛
                                                          number of enemy’s and player’s checkers and addi-
    • being taken:                                        tionally on the position of checked tile. Fig. 3 shows
      Being killed must be punished – we have a sim- us the dependence of reward and position of checked
      ilar situation to the taking enemy’s checker. If tile.
      the move will cause the possibility of being taken     On the axis OX, we located tiles from the player’s
      by the enemy’s checker, a value of 𝑠 has to be edge to the enemy’s edge and on the 𝑂𝑌 is calculated
      smaller to avoid this. To balance the difference to reward 𝑠. On this figure we can see two considered
      between taking and being taken we used the same options for the move:
      function as punishment, but we will subtract it:
                                                              • move towards enemy’s edge without taking or
                          1                                      being taken,
              𝑘(𝑚, 𝑛) =       ,     𝑚, 𝑛 ∈ [1, 12]    (3)
                        𝑚∗𝑛


                                                             56
Figure 3: Comparison of reward depending on player’s and        Figure 4: Example of choosing not the best option.
enemy’s checkers.


                                                            an enemy may do the move which was not included in
    • taking the eighth enemy’s checker.                    chosen series of moves.
   First, one is our 𝑑(𝑥, 𝑚, 𝑛) and is assigned to blue        It means, that our algorithm is more exact than for
functions on the Fig. 3 They are changing in the con- example minimax algorithm, but is faster and better
nection to a number of player’s checkers. This func- for a quick game.
tion is a function corresponding to the need of hav-
ing a king and increases with a decreasing number of
player’s checkers. Red function means the reward for 5. Conclusions
taking the eighth enemy’s checker.
                                                            In this paper, we showed a reward and punishment
   We noticed that taking the enemy’s checker is more
                                                            based algorithm as a possible algorithm deciding how
rewarded in any place of the board for more than five
                                                            to move in checkers. Our algorithm turned to be good
player’s checkers. The situation is more complicated
                                                            enough to play against an intelligent person, however
when a player has five or fewer checkers – for five
                                                            with a trend to lose checkers at the beginning of the
checkers and for the fifth tile algorithm will choose to
                                                            game. It is much better in choosing how to move when
go further to be a king.
                                                            a number of player’s and enemy’s checkers changes
   This whole algorithm decides about priorities dur-
                                                            dynamically during the game.
ing the play but is not good enough to choose always
                                                               In future papers, we will focus on increasing the ef-
the best option from a human’s point of view. An ex-
                                                            fectiveness of the algorithm at the beginning of the
ample of a bad decision is shown in Fig. 4.
                                                            game.
   Here we have the same two options which are men-
tioned above, but this time after taking one enemy’s
checker, the player can take another two, decreasing References
the number of enemy’s checkers to 10. It will do it if
player’s checkers are mostly above 3, but for 1, 2 or 3 [1] G. Ciccarella, R. Giuliano, F. Mazzenga, F. Vata-
checkers, it will still go to the edge of the board, ignor-       laro, A. Vizzarri, Edge cloud computing in
ing the opportunity to take three enemy’s checkers.               telecommunications: Case studies on perfor-
   Another situation is visible on the bottom of the              mance improvement and tco saving,              IEEE
Fig. 4 – when the player has even his full set of 12              Int. Conference on Fog Mobile Edge Computing
and the enemy has 12 too, the player will choose to               (FMEC 2019) (Rome, Italy, Jun. 2019) 113–120.
go further instead of taking one enemy’s checker.            [2] D. Silver, J. Schrittwieser, K. Simonyan,
   One of the most popular algorithms for games is the            I. Antonoglou, A. Huang, A. Guez, T. Hu-
minimax algorithm. This algorithm finds the best op-              bert, L. Baker, M. Lai, A. Bolton, et al., Mastering
tion for the chosen player, considering possible moves            the game of go without human knowledge, Jour-
for both players and finds the best option from every             nal of Artificial Intelligence and Soft Computing
combination of moves. It ensures that literally the best          Research 550 (2017) 354—-359.
of all moves will be found, however simultaneously re- [3] A. Mansoori, M. Eshaghnezhad, S. Effati, Recur-
quires much more computing power during each turn                 rent neural network model: a new strategy to
to check if the best option is still the best one – because       solve fuzzy matrix games, IEEE transactions on


                                                           57
    neural networks and learning systems 30 (2019)                   2020, pp. 7–12.
    2538––2547.                                                  [9] D. Połap, K. K˛esik, M. Woźniak, Accident pre-
[4] R. Radziukas, R. Maskeli𝑢̄ nas, R. Dama𝑠̆ evi𝑐̆ ius,             vention system during immersion in virtual re-
    Prediction of poker moves using sequential                       ality, in International Conference on Multime-
    model and tensorflow, International Confer-                      dia and Network Information System. (Springer,
    ence on Information and Software Technologies                    2018) 565––573.
    (Springer, 2019) 516––525.                                  [10] D. Połap, Human-machine interaction in intel-
[5] Q. Hu, S. Yean, J. Liu, S. Lee, D. Rajan,                        ligent technologies using the augmented reality,
    R. Phattharaphon, Using eye gaze to play mind                    Information Technology and Control 47 (2018)
    card-game using neural network, Proceedings                      691––703.
    of The Fifth International Conference on Elec-              [11] D. Połap, M. Woźniak, C. Napoli, E. Tramontana,
    tronics and Software Science (ICESS2019) (Japan,                 Real-time cloud-based game management sys-
    2019) 54.                                                        tem via cuckoo search algorithm, International
[6] F. Beritelli, G. Capizzi, G. Lo Sciuto, C. Napoli,               Journal of Electronics and Telecommunications
    F. Scaglione, Rainfall estimation based on the in-               61 (2015) 333–338.
    tensity of the received signal in a lte/4g mobile           [12] J. Hamari, D. J. Shernoff, E. Rowe, B. Coller,
    terminal by using a probabilistic neural network,                J. Asbell-Clarke, T. Edwards, Challenging games
    IEEE Access 6 (2018) 30865–30873.                                help students learn: An empirical study on en-
[7] F. Bonanno, G. Capizzi, G. Sciuto, C. Napoli,                    gagement, flow and immersion in game-based
    Wavelet recurrent neural network with semi-                      learning, Computers in human behavior 54
    parametric input data preprocessing for micro-                   (2016) 170—-179.
    wind power forecasting in integrated generation             [13] M. Sailer, J. U. Hense, S. K. Mayr, H. Mandl, How
    systems, 2015, pp. 602–609.                                      gamification motivates: An experimental study
[8] G. Capizzi, C. Napoli, S. Russo, M. Woźniak,                     of the effects of specific game design elements
    Lessening stress and anxiety-related behaviors                   on psychological need satisfaction, Computers
    by means of ai-driven drones for aromatherapy,                   in Human Behavior 69 (2017) 371—-380.
    in: CEUR Workshop Proceedings, volume 2594,


                                                           58