Algorithm Based On Reward And Punishment Technique For Checker Player Alicja Winnickaa a Faculty of Applied Mathematics, Silesian University of Technology, Kaszubska 23, 44-100 Gliwice, Poland Abstract Board games always need algorithms for players to be a challenge for people. The only way to increase playability for people is to create an algorithm that will be intelligent enough to beat a human. In our paper we decided to implement an algorithm based on reward and punishment, which decides every turn which move is the best, considering the situation on the board. Keywords Game, algorithm, reward and punishment, board, checkers, artificial intelligence 1. Introduction in different fields [6, 7, 8] . Last year brought different types of reality games Game development is important due to many different like virtualcit [9] or augmented [10, 11] which are still factors, and above all entertainment. It is notewor- improved. In [12], the authors presented the results of thy that players drive their development by placing their analysis that games help students learn. Again greater and greater requirements regarding the story in [13], the idea of gamification is described with some as well as the quality of the game. This means that results based on conducted experiments. newer games need more computing power, so the de- In this paper, we propose a solution for playing check- velopment of equipment that allows them to run is ers, where the current iteration of the game is analyzed driven very fast. The plot, the world surrounding the on the basis of the reward and punishment technique. hero are also important and quite often both elements are generated by algorithms that must be constantly developed and improved. In addition, the behavior of 2. Checkers’ rules opponents is important for the whole game, which is commonly called as artificial intelligence. All these Checkers are one of the most known classic board ga- elements are constantly being developed in order to mes based on strategic thinking. They were invented improve the quality of multimedia games production. probably in the XII century and since then there were Improvements in network for remote gaming are also created many variants of board and play. The basic envisaged [1]. and the only one being sport discipline is International The development of game analysis strategies and Checkers also called Polish Checkers, where the board artificial intelligence activities point to milestones in is 10 × 10 tiles and each player has 20 checkers. How- this field. One of the biggest is the algorithm that de- ever, the most played by people is the English variant feated the world champion in the game of Go, which is with 8 × 8 board and 12 checkers per player. considered much more difficult than chess [2]. Other The board is divided into black and white tiles ar- types of games are also analyzed by the researcher ranged alternately, which in our program is grey and what can be seen in [3]. The authors proposed a new yellow to better recognition of checkers, which also strategy for solving fuzzy matrix games. In [4], the are black-white. We used an English variant with check- authors described a sequential model for the predic- ers set on black tiles. According to the assumption tion of poker moves. Again in [5], the idea of using each player has 12 checkers set on the opposite edges eye gaze to play cards with the use of artificial neural of the board in three rows. This arrangement of po- networks was presented. In fact the neural networks sition is shown in Fig. 1. The goal of this game is to have proved to be a very powerful tool for predictions take off all enemy’s checkers – then the winner is the player who has at least one checker on the board. IVUS 2020: Information Society and University Studies, 23 April 2020, This game requires strategic thinking and consider- KTU Santaka Valley, Kaunas, Lithuania ing possible movements of the enemy, and, which is " Alicja.Lidia.Winnicka@gmail.com (A. Winnicka) the most important, ability to change a plan depend-  © 2020 Copyright for this paper by its authors. Use permitted under Creative ing on the situation on the board. Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) We assume that checkers can move only diagonally Figure 1: Starting positions of the checkers. Figure 2: Board with marked checker. on the black tiles and they can move one tile per turn. 3. Algorithm The situation, when we want to take an enemy’s check- er in the neighborhood. Then tile behind this tile must The purpose of our algorithm is to find the best option be empty and our checker jumps over enemy’s one, to move during one turn. We may divide it into a few so it moves over two tiles. Changing the checker to parts. "King" is a second exception. King can move over the whole board diagonally on black tiles. 3.1. First Step One of these movements is treated as one turn, but after taking an enemy’s checker player can move again Let us assume that a checker is on (𝑥, 𝑦). In this sit- in the same turn. A second player’s turn starts when uation, depending on the position, it can have maxi- the first player will move without taking his checker. mally four directions to move, which are checked to According to the description above, we had to set a classified as possible or not possible to move. It will few basic assumptions: be classified as not possible when our checker is on the edge of the board or a checked tile is not empty. • checkers can move backward, For example, the checker marked as red in Fig. 2 has two options to move – left-up and right-up – because • checkers can take others checkers backward, two tiles below him are used by other white checkers. • taking checkers is not obligatory, Because our coordinates starts in top left corner, these tiles are on (𝑥 − 1, 𝑦 − 1) and (𝑥 + 1, 𝑦 − 1). • checkers can take kings, • king can move in diagonal line through whole 3.2. Second Step board, After finding all the options for a turn, we have to • board is 8 × 8, calculate which of them is the best option. This cal- culating is based on the sum of rewards and punish- • checkers move on black tiles, ment connected to these movements. As a reward, we treat being close to changing into the king and tak- • player with white checkers starts. ing the enemy’s checker. Alternatively, punishments mean being possibly taken in the next enemy’s move. Let us set 𝑠(𝑥, 𝑚, 𝑛) as this sum for the one move- ment. According to our function, it will be a sum of 55 three values: • reward for being closer to be a king: A basic assumption in this option is the fact, that being closer to being a king means being closer to the enemy’s edge of the board. Additionally, it is important, how many checkers have the player. It will be more urgent to have a king when the number of checkers is smaller – it would be very useful to have a king when there is only one left checker on the board. We needed a function, which will increase with increasing tile counter 𝑥 – a number of tiles be- tween player’s edge of the board and a checker – and which will have bigger values for a smaller number of checkers 𝑚. We defined a function: In this situation our 𝑠 will be defined with following equation: sin(𝑥) 𝑑(𝑥, 𝑚) = | | (1) 𝑠(𝑥, 𝑚, 𝑛) = 𝑑(𝑥, 𝑛) + 𝑓 (𝑛) − 𝑘(𝑚, 𝑛) + 𝑟 (4) 𝑚 • reward for taking enemy’s checker: where 𝑥 ∈ [0, 8] is number of tiles from player’s edge, 𝑚 ∈ [1, 12] and 𝑛 ∈ [1, 12] are respectively num- The second reward is the situation when a player ber of player’s and enemy’s checkers left on the board can take the enemy’s checker. Because it is a 1 ) is random value. and 𝑟 ∈ (0, 𝑚𝑛 goal of the game player should be focused on Value of 𝑠 is calculated for each possible move in this this – so if the player can take enemy’s checker turn (so each possibility for each checker) and the next it must be his priority - unless his winning is in algorithm finds the best of them, which is equivalent to danger. Additionally, this move causes the pos- the biggest one. This value is treated as the best option sibility of the next move, so another value to the to move considering the situation on the board. reward – all these rewards are calculated as a Random value 𝑟 was added to make small differ- reward of this turn. ences between possibilities which are simple moves It is worth noticing that the less enemy’s check- towards the end of the board and where 𝑥 is the same. ers, the better is the player’s situation. Accord- Then 𝑠 would be the same value. Without 𝑟 algorithm ing to this, we assumed that it is more urgent to became predictable, because it would choose the first take enemy’s checkers’ when there are fewer of of the option with the same value. 𝑟 allowed to make them – to finish the game quickly. We needed small differences between them. a function dependent on a number of enemy’s The whole algorithm of one turn is shown above. checkers 𝑛 and player’s checkers 𝑚 and we de- fined simple increasing function: 4. Experiments 1 𝑓 (𝑚, 𝑛) = , 𝑚, 𝑛 ∈ [1, 12] (2)During experiments, we focused on the impact of a 𝑚𝑛 number of enemy’s and player’s checkers and addi- • being taken: tionally on the position of checked tile. Fig. 3 shows Being killed must be punished – we have a sim- us the dependence of reward and position of checked ilar situation to the taking enemy’s checker. If tile. the move will cause the possibility of being taken On the axis OX, we located tiles from the player’s by the enemy’s checker, a value of 𝑠 has to be edge to the enemy’s edge and on the 𝑂𝑌 is calculated smaller to avoid this. To balance the difference to reward 𝑠. On this figure we can see two considered between taking and being taken we used the same options for the move: function as punishment, but we will subtract it: • move towards enemy’s edge without taking or 1 being taken, 𝑘(𝑚, 𝑛) = , 𝑚, 𝑛 ∈ [1, 12] (3) 𝑚∗𝑛 56 Figure 3: Comparison of reward depending on player’s and Figure 4: Example of choosing not the best option. enemy’s checkers. an enemy may do the move which was not included in • taking the eighth enemy’s checker. chosen series of moves. First, one is our 𝑑(𝑥, 𝑚, 𝑛) and is assigned to blue It means, that our algorithm is more exact than for functions on the Fig. 3 They are changing in the con- example minimax algorithm, but is faster and better nection to a number of player’s checkers. This func- for a quick game. tion is a function corresponding to the need of hav- ing a king and increases with a decreasing number of player’s checkers. Red function means the reward for 5. Conclusions taking the eighth enemy’s checker. In this paper, we showed a reward and punishment We noticed that taking the enemy’s checker is more based algorithm as a possible algorithm deciding how rewarded in any place of the board for more than five to move in checkers. Our algorithm turned to be good player’s checkers. The situation is more complicated enough to play against an intelligent person, however when a player has five or fewer checkers – for five with a trend to lose checkers at the beginning of the checkers and for the fifth tile algorithm will choose to game. It is much better in choosing how to move when go further to be a king. a number of player’s and enemy’s checkers changes This whole algorithm decides about priorities dur- dynamically during the game. ing the play but is not good enough to choose always In future papers, we will focus on increasing the ef- the best option from a human’s point of view. An ex- fectiveness of the algorithm at the beginning of the ample of a bad decision is shown in Fig. 4. game. Here we have the same two options which are men- tioned above, but this time after taking one enemy’s checker, the player can take another two, decreasing References the number of enemy’s checkers to 10. It will do it if player’s checkers are mostly above 3, but for 1, 2 or 3 [1] G. Ciccarella, R. Giuliano, F. Mazzenga, F. Vata- checkers, it will still go to the edge of the board, ignor- laro, A. Vizzarri, Edge cloud computing in ing the opportunity to take three enemy’s checkers. telecommunications: Case studies on perfor- Another situation is visible on the bottom of the mance improvement and tco saving, IEEE Fig. 4 – when the player has even his full set of 12 Int. Conference on Fog Mobile Edge Computing and the enemy has 12 too, the player will choose to (FMEC 2019) (Rome, Italy, Jun. 2019) 113–120. go further instead of taking one enemy’s checker. [2] D. Silver, J. Schrittwieser, K. Simonyan, One of the most popular algorithms for games is the I. Antonoglou, A. Huang, A. Guez, T. Hu- minimax algorithm. This algorithm finds the best op- bert, L. Baker, M. Lai, A. Bolton, et al., Mastering tion for the chosen player, considering possible moves the game of go without human knowledge, Jour- for both players and finds the best option from every nal of Artificial Intelligence and Soft Computing combination of moves. It ensures that literally the best Research 550 (2017) 354—-359. of all moves will be found, however simultaneously re- [3] A. Mansoori, M. Eshaghnezhad, S. Effati, Recur- quires much more computing power during each turn rent neural network model: a new strategy to to check if the best option is still the best one – because solve fuzzy matrix games, IEEE transactions on 57 neural networks and learning systems 30 (2019) 2020, pp. 7–12. 2538––2547. [9] D. Połap, K. K˛esik, M. Woźniak, Accident pre- [4] R. Radziukas, R. Maskeli𝑢̄ nas, R. Dama𝑠̆ evi𝑐̆ ius, vention system during immersion in virtual re- Prediction of poker moves using sequential ality, in International Conference on Multime- model and tensorflow, International Confer- dia and Network Information System. (Springer, ence on Information and Software Technologies 2018) 565––573. (Springer, 2019) 516––525. [10] D. Połap, Human-machine interaction in intel- [5] Q. Hu, S. Yean, J. Liu, S. Lee, D. Rajan, ligent technologies using the augmented reality, R. Phattharaphon, Using eye gaze to play mind Information Technology and Control 47 (2018) card-game using neural network, Proceedings 691––703. of The Fifth International Conference on Elec- [11] D. Połap, M. Woźniak, C. Napoli, E. Tramontana, tronics and Software Science (ICESS2019) (Japan, Real-time cloud-based game management sys- 2019) 54. tem via cuckoo search algorithm, International [6] F. Beritelli, G. Capizzi, G. Lo Sciuto, C. Napoli, Journal of Electronics and Telecommunications F. Scaglione, Rainfall estimation based on the in- 61 (2015) 333–338. tensity of the received signal in a lte/4g mobile [12] J. Hamari, D. J. Shernoff, E. Rowe, B. Coller, terminal by using a probabilistic neural network, J. Asbell-Clarke, T. Edwards, Challenging games IEEE Access 6 (2018) 30865–30873. help students learn: An empirical study on en- [7] F. Bonanno, G. Capizzi, G. Sciuto, C. Napoli, gagement, flow and immersion in game-based Wavelet recurrent neural network with semi- learning, Computers in human behavior 54 parametric input data preprocessing for micro- (2016) 170—-179. wind power forecasting in integrated generation [13] M. Sailer, J. U. Hense, S. K. Mayr, H. Mandl, How systems, 2015, pp. 602–609. gamification motivates: An experimental study [8] G. Capizzi, C. Napoli, S. Russo, M. Woźniak, of the effects of specific game design elements Lessening stress and anxiety-related behaviors on psychological need satisfaction, Computers by means of ai-driven drones for aromatherapy, in Human Behavior 69 (2017) 371—-380. in: CEUR Workshop Proceedings, volume 2594, 58