A Game-Theoretic Intelligent Agent for the Board Game Football Strategy Sean McCulloch Ohio Wesleyan University stmccull@owu.edu Abstract  They often are based on interesting underlying the- While much work has been done in the development of in- oretical or mathematical ideas which can poten- telligent agents for “classic” board games (e.g., Chess, tially be exploited by the agent. Checkers, Othello), there have been many games developed There have been some efforts to design agents to play these in recent years that have not received as much attention. games. The most common approach is to manage the large These games usually include large amounts of randomness state space created by these games, either by using clever or non-public information and often have interesting under- variants of classical search methods (Heyden 2009) lying theoretical structure. In this paper, we investigate a (Schadd et al 2007) (Schadd and Winands 2009) or by ap- game called Football Strategy, which can be viewed as hav- plying multiple agents to the problem (Johansson 2006) or ing similar structure to a normal-form game. We discuss the by keeping a simpler model of the game to reduce the com- game-theoretic techniques used to arrive at a mixed strategy plexity (Thomas 2003). Another approach is to develop a to play the game. We also discuss the methods used to in- rule-based system derived from specific rules and strategies corporate information about the game which are not easily of a given game. These rules can then either be applied di- represented in game-theoretic ways into our agent. Our rectly to choose a move, or serve as the basis of an evalua- agent was evaluated by playing against one of the world’s tion function in a search (Hall et al 2004) (Heyden 2009). top players, and gave him a competitive game. These approaches obviously have some inherent limita- . tions. Often, attempts to reduce a game’s complexity result in removing components that were necessary to play the game well. Additionally, a rule-based system is only as Introduction good as the rules it contains. If there is no rule to apply to a For decades, the Artificial Intelligence community has specific situation (for example, to react to a new strategy by worked on the development of intelligent agents to play an opponent), then the quality of the agent will suffer. “classic” games like Chess, Checkers, and Othello. In re- Our approach is to design an agent that exploits the un- cent years, many new games have been published that have derlying mathematical nature of a game. In this way, the gained widespread popularity. Often, these games have two agent’s performance can be based on well-known rules of common factors that make them interesting to study from an math, and the existence of mathematical theorems and tools intelligent agent perspective: can be used to aid the agent. In this way, our agent’s play  They often have a large state space, or branching can approach “optimal” (in a mathematical sense). factor. Sometimes this is due to randomness (cards In this paper we concentrate on a game called Football or dice), sometimes this is due to non-public infor- Strategy, originally published by Strategy Game Company mation (different players each having secret vic- in 1958, but more widely published by Avalon Hill in sev- tory conditions, for example), and sometimes this eral editions starting in 1962. (For details on the game, see is due to a large variety of potential moves availa- BGG 2015a.) The game represents an American football ble to the player. Whatever the cause, traditional game between two generic teams who each call plays that tree-based searching is often impractical. are resolved in a two-dimensional matrix. The offensive player chooses a number representing a column, and the de- fensive player selects a letter representing a row. A small Copyright retained by author version of the matrix is included in Figure 1. In the actual weighted yardage table is then solved using linear program- chart, there are 20 offensive plays and 10 defensive plays, ming to generate a mixed strategy. for a total of 200 entries. The resulting mixed strategy will give the agent a proba- FUMBLE bility of choosing each offensive (or defensive) play that will maximize (or minimize) the expected performance of the offensive player. Table 1 shows a sample mixed strategy for an offensive player, and Table 2 a sample mixed strategy for the defensive player. Both tables show the same situa- FUMBLE FUMBLE tion (first and ten, on the offense’s twenty yard line.) Low- numbered offensive plays tend to be running plays, and high numbers tend to be passing plays. Low-lettered defensive plays tend to be defenses against runs, and high-lettered de- fensive plays tend to be defenses against passes. In this case, Figure 1: A subsection of a Football Strategy chart the intersection of the highest probability offensive play call In each play of the game, the offensive player and defensive and the highest probability defensive call is 4E, which player each secretly and simultaneously choose a letter or would result in a gain of five yards for the offense. How- number. The intersection of those choices in the chart de- ever, most of the other defensive play calls that have posi- termines the result of the play. In most cases, the chart gives tive probability (A,B,C,D,and H) give lower gains against the gain (or loss) of yardage as the result of the play. For the offensive play call of 4, sometimes even resulting in example, the combination of offense play “1” and defense losses of yardage for the offense. play “D” (we will abbreviate such a combination as “1D”) is a gain of one yard. In some cases, other results can hap- Offensive play number Probability of Play Call pen-- penalties, or as in the case of play 2B, a turnover, giv- 1 0% ing the other team the ball. The normal rules of American 2 .29% football apply (for scoring, first downs, et cetera.) 3 12.9% While some combinations of offensive and defensive 4 53.92% plays lead to the same outcome (for example, plays 4B and 5 12.3% 3D above both result in a gain of two yards), the chart has 6 0% over 60 unique outcomes, leading to a branching factor that 7 0% is too large to be addressed with traditional methods. By 8 0% way of comparison, Chess has an average branching factor 9 2.04% of about 35. 10 0% 11 0% Game Theoretic Approach 12 0% 13 10.21% In general, the chart given above can be seen as a zero-sum 14 0% normal form game, in the game-theoretic sense. In these 15 0% games, a mixed strategy is a probability that each player will 16 0% choose one of each available options. A Nash Equilibrium 17 5.98% (Nash 1950) in this case is a mixed strategy that cannot im- 18 2.23% prove any player’s utility, given the other player’s strategy. 19 0% Von Neuman has shown (von Neumann and Morgenstern 1944) that a mixed strategy Nash Equilibrium exists for all 20 0% zero-sum normal form games. The mixed strategy for each Table 1: An example of an offensive mixed strategy player can be computed using linear programming (for ex- ample, see Mendelson 2004.) It is interesting to note that there are several play calls that We have developed an agent that uses these techniques to have zero probability-- they will never be called. This result play the game of Football Strategy. The game chart gives indicates that the solver has determined that for the given the basic utility function measured in yards gained by the game situation (defined by a combination of yard line, offensive player. The agent then modifies the values in the down, distance to a first down, score, and time remaining in chart by adding (or subtracting) values to each table entry the game), those play calls are dominated by others. It is the based on how the play affects the overall game. The new case that each offensive and defensive play has a game situ- ation in which it will be called with a nonzero probability. Defensive play letter Probability of Play Call is lost). To further encourage shorter yardage gains, we give A .24% an additional bonus (defined as a fixed percentage of the B 3.41% overall first down bonus) for a play that results in a favora- C 5.94% ble second or third down situation. D 7.72% The second constant gives a large addition to the utility E 43.9% score when the play results in a touchdown. Since scoring a F 10.64% touchdown is the main goal of the offense (and preventing G 12.72% one is the main goal of the defense), this bonus should be H 15.89% large enough to have a major impact on play calling, but not so large that it causes the agent to become imbalanced when I 0% only a few risky plays give any chance of scoring a touch- J 0% down on a single play. Table 2: An example of a defensive mixed strategy Additionally, it is possible for the offense to kick a field goal. In practice, this is usually done when the offense is on The fact that the mixed strategy generated is a Nash Equi- fourth down, and relatively close to the other team’s end librium means that this probability gives the optimal ex- zone. Field goal attempts in the game are resolved by a ran- pected gain if we assume the opposing player is also using dom die roll based on the location on the field that the at- the Nash Equilibrium mixed strategy for their side. If the tempt is made from. Scoring a touchdown gives the offense opponent deviates from the equilibrium strategy, the ex- six or seven (usually seven, depending on a random die roll) pected gain may improve. points, while kicking a field goal gives the offense three points. To simulate this, the extra utility added to the chart Adapting the Game Matrix on plays that result in a successful field goal is set to 3/7 the Simply defining the utility of the game matrix in terms of value gained by a touchdown. yards gained from each play is not sufficient to play Foot- The third constant gives a large penalty to the utility score ball Strategy—or any simulation of a football game— intel- when the play results in a fumble or interception. While ligently. The agent must manage several additional aspects balancing the values of all of the constants has been chal- that relate to the rules of football: lenging, it has been especially challenging to balance the in-  While on offense, the offensive player has four at- terplay between the bonus utility added by scoring a touch- tempts to gain a total of ten yards (a “first down”,) down versus the negative utility added by creating a turno- which will reset the counter of four attempts. Of- ver. Since a play that is neither a turnover nor a touchdown ten, the offensive player just uses three of those at- is usually better for the offensive player (since they will typ- tempts, using the final attempt to give up posses- ically have at least one more chance to make another offen- sion to the opponent further downfield (“punting”.) sive play), we have found that setting the turnover penalty  As shown in play 2B above, some combinations of higher than the touchdown bonus works well. plays result in the offense immediately losing pos- The final two constants are more rarely relevant. First, if session of the ball. the offense loses enough yardage to end the play in their own  The ultimate goal of the offensive player (and what end zone, this results in a “safety,” giving the other team two the defensive player is trying to prevent) is not points and the ball. This can only arise in rare situations. merely to gain yardage-- points are scored either by Finally, we added a special constant for turning the ball over crossing the opponents goal line, or by attempting on fourth down. We found that if we applied the normal to kick a field goal (which in this game is resolved turnover penalty described above to the fourth down situa- randomly, with increasing probability given to at- tion, the agent would never find punting to be a viable op- tempts made closer to the opponent’s goal line) tion, since even a remote chance at a first down is better than the guaranteed turnover that punting represents. As a result, As a result, the utility matrix in the game must be modified giving a smaller penalty to the utility score on fourth down to encapsulate a more accurate utility value based on these turnovers led to less radical decision-making by the agent. situations. To do this, we added five constants to the utility value in several situations. The first constant value is added when the offensive Strategic Considerations player makes a first down. To encourage the agent to choose The mixed strategy, with the additional constants described plays that gain some yardage, rather than merely trying a above, works well in a tactical sense to find the optimal low-probability attempt to get the entire ten yards in one move in a single play of the game in isolation. However, to play, we add a prorated portion of the first down bonus for win the actual game, other strategic factors are necessary to each yard gained (and subtract a similar amount if yardage add to the complexity of the decision. The fact that the of- ing, this is the number of plays left in the game. If the de- fense has several attempts to reach a first down, the time re- fensive player is currently losing, there is a minimum num- maining in the game, and the current score all impact the ber of plays that must elapse before they can get possession choices made by players, and an agent who does not adapt of the ball and try to score. to the changing nature of the game will not be successful. Next, the agent calculates the number of yards that need The two major strategic aspects we have added to the to be gained by the losing player to have a chance at winning game involve the special case of fourth down, and adjusting the game. If a field goal is sufficient to win, we calculate the play call in reference to the current time remaining and the yardage to the next position on the field that changes the score of the game. In keeping with the ideals of designing odds of making a successful field goal; otherwise we calcu- an agent based on mathematical principles, we have made late the yardage needed to score a touchdown. an effort to allow the agent to create its own strategy by im- From these values, we can generate an average “yards per plementing these strategic factors as modifications to the play” that the player needs to make to reach the goal of win- game matrix. ning the game. The elapsed time and yardage of each play considered by the agent will have an effect on this average. Fourth Down The weighted change to this average yards per play will be On fourth down, the offensive player has three choices: punt added to the play’s utility value. the ball to the other team (losing possession but forcing the other player to gain more yards to score), try to gain a first Evaluation down (losing possession at the current yard line if the at- tempt fails), or try to kick a field goal (represented in this Since Football Strategy is not as widely played as other game as a random chance to score points based on how close games, and to our knowledge there are no computer pro- the offense is to the end zone). There are two punting plays, grams that play it, we evaluated the performance of the pro- one that can be called at any time, and one that can only be gram against several humans of varying ability, ranging called in “punt formation,” which the offense must declare from novices to experts. After seeing encouraging results before both players choose their plays. The offense is al- against these players, we also tested the agent in several lowed to call a non-punt play in punt formation (a “fake games against Bruce Reiff, the world’s top player of Foot- punt,”) but suffers a penalty on any yardage gained. ball Strategy, as rated by the Boardgame Player’s Associa- The agent determines the best option between these three tion (BPA 2014) (BPA 2015). choices by finding the expected utility of all three possible Mr. Reiff played against our agent in a series of three play call situations (calling a normal play, calling a play games. He won all three games, but each was within a from punt formation, and attempting a field goal). The agent touchdown, and in one game, his victory required a success- will choose the mixed strategy that has the highest utility. ful field goal attempt on the last play of the game. Despite An interesting consequence of this method is that the agent losing all three games, we feel that our agent can play at a will occasionally decide that going into punt formation and level competitive with the world’s top players. After our fake punting is the correct move, just like real human play- official experiment, Mr. Reiff played a few “recreational” ers would. games against our agent, and our agent did manage to get some wins against him. Clock Management As an additional experiment, we set up some games where Mr. Reiff was allowed to see the agent’s mixed strat- The game of football is timed and scored, and the ultimate egy probabilities before choosing a play, though not the goal is to have the most points when time runs out. As a agent’s actual play call. Mathematically, if the utility values result, teams that are losing towards the end of the game of- are computed correctly, this should not give the human ten need to take more risks to gain more yards quickly, and player any advantage, since it is an equilibrium strategy. teams that are winning can play more conservatively and However, the extra strategic elements added to the game, as choose slower plays that consume more time. Teams get well as the effects of the constant values added to the utili- three time-outs each half which are used to cause the previ- ties, made the game slightly exploitable, and the human ous play to consume less time, allowing both teams the abil- player won by a wide margin. ity to conserve the clock, if desired. To simulate this, once the time remaining gets below a certain threshold, the agent estimates the number of offen- sive plays the player currently losing the game will have, based on the time left in the game and the number of time- outs each player has. If the offensive player is currently los- Future Work References There are several possible avenues of future improvement BGG. 2015a. Football Strategy. Available online at that can be added to the agent. http://boardgamegeek.com/boardgame/951/football-strategy We would like to generate the values of the five constants BGG 2015b. Paydirt. Available online at used to adjust the values placed in the game matrix in a bet- http://boardgamegeek.com/boardgame/1498/paydirt ter way. Right now we have just used trial-and-error in play- BPA. 2014. Boardgame Players Association Football Strategy Rankings. Available online at http://boardgamers.org/year- ing against humans. We would instead like to make these book14/fbspge.htm values more rooted in the underlying mathematical princi- BPA. 2015. Boardgame Players Association. Available online at ples of the game. http://www.boardgamers.org. We would like to integrate calling time-outs into the util- Hall L.; Gordon A.; James R.; and Newall L. 2004. A Lightweight ity function, especially towards the end of each half of the Rule-Based AI Engine for Mobile Games. In Proceedings of the game. Currently, time-outs are called outside of the game 2004 ACM SIGCHI International Conference on Advances in matrix decision making, using a set of pre-defined rules. In- Computer Entertainment Technology, 284-289. New York, NY: corporating them into the utility matrix would allow for Association of Computing Machinery. more strategic decision making by the agent. Heyden C. 2009. Implementing a Computer Player in Carcassone. We would like to explore ways in which the agent can Master’s Thesis. Department of Knowledge Engineering, Maas- tricht University, Maastricht, Netherlands. plan a sequence of downs to realize short-term goals. For Johnansson, S. 2006. On Using Multi-agent Systems in Playing example, a “second and short” situation- where it is second Board Games. In Proceedings of the Fifth International Joint Con- down and very few yards are needed to obtain a first down- ference on Autonomous Agents and Multiagent Systems, 569-576. is often a good time to attempt a risky play like a long pass. New York, NY: Association of Computing Machinery If it is successful, there is a large gain of yardage. If it is Mendelson, E. 2004. Introducing Game Theory and its Applica- unsuccessful, it is likely that the short yardage required for tions. Boca Raton FL:CRC Press. a first down can be gained on the next two plays. Currently, Nash , J. 1950. Equilibrium Pointes in n-person Games. Proceed- the agent as constructed merely sees a high probability for ings of the National Academy of Sciences, 36(1):48-49. obtaining a first down and chooses that over the higher-po- Schadd, M,.; Winands M.; Bergsma M; and Uiterwijk J. 2007. tential play. Solving Fanorona. ICGA Journal, 30(1):43-45. Humans tend to play Football Strategy by predicting Schadd, M; and Winands M. 2009. Quiescence Search for tendencies of play calls made by the opponent. It would be Stratego. In Proceedings of the 21st Benelux Conference on Artifi- cial Intelligence. Eindhoven, the Netherlands. interesting to explore whether a learning element could be added to the game to allow the agent to make similar deci- Thomas, R. 2003. Real-time Decision Making for Adversarial En- vironments Using a Plan-based Heuristic. PhD diss., Department sions. of Computer Science, Northwestern University, Evanston IL. A similar game to Football Strategy exists, called Paydirt von Neumann, J., and Morgenstern, O. 1944. Theory of Games (BGG 2015b). This game uses a chart of play calls similar and Economic Behavior. Princeton University Press. to the one found in Football Strategy, but the charts repre- sent actual historical teams. The intersection of an offensive and defensive play call is resolved probabilistically based on the historical team’s abilities. The underlying ideas behind our current agent should translate well to a program playing this similar game. We would like to extend this idea of exploiting the under- lying mathematical structure of a game to other games. While there are not many games that are so directly related to normal-form games, many other games use concepts from probability, graph theory, or auctions that can be exploited in similar ways. Acknowledgements We would like to thank Julide Yazar for an invaluable edu- cation into the basics of Game Theory. We would also like to thank Bruce Reiff for being our resident Football Strat- egy expert and experimental subject.