=Paper=
{{Paper
|id=Vol-1353/paper_17
|storemode=property
|title=A Game-Theoretic Intelligent Agent for the Board Game Football Strategy
|pdfUrl=https://ceur-ws.org/Vol-1353/paper_17.pdf
|volume=Vol-1353
|dblpUrl=https://dblp.org/rec/conf/maics/McCulloch15
}}
==A Game-Theoretic Intelligent Agent for the Board Game Football Strategy==
<pdf width="1500px">https://ceur-ws.org/Vol-1353/paper_17.pdf</pdf>
<pre>
                 A Game-Theoretic Intelligent Agent for the Board Game
                                                  Football Strategy
                                                      Sean McCulloch
                                                     Ohio Wesleyan University
                                                        stmccull@owu.edu


                               Abstract                                      They often are based on interesting underlying the-
While much work has been done in the development of in-                       oretical or mathematical ideas which can poten-
telligent agents for “classic” board games (e.g., Chess,                      tially be exploited by the agent.
Checkers, Othello), there have been many games developed            There have been some efforts to design agents to play these
in recent years that have not received as much attention.           games. The most common approach is to manage the large
These games usually include large amounts of randomness             state space created by these games, either by using clever
or non-public information and often have interesting under-         variants of classical search methods (Heyden 2009)
lying theoretical structure. In this paper, we investigate a        (Schadd et al 2007) (Schadd and Winands 2009) or by ap-
game called Football Strategy, which can be viewed as hav-          plying multiple agents to the problem (Johansson 2006) or
ing similar structure to a normal-form game. We discuss the         by keeping a simpler model of the game to reduce the com-
game-theoretic techniques used to arrive at a mixed strategy        plexity (Thomas 2003). Another approach is to develop a
to play the game. We also discuss the methods used to in-           rule-based system derived from specific rules and strategies
corporate information about the game which are not easily           of a given game. These rules can then either be applied di-
represented in game-theoretic ways into our agent. Our              rectly to choose a move, or serve as the basis of an evalua-
agent was evaluated by playing against one of the world’s           tion function in a search (Hall et al 2004) (Heyden 2009).
top players, and gave him a competitive game.                          These approaches obviously have some inherent limita-
   .                                                                tions. Often, attempts to reduce a game’s complexity result
                                                                    in removing components that were necessary to play the
                                                                    game well. Additionally, a rule-based system is only as
                          Introduction                              good as the rules it contains. If there is no rule to apply to a
For decades, the Artificial Intelligence community has              specific situation (for example, to react to a new strategy by
worked on the development of intelligent agents to play             an opponent), then the quality of the agent will suffer.
“classic” games like Chess, Checkers, and Othello. In re-              Our approach is to design an agent that exploits the un-
cent years, many new games have been published that have            derlying mathematical nature of a game. In this way, the
gained widespread popularity. Often, these games have two           agent’s performance can be based on well-known rules of
common factors that make them interesting to study from an          math, and the existence of mathematical theorems and tools
intelligent agent perspective:                                      can be used to aid the agent. In this way, our agent’s play
      They often have a large state space, or branching            can approach “optimal” (in a mathematical sense).
          factor. Sometimes this is due to randomness (cards           In this paper we concentrate on a game called Football
          or dice), sometimes this is due to non-public infor-      Strategy, originally published by Strategy Game Company
          mation (different players each having secret vic-         in 1958, but more widely published by Avalon Hill in sev-
          tory conditions, for example), and sometimes this         eral editions starting in 1962. (For details on the game, see
          is due to a large variety of potential moves availa-      BGG 2015a.) The game represents an American football
          ble to the player. Whatever the cause, traditional        game between two generic teams who each call plays that
          tree-based searching is often impractical.                are resolved in a two-dimensional matrix. The offensive
                                                                    player chooses a number representing a column, and the de-
                                                                    fensive player selects a letter representing a row. A small

Copyright retained by author
version of the matrix is included in Figure 1. In the actual      weighted yardage table is then solved using linear program-
chart, there are 20 offensive plays and 10 defensive plays,       ming to generate a mixed strategy.
for a total of 200 entries.                                          The resulting mixed strategy will give the agent a proba-
                           FUMBLE                                 bility of choosing each offensive (or defensive) play that
                                                                  will maximize (or minimize) the expected performance of
                                                                  the offensive player. Table 1 shows a sample mixed strategy
                                                                  for an offensive player, and Table 2 a sample mixed strategy
                                                                  for the defensive player. Both tables show the same situa-
                        FUMBLE
                          FUMBLE                                  tion (first and ten, on the offense’s twenty yard line.) Low-
                                                                  numbered offensive plays tend to be running plays, and high
                                                                  numbers tend to be passing plays. Low-lettered defensive
                                                                  plays tend to be defenses against runs, and high-lettered de-
                                                                  fensive plays tend to be defenses against passes. In this case,
 Figure 1: A subsection of a Football Strategy chart
                                                                  the intersection of the highest probability offensive play call
In each play of the game, the offensive player and defensive      and the highest probability defensive call is 4E, which
player each secretly and simultaneously choose a letter or        would result in a gain of five yards for the offense. How-
number. The intersection of those choices in the chart de-        ever, most of the other defensive play calls that have posi-
termines the result of the play. In most cases, the chart gives   tive probability (A,B,C,D,and H) give lower gains against
the gain (or loss) of yardage as the result of the play. For      the offensive play call of 4, sometimes even resulting in
example, the combination of offense play “1” and defense          losses of yardage for the offense.
play “D” (we will abbreviate such a combination as “1D”)
is a gain of one yard. In some cases, other results can hap-       Offensive play number             Probability of Play Call
pen-- penalties, or as in the case of play 2B, a turnover, giv-    1                                 0%
ing the other team the ball. The normal rules of American          2                                 .29%
football apply (for scoring, first downs, et cetera.)              3                                 12.9%
   While some combinations of offensive and defensive              4                                 53.92%
plays lead to the same outcome (for example, plays 4B and          5                                 12.3%
3D above both result in a gain of two yards), the chart has        6                                 0%
over 60 unique outcomes, leading to a branching factor that        7                                 0%
is too large to be addressed with traditional methods. By          8                                 0%
way of comparison, Chess has an average branching factor           9                                 2.04%
of about 35.
                                                                   10                                0%
                                                                   11                                0%
             Game Theoretic Approach                               12                                0%
                                                                   13                                10.21%
In general, the chart given above can be seen as a zero-sum        14                                0%
normal form game, in the game-theoretic sense. In these            15                                0%
games, a mixed strategy is a probability that each player will
                                                                   16                                0%
choose one of each available options. A Nash Equilibrium
                                                                   17                                5.98%
(Nash 1950) in this case is a mixed strategy that cannot im-
                                                                   18                                2.23%
prove any player’s utility, given the other player’s strategy.
                                                                   19                                0%
Von Neuman has shown (von Neumann and Morgenstern
1944) that a mixed strategy Nash Equilibrium exists for all        20                                0%
zero-sum normal form games. The mixed strategy for each            Table 1: An example of an offensive mixed strategy
player can be computed using linear programming (for ex-
ample, see Mendelson 2004.)                                          It is interesting to note that there are several play calls that
   We have developed an agent that uses these techniques to       have zero probability-- they will never be called. This result
play the game of Football Strategy. The game chart gives          indicates that the solver has determined that for the given
the basic utility function measured in yards gained by the        game situation (defined by a combination of yard line,
offensive player. The agent then modifies the values in the       down, distance to a first down, score, and time remaining in
chart by adding (or subtracting) values to each table entry       the game), those play calls are dominated by others. It is the
based on how the play affects the overall game. The new           case that each offensive and defensive play has a game situ-
                                                                  ation in which it will be called with a nonzero probability.
 Defensive play letter           Probability of Play Call          is lost). To further encourage shorter yardage gains, we give
 A                               .24%                              an additional bonus (defined as a fixed percentage of the
 B                               3.41%                             overall first down bonus) for a play that results in a favora-
 C                               5.94%                             ble second or third down situation.
 D                               7.72%                                The second constant gives a large addition to the utility
 E                               43.9%                             score when the play results in a touchdown. Since scoring a
 F                               10.64%                            touchdown is the main goal of the offense (and preventing
 G                               12.72%                            one is the main goal of the defense), this bonus should be
 H                               15.89%                            large enough to have a major impact on play calling, but not
                                                                   so large that it causes the agent to become imbalanced when
 I                               0%
                                                                   only a few risky plays give any chance of scoring a touch-
 J                               0%
                                                                   down on a single play.
  Table 2: An example of a defensive mixed strategy                   Additionally, it is possible for the offense to kick a field
                                                                   goal. In practice, this is usually done when the offense is on
   The fact that the mixed strategy generated is a Nash Equi-
                                                                   fourth down, and relatively close to the other team’s end
librium means that this probability gives the optimal ex-
                                                                   zone. Field goal attempts in the game are resolved by a ran-
pected gain if we assume the opposing player is also using
                                                                   dom die roll based on the location on the field that the at-
the Nash Equilibrium mixed strategy for their side. If the
                                                                   tempt is made from. Scoring a touchdown gives the offense
opponent deviates from the equilibrium strategy, the ex-
                                                                   six or seven (usually seven, depending on a random die roll)
pected gain may improve.
                                                                   points, while kicking a field goal gives the offense three
                                                                   points. To simulate this, the extra utility added to the chart
Adapting the Game Matrix                                           on plays that result in a successful field goal is set to 3/7 the
   Simply defining the utility of the game matrix in terms of      value gained by a touchdown.
yards gained from each play is not sufficient to play Foot-           The third constant gives a large penalty to the utility score
ball Strategy—or any simulation of a football game— intel-         when the play results in a fumble or interception. While
ligently. The agent must manage several additional aspects         balancing the values of all of the constants has been chal-
that relate to the rules of football:                              lenging, it has been especially challenging to balance the in-
      While on offense, the offensive player has four at-         terplay between the bonus utility added by scoring a touch-
          tempts to gain a total of ten yards (a “first down”,)    down versus the negative utility added by creating a turno-
          which will reset the counter of four attempts. Of-       ver. Since a play that is neither a turnover nor a touchdown
          ten, the offensive player just uses three of those at-   is usually better for the offensive player (since they will typ-
          tempts, using the final attempt to give up posses-       ically have at least one more chance to make another offen-
          sion to the opponent further downfield (“punting”.)      sive play), we have found that setting the turnover penalty
      As shown in play 2B above, some combinations of             higher than the touchdown bonus works well.
          plays result in the offense immediately losing pos-         The final two constants are more rarely relevant. First, if
          session of the ball.                                     the offense loses enough yardage to end the play in their own
      The ultimate goal of the offensive player (and what         end zone, this results in a “safety,” giving the other team two
          the defensive player is trying to prevent) is not        points and the ball. This can only arise in rare situations.
          merely to gain yardage-- points are scored either by     Finally, we added a special constant for turning the ball over
          crossing the opponents goal line, or by attempting       on fourth down. We found that if we applied the normal
          to kick a field goal (which in this game is resolved     turnover penalty described above to the fourth down situa-
          randomly, with increasing probability given to at-       tion, the agent would never find punting to be a viable op-
          tempts made closer to the opponent’s goal line)          tion, since even a remote chance at a first down is better than
                                                                   the guaranteed turnover that punting represents. As a result,
As a result, the utility matrix in the game must be modified       giving a smaller penalty to the utility score on fourth down
to encapsulate a more accurate utility value based on these        turnovers led to less radical decision-making by the agent.
situations. To do this, we added five constants to the utility
value in several situations.
   The first constant value is added when the offensive                           Strategic Considerations
player makes a first down. To encourage the agent to choose
                                                                   The mixed strategy, with the additional constants described
plays that gain some yardage, rather than merely trying a
                                                                   above, works well in a tactical sense to find the optimal
low-probability attempt to get the entire ten yards in one
                                                                   move in a single play of the game in isolation. However, to
play, we add a prorated portion of the first down bonus for
                                                                   win the actual game, other strategic factors are necessary to
each yard gained (and subtract a similar amount if yardage
add to the complexity of the decision. The fact that the of-      ing, this is the number of plays left in the game. If the de-
fense has several attempts to reach a first down, the time re-    fensive player is currently losing, there is a minimum num-
maining in the game, and the current score all impact the         ber of plays that must elapse before they can get possession
choices made by players, and an agent who does not adapt          of the ball and try to score.
to the changing nature of the game will not be successful.           Next, the agent calculates the number of yards that need
   The two major strategic aspects we have added to the           to be gained by the losing player to have a chance at winning
game involve the special case of fourth down, and adjusting       the game. If a field goal is sufficient to win, we calculate
the play call in reference to the current time remaining and      the yardage to the next position on the field that changes the
score of the game. In keeping with the ideals of designing        odds of making a successful field goal; otherwise we calcu-
an agent based on mathematical principles, we have made           late the yardage needed to score a touchdown.
an effort to allow the agent to create its own strategy by im-       From these values, we can generate an average “yards per
plementing these strategic factors as modifications to the        play” that the player needs to make to reach the goal of win-
game matrix.                                                      ning the game. The elapsed time and yardage of each play
                                                                  considered by the agent will have an effect on this average.
Fourth Down                                                       The weighted change to this average yards per play will be
On fourth down, the offensive player has three choices: punt      added to the play’s utility value.
the ball to the other team (losing possession but forcing the
other player to gain more yards to score), try to gain a first                            Evaluation
down (losing possession at the current yard line if the at-
tempt fails), or try to kick a field goal (represented in this    Since Football Strategy is not as widely played as other
game as a random chance to score points based on how close        games, and to our knowledge there are no computer pro-
the offense is to the end zone). There are two punting plays,     grams that play it, we evaluated the performance of the pro-
one that can be called at any time, and one that can only be      gram against several humans of varying ability, ranging
called in “punt formation,” which the offense must declare        from novices to experts. After seeing encouraging results
before both players choose their plays. The offense is al-        against these players, we also tested the agent in several
lowed to call a non-punt play in punt formation (a “fake          games against Bruce Reiff, the world’s top player of Foot-
punt,”) but suffers a penalty on any yardage gained.              ball Strategy, as rated by the Boardgame Player’s Associa-
   The agent determines the best option between these three       tion (BPA 2014) (BPA 2015).
choices by finding the expected utility of all three possible        Mr. Reiff played against our agent in a series of three
play call situations (calling a normal play, calling a play       games. He won all three games, but each was within a
from punt formation, and attempting a field goal). The agent      touchdown, and in one game, his victory required a success-
will choose the mixed strategy that has the highest utility.      ful field goal attempt on the last play of the game. Despite
An interesting consequence of this method is that the agent       losing all three games, we feel that our agent can play at a
will occasionally decide that going into punt formation and       level competitive with the world’s top players. After our
fake punting is the correct move, just like real human play-      official experiment, Mr. Reiff played a few “recreational”
ers would.                                                        games against our agent, and our agent did manage to get
                                                                  some wins against him.
Clock Management                                                     As an additional experiment, we set up some games
                                                                  where Mr. Reiff was allowed to see the agent’s mixed strat-
The game of football is timed and scored, and the ultimate
                                                                  egy probabilities before choosing a play, though not the
goal is to have the most points when time runs out. As a
                                                                  agent’s actual play call. Mathematically, if the utility values
result, teams that are losing towards the end of the game of-
                                                                  are computed correctly, this should not give the human
ten need to take more risks to gain more yards quickly, and
                                                                  player any advantage, since it is an equilibrium strategy.
teams that are winning can play more conservatively and
                                                                  However, the extra strategic elements added to the game, as
choose slower plays that consume more time. Teams get
                                                                  well as the effects of the constant values added to the utili-
three time-outs each half which are used to cause the previ-
                                                                  ties, made the game slightly exploitable, and the human
ous play to consume less time, allowing both teams the abil-
                                                                  player won by a wide margin.
ity to conserve the clock, if desired.
   To simulate this, once the time remaining gets below a
certain threshold, the agent estimates the number of offen-
sive plays the player currently losing the game will have,
based on the time left in the game and the number of time-
outs each player has. If the offensive player is currently los-
                      Future Work                                                           References

There are several possible avenues of future improvement          BGG. 2015a. Football Strategy. Available online at
that can be added to the agent.                                   http://boardgamegeek.com/boardgame/951/football-strategy
   We would like to generate the values of the five constants     BGG         2015b.    Paydirt.          Available    online     at
used to adjust the values placed in the game matrix in a bet-     http://boardgamegeek.com/boardgame/1498/paydirt
ter way. Right now we have just used trial-and-error in play-     BPA. 2014. Boardgame Players Association Football Strategy
                                                                  Rankings.      Available online at http://boardgamers.org/year-
ing against humans. We would instead like to make these
                                                                  book14/fbspge.htm
values more rooted in the underlying mathematical princi-
                                                                  BPA. 2015. Boardgame Players Association. Available online at
ples of the game.                                                 http://www.boardgamers.org.
   We would like to integrate calling time-outs into the util-
                                                                  Hall L.; Gordon A.; James R.; and Newall L. 2004. A Lightweight
ity function, especially towards the end of each half of the      Rule-Based AI Engine for Mobile Games. In Proceedings of the
game. Currently, time-outs are called outside of the game         2004 ACM SIGCHI International Conference on Advances in
matrix decision making, using a set of pre-defined rules. In-     Computer Entertainment Technology, 284-289. New York, NY:
corporating them into the utility matrix would allow for          Association of Computing Machinery.
more strategic decision making by the agent.                      Heyden C. 2009. Implementing a Computer Player in Carcassone.
   We would like to explore ways in which the agent can           Master’s Thesis. Department of Knowledge Engineering, Maas-
                                                                  tricht University, Maastricht, Netherlands.
plan a sequence of downs to realize short-term goals. For
                                                                  Johnansson, S. 2006. On Using Multi-agent Systems in Playing
example, a “second and short” situation- where it is second
                                                                  Board Games. In Proceedings of the Fifth International Joint Con-
down and very few yards are needed to obtain a first down-        ference on Autonomous Agents and Multiagent Systems, 569-576.
is often a good time to attempt a risky play like a long pass.    New York, NY: Association of Computing Machinery
If it is successful, there is a large gain of yardage. If it is   Mendelson, E. 2004. Introducing Game Theory and its Applica-
unsuccessful, it is likely that the short yardage required for    tions. Boca Raton FL:CRC Press.
a first down can be gained on the next two plays. Currently,      Nash , J. 1950. Equilibrium Pointes in n-person Games. Proceed-
the agent as constructed merely sees a high probability for       ings of the National Academy of Sciences, 36(1):48-49.
obtaining a first down and chooses that over the higher-po-       Schadd, M,.; Winands M.; Bergsma M; and Uiterwijk J. 2007.
tential play.                                                     Solving Fanorona. ICGA Journal, 30(1):43-45.
   Humans tend to play Football Strategy by predicting            Schadd, M; and Winands M. 2009. Quiescence Search for
tendencies of play calls made by the opponent. It would be        Stratego. In Proceedings of the 21st Benelux Conference on Artifi-
                                                                  cial Intelligence. Eindhoven, the Netherlands.
interesting to explore whether a learning element could be
added to the game to allow the agent to make similar deci-        Thomas, R. 2003. Real-time Decision Making for Adversarial En-
                                                                  vironments Using a Plan-based Heuristic. PhD diss., Department
sions.                                                            of Computer Science, Northwestern University, Evanston IL.
   A similar game to Football Strategy exists, called Paydirt
                                                                  von Neumann, J., and Morgenstern, O. 1944. Theory of Games
(BGG 2015b). This game uses a chart of play calls similar         and Economic Behavior. Princeton University Press.
to the one found in Football Strategy, but the charts repre-
sent actual historical teams. The intersection of an offensive
and defensive play call is resolved probabilistically based on
the historical team’s abilities. The underlying ideas behind
our current agent should translate well to a program playing
this similar game.
   We would like to extend this idea of exploiting the under-
lying mathematical structure of a game to other games.
While there are not many games that are so directly related
to normal-form games, many other games use concepts from
probability, graph theory, or auctions that can be exploited
in similar ways.


                  Acknowledgements
We would like to thank Julide Yazar for an invaluable edu-
cation into the basics of Game Theory. We would also like
to thank Bruce Reiff for being our resident Football Strat-
egy expert and experimental subject.

</pre>