<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Artificial Intelligence that plays for competitive Scrabble</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fulvio Di Maria</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Strade Alma Mater Studiorum</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>alberto.strade@live.it</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>An effective program created to play the worldwide famous crossword game SCRABBLE. It uses an efficient data structure that allows fast moves searching possible words trough the lexicon, while an heuristic function determinates the best word to be played, by means of probabilities and sensible considerations. Here is considered the two players version of the game.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Scrabble is a famous board game, which requires a good knowledge of the native
language and a good sense of strategy. On the contrary of other games like Chess and
Go, Scrabble is an incomplete information game, since the opponent’s rack of tiles is
secret.</p>
      <p>
        One of the first computer programs that played Scrabble was MONTY (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), which
used both strategic and tactical concepts, but was also a bit slow and never managed to
beat Scrabble experts. Others attempt were done, some using a straightforward strategy
(playing the longest word or the one which gives more points), others approaching
with a bit of protective strategy, like trying not to make available bonus squares to
the opponent. In 1988 Appel and Jacobson managed, using a DWAG data structure
(Directed Acyclic Word Graph) (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), to create the fastest and most efficient program of
their time, surpassed later only by a variant called GALLAD.
      </p>
      <p>
        While finding all the legal moves is a non-trivial problem on itself, because it
implies a CSP (Constraint Satisfaction Problem),(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) being able to choose the best move
to perform is equally difficult. Bayes’ probability theorem allows to create a good
heuristic function (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ), choosing the best word evaluating also rack residues.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Scrabble Basis</title>
      <p>The goal of the game is to make more points than the opponent. The board is composed
of a 15X15 square rack where words has to be placed, like in crossword puzzles. There
are 120 tiles representing the letters, and each of them reports the value of that letter;
there also are 2 blank tiles, which can be used as any letter of the alphabet. Once
they are put on the board, they become a copy of the chosen tile and cant be replaced,
nor they have a value when calculating the score. Each word must cross at least one
letter that was already on the board, and all the perpendicular words that come out
from these crosses must be legal words. There are four types of bonus square: 3W
triplicates the score of the entire word, 2W doubles it, 3L triplicate the value of the
specific letter, while 2L doubles that value. When a player put his word, he draws from
the bag until his rack returnes to 7 tiles. In the program at the moment we use a 8 tiles
rack, giving human and cpu more possibilities to form longer words and making the
game more interesting. Although it implies more computational load, the algorithm is
quite efficient, so that this change does not dramatically affect the performance of the
program. If a player manages to use all the tiles of the rack, he receives a 50 points
bonus. After the tiles of the bag are finished the game turns in a complete information
problem, but it ends only when one of the players runs out of tiles. Since we can divide
each legal play in across and down, we can talk from now only about the across plays,
because the board is symmetric.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Structures and Lexicon representation</title>
      <p>Any across word must have some newly placed tiles and at least a tile already present
on the board. Therefore, we can use this property in order to find legal words to add
on the board. Lets call the leftmost newly covered square adjacent to a tile already
on the board the anchorsquare for that word. So, all the possible anchors are the
squares which are adjacent to the filled squares. This way we can reduce the problem
of generating all the legal moves by dividing it for each row: for each of it, given the
rack, the anchors and the tiles already placed, we can generate all the legal plays.</p>
      <p>
        In order to perform a faster search, we need a structure that stores all the words of
the given lexicon (which contains all the words of a dictionary, but also the declinations
of the verbs, the plurals and so on). So we used a trie (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ), a tree which edges are marked
by letters. If two words start with the same letters, then they share a portion of their
path towards the ending node, which is marked as terminal. Note that all the leaves
of the trie are terminal nodes, while the contrary is not always true. This structure can
save a lot of words with a memory use proportional to O(n) bytes, resulting in a very
efficient solution.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Possible words generation</title>
      <p>The word generation can be easily divided in two phases: the detection of all the
possible tiles anchored to the left of an anchor square (we call these tiles the le f t part of
the word); and for each of them, the search of all the possible right parts of the words,
considering the right part all the tiles at the right of the anchor and the anchor square
itself. Consider that the left part can be formed by tiles already on the board, or tiles
from the rack, but not both. So, if the square adjacent on the left to the anchor is empty,
we must put a left part from the rack (or even have a empty left part), then extend the
word starting from the anchor square. If there are any tiles on the left of an anchor we
can simply consider that as the left part and perform only the ”extend right” phase.</p>
      <p>If the left part is all made of tiles already on the board, then we have already the
left part, and we have only to note which is; otherwise, we have to find all the left parts.
Since the anchor is the leftmost point of adjacency, the left part cant cover an anchor
square, and that reduces the length of the left parts of a word anchored to a square,
which will be found pruning the trie, according to the constraints of the tiles in our
rack.</p>
      <p>Once we have all the left parts, we can extend it on the right, adding the tiles one by
one, according to the constraints, which are the residual tiles on the rack and the tiles
already placed on the board to the right of the anchor square, which must be included
in the word (if is not possible, we will immediately delete that istance of right part).
Now we have all the possible words that can be played.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Heuristic</title>
      <p>After the program has determined all the possible words that can be placed, an heuristic
function has the duty to choose the most suitable word to play. In the program, the
decision is based on 3 parameters, which influence the choice with different weights.</p>
      <p>The formula used to express the fitness of each word can be expressed as F =
s p + c, where s is the score resulting from playing that word on the board; p is a non
negative penalty score, which is greater if the move is weak from a defensive point of
view; and c is a combo score, representing the possibility to put a 8 letter word on the
table the next turn, which gives a 50 points bonus.
5.1</p>
      <sec id="sec-5-1">
        <title>Penalty Score</title>
        <p>Penalty score P is a score which defines how bad is the move from a defensive point
of view, in other words it describes the bonus squares that the actual move allows the
opponent to use. For each letter it places for the possible word, the program controls
along the vertical axis if there are any bonus squares, starting from 7 squares above
the word to 7 squares under the word. The malus is higher if it includes a triple score
square, decreasing for a double word square, a triple letter square and a double letter
square, but it is also conditioned by the distance of that square from the word (since it
takes a longer word from the opponent to get advantage of the particular bonus square).
Moreover, the malus is then multiplied by a number n, with 0 &lt; n &lt; 1, which starts
from 1 and decreases every time a letter from another word is encountered during the
exploration (it takes a lot of effort to get a word long enough to reach the bonus square
using a letter already on the board).</p>
        <p>The penalty score then is the sum of all the maluses obtained from the exploration
of the space up and down of each letter. This number can usually vary from 0 (perfect
play from a defensive point of view) to 10 or more. The average value however, is
about 6/7.
5.2</p>
      </sec>
      <sec id="sec-5-2">
        <title>Bingo Probability</title>
        <p>Before explaining the meaning of the c score, we must define an octet as a string which
contains in lexicographic order the letters of a 8-word letter (eg. the word ABBAGLIO
has as correspondent optet AABBGILO), and residual rack similarly. The value of c,
given that in our lexicon there are over 10000 words with 8 letters, is expressed as
c =
10000
å pi k
i=0
where p is the probability of having, after the draw, all the letters of the octet i, and k is
a corrective value that is set to 6 after some experimental games. The value of pi can
result easier to understand with an example.</p>
        <p>Given a rack, lexicographically ordinate, fABBEILMUg, assuming that the
program is evaluating the fitness of the word fLUMEg as the first word of the game,
the residual rack will be fABBIg. As for the word fABBAGLIOg, the difference
between its octet and the residual rack is fAABBGILOg fABBIg = fAGLOg, so it
needs to draw these 4 letters in order to have the complete word in hand. Lets
consider now the remaining tiles pile, which is given by all the tiles in the game minus
the ones that are on the board and in the residual rack. It will be something like this:
fAAAAA...GG..LLLL...OOOOOO...ZZg. Note that this is not the exact count of the
remaining tiles, since we don’t nknow which tiles the opponent has in his rack. Given
the quantity available of each letter in the tiles pile, the probability of the desired draw
is
" Z
Õ
i=A
ai
d1i
#
=
t
d2
where a is the quantity of tiles of that letter that stills available, d1 is the quantity
of those tiles that have to be drawn, t is the total number of tiles and d2 the number
of tiles to draw from the bag. In this case the pi for the word ABBAGLIO will be
[ 91 12 14 112 ]= 948 .</p>
        <p>To enhance performances of c, since binomial is not a simple operation, it is
possible to create a ”binomial matrix” which stores all the binomial coefficients we will
need during the game. It is also possible to check before the calculation if there are any
chances for that word to be formed drawing from the bag. If there is not, the method
skips to the next word and assign pi=0.</p>
        <p>Given this algorithm, is very easy to change it to make it fit with the original version
of the Scrabble game, since we have just to change the octet with a septet in the
program. The number of 7-letters words is higher than the number of 8-letters ones (15000
against 10000), but the chance to skip a good number of checks makes it possible not
to degrade the performances of the program. Actually, the program takes between 3-7
seconds to choose a move every turn, depending of the openings on the board and the
quantity of possible words to test. If the program wouldn’t use any heuristic it would
take less than a second to find the best word to play (the one that gives the best score).
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Improvements</title>
      <p>Others strategies could be used to enhance quality of play of our program. It could
evaluate the usege of large spots of the board, making it difficult for the opponent to
find space where to place his word. It could change its strategy according to the score
difference between its and the opponent’s one. Since the ”fitness” score is given as a
sum of 3 values, giving each of them a different weight will create different styles of
play. Prioritizing the score function, the AI will make more aggressive moves, trying
to get as points as possible, leaving, however, good spots for the opponent; giving
an higher value to the penalty score we will make the program produce more careful
moves, trying not to give the opponent the chance to get a good score; emphasizing
the octet function there will be more combo moves, maybe putting shorter words on
the table in order to get a ”bingo play”. With the current set of values, the program
can easily reach the 500 points threshold, offering a good challenge even for the most
competitive players.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Cosma</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Jasckson: Introducing MONTY plays scrabble</article-title>
          .
          <source>Scrabble Player News</source>
          ,
          <fpage>7</fpage>
          -
          <lpage>10</lpage>
          (
          <year>1983</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Appel</surname>
          </string-name>
          ,
          <source>Jacobson: The World's Fastest Scrabble Program. Communications of the ACM</source>
          <volume>31</volume>
          (
          <issue>3</issue>
          ),
          <fpage>572</fpage>
          -
          <lpage>578</lpage>
          (
          <year>1988</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Russel</surname>
          </string-name>
          ,
          <article-title>Norvig Ariticial Intelligence, a modern approch (3rd Edition)</article-title>
          .
          <source>Pearson</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Richards</surname>
          </string-name>
          , Amir:Opponent modeling in Scrabble.
          <source>Proceeding of the Twentieth International Joint Conference on Artificial Intelligence</source>
          ,
          <fpage>1482</fpage>
          -
          <lpage>1487</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>[5] Fredkin: Trie Memory. CACM 3.9</source>
          ,
          <fpage>490</fpage>
          -
          <lpage>500</lpage>
          (
          <year>1960</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>