-

Logical and Psychological Analysis of Deductive Mastermind

Nina Gierasimczuk

Nina.Gierasimczuk@gmail.com 1

Han van der Maas

Maartje Raijmakers

0 0 Department of Psychology, University of Amsterdam 1 Institute for Logic, Language and Computation, University of Amsterdam

The paper proposes a way to analyze logical reasoning in a deductive version of the Mastermind game implemented within the Math Garden educational system. Our main goal is to derive predictions about the cognitive di culty of game-plays, e.g., the number of steps needed for solving the logical tasks or the working memory load. Our model is based on the analytic tableaux method, known from proof theory. We associate the di culty of the Deductive Mastermind game-items with the size of the corresponding logical tree derived by the tableau method. We discuss possible empirical hypotheses based on this model, and preliminary results that prove the relevance of our theory.

Deductive Mastermind Mastermind game deductive reasoning analytic tableaux Math Garden educational tools

Computational and logical analysis has already proven useful for the investigations into the cognitive di culty of linguistic and communicative tasks (see [1{5]). We follow this line of research by adapting formal logical tools to directly analyze the di culty of non-linguistic logical reasoning. Our object of study is a deductive version of the Mastermind game. Although the game has been used to investigate the acquisition of complex skills and strategies in the domain of reasoning about others [6], as far as we know, it has not yet been studied for the di culty of the accompanying deductive reasoning. Our model of reasoning is based on the proof-theoretical method of analytic tableaux for propositional logic, and it gives predictions about the empirical di culty of game-items. In the remaining part of this section we give the background of our work: we explain the classical and static versions of the Mastermind game, and describe the main principles of the online Math Garden system. In Section 2 we introduce Deductive Mastermind as implemented within Math Garden. Section 3 gives a logical analysis of Deductive Mastermind game-items using the tableau method. Finally, Section 4 discusses some hypotheses drawn on the basis of our model, and gives preliminary results. In the end we brie y discuss the directions for further work.

Mastermind Game

Mastermind is a code-breaking game for two players. The modern game with pegs was invented in 1970 by Mordecai Meirowitz, but the game resembles an earlier pen and paper game called Bulls and Cows. The Mastermind game, as known today, consists of a decoding board, code pegs of k colors, and feedback pegs of black and white. There are two players, the code-maker, who chooses a secret pattern of ` code pegs (color duplicates are allowed), and the code-breaker, who guesses the pattern, in a given n rounds. Each round consists of code-breaker making a guess by placing a row of ` code pegs, and of code-maker providing the feedback of zero to ` key pegs: a black key for each code peg of correct color and position, and a white key for each peg of correct color but wrong position. After that another guess is made. Guesses and feedbacks continue to alternate until either the code-breaker guesses correctly, or n incorrect guesses have been made. The code-breaker wins if he obtains the solution within n rounds; the code-maker wins otherwise. Mastermind is an inductive inquiry game that involves trials of experimentation and evaluation. As such it leads to the interesting question of the underlying logical reasoning and its di culty. Existing mathematical results on Mastermind do not provide any answer to this problem|most focus has been directed at nding a strategy that allows winning the game in the smallest number of rounds (see [7{10]).

Static Mastermind [11] is a version of the Mastermind game in which the goal is to nd the minimum number of guesses the code-breaker can make all at once at the beginning of the game (without waiting for the individual feedbacks), and upon receiving them all at once completely determine the code in the next guess. In the case of this game some strategy analysis has been conducted [12], but, more importantly, Static Mastermind has been given a computational complexity analysis [13]. The corresponding Static Mastermind Satis ability Decision Problem has been de ned in the following way.

De nition 1 (Mastermind Satis ability Decision Problem).

Input A set of guesses G and their corresponding scores.

Question Is there at least one valid solution? Theorem 1. Mastermind Satis ability Decision Problem is NP-complete. This result gives an objective computational measure of the di culty of the task. NP-complete problems are believed to be cognitively hard [14{16], hence this result has been claimed to indicate why Mastermind is an engaging and popular game. It does not however give much insight into the di culty of reasoning that might take place while playing the game, and constructing the solution. 1.2

Math Garden

This work has been triggered by the idea of introducing a dedicated logical reasoning training in primary schools through the online Math Garden system (Rekentuin.nl or MathsGarden.com, see [17]). Math Garden is an adaptive training environment in which students can play various educational games especially designed to facilitate the development of their abstract thinking. Currently, it consists of 15 arithmetic games and 2 complex reasoning games. Students play the game-items suited for their level. A di cultly level is appropriate for a student if she is able to solve 70% items of this level correctly. The di culty of tasks and the level of the students' play are being continuously estimated according to the Elo rating system, which is used for calculating the relative skill levels of players in two-player games such as chess [18]. Here, the relative skill level is computed on the basis of student v. game-item opposition: students are rated by playing, and items are rated by getting played. The rating depends on accuracy and speed of problem solving [19]. The rating scales go from in nity to + in nity. If a child has the same rating as an item this means that the child can solve the item with probability :5. In general, the absolute values of the ratings have no straightforward meaning. Hence, one result of children playing within Math Garden is a rating of all items, which gives item di culty parameters. At the same time every child gets a rating that re ects her or his reasoning ability. One of the goals of this project is to understand the empirically established item di culty parameters by means of a logical analysis of the items. Figure 5 in the Appendix depicts the educational and research context of Math Garden. 2

Deductive Mastermind

Now let us turn to Deductive Mastermind, the game that we designed for the Math Garden system (its corresponding name within Math Garden is Flowercode). Figure 1 shows an example item, revealing the basic setting of the game. Each item consists of a decoding board ( 1 ), short feedback instruction ( 2 ), the domain of owers to choose from while constructing the solution ( 3 ) and the timer in the form of disappearing coins ( 4 ). The goal of the game is to guess the right sequence of owers on the basis of the clues given on the decoding board. Each row of owers forms a conjecture that is accompanied by a small board on the right side of it. The dots on the board code the feedback about the conjecture: one green dot for each ower of correct color and position, one orange dot for each ower of correct color but wrong position, and one red dot for each ower that does not appear in the secret sequence at all. In order to win, the player is supposed to pick the right owers from the given domain ( 3 ), place them in the right order on the decoding board, right under the clues, and press the OK button. She must do so before the time is up, i.e., before all the coins ( 4 ) disappear. If the guess is correct, the number of coins that were left as she made the guess is added to her overall score. If her guess is wrong, the same number is subtracted from it. In this way making fast guesses is punished, but making a fast correct response is encouraged [19].

Unlike the classical version of the Mastermind game, Deductive Mastermind does not require the player to come up with the trial conjectures. Instead, Flowercode gives the clues directly, ensuring that they allow exactly one correct solution. Hence, Deductive Mastermind reduces the complexity of classical Mastermind by changing from an inductive inference game into a very basic logical-reasoning game. On the other hand, when compared to Static Mastermind, Deductive Mastermind di ers with respect to the goal. In fact, by guaranteeing the existence of exactly one correct solution, Deductive Mastermind collapses the postulated complexity from Theorem 1, since the question of the Static Mastermind Satisability Problem becomes void. It is fair to say that Deductive Mastermind is a combination of the classical Mastermind game (the goal of the game is the same: nding a secret code) and the Static Mastermind game (it does not involve the trial-and-error inductive inference experimentation). Its very basic setting allows access to atomic logical steps of non-linguistic logical reasoning. Moreover, Deductive Mastermind is easily adaptable as a single-player game, and hence suitable for the Math Garden system. The simple setting provides educational justi cation, as the game trains very basic logical skills.

The game has been running within the Math Garden system since November 2010. It includes 321 game-items, with conjectures of various lengths (1-5 owers) and number of colors (from 2 to 5). By January 2012, 2,187,354 items had been played by 28,247 primary school students (grades 1-6, age: 6-12 years) in over 400 locations (schools and family homes). This extensive data-collecting process allows analyzing various aspects of training, e.g., we can access the individual progress of individual players on a single game, or the most frequent mistakes with respect to a game-item. Most importantly, due to the studentitem rating system mentioned in Section 1.2, we can infer the relative di culty of game-items. Within this hierarchy we observed that the present game-item domain contains certain \gaps in di culty"|it turns out that our initial di culty estimation in terms of non-logical aspects (e.g., number of owers, number of colors, number of lines, the rate of the hypotheses elimination, etc.) is not precise, and hence the domain of items that we generated does not cover the whole di culty space. Providing a logical apparatus that predicts and explains the di culty of Deductive Mastermind game-items can help solving this problem and hence also facilitate the training e ect of Flowercode (see Appendix, Figure 7).

A Logical Analysis

Each Deductive Mastermind game-item consists of a sequence of conjectures. De nition 2. A conjecture of length l over k colors is any sequence given by a total assignment, h : f1; : : : ; `g ! fc1; : : : ; ckg. The goal sequence is a distinguished conjecture, goal : f1; : : : ; `g ! fc1; : : : ; ckg.

Every non-goal conjecture is accompanied by a feedback that indicates how similar h is to the given goal assignment. The three feedback colors: green, orange, and red, described in Section 2, will be represented by letters g; o; r. De nition 3. Let h be a conjecture and let goal be the goal sequence, both of length l over k colors. The feedback f for h with respect to goal is a sequence a b c zg :}:|: g{ oz :}:|: o{ zr :}:|: r{ = gaobrc; where a; b; c 2 f0; 1; 2; 3; : : :g and a + b + c = `. The feedback consists of: { exactly one g for each i 2 G, where G = fi 2 f1; : : : `g j h(i) = goal(i)g. { exactly one o for every i 2 O, where O = fi 2 f1; : : : ; `gnG j there is a j 2 f1; : : : ; `gnG; such that i 6= jand h(i) = goal(j)g.

{ exactly one r for every i 2 f1; : : : ; `gn(G[O). 3.1

The informational content of the feedback

How to logically express the information carried by each pair (h; f )? To shape the intuitions let us rst give a second-order logic formula that encodes any feedback sequence gaobrc for any h with respect to any goal: ^ 9O 9G f1; : : : `g(card(G)=a ^ 8i2G h(i)=goal(i) ^ 8i2=G h(i)6=goal(i) f1; : : : `gnG (card(O)=b ^ 8i2O 9j2f1; : : : `gnG(j6=i ^ h(i)=goal(j)) ^ 8i2f1; : : : `gn(G[O) 8j2f1; : : : `gnG h(i)6=goal(j))):

Since the conjecture length, `, is xed for any game-item, it seems sensible to give a general method of providing a less engaging, propositional formula for any instance of (h; f ). As literals of our Boolean formulae we take h(i) = goal(j), where i; j 2 f1; : : : `g (they might be viewed as propositional variables pi;j , for i; j 2 f1; : : : `g). With respect to sets G, O, and R that induce a partition of g f1; : : : `g, we de ne 'G; 'oG;O; 'rG;O, the propositional formulae that correspond to di erent parts of feedback, in the following way: { 'gG := Vi2G h(i)=goal(i) ^ Vj2f1;:::;`gnG h(j)6=goal(j), { 'oG;O := Vi2O(Wj2f1;:::;`gnG;i6=j h(i) = goal(j)), { 'rG;O := Vi2f1;:::`gn(G[O);j2f1;:::`gnG;i6=j h(i) 6= goal(j). Observe that there will be as many substitutions of each of the above schemes of formulae, as there are ways to choose the corresponding sets G and O. To x the domain of those possibilities we set G := fGjG f1; : : : ; `g ^ card(G)=ag, and, if G f1; : : : ; `g, then OG = fOjO f1; : : : ; `gnG ^ card(O)=bg. Finally, we can set Bt(h; f ), the Boolean translation of (h; f ), to be given by: Bt(h; f ) := _ ('gG ^

_ ('oG;O ^ 'rG;O)): G2G

O2OG Example 1. Let us take ` = 2 and (h; f ) such that: h( 1 ):=c1, h( 2 ):=c2; f :=or. Then G=f;g, Of;g=ff1g; f2gg. The corresponding formula, Bt(h; f ), is: (goal( 1 )6=c1 ^ goal( 2 )6=c2) ^ ((goal( 1 )=c2 ^ goal( 2 )6=c1) _ (goal( 2 )=c1 ^ goal( 1 )6=c2))

Each Deductive Mastermind game-item consists of a sequence of conjectures together with their respective feedbacks. Let us de ne it formally. De nition 4. A Deductive Mastermind game-item over ` positions, k colors and n lines, DM (l; k; n), is a set f(h1; f1); : : : ; (hn; fn)g of pairs, each consisting of a single conjecture together with its corresponding feedback. Respectively, Bt(DM (l; k; n)) = Bt(f(h1; f1); : : : ; (hn; fn)g) = fBt(h1; f1); : : : ; Bt(hn; fn)g. Hence, each Deductive Mastermind game-item can be viewed as a set of Boolean formulae. Moreover, by the construction of the game-items we have that this set is satis able, and that there is a unique valuation that satis es it. Now let us focus on a method of nding this valuation. 3.2

Analytic Tableaux for Deductive Mastermind

In proof theory, the analytic tableau is a decision procedure for propositional logic [20{22]. The tableau method can determine the satis ability of nite sets of formulas of propositional logic by giving an adequate valuation. The method builds a formulae-labeled tree rooted at the given set of formulae and unfolding breaks these formulae into smaller formulae until contradictory pairs of literals are produced or no further reduction is possible. The rules of analytic tableaux for propositional logic, that are relevant for our considerations are as follows.3 ' ^ ', ^ ' ' _ _ By our considerations from Section 3 we can now conclude that applying the analytic tableaux method to the Boolean translation of a Deductive Mastermind game-item will give the unique missing assignment goal. In the rest of the paper we will focus on 2-pin Deductive Mastermind game-items (where ` = 2), in particular, we will explain the tableau method in more detail on those simple examples. 3 We do not need the rule for negation because in our formulae only literals are negated. 2-pin Deductive Mastermind Game-items Since the possible feedbacks consist of letters g (green), o (orange), and r (red), in principle for the 2-pin Deductive Mastermind game-items we get six possible feedbacks: gg; oo; rr; go; gr; or. From those: gg is excluded as non-applicable; go is excluded because there are only two positions. Let us take a pair (h; f ), where h( 1 )=ci, h( 2 )=cj , then, depending on the feedback, the corresponding boolean formulae are given in Figure 2. We can compare the complexity of those feedbacks just by looking at their tree representations created from the boolean translations via the tableau method. Those Feedback Boolean translation oo goal( 1 ) 6= ci ^ goal( 2 ) 6= cj ^ goal( 1 ) = cj ^ goal( 2 ) = ci rr goal( 1 ) 6= ci ^ goal( 2 ) 6= cj ^ goal( 1 ) 6= cj ^ goal( 2 ) 6= ci gr (goal( 1 ) = ci ^ goal( 2 ) 6= cj ) _ (goal( 2 ) = cj ^ goal( 1 ) 6= ci) or (goal( 1 ) 6= ci ^ goal( 2 ) 6= cj ) ^

((goal( 1 ) = cj ^ goal( 2 ) 6= ci) _ (goal( 2 ) = ci ^ goal( 1 ) 6= cj )) ci; cj oo ci; cj

rr goal( 1 )6=ci goal( 1 )6=ci goal( 2 )6=cj goal( 2 )6=cj goal( 1 )=cj goal( 1 )6=cj goal( 2 )=ci goal( 2 )6=ci ci; cj gr goal( 1 )=ci goal( 2 )=cj goal( 2 )6=cj goal( 1 )6=ci ci; cj or goal( 1 )6=c1 goal( 2 )6=c2 goal( 2 )=c1 goal( 1 )=c2 goal( 1 )6=c2 goal( 2 )6=c1 goal( 2 )6=c2 goal( 2 )6=c1 representations clearly indicate that the order of the tree-di culty for the four possible feedbacks is: oo<rr<gr<or. As the feedbacks oo; rr are conjunctions, they do not require branching (the other two include disjunctions, and as such demand reasoning by cases). Unlike rr, the feedback oo in fact gives the solution immediately. Within the two remaining rules, gr requires less memory to store the information in each branch.

We will now brie y discuss the tableau method on an example. Let us consider the following Deductive Mastermind game-item (Figure 3). The tree on the left stands for the reasoning which corresponds to analyzing the conjectures as given, from top to bottom. The rst branching gives the result of applying the gr feedback. In the next level of the tree we apply the oo feedback to the second conjecture. We must rst do so assuming the left branch of the rst conjecture to be true. This leads to a contradiction|on this branch we get that goal( 2 )=c1 and goal( 2 )6=c1. Then we move to the right branch of the rst conjecture. This assumption leads to discovering the right assignment, goal( 1 )=c2 and goal( 2 )=c1, there is no contradiction on this branch. This tableau procedure is required to build the whole tree for the game-item. That is not always necessary. The right-most part of Figure 3 shows what happens if you chose to start the analysis from the second conjecture. We rst apply the feedback oo to the second conjecture. We immediately get: goal( 1 )=c2 goal( 2 )=c1. The full goal( 1 )=c1 goal( 2 )6=c1 Bt(h2; f2)

oo goal( 1 )=c2 goal( 2 )=c1 goal( 2 )=c1 goal( 1 )6=c1 Bt(h2; f2)

oo goal( 1 )=c2 goal( 2 )=c1

Bt(h2; f2) Bt(h1; f1)

oo goal( 1 )=c2 goal( 2 )=c1 Bt(h1; f1) unique assignment with no contradiction. We can stop the computation at this point|if the other conjecture contradicted this assignment, then it would mean that the two conjectures must be inconsistent and hence not satis able. This contradicts the setting of our game.

The tree might not always give us the complete valuation explicitly. In some game-items it is required to use a ower that did not appear in the conjectures. This is so in the example in Figure 4; the right-most branch does not give a contradiction, it does assign color 1 to the second position of the goal conjecture. In such case we draw the remaining color, c3 (a tulip in the picture), as the missing value of the rst position of the goal conjecture, i.e., goal( 1 )=c3. Bt(h1; f1) Bt(h2; f2)

gr goal( 1 )=c1 goal( 2 )6=c1 Bt(h2; f2) gr goal( 2 )=c1 goal( 1 )6=c1 Bt(h2; f2)

gr goal( 1 )=c2 goal( 2 )=c1 goal( 2 )6=c1 goal( 1 )6=c2 goal( 1 )=c2 goal( 2 )=c1 goal( 2 )6=c1 goal( 1 )6=c2

Hypotheses and Preliminary Results

Normatively speaking, the full tree generated by the tableau method for the set of formulae corresponding to a Deductive Mastermind game-item gives its ideal reasoning scheme and thus the size of the tree can be thought of as an abstract complexity measure. Obviously, the shape and the size of the tree for each Deductive Mastermind item depends to some extent on the order in which the formulae are analyzed (see Figure 3). Hence, using the tableau method it is even possible to analyze whether and in what way the students apply reasoning strategies, i.e., how they manipulate with the elements of the task in order to optimize the size of the reasoning tree (i.e., the length of the computation). In this way, items' logical di culty can be expressed via the size of their minimal trees. The empirical data, resulting from children playing the Deductive Mastermind game in Math Garden, includes item ratings. In the rst analysis of this data we aimed at relating the item ratings to the parameters of the trees. To this end we de ne a computational method based on the tableau method. The computational method makes two assumptions. First, the formulae are not processed from top to bottom, but instead the order depends on the length of the rule that is associated with the feedback. That is, feedback is processed in the following order: oo, rr, gr, or. Ties are solved by processing the top formula rst. Second, the computational method is stopped once a consistent solution is found, assuming that there exists at least one solution. Based on these principles and the tableau method we programmed a recursive algorithm for calculating the type and number of steps until solution for each item is reached. We de ne 4 measures of theoretically derived item di culties; the required number of oo, rr, gr, and rr steps, which together might predict item di culty. Below we will describe the empirically derived item ratings of the 2-pin items and we will show how these relate to the theoretically derived measures of item di culties. Method Participants were 28,247 students from grades 1-6, of age: 6-12 years. Together, they played 2,187,354 items between November 2010 and January 2012. From the total of 321 items in the Master Mind game, 100 items have two pins. From these 100 items, 10 items involved 2 colors, 30 items involved 3 colors, 30 items involved 4 colors and 30 items involved 5 colors. Results To test the relation between empirically derived item ratings and theoretically derived measures of item di culty we did a regression analysis that includes the basic features of the items (number of colors and number of guesses) and the required number of oo, rr, gr, and or analysis steps as predicted by the tableau-based computational algorithm (F (6; 93)=33, p < :0001, R2=:66). All these factors but one (i.e., number of gr evaluations) were signi cant in predicting item di culties: number of colours ( =1:07, p=:02), number of hypotheses ( =1:75, p<:01), number of oo feedbacks ( = 5:1, p<:001), number of rr feedbacks ( = 3:19; p<:0001), number of gr evaluations (ns), number of or evaluations ( =1:6, p<:0001). Note that among the rules, only the required number of or steps increases item di culty. A second aspect of the observed item ratings to explain is the shape of its distribution. The distribution of the item di culties shows a remarkable property for the 2-pin items (see Appendix, Figure 6). The distribution is bimodal, meaning that there is a cluster of more di cult items (item ratings >0) and a cluster of easier items (item ratings <0). In the cluster of easy items, there are items with 2, 3, 4, and 5 colors. The cluster of di cult items consists of items with 3, 4 and 5 colors. The two clusters could be expressed fully in terms of model-relevant features. It appeared that items are easy in the following two cases: ( 1 ) no or feedback and no gr feedback; ( 2 ) no or feedback, at least one gr feedback, and all colors are included in at least one of the conjectures. Items are di cult otherwise. This shows that or steps make the item relatively di cult, but only if or is required to solve the item. A second aspect that makes an item di cult is the exclusion of one of the colors in the solution from the hypotheses rows. 5

Conclusions and Future Work

In this paper we proposed a way to use a proof-theoretical concept to analyze the cognitive di culty of logical reasoning. The rst hypotheses that we have drawn from the model gave a reasonably good prediction of item-di culties, but the t is not perfect. However, it must be noted that several non-logical factors may play a role in the item di culty as well. For example, the motor requirements to give the answer also introduce some variation in item di culty, which depends not only on accuracy but also on speed. The minimal number of clicks required to give an answer varies between items. We did not take these aspects into account so far. In particular, the two di culty clusters observed in the empirical study can be explained with the use of our method.

Our further work can be split into two main parts. We will continue this study on the theoretical level by analyzing various complexity measures that can be obtained on the basis of the tableau system. We also plan to compare the t with empirical data of the tableau-derived measures and other possible logical formalizations of level di culty (e.g., a resolution-based model). On the empirical side we rst would like to extend our analysis to game-items that consist of conjectures of higher length. This will allow comparing di culty of tasks of di erent size and measure the trade-o between the size and the structure of the tasks. We will also study this model in a broader context of other Math Garden games. This would allow comparing individual abilities within Deductive Mastermind and the abilities within other games that have been correlated for instance with the working memory load. Finally, it would be interesting to see whether the children are really using the proposed di culty order of feedbacks in nding an optimal tableau|this could be checked in an eye-tracking experiment (see [23, 24]).

Appendix

The appendix contains three gures. The rst one illustrates the educational and research context of the Math Garden system (Figure 5); the second shows the distribution of the empirical di culty of all items in Deductive Mastermind (Figure 6); and the third gives the frequency of players and game-items with respect to the overall rating (Figure 7). We refer to those gures in the proper text of the article.

Student game playing

Investigators data Math Garden.com instruction new items, tasks report instruction methods

Teacher

1. Szymanik , J.: Quanti ers in TIME and SPACE. Computational Complexity of Generalized Quanti ers in Natural Language . PhD thesis , Universiteit Van Amsterdam ( 2009 )

2. Gierasimczuk , N. , Szymanik , J.: Branching quanti cation v. two-way quanti cation . Journal of Semantics 26 ( 4 ) ( 2009 ) 367 { 392

3. Szymanik , J. , Zajenkowski , M. : Comprehension of simple quanti ers. Empirical evaluation of a computational model . Cognitive Science 34 ( 3 ) ( 2010 ) 521 { 532

4. Zajenkowski , M. , Styla , R. , Szymanik , J.: A computational approach to quanti ers as an explanation for some language impairments in schizophrenia . Journal of Communication Disorders 44 ( 6 ) ( 2011 ) 595 { 600

Van

Rooij , I. , Kwisthout , J. , Blokpoel , M. , Szymanik , J. , Wareham , T. , Toni , I. : Intentional communication: Computationally easy or di cult? Frontiers in Human Neuroscience 5 ( 2011 ) 1 { 18

6. Verbrugge , R. , Mol , L. : Learning to apply theory of mind . Journal of Logic, Language and Information 17 ( 4 ) ( 2008 ) 489 { 511

7. Knuth , D.E. : The computer as master mind . Journal of Recreational Mathematics 9 ( 1 ) ( 1977 ) 1{ 6

8. Irving , R.W.: Towards an optimum Mastermind strategy . Journal of Recreational Mathematics 11 ( 1978 -79) 81 { 87

9. Koyama , M. , Lai , T. : An optimal Mastermind strategy . Journal of Recreational Mathematics 25 ( 1993 ) 251 | 256

10. Kooi , B. : Yet another Mastermind strategy . ICGA Journal 28 ( 1 ) ( 2005 ) 13 { 20

11. Chvatal , V.: Mastermind. Combinatorica 325 { 329 ( 1983 )

12. Greenwell , D.L. : Mastermind . Journal of Recreational Mathematics 30 ( 1999 -2000) 191 { 192

13. Stuckman , J. , Zhang, G.: Mastermind is NP-complete . INFOCOMP Journal of Computer Science 5 ( 2006 ) 25 { 28

14. Van Rooij , I. : The tractable cognition thesis . Cognitive Science 32 ( 2008 ) 939 { 984

15. Szymanik , J.: Computational complexity of polyadic lifts of generalized quanti ers in natural language . Linguistics and Philosophy 33 ( 3 ) ( 2010 ) 215 { 250

16. Ristad , E.S.: The Language Complexity Game . The MIT Press ( 1993 )

17. Van der Maas , H.L.J. , Klinkenberg , S. , Straatemeier , M. : Rekentuin.nl: combinatie Van oefenen en toetsen . Examens ( 2010 ) 10 { 14

18. Elo , A. : The Rating of Chessplayers, Past and Present . Arco ( 1978 )

19. Klinkenberg , S. , Straatemeier , M. , Van der Maas , H.L.J.: Computer adaptive practice of maths ability using a new item response model for on the y ability and di culty estimation . Computers in Education 57 ( 2011 ) 1813 { 1824

20. Beth , E.W. : Semantic entailment and formal derivability . Mededelingen Van de Koninklijke Nederlandse Akademie Van Wetenschappen . Afdeling Letterkunde 18 ( 13 ) ( 1955 ) 309 { 342

21. Smullyan , R.: First-order logic . Springer-Verlag, Berlin ( 1968 )

22. Van Benthem , J. : Semantic tableaus . Nieuw Archief voor Wiskunde 22 ( 1974 ) 44 { 59

23. Ghosh , S. , Meijering , B. , Verbrugge , R. : Logic meets cognition: Empirical reasoning in games . In Boissier, O. , Fallah-Seghrouchni , A.E. , Hassas , S. , Maudet , N., eds. : MALLOW, CUER Workshop Proceedings ( 2010 )

24. Ghosh , S. , Meijering , B. : On combining cognitive and formal modeling: a case study involving strategic reasoning . In Van Eijck, J. , Verbrugge , R., eds. : Proceedings of the Workshop on Reasoning About Other Minds: Logical and Cognitive Perspectives (RAOM-2011) , Groningen, CEUR Workshop Proceedings ( 2011 )