<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Logical and Psychological Analysis of Deductive Mastermind</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nina Gierasimczuk</string-name>
          <email>Nina.Gierasimczuk@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Han van der Maas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maartje Raijmakers</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Psychology, University of Amsterdam</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Logic, Language and Computation, University of Amsterdam</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper proposes a way to analyze logical reasoning in a deductive version of the Mastermind game implemented within the Math Garden educational system. Our main goal is to derive predictions about the cognitive di culty of game-plays, e.g., the number of steps needed for solving the logical tasks or the working memory load. Our model is based on the analytic tableaux method, known from proof theory. We associate the di culty of the Deductive Mastermind game-items with the size of the corresponding logical tree derived by the tableau method. We discuss possible empirical hypotheses based on this model, and preliminary results that prove the relevance of our theory.</p>
      </abstract>
      <kwd-group>
        <kwd>Deductive Mastermind</kwd>
        <kwd>Mastermind game</kwd>
        <kwd>deductive reasoning</kwd>
        <kwd>analytic tableaux</kwd>
        <kwd>Math Garden</kwd>
        <kwd>educational tools</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Computational and logical analysis has already proven useful for the
investigations into the cognitive di culty of linguistic and communicative tasks (see
[1{5]). We follow this line of research by adapting formal logical tools to directly
analyze the di culty of non-linguistic logical reasoning. Our object of study is
a deductive version of the Mastermind game. Although the game has been used
to investigate the acquisition of complex skills and strategies in the domain of
reasoning about others [6], as far as we know, it has not yet been studied for
the di culty of the accompanying deductive reasoning. Our model of reasoning
is based on the proof-theoretical method of analytic tableaux for propositional
logic, and it gives predictions about the empirical di culty of game-items. In
the remaining part of this section we give the background of our work: we
explain the classical and static versions of the Mastermind game, and describe the
main principles of the online Math Garden system. In Section 2 we introduce
Deductive Mastermind as implemented within Math Garden. Section 3 gives a
logical analysis of Deductive Mastermind game-items using the tableau method.
Finally, Section 4 discusses some hypotheses drawn on the basis of our model,
and gives preliminary results. In the end we brie y discuss the directions for
further work.</p>
    </sec>
    <sec id="sec-2">
      <title>Mastermind Game</title>
      <p>Mastermind is a code-breaking game for two players. The modern game with
pegs was invented in 1970 by Mordecai Meirowitz, but the game resembles an
earlier pen and paper game called Bulls and Cows. The Mastermind game, as
known today, consists of a decoding board, code pegs of k colors, and feedback
pegs of black and white. There are two players, the code-maker, who chooses a
secret pattern of ` code pegs (color duplicates are allowed), and the code-breaker,
who guesses the pattern, in a given n rounds. Each round consists of code-breaker
making a guess by placing a row of ` code pegs, and of code-maker providing the
feedback of zero to ` key pegs: a black key for each code peg of correct color and
position, and a white key for each peg of correct color but wrong position. After
that another guess is made. Guesses and feedbacks continue to alternate until
either the code-breaker guesses correctly, or n incorrect guesses have been made.
The code-breaker wins if he obtains the solution within n rounds; the code-maker
wins otherwise. Mastermind is an inductive inquiry game that involves trials of
experimentation and evaluation. As such it leads to the interesting question of
the underlying logical reasoning and its di culty. Existing mathematical results
on Mastermind do not provide any answer to this problem|most focus has
been directed at nding a strategy that allows winning the game in the smallest
number of rounds (see [7{10]).</p>
      <p>Static Mastermind [11] is a version of the Mastermind game in which the goal
is to nd the minimum number of guesses the code-breaker can make all at once
at the beginning of the game (without waiting for the individual feedbacks), and
upon receiving them all at once completely determine the code in the next guess.
In the case of this game some strategy analysis has been conducted [12], but,
more importantly, Static Mastermind has been given a computational
complexity analysis [13]. The corresponding Static Mastermind Satis ability Decision
Problem has been de ned in the following way.</p>
    </sec>
    <sec id="sec-3">
      <title>De nition 1 (Mastermind Satis ability Decision Problem).</title>
      <p>Input A set of guesses G and their corresponding scores.</p>
      <p>Question Is there at least one valid solution?
Theorem 1. Mastermind Satis ability Decision Problem is NP-complete.
This result gives an objective computational measure of the di culty of the task.
NP-complete problems are believed to be cognitively hard [14{16], hence this
result has been claimed to indicate why Mastermind is an engaging and popular
game. It does not however give much insight into the di culty of reasoning that
might take place while playing the game, and constructing the solution.
1.2</p>
    </sec>
    <sec id="sec-4">
      <title>Math Garden</title>
      <p>This work has been triggered by the idea of introducing a dedicated logical
reasoning training in primary schools through the online Math Garden system
(Rekentuin.nl or MathsGarden.com, see [17]). Math Garden is an adaptive
training environment in which students can play various educational games especially
designed to facilitate the development of their abstract thinking. Currently, it
consists of 15 arithmetic games and 2 complex reasoning games. Students play
the game-items suited for their level. A di cultly level is appropriate for a
student if she is able to solve 70% items of this level correctly. The di culty of tasks
and the level of the students' play are being continuously estimated according
to the Elo rating system, which is used for calculating the relative skill levels
of players in two-player games such as chess [18]. Here, the relative skill level is
computed on the basis of student v. game-item opposition: students are rated by
playing, and items are rated by getting played. The rating depends on accuracy
and speed of problem solving [19]. The rating scales go from in nity to +
in nity. If a child has the same rating as an item this means that the child can
solve the item with probability :5. In general, the absolute values of the ratings
have no straightforward meaning. Hence, one result of children playing within
Math Garden is a rating of all items, which gives item di culty parameters. At
the same time every child gets a rating that re ects her or his reasoning ability.
One of the goals of this project is to understand the empirically established item
di culty parameters by means of a logical analysis of the items. Figure 5 in the
Appendix depicts the educational and research context of Math Garden.
2</p>
      <sec id="sec-4-1">
        <title>Deductive Mastermind</title>
        <p>
          Now let us turn to Deductive Mastermind, the game that we designed for the
Math Garden system (its corresponding name within Math Garden is
Flowercode). Figure 1 shows an example item, revealing the basic setting of the game.
Each item consists of a decoding board (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), short feedback instruction (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ), the
domain of owers to choose from while constructing the solution (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) and the
timer in the form of disappearing coins (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ). The goal of the game is to guess
the right sequence of owers on the basis of the clues given on the decoding
board. Each row of owers forms a conjecture that is accompanied by a small
board on the right side of it. The dots on the board code the feedback about
the conjecture: one green dot for each ower of correct color and position, one
orange dot for each ower of correct color but wrong position, and one red dot
for each ower that does not appear in the secret sequence at all. In order to
win, the player is supposed to pick the right owers from the given domain (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ),
place them in the right order on the decoding board, right under the clues, and
press the OK button. She must do so before the time is up, i.e., before all the
coins (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) disappear. If the guess is correct, the number of coins that were left as
she made the guess is added to her overall score. If her guess is wrong, the same
number is subtracted from it. In this way making fast guesses is punished, but
making a fast correct response is encouraged [19].
        </p>
        <p>Unlike the classical version of the Mastermind game, Deductive Mastermind
does not require the player to come up with the trial conjectures. Instead,
Flowercode gives the clues directly, ensuring that they allow exactly one correct solution.
Hence, Deductive Mastermind reduces the complexity of classical Mastermind
by changing from an inductive inference game into a very basic logical-reasoning
game. On the other hand, when compared to Static Mastermind, Deductive
Mastermind di ers with respect to the goal. In fact, by guaranteeing the existence
of exactly one correct solution, Deductive Mastermind collapses the postulated
complexity from Theorem 1, since the question of the Static Mastermind
Satisability Problem becomes void. It is fair to say that Deductive Mastermind is a
combination of the classical Mastermind game (the goal of the game is the same:
nding a secret code) and the Static Mastermind game (it does not involve the
trial-and-error inductive inference experimentation). Its very basic setting
allows access to atomic logical steps of non-linguistic logical reasoning. Moreover,
Deductive Mastermind is easily adaptable as a single-player game, and hence
suitable for the Math Garden system. The simple setting provides educational
justi cation, as the game trains very basic logical skills.</p>
        <p>The game has been running within the Math Garden system since
November 2010. It includes 321 game-items, with conjectures of various lengths (1-5
owers) and number of colors (from 2 to 5). By January 2012, 2,187,354 items
had been played by 28,247 primary school students (grades 1-6, age: 6-12 years)
in over 400 locations (schools and family homes). This extensive data-collecting
process allows analyzing various aspects of training, e.g., we can access the
individual progress of individual players on a single game, or the most frequent
mistakes with respect to a game-item. Most importantly, due to the
studentitem rating system mentioned in Section 1.2, we can infer the relative di culty
of game-items. Within this hierarchy we observed that the present game-item
domain contains certain \gaps in di culty"|it turns out that our initial di
culty estimation in terms of non-logical aspects (e.g., number of owers, number
of colors, number of lines, the rate of the hypotheses elimination, etc.) is not
precise, and hence the domain of items that we generated does not cover the
whole di culty space. Providing a logical apparatus that predicts and explains
the di culty of Deductive Mastermind game-items can help solving this problem
and hence also facilitate the training e ect of Flowercode (see Appendix, Figure
7).</p>
      </sec>
      <sec id="sec-4-2">
        <title>A Logical Analysis</title>
        <p>Each Deductive Mastermind game-item consists of a sequence of conjectures.
De nition 2. A conjecture of length l over k colors is any sequence given by
a total assignment, h : f1; : : : ; `g ! fc1; : : : ; ckg. The goal sequence is a
distinguished conjecture, goal : f1; : : : ; `g ! fc1; : : : ; ckg.</p>
        <p>Every non-goal conjecture is accompanied by a feedback that indicates how
similar h is to the given goal assignment. The three feedback colors: green,
orange, and red, described in Section 2, will be represented by letters g; o; r.
De nition 3. Let h be a conjecture and let goal be the goal sequence, both of
length l over k colors. The feedback f for h with respect to goal is a sequence
a b c
zg :}:|: g{ oz :}:|: o{ zr :}:|: r{ = gaobrc;
where a; b; c 2 f0; 1; 2; 3; : : :g and a + b + c = `. The feedback consists of:
{ exactly one g for each i 2 G, where G = fi 2 f1; : : : `g j h(i) = goal(i)g.
{ exactly one o for every i 2 O, where O = fi 2 f1; : : : ; `gnG j there is a j 2
f1; : : : ; `gnG; such that i 6= jand h(i) = goal(j)g.</p>
        <p>{ exactly one r for every i 2 f1; : : : ; `gn(G[O).
3.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>The informational content of the feedback</title>
      <p>How to logically express the information carried by each pair (h; f )? To shape
the intuitions let us rst give a second-order logic formula that encodes any
feedback sequence gaobrc for any h with respect to any goal:
^ 9O
9G f1; : : : `g(card(G)=a ^ 8i2G h(i)=goal(i) ^ 8i2=G h(i)6=goal(i)
f1; : : : `gnG (card(O)=b ^ 8i2O 9j2f1; : : : `gnG(j6=i ^ h(i)=goal(j))
^ 8i2f1; : : : `gn(G[O) 8j2f1; : : : `gnG h(i)6=goal(j))):</p>
      <p>Since the conjecture length, `, is xed for any game-item, it seems sensible to
give a general method of providing a less engaging, propositional formula for any
instance of (h; f ). As literals of our Boolean formulae we take h(i) = goal(j),
where i; j 2 f1; : : : `g (they might be viewed as propositional variables pi;j , for
i; j 2 f1; : : : `g). With respect to sets G, O, and R that induce a partition of
g
f1; : : : `g, we de ne 'G; 'oG;O; 'rG;O, the propositional formulae that correspond
to di erent parts of feedback, in the following way:
{ 'gG := Vi2G h(i)=goal(i) ^ Vj2f1;:::;`gnG h(j)6=goal(j),
{ 'oG;O := Vi2O(Wj2f1;:::;`gnG;i6=j h(i) = goal(j)),
{ 'rG;O := Vi2f1;:::`gn(G[O);j2f1;:::`gnG;i6=j h(i) 6= goal(j).
Observe that there will be as many substitutions of each of the above schemes
of formulae, as there are ways to choose the corresponding sets G and O. To x
the domain of those possibilities we set G := fGjG f1; : : : ; `g ^ card(G)=ag,
and, if G f1; : : : ; `g, then OG = fOjO f1; : : : ; `gnG ^ card(O)=bg. Finally,
we can set Bt(h; f ), the Boolean translation of (h; f ), to be given by:
Bt(h; f ) :=
_ ('gG ^</p>
      <p>_ ('oG;O ^ 'rG;O)):
G2G</p>
      <p>
        O2OG
Example 1. Let us take ` = 2 and (h; f ) such that: h(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ):=c1, h(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ):=c2; f :=or.
Then G=f;g, Of;g=ff1g; f2gg. The corresponding formula, Bt(h; f ), is:
(goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c1 ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c2) ^ ((goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2 ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1) _ (goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1 ^ goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c2))
      </p>
      <p>Each Deductive Mastermind game-item consists of a sequence of conjectures
together with their respective feedbacks. Let us de ne it formally.
De nition 4. A Deductive Mastermind game-item over ` positions, k colors
and n lines, DM (l; k; n), is a set f(h1; f1); : : : ; (hn; fn)g of pairs, each
consisting of a single conjecture together with its corresponding feedback. Respectively,
Bt(DM (l; k; n)) = Bt(f(h1; f1); : : : ; (hn; fn)g) = fBt(h1; f1); : : : ; Bt(hn; fn)g.
Hence, each Deductive Mastermind game-item can be viewed as a set of Boolean
formulae. Moreover, by the construction of the game-items we have that this set
is satis able, and that there is a unique valuation that satis es it. Now let us
focus on a method of nding this valuation.
3.2</p>
    </sec>
    <sec id="sec-6">
      <title>Analytic Tableaux for Deductive Mastermind</title>
      <p>
        In proof theory, the analytic tableau is a decision procedure for propositional
logic [20{22]. The tableau method can determine the satis ability of nite sets
of formulas of propositional logic by giving an adequate valuation. The method
builds a formulae-labeled tree rooted at the given set of formulae and unfolding
breaks these formulae into smaller formulae until contradictory pairs of literals
are produced or no further reduction is possible. The rules of analytic tableaux
for propositional logic, that are relevant for our considerations are as follows.3
' ^
',
^
'
' _
_
By our considerations from Section 3 we can now conclude that applying the
analytic tableaux method to the Boolean translation of a Deductive Mastermind
game-item will give the unique missing assignment goal. In the rest of the paper
we will focus on 2-pin Deductive Mastermind game-items (where ` = 2), in
particular, we will explain the tableau method in more detail on those simple
examples.
3 We do not need the rule for negation because in our formulae only literals are negated.
2-pin Deductive Mastermind Game-items Since the possible feedbacks consist
of letters g (green), o (orange), and r (red), in principle for the 2-pin Deductive
Mastermind game-items we get six possible feedbacks: gg; oo; rr; go; gr; or. From
those: gg is excluded as non-applicable; go is excluded because there are only two
positions. Let us take a pair (h; f ), where h(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=ci, h(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=cj , then, depending on
the feedback, the corresponding boolean formulae are given in Figure 2. We can
compare the complexity of those feedbacks just by looking at their tree
representations created from the boolean translations via the tableau method. Those
Feedback Boolean translation
oo goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) 6= ci ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) 6= cj ^ goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) = cj ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) = ci
rr goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) 6= ci ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) 6= cj ^ goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) 6= cj ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) 6= ci
gr (goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) = ci ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) 6= cj ) _ (goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) = cj ^ goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) 6= ci)
or (goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) 6= ci ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) 6= cj ) ^
      </p>
      <p>
        ((goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) = cj ^ goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) 6= ci) _ (goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) = ci ^ goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) 6= cj ))
ci; cj
oo
ci; cj
      </p>
      <p>
        rr
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=ci goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=ci
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=cj goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=cj
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=cj goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=cj
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=ci goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=ci
ci; cj
gr
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=ci goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=cj
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=cj goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=ci
ci; cj
or
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c1 goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c2
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1 goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c2 goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c2 goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1
representations clearly indicate that the order of the tree-di culty for the four
possible feedbacks is: oo&lt;rr&lt;gr&lt;or. As the feedbacks oo; rr are conjunctions,
they do not require branching (the other two include disjunctions, and as such
demand reasoning by cases). Unlike rr, the feedback oo in fact gives the solution
immediately. Within the two remaining rules, gr requires less memory to store
the information in each branch.
      </p>
      <p>
        We will now brie y discuss the tableau method on an example. Let us
consider the following Deductive Mastermind game-item (Figure 3). The tree on
the left stands for the reasoning which corresponds to analyzing the conjectures
as given, from top to bottom. The rst branching gives the result of applying
the gr feedback. In the next level of the tree we apply the oo feedback to the
second conjecture. We must rst do so assuming the left branch of the rst
conjecture to be true. This leads to a contradiction|on this branch we get that
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1 and goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1. Then we move to the right branch of the rst
conjecture. This assumption leads to discovering the right assignment, goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2
and goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1, there is no contradiction on this branch. This tableau
procedure is required to build the whole tree for the game-item. That is not always
necessary. The right-most part of Figure 3 shows what happens if you chose to
start the analysis from the second conjecture. We rst apply the feedback oo
to the second conjecture. We immediately get: goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2 goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1. The full
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c1
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1
Bt(h2; f2)
      </p>
      <p>
        oo
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c1
Bt(h2; f2)
      </p>
      <p>
        oo
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1
      </p>
      <p>Bt(h2; f2)
Bt(h1; f1)</p>
      <p>
        oo
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1
Bt(h1; f1)
unique assignment with no contradiction. We can stop the computation at this
point|if the other conjecture contradicted this assignment, then it would mean
that the two conjectures must be inconsistent and hence not satis able. This
contradicts the setting of our game.
      </p>
      <p>
        The tree might not always give us the complete valuation explicitly. In some
game-items it is required to use a ower that did not appear in the conjectures.
This is so in the example in Figure 4; the right-most branch does not give a
contradiction, it does assign color 1 to the second position of the goal conjecture.
In such case we draw the remaining color, c3 (a tulip in the picture), as the
missing value of the rst position of the goal conjecture, i.e., goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c3.
Bt(h1; f1)
Bt(h2; f2)
      </p>
      <p>
        gr
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c1
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1
Bt(h2; f2)
gr
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c1
Bt(h2; f2)
      </p>
      <p>
        gr
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2 goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1 goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c2
goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )=c2 goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )=c1
goal(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )6=c1 goal(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )6=c2
      </p>
      <sec id="sec-6-1">
        <title>Hypotheses and Preliminary Results</title>
        <p>
          Normatively speaking, the full tree generated by the tableau method for the
set of formulae corresponding to a Deductive Mastermind game-item gives its
ideal reasoning scheme and thus the size of the tree can be thought of as an
abstract complexity measure. Obviously, the shape and the size of the tree for
each Deductive Mastermind item depends to some extent on the order in which
the formulae are analyzed (see Figure 3). Hence, using the tableau method it is
even possible to analyze whether and in what way the students apply reasoning
strategies, i.e., how they manipulate with the elements of the task in order to
optimize the size of the reasoning tree (i.e., the length of the computation). In
this way, items' logical di culty can be expressed via the size of their minimal
trees. The empirical data, resulting from children playing the Deductive
Mastermind game in Math Garden, includes item ratings. In the rst analysis of
this data we aimed at relating the item ratings to the parameters of the trees.
To this end we de ne a computational method based on the tableau method.
The computational method makes two assumptions. First, the formulae are not
processed from top to bottom, but instead the order depends on the length of
the rule that is associated with the feedback. That is, feedback is processed in
the following order: oo, rr, gr, or. Ties are solved by processing the top formula
rst. Second, the computational method is stopped once a consistent solution is
found, assuming that there exists at least one solution. Based on these principles
and the tableau method we programmed a recursive algorithm for calculating
the type and number of steps until solution for each item is reached. We de ne
4 measures of theoretically derived item di culties; the required number of oo,
rr, gr, and rr steps, which together might predict item di culty. Below we will
describe the empirically derived item ratings of the 2-pin items and we will show
how these relate to the theoretically derived measures of item di culties.
Method Participants were 28,247 students from grades 1-6, of age: 6-12 years.
Together, they played 2,187,354 items between November 2010 and January
2012. From the total of 321 items in the Master Mind game, 100 items have
two pins. From these 100 items, 10 items involved 2 colors, 30 items involved 3
colors, 30 items involved 4 colors and 30 items involved 5 colors.
Results To test the relation between empirically derived item ratings and
theoretically derived measures of item di culty we did a regression analysis that
includes the basic features of the items (number of colors and number of guesses)
and the required number of oo, rr, gr, and or analysis steps as predicted by
the tableau-based computational algorithm (F (6; 93)=33, p &lt; :0001, R2=:66).
All these factors but one (i.e., number of gr evaluations) were signi cant in
predicting item di culties: number of colours ( =1:07, p=:02), number of
hypotheses ( =1:75, p&lt;:01), number of oo feedbacks ( = 5:1, p&lt;:001), number
of rr feedbacks ( = 3:19; p&lt;:0001), number of gr evaluations (ns), number of
or evaluations ( =1:6, p&lt;:0001). Note that among the rules, only the required
number of or steps increases item di culty. A second aspect of the observed
item ratings to explain is the shape of its distribution. The distribution of the
item di culties shows a remarkable property for the 2-pin items (see Appendix,
Figure 6). The distribution is bimodal, meaning that there is a cluster of more
di cult items (item ratings &gt;0) and a cluster of easier items (item ratings &lt;0).
In the cluster of easy items, there are items with 2, 3, 4, and 5 colors. The cluster
of di cult items consists of items with 3, 4 and 5 colors. The two clusters could
be expressed fully in terms of model-relevant features. It appeared that items
are easy in the following two cases: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) no or feedback and no gr feedback; (
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
no or feedback, at least one gr feedback, and all colors are included in at least
one of the conjectures. Items are di cult otherwise. This shows that or steps
make the item relatively di cult, but only if or is required to solve the item. A
second aspect that makes an item di cult is the exclusion of one of the colors
in the solution from the hypotheses rows.
5
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>Conclusions and Future Work</title>
        <p>In this paper we proposed a way to use a proof-theoretical concept to analyze
the cognitive di culty of logical reasoning. The rst hypotheses that we have
drawn from the model gave a reasonably good prediction of item-di culties,
but the t is not perfect. However, it must be noted that several non-logical
factors may play a role in the item di culty as well. For example, the motor
requirements to give the answer also introduce some variation in item di culty,
which depends not only on accuracy but also on speed. The minimal number of
clicks required to give an answer varies between items. We did not take these
aspects into account so far. In particular, the two di culty clusters observed in
the empirical study can be explained with the use of our method.</p>
        <p>Our further work can be split into two main parts. We will continue this
study on the theoretical level by analyzing various complexity measures that
can be obtained on the basis of the tableau system. We also plan to compare
the t with empirical data of the tableau-derived measures and other possible
logical formalizations of level di culty (e.g., a resolution-based model). On the
empirical side we rst would like to extend our analysis to game-items that
consist of conjectures of higher length. This will allow comparing di culty of tasks
of di erent size and measure the trade-o between the size and the structure
of the tasks. We will also study this model in a broader context of other Math
Garden games. This would allow comparing individual abilities within Deductive
Mastermind and the abilities within other games that have been correlated for
instance with the working memory load. Finally, it would be interesting to see
whether the children are really using the proposed di culty order of feedbacks in
nding an optimal tableau|this could be checked in an eye-tracking experiment
(see [23, 24]).</p>
      </sec>
      <sec id="sec-6-3">
        <title>Appendix</title>
        <p>The appendix contains three gures. The rst one illustrates the educational
and research context of the Math Garden system (Figure 5); the second shows
the distribution of the empirical di culty of all items in Deductive Mastermind
(Figure 6); and the third gives the frequency of players and game-items with
respect to the overall rating (Figure 7). We refer to those gures in the proper
text of the article.</p>
        <p>Student
game playing</p>
        <p>Investigators
data
Math Garden.com
instruction
new items,
tasks
report
instruction
methods</p>
        <p>Teacher</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Szymanik</surname>
          </string-name>
          , J.:
          <article-title>Quanti ers in TIME and SPACE. Computational Complexity of Generalized Quanti ers in Natural Language</article-title>
          .
          <source>PhD thesis</source>
          , Universiteit Van Amsterdam (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Gierasimczuk</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szymanik</surname>
          </string-name>
          , J.:
          <article-title>Branching quanti cation v. two-way quanti cation</article-title>
          .
          <source>Journal of Semantics</source>
          <volume>26</volume>
          (
          <issue>4</issue>
          ) (
          <year>2009</year>
          )
          <volume>367</volume>
          {
          <fpage>392</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Szymanik</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zajenkowski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Comprehension of simple quanti ers. Empirical evaluation of a computational model</article-title>
          .
          <source>Cognitive Science</source>
          <volume>34</volume>
          (
          <issue>3</issue>
          ) (
          <year>2010</year>
          )
          <volume>521</volume>
          {
          <fpage>532</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Zajenkowski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Styla</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szymanik</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A computational approach to quanti ers as an explanation for some language impairments in schizophrenia</article-title>
          .
          <source>Journal of Communication Disorders</source>
          <volume>44</volume>
          (
          <issue>6</issue>
          ) (
          <year>2011</year>
          )
          <volume>595</volume>
          {
          <fpage>600</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Van</given-names>
            <surname>Rooij</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Kwisthout</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Blokpoel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Szymanik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Wareham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Toni</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          :
          <article-title>Intentional communication: Computationally easy or di cult? Frontiers in Human Neuroscience 5 (</article-title>
          <year>2011</year>
          )
          <volume>1</volume>
          {
          <fpage>18</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Verbrugge</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mol</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Learning to apply theory of mind</article-title>
          .
          <source>Journal of Logic, Language and Information</source>
          <volume>17</volume>
          (
          <issue>4</issue>
          ) (
          <year>2008</year>
          )
          <volume>489</volume>
          {
          <fpage>511</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Knuth</surname>
            ,
            <given-names>D.E.</given-names>
          </string-name>
          :
          <article-title>The computer as master mind</article-title>
          .
          <source>Journal of Recreational Mathematics</source>
          <volume>9</volume>
          (
          <issue>1</issue>
          ) (
          <year>1977</year>
          ) 1{
          <fpage>6</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Irving</surname>
          </string-name>
          , R.W.:
          <article-title>Towards an optimum Mastermind strategy</article-title>
          .
          <source>Journal of Recreational Mathematics</source>
          <volume>11</volume>
          (
          <year>1978</year>
          -79)
          <fpage>81</fpage>
          {
          <fpage>87</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Koyama</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>An optimal Mastermind strategy</article-title>
          .
          <source>Journal of Recreational Mathematics</source>
          <volume>25</volume>
          (
          <year>1993</year>
          )
          <volume>251</volume>
          |
          <fpage>256</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kooi</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Yet another Mastermind strategy</article-title>
          .
          <source>ICGA Journal</source>
          <volume>28</volume>
          (
          <issue>1</issue>
          ) (
          <year>2005</year>
          )
          <volume>13</volume>
          {
          <fpage>20</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Chvatal</surname>
          </string-name>
          , V.:
          <source>Mastermind. Combinatorica</source>
          <volume>325</volume>
          {
          <fpage>329</fpage>
          (
          <year>1983</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Greenwell</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Mastermind</surname>
          </string-name>
          .
          <source>Journal of Recreational Mathematics</source>
          <volume>30</volume>
          (
          <year>1999</year>
          -2000)
          <volume>191</volume>
          {
          <fpage>192</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Stuckman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Zhang, G.:
          <article-title>Mastermind is NP-complete</article-title>
          .
          <source>INFOCOMP Journal of Computer Science</source>
          <volume>5</volume>
          (
          <year>2006</year>
          )
          <volume>25</volume>
          {
          <fpage>28</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Van Rooij</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>The tractable cognition thesis</article-title>
          .
          <source>Cognitive Science</source>
          <volume>32</volume>
          (
          <year>2008</year>
          )
          <volume>939</volume>
          {
          <fpage>984</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Szymanik</surname>
          </string-name>
          , J.:
          <article-title>Computational complexity of polyadic lifts of generalized quanti ers in natural language</article-title>
          .
          <source>Linguistics and Philosophy</source>
          <volume>33</volume>
          (
          <issue>3</issue>
          ) (
          <year>2010</year>
          )
          <volume>215</volume>
          {
          <fpage>250</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Ristad</surname>
            ,
            <given-names>E.S.:</given-names>
          </string-name>
          <article-title>The Language Complexity Game</article-title>
          . The MIT Press (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Van der Maas</surname>
            ,
            <given-names>H.L.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klinkenberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Straatemeier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Rekentuin.nl: combinatie Van oefenen en toetsen</article-title>
          .
          <source>Examens</source>
          (
          <year>2010</year>
          )
          <volume>10</volume>
          {
          <fpage>14</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Elo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The Rating of Chessplayers, Past and Present</article-title>
          .
          <source>Arco</source>
          (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Klinkenberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Straatemeier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van der Maas</surname>
            ,
            <given-names>H.L.J.:</given-names>
          </string-name>
          <article-title>Computer adaptive practice of maths ability using a new item response model for on the y ability and di culty estimation</article-title>
          .
          <source>Computers in Education</source>
          <volume>57</volume>
          (
          <year>2011</year>
          )
          <year>1813</year>
          {
          <fpage>1824</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Beth</surname>
            ,
            <given-names>E.W.</given-names>
          </string-name>
          :
          <article-title>Semantic entailment and formal derivability</article-title>
          . Mededelingen
          <string-name>
            <surname>Van de Koninklijke Nederlandse Akademie Van Wetenschappen</surname>
          </string-name>
          .
          <source>Afdeling Letterkunde</source>
          <volume>18</volume>
          (
          <issue>13</issue>
          ) (
          <year>1955</year>
          )
          <volume>309</volume>
          {
          <fpage>342</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Smullyan</surname>
          </string-name>
          , R.:
          <article-title>First-order logic</article-title>
          . Springer-Verlag, Berlin (
          <year>1968</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Van Benthem</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Semantic tableaus</article-title>
          .
          <source>Nieuw Archief voor Wiskunde</source>
          <volume>22</volume>
          (
          <year>1974</year>
          )
          <volume>44</volume>
          {
          <fpage>59</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meijering</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verbrugge</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Logic meets cognition: Empirical reasoning in games</article-title>
          . In Boissier,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Fallah-Seghrouchni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.E.</given-names>
            ,
            <surname>Hassas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Maudet</surname>
          </string-name>
          , N., eds.
          <source>: MALLOW, CUER Workshop Proceedings</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meijering</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>On combining cognitive and formal modeling: a case study involving strategic reasoning</article-title>
          . In Van Eijck,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Verbrugge</surname>
          </string-name>
          , R., eds.
          <source>: Proceedings of the Workshop on Reasoning About Other Minds: Logical and Cognitive Perspectives (RAOM-2011)</source>
          , Groningen, CEUR Workshop Proceedings (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>