Reasoning about Spatial Consistency Marco Ragni (ragni@cs.uni-freiburg.de) Department of Psychology, University of Giessen, Giessen, Germany Department Foundations of AI, Technical Faculty, University of Freiburg, Freiburg, Germany Tobias Sonntag (sonntag@hotmail.de) University of Freiburg, Freiburg, Germany P.N. Johnson-Laird (phil@princeton.edu) Department of Psychology, Princeton University, NJ, US Department of Psychology, New York University, USA Abstract Hence, consistency depends on the failure of an exhaustive The consistency of spatial descriptions is relevant to tasks search for all possible proofs. – a process that is ranging from navigation to architecture. In contrast to studies computationally intractable. An alternative process could be of deduction in which a conclusion is drawn from premises, based on the new paradigm of probabilistic logic. Adams there have been only a few investigations into how human (1998) formulated a notion of p-consistency, according to reasoners decide whether or not a description is consistent. which a set of assertions is consistent if each assertion in We report results corroborating the theory that reasoners the set can have a high probability. No psychologists, as far make such judgments usually relying on a single initial as we know, have endorsed this procedure. One difficulty is mental model of the description. As a result, the task is difficult if it calls for an alternative model of the assertions to specify how people determine the relative constraints on that must be revised. Especially the model construction the probabilities of assertions in a set. So, we need an process and the way of how information is integrated into a alternative account. In the next section, we therefore model can explain errors in evaluating problems as consistent. describe the mental model theory – the “model” theory for Implications for other theories of reasoning are discussed. short – and we derive its predictions for assessments of Keywords: Consistency; Spatial Relational Reasoning; consistency. We then report an experiment that tested these Mental Model Theory predictions. Finally, we discuss the implications of these results. Introduction Inconsistency in a set of beliefs or assertions is dangerous, The mental model theory of consistency and can have disastrous consequences. Its importance therefore raises the question of how individuals assess Consider the following problem: consistency – that is, the assertions or beliefs can all hold at the same time. We have investigated this problem using (2) The apple is to the left of the pear. descriptions of spatial layouts, which have everyday analogs The pear is to the left of the kiwi. in architecture, route finding, and design. Consider, for The pear is to the left of the orange. example, the following problem about a fruit and veg stall: The orange is to the left of the mango. The kiwi is to the left of the orange. (1) The box of apples is left of the box of pears. The box of kiwis is right of the box of the pears. Can these five assertions all be true at the same time? The box of apples is right of the box of kiwis. As you read these assertions, you can construct a model of Can these three assertions all be true at the same time? the corresponding spatial arrangement: The answer to this question is “No”, as there is no apple pear orange mango kiwi arrangement of the three boxes in a line integrating all information in the description. We therefore refer to a set of You may have formed a visual image of the arrangment, or assertions as consistent if there it has at least one model your representation may have been more abstract. It needs sattisfying all the assertions in the set. A general test for only to represent the spatial relations among the objects inconsistency based on formal logic works as follows: (Goodwin & Johnson-Laird, 2005; Knauff 2013). You may Choose any assertion from the set of assertions, and prove have noticed that the orange can initially be located either to that its negation follows from the remaining assertions in the left or to the right of the kiwi, but the final assertion the set. Only if no such proof exists is the set consistent. 627 resolves the interminancy. The example illustevaluates a a ‘The mango is to the left of the kiwi’ forces the reasoner to temporary spatial indeterminacy (e.g., Johnson-Laird & revise the recently constructed initial model to the model Byrne, 1991): although the set of assertions yields a determinate arrangement, during their interpretation more apple pear orange mango kiwi than one arrangement is possible. Likewise, reasoners often initially construct a preferred mental model, and neglect as the fifth assertion conflicts the previous built model. The other possible mental models (e.g., Ragni & Knauff, 2013). fifth assertion (in problem 3) does not hold in the preferred This leads us to the first research question: Although all or initially constructed model (built after the assertions 1-4, assertions form a determinate description has the cp. Table 1), but only in the model constructed according to indeterminacy during the construction process an influence the leftmost insertion principle. In this sense we would have on reasoning performance? If so, this would not only a mismatch between the initial model (the model: apple pear support a model based approach, but show that the model kiwi orange mango) and the fifth assertion (“The mango is construction process is a relevant factor in deciding to the left of the kiwi”). Of course, all five statements are consistency. Preferred models are incrementally consistent. constructed, i.e., during the construction process each new Thus, the theory of mental models predicts that reasoners premise information is taken incrementally into account. may have some difficulty in inferring that problems such as Such a model construction process saves working memory 3 are consistent, as the participants will have a conflict with capacity, since each bit of information is immediately their initial model they constructed after the first four processed and integrated into the model (Johnson-Laird & assertions. The fifth assertion does not correspond with the Byrne, 1991). For the different construction principles initial model of the four assertions and so individuals should please refer to Table 1. respond, “no” – in contrast to Problem 2. Human reasoners tend to evaluate a given set of assertions Table 1. Construction principles for the first four assertions as inconsistent if it does not match the initially built mental leading to three possible models in the indeterminate model, which is constructed according to the right-most problem (2). The fifth assertion eliminates all models but insertion principle (see Table 1) – that is an alternative the right-most insertion model. The asterisk denotes the name for the first-free-fit principle (Ragni & Knauff, 2013). initial model. This initial mental model is the central explanation pattern in reasoning towards consistency. Even if a description is Insertion principle Model consistent, a failure to build this model will result in an Right-most insertion* apple pear kiwi orange mango erroneous answer. Mix-Left/Right insertion apple pear orange kiwi mango If participants construct initial models then the way Left-most insertion apple pear orange mango kiwi information is integrated should have as well an influence. This effect has been investigated for deductive reasoning in the premise order effect. Knauff, Rauh, Schlieder and Strube Consequently, participants may construct the following (1998) conducted an experiment to test the empirical mental model for the first four assertions of Problem 2: differences of continuous, semi-continuous, and disconti- nuous premise orders in spatial relational reasoning apple pear kiwi orange mango (following Ehrlich & Johnson-Laird, 1982). The continuous ... order and the semi-continuous order led to 60% correctness and the discontinuous order to 50% only. The premise order where the ellipsis denotes implicit models – in this case the effect is explained with the effort to construct a mental other models that can be found in Table 1. The fifth representation of the premises. In continuous and semi- assertion is consistent with these possibilities because it continuous descriptions, a common middle term of two holds in the explicit mental models, and the model that is successive premises exists. Since this is not the case in present to the participants is the same as the explicit mental discontinuous premise orders, these premises are more models (all other models in Table 1), and yields the difficult to process and we will leave these problems out. response: “yes (all assertions are consistent).” In contrast, Again, if participants are successively integrating consider the following problem: information than reasoners struggle more when drawing valid conclusions from a set of assertions that cannot be (3) The apple is to the left of the pear. successively integrated into one initial model. In the The kiwi is to the right of the pear. continuous premise order condition (cp. Table 2), each The orange is to the right of the pear. assertion (but the first) contains one new introduced object. The orange is to the left of the mango. The mango is to the left of the kiwi. Both aspects – the way and kind of the construction of the initial mental model are the two main predictions of the Again, can these five assertions all be true at the same time? mental model theory to explain human evaluation of This set of five assertions is not consistent with the initial consistency and will be investigated in the following. model (apple pear kiwi orange mango). The fifth assertion 628 The experiment allowed for three possible arrangements of the objects. This indeterminacy appears, however, only in the first four assertions, after the fifth assertion, each problem description The participants’ task was to evaluate whether or not spatial is determinate. The indeterminate problems differed variable descriptions were consistent. Half of the problems were in the revision distance of the initial model to integrate the determinate and half of them had a local inderminacy that last assertion. It can require zero vs. one vs. two spatial the fifth assertion resolved. One third of the indeterminate operations from the initially built mental (see Table 1). problems asked for the preferred model (no mental model The third variable is the sequence, i.e., are the first two revisions necessary), one third asked for the alternative assertions continuous or semi continuous (see Table 2). We models with a revision distance 1 from the preferred model manipulated this by exchanging the first two assertions. We and one third asked for the alternative models with a counterbalanced the problems regarding the type of relation, revision distance 2 from the preferred model. i.e., half of the problems used horizontal relations like left and right and half of the problems vertical relations (above Table 2. Determinate (left column) and indeterminate (right and under). column) one-dimensional problems. We obtained a semi- continuous premise order by exchanging the first two premises, which implies a directional change during the Design and Procedure postulated model building rather than the fleshing out of a sub-model. All of these problems were also available in an Our 40 problems differed in the three variables outlined inconsistent and vertical version. above: Determinate vs. indeterminate problems, consistent vs. inconsistent, and the sequence of the premises. In order to examine possible effects of revising initial models, we Determinate problems Indeterminate problems only considered the consistent indeterminate tasks and premise continuous premise order differentiated the type of conflict to the initial model. First, number only for consistent tasks we can interpret a negative 1 The apple left of the pear 1 The apple left of the pear response as an indicator of difficulty. Second, for these 2 The pear left of the kiwi 2 The pear left of the kiwi problems a conflict to the initial model does not mean a 3 The kiwi left of the orange 3 The pear left of the orange conflict to its consistency. In the condition ‘0’ (the 4 The orange left of the mango 4 The orange left of the mango rightmost insertion principle in Table 1), the representation 5 The pear left of the mango 5 The pear left of the mango of an initial model was possible till the last introduced semi-continuous premise order assertion. In 1-step-revised-model condition (the mix- left/right-insertion principle in Table 1), a revision of the 2 The pear left of the kiwi 2 The pear left of the kiwi model only required the revision of two objects of the initial 1 The apple left of the pear 1 The apple left of the pear model. In the 2-revised-model condition (the leftmost- 3 The kiwi left of the orange 3 The pear left of the orange The orange left of the mango insertion principle in Table 1), more operations were needed 4 4 The orange left of the mango 5 The pear left of the mango 5 The pear left of the mango to revise the initial model and detect the problems’ consistency. Assuming the representation of an initial model, we expected increasing difficulty related to the Participants number of operations necessary. We tested 27 logically naive participants (15f/12m; mean age 27 years) on an online website, Amazon’s Mechanical Each participant received all 40 problems and acted as their Turk, and paid them a nominal fee for their participation. own controls. Each problem and each assertion were presented self-paced. After the participants received the fifth Materials assertion, which either was consistent or inconsistent with The problems consisted of five assertions stating spatial the four preceding assertions (consistent vs. inconsistent), relations of five objects. Each of the first four successive they had to consider the consistency of each problem. assertions introduced a new object, which was randomly Participants could verify (“y”) or reject (“n”) all the inserted from a list of either fruits (apple, peach, orange, assertions by answering the question “Could all of these etc.) or “breakfast items” (toast, bagel, biscuit, etc.). The assertions be true at the same time?” The problems were forty problems differed in five independent variables with presented in a randomized order. Reading and response respect to consistency, determinacy of the description, times were recorded as well as the correctness of answers premise order, distance, and type of relation. and analyzed as dependent variables. Half of the problems were consistent (e.g., Problem 2) and half of the problems inconsistent (e.g., Problem 1). We also Results and Discussion manipulated the determinacy of the description, so that half Six participants were excluded from the analysis, as their of the problems’ descriptions are determinate (see, Table 2), accuracy did not differ significantly from chance. The i.e., they allow for only one model after four presented remaining 21 participants solved an average of 85% of all assertions, and half of them were indeterminate, i.e., they 629 problems correctly. Table 3 presents the percentages of order in contrast to the continuous order. This supports correct response in each of the 4 main conditions. again a continuous integration of information into a model during the reasoning process. In accordance with previous Reasoners correctly identified as consistent determinate results (Ragni, Fangmeier & Schleipen, 2007) orientation of descriptions (90%) more often than descriptions that were relations differed between the postulated vertical model locally indeterminate (72%, Wilcoxon test, z = 3.08, p < .01 building (87%) prompted by the relation above, and the [1-tailed]) and the same pattern holds for the reaction times supposed horizontal model building prompted by the (29.0 vs. 23.4, Wilcoxon test, z = 2.88, p < .01 [1-tailed], horizontal relation left (83%, Wilcoxon test, z = 1.77, p < see Figure 1). The participants identified inconsistent .05 [1-tailed]). problems (89%) in general significantly more often than consistent ones (82%, z = 2.12, p < .05 [2-tailed], r = -.46). Table 3. Aspects of the experimental problems, correctness in percentage and response times in seconds. Half of the problems are determinate and in half of the problems the first four assertions were indeterminate and allowed for three models (initial model, 1-step revised initial model, and 2-step revised initial model). . Problems Correctness Response in % times Overall mean 85 26.8s Determinate descriptions 90 29.0s Figure 1. Correctness rates (left abscissa, black bar) and Indeterminate descriptions 72 23.4s response times (right abscissa, grey bar) for consistent Models after 4th premise: problems compared for the fifth assertion true in the initial- Initial model 93 24.3s model condition (0), 1-step revised-model condition (1), and 1-step revised initial model 80 30.7s 2-step revised-model condition (2) with ascending ‘2 2-step revised initial model 49 31.3s transformation distance to the initial model implicated by Continuous assertions 85 25.1s the first four assertions. Semi-continuous assertions 84 28.4s In the indeterminate condition the first four assertions General Discussion allowed for three different models (cp. Table 1), while with In contrast to an evaluation in formal logic, the order of the fifth assertion the set of assertions was again a assertions has a major effect on human evaluations of determinate description and, hence, allowed for one model consistency. This finding occurs even when a description is only. The more transformation steps the fifth assertion determinate and consistent, e.g.: requires from the initial model the lower was the correctness rates and the higher are the response times: If there is no The apple is to the left of the pear. revision step necessary (i.e., if the fifth assertion is The pear is to the left of the kiwi. consistent with the initial model the correctness is 93%); if The pear is to the left of the orange. the fifth assertion requires one revision-step of the initial The orange is to the left of the mango. model (cp. Table 1) the accuracy and response times are The kiwi is to the left of the orange. lower than in the initial model condition (80%, Wilcoxon test, z = 1.98, p < .01; response times: 24.3s vs. 30.7s, Can all five of these assertions be true at the same time? Wilcoxon test, z = 1.68, p < .05). The same pattern holds if two model revisions are necessary (correctness 49%, Reasoners can construct a model of the first two assertions: Wilcoxon test, z = 3.36, p < .001; response times: 24.3s vs. 31.3s, Wilcoxon test, z = 1.90, p < .05). The manipulation apple pear kiwi of the first two assertions’ order (the continuous order vs. semi-continuous order cp. Table 2) did not effect mean But, how are they to interpret the third assertion? The correctness to a significant extent (85% vs. 84%, Wilcoxon orange could be to the right of the kiwi or it could be test, z = .34, p = .735). However, participants needed between the pear and the kiwi. The final assertion in the significantly more time to generate answers, in the semi- description resolves the indeterminacy, but nevertheless its continuous case (when the first two assertions were local occurrence impedes the evaluation of the description exchanged; 25.1s vs. 28.4s, Wilcoxon test, z = 2.17, p < .05 as consistent. Indeed, if a subsequent assertion contradicts [1-tailed], r = -.46). This delay can be traced back to longer an initial model, then the chances increase that reasoners reading times for the third assertion in the semi-continuous will err and evaluate the description as inconsistent. 630 Similar difficulties occur when the referents in a description marking according to which marked items are harder to are ordered discontinuously, e.g.: work with than unmarked items (see, e.g., Clark, 1969; Evans, Newstead, & Byrne, 1993). Both left and right are The apple is to the left of the pear. marked, whereas above is unmarked and below is marked. The orange is to the left of the mango. The difference might potentially account for our result. The pear is to the left of the orange, etc. Taken together the mental logic theory leaves the questions about the reasoning performance differences in The second assertion cannot be integrated into the model of indeterminate cases open. For deductive reasoning Van der the first assertion until reasoners interpret the third Henst (2002) proposed to extend the set of reasoning rules assertion. This discontinuity contributes to the difficulty of for rules of indeterminacy. This would, however, not help in evaluating consistency. But, theories that do not postulate our case than all five assertions together are a determinate the construction of mental models have difficulty in description of the problems and thus not requiring any explaining the phenomenon. mental logic rules of indeterminacy. A byproduct of our investigation was the finding that human Probabilistic approaches can explain deductions from reasoners find it slightly easier to work in a vertical conditional and quantified premises (e.g., Oaksford & direction, e.g., A is above B, than in a horizontal direction, Chater, 2001). But, the evaluation of consistency challenges e.g., A is to the left of B (see Ragni, Fangmeier, & this approach (Kunze et al. (2011). As we have argued in Schleipen, 2007, for similar results). Why the difference the introduction, psychologists have not applied the notion occurs remains an open question, but studies of spatial of probabilistic-consistency (Adams, 1998) to human orientation have shown that individuals are less likely to reasoning. As in the case of theories based on logic, it is not confuse vertical relations with left-to-right relations (Sholl obvious how it can explain our principal results. A further & Egeth, 1981). difficulty is to account for how people estimate the relative probabilities of spatial assertions. The two main alternatives to the model theory – mental logic and probability logic – have not addressed reasoning In contrast, to its alternatives, the model theory provides a about the consistency of spatial descriptions. The only simple explanation of the phenomena. If, and only if, general method for testing consistency in mental logic is to individuals can build a model of a set of assertions then they negate one assertion, and to try to prove that it follows from judge them to be consistent. An initial model may clash the remaining assertions (e.g., Rips, 1994). If it does, then with a subsequent assertion. Reasoners may then search for the description is inconsistent; if it doesn’t then the an alternative model to accommodate the assertion. Even if description is consistent provided that one has made an they find one, the task is harder than when the initial model exhaustive search and the logic is complete. The case of accommodates all the subsequent assertions. Likewise, the consistency can lead to a potentially exponential blow-up of task will be harder when there is a discontinuity in the the applications of rules governing the transitivity of spatial referents. Reasoners have to bear in mind two separate relations. In contrast, an inconsistency can be discovered in spatial relations, which they can integrate only in the light a single proof that the negated assertion follows from the of a subsequent assertion. Once again, this factor adds to remaining assertions. The notion that naïve reasoners grasp the difficulty of evaluating consistency. these principles seems unlikely. Moreover, the account fails to explain the difference in difficulty between two In conclusion, reasoning about the consistency of consistent problems: problem 2 was evaluated correctly on descriptions is important in everyday life. The model theory 93% of trials, whereas problem 3 was evaluated correctly on provides an account of how naïve reasoners carry out this only 49% of trials. The model theory predicts the difference task, and our investigation has corroborated its main because problem 2 is straightforward whereas problem 3 has predictions. a fifth assertion calling for reasoners to revise their model of the earlier assertions. Neither problem yields a proof than Acknowledgements the negation of one assertion follows from the other assertions. The research was partially supported by the DFG with a Heisenberg grant (RA 1934/3-1) and a DFG research The difference in difficulty between problems in a vertical grant (RA 1934/2-1) in the SPP 1516 “New frameworks of dimension and those in a horizontal dimension makes sense Rationality” to the first author. The authors are grateful to in the model theory: a confusion between left and right Stephanie Schwenke for proof-reading and Eva-Maria should be echoed in the construction of models. But, it is Steinlein for comments and discussions. inexplicable for theories based on formal rules: there is no reason why the transitivity of above should be easier to grasp than the transitivity of left. The concept of lexical 631 References Braine, M. D. S. & O'Brien, D. P. (1998). Mental logic. Mahwah, NJ: Erlbaum. Byrne, R. M. & Johnson-Laird, P. N. (1989). Spatial reasoning. Journal of Memory & Language, vol. 28, 564- 575. Clark, H. H. (1969). Linguistic processes in deductive reasoning. Psychological Review, 76, 387-404. Ehrlich, K., & Johnson-Laird, P. N. (1982). Spatial descriptions and referential continuity. Journal of Verbal Learning and Verbal Behavior, 21, 296-306. Evans, J. S. B. T., Newstead, S. E. & Byrne, R. M. J. (1993). Human reasoning: The psychology of deduction. Lawrence Erlbaum Associates. Goodwin, G. P. & Johnson-Laird, P. N. (2005). Reasoning about relations, Psychological Review, 112, 468-493. Johnson-Laird, P. N. (2006). How we reason. New York, NY: Oxford University Press. Johnson-Laird, P. N. & Byrne, R. (1991). Deduction. Hillsdale, NJ: Erlbaum. Knauff, M. (2013). Space to reason. Cambridge, MA: MIT Press. Knauff, M., Rauh, R., Schlieder, C. & Strube, G. (1998). Continuity effect and figural bias in spatial relational inference. Proceedings of the Twentieth Annual Conference of the Cognitive Science Society, 573-578. Kunze, N., Khemlani, S., Lotstein, M. & Johnson-Laird, P. (2010). Illusions of Consistency in Quantified Assertions. In Proceedings of the 31th Cognitive Science Conference (573–578). Mahwah: Erlbaum. Oaksford, M. & Chater, N. (2001). The probabilistic approach to human reasoning. Trends in Cognitive Sciences, 5, 349-357. Ragni, M., Fangmeier, T., & Schleipen, S. (2007). What about negation in spatial reasoning? Proceedings of the 28th Annual Cognitive Science Conference, 1409–1414. Ragni, M., Fangmeier, T., Webber, L. & Knauff, M. (2007). Preferred mental models: How and why they are so important in human reasoning with spatial relations. Spatial Cognition V: Reasoning, Action, Interaction, 175- 190. Ragni, M., & Knauff, M. (2013). A theory and a computational model of spatial reasoning with preferred mental models. Psychological Review, 120 (3), 561–588. Rips, L. J. (1994). The psychology of proof: Deductive reasoning in human thinking. Cambridge, MA: The MIT Press. Sholl, M. J., & Egeth, H. E. (1981). Right-left confusion in the adult: A verbal labeling effect. Memory & Cognition, 9, 339-350. Van der Henst, J. (2002). Mental model theory versus the inference rule approach in relational reasoning. Thinking & Reasoning, 8, 193-203. 632