=Paper=
{{Paper
|id=Vol-1283/paper14
|storemode=property
|title=
Seeing is Believing: Formalising False-Belief
Tasks in Dynamic Epistemic Logic
|pdfUrl=https://ceur-ws.org/Vol-1283/paper_14.pdf
|volume=Vol-1283
|dblpUrl=https://dblp.org/rec/conf/ecsi/Bolander14
}}
==
Seeing is Believing: Formalising False-Belief
Tasks in Dynamic Epistemic Logic==
Seeing is Believing: Formalising False-Belief Tasks in Dynamic Epistemic Logic Thomas Bolander Technical University of Denmark Abstract. In this paper we show how to formalise false-belief tasks like the Sally-Anne task and the second-order chocolate task in Dynamic Epistemic Logic (DEL). False-belief tasks are used to test the strength of the Theory of Mind (ToM) of humans, that is, a human’s ability to attribute mental states to other agents. Having a ToM is known to be essential to human social intelligence, and hence likely to be essential to social intelligence of artificial agents as well. It is therefore important to find ways of implementing a ToM in artificial agents, and to show that such agents can then solve false-belief tasks. In this paper, the approach is to use DEL as a formal framework for representing ToM, and use reasoning in DEL to solve false-belief tasks. In addition to formalising several false-belief tasks in DEL, the paper introduces some extensions of DEL itself: edge-conditioned event models and observability propositions. These extensions are introduced to provide better formalisations of the false-belief tasks, but expected to have independent future interest. 1 Introduction Social intelligence is the ability to understand others and the social context e↵ec- tively and thus to interact with other agents successfully. Research has suggested that Theory of Mind (ToM) may play an important role in explaining social in- telligence. ToM is the ability to to attribute mental states—beliefs, intentions, etc.—to oneself and others and to understand that others might have mental states that are di↵erent from one’s own [23]. The strength of a human child’s ToM is often tested with a false-belief task such as the Sally-Anne task [29]. Example 1 (The Sally-Anne task). The Sally-Anne task is illustrated in Figure 7 in the appendix. It is based on a story with two agents, Sally and Anne, that has the following 5 steps, corresponding to the 5 pictures in Figure 7: 0. Sally and Anne are in a room. Sally is holding a marble. There is a basket and a box in the room. 1. Sally puts the marble into the basket. 2. Sally leaves the room. 3. Anne transfers the marble to the box. 4. Sally comes back. When used as a cognitive test for children, the child is told or shown the story in the figure. At the end, the child is asked “where does Sally believe the marble to be?” Passing the test means answering “in the basket”, since Sally didn’t see Anne transfer the marble from the basket to the box, and hence Sally has the false belief that it is still in the basket. If the child answers “in the box”, where in fact the marble is, the child has failed the test. Children under the age of 4, and autistic children in general, are generally unable to pass the Sally-Anne test [29, 9]. To create AI agents with social intelligence, it seems relevant to consider the possibility of equipping such agents with a ToM, and to test them using false- belief tasks. The idea here is that for an AI agent, e.g. a robot, to be considered truly ‘socially intelligent’, it should at least be able to pass these false-belief tasks. Hence it becomes important to find ways of formalising ToM and false- belief tasks in a way that will allow computers to do the required reasoning. The goal of the present paper is to present one such possible formalisation, using the framework of Dynamic Epistemic Logic (DEL). We will now explain why DEL is a fairly natural choice here. First of all, we need a formalism that can represent the beliefs of other agents, e.g. the beliefs of Sally, Sally’s beliefs about Anne, etc. This naturally leads one to consider an epistemic logic (or, more precisely, a doxastic logic, but we will here still refer to it as epistemic). Basic epistemic logic is however only sufficient to model static state of a↵airs, like “at this particular instant, Sally believes the marble to be in the basket.” In the false-belief tasks we also need to be able to model the dynamics: how the beliefs of the involved agents change as actions occur, e.g. when Anne secretely transfers the marble. This is where DEL comes into the picture: it has a natural way to deal with static states of beliefs (the epistemic models of DEL), a natural way to describe actions with epistemic and/or world changing e↵ects (the event models of DEL), and a simple way of calculating the result of executing an action in a state (the product update of DEL). Below we will first, in Section 2, briefly present the qualities we aim for in our false-belief task formalisations. Next, in Section 3, we introduce the required parts of DEL, and then apply it to formalise the Sally-Anne task in Section 4. The formalisation turns out not to be entirely satisfactory, and hence we will, in Section 5, introduce an extension of DEL that gives more appropriate formali- sations. The improved formalisations are in Section 6. 2 Robustness and faithfulness Above we claim that DEL is a fairly natural choice for the formalisation of false-belief tasks. This of course doesn’t imply that it is the only natural choice. Indeed, there are several existing formalisations of false-belief tasks in the liter- ature, using di↵erent formal frameworks. Figure 8 of the appendix gives a brief overview of the full formalisations and implemented systems we know of. The Sally-Anne task is usually referred to as a first-order false-belief task since it only involves first-order belief attribution: the child has to attribute beliefs to Sally, but not, say, to Sally’s beliefs about Anne’s beliefs (which would be second-order belief attribution). Most of the existing formalisations can only deal with first- order or at most second-order false-belief tasks. We wish to be more general, and at the same time have formalisations that stay as close as possible to the informal versions of the tasks, and so propose the following two criteria: Robustness. The formalism should not only be able to deal with one or two selected false-belief tasks, but with as many as possible, with no strict limit on the order of belief attribution. Faithfulness. Each action of the false-belief story should correspond to an action in the formalism in a natural way, and it should be fairly straightforward, not requiring ingenuity, to find out what that action of the formalism is. One can distinguish approaches to formalising false-belief tasks that seek to: 1) provide formal models of human reasoning; 2) provide the basis for a reasoning engine of autonomous agents. These two are of course not always disjoint aims, as discussed by Rineke Verbrugge [27]. In this paper, however, we are exclusively concerned with the second aim. We will hence not be concerned with whether our formalisation has any correspondence with the cognitive processes of humans solving false-belief tasks. 3 Dynamic epistemic logic In this section we will introduce the required basics of dynamic epistemic logic (DEL). The less technically inclined, or interested, reader can browse very quickly through the definitions and instead focus on the examples that illustrate the workings of the formalism in relation to the Sally-Anne task. Basic familiarity with epistemic logic, but not necessarily DEL, is expected. All definitions in this section are well-known and standard in DEL. The particular variant presented here is adopted from van Ditmarsch and Kooi [17]. Epistemic Models Throughout this article, P is an infinite, countable set of atomic propositions (propositional symbols), and A is a non-empty finite set of agents. We will most often use lower case letters p, q, r, . . . for atomic propositions and capital letters A, B, C, . . . for agents. Variables ranging over agents will be denoted i, j, k, . . . . The epistemic language L(P, A) is generated by the following BNF: ::= p | ¬ | ^ | Bi where p 2 P and i 2 A. The intended interpretation of a formula Bi is “agent i believes ”. The formula _ is an abbreviations of ¬(¬ ^ ¬ ), and we define > as an abbreviation for p _ ¬p and ? as an abbreviation for p ^ ¬p for some arbitrarily chosen p 2 P . The semantics of L(P, A) is defined through epistemic models. S, A A S, A A S, A (M, w0 ) = (M , w00 ) = 0 S (E, e0 ) = S w0 : t w00 : x w10 : t e0 : h>, ¬t ^ xi e1 : h>, >i Fig. 1. Two states and an action. Definition 1 (Epistemic models and states). An epistemic model of L(P, A) is M = (W, R, V ), where – W , the domain, is a set of worlds; – R : A ! 2W ⇥W assigns an accessibility relation R(i) to each agent i 2 A; – V : P ! 2W assigns a set of worlds to each atomic proposition. The relation R(i) is usually abbreviated Ri , and we write wRi v when (w, v) 2 Ri . For w 2 W , the pair (M, w) is called a state of L(P, A), and w is referred to as the actual world. The truth conditions (that is, the definition of (M, w) |= for models M, worlds w and formulas 2 L(P, A)) are standard and provided in Figure 9 of the appendix. Example 2. We will now illustrate the notion of a state relative to the Sally-Anne task of Example 1. The example states are (M, w0 ) and (M0 , w00 ) of Figure 1. Here we have two atomic propositions, x and t, where x is intended to mean “the marble is in the box ”, and t means “the marble is in the basket”. We use the agent symbols S and A for Sally and Anne, respectively. In (M, w0 ) and (M0 , w00 ), and states in general, each world is marked by its name followed by a list of the atomic propositions true at that world (which may be empty if none holds true). Sometimes we will drop names on worlds and just label them by the list of true propositions. Edges are labelled with the name of the relevant accessibility relations (agents). We use the symbol to mark the actual world. Consider (M, w0 ). The actual world is w0 , that is, the marble is in the basket (t holds). The loop at w0 for S and A means that Sally and Anne consider the actual world w0 possible, and the absence of other edges means that they only consider w0 possible. Hence we have e.g. (M, w0 ) |= BS t^BA t^BS BA t^BA BS t: both Sally and Anne believe the marble to be in the basket, and they both believe each other to have this belief. The state (M, w0 ) corresponds to the situation before Anne has transferred the marble to the box. Consider now (M0 , w00 ). This corresponds to the situation after Anne has transferred the marble in Sally’s absence. The actual world now satisfies x. In the actual world, w0 , Anne only considers w0 possible (signified by the loop labelled A at w0 ): she knows the marble to be in the box. However, Sally doesn’t have such a loop at w0 , rather she has an edge going to w1 where t holds. This means that in the actual world Sally only considers it possible that the actual world is in fact w1 . Hence she has a false belief that the marble is in the basket (a false belief that t holds). Formally, (M0 , w00 ) |= BS t. We have now seen how we can use states to model the beliefs of Sally and Anne before and after the marble is moved. But we also need a way to model the act of moving the marble. This is done using DEL event models, presented next. Event Models DEL introduces the concept of event model (or action model ) for modeling the changes to states brought about by the execution of actions [6, 5]. We here use a variant that includes postconditions [16, 10, 11], which means that actions can have both epistemic e↵ects (changing the beliefs of agents) and ontic e↵ects (changing the physical facts of the world). Definition 2 (Event models and actions). An event model of L(P, A) is E = (E, Q, pre, post), where – E, the domain, is a finite non-empty set of events; – Q : A ! 2E⇥E assigns an accessibility relation Q(i) to each agent i 2 A; – pre : E ! L(P, A) assigns to each event a precondition, which can be any formula in L(P, A). – post : E ! L(P, A) assigns to each event a postcondition. Postconditions are conjunctions of propositional literals, i.e., conjunctions of atomic propo- sitions and their negations (including > and ?). The relation Q(i) is generally abbreviated Qi . For e 2 E, (E, e) is called an action of L(P, A), and e is referred to as the actual event. Example 3. Consider the action (E, e0 ) of Figure 1. Labeling events by the pair h 1 , 2 i means that the event has precondition 1 and postcondition 2 . Hence the actual event, e0 , corresponds to the action of making t false and x true, that is, it is the act of transferring the marble from the basket to the box. The event e1 has trivial pre- and post-conditions meaning that it is a ‘skip’ action representing that nothing happens. Looking at the edges of the action, we see that Anne only considers it possible that the marble is transferred (the loop at e0 ), whereas Sally only considers it possible that nothing happens (she only has en edge from the actual event to the ‘skip’ event e1 ). Hence the model encodes an action where the marble is actually transferred from the basket to the box, Anne is aware of this, but Sally thinks that nothing happens. It hence encodes step 3 of the Sally-Anne task, cf. Example 1. Product Update Assume given a state (M, w0 ) and an action (E, e0 ). The product update yields a new state (M, w0 ) ⌦ (E, e0 ) representing the situation after the action (E, e0 ) has been executed in the state (M, w0 ). Definition 3 (Product update). Let (M, w0 ) be a state and (E, e0 ) an action, where M = (W, R, V ) and E = (E, Q, pre), and where M, w0 |= pre(e0 ). The product update of (M, w0 ) with (E, e0 ) is defined as the state (M, w0 )⌦(E, e0 ) = ((W 0 , R0 , V 0 ), (w0 , e0 )), where – W 0 = {(w, e) 2 W ⇥ E | M, w |= pre(e)} – Ri0 = {((w, e), (v, f )) 2 W 0 ⇥ W 0 | wRi v and eQi f } – (w, e) 2 V 0 (p) i↵ post(e) |= p or (M, w |= p and post(e) 6|= ¬p). Example 4. Referring again to Figure 1, we can calculate the product update (M, w0 ) ⌦ (E, e0 ). Intuitively, the calculation works like this. For each event in E, we first find the worlds in M that satisfies the precondition of the event. Each such matching world-event pair will become a world in the resulting model. Since both e0 and e1 have the trivial precondition >, both have their precondition satisfied in the world w0 . This gives us two matching world-event pairs (w0 , e0 ) and (w0 , e1 ) that will become the worlds of the new model. Now we have to use the postconditions of the events in order to figure out what the labels of these new worlds will be. In (w0 , e0 ) we have paired w0 with e0 . This means that we should take the existing label of w0 and then update it according to the postcondition of e0 . The label of w0 is t and the postcondition of e0 is ¬t^x. The postcondition ¬t ^ x will force t to become false and x to become true, so the label of (w0 , e0 ) will be x. The label of (w0 , e1 ) is the same as of w0 , since e0 has the trivial postcondition >. So the updated model (M, w0 ) ⌦ (E, e0 ) will have the two worlds (w0 , e0 ) : x and (w0 , e1 ) : t. Now we only need to find the edges connecting these two worlds. There will be an A-loop at (w0 , e0 ), since there is both an A-loop at w0 in M and an A-loop at e0 in E. Similarly there will be an {S, A}-loop at (w0 , e1 ). Finally, we need to check the edges between (w0 , e0 ) and (w0 , e1 ). Since there is an S-loop at w0 and an S-edge from e0 to e1 , we get an S-edge from (w0 , e0 ) to (w0 , e1 ). In total, the product update becomes: A S, A (M, w0 ) ⌦ (E, e0 ) = S (w0 , e0 ) : x (w0 , e1 ) : t Note that the resulting model is isomorphic to (M0 , w00 ) of Figure 1. Since (M, w0 ) represents the situation before Anne transfers the marble, and (M0 , w00 ) represents the situation afterwards (cf. Example 2), (E, e0 ) correctly captures the action of transferring the marble in Sally’s absence. 4 Formalising the Sally-Anne task in DEL We now have all the necessary ingredients for our first formalisation of the Sally- Anne task. Consider again the 5 steps of the Sally-Anne story presented in Example 1. The first step, step 0, describes the initial state, whereas the rest, 1–4, describes a sequence of actions. We will now show how to represent step 0 as a state and steps 1–4 as actions. We use the same symbols as in the previous s1 a1 a2 a3 a4 S, A S, A S, A A S, A S, A S p h>, ti h>, ¬pi h>, ¬t ^ xi h>, >i h>, pi Fig. 2. The states and actions in the DEL formalisation of Sally-Anne. s1 s2 s3 a1 S, A a2 a3 S, A S, A S, A S, A A S, A p p, t t S h>, ti h>, ¬pi h>, ¬t ^ xi h>, >i s4 s5 A S, A a4 A S, A S S, A S x t p, x p, t h>, pi Fig. 3. The DEL-formalisation of the Sally-Anne task examples, except we add a new atomic proposition p meaning “Sally is present in the room with Anne”. The following 5 step list, corresponding to the list of Example 1, shows the relevant states and actions: 0. Sally is in the room, holding the marble: state s1 of Figure 2. 1. Sally puts the marble into the basket: action a1 of Figure 2. 2. Sally leaves the room: action a2 of Figure 2. 3. Anne transfers the marble to the box: action a3 of Figure 2. 4. Sally re-enters: action a4 of Figure 2. Figure 3 calculates the result of executing the action sequence a1 , . . . , a4 in s1 , that is, si+1 = si ⌦ ai for all i = 1, . . . , 4, and hence s5 = s1 ⌦ a1 ⌦ · · · ⌦ a4 . The first two actions, a1 and a2 , are very simple. As seen from Figure 3, executing a1 in the initial state s1 simply adds the proposition t to the actual world (in s2 ), signifying that now the marble is in the basket. Executing a2 in the resulting state s2 amounts to deleting p from the actual world: in s3 Sally is no longer present in the room. The action a3 , the most complex one, has already been discussed in Example 3, and in Example 4 we carefully checked that s4 = s3 ⌦ a3 . The final action, a4 , simply adds p to every world of the model, corresponding to the fact the Sally returns to the room, and this is observed by both agents. What is important is now of course to check what holds in s5 , the model resulting from executing a1 , . . . , a4 in s1 . From Figure 3 we can see that s5 |= ¬t ^ BS t, that is, Sally mistakenly believes the marble to be in the basket. Assume an agent presented with steps 0–4 of the original informal story is able to formalise the steps as s1 , a1 , . . . , a4 , and is afterwards asked “where does Sally believe the marble to be”. Then that agent can first calculate the final state s5 = s1 ⌦ a1 ⌦ · · · ⌦ a4 and conclude that s5 |= BS t holds. From this the agent can answer “in the basket”, hence passing the Sally-Anne test! 5 Extending the DEL formalism So far so good, or at least it seems that way. But a closer look shows that there are two problems with the DEL-formalisation that need to be addressed. The first is: where do the event models come from? How is an agent supposed to get from the informal steps of the story to the formalisations s1 , a1 , . . . , a4 ? It seems to require ingenuity to come up with the right event models to formalise the informal action descriptions, in particular action a3 . Hence the proposed solution doesn’t yet really satisfy the faithfulness criterion presented in Section 2. The second problem with the formalisation can be illustrated by considering a shortened version of the Sally-Anne task where Sally doesn’t leave the room, that is, it only includes the steps 0, 1 and 3 of Example 1. These steps ought to have the same formalisations as before, that is, s1 , a1 and a3 , respectively. Hence the situation after the shortened Sally-Anne story should correspond to s1 ⌦ a1 ⌦ a3 . However, consulting Figure 3 it can be checked that s1 ⌦ a1 ⌦ a3 = s5 (since a2 only makes p false, and a4 makes it true again). Hence, an agent presented with the shortened Sally-Anne story would conclude that s1 ⌦ a1 ⌦ a3 |= BS t, implying that Sally ends up believing the marble to be in the basket. This is clearly not correct, since in this version she never left the room! In the following we will propose an improved formalisation that solves both of these problems. We start out by analysing the source of the second problem, which is in the formalisation of a3 (see Figure 2). As explained in Example 3, a3 “encodes an action where the marble is actually transferred from the basket to the box, Anne is aware of this, but Sally thinks that nothing happens”. All this is clearly not part of step 3 of the story, which simply states “Sally transfers the marble to the box”. The problem with a3 is that it is hardcoded into the event model who observes the action taking place. In most real-life cases, who observes an action depends on the state in which the action is applied. This is also the case in the Sally-Anne story: whether Sally observes the marble being moved depends on whether she is in the room. So the edges of the event model for action a3 ought to depend on whether Sally is present, that is, whether p holds or not. This leads us to a more general type of event model like aedge3 of Figure 4. aedge 3 aobs 3 A > S > A > S > S p A > S S^A A > S ¬p S ¬S^A e0 : h>, ¬t ^ xi e1 : h>, >i e0 : h>, ¬t ^ xi e1 : h>, >i Fig. 4. Two generalised variants of the action a3 in Sally-Anne Here A > (A gets >) at the loop of e0 means that A unconditionally has an edge here: Anne unconditionally observes the event e0 . The other label S p at the loop of e0 means that there is an edge here for agent S if p is true: Sally observes the event e0 if she is present in the room. Similarly, the label S ¬p on the edge from e0 to e1 means that if Sally is not in the room (¬p) then she thinks that nothing (e1 ) happens. This is a new type of event model, called an edge-conditioned event model, to be defined formally in the next subsection. With edge-conditioned event models we can solve the second problem men- tioned above. We now have an event model that will behave correctly both if applied in a state where Sally is present (p holds) and in a state where Sally is not present (p doesn’t hold). If aedge 3 is applied in a state where p holds, from e0 Sally will only consider e0 possible (have a loop at e0 ), but if p doesn’t hold, from e0 she will only consider e1 possible (have an edge from e0 to e1 ). Hence, if p holds she observes the event e0 , otherwise she doesn’t. Using edge-conditioned event models also brings us a step closer to satisfying the first criterion of faith- fulness. In almost all existing false-belief tasks, all ontic actions have the same structure as aedge 3 , and we can hence define a generic event model for all such ontic actions (which we will do in Section 5). However, it is still not quite sat- isfactory to use ad hoc symbols like p to state that a certain agent is present. This leads us to our next new idea. In addition to the propositional symbols P , we add to the language a new set of propositional symbols i^j (i sees j) for each pair of agents i 6= j. The intended meaning of i^j is that agent i observes the actions of agent j. Using such symbols we can replace the event model aedge 3 by aobs 3 , see Figure 4. The meaning of the label S S^A at the loop of e0 is that agent S observes the event e0 if S currently sees A (S^A is the case). We will now define these new technical constructs formally, and afterwards apply them to give an improved formalisation of the Sally-Anne task. Edge-conditioned event models Definition 4 (Edge-conditioned event models). An edge-conditioned event model of L(P, A) is E = (E, Q, pre, post), where E, pre and post are defined as for standard event models (Definition 2), and Q : A ! (E ⇥ E ! L(P, A)) assigns to each agent i a mapping Q(i) from pairs of events into formulas of L(P, A). The mapping Q(i) is generally abbreviated Qi . For e 2 E, (E, e) is called an edge-conditioned action of L(P, A). For standard event models (Definition 2), eQi f means that event f is accessible from event e by agent i, and we draw an edge from e to f labelled i in the graph of the event model. In edge-conditioned event models, accessibility has become conditioned by a formula: Qi (e, f ) = means that f is accessible from e by i under condition . When Qi (e, f ) = , we draw an edge from e to f labelled i in the graph of the event model (except when Qi (e, f ) = ?). We already saw an example of such an edge-conditioned event model: aedge 3 of Figure 4. Note that edge-conditioned event models naturally generalise standard event models: Any standard event model E = (E, Q, pre, post) can be equivalently represented as an edge-conditioned event model E 0 = (E, Q0 , pre, post) by simply letting Q0i (e, f ) = > for all (e, f ) 2 Qi and Q0i (e, f ) = ? otherwise. We also have to generalise the notion of product update: Definition 5 (Edge-conditioned product update). Let a state (M, w0 ) and an edge-conditioned action (E, e0 ) be given, where M = (W, R, V ) and E = (E, Q, pre), and where M, w0 |= pre(e0 ). The product update of (M, w0 ) with (E, e0 ) is defined as the state (M, w0 ) ⌦ (E, e0 ) = ((W 0 , R0 , V 0 ), (w0 , e0 )), where W 0 and V 0 are defined as in the standard product update (Definition 3) and Ri0 = {((w, e), (v, f )) 2 W 0 ⇥ W 0 | wRi v and M, w |= Qi (e, f )}. The only di↵erence to the standard product update is that the Ri0 relations have become parametrised by the Qi (e, f ) formulas. There is an i-edge from a world- event pair (w, e) to a world-event pair (v, f ) i↵ there is an i-edge from w to v in the epistemic model, and the condition Qi (e, f ) for having an edge from e to f in the event model is true in w. It can be shown that any edge-conditioned event model induces a standard event model in a canonical way, but the induced standard event model might be exponentially bigger. In technical terms, it can be shown that edge-conditioned event models are exponentially more succinct than standard event models (we will prove this and other interesting properties of edge-conditioned event models in a future paper). In particular, our generic event models for ontic actions and observability change (to be presented in Section 5) are going to consist of 2 events each, whereas the same actions using only standard event models would contain 2n 1 + 1 events, where n is the number of agents! Observability propositions We now define a new language Lobs (P, A) extending L(P, A) by the addition of observability propositions on the form i^j: ::= p | i^j | ¬ | ^ | Bi , where p 2 P , i, j 2 A and i 6= j. As noted above, the intended meaning of i^j is that “agent i observes all actions performed by agent j”. The reason we do not include the reflexive propositions i^i is that we will assume that all agents always observe their own actions, so i^i is implicitly always true. This assumption can of course be relaxed, but we will not consider that here. In the expression i^j we call i the observer and j the observed. Given a formula , we use ⇡1 ( ) to denote the set of agents occurring as observers in , that is, ⇡1 ( ) = {i | i^j is a subformula of for some j}. For instance we have ⇡1 (i^j ^ ¬k^l) = {i, k} (note that k is in the set even though the formula k^l occurs negated). The idea of introducing observability propositions in the context of DEL was first introduced in [15]. They, however, only use a simpler type of proposition hi with the intended meaning “agent i observes all actions” (agent i is in a state of paying attention to everything that happens). Here we need something more fine-grained, in particular for our later formalisation of the chocolate task (Section 6) where we need to be able to represent that an agent i is observing the actions of an agent j without j observing the actions of i. Ontic actions and observability change The previous definitions of edge-conditioned event models and product update extend to the language Lobs (P, A) in the obvious way (after all, we only added some additional atomic propositions). We can now finally define two generic types of edge-conditioned actions that are sufficient to formalise a number of di↵erent false-belief tasks of varying order. The first action type is an ontic action do(i, ): agent i makes true. Step 1 of the Sally-Anne task is for in- stance going to be formalised by do(S, t): Sally makes t true. The second is an observability changing action oc( ) for changing who observes who. For in- stance step 2 of the Sally-Anne task where Sally leaves the room is going to be formalised by oc(¬S^A ^ ¬A^S): Sally stops observing Anne (¬S^A), and Anne stops observing Sally (¬A^S). Definition 6. We define the following edge-conditioned actions on Lobs (P, A). – do(i, ): for each agent i and each conjunction of propositional literals , this is the ontic action shown at the top of Figure 5. – oc( ): for each conjunction of observability literals (observability propositions and their negations), this is the observability changing action shown at the bottom of Figure 5. These new actions need a little explanation. Consider first do(i, ). As mentioned, this is an action where agent i makes true (since the actual event e0 has postcondition ). From the label at the loop of e0 we can see that the agents who observe the action taking place, and hence come to believe , are: 1) agent i itself (since we have i >); 2) any other agent who is observing agent i (since we have j 2 A\{i} : j j^i). The agents who are not observing i will think that nothing happens (the label j 2 A \ {i} : j ¬j^i on the edge to e1 ). This also explains the title of the paper, “Seeing is believing”: If agent j sees agent i, j^i, then j comes to believe any formula that i brings about. The action oc( ) is a bit more complicated, but follows the same principle (note that the two event models only di↵er in their edge labels). Looking at the labels of the loop at e0 , we can see that the agents observing the observability change are: 1) any agent do(i, ) i > j 2 A\{i} : j j^i j 2 A: j > j 2 A\{i} : j ¬j^i e0 : h>, i e1 : h>, >i oc( ) j 2 ⇡1 ( ) :Wj > j 62 ⇡1 ( ) : j k2⇡1 ( ) j^k j 2 A: j > V j 62 ⇡1 ( ) : j k2⇡1 ( ) ¬j^k e0 : h>, i e1 : h>, >i Fig. 5. The edge-conditioned actions do(i, ) and oc( ). whose observer status is a↵ected by the action (since we have j 2 ⇡1 ( ) : j >); 2) any other agent who is observing W at least one of the aforementioned agents (since we have j 62 ⇡1 ( ) : j k2⇡1 ( ) j^k). This means that if i is currently observing j, and j either starts or stops to observe k, then i will also observe this change. One could imagine intricate situations where this wouldn’t hold, but for our purposes it is sufficient. The event model oc( ) is a generalisation of the event models +S and S introduced in [15]. A few final remarks before turning to present our improved formalisations of false-belief tasks. In standard DEL there is no explicit notion of agency, that is, an action simply happens without any need to say who did it. But in our do action we need to include the agent performing it as a parameter, since what will be observed by the other agents depends on it. To save space, we have chosen not to include an announcement action in our framework, even though this would be very simple: simply take the event model for do(i, ) and put in the precondition instead of the postcondition of e0 . 6 New formalisations of false-belief tasks Example 5 (Formalising the Sally-Anne task). Given the generic actions from the previous section, it is now quite straightforward to provide a new formalisa- tion of the Sally-Anne task using these actions: S, A 0. Sally is in the room with Anne, holding the marble: state s1 = S^A, A^S 1. Sally puts the marble into the basket: a1 = do(S, t). 2. Sally leaves the room: a2 = oc(¬S^A ^ ¬A^S). 3. Anne transfers the marble to the box: a3 = do(A, ¬t ^ x). 4. Sally re-enters: a4 = oc(S^A ^ A^S). Note that we no longer use the atomic proposition p, as we now have a more generic way to deal with observability through our observability propositions. Similar to the previous formalisation in Section 4, it can now be checked that s1 ⌦ a1 ⌦ · · · ⌦ a4 |= BS t, hence again the formalisation gives the right answer to the Sally-Anne test. We should also note that now we have s1 ⌦ a1 ⌦ a3 |= BS x, so if Sally doesn’t leave the room, she will not get a false belief. Thus we have successfully solved the problem of the shortened Sally-Anne task that was dis- cussed in the beginning of Section 5. We will not show the detailed calculations, as we will save that for the next example, which formalises a more complex false-belief task. Example 6 (Formalising the second-order chocolate task). We now consider a compact version of the second-order chocolate task presented in [18, 4]. It is illustrated in Figure 10 in the appendix. It has the following steps: 0. John and Mary are in a room. There is a chocolate bar in the room. 1. John puts the chocolate into the drawer. 2. John leaves the room. 3. John starts peeking into the room through the window, without Mary seeing. 4. Mary transfers the chocolate to the box. The child taking the test is now asked “where does Mary believe that John be- lieves the chocolate to be?” It is a second-order task since this question concerns second-order belief attribution (Mary’s beliefs about John’s beliefs). The correct answer is “in the drawer”, since Mary is not aware that John was peeking while she moved the chocolate. It is immediate that step 1 and 4 above are ontic ac- tions, and steps 2 and 3 are observability changing actions. Let us use atomic propositions d for the “the chocolate is in the d rawer” and x for “the chocolate is in the box.” We use agent symbols J for John and M for Mary. Step 1, “John puts the chocolate into the drawer”, must then be the ontic action do(J, d). Step 2, “John leaves the room”, must be the observability change oc(¬J^M ^¬M ^J) (John stops observing Mary and Mary stops observing John). Step 3 is again an observability change, but this time it is simply oc(J^M ): John starts observing Mary. Finally, step 4 is the ontic action do(M, ¬d ^ x). Figure 6 calculates the result of executing the action sequence of steps 1–4 in the initial state described by step 0. The actions in the figure show the applied instances of do(i, ) and oc( ) calculated from Figure 5. Some of the states and actions contain grey nodes. These are nodes that are not accessible from the initial world/event, and can hence be ignored (by bisimulation contraction, to be technically precise). Before going into the detailed calculations of Figure 6, let us have a look at the resulting model s5 . This is the model in which it should be checked where Mary believes John believes the chocolate to be. Clearly we have s5 |= BM BJ d, so the 1. John puts chocolate in drawer 2. John leaves room s1 s2 do(J, d) oc(¬J^M ^ ¬M ^J) J, M J, M J > J > J > J > J^M, M ^J M M ^J M > J^M, M ^J, d M > M > M ¬M ^J h>, >i h>, ¬J^M ^ ¬M ^Ji h>, >i h>, di 3. John peeks through s3 window s4 4. Mary moves chocolate oc(J^M ) do(M, ¬d ^ x) J, M J J, M J > M M > J > J > d M M ^J w : J^M, d w1 : d J J^M M > 0 M > M ¬M ^J J ¬J^M h>, J^M i h>, >i e0 : h>, ¬d ^ xi e1 : h>, >i s5 J M J, M J M J (w0 , e0 ) : J^M, x (w1 , e0 ) : x (w1 , e1 ) : d (w0 , e1 ) : J^M, d Fig. 6. The DEL-formalisation of the second-order chocolate task agent’s answer will be “in the drawer”, hence passing the false-belief test. But s5 can do more than just answer this question, in fact it is a full description of the final situation, including all beliefs to arbitrary order. Concerning observability, we can for instance see that s5 |= J^M ^ BM ¬J^M ^ BJ BM ¬J^M : John sees Mary, Mary believes he doesn’t, and John knows this. We can also imagine a third-order version of the task, where the question is “Where does John believe that Mary believes that John believes the chocolate to be”, and by consulting s5 we immediately get the answer “in the drawer”: s5 |= BJ BM BJ d. The most interesting part of the calculation in Figure 6 is the last step, s5 = s4 ⌦ do(M, ¬d ^ x), so we will explain this in more detail. Calculating the product s4 ⌦ do(M, ¬d ^ x) follows the same strategy as in Example 4. First we find the matching world-event pairs which, in this case, is all four world- event combinations (w0 , e0 ), (w0 , e1 ), (w1 , e0 ) and (w1 , e1 ), since both e0 and e1 have trivial preconditions (see Figure 6 where do(M, ¬d ^ x) is the event model of step 4). In the world-event pairs containing e0 , the postcondition of e0 is enforced, that is, d is made false and x true. The other world-event pairs simply inherit their label from the first element of the pair. Hence the four worlds of the resulting model s5 are (w0 , e0 ) : J^M, x; (w0 , e1 ) : J^M ; d, (w1 , e0 ) : x; (w1 , e1 ) : d. Now for the interesting part, the edges. At (w0 , e0 ) we get a J-loop, since there is J-loop at w0 and the condition for having a J-loop at e0 is J^M , which is satisfied in w0 . This should be contrasted with the situation at (w1 , e0 ): Here we also have a J-loop at the world of the pair, w1 , but now the condition J^M for having a J-loop at the event of the pair is not satisfied in the world of the pair. At (w1 , e0 ) we hence only get an M -loop (since both w1 and e0 unconditionally have such a loop). We leave the calculation of the rest of the edges to the (enthusiastic) reader. Note that to get from s4 to s5 we only have to apply an instance of a generic edge-conditioned action with 2 events. This situation is much better than what can be achieved with standard event models. In Proposition 1 in the appendix we prove that there is no standard event model a with 2 events such that s5 = s4 ⌦a. This implies that the smallest standard event model that can produce s5 from s4 is this: J M J, M M J h>, ¬d ^ xi h>, ¬d ^ xi h>, >i The problem with this event model is that it is already a ‘second-order model’ that fully encodes the structure of the model s5 we wish to obtain. Hence if we had to formalise the second-order chocolate task using standard event models, we would have to formalise the step “Mary moves the chocolate” as this event model that already fully encodes the final structure achieved at the end of the story. This would certainly be very far from achieving the faithfulness criterion introduced in Section 2. So indeed the edge-conditioned event models make a real di↵erence to the formalisation of false-belief tasks. 7 Conclusion, related work and future work In this paper we have shown how to formalise two false-belief tasks—a first- and a second-order one—in an extension of dynamic epistemic logic. In the end, we were able to express the formalisations rather compactly: – Sally-Anne task: do(S, t), oc(¬S^A^¬A^S), do(A, ¬t^x), oc(S^A^A^S). – Chocolate task: do(J, d), oc(¬J^M ^ ¬M ^J), oc(J^M ), do(M, ¬d ^ x). We started out expressing two overall criteria for our formalisations of false-belief tasks: robustness and faithfulness. To be robust, the formalism should be able to formalise false-belief tasks of arbitrary order. We claim to have such robustness in our current formalism, put proving it formally is future work. Nevertheless, we have been able to show that we could go from a formalisation of a first-order false-belief task to a second-order one at no extra cost, which as discussed above is not the case in standard DEL (and not in most other frameworks either). To have faithfulness, we required that it should be relatively straightforward to get from the informal action descriptions of the false-belief task to the corresponding formalised actions. We believe we have taken a big step closer towards achieving this. If the (semi-)informal description says “agent i makes true” it is our action do(i, ). If the informal description says, e.g., “now agent i starts observing j without agent j noticing” it is oc(i^j). The formalisation step can of course still not be fully automated, but we are much closer than if we just had to build all the relevant event models from scratch, which was where this paper started. There is of course also a limit to the types of false-belief tasks that can be dealt with using only do and oc. In particular, a lot of the existing false-belief tasks involve untruthful announcements such as the ‘ice-cream task’ [22], the ‘birthday puppy task’ [26] and the ‘clown-in-the-park task’ [28]. These can not be dealt with in the current framework. To be able to deal with untruthful announcements and the revision of false beliefs, we need another type of model called plausibility models [7]. We plan to show how these models can be used to formalise the aforementioned false-belief tasks in a future paper. In our approach, observability amounts to ‘who sees who’, that is, it is a relation between agents. Other approaches to modelling observability can be found in e.g. [14, 20, 8, 15]. In these approaches, observability is instead con- nected either to propositions [14, 20], particular actions [8] or all actions [15]. The paper [24] uses a similar approach to observability as we do, but in a more complex 2-dimensional dynamic epistemic logic. In the papers [14, 8], observabil- ity is encoded using axioms instead of being encoded into the states as we do. For us, it is very important to encode observability directly into the states to be able to deal with higher-order observability (‘you don’t see me seeing you’). Even though edge-conditioned event models is an original idea of this paper, they are close in spirit to the generalized arrow updates of [21]. However, arrow updates are rather an alternative to event models, whereas our edge-conditioned event models is a straightforward generalisation of event models. Furthermore, arrow updates are purely epistemic (without postconditions), and would hence not be able to represent the ontic actions of the false-belief tasks. Solving false-belief tasks using DEL as we do in this paper is part of a larger research e↵ort in epistemic planning: combining automated planning with DEL to integrate higher-order social cognition into intelligent planning agents [11, 1]. Combining the ideas of [11, 1] with the ideas of this paper will allow us to devise algorithms not only for analysing false beliefs (as is done in the false-belief tasks), but also for synthesising them. It could e.g. be that Anne plans to deceive Sally by asking her to go outside and then she moves the marble meanwhile. This is a case of epistemic planning where the goal is to achieve a state where Sally does not know the location of the marble. References 1. Andersen, M.B., Bolander, T., Jensen, M.H.: Conditional epistemic planning. Lec- ture Notes in Artificial Intelligence 7519, 94–106 (2012), proceedings of JELIA 2012 2. Arkoudas, K., Bringsjord, S.: Toward formalizing common-sense psychology: An analysis of the false-belief task. In: Ho, T.B., Zhou, Z.H. (eds.) PRICAI. Lecture Notes in Computer Science, vol. 5351, pp. 17–29. Springer (2008) 3. Arslan, B., Verbrugge, R., Taatgen, N., Hollebrandse, B.: Teaching children to attribute second-order false beliefs: A training study with feedback. Submitted. 4. Arslan, B., Taatgen, N., Verbrugge, R.: Modeling developmental transitions in rea- soning about false beliefs of others. In: Proc. of the 12th International Conference on Cognitive Modelling (2013) 5. Baltag, A., Moss, L.: Logic for epistemic programs. Synthese 139(2), 165–224 (2004) 6. Baltag, A., Moss, L.S., Solecki, S.: The logic of public announcements and common knowledge and private suspicions. In: TARK. pp. 43–56 (1998) 7. Baltag, A., Smets, S.: A qualitative theory of dynamic interactive belief revision. In: Bonanno, G., van der Hoek, W., Wooldridge, M. (eds.) Logic and the Foundations of Game and Decision Theory (LOFT7). Texts in Logic and Games, vol. 3, pp. 13–60. Amsterdam University Press (2008) 8. Baral, C., Gelfond, G., Son, T.C., Pontelli, E.: An action language for reasoning about beliefs in multi-agent domains. In: Proceedings of the 14th International Workshop on Non-Monotonic Reasoning (2012) 9. Baron-Cohen, S., Leslie, A.M., Frith, U.: Does the autistic child have a theory of mind? Cognition 21(1), 37–46 (1985) 10. van Benthem, J., van Eijck, J., Kooi, B.: Logics of communication and change. Information and Computation 204(11), 1620–1662 (2006) 11. Bolander, T., Andersen, M.B.: Epistemic planning for single- and multi-agent sys- tems. Journal of Applied Non-Classical Logics 21, 9–34 (2011) 12. Braüner, T.: Hybrid-logical reasoning in false-belief tasks. In: Schipper, B. (ed.) Proceedings of Fourteenth Conference on Theoretical Aspects of Rationality and Knowledge (TARK). pp. 186–195 (2013) 13. Breazeal, C., Gray, J., Berin, M.: Mindreading as a foundational skill for socially intelligent robots. In: Robotics Research, pp. 383–394. Springer (2011) 14. Brenner, M., Nebel, B.: Continual planning and acting in dynamic multiagent en- vironments. Autonomous Agents and Multi-Agent Systems 19(3), 297–331 (2009) 15. van Ditmarsch, H., Herzig, A., Lorini, E., Schwarzentruber, F.: Listen to me! public announcements to agents that pay attention—or not. In: Logic, Rationality, and Interaction, pp. 96–109. Springer (2013) 16. van Ditmarsch, H., van der Hoek, W., Kooi, B.: Dynamic epistemic logic with assignment. In: Dignum, F., Dignum, V., Koenig, S., Kraus, S., Singh, M.P., Wooldridge, M. (eds.) Autonomous Agents and Multi-agent Systems (AAMAS 2005). pp. 141–148. ACM (2005) 17. van Ditmarsch, H., Kooi, B.: Semantic results for ontic and epistemic change. In: Bonanno, G., van der Hoek, W., Wooldridge, M. (eds.) Logic and the Foundation of Game and Decision Theory (LOFT 7). pp. 87–117. Texts in Logic and Games 3, Amsterdam University Press (2008) 18. Flobbe, L., Verbrugge, R., Hendriks, P., Krämer, I.: Childrens application of theory of mind in reasoning and language. Journal of Logic, Language and Information 17(4), 417–442 (2008), special issue on formal models for real people, edited by M. Counihan 19. Frith, U.: Mind blindness and the brain in autism. Neuron 32(6), 969–979 (2001) 20. van der Hoek, W., Troquard, N., Wooldridge, M.: Knowledge and control. In: The 10th International Conference on Autonomous Agents and Multiagent Systems- Volume 2. pp. 719–726. International Foundation for Autonomous Agents and Multiagent Systems (2011) 21. Kooi, B., Renne, B.: Generalized arrow update logic. In: Proceedings of the 13th Conference on Theoretical Aspects of Rationality and Knowledge. pp. 205–211. ACM (2011) 22. Perner, J., Wimmer, H.: john thinks that mary thinks that attribution of second- order beliefs by 5-to 10-year-old children. Journal of experimental child psychology 39(3), 437–471 (1985) 23. Premack, D., Woodru↵, G.: Does the chimpanzee have a theory of mind? Behav- ioral and Brain Sciences 1(4), 515–526 (1978) 24. Seligman, J., Liu, F., Girard, P.: Facebook and the epistemic logic of friendship. In: Schipper, B. (ed.) Proceedings of Fourteenth Conference on Theoretical Aspects of Rationality and Knowledge (TARK). pp. 229–238 (2013) 25. Sindlar, M.P.: In the Eye of the Beholder: Explaining Behavior through Mental State Attribution. Ph.D. thesis, Universiteit Utrecht (2011) 26. Sullivan, K., Zaitchik, D., Tager-Flusberg, H.: Preschoolers can attribute second- order beliefs. Developmental Psychology 30(3), 395 (1994) 27. Verbrugge, R.: Logic and social cognition. Journal of Philosophical Logic 38(6), 649–680 (2009) 28. Wahl, S., Spada, H.: Childrens reasoning about intentions, beliefs and behaviour. Cognitive Science Quarterly 1(1), 3–32 (2000) 29. Wimmer, H., Perner, J.: Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children’s understanding of deception. Cognition 13(1), 103–128 (1983) Appendix Proposition 1. Let (M, w0 ) and (M0 , w0 ) be the following two models: (M, w0 ) (M0 , w00 ) J J, M J M J, M M M J w0 : J^M, d w1 : d w00 : J^M, x w10 : x w20 : d There exists no standard event model E with only 2 events e0 and e1 such that (M, w0 ) ⌦ (E, e0 ) contains (M0 , w00 ) as a submodel. Proof. Assume the opposite. Then since (M0 , w00 ) is a submodel of (M, w0 ) ⌦ (E, e0 ), all of the worlds w00 , w10 and w20 of (M0 , w00 ) must be among the pairs (w0 , e0 ), (w0 , e1 ), (w1 , e0 ) and (w1 , e1 ) of (M, w0 )⌦(E, e0 ). First we can conclude that w00 = (w0 , e0 ), since the actual worlds and events have to match. Since w0 has the label J^M, d whereas w00 has the label J^M, x, e0 must be an event with a postcondition including ¬d and x as conjuncts. Since w20 doesn’t satisfy x, w20 can then not be of the form (·, e0 ) (that is, it is not (w, e0 ) for any w). Therefore w20 must be of the form (·, e1 ). Since w20 doesn’t satisfy x, e1 must then be an event that doesn’t have x as a (positively occurring) conjunct in its postcondition. Since w10 satisfies x but none of w0 or w1 satisfies it, we can conclude that w10 can not be on the form (·, e1 ). Hence, we must have w10 = (w0 , e0 ) or w10 = (w1 , e0 ), but since w00 = (w0 , e0 ) and w00 6= w10 we can conclude w10 = (w1 , e0 ). We have now concluded w00 = (w0 , e0 ) and w10 = (w1 , e0 ). Since there is a J- loop at w00 and w00 = (w0 , e0 ), e0 must then also have a J-loop (cf. the definition of product update). Similarly, since there is an M -loop at w10 and w10 = (w1 , e0 ), e0 must also have an M -loop. We can conclude that e0 contains a J, M -loop. Now since w1 contains a J, M -loop, we get that w10 = (w1 , e0 ) must also contain a J, M -loop. But looking at the world w10 of (M0 , w00 ) we see it only has an M -loop, and hence we have a contradiction, completing the proof. The proof given is slightly intricate, but the intuition is rather clear: w00 and 0 w1 must necessarily be updates with the same event (e0 ), but there is no way a standard such event can produce a J-loop at w00 and an M -loop at w10 without having both a J-loop and an M -loop itself. But if it has, then when updating w1 with this event we get a J, M -loop either at w00 or w10 . The reason that it works with edge-conditioned event models is of course that they allow us to let the edges of events depend on the world they are applied in, so in this case it is not a problem to have a single event e0 which produces one type of loop in one world and another type of loop in another. Fig. 7. An illustration of the Sally-Anne false belief task adapted from [19]. system/reference year formalism/platform h-o reas. other features CRIBB [28] 2000 Prolog 2 goal recognition, plan recognition Edd Hifeng [2] 2008 event calculus 1 Second Life avatar Leonardo [13] 2011 C5 agent architecture 1 goal recognition, learning [25] 2011 extension of PDL, 1 goal recognition implementation in 2APL ACT-R agent [4] 2013 ACT-R cognitive 1 learning architecture [12] 2013 hybrid logic 1 temporal reasoning Fig. 8. Existing full formalisations/implementations of false-belief tasks, ordered chronologically. The numbers in the ‘h-o reas.’ column refer to the highest level of belief attribution the formalism/system allows (1 if there is no upper bound). (M, w) |= p i↵ w 2 V (p) (M, w) |= ¬ i↵ M, w 6|= (M, w) |= ^ i↵ M, w |= and M, w |= (M, w) |= Bi i↵ for all v 2 W , if wRi v then M, v |= Fig. 9. Truth conditions for the epistemic language where M = (W, R, V ) is an epis- temic model, i 2 A, w 2 W and , 2 L(P, A). Fig. 10. Illustrations c Avik Kumar Maitra, with kind permission of the authors of [4, 3].