Dynamic Epistemic Logic with ASP Updates: Application to Conditional Planning Pedro Cabalara , Jorge Fandinnob and Luis Fariñas del Cerroc a University of Corunna, SPAIN b University of Potsdam, GERMANY c Institut de Recherche en Informatique de Toulouse, Universty of Toulouse, CNRS, FRANCE Abstract Dynamic Epistemic Logic (DEL) is a family of multimodal logics that has proved to be very successful for epistemic reasoning in planning tasks. In this logic, the agent’s knowledge is captured by modal epistemic operators whereas the system evolution is described in terms of (some subset of) dynamic logic modalities in which actions are usually represented as semantic objects called event models. In this paper, we study a variant of DEL, that we call DEL[ASP], where actions are syntactically described by using an Answer Set Programming (ASP) representation instead of event models. This representation directly inherits high level expressive features like indirect effects, qualifications, state constraints, defaults, or recursive fluents that are common in ASP descriptions of action domains. Besides, we illustrate how this approach can be applied for obtaining conditional plans in single-agent, partially observable domains where knowledge acquisition may be represented as indirect effects of actions. Keywords Answer Set Programming, Dynamic Epistemic Logic, Epistemic Logic Programs, Epistemic Specifica- tions, Conditional Planning, Equilibrium Logic, Non-Monotonic Reasoning Introduction Automated planning is the field of Artificial Intelligence concerned with the generation of strategies to achieve a goal in a given dynamic domain. A planner usually starts from a formal representation of the domain, a particular instance of the problem and the goal to achieve. The planner output is some strategy, expressed in terms of actions that cause the state transitions to reach the goal. The most common situation is that such a strategy is just a sequence of actions called a plan. In Classical Planning [1] some simplifying restrictions are assumed: the system has a finite number of states, the world is fully observable and the transition relation is deterministic and static (i.e. transitions are only caused by the execution of actions). However, a rational agent may easily face planning problems that require relaxing these assumptions. For instance, a robot may not possess all the information about the environment, either because its sensors have a limited scope, or because its actions may have non-deterministic effects that require observation for finding out the real outcome. Removing the assumptions of determinism and fully observable world naturally leads to two important questions [2]: (i) how does a plan EELP’20: Second Workshop on Epistemic Extensions of Logic Programming, September 19th, 2020, Rende, Italy email: cabalar@udc.es (P. Cabalar); fandinno@uni-potsdam.de (J. Fandinno); farinas@irit.fr (L. Fariñas del Cerro) © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) look like in this new context? and (ii) how to represent the changes in the agent’s knowledge along the plan execution? Regarding (i), two new categories of plans have been defined in this context: conformant plans and conditional plans. A conformant plan is a sequence of actions that guarantees achieving the goal regardless unknown values of the fluents in the initial situation or the precise effect of the non-deterministic actions. If we further allow sensing actions (acquiring knowledge from the environment) then the structure of a sequential plan is not reasonable any more: a conditional plan may contain “if-then-else” constructs that allow the agent to follow different strategies depending on the knowledge she acquired when executing the plan. Approaches to both conformant and conditional planning have been broadly studied in the literature [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]. With respect to question (ii), several approaches studied the effects of sensing actions [4, 6, 18, 19, 21, 22, 28, 29, 30, 31]. One prominent line of research is based on Dynamic Epistemic Logic (DEL) [32, 33], a multi-modal approach where the agent’s knowledge is captured by modal epistemic operators whereas the system evolution is described in terms of (some subset of) dynamic logic [34] modalities. For instance, the DEL expression [watch](K rain ∨ K ∼rain) represents that, after any possible outcome of sensing action watch, the agent knows whether rain holds or not. Different variants of DEL have been successfully applied to the problem of planning with non-deterministic, partially observable multi-agent domains [18, 19, 21, 22, 35]. Although DEL has proved to be very convenient for epistemic reasoning in planning, it shows some important drawbacks when analysed from a Knowledge Representation (KR) viewpoint. This is because, for representing actions and their effects, DEL uses the so-called event models [36, 37], that inherit some of the expressive limitations of the STRIPS planning language [38]. In particular, event models do not allow some important KR features, like the treatment of indirect effects, action qualifications, state constraints or recursive fluents, that are quite common in modern representation of action theories. One popular KR formalism that naturally covers these expressive features is Answer Set Programming (ASP) [39, 40, 41], a well-established paradigm for problem solving and non- monotonic reasoning based on the stable models semantics [42, 43]. The use of ASP for classical planning was introduced in [44, 45], leading to a methodology adopted by different high-level action languages (see [46] and references there) and, more recently, to a temporal extension of ASP [47]. Besides, there exists a growing list of ASP applications [48], many of them dealing with classical planning problems. When moving to conformant planning, though, the application of ASP is still under an exploratory stage. Most of the attempts in this direction relied on a extension called epistemic specifications [49] that incorporate modal constructs (called subjective literals) for representing the agent’s knowledge. However, the semantic interpretation of this formalism is still under debate and only some preliminary implementations are still available – see [50] for a recent survey. On the other hand, the use of ASP to obtain conditional plans was still an unexplored territory. In this paper, we study the case of single-agent planning and combine both approaches, DEL and (epistemic) ASP, to exploit the advantages of both formalisms in a single language. Our proposal, called DEL[ASP], relies on replacing event models by epistemic logic programs. In that way, the basic event to describe the transition between two epistemic models becomes an ASP epistemic specification, while we keep the same dynamic logic operators for temporal reasoning among transitions. On the one hand, with respect to DEL, the new approach provides all the expressive power of ASP for action domains: indirect effects, qualifications, state constraints, defaults, or recursive fluents are directly inherited from ASP. Moreover, when a classical planning scenario (represented in ASP) becomes partially observable, the new approach allows keeping the scenario representation untouched, possibly adding new epistemic rules to describe the effects of sensing actions. On the other hand, with respect to (non-temporal) epistemic ASP, dynamic operators provide a comfortable way for explicitly representing, and formally reasoning about conformant and conditional plans. The rest of the paper is organised as follows. In the next section, we provide some background on the formalisms that conform our proposal. After these preliminaries, we introduce the formalism of DEL[ASP] and explain its behaviour using some examples. Then, we study the representation of conditional plans in this formalism. Finally, we discuss some related work and conclude the paper. Preliminaries In this section, we provide some background on planning in DEL, planning in ASP, and the ASP extension of epistemic specifications, since these three components will be present in DEL[ASP] up to some degree. In the case of DEL, we will present a slight generalisation of [19, 21] that admits abstract updating objects. These objects correspond to event models for standard DEL, which we denote here as DEL[E], and will become epistemic specifications for DEL[ASP]. For the case of epistemic logic programs, we will use a recent logical formalisation [51] that avoids the problem of self-supported conclusions present in the original semantics [49]. This logic, called Founded Autoepistemic Equilibrium Logic (FAEEL) is a combination of Pearce’s Equilibrium Logic [52], a well-known logical characterisation of stable models, with Moore’s Autoepistemic Logic (AEL) [53], one of the most representative approaches among modal non-monotonic logics. Dynamic Epistemic Logic with Abstract Updating Objects Given a set of propositional symbols P and a set of updating objects O, a (dynamic epistemic) formula ϕ is defined according to the following grammar: ϕ ::= ⊥ | p | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ → ϕ | ∼ϕ | K ϕ | [o]ϕ where p ∈ P is a proposition and o ∈ O an updating object. The modal epistemic operator K represents the (planning) agent’s knowledge: formula K ϕ means that “the agent knows ϕ.” The symbol “∼” stands here for classical negation (we reserve the symbol “¬” for intuitionistic negation later on). A formula ϕ is called objective if the operator K does not occur in it. It is called subjective if it has at least some proposition and every proposition is in the scope of K. As usual, we define the following abbreviations: ϕ ↔ ψ def = (ϕ → ψ) ∧ (ψ → ϕ), (ϕ ← ψ) = (ψ → ϕ), and > = ∼⊥. We also define the dual of K as follows: K̂ ϕ def def def = ∼K ∼ϕ. We keep the Boolean operators ∨, ∧, →, ⊥ and avoid defining ones in terms of the others, since this will not be valid when we use an intuitionistic reading later on. By LE (P) we denote the language containing all dynamic epistemic formulas over P. We provide next an abstract semantics that just relies on two basic concepts: epistemic models that represent the agent’s knowledge; and the updating evaluation, a pair of generic functions that describe how updating objects cause transitions among those models. Definition 1 (Epistemic Model). Given a (possibly infinite) set of propositional symbols P and a (possibly infinite) set of possible worlds W, a model is a triple M = hW, K, V i where – W ⊆ W is a finite set of worlds, – K ⊆ W × W is an accessibility relation on W , and – V : W −→ 2P is a valuation function. D(M) = W denotes the domain of M. An epistemic model is a model where K is an equivalence relation: it is further called information cell if K is an universal relation. A belief model is a model where K = W × W 0 with W 0 = W \ {w0 } for some w0 ∈ W . By M we denote the set of all possible epistemic models over P and W. We assume that the modeller coincides with the planning agent (the one whose knowledge is captured by the epistemic models). This is usually called an internal point of view, as opposed to the external one where the modeller is a different agent, an omniscient and external observer who can differentiate the actual world and knows its configuration [54, 55]. Adopting the internal orientation translates in the lack of a designated world (all worlds in the model are equally possible). A second consequence is that, even for single-agent epistemic models, we cannot replace the equivalence relation by a universal one. Before going into further technical details, let us introduce the following scenario from [21], which will be our running example throughout the paper. Example 1. After following carefully laid plans, a thief has almost made it to her target: the vault containing the invaluable Pink Panther diamond. Standing outside the vault, she now deliberates on how to get her hands on the diamond. She knows the light inside the vault is off, and that the Pink Panther is on either the right or the left pedestal inside. Obviously, the diamond cannot be on both the right and left pedestal, but nonetheless the agent may be uncertain about its location. Note that the thief is perfectly capable of moving in the darkness and take whatever is on top any of the pedestals, but she is not able to know whether the diamond has been taken or not. It is assumed that there are four possible actions: move, flick, take_left and take_right. The action move changes the location of the thief from outside the vault (∼v) to inside the vault (v) and vice-versa. The action flick turns on the light (l). Furthermore, if the thief is in the vault (v) and the light is on (l), the thief can see (s) where the Pink Panther is. Finally, actions take_left and take_right respectively take the diamond (d) from the left (∼r) or right (r) pedestal if the diamond is in the intended pedestal. The set of propositions for the example is P = {v, l, r, s, d}. Figure 1 depicts three consecutive epistemic models, M0 , M1 and M2 , respectively corresponding to the initial state of Example 1 and the resulting states after performing the sequence of actions move and then flick. For improving readability, we represent the world valuations as strings of propositions and underline those that are false. For instance, the valuation for w1 in Figure 1a is depicted M0 M1 w1 : vlrsd w2 : vlrsd w1 : vlrsd w2 : vlrsd (a) Initial epistemic model (b) Result of move on M0 M2 w1 : vlrsd w2 : vlrsd (c) Result of flick on M1 Figure 1: Sequence of epistemic models M0 , M1 , M2 that result from actions move and then flick starting in the initial state M0 of Example 1. as vlrsd and corresponds to the set of true atoms {r}, making false all the rest. The initial model, M0 represents the triple hW0 , K0 , V0 i where we have two worlds W0 = {w1 , w2 } universally connected, that is, K0 = W0 × W0 = {(w1 , w2 ), (w2 , w1 ), (w1 , w1 ), (w2 , w2 )} with valuations V0 (w1 ) = {r} and V0 (w2 ) = ∅. This means that the agent knows that the thief is outside the vault (v), the light is off (l), she cannot see where the Pink Panther is (s) and she does not have the diamond (d). The two connected worlds reveal that she does not know whether the Pink Panther is on the right pedestal (world w1 ) or on the left one (world w2 ). The epistemic model M1 in Figure 1b reflects the same configuration, with the only exception that now the thief is in the vault (v), as a result of moving inside. That is, M1 = hW0 , K0 , V1 i with V1 satisfying V1 (w1 ) = {v, r} and V0 (w2 ) = {v}. Finally, M2 shows the epistemic model that results from performing action flick on M1 . In this third model, two relevant changes occur: first, l and s became true in both worlds, since flicking turns on the light, and then, the thief can see the interior of the vault. Second, more importantly, the two worlds became completely disconnected, something that reveals that the agent has now complete knowledge of the world configuration in the two possible cases, w1 and w2 . Formally, we have that M2 = hW0 , K2 , V2 i with K2 = {(w1 , w1 ), (w2 , w2 )} and with V2 satisfying V2 (w1 ) = {v, l, r, s} and V2 (w2 ) = {v, l, s}. As we can see, the accessibility relation needs not be universal: for instance, we had (w1 , w2 ) 6∈ K2 in M2 above. In general, when (w, w0 ) 6∈ K we say that the two worlds are indistinguishable at plan-time, given epistemic model M = hW, K, V i with {w1 , w2 } ⊆ W . In the example, model M2 tells us that, before executing the plan, the agent cannot tell which of the two possibilities represented by worlds w1 (diamond on the right) and w2 (diamond on the left) will correspond to the actual world. However, once she executes the plan, she will acquire that knowledge in both cases: w1 and w2 are disconnected, so uncertainty in the agent’s knowledge is completely removed. As a result, at that point, she will be able to make a decision whether she should perform the take_right or take_left action. If, on the contrary, two worlds w, w0 are connected (w, w0 ) ∈ K in some epistemic model, we say that they are indistinguishable at run-time. This expresses a higher degree of uncertainty, since the agent has no way to tell which world corresponds to “reality” either before or during the plan execution. For instance, at model M1 we have (w1 , w2 ) ∈ K1 meaning that, during the plan execution, the agent will not be able to decide (at that point) whether actions take_right or take_left will work. Until now, we have only described the information captured by epistemic models and pre- sented an example with transitions among them, but did not specify how those transitions were obtained. For that purpose, we will only assume, by now, the existence of a pair of generic functions called updating evaluation defined below. Definition 2 (Updating evaluation). Given a set of models M over set of worlds W and a set of up- dating objects O, an updating evaluation is a pair h⊗, Ri of partial functions ⊗ : M × O −→ M and R : M × O −→ 2W×W satisfying R(M, o) ⊆ D(W ) × D(⊗(M, o)). Function ⊗ takes an initial model M and some updating object o and provides a successor model M0 = ⊗(M, o). We will usually write ⊗ in infix notation and assume that it is left associative so that M⊗o stands for ⊗(M, o) and M⊗o1 ⊗o2 stands for (M⊗o1 )⊗o2 . Relation R(M, o) matches worlds from the initial model M and its successor model M0 = M ⊗ o. We will also write RM,o instead of R(M, o). At this point, we have all the elements for defining the satisfaction of dynamic epistemic formulas. Definition 3 (Satisfaction). Let h⊗, Ri be an updating evaluation. Then, given an epistemic model M = hW, K, V i and world w ∈ W , satisfaction of formulas is given by the following recursive definition: – M, w 6|= ⊥, – M, w |= p iff p ∈ V (w), – M, w |= ϕ1 ∧ ϕ2 iff M, w |= ϕ1 and M, w |= ϕ2 , – M, w |= ϕ1 ∨ ϕ2 iff M, w |= ϕ1 or M, w |= ϕ2 , – M, w |= ϕ1 → ϕ2 iff M, w 6|= ϕ1 or M, w |= ϕ2 , – M, w |= ∼ϕ iff M, w 6|= ϕ, – M, w |= K ϕ iff M, w0 |= ϕ for all w0 with (w, w0 ) ∈ K, and – M, w |= [o]ϕ iff M ⊗ o and RM,o are defined and M ⊗ o, w0 |= ϕ holds for all w0 with (w, w0 ) ∈ RM,o . As usual, we write M |= ϕ iff M, w |= ϕ for every world w ∈ W . Furthermore, given a theory Γ, we write M |= Γ iff M |= ϕ for every formula ϕ ∈ Γ. We say that theory Γ entails formula ψ, also written Γ |= ψ, iff M |= Γ implies M |= ψ for any epistemic model M ∈ M. It is easy to see that the semantics for the dynamic-free fragment of the language (i.e., without [·] operator) corresponds to modal logic S5 (see [56] for instance). Dynamic Epistemic Logic with Event Model Updates: DEL[E] Let us now see how these definitions apply to the case in which updating objects are event models [36]. The following is an adaptation of the definition from [21]. A first peculiarity of event models is that, when making an update M ⊗ o = M0 , the resulting epistemic model M0 uses world names of the form (w, e) where w is a world from the updated epistemic model M and e is a world (or event) from the event model o. For this reason, along this section, we assume that the global set of available world names W is closed under formation of pairs. In other words, W satisfies (w, w0 ) ∈ W for all w, w0 ∈ W. For instance, given a unique “atomic” world name w0 ∈ W, the set W would contain infinitely many pairs (w0 , w0 ), ((w0 , w0 ), w0 ), (w0 , (w0 , w0 )), . . . and so on. Definition 4 (Event Model). An event model over P and W is a quadruple E = hE, K, pre, posti where – E ⊆ W is a finite set of worlds called events, – pre : E −→ LE (P) assigns to each event a precondition, and – post : E −→ (P −→ LE (P)) assigns to each event a postcondition, for some propositions in P. – K ⊆E×E By D(E) = E we denote the domain of E. A pair hE, ei with e ∈ E is called a pointed event model. Definition 5 (Event updating evaluation). Let M = hW, K, V i be an epistemic model and E = hE, K̂, pre, posti an event model, both over P and W. The product update M ⊗ E def = hW 0 , K0 , V 0 i is another epistemic model where – W 0 = { (w, e) ∈ W × E | M, w |= pre(e) } ⊆ W is a set of worlds, – K0 = { ((w1 , e1 ), (w2 , e2 )) ∈ W 0 × W 0 | (w1 , w2 ) ∈ K and (e1 , e2 ) ∈ K̂ }, – V 0 ((w, e)) = { p ∈ P | M, w |= post(e)(p) } for every (w, e) ∈ W 0 , Given a pointed event model hE, ei, the event updating evaluation is a pair h⊗, Ri with – M ⊗ hE, ei def = M⊗E – R(M, hE, ei) def = { (w, w0 ) ∈ W × W 0 | w0 = (w, e) }. For simplicity, we will usually write [E, e]ϕ instead of [hE, ei]ϕ. We will also use the following shorthands ^ [E]ϕ def = [E, e]ϕ hEiϕ def= ∼[E] ∼ϕ e∈D(E) The following result shows that, indeed, the semantics described above coincides with the semantics from [21] for the case of event models. Proposition 1. Let M be an epistemic model, w ∈ D(M) be a world in M, hE, ei be pointed event model and ϕ ∈ LE (P) be a formula. Then, – M, w |= [E, e]ϕ iff M, w |= pre(e) implies M ⊗ E, (w, e) |= ϕ, Proof. By definition, we have M, w |= [E, e] iff M ⊗ hE, ei, w0 |= ϕ for all w0 with (w, w0 ) ∈ RM,hE,ei iff M ⊗ E, w0 |= ϕ for all w0 with (w, w0 ) ∈ RM,hE,ei . Note now that, by definition, we have either RM,hE,ei ∩ ({w} × D(M ⊗ E)) = {(w, (w, e))} or RM,hE,ei ∩ ({w} × D(M ⊗ E)) = ∅. Hence, the above holds iff M ⊗ E, (w, e) |= ϕ or RM,hE,ei ∩ ({w} × D(M ⊗ E)) = ∅, iff M ⊗ E, (w, e) |= ϕ or M, w 6|= pre(e) iff M, w |= pre(e) implies M ⊗ E, (w, e) |= ϕ. The following result helps undertanding the semantics of (non-pointed) event models. Proposition 2. Let M be an epistemic model, w ∈ D(M) be a world in M, E be an event model and ϕ ∈ LE (P) be a formula. Then, – M, w |= [E]ϕ iff M ⊗ E, (w, e) |= ϕ for every e ∈ D(E) such that M, w |= pre(e), – M, w |= hEiϕ iff M, w |= pre(e) and M ⊗ E, (w, e) |= ϕ for some e ∈ D(E). Proof. The second statement follows directly from its definitions. For the third, we have M, w |= hEiϕ iff M, w |= ∼[E] ∼ϕ iff M, w 6|= [E] ∼ϕ iff M, w |= pre(e) does not imply M ⊗ E, (w, e) |= ∼ϕ for some e ∈ D(E) iff M, w |= pre(e) and M ⊗ E, (w, e) 6|= ∼ϕ for some e ∈ D(E) iff M, w |= pre(e) and M ⊗ E, (w, e) |= ϕ for some e ∈ D(E). Going back to our running example, Figure 2 depicts the event models corresponding to the actions of Example 1. For instance, Figure 2a depicts the event model of the action take_left with a single event e1 whose precondition is v ∧ ∼d and whose postcondition {d 7→ ∼r} states that, in the next state, d is assigned the truth value that formula ∼r had in the previous state. More interestingly, Figure 2c depicts the event model of the action flick which has two events e1 and e2 with the same postcondition but different preconditions. The precondition of e1 makes it applicable when the thief is in the vault and the diamond is on the right pedestal while the precondition of e2 is analogous but for the left pedestal.1 In this sense, when the action flick is executed in the epistemic model M1 (Figure 1b), it follows that only w1 satisfies the preconditions of e1 and only w2 satisfies the preconditions of e2 . As a result, we can see that M1 ⊗ flick has two worlds, that is, D(M1 ⊗ flick) = {(w1 , e1 ), (w2 , e2 )}. Furthermore, since events e1 and e2 are disconnected, we also get that worlds (w1 , e1 ) and (w2 , e2 ) are disconnected. In fact, the epistemic model M1 ⊗ flick is isomorphic to the model M2 depicted in Figure 1c and can be obtained just by renaming each world wi in M2 as (wi , ei ). 1 Note how we must specify the diamond’s location in both preconditions, although the only real physical requirement for flick is being inside the vault. This need for specifying unrelated preconditions may obviously become a representational problem. e1 : hv ∧ ∼d, {d 7→ ∼r}i e1 : hv ∧ ∼d, {d 7→ r}i (a) take_left (b) take_right e1 : hv ∧ r, {l 7→ >, s 7→ >}i e2 : hv ∧ ∼r, {l 7→ >, s 7→ >}i (c) flick e1 : hv ∨ ∼l, {v 7→ ∼v}i e2 : h∼v ∧ l ∧ ∼r, {v 7→ ∼v, s 7→ >}i e2 : h∼v ∧ l ∧ r, {v 7→ ∼v, s 7→ >}i (d) move Figure 2: Event models corresponding to the actions of Example 1. The first element of the pair assigned to each world corresponds to its preconditions while the second one corresponds to its postconditions. For instance, in (a) the precondition is v ∧ ∼d and the postcondition {d 7→ ∼r}. This postcondition means that, in the next state, d takes the value that formula ∼r had in the previous state. e1 : hv ∧ r, {l 7→ >, s 7→ >}i M02 e2 : hv ∧ ∼r, {l 7→ >, s 7→ >}i w1 : vlrsd w2 : vlrsd (a) flick0 (b) Figure 3: (a) Event model corresponding to a variation of the action flick of Example 1 without observing the position of the diamond. (b) Epistemic model obtained after executing the action flick0 in the model M1 . Note that the existence of two disconnected events in the action flick encodes the observa- tion that happens when the light is turned on, that is, the agent obtains the knowledge about the actual place of the diamond. For instance, if we consider the action flick0 depicted in Figure 3a, and obtained from the action flick by connecting events e1 and e2 , we can see that M1 ⊗ flick0 is isomorphic to the epistemic model M02 depicted in Figure 3b. Model M02 only differs from M2 in that worlds w1 and w2 are now connected, revealing that the agent cannot tell where is the diamond. In other words, flick0 encodes the same ontic changes in the world than flick but does not capture the agent’s observation about the position of the diamond. Finally, Figure 4a (model M3 ) corresponds to an state where the thief is inside the vault with the diamond in her possession. Intuitively, this model represent the result of executing action take_left or take_right according to the agent’s knowledge about the position of the diamond, whereas Figure 4b represents model M4 = M3 ⊗ move, that is, the result of moving (outside the vault) afterwards. M3 M4 w1 : vlrsd w2 : vlrsd w1 : vlrsd w2 : vlrsd (a) (b) Figure 4: Epistemic models representing (a) the state corresponding to execution of action take_left or take_right according to the agent’s knowledge about the position of the diamond and (b) the result of executing move on M3 in (a). Planning in Answer Set Programming In this subsection, we informally describe the ASP methodology for representing problems of classical planning: for a more formal approach we refer to [45, 57]. Our purpose is, by now, merely introductory, trying to illustrate the main representational features of ASP planning that are relevant for the current discussion. For this reason, we delay the introduction of a formal semantics for later on, when epistemic ASP is introduced. ASP specifications or logic programs are sets of rules of the form: a ← b1 , . . . , bn , not c1 , . . . , not cm where ← is a reversed implication, so its left hand side is called the rule head and its right hand side receives the name of rule body. The rule head a is a proposition or the symbol ⊥: when the latter happens, the rule is a constraint forbidding that the body holds. The elements in the body (bi and not cj ) are called literals, where bi and cj are propositions. The ordering among literals is irrelevant: in fact, commas just stand for conjunctions. Operator not represents default negation: we read not cj as “there is no evidence on cj ” or “there is no way to prove cj ”. We will also use non-deterministic rules of the form: m {a1 ; . . . ; ak } n ← Body where m, n ≥ 0 meaning that, when Body holds, we can arbitrarily add a subset of atoms from {a1 , . . . , ak } of cardinality c with n ≤ c ≤ m. ASP allows a second kind of negation called strong or explicit that we will represent ∼p. From a practical point of view, we can assume that “∼p” is a new kind of atom and that models cannot make p and ∼p true simultaneously. For a simple representation of rules describing transitions we partly adopt the syntax of [57] and assume that, for each proposition p, we handle a second atom “•p” that stands for p at the immediately previous situation. In temporal ASP, actions are represented as regular propositions in P: the rest of non-action propositions in P are called fluents. Taking all these considerations, the behaviour of action take_left can be encoded in ASP as the following three rules: d ← take_left (1) ⊥ ← take_left, ∼•v (2) ⊥ ← take_left, •r (3) where (1) describes its direct effect (grasping the diamond) whereas the other two rules describe the preconditions: (2) forbids executing take_left when the thief was not in the vault and (3) forbids its execution when the diamond is in the right pedestal. Analogously, the following three rules encode the action take_right: d ← take_right (4) ⊥ ← take_right, ∼•v (5) ⊥ ← take_right, ∼•r (6) Similarly, actions flick and move are respectively represented by the rules: l ← flick (7) ⊥ ← flick, ∼•v (8) v ← move, ∼•v (9) ∼v ← move, •v (10) Rule (7) states the postcondition of flick, that is, the light is turned on, while rule (8) states its precondition, that is, we forbid its execution when being outside vault. Rules (9) and (10) together state the postconditions of move: its execution just flips the truth value of v. To illustrate the use of indirect effects, we can just assert that seeing the diamond (s) just depends on being in the vault (v) with the light on (l), regardless of the actions that have been performed to reach that situation. This is naturally captured by the single ASP rule: s ← v, l (11) Default negation allows a natural representation of the inertia law, that is, a fluent normally remains unchanged, unless there is evidence on the contrary. We divide the set of fluents into inertial F I = {v, l, r, d} and non-inertial fluents F N = {s}. Inertia is then guaranteed by the pair of rules: f ← •f, not ∼f (12) ∼f ← ∼•f, not f (13) for each inertial fluent f ∈ F I . In our running example, the fluent (s) is considered false by default, that is, the following rule: ∼s ← not s (14) sating that, unless (s) is proved, we should consider that its explicit negation (∼s) is added to the set of conclusions. If we consider now the following simplification of Example 1 where the value of all fluents in the initial situation are known, we can use ASP to obtain a plan to achieve the thief’s goal. Example 2 (Example 1 continued). Consider now the case where the thief is outside the vault (∼v) and already knows the Pink Panther is inside the vault on the right (r) pedestal. 1 #program dynamic. d :- take_left. :- take_left, not 'v. :- take_left, not -'r. d :- take_rigth. 6 :- take_rigth, not 'v. :- take_rigth, not 'r. l :- flick. :- flick, not 'v. v :- move, -'v. 11 -v :- move, 'v. s :- v,l. v :- 'v, not -v. -v :- -'v, not v. d :- 'd, not -d. 16 -d :- -'d, not d. l :- 'l, not -l. -l :- -'l, not l. r :- 'r, not -r. -r :- -'r, not r. 21 {take_left; take_rigth; flick; move}1. #program always. -s :- not s. 26 #program i n i t i a l . -v. r. -d. -l. #program final . :- not -v. 31 :- not d. #show take_left/0. #show take_rigth/0. #show flick/0. 36 #show move/0. Listing 1: Program corresponding to Example 2 in the syntax of the solver telingo. Listing 1 shows the full encoding representing Example 2 in the syntax of the ASP solver telingo.2 In this syntax, ← is represented as :-, •p as 'p and ∼p as -p. By copying that encoding into a file called pink.lp and executing “telingo pink.lp” we can obtain a plan for this example. As we said before, an important difference between ASP and event models is the treatment of indirect effects. In the example, note how s was captured by the ASP rule (11), which only depends on other fluents (v and l) but does not refer to the original actions that caused their 2 https://github.com/potassco/telingo. values. There is no flexible way to express this feature when using event models: the value of s must be expressed as a direct effect of actions flick and move, that respectively determine the values of l and v. If new actions could alter the values of those fluents, directly or indirectly, then their effect on s should also be included in their post-conditions. This is, in fact, an instance of the well-known ramification problem [58]. The ramification problem may also occur for epistemic effects, if we are not careful enough for their treatment. For instance, the encoding of Example 1 in [21] did not use our fluent s (where the ramification is evident), but transferred the problem to the epistemic effect of knowing the position of the diamond (r). Again, this epistemic effect must be specified as a direct consequence of some action, something that does not always seem reasonable. In the rest of the paper, we develop an extension of DEL and ASP, that we denote DEL[ASP] where the ontic and epistemic effects of actions can be described both in a direct or indirect way, as needed. In particular, in DEL[ASP], the observation of the diamond position when the thief is in the illuminated vault can be expressed by the following rule analogous to (11): O r ← v, l (15) Here, O r is a new construct whose intuitive meaning is that “the actual value of the fluent r is observed (by the agent).” Note that we just replace the fluent s in (11), whose intuitive meaning is that the agent sees the position of the diamond, by this new construct O r, which makes this observation affect the agent’s beliefs. Epistemic Logic Programs As explained before, we will use FAEEL for the interpretation of epistemic specifications, the epistemic extension of ASP. FAEEL inherits both the ASP capabilities for knowledge representa- tion and the AEL capabilities for introspective reasoning. For the sake of coherence, we adapt the definitions of [51] to the use of Kripke structures. We also add strong negation [52, 59] to FAEEL, which for simplicity, is restricted to be used only in front of atoms, something that suffices for the goals of this paper and is usual in the ASP literature [43, 60]. Autoepistemic formulas are defined according to the following grammar: ϕ ::= ⊥ | p | ∼p | ϕ1 ∧ ϕ2 | ϕ1 ∨ ϕ2 | ϕ1 → ϕ2 | L ϕ where p ∈ P is a proposition. We assume that ASP notation is transformed into its logical counterpart: not F is represented as ¬F , commas in the body are replaced by conjunctions and rules G ← F are written from left to right F → G. Intuitively, the new construct, L ϕ, is read as “it is reasonable (for the planning agent) to believe ϕ.” Weak or intuitionistic negation is defined as usual: ¬ϕ def = ϕ → ⊥. The knowledge modality is defined as true belief: K ϕ def = ϕ∧L ϕ. We also introduce the following abbreviations: def Up = ¬p ∧ ¬∼p def Op = (p → L ¬¬p) ∧ (∼p → L ¬¬∼p) ∧ (U p → L ¬¬U p) whose respective intuitive meanings are that the value of proposition p ∈ P is undefined and that the actual value of proposition p ∈ P is observed. Note that when an atom p is observed, the agent’s beliefs have to agree with the actual value of the atom p. The use of double negation here implies that only the agent’s beliefs will be modified, without providing any justification for believing p. Besides, we assume all previous abbreviations too, that is, (ϕ ← ψ) def = (ψ → ϕ), ϕ ↔ ψ def= (ϕ → ψ) ∧ (ψ → ϕ), and > def = ⊥ → ⊥. An autoepistemic theory Γ is a set of autoepistemic formulas as defined above. When a theory is a singleton, we will usually write just ϕ instead of {ϕ}. A literal L is either a proposition p ∈ P or a proposition preceded by strong negation ∼p and by Lit def = P ∪ {∼p | p ∈ P} we denote the set of all literals over the signature P. We define next an (autoepistemic) HT-model, as a combination of modal epistemic logic with the logic of here-and-there (HT) [61], an intermediate logic with two intuitionistic worlds, h (standing for “here”) and t (read as “there”) satisfying h ≤ t. Definition 6 (HT-Model). Given a set of propositional symbols P, an HT-model is a quadruple M = hW, K, V h , V t i where – W is a set of worlds, – K ⊆ W × W is an accessibility relation on W , and – V x : W −→ 2Lit is a valuation with x ∈ {h, t} such that V h (w) ⊆ V t (w) for all w ∈ W . D(M) = W denotes the domain of M. A belief HT-model is an HT-model where K = K × K0 with K0 = K \ {w0 } for some distinguish world w0 ∈ K. A HT-model M = hW, K, V h , V t i is called total iff V h = V t . Furthermore, by Mt def = hW, K, V , V i we denote the total model corresponding to M. Satisfaction of autoepistemic t t formulas is then given by the following recursive definition: – M, w 6|= ⊥, – M, w |= L iff L ∈ V h (w) for any L ∈ Lit, – M, w |= ϕ1 ∧ ϕ2 iff M, w |= ϕ1 and M, w |= ϕ2 , – M, w |= ϕ1 ∨ ϕ2 iff M, w |= ϕ1 or M, w |= ϕ2 , – M, w |= ϕ1 → ϕ2 iff both M, w 6|= ϕ1 or M, w |= ϕ2 and Mt , w 6|= ϕ1 or Mt , w |= ϕ2 , – M, w |= L ϕ iff M, w0 |= ϕ for all w0 with (w, w0 ) ∈ K As usual, we say that M is an HT-model of some theory Γ, in symbols M |= Γ, iff M, w |= ϕ for every world w ∈ D(M) and every formula ϕ ∈ Γ. As mentioned before, when Γ = {ϕ} is a singleton we will omit the brackets, so that M |= ϕ stands for M |= {ϕ} and holds iff M, w |= ϕ for every world w ∈ D(M). Definition 7 (Bisimulation between HT-models). Let M1 = hW1 , K1 , V1h , V1t i and M2 = hW2 , K2 , V2h , V2t i be two HT-models. Given some binary relation Z ⊆ W1 × W2 , we write M1 Z M2 iff – every (w1 , w2 ) ∈ Z satisfies V t (w1 ) = V t (w2 ) and V h (w1 ) ⊆ V h (w2 ), – for every (w1 , w10 ) ∈ K1 , there is (w2 , w20 ) ∈ K2 such that (w10 , w20 ) ∈ Z, – for every (w2 , w20 ) ∈ K2 , there is (w1 , w10 ) ∈ K1 such that (w10 , w20 ) ∈ Z. We write M1  M2 iff there is a total relation Z s.t. M1 Z M2 . We also say that M1 and M2 are bisimilar, in symbols M1 ≈ M2 , iff there is a total relation Z s.t M1 Z M2 and M2 Z M1 . As usual, we write M1 ≺ M2 iff M1  M2 and M1 6≈ M2 . Definition 8 (Equilibrium model). A total belief HT-model M of some theory Γ is said to be an equilibrium model of Γ iff there is no other belief HT-model M0 of Γ such that M0 ≺ M. Given some information cell M = hW, K, V i and some set of literals I ⊆ Lit, by M + I we denote the total belief HT-model M0 = hW 0 , K0 , V 0 , V 0 i where W 0 = {w0 } ∪ W with w0 ∈ / W, K0 = W 0 × W and V 0 : W 0 −→ 2Lit satisfies V 0 (w) = V (w) for all w ∈ W and V 0 (w0 ) = I. Definition 9 (World view). Given a set of propositions P, an information cell M = hW, K, V i over Lit is called a world view of some theory Γ iff the following two conditions hold: – M + V (w) is an equilibrium model of Γ, for every world w ∈ W , – M+I is not an equilibrium model of Γ for every set of literals I ⊆ Lit satisfying I 6= V (w) for all w ∈ W , and – either p ∈ / V (w) or ∼p ∈ / V (w) for all p ∈ P and w ∈ W . We say that a theory Γ is consistent iff it has some world view and by WV[Γ] we denote the set of all world views of Γ. Example 3 (Example 1 continued). For instance, the formula ϕ0 = ∼v ∧ ∼l ∧ (r ∨ ∼r) ∧ ∼s ∧ ∼d (16) has a unique world view that is depicted in Figure 5. Note that every propositional theory has a unique world view [51] that corresponds to the set of all answer sets of the theory. Furthermore, since ϕ0 contains no negation, its answer sets coincide with its minimal classical models when we treat each strong negated literal ∼p as a new atom. Dynamic Epistemic Logic with ASP Updates: DEL[ASP] In this section, we present the major contribution of this paper, DEL[ASP], an instance of the abstract DEL framework where updating objects correspond to logic programs. Our motivation is twofold: on the one hand, to allow unrestricted use of indirect effects (both ontic and epistemic); on the other hand, to preserve the ASP representation of non-epistemic planning problems without need of any adjustment or modification. We illustrate these two objectives through our running example. Mwv 0 w1 : {∼v, ∼l, r, ∼s, ∼d} w2 : {∼v, ∼l, ∼r, ∼s, ∼d} Figure 5: Unique world view of the formula ϕ0 = ∼v ∧ ∼l ∧ (r ∨ ∼r) ∧ ∼s ∧ ∼d. Characterising information cells in FAEEL Let us start by showing how any information cell can be represented by some autoepistemic formula in FAEEL. Note that world views are an information cell over Lit, so they represent a kind of three valued epistemic models where each proposition p can be true p ∈ V (w), false ∼p ∈ V (w) or undefined p, ∼p ∈ / V (w). We will show here how (two-valued) information cells over P can be simply represented as propositional formulas in FAEEL, allowing to map these three valued epistemic models into standard two valued ones. Example 4 (Example 3 continued). Continuing with our running example, we can see now that this model satisfies either p ∈ V (w) or ∼p ∈ V (w) for every proposition p ∈ {v, l, r, s, d} and world w ∈ {w1 , w2 }. Hence, we can map this model into a (two-valued) information cell by con- sidering as true every proposition p ∈ V (w) and as false every proposition satisfying ∼p ∈ V (w). It is easy to see that the obtained information cell is precisely the model M0 depicted in Figure 1a, that is, the epistemic model corresponding to the initial situation of Example 1. In this sense, we can use the formula ϕ0 to represent the initial state of this example. Definition 10. Given W somewinformation cell M = hW, K, V i, its characteristic (autoepistemic) formula is ϕM def = w∈W M ϕ where ϕ w is defined as follows: M  ^   ^  ϕwM def = p ∧ ∼p p∈V (w) p∈P\V (w) Definition 11 (Bisimulation). Given two models M1 = hW1 , K1 , V1 i and M2 = hW2 , K2 , V2 i, we say that they are bisimilar, in symbols M1 ≈ M2 , if and only if hW1 , K1 , V1 , V1 i ≈ hW2 , K2 , V2 , V2 i. Definition 12. Given a set of propositions P ⊆ P, we say that an HT-model M = hW, K, V h , V t i over Lit is P -classical iff every world w ∈ W and proposition p ∈ P satisfy that either p ∈ V h (w) or ∼p ∈ V h (w) holds. A theory Γ is P -classical iff it is consistent and, in addition, every world view is P -classical. Definition 13. Given a set of propositions P ⊆ P and any P -classical total HT-model M = hW, K, V t , V t i over Lit, by M ↓ P = hW, K, V i we denote the model over P where V : W −→ 2P is a valuation satisfying V (w) = V t (w) ∩ P for every world w ∈ W . Proposition 3. Let M be an information cell over P. Then, ϕM has a unique world view M0 and we have that M and M0 ↓ P are bisimilar. Mwv 1 w1 : •{∼v, ∼l, r, ∼s, ∼d} w2 : •{∼v, ∼l, ∼r, ∼s, ∼d} ∪ {move, v, ∼l, r, ∼s, ∼d} ∪ {move, v, ∼l, ∼r, ∼s, ∼d} Figure 6: Unique world view of the program Γ1 . Proof. First note that, since ϕM is a propositional formula, it has a unique world view M0 [51, Proposition 3]. Let M = hW, K, V i and M0 = hW 0 , K0 , V 0 i. Then, we have that w0 ∈ W 0 iff V 0 (w0 ) is a stable model of ϕM . Note also that the stable models of ϕM are exactly its classical models understood as sets of literals. Hence, for every w0 ∈ W 0 , there is some w ∈ W such that V 0 (w0 ) = V (w)∪∼(P\V (w)) and vice-versa. Consequently, M and M0 ↓ P are bisimilar. Example 5 (Example 4 continued). Continuing with our running example, we have ϕM0 = ϕw w2 M0 ∨ ϕM0 with 1 ϕw M0 = ∼v ∧ ∼l ∧ r ∧ ∼s ∧ ∼d 1 ϕw M0 = ∼v ∧ ∼l ∧ ∼r ∧ ∼s ∧ ∼d 2 By applying distributivity of conjunctions over disjunctions, it is easy to see that ϕM0 is classically 0 is the unique world view of ϕM0 and, (and intuitionistically) equivalent to (16). As a result, Mwv as expected from Proposition 3, it can be checked that it satisfies Mwv 0 ↓ P = M0 . Epistemic Model Updates with FAEEL In this section, we show how autoepistemic equilibrium logic can be used to define epistemic model updates just by using an extended signature. Given a set of propositions S ⊆ P, we define •S def = { •p | p ∈ S ∩ P } and Pbi = P ∪ •P where •p intuitively means that p is true in the previous state. It will also be convenient to use • in front of any propositional formula ϕ such that •ϕ is as an abbreviation for the formula obtained by writing • in front of every proposition occurring in ϕ. Example 6 (Example 3 continued). Let Γpink be a theory containing formulas (1)-(15) and let Γ1 = Γpink ∪ {move, •ϕM0 }. This program has a unique world view shown in Figure 6. Note that, if we disregard all the information corresponding to the previous situation (that is all literals preceded by •) and the action move, then we have the same information as the epistemic model M1 in Figure 1a. In other words, Γ1 encodes the transition that occurs between the epistemic models M0 and M1 when executing action move. As shown in the example above, we can represent the transition between two epistemic models as an autoepistemic theory. Let us now formalise this intuition. We begin introducing some auxiliary definitions. Given a set of epistemic models S = {M1 , M2 , . . . } where each Mi is a model over a set of atoms P of the F form Mi = hWi , Ki , Vi i and satisfying Wi ∩ Wj = ∅ for all Mi , Mj ∈ S with i 6= j, by S def = hW, K, V i, we denote an epistemic model where – W 0 = { Wi | Mi ∈ S }, S – K0 = { Ki | Mi ∈ S }, and S – V 0 : W 0 −→ 2P with V 0 (w) = Vi (w) for all w ∈ Wi and all Mi ∈ S. As usual, if S = {M1 , M2 }, we just write M1 t M2 instead of {M1 , M2 }. F Definition 14.F Let P be a set of atoms and Γ be some P -classical autoepistemic theory. Then, by = { M ↓ P | M ∈ WV[Γ] } we denote the epistemic model capturing all the world mwv(Γ, P ) def views of Γ projected into the vocabulary P . If Γ is not P -classical, we assume that mwv(Γ, P ) is not defined. In other words, for every P -classical autoepistemic theory Γ, mwv(Γ, P ) is the two-valued epistemic model that has an information cell for every world view of Γ such that the valuation of every proposition p ∈ P in every world in mwv(Γ, P ) corresponds to the valuation of that proposition in that same world in the corresponding world view. Recall that, in a P -classical theory, all its worlds views satisfy either p or ∼p for every proposition p ∈ P . This is necessary so it is possible to map three-valued world views into two-valued epistemic models. We could remove this restriction by allowing three-valued epistemic models, but we have decided to stay as close as possible to the original DEL semantics, which is only two-valued. Example 7 (Example 6 continued). Note that if P ⊆ P, then the epistemic model mwv(Γ, P ) always corresponds to the current situation, discarding all information about the previous one. In this sense, if we consider the program Γ1 of Example 6 and the epistemic model M1 of Figure 1b, we have that mwv(Γ1 , P ) = M1 where P = {v, d, r, l, s} is the set of fluent of Example 1. As a further example, consider now the theory Γ2 = Γpink ∪ {flick, •ϕM1 }. Then, Γ2 has two world views which correspond to the two cell informations in the epistemic model M2 depicted in Figure 1c. This explains why we need to join together all the world views of the theory in a single epistemic model. Every world view, which becomes an information cell, represents the knowledge the agent will have after executing the action flick, while the set of all world views represent the knowledge the agent had before executing it. Let us now define the transition between states borrowing the notion of product update from DEL. Definition 15. Given an information cell M over P ⊆ P and a theory Γ over Pbi such that Γ ∪ {•ϕM } is Pbi -classical, we define: = mwv(Γ ∪ {•ϕM }, P ). – the product update of M with respect to Γ as the epistemic model M ⊗ Γ def – the binary relation RM,Γ ⊆ D(M) × D(M ⊗ Γ) s.t. (w, w0 ) ∈ RM,Γ iff M0 , w0 |= •ϕw M with M0 = mwv(Γ ∪ {•ϕM }, •P). Example 8 (Example 7 continued). It is now easy to see that M0 ⊗ Γ3 = mwv(Γ3 ∪ {•ϕM0 }, P ) = mwv(Γ1 , P ) = Mwv 1 ↓ P = M1 M1 ⊗ Γ4 = mwv(Γ4 ∪ {•ϕM0 }, P ) = mwv(Γ2 , P ) = M2 M00 M00 ⊗ Γ3 M00 ⊗ Γ3 ⊗ Γ5 M00 ⊗ Γ3 ⊗ Γ5 ⊗ Γ3 w1 : vlrsd w1 : vlrsd w1 : vlrsd w1 : vlrsd (a) (b) (c) (d) Figure 7: Epistemic models corresponding execution of the sequence of actions hmove, take_right, movei in initial state of Example 2, represented here by the model M00 in (a). with Γ3 = Γpink ∪{move} and Γ4 = Γpink ∪{flick} respectively being the theories representing the execution of the action move and flick according to program Γpink , Mwv 1 the epistemic model depicted in Figure 6 and M1 and M2 the epistemic models depicted in Figure 1b and Figure 1c. In other words, M1 is the result of executing action move in epistemic model M0 according to the de- 1 , w |= •ϕM . scription provided by Γpink . Furthermore, for each world w ∈ {w1 , w2 }, we have Mwv w In its turn, this implies M1 ↓ •P, w |= •ϕM and, thus, that RM0 ,Γ3 = {(w1 , w1 ), (w2 , w2 )} wv w is the identity3 , that is, it maps each world in M0 to a world in M1 with the same name. Simi- larly, M2 is the result of executing the action flick in the epistemic model M1 according to the description provided by Γpink and we can check that RM1 ,Γ4 is also the identity. We define now the updating evaluation for ASP epistemic specifications for epistemic models. In a nutshell, this evaluation is the result of combining the evaluation for each individual information cell in the model. Definition 16 (ASP updating evaluation). Given any epistemic model M and theory Γ, the ASP updating evaluation is a pair h⊗, Ri satisfying G M ⊗ Γ def = { M0 ⊗ Γ | M0 ∈ cell(M) } [ RM,Γ def = { M0 ⊗ Γ | M0 ∈ cell(M) } We can now directly apply Definition 3 to obtain the satisfaction of DEL[ASP] formulas, that is, DEL formulas in which the updating objects are autoepistemic theories. Example 9 (Example 2 continued). Let us now resume the simplified version of our running ex- ample introduced in Example 2. In this case, the initial situation can be represented by an epistemic model M00 depicted in Figure 7a. Then, it can be checked that M00 |= K[Γ3 ][Γ5 ][Γ3 ](∼v ∧ d) holds with Γ5 = Γpink ∪ {take_right}. In other words, the thief knows that after executing the sequence of actions hmove, take_right, movei she will be out of the vault with the diamond. 3 Note that, for the sake of clarity, the names of worlds have been chosen so that RM0 ,Γ3 is the identity, but this is not necessarily the case. In fact, worlds from M0 and M1 could be disjoint or even be switched so that w1 could be called w2 and vice-versa. M1 ⊗ Γ6 M1 ⊗ Γ6 ⊗ Γ7 w1 : vlrsd w2 : vlrsd w1 : vlrsd w2 : vlrsd (a) (b) Figure 8: Epistemic models corresponding to (a) the execution of try_take_left in the model M1 = M0 ⊗ Γ6 and (b) the execution of try_take_right in the resulting state. That is, this sequence of actions is a valid plan that achieves the goal of getting out of the vault with the diamond regardless of the actual initial situation. This means that this is a conformant plan. For the sake of completeness, Figures 7b, c and d respectively depict the epistemic models M00 ⊗ Γ3 , M00 ⊗ Γ3 ⊗ Γ5 and M00 ⊗ Γ3 ⊗ Γ5 ⊗ Γ3 . Example 10 (Example 1 continued). As a further example, consider another variation of Exam- ple 1 where we have actions try_take_left and try_take_right that are similar to take_left and try_take_left, but that can be executed even when the diamond is not in the right location, having no effect in such case. This can be represented by a theory Γ0pink obtained from Γpink by replacing rules (1-6) by the following rules: d ← try_take_left ∧ • ∼r (17) ⊥ ← try_take_left ∧ ∼•v (18) d ← try_take_right ∧ •r (19) ⊥ ← try_take_right ∧ ∼•v (20) Now, we can check that M0 |= K[Γ3 ][Γ6 ][Γ7 ][Γ3 ](∼v ∧ d) (21) holds with Γ6 = Γ0pink ∪ {try_take_left} and Γ7 = Γ0pink ∪ {try_take_right}. Recall that M0 ⊗ Γ3 = M1 is the epistemic model depicted in Figure 1a. We also can check that M0 ⊗ Γ3 ⊗ Γ6 ⊗Γ7 ⊗Γ3 = M4 is the epistemic model depicted in Figure 4b. Figure 8 depicts the epistemic mod- els corresponding to intermediate states. With these models, we can see that (21) holds. That is, the thief knows that after executing the sequence of actions actions hmove, try_take_left, try_take_right, movei she will be outside the vault with the diamond. Therefore, this sequence of actions constitutes a conformant plan for this problem. Note that the thief achieves her goal without ever getting to know where the diamond actually was. Conditional Planning in DEL[ASP] In this section, we show how to use DEL[ASP] to represent conditional plans. Let start by defining what a plan is by introducing the following plan language from [21]. Definition 17 (Plan Language). Given disjunct sets of actions A and fluents F, a plan is an expression π built with the following grammar: π ::= a | skip | if K ϕ then π else π | π; π where a ∈ A and ϕ is a formula over F. We write (if K ϕ then π) as a shorthand for the plan (if K ϕ then π else skip). As mentioned in the introduction, conditional plans contain “if-then-else” structures that allow the agent to apply different strategics depending on the knowledge she has obtained along the execution of the plan. For instance, move ; flick ; if K r then take_right else take_left ; move (22) is a conditional plan for the problem laid out in Example 1. It is a plan since, as we will prove next, the thief eventually takes the diamond out in all possible outcomes, and it is conditional because the third step contains an alternative decision. If the thief acts according to her knowledge about the diamond position at that point, the plan is guaranteed to succeed. We will show that in fact, after executing the actions move and flick, the thief knows that she will know where the diamond is. Let us now formalise these intuitive ideas by providing a translation from plans into DEL[ASP] as follows: Definition 18 (Translation). Let A ⊆ P and F ⊆ P be a pair of disjoint sets of propositions, respectively corresponding to actions and fluents. The translation of a plan π over A applied to a formula ψ over F, with respect to a theory Γ over Pbi is denoted as J π KΓ ψ and is recursively defined as: J a KΓ ψ def = hΓ ∪ {a}i> ∧ [Γ ∪ {a}]ψ Γ def J skip K ψ = ψ 0 Γ J π; π K ψ def = J π KΓ (J π 0 KΓ ψ) J if K ϕ then π else π 0 KΓ ψ def = (K ϕ → J π KΓ ψ) ∧ (∼K ϕ → J π 0 KΓ ψ) where ϕ is any formula over F. As a first remark, note that the translation J a KΓ of an action a is always made by adding a constant theory Γ that defines the behaviour of the action domain (fixing the transition relation). As a result, each elementary action in the plan becomes a complete autoepistemic theory Γ∪{a} in the translation. When Γ is clear from the context, we will simply write J π K instead of J π KΓ . Conjunct [Γ ∪ {a}] ψ requires that ψ becomes true in any resulting state whereas hΓ ∪ {a}i> ensures that action a is executable indeed. We check next that the evaluation of plan (22) corresponds to what we have already seen in Example 8. For the sake of clarity, we gather together all rules of the theory Γpink in Figure 9. Example 11 (Example 8 continued). Going on with our running example, let us consider plans J π KΓpink simply denoted as J π K. We have seen that M0 ⊗ Γ3 = M1 where Γ3 = d ← take_left (1) s ← v, l (11) ⊥ ← take_left, ∼•v (2) O r ← v, l (15) ⊥ ← take_left, •r (3) ∼s ← not s (14) d ← take_right (4) v ← •v, not ∼v ⊥ ← take_right, ∼•v (5) ∼v ← ∼•v, not v ⊥ ← take_right, ∼•r (6) l ← •l, not ∼l l ← flick (7) ∼l ← ∼•l, not l ⊥ ← flick, ∼•v (8) r ← •r, not ∼r v ← move, ∼•v (9) ∼r ← ∼•r, not r ∼v ← move, •v (10) d ← •d, not ∼d ∼d ← ∼•d, not d Figure 9: Theory Γpink : the left column contains the direct effects and preconditions of actions while the right one contains the indirect effects and the inertia axioms. Γpink ∪ {move} was the theory representing the execution of action move according to Γpink . We have also seen that RM0 ,Γ3 = {(w1 , w1 ), (w2 , w2 )} is the identity. Similarly, given theory Γ8 = Γpink ∪ {flick} representing the execution of action flick, we have M1 ⊗ Γ8 = M2 and RM1 ,Γ8 = {(w1 , w1 ), (w2 , w2 )}. Figure 10 shows these three models together with the corresponding relations RM0 ,Γ3 and RM1 ,Γ8 . Looking at this figure, we observe that M2 , w1 |= v ∧ K r and, thus, also M2 , w1 |= v ∧ (K r ∨ K ∼r). From this we can conclude that M1 , w1 |= [Γ8 ](v ∧ (K r ∨ K ∼r)). Note that M1 |= hΓ8 i> holds and, thus, it follows M1 , w1 |= J flick K(v ∧ (K r ∨ K ∼r)) Now we can check that M0 , w1 |= [Γ3 ]J flick K(v ∧ (K r ∨ K ∼r)) and M0 |= hΓ3 i> hold and, thus, we can conclude  M0 , w1 |= J move K J flick K(v ∧ (K r ∨ K ∼r)) An analogous reasoning, allow us to see that the same holds for M0 , w2 and, thus, we obtain  M0 |= KJ move K J flick K(v ∧ (K r ∨ K ∼r)) By definition, these two facts imply M0 |= KJ move; flick K(v ∧ (K r ∨ K ∼r)) (23) In other words, the thief knows that, after executing actions move and flick, she will be inside the vault and that she will know where the diamond is. So she will be ready for the next step: using her knowledge to decide what is the suitable action to continue the plan. Let us now continue with the thief’s reasoning process after the execution of the first two actions. w1 : vlrsd w1 : vlrsd w1 : vlrsd w1 : vlrsd w1 : vlrsd w2 : vlrsd w2 : vlrsd w2 : vlrsd M0 M1 M2 M31 M41 move flick take_right move Figure 10: Execution of the sequence of actions hmove, flick, take_right, movei starting at M0 , w1 of Example 1. We have Mi+1 = Mi ⊗ (Γpink ∪ {ai }) with ai the corresponding action in the sequence. The dotted arrows depict the R relation associated with the update of Mi with respect to Γpink ∪ {ai }. Note that action take_right is not executable in M2 , w2 and, as a result, w2 has no associated world in M3 . Example 12 (Example 11 continued). We will show now that M2 , w1 |= J if K r then take_right else take_left; move K(v ∧ d) (24) is satisfied. First note that M31 ⊗ Γ3 = M41 RM2 ,Γ3 = {(w1 , w1 )} and that M41 , w1 |= ∼v ∧ d and, thus, we get M41 , w1 |= J move K(∼v ∧ d). Let now Γ9 def = Γpink ∪ {take_right}. Then, we have M2 ⊗ Γ9 = M31 RM2 ,Γ9 = {(w1 , w1 )} from where we get M2 , w1 |= [Γ9 ]J move K(∼v ∧ d). Furthermore, this implies M2 , w1 |= J take_right KJ move K(∼v ∧ d), which in its turn implies M2 , w1 |= K r → J take_right KJ move K(∼v ∧ d) Note that M2 , w1 |= K r and, thus, M2 , w1 |= ∼K r → J take_right KJ move K(∼v ∧ d) also follows. As a result, we can see that (24) holds. Now we follow the reasoning from Example 11 to show that M0 , w1 |= J (22) K(∼v∧d). That is, (22) is a plan that achieves the goal of Example 1 in the case that the diamond is in the right pedestal. Analogously, Figure 11 shows the models needed to prove M0 , w2 |= J (22) K(∼v∧d), that is, when the diamond was on the left. As a result, we obtain M0 |= KJ (22) K(∼v ∧ d). In other words, the thief knows that after executing (22), she will succeed in her goal: being outside of the vault with the diamond. w1 : vlrsd w1 : vlrsd w1 : vlrsd w2 : vlrsd w2 : vlrsd w2 : vlrsd w2 : vlrsd w2 : vlrsd M0 M1 M2 M32 M42 move flick take_left move Figure 11: Execution of the sequence of actions hmove, flick, take_left, movei starting at M0 , w2 of Example 1. This figure is analogous to Figure 10 but replacing action take_right by take_left and models M31 and M41 by M32 and M42 , respectively. To conclude this section, we formalise the concepts of planing task and planning solution. Definition 19 (Planning task). Given the disjoint sets of actions A ⊆ P and fluents F ⊆ P, a planning task is a triple Π = hΓ0 , Γ, ϕg i where Γ0 is a theory over P \ A defining the initial state, Γ is a theory over Pbi defining the interpretation of actions and ϕg is the goal formula over F. Definition 20 (Planning solution). A plan π is a conditional solution for the planning task Π = hΓ0 , Γ, ϕg i iff mwv(Γ0 , F) |= J π KΓ ϕg . A conditional solution without occurrences of the “if-then-else” construct is called a conformant solution. In particular, Example 1 can be now formalised as the planning task Π = hΓ0 , Γpink , ϕg i where Γ0 is a singleton containing (16), describing the initial situation, and ϕg = ∼v ∧ d. Then, we can see that (22) is a conditional solution for the planning task Π. We can also formalise Example 2 as the planning task Π = hΓ00 , Γpink , ϕg i where Γ00 contains the single formula: ϕ00 = ∼v ∧ ∼l ∧ r ∧ ∼s ∧ ∼d (25) that describes the corresponding initial situation. It can be checked that (22) is also a conditional solution for Π0 , though this example also has the simpler (conformant) solution: move ; take_right ; move (26) Finally, Example 10 becomes the task Π = hΓ0 , Γ0pink , ϕg i for which move ; try_take_right ; try_take_left ; move (27) is a conformant solution. Conclusions and Future Work As discussed in [62], the traditional DEL[E] approach with event model updates is a semantic approach, where states and actions are represented as semantic objects, epistemic and event models respectively. On the other hand, DEL[ASP] is a syntactic approach, where states and actions are represented as knowledge-bases, that is, sets of formulas known to be true. Semantic and syntactic approaches are mutually dual, with the semantic approach modelling ignorance (the more ignorance, the bigger the state) and the syntactic approach modelling knowledge (the more knowledge, the bigger the knowledge-base). The generalisation of DEL for abstract updating objects can easily accommodate both approaches: it suffices with allowing both event models and epistemic programs to occur in the dynamic operator, and selecting the corresponding updating evaluation. Another interesting observation is that both DEL[E] and ASP can be considered as generali- sations of the STRIPS planning language in orthogonal directions. On the one hand, DEL[E] allows planning in domains where the world is not fully observable, the effects of actions are not necessarily deterministic and where sensing actions may allow to gain knowledge about the actual state of the world. On the other hand, ASP introduces high level KR features like the treatment of indirect effects, action qualifications, state constraints or recursive fluents (for motivation about the need of such features we refer to [63]). The approach presented here, DEL[ASP], combines the strengths of both generalisations so that it is possible to use high level KR features in non-fully observable or non-deterministic domains where observing the world may be needed to achieve a valid plan. Similar to our approach, the action language mAL [64] also combined the treatment of indirect effects and action qualifications with the possibility of defining sensing actions. The main handicap of mAL with respect to DEL[ASP] is that the former only allows ramifications on the ontic effects, but not on the epistemic ones, as we did for instance with rule (15). In mAL, as in DEL[E], this indirect observation needs to be encoded as a direct effect of all actions that may affect those fluents. On the other hand, an advantage of both DEL[E] and mAL is that they can be applied on domains that involve several agents and in which those agents may even hold false beliefs [22], while, so far, DEL[ASP] is only able to deal with domains involving a single agent. Extending DEL[ASP] to cover these domains is a matter of future work. It will be also interesting to study the relation between DEL[ASP] and Temporal ASP [57] and the possibility of extending the latter with an epistemic modality to deal with non-fully observable or non-deterministic domains. Regarding the computation of planning solutions in DEL[ASP], it is worth to mention that the algorithm based on planning trees described in [21] for DEL[E] is general enough and does not really depend of the kind of updating object used. In this sense, we can apply that same algorithm with the only variation of using the ASP updating evaluation when we expand the tree. Then, solutions can be retrieved from the planning tree in exactly the same way as described there. Acknowledgements This work has been partially funded by the Centre International de Mathématiques et d’Informatique de Toulouse through contract ANR-11-LABEX-0040-CIMI within the program ANR-11-IDEX-0002-02, grant 2016-2019 ED431G/01 CITIC Center (Xunta de Galicia, Spain), grant TIN 2017-84453-P (MINECO, Spain). Acknowledgments References [1] M. Ghallab, D. S. Nau, P. Traverso, Automated planning - theory and practice, Elsevier, 2004. [2] P. H. Tu, T. C. Son, C. Baral, Reasoning and planning with sensing actions, incomplete information, and static causal laws using answer set programming, Theory and Practice of Logic Programming 7 (2007) 377–450. [3] M. A. Peot, D. E. Smith, Conditional nonlinear planning, in: Artificial Intelligence Planning Systems, Elsevier, 1992, pp. 189–197. [4] K. Golden, D. S. Weld, Representing sensing actions: The middle ground revisited, in: KR, Morgan Kaufmann, 1996, pp. 174–185. [5] L. Pryor, G. Collins, Planning for contingencies: A decision-based approach, J. Artif. Intell. Res. 4 (1996) 287–339. [6] J. Lobo, G. Mendez, S. R. Taylor, Adding knowledge to the action description language A, in: B. Kuipers, B. L. Webber (Eds.), Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, AAAI 97, IAAI 97, July 27-31, 1997, Providence, Rhode Island, USA., AAAI Press / The MIT Press, 1997, pp. 454–459. [7] A. L. Blum, M. L. Furst, Fast planning through planning graph analysis, Artificial intelli- gence 90 (1997) 281–300. [8] D. E. Smith, D. S. Weld, Conformant graphplan, in: AAAI/IAAI, 1998, pp. 889–896. [9] K. Golden, Leap before you look: Information gathering in the PUCCINI planner, in: AIPS, AAAI, 1998, pp. 70–77. [10] D. S. Weld, C. R. Anderson, D. E. Smith, Extending graphplan to handle uncertainty & sensing actions, in: Aaai/iaai, 1998, pp. 897–904. [11] J. Rintanen, Constructing conditional plans by a theorem-prover, Journal of Artificial Intelligence Research 10 (1999) 323–352. [12] B. Bonet, H. Geffner, Planning with incomplete information as heuristic search in belief space, in: Proceedings of the Fifth International Conference on Artificial Intelligence Planning Systems, AAAI Press, 2000, pp. 52–61. [13] C. Castellini, E. Giunchiglia, A. Tacchella, Sat-based planning in complex domains: Con- currency, constraints and nondeterminism, Artificial Intelligence 147 (2003) 85–117. [14] T. Eiter, W. Faber, N. Leone, G. Pfeifer, A. Polleres, A logic programming approach to knowledge-state planning, ii: The dlvk system, Artificial Intelligence 144 (2003) 157–211. [15] A. Cimatti, M. Roveri, P. Bertoli, Conformant planning via symbolic model checking and heuristic search, Artificial Intelligence 159 (2004) 127–206. [16] D. Bryce, S. Kambhampati, D. E. Smith, Planning graph heuristics for belief space search, Journal of Artificial Intelligence Research 26 (2006) 35–99. [17] J. Hoffmann, R. I. Brafman, Conformant planning via heuristic forward search: A new approach, Artificial Intelligence 170 (2006) 507–541. [18] B. Löwe, E. Pacuit, A. Witzel, Planning based on dynamic epistemic logic, Technical Report, Citeseer, 2010. [19] T. Bolander, M. B. Andersen, Epistemic planning for single and multi-agent systems, Journal of Applied Non-Classical Logics 21 (2011) 9–34. [20] E. Pontelli, T. C. Son, C. Baral, G. Gelfond, Answer set programming and planning with knowledge and world-altering actions in multiple agent domains, in: Correct Reasoning, Springer, 2012, pp. 509–526. [21] M. B. Andersen, T. Bolander, M. H. Jensen, Conditional epistemic planning, in: JELIA, volume 7519 of Lecture Notes in Computer Science, Springer, 2012, pp. 94–106. [22] T. Bolander, Seeing is believing: Formalising false-belief tasks in dynamic epistemic logic, in: ECSI, volume 1283 of CEUR Workshop Proceedings, CEUR-WS.org, 2014, pp. 87–107. [23] F. Kominis, H. Geffner, Beliefs in multiagent planning: From one agent to many, in: R. I. Brafman, C. Domshlak, P. Haslum, S. Zilberstein (Eds.), Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling, ICAPS 2015, Jerusalem, Israel, June 7-11, 2015., AAAI Press, 2015, pp. 147–155. [24] F. Kominis, H. Geffner, Multiagent online planning with nested beliefs and dialogue, in: L. Barbulescu, J. Frank, Mausam, S. F. Smith (Eds.), Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, ICAPS 2017, Pittsburgh, Pennsylvania, USA, June 18-23, 2017., AAAI Press, 2017, pp. 186–194. [25] M. C. Cooper, A. Herzig, F. Maffre, F. Maris, P. Régnier, Simple epistemic planning: Generalised gossiping, in: G. A. Kaminka, M. Fox, P. Bouquet, E. Hüllermeier, V. Dignum, F. Dignum, F. van Harmelen (Eds.), ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, volume 285 of Frontiers in Artificial Intelligence and Applications, IOS Press, 2016, pp. 1563–1564. [26] M. C. Cooper, A. Herzig, F. Maffre, F. Maris, P. Régnier, A simple account of multi-agent epistemic planning, in: G. A. Kaminka, M. Fox, P. Bouquet, E. Hüllermeier, V. Dignum, F. Dignum, F. van Harmelen (Eds.), ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, volume 285 of Frontiers in Artificial Intelligence and Applications, IOS Press, 2016, pp. 193–201. [27] T. Bolander, T. Engesser, R. Mattmüller, B. Nebel, Better eager than lazy? how agent types impact the successfulness of implicit coordination, in: M. Thielscher, F. Toni, F. Wolter (Eds.), Principles of Knowledge Representation and Reasoning: Proceedings of the Sixteenth International Conference, KR 2018, Tempe, Arizona, 30 October - 2 November 2018., AAAI Press, 2018, pp. 445–453. [28] R. C. Moore, A formal theory of knowledge and action, Technical Report, SRI International, Menlo Park, CA, Artificial Intelligence Center, 1984. [29] M. Thielscher, Representing the knowledge of a robot, in: KR, 2000, pp. 109–120. [30] T. C. Son, C. Baral, Formalizing sensing actions—a transition function based approach, Artificial Intelligence 125 (2001) 19–91. [31] R. B. Scherl, H. J. Levesque, Knowledge, action, and the frame problem, Artificial Intelli- gence 144 (2003) 1–39. [32] H. Van Ditmarsch, W. van Der Hoek, B. Kooi, Dynamic epistemic logic, volume 337, Springer Science & Business Media, 2007. [33] H. van Ditmarsch, B. Kooi, Semantic results for ontic and epistemic change, Logic and the foundations of game and decision theory (LOFT 7) 3 (2008) 87–117. [34] V. Pratt, Semantical consideration on floyd-hoare logic, in: Proceedings of the Seventeenth Annual Symposium on Foundations of Computer Science (SFCS’76), IEEE Computer Society Press, 1976, pp. 109–121. [35] H. P. van Ditmarsch, W. van der Hoek, B. P. Kooi, Dynamic epistemic logic with assignment, in: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, ACM, 2005, pp. 141–148. [36] A. Baltag, L. S. Moss, S. Solecki, The logic of public announcements and common knowledge and private suspicions, in: TARK, Morgan Kaufmann, 1998, pp. 43–56. [37] A. Baltag, L. S. Moss, Logics for epistemic programs, Synthese 139 (2004) 165–224. [38] R. E. Fikes, N. J. Nilsson, STRIPS: A new approach to the application of theorem proving to problem solving, Artificial Intelligence 2 (1971) 189 – 208. [39] V. W. Marek, M. Truszczyńki, Stable models and an alternative logic programming paradigm, in: K. R. Apt, V. W. Marek, M. Truszczyński, D. Warren (Eds.), The Logic Programming Paradigm, Artificial Intelligence, Springer Berlin Heidelberg, 1999, pp. 375– 398. [40] I. Niemelä, Logic programs with stable model semantics as a constraint programming paradigm, Annals of Mathematics and Artificial Intelligence 25 (1999) 241–273. [41] C. Baral, Knowledge Representation, Reasoning and Declarative Problem Solving, Cam- bridge University Press, 2010. [42] M. Gelfond, V. Lifschitz, The stable model semantics for logic programming, in: Proc. of the 5th Intl. Conference on Logic Programming (ICLP’88), 1988, pp. 1070–1080. [43] M. Gelfond, V. Lifschitz, Classical negation in logic programs and disjunctive databases, New Generation Comput. 9 (1991) 365–386. [44] V. Lifschitz, Answer set planning, in: D. de Schreye (Ed.), Proceedings of the International Conference on Logic Programming (ICLP’99), MIT Press, 1999, pp. 23–37. [45] V. Lifschitz, Answer set programming and plan generation, Artificial Intelligence 138 (2002) 39 – 54. Knowledge Representation and Logic Programming. [46] J. Lee, V. Lifschitz, F. Yang, Action language BC preliminary report, in: F. Rossi (Ed.), IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, August 3-9, 2013, IJCAI/AAAI, 2013, pp. 983–989. [47] P. Cabalar, R. Kaminski, T. Schaub, A. Schuhmann, Temporal answer set programming on finite traces, Theory and Practice of Logic Programming 18 (2018) 406–420. [48] E. Erdem, M. Gelfond, N. Leone, Applications of ASP, AI Magazine 37 (2016) 53–68. [49] M. Gelfond, Strong introspection, in: T. L. Dean, K. McKeown (Eds.), Proceedings of the AAAI Conference, volume 1, AAAI Press/The MIT Press, 1991, pp. 386–391. [50] A. P. Leclerc, P. T. Kahl, A survey of advances in epistemic logic program solvers, CoRR abs/1809.07141 (2018). URL: http://arxiv.org/abs/1809.07141. arXiv:1809.07141. [51] P. Cabalar, J. Fandinno, L. Fariñas del Cerro, Founded world views with autoepistemic equilibrium logic, Under consideration for 15th International Conference on Logic Pro- gramming and Nonmonotonic Reasoning (LPNMR 2019), Corr abs/1902.07741 (2019). URL: https://arxiv.org/abs/1902.07741. [52] D. Pearce, A new logical characterisation of stable models and answer sets, in: J. Dix, L. M. Pereira, T. C. Przymusinski (Eds.), Non-Monotonic Extensions of Logic Programming, NMELP 1996, Bad Honnef, Germany, September 5-6, 1996, Selected Papers, volume 1216 of Lecture Notes in Computer Science, Springer, 1996, pp. 57–70. [53] R. C. Moore, Semantical considerations on nonmonotonic logic, Artificial Intelligence 25 (1985) 75–94. [54] G. Aucher, An internal version of epistemic logic, Studia Logica 94 (2010) 1–22. [55] R. Fagin, Reasoning about knowledge, MIT Press, 1995. [56] J. Y. Halpern, Reasoning about knowledge: an overview, in: Theoretical aspects of reasoning about knowledge, Elsevier, 1986, pp. 1–17. [57] P. Cabalar, R. Kaminski, T. Schaub, A. Schuhmann, Temporal answer set programming on finite traces, Theory and Practice of Logic Programming 18 (2018) 406–420. [58] H. Kautz, The logic of persistence, in: Proceedings of the 5th National Conference of Artificial Intelligence, 1986, pp. 401–405. [59] D. Nelson, Constructible falsity, J. Symbolic Logic 14 (1949) 16–26. [60] V. Lifschitz, L. R. Tang, H. Turner, Nested expressions in logic programs, Ann. Math. Artif. Intell. 25 (1999) 369–389. [61] A. Heyting, Die formalen Regeln der intuitionistischen Logik, in: Sitzungsberichte der Preussischen Akademie der Wissenschaften, Deutsche Akademie der Wissenschaften zu Berlin, 1930, p. 42–56. Reprint in Logik-Texte: Kommentierte Auswahl zur Geschichte der Modernen Logik, Akademie-Verlag, 1986. [62] T. Bolander, A gentle introduction to epistemic planning: The DEL approach, in: M4M@ICLA, volume 243 of EPTCS, 2017, pp. 1–22. [63] V. Lifschitz, Answer set programming and plan generation, Artificial Intelligence 138 (2002) 39–54. [64] C. Baral, G. Gelfond, E. Pontelli, T. C. Son, Reasoning about the beliefs of agents in multi-agent domains in the presence of state constraints: The action language mal, in: J. Leite, T. C. Son, P. Torroni, L. van der Torre, S. Woltran (Eds.), Computational Logic in Multi-Agent Systems, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013, pp. 290–306.