Non-Well-Founded Set Based Multi-Agent Action Language? Francesco Fabiano1 , Idriss Riouak1 , Agostino Dovier1 , and Enrico Pontelli2 1 Dipartimento DMIF, Università di Udine, Udine, Italy 2 Computer Science Department, New Mexico State University, Las Cruces NM, USA Abstract As the research in multi-agent domain continues to grow it is becoming more and more important to investigate the agents’ relations in such systems: not only to reason about agents’ perception of the world but also about agents’ knowledge of her and others’ knowledge. This type of study is referred as epistemic reasoning. In certain domains, e.g., economy, security, justice and politics, reasoning about others’ beliefs could lead to winning strategies or help in changing a group of agents’ view of the world. In this work we formalize the epistemic planning problem where the state description is based on non-well-founded set theory. The introduction of such semantics would permit to characterize the planning problem in terms of set operations and not in term of reachability, as the state-of- the-art suggests. Doing so we hope to introduce a more clear semantics and to establish the basis to exploit properties of set based operations inside multi-agent epistemic planning. Keywords: Epistemic Reasoning · Planning · Multi-agent · Action lan- guages · Non-well-founded sets · Possibilities. 1 Introduction Multi-agent planning and epistemic knowledge have recently gained attention from several research communities. Efficient autonomous systems that can rea- son in these domains could lead to winning strategies in various fields such as economy [3], security [4], justice [25], politics [10] and can be exploited by self- driving cars and other autonomous devices that can control several aspects of our daily life. Epistemic planners are not only interested in the state of the world but also in the knowledge (or beliefs) of the agents. Some problems can be expressed through less expressive languages and need less powerful, and usually faster, planners. For example [18, 24] dealing with problems where dynamic common knowledge and unbounded nested knowledge are respectively not needed. On the other hand, to the best of our knowledge, only few systems [21, 22] can reason ? This research is partially supported by the Università di Udine PRID ENCASE project, and by GNCS-INdAM 2017 and 2019 projects. about epistemic knowledge in multi-agent domains without these limitations, i.e., using the language LC AG presented in Section 2. Such systems, that can reason on the full extent of LC AG , base their concept of state on Kripke structures. Using a Kripke structure as state has a negative impact on the performances of the planner. First of all, to store all the necessary states, solvers require a high amount of memory. Moreover to perform operations, such as entailment or the application of the transition function, the states have been represented explicitly. That is why, as the research on epistemic reasoning advances [2, 5, 7], it is interesting to analyze alternative representations for the states that could lead to more efficient operations on the search-space. In this work we formalized the epistemic planning problem where the states are represented by possibilities (introduced in Section 4) which are based on non-well-founded set theory. This representation will allow us to describe the language through set-based operations and also to exploit some of the results from this field, such as the concept of bisimulation, to add important features to the multi-agent epistemic (MEP) community. The paper is organized as follows: Section 2 will present the concept of epis- temic planning. In Section 3 we will introduce what, in our opinion, is the most complete action language for MEP that bases its states on the concept of Kripke structure. The background will be then concluded with Section 4 where we will describe possibilities, an interesting approach that combines non-well-founded set theory with epistemic logic. In Section 5 we will introduce our semantics, based on possibilities, of mAρ (an action language for MEP) and we will show a comparison with the state-of-the-art action language mA∗ . We will finally conclude in Section 6 with future works. 2 Epistemic reasoning In this section we will briefly introduce the background that is necessary to understand multi-agent epistemic reasoning. As defined in [16, 27], where the concept of multi-agent propositional epistemic logic is fully explored, the epis- temic logic is the logic of knowledge and belief that different agents have about the world and about the beliefs of each other. Let AG be a set of agents and let F be a set of propositional variables, called fluents. We have that each world is described by a subset of elements of F (intuitively, those that are “true” in the world); we also refer to each world as an interpretation. For each agent ag ∈ AG we associate a modal operator Bag (where B stands for belief) and we represent the beliefs of an agent as belief formulae in a logic extended with these operators. Moreover, group operators are also introduced in epistemic logic, such as Eα and Cα , that intuitively represent the belief of a group of agents α and the common knowledge of α respectively. To be more precise, as in [5], we have that Definition 1 (fluent formulae, atoms and literals). A fluent formula is a propositional formula built using the propositional variables in F and the tradi- tional propositional operators ∧, ∨, ⇒, ¬, etc. We will use > and ⊥ to indicate True and False, respectively. A fluent atom is a formula composed by just an el- ement f ∈ F, instead a fluent literal is either a fluent atom f ∈ F or its negation ¬f. Definition 2 (belief formula). A belief formula is defined as follow: – A fluent formula is a belief formula; – let ϕ be belief formula and ag ∈ AG, then Bag ϕ is a belief formula; – let ϕ1 , ϕ2 and ϕ3 be belief formulae, then ¬ϕ3 and ϕ1 op ϕ2 are belief formu- lae, where op ∈ {∧, ∨, ⇒}; – all the formulae of the form Eα ϕ or Cα ϕ are belief formulae, where ϕ is itself a belief formula and ∅ =6 α ⊆ AG. Let us denote with LC AG the language of the belief formulae over the sets F and AG. On the other hand we define LAG as the language over beliefs formulae that does not allow the use of C. In [14], it is pointed out how these two languages differ in expressiveness and in complexity. Example 1. Let us consider the formula Bag1 Bag2 ϕ. This formula expresses that the agent ag1 believes that the agent ag2 believes that ϕ is true, instead, Bag1 ¬ϕ expresses that the agent ag1 believes that ϕ is false. The classical way of providing a semantics for the language of epistemic logic is in terms of Kripke models [20]: Definition 3 (Kripke structure). A Kripke structure is a tuple hS, π, B1 ,. . . , Bn i, such that: – S is a set of worlds; – π : S 7→ 2F is a function that associates an interpretation of F to each element of S; – for 1 ≤ i ≤ n, Bi ⊆ S × S is a binary relation over S. A pointed Kripke structure is a pair (M, s) where M is a Kripke structure as defined above, and s ∈ S, where s represents the real world. As in [5], we will refer to a pointed Kripke structure (M, s) as a state. Following the notation of [5], we will indicate with M [S], M [π], and M [i] the components S, π, and Bi of M , respectively. Definition 4 (entailment w.r.t. a Kripke structure). Given the belief for- mulae ϕ, ϕ1 , ϕ2 , an agent agi , a group of agents α, a Kripke structure M = hS, π, B1 , ..., Bn i, and the worlds s,t ∈ S: (i ) (M, s) |= ϕ if ϕ is a fluent formula and π(s) |= ϕ; (ii ) (M, s) |= Bagi ϕ if for each t such that (s, t) ∈ Bi it holds that (M, t) |= ϕ; (iii ) (M, s) |= ¬ϕ if (M, s) 6|= ϕ; (iv ) (M, s) |= ϕ1 ∨ ϕ2 if (M, s) |= ϕ1 or (M, s) |= ϕ2 ; (v ) (M, s) |= ϕ1 ∧ ϕ2 if (M, s) |= ϕ1 and (M, s) |= ϕ2 ; (vi ) (M, s) |= Eα ϕ if (M, s) |= Bagi ϕ for all agi ∈ α; (vii ) (M, s) |= Cα ϕ if (M, s) |= Ekα ϕ for every k ≥ 0, where E0α ϕ = ϕ and Ek+1 k α ϕ = Eα (Eα ϕ). We will no further describe the properties of the Kripke structures since those are not strictly needed to describe the contribution of this paper. The reader who has interest in a more detailed description can refer to [14]. 3 The action language mA∗ The mA∗ [5] action language is a generalization of the single-agent action lan- guages, extensively studied in the literature [8, 26], to the case of multi-agent domains for epistemic planning. The language has a declarative, English-like syntax and an event model based semantics which permits to reason about beliefs. The semantics of mA∗ is based on the assumption that agents are truthful. The language is built over a signature (AG, F, A), where AG is a finite set of agent identifiers, F is a set of fluents, and A is a set of actions. We will introduce only the basics features of mA∗ . The remaining details about the language can be found in [5]. The following will be used as working example through the paper: Example 2 (Three Agents and the Coin Box). Three agents, A, B, and C, are in a room where in the middle there is a box containing a coin. It is common knowledge that: – None of the agents know whether the coin lies heads or tails up; – The box is locked and one needs a key to open it; – Agent A has the key of the box; – In order to learn whether the coin lies heads or tails up, an agent can peek into the box, but this require the box to be open; – If one agent is looking at the box and a second agents peeks into the box, then the first agent will observe this fact and will be able to conclude that the second agent knows the status of the coin. On the other hand, the first agent’s knowledge about which face of the coin is up does not change. – Distracting an agent causes her to not look at the box; – Signaling an agent to look at the box causes such agent to look at the box; – Announcing that the coin lies heads or tails up will make this common knowledge among the agents that are listening. Agent A would like to know whether the coin lies heads or tails up. She would also like to let agent B knowing that she knows this fact. However, she would like to keep this information secret from C. This can be achieved by: i) Distracting C from looking at the box; ii) Signaling B to look at the box if B is not looking at it; iii) Opening the box; and iv) Peeking into the box. For this domain, we have that AG = {A, B, C}, while the set of fluent F consists of: – heads: the coin lies heads up; – key(ag): agent ag has the key of the box; – opened: the box is open; and – look(ag): agent ag is looking at the box. Let ag ∈ AG, the set of actions A comprises: – open: an agent opens the box; – peek: an agent peeks into the box; – signal(ag): an agent signals to agent ag to look at the box; – distract(ag): an agent distracts agent ag; – shout tails: an agent announces that the coin lies tails up. In [5], the authors distinguished between three types of actions in the following way (some examples of action execution can be found in Appendix): – World-altering action (also called ontic): used to modify certain properties (i.e., fluents) of the world, e.g., the action open or distract(ag) of Exam- ple 2. – Sensing action: used by an agent to refine her beliefs about the world, e.g., the action peek. – Announcement action: used by an agent to affect the beliefs of other agents. e.g., in Example 2 the action shout tails. Given an action instance a ∈ AI, where AI is the set of all the possible action instances A × AG, a fluent literal f ∈ F, a fluent formula φ and a belief formula ϕ we can quickly introduce the syntax adopted in mA∗ . Executability conditions are captured by statements of the form: executable a if ϕ For ontic actions we have: a causes f if ϕ Sensing actions statements have the form expressed by a determines f Finally announcement actions are expressed as follows: a announces φ. In multi-agent domains the execution of an action might change or not the beliefs of an agent. This because, in such domains, each action instance associates an observability relation to each agent. For example the agent C that becomes oblivious as distracted by the agent A, is not able to see the execution of the action openhAi. On the other hand, watching an agent executing a sensing or an announcement action can change the beliefs of who is watching, e.g., the agent B, who is watching the agent A sensing the status of the coin, will know that A Action type Full observers Partial Observers Oblivious World-altering Sensing Announcement Table 1: Action type and observability relations. knows the status of the coin without knowing the status herself. In Table 1 are summarized the possible observability relations for each type of action. Partial observability for World-altering action is not admitted as, whenever an agent is aware of the execution of an ontic action, she must knows its effects on the world as well. For brevity we address the reader to [5] for the definition of the transition function in mA∗ . 4 Non-well-founded sets and Possibilities This section defines the concept of possibility (originally introduced in [15]), based on non-well-founded set theory. This section aims to provide the reader with enough information to understand what possibilities are. For a more informative introduction the reader is addressed to [1, 6, 16] and to [12] for a logic programming point of view on non-well-founded sets and their equivalence. 4.1 Non-well-founded set theory fundamentals We start by giving some fundamental definitions of non-well-founded set theory. First of all a well-founded set is described in [1] as follows: Definition 5 (well-founded set). Let E be a set, E 0 one of its elements, E 00 any element of E 0 , and so on. A descent is the sequence of steps from E to E 0 , E 0 to E 00 , etc. . . . A set is well-founded (or ordinary) when it only gives rise to finite descents. Well-founded set theory states that all the sets in the sense of Definition 5 can be represented in the form of graphs, called pictures, (as shown in Figure 1). To formalize this concept of ‘picture of a set’ however it is necessary to introduce the concept of decoration: Definition 6 (decoration and picture). – A decoration of a graph G=(V, E) is a function δ that assigns to each node n ∈ V a set δn in such a way that the elements of δn are exactly the sets assigned to successors of n, i.e., δn = {δn0 | (n, n0 ) ∈ E}. – If δ is a decoration of a pointed graph (G, n), then (G, n) is a picture of the set δn . 0 1 2 3 1 0 2 3 0 1 2 (a) Pictures of von Neumann ordinals where 0 = ∅; (b) Alternative Pictures of von 1 = {∅}; 2 = {∅, {∅}}; 3 = {∅, {∅}, {∅, {∅}}}. Neumann ordinals 2 and 3. Figure 1: Well-founded sets represented through graphs [1]. Moreover, in well-founded set theory, it holds the Mostovski’s lemma: “each well-founded graph3 is a picture of exactly one set”. On the other hand in [1] a non-well-founded, or extraordinary set in the sense of Mirimanoff, is a set that respects Definition 7. Definition 7 (non-well-founded set). A set is non-well-founded (or extraor- dinary) when among its descents there are some which are infinite. In fact, when the Foundation Axiom4 is substituted by the Anti-Foundation Axiom (AFA), expressed by Aczel in [1] as “Every graph has a unique decora- tion”, the following consequences become true: – Every graph is a picture of exactly one set (AFA as is formulated in [16]); – non-well-founded sets exist given that a non-well-founded pointed graph has to be a picture of a non-well-founded set. (a) Standard picture Ω. (b) Unfolding of the picture of Ω. Figure 2: Representation of the non-well-founded set Ω = {Ω} [1]. In [1, 16] it is pointed out how non-well-founded sets can also be expressed through systems of equations. This concept will help us to formalize the notion of state in our action language. 3 A well-founded graph is a graph that doesn’t contain an infinite path n → n0 → n00 → . . . of successors. 4 Expressed in [16] as “Only well-founded graphs have decorations”. A quick example of this representation can be derived by the set Ω = {Ω} (Figure 2). We can, in fact, informally define this set by the (singleton) system of equations x = {x}. Systems of equations and their solutions are described more formally as follows in [16]: Definition 8 (system of equations). For each class of atoms5 X a system of equation in X is a class τ of equations x = X, where x ∈ X and X ⊆ X , such that τ contains exactly one equation x = X for each x ∈ X . A solution to a system of equations τ is a function δ that assigns to each x ∈ τ (X )6 a set δx such that δx = {δy | y ∈ X}, where x = X is an equation of τ . If δ is the solution to a system of equations τ , then the set {δx | x ∈ τ (X )} is called the solution set of that system. Since both graphs and systems of equations are representations for non-well- founded sets, it is natural to investigate their relationships. In particular it is interesting to point out how from a graph G=(V, E) it is possible to construct a system of equations τ and vice versa. The nodes in G, in fact, can be the set of atoms τ (X ) and, for each node v ∈ V , an equation is represented by v = {v0 | (v, v0 ) ∈ E}. Since each graph has a unique decoration, each system of equations has a unique solution. This is also true when we consider bisimilar systems of equations. In fact we can collapse them into their minimal representation thanks to the concept of maximum bisimulation as introduced in [12]. Bisimilar labeled graphs (or Kripke structures) have therefore a unique solution as well since we collapse their representations into the minimal one. This idea will be further expanded in Section 5.2. 4.2 Possibilities Let us introduce the notion of possibility as in [15]: Definition 9 (possibilities). Let AG be a set of agents and F a set of propo- sitional variables: – A possibility u is a function that assigns to each propositional variable f ∈ F a truth value u(f) ∈ {0, 1} and to each agent ag ∈ AG an information state u(ag) = σ. – An information state σ is a set of possibilities. In Section 5.1 we will use this concept to describe a ‘state’ of the planning problem. The intuition behind this idea is that a possibility u is a possible interpretation of the world and of the agents’ beliefs; in fact u(f) specifies the truth value of the fluent f in u and u(A) is the set of all the interpretations the agent A considers possible in u. Moreover a possibility can be pictured as a decoration of a labeled graph and therefore as a unique solution to a system of equations for possibilities. A 5 Objects that are not sets and have no further set-theoretic structure. 6 τ (X ) denotes the class of atoms X in which τ is described. possibility represents the solution to the minimal system of equations in which all bisimilar systems of equations are collapsed; that is the possibilities that represent decorations of bisimilar labeled graphs are bisimilar and can be repre- sented by the minimal one. This shows that the class of bisimilar labeled graphs and, therefore, of bisimilar Kripke structures, used by mA∗ as states, can be represented by a single possibility. Definition 10 (equations for possibilities). Given a set of agents AG and a set of propositional variables F, a system of equations for possibilities in a class of possibilities X is a set of equations such that for each x ∈ X there exists exactly one equation of the form x(f) = i, where i ∈ {0, 1}, for each f ∈ F, and of the form x(ag) = X, where X ⊆ X , for each ag ∈ AG. A solution to a system of equations for possibilities is a function δ that assigns to each atom x a possibility δx in such a way that if x(f) = i is an equation then δx(f) = i, and if x(ag) = σ is an equation, then δx(ag) = {δy | y ∈ σ}. 5 The action language mAρ The research on multi-agent epistemic domain, both in logic and planning, al- ready comprehends several theoretical studies [2, 5, 7, 13, 14, 16, 17, 23, 27, 28] and also a variety of solvers [18,19,21,22,24,29] even if, at the best of our knowledge, only [13, 21] can reason without limitations on domains described by LC AG . Anyway multi-agent epistemic solvers still have to reason on domains where the number of fluents and/or agents is limited. This is to reduce the length of the planning problems solution that otherwise would require an excessive quantity of resources (i.e., time and memory). For these reasons the demand of computational resources needed (for example in respect to classical planning) is one of the central problem in MEP. To reduce this gap several approaches can be used: i) as in [18, 19, 24, 29] the planning domain can be limited to a less expressive class of problems. On the other hand, when generality is required, ii) heuristics can effectively reduce the resolution process, as shown in [21]. Finally another approach to follow could be iii) to consider alternative representations to Kripke structures; this is what mAρ tries to do. Changing the structure is especially important because different state repre- sentation can lead to a better use of the resources, and to exploit properties of the new structure to introduce important functionalities; e.g., mAρ could rely on the concept of bisimulation to introduce the notion of visited states, being bisimulation an equality criteria for non-well-founded sets. 5.1 State As main contribution we introduce a modified version of mA∗ , called mAρ . The difference is in how a state is defined: in mA∗ a state is represented as a Kripke structure while in mAρ a state is a possibility (Section 4.2). For simplicity, we maintain the syntax used in mA∗ . The strict connection of these two structures is highlighted in [16] from the fact that a solution to a system of equations for possibilities (Definition 10) represents a decoration for a labeled graph, which is essentially a Kripke model. In [16], it is also expressed that a “possibility corresponds with a whole class of bisimilar, but structurally different, Kripke models”. Let us define the usage of possibilities as states in a MEP domain where the set of agents is AG and the set of fluents is F. A possibility is a function that assigns to each propositional variable f ∈ F a truth value u(f) ∈ {0, 1} and to each agent ag ∈ AG a set of possibilities u(ag). A state in MEP has to encode the truth value of the fluents, as in classical planning, and also the beliefs of the agents about fluents and beliefs themselves. We defined in Section 3 how Kripke structures represent these information. In Figure 3 we use a possibility as a system of equations to encode a state (of the domain in Example 2). An equation is represented in the form u = {(ag1 , σ), (ag2 , σ 0 ), . . . , f, f 0 , . . . } where ag1 , ag2 ∈ AG, σ, σ 0 are sets of possibilities and f, f 0 ∈ F. When we write (ag, σ) we mean that, in u, the agent ag believes that the possibilities in σ are plausible. On the other hand only if a fluent f is present in the equation this means that the fluent itself is true in u. A,B w0 A,B w A,B = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened}  w  C C w 0  = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened, heads} C C v   = {(A, {v, v0 }), (B, {v, v0 }), (C, {v, v0 }), look(ag), key(A)}  0 v = {(A, {v, v0 }), (B, {v, v0 }), (C, {v, v0 }), look(ag), key(A), heads} A,B,C A,B,C where ag ∈ {A, B}. v A,B,C v0 (a) System of equations for possibilities. (b) Decoration of the pointed la- beled graph (G, w). Figure 3: Representation of the possibility w after the execution of the actions distract(C)hAi;openhAi on the initial state of Example 2. The possibility is expanded to its system of equation for clarity. It is clear that a possibility correspond to a decoration of a pointed labeled graph and therefore to a unique Kripke model up to bisimulation. The represen- tation through possibilities allows, in our opinion, a more clear and concise view of the state. That is because each state is represented by a single possibility; e.g., Figure 3 is represented by w= {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened} where ag ∈ {A, B}. 5.2 State equality through Bisimulation One of the reasons we chose to use possibilities as states is to exploit this new structure to introduce new functionalities to MEP. In fact having the states described as possibilities, which are strongly related to non-well-founded sets, help us to introduce the concept of visited states, a core idea in planning. As said before, a possibility in the sense of decoration, represents all the Kripke structures bisimilar to the decoration itself. This means that with pos- sibilities we can exploit bisimulation to capture the idea of equality between states. In fact, given two bisimilar decorations (or labeled graphs), these, even with structural differences, are represented by the same possibility. On the other hand this is not true when it comes to Kripke structures. This idea is best described through graphical representation; therefore we will use Figure 4 to explain this concept. As an example, let us introduce two new actions: flip and tell(ag). The first one is an ontic action, where an agent inverts the position of the coin; the observability of this action depends on the fluents looking. On the other hand, tell(ag) means that an agent announces to ag the position in which she thinks the coin lies while all the other agents are oblivious. Assuming that the coin lies tails up and given a sequence of action instances ∆ = peekhBi; distract(B)hCi; fliphCi; tell(B)hCi7 we show in Figure 4a the result of applying ∆ in a slightly modified initial state of Example 2 where C{A,B,C} (opened ∧¬look(A)). In Figure 4b, it is represented a Kripke structure that has structural differences in respect to the one in Figure 4a. This means that these two Kripke structures represent two different states in mA∗ even if they are intuitively the same. On the contrary, if we think in term of possibilities, both the Kripke structures of Figure 4 are represented by possibility r1 ={(A, {s0 , s1 }), (B, {r1 }), (C, {r1 }), look(C), key(A), opened, heads}. B, C B,C r1 r1 B, C q0 A A A A A A A, B, C A, B, C A,B,C A,B,C s0 A, B, C s1 s0 A,B,C s1 (a) The resulting Kripke struc- (b) A Kripke structure bisimilar ture after the execution of ∆. to the one in Figure 4a. Figure 4: Bisimilar Kripke Structures. Thanks to the use of possibilities we can capture bisimilar decoration while the same were considered different in mA∗ and, therefore, check if a state has 7 We recall that in [5] an action instance is represented as actionhagi where ag is the agent that executes action. already been visited. Next we define the concept of entailment and finally the transition function in mAρ . 5.3 Entailment We now introduce the concept of entailment w.r.t. possibilities. Definition 11 (entailment w.r.t. possibilities). Given the belief formulae ϕ, ϕ1 , ϕ2 , a fluent f, an agent ag, a group of agents α, and a possibility u: (i ) u |= f if u(f) = 1; (ii ) u |= ¬φ if u 6|= φ; (iii ) u |= φ1 ∨ φ2 if u |= φ1 or u |= φ2 ; (iv ) u |= φ1 ∧ φ2 if u |= φ1 and u |= φ2 . (v ) u |= Bag ϕ if for each v ∈ u(ag), v |= ϕ; (vi ) u |= ¬ϕ if u 6|= ϕ; (vii ) u |= ϕ1 ∨ ϕ2 if u |= ϕ1 or u |= ϕ2 ; (viii ) u |= ϕ1 ∧ ϕ2 if u |= ϕ1 and u |= ϕ2 ; (ix ) u |= Eα ϕ if u |= Bag ϕ for all ag ∈ α; (x ) u |= Cα ϕ if u |= Ekα ϕ for every k ≥ 0, where E0α ϕ = ϕ and Ek+1 α ϕ = Eα (Ekα ϕ). 5.4 Transition function Finally we introduce the transition function for mAρ . In defining this transition function we made some assumptions: i) we consider that the given initial state also specifies which world is the pointed one (as said in [5]), this allow us to relax the description of Φ in the case of announcements or sensing actions. Moreover ii) we do not take into consideration the case in which an agent can have false beliefs 8 because is still an open question how to deal with them in MEP; we will try to address this problem in future works. The transition function relative to the domain D is ΦD : AI ×Σ → Σ ∪{∅} where Σ is the set of all the possibilities. For the sake of readability, the abbreviations used in the following definition are explained at the end of it. Definition 12 (transition function for mAρ ). Let a ∈ AI, and a possibility u be given. If a is not executable in u then ΦD (a, u) = ∅ otherwise if a is executable in u : – ( ΦD (a, u) = v if a is an ontic action instance and v(f) = u(f) if f 6= caused(a) v(f) = caused(a) if f = caused(a) and ( v(ag) = u(ag) if ag ∈ OD S v(ag) = w∈u(ag) ΦD (a, w) if ag ∈ FD 8 An agent ag has a false belief about ϕ in state s if s |= ϕ and s |= Bag ¬ϕ. – ΦD (a, u) = v if a sensed the fluent f    ∅ if sensed(a)[f] 6= u(f)  v(ag) = u(ag) if ag ∈ OD and sensed(a)[f] = u(f) S   v(ag) = w∈u(ag) ΦD (a, w) if ag ∈ FD and sensed(a)[f] = u(f)  S v(ag) = w∈u(ag) (ΦD (a, w) ∪ ΦD (¬a, w)) if ag ∈ PD and sensed(a)[f] = u(f)  – ΦD (a, u) = v if a announces the fluent formula φ    ∅ if u 6|= φ  v(ag) = u(ag) if ag ∈ OD and u |= φ S v(ag) = Φ D (a, w) if ag ∈ FD and u |= φ Sw∈u(ag)    v(ag) = w∈u(ag) (ΦD (a, w) ∪ ΦD (¬a, w)) if ag ∈ PD and u |= φ  where: – caused(a) is the fluent modified by the action instance a; – FD , PD , OD identify the fully observant, partially observant and oblivious agents respectively; – sensed(a)[f] represents the truth value of the fluent f determined by a = u(f); – ¬a describes the action that senses the opposite of a; namely sensed(a)[¬f]. Note that sensing and announcement actions generate the empty set when the action effects do not respect the fluents truth values of the possibility where the action is executed. That is because epistemic action (i.e., announcement or sensing) cannot change the state of the world but only the beliefs of the agent. As example we present the execution of the sequence distract(C)hAi; openhAi on the initial state of Example 2 encoded by the possibility u in Figure 5a. ( ( u = {(ag, {u, u0 }), look(ag), key(A)} v = {(ag, {v, v0 }), look(A), look(B), key(A)} u0 = {(ag, {u, u0 }), look(ag), key(A), heads} v0 = {(ag, {v, v0 }), look(A), look(B), key(A), heads} (a) The initial state u. (b) The state v after executing distract(C)hAi. ( w = {(A, {w, w0 }), (B, {w, w0 }), (C, {v, v0 }), look(A), look(B), key(A), opened} w0 = {(A, {w, w0 }), (B, {w, w0 }), (C, {v, v0 }), look(A), look(B), key(A), opened, heads} (c) The state w after executing distract(C)hAi;openhAi. Figure 5: Execution of distract(C)hAi;openhAi in Example 2 (ag ∈ {A, B, C}). 6 Conclusions and Future Works In this paper we investigated an alternative to Kripke structures as representa- tion for multi-agent epistemic planning states. Doing so we presented mAρ , an action language for MEP based on possibilities, introducing MEP in non-well- founded set theory. Moreover we exploited possibilities to define a stronger con- cept of equality on states collapsing all the bisimilar states into the same possibil- ity. Finally, as with mAρ is more direct to have an implicit state-representation, using possibilities helps in reducing the search-space dimension. In the near future we intend to implement a planner for mAρ and to study alternative representations. In particular we plan to: i) exploit more set-based operations: especially for the entailment of group operators; ii) formalize the concept of non-consistent belief for mAρ ; iii) investigate more thoroughly the connection between Kripke structures and non-well-founded sets; iv) examine the concept of bisimulation as equality between epistemic states; v) and finally consider other alternatives to Kripke structures,e.g., OBDDs [9, 11]. References 1. Aczel, P.: Non-well-founded sets. CSLI Lecture Notes, 14 (1988) 2. Aucher, G., Bolander, T.: Undecidability in epistemic planning. pp. 27–33 (2013), cited By 27 3. Aumann, R., Brandenburger, A., et al.: Epistemic conditions for Nash equilibrium. ECONOMETRICA-EVANSTON ILL- 63, 1161–1161 (1995) 4. Balliu, M., Dam, M., Le Guernic, G.: Epistemic temporal logic for information flow security. In: Proceedings of the ACM SIGPLAN 6th Workshop on Programming Languages and Analysis for Security. pp. 6:1–6:12. PLAS ’11, ACM, New York, NY, USA (2011). https://doi.org/10.1145/2166956.2166962 5. Baral, C., Gelfond, G., Pontelli, E., Son, T.C.: An action language for multi-agent domains: Foundations. CoRR abs/1511.01960 (2015), http://arxiv.org/abs/1511.01960 6. Barwise, J., Etchemendy, J.: The liar: An essay on truth and circularity. Oxford University Press (1987) 7. Bolander, T., Jensen, M., Schwarzentruber, F.: Complexity results in epistemic planning. vol. 2015-January, pp. 2791–2797 (2015) 8. Bolander, T., Andersen, M.B.: Epistemic planning for single-and multi-agent sys- tems. Journal of Applied Non-Classical Logics 21(1), 9–34 (2011) 9. Bryant, R.E.: Symbolic boolean manipulation with ordered binary-decision dia- grams. ACM Computing Surveys (CSUR) 24(3), 293–318 (1992) 10. Carbonell Jr, J.G.: Politics: Automated ideological reasoning. Cognitive Science 2(1), 27–51 (1978) 11. Cimatti, A., Roveri, M.: Conformant planning via symbolic model checking. Jour- nal of Artificial Intelligence Research 13, 305–338 (2000) 12. Dovier, A.: Logic programming and bisimulation. In: Vos, M.D., Eiter, T., Lierler, Y., Toni, F. (eds.) ICLP. CEUR Workshop Proceedings, vol. 1433. CEUR-WS.org (2015), http://ceur-ws.org/Vol-1433 13. van Eijck, J.: Dynamic epistemic modelling. CWI. Software Engineering [SEN] (2004) 14. Fagin, R., Halpern, J.Y.: Reasoning about knowledge and probability. Journal of the ACM (JACM) 41(2), 340–367 (1994) 15. Gerbrandy, J., Groeneveld, W.: Reasoning about information change. Journal of Logic, Language and Information 6(2), 147–169 (1997). https://doi.org/10.1023/A:1008222603071 16. Gerbrandy, J.D.: Bisimulations on planet Kripke. Inst. for Logic, Language and Computation, Univ. van Amsterdam (1999) 17. Van der Hoek, W., Meyer, J.: A complete epistemic logic for multiple agents. In: Epistemic Logic and the Theory of Games and Decisions, pp. 35–68. Springer (1997) 18. Huang, X., Fang, B., Wan, H., Liu, Y.: A general multi-agent epistemic planner based on higher-order belief change. pp. 1093–1101 (2017) 19. Kominis, F., Geffner, H.: Beliefs in multiagent planning: From one agent to many. In: Proc. ICAPS. pp. 147–155 (2015) 20. Kripke, S.A.: Semantical considerations on modal logic. Acta Philosophica Fennica 16(1963), 83–94 (1963) 21. Le, T., Fabiano, F., Son, T.C., Pontelli, E.: Efp and pg-efp: Epistemic forward search planners in multi-agent domains. In: Twenty-Eighth International Confer- ence on Automated Planning and Scheduling (2018) 22. Liu, Q., Liu, Y.: Multi-agent epistemic planning with common knowledge. In: IJ- CAI. pp. 1912–1920 (2018) 23. Moore, R.C.: Reasoning about knowledge and action. In: Readings in Artificial Intelligence, pp. 473–477. Elsevier (1981) 24. Muise, C.J., Belle, V., Felli, P., McIlraith, S.A., Miller, T., Pearce, A.R., Sonenberg, L.: Planning over multi-agent epistemic states: A classical planning approach. In: Proc. of AAAI. pp. 3327–3334 (2015) 25. Prakken, H.: Logical tools for modelling legal argument: a study of defeasible rea- soning in law, vol. 32. Springer Science & Business Media (2013) 26. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall Press, Upper Saddle River, NJ, USA, 3rd edn. (2009) 27. Van Ditmarsch, H., van Der Hoek, W., Kooi, B.: Dynamic epistemic logic, vol. 337. Springer Science & Business Media (2007) 28. Van Harmelen, F., Lifschitz, V., Porter, B.: Handbook of knowledge representation, vol. 1. Elsevier (2008) 29. Wan, H., Yang, R., Fang, L., Liu, Y., Xu, H.: A complete epistemic planner without the epistemic closed world assumption. In: IJCAI (2015) A mA∗ and mAρ comparison For a more clear comparison we show the execution, on both mA∗ and mAρ , of the action instances sequence ∆c = distract(C)hAi; openhAi; peekhAi, that leads to the desired goal, in the domain expressed in Example 2. With this we want to give a graphical explanation of both the transition functions and state- space defined by the two languages. Each state in mA∗ will be represented by a Kripke structure while in mAρ will be a possibility (expanded to its respective system of equations for clarity). The observability relations of each action instance in ∆c is expressed in the following Table: distract(C)hAi openhAi peekhAi FD A, B, C A, B A PD - - B OD - C C Table 2: Observability relations of the actions instances in ∆c . The initial state, based on Example 2, is defined by the conditions: intially C(key(A)) intially C(¬key(B)) intially C(¬key(C)) intially C(¬opened) intially C(¬Bag heads ∧ ¬Bag ¬heads) for ag ∈ {A, B, C} intially C(look(ag)) for ag ∈ {A, B, C} intially ¬heads Finally the goal that both the state in Figure 9 entail is expressed with the following formulae: BA ¬heads ∧ BA (BB (BA heads ∨ BA ¬heads)) BB (BA heads ∨ BA ¬heads) ∧ (¬BB heads ∧ ¬BB ¬heads) ^ BC [ (¬Bag heads ∧ ¬Bag ¬heads)] ag∈{A,B,C} The plan for Example 2 A, B, C A, B, C s0 A, B, C s1 ( u = {(ag, {u, u0 }), look(ag), key(A)} u0 = {(ag, {u, u0 }), look(ag), key(A), heads} M0 [π](s0 ) = {look(ag), key(A)} where ag ∈ {A, B, C}. where ag ∈ {A, B, C}. M0 [π](s1 ) = {look(ag), key(A), heads} (a) The intial Kripke structure (M0 , s0 ). (b) The initial possibility u. Figure 6: The initial state. A, B, C A, B, C p0 A, B, C p1 ( v = {(ag, {v, v0 }), look(A), look(B), key(A)} v0 = {(ag, {v, v0 }), look(A), look(B), key(A), heads} M1 [π](p0 ) ={look(A), look(B), key(A)} where ag ∈ {A, B, C} M1 [π](p1 ) ={look(A), look(B), key(A), heads} (a) The Kripke stucture (M1 , p0 ), obtained after the execution (b) Possibility v, obtained after the execution of of distract(C)hAi in (M0 , s0 ) (Figure 6a). distract(C)hAi in u (Figure 6b). Figure 7: Execution of distract(C)hAi. A, B A, B q0 A, B q1 C C ( w = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened} C C w0 = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened, heads} A, B, C where ag ∈ {A, B} and v, v0 , are the possibilities represented in Figure 7b. A, B, C p0 A, B, C p1 M2 [π](q0 ) = {look(ag), key(A), opened} M2 [π](p0 ) = M1 [π](p0 ) M2 [π](q1 ) = {look(ag), key(A), opened, heads} M2 [π](p1 ) = M1 [π](p1 ) where ag ∈ {A, B}. (a) The Kripke stucture (M2 , q0 ), obtained after the execution of (b) Possibility w, obtained after the execution of openhAi in v (Figure 7b). openhAi in (M1 , p0 ) (Figure 7a). Figure 8: Execution of openhAi. A, B A, B r0 B r1 C C ( z = {(A, {z}), (B, {z, z0 })(C, {v, v0 }), look(ag), key(A), opened} C C z0 = {(A, {z0 }), (B, {z, z0 })(C, {v, v0 }), look(ag), key(A), opened, heads} A, B, C where ag ∈ {A, B} and the possibilities v, v0 are represented in Figure 7b. A, B, C p0 A, B, C p1 M3 [π](r0 ) = {look(ag), key(A), opened} M3 [π](p0 ) = M1 [π](p0 ) M3 [π](r1 ) = {look(ag), key(A), opened, heads} M3 [π](p1 ) = M1 [π](p1 ) where ag ∈ {A, B}. (a) The Kripke structure (M3 , q0 ), obtained after the execution of (b) Possibility z, obtained after the execution of peekhAi in w (Figure 8b). peekhAi in (M2 , q0 ). Figure 9: Execution of peekhAi.