Non-Well-Founded Set Based Multi-Agent
                  Action Language?

    Francesco Fabiano1 , Idriss Riouak1 , Agostino Dovier1 , and Enrico Pontelli2
               1
                 Dipartimento DMIF, Università di Udine, Udine, Italy
2
    Computer Science Department, New Mexico State University, Las Cruces NM, USA


        Abstract As the research in multi-agent domain continues to grow it is
        becoming more and more important to investigate the agents’ relations
        in such systems: not only to reason about agents’ perception of the world
        but also about agents’ knowledge of her and others’ knowledge. This type
        of study is referred as epistemic reasoning.
        In certain domains, e.g., economy, security, justice and politics, reasoning
        about others’ beliefs could lead to winning strategies or help in changing
        a group of agents’ view of the world.
        In this work we formalize the epistemic planning problem where the state
        description is based on non-well-founded set theory. The introduction of
        such semantics would permit to characterize the planning problem in
        terms of set operations and not in term of reachability, as the state-of-
        the-art suggests. Doing so we hope to introduce a more clear semantics
        and to establish the basis to exploit properties of set based operations
        inside multi-agent epistemic planning.

        Keywords: Epistemic Reasoning · Planning · Multi-agent · Action lan-
        guages · Non-well-founded sets · Possibilities.


1     Introduction

Multi-agent planning and epistemic knowledge have recently gained attention
from several research communities. Efficient autonomous systems that can rea-
son in these domains could lead to winning strategies in various fields such as
economy [3], security [4], justice [25], politics [10] and can be exploited by self-
driving cars and other autonomous devices that can control several aspects of
our daily life.
    Epistemic planners are not only interested in the state of the world but also
in the knowledge (or beliefs) of the agents. Some problems can be expressed
through less expressive languages and need less powerful, and usually faster,
planners. For example [18, 24] dealing with problems where dynamic common
knowledge and unbounded nested knowledge are respectively not needed. On the
other hand, to the best of our knowledge, only few systems [21, 22] can reason
?
    This research is partially supported by the Università di Udine PRID ENCASE
    project, and by GNCS-INdAM 2017 and 2019 projects.
about epistemic knowledge in multi-agent domains without these limitations, i.e.,
using the language LC  AG presented in Section 2. Such systems, that can reason
on the full extent of LC  AG , base their concept of state on Kripke structures.
Using a Kripke structure as state has a negative impact on the performances
of the planner. First of all, to store all the necessary states, solvers require a
high amount of memory. Moreover to perform operations, such as entailment
or the application of the transition function, the states have been represented
explicitly. That is why, as the research on epistemic reasoning advances [2, 5, 7],
it is interesting to analyze alternative representations for the states that could
lead to more efficient operations on the search-space.
    In this work we formalized the epistemic planning problem where the states
are represented by possibilities (introduced in Section 4) which are based on
non-well-founded set theory. This representation will allow us to describe the
language through set-based operations and also to exploit some of the results
from this field, such as the concept of bisimulation, to add important features
to the multi-agent epistemic (MEP) community.

    The paper is organized as follows: Section 2 will present the concept of epis-
temic planning. In Section 3 we will introduce what, in our opinion, is the most
complete action language for MEP that bases its states on the concept of Kripke
structure. The background will be then concluded with Section 4 where we will
describe possibilities, an interesting approach that combines non-well-founded
set theory with epistemic logic. In Section 5 we will introduce our semantics,
based on possibilities, of mAρ (an action language for MEP) and we will show
a comparison with the state-of-the-art action language mA∗ . We will finally
conclude in Section 6 with future works.


2   Epistemic reasoning
In this section we will briefly introduce the background that is necessary to
understand multi-agent epistemic reasoning. As defined in [16, 27], where the
concept of multi-agent propositional epistemic logic is fully explored, the epis-
temic logic is the logic of knowledge and belief that different agents have about
the world and about the beliefs of each other.
    Let AG be a set of agents and let F be a set of propositional variables,
called fluents. We have that each world is described by a subset of elements of
F (intuitively, those that are “true” in the world); we also refer to each world
as an interpretation. For each agent ag ∈ AG we associate a modal operator
Bag (where B stands for belief) and we represent the beliefs of an agent as belief
formulae in a logic extended with these operators. Moreover, group operators are
also introduced in epistemic logic, such as Eα and Cα , that intuitively represent
the belief of a group of agents α and the common knowledge of α respectively.
To be more precise, as in [5], we have that
Definition 1 (fluent formulae, atoms and literals). A fluent formula is a
propositional formula built using the propositional variables in F and the tradi-
 tional propositional operators ∧, ∨, ⇒, ¬, etc. We will use > and ⊥ to indicate
 True and False, respectively. A fluent atom is a formula composed by just an el-
 ement f ∈ F, instead a fluent literal is either a fluent atom f ∈ F or its negation
 ¬f.

 Definition 2 (belief formula). A belief formula is defined as follow:
   – A fluent formula is a belief formula;
   – let ϕ be belief formula and ag ∈ AG, then Bag ϕ is a belief formula;
   – let ϕ1 , ϕ2 and ϕ3 be belief formulae, then ¬ϕ3 and ϕ1 op ϕ2 are belief formu-
     lae, where op ∈ {∧, ∨, ⇒};
   – all the formulae of the form Eα ϕ or Cα ϕ are belief formulae, where ϕ is
     itself a belief formula and ∅ =6 α ⊆ AG.

 Let us denote with LC AG the language of the belief formulae over the sets F and
 AG. On the other hand we define LAG as the language over beliefs formulae that
 does not allow the use of C. In [14], it is pointed out how these two languages
 differ in expressiveness and in complexity.

 Example 1. Let us consider the formula Bag1 Bag2 ϕ. This formula expresses that
 the agent ag1 believes that the agent ag2 believes that ϕ is true, instead, Bag1 ¬ϕ
 expresses that the agent ag1 believes that ϕ is false.

 The classical way of providing a semantics for the language of epistemic logic is
 in terms of Kripke models [20]:
 Definition 3 (Kripke structure). A Kripke structure is a tuple hS, π, B1 ,. . . , Bn i,
 such that:
   – S is a set of worlds;
   – π : S 7→ 2F is a function that associates an interpretation of F to each
     element of S;
   – for 1 ≤ i ≤ n, Bi ⊆ S × S is a binary relation over S.
 A pointed Kripke structure is a pair (M, s) where M is a Kripke structure as
 defined above, and s ∈ S, where s represents the real world. As in [5], we will
 refer to a pointed Kripke structure (M, s) as a state.
     Following the notation of [5], we will indicate with M [S], M [π], and M [i] the
 components S, π, and Bi of M , respectively.

 Definition 4 (entailment w.r.t. a Kripke structure). Given the belief for-
 mulae ϕ, ϕ1 , ϕ2 , an agent agi , a group of agents α, a Kripke structure M =
 hS, π, B1 , ..., Bn i, and the worlds s,t ∈ S:
  (i ) (M, s) |= ϕ if ϕ is a fluent formula and π(s) |= ϕ;
 (ii ) (M, s) |= Bagi ϕ if for each t such that (s, t) ∈ Bi it holds that (M, t) |= ϕ;
(iii ) (M, s) |= ¬ϕ if (M, s) 6|= ϕ;
(iv ) (M, s) |= ϕ1 ∨ ϕ2 if (M, s) |= ϕ1 or (M, s) |= ϕ2 ;
 (v ) (M, s) |= ϕ1 ∧ ϕ2 if (M, s) |= ϕ1 and (M, s) |= ϕ2 ;
 (vi ) (M, s) |= Eα ϕ if (M, s) |= Bagi ϕ for all agi ∈ α;
(vii ) (M, s) |= Cα ϕ if (M, s) |= Ekα ϕ for every k ≥ 0, where E0α ϕ = ϕ and
       Ek+1            k
        α ϕ = Eα (Eα ϕ).

     We will no further describe the properties of the Kripke structures since those
 are not strictly needed to describe the contribution of this paper. The reader who
 has interest in a more detailed description can refer to [14].

 3    The action language mA∗
 The mA∗ [5] action language is a generalization of the single-agent action lan-
 guages, extensively studied in the literature [8, 26], to the case of multi-agent
 domains for epistemic planning.
    The language has a declarative, English-like syntax and an event model based
 semantics which permits to reason about beliefs.
    The semantics of mA∗ is based on the assumption that agents are truthful.
 The language is built over a signature (AG, F, A), where AG is a finite set of
 agent identifiers, F is a set of fluents, and A is a set of actions.
    We will introduce only the basics features of mA∗ . The remaining details
 about the language can be found in [5]. The following will be used as working
 example through the paper:
 Example 2 (Three Agents and the Coin Box). Three agents, A, B, and C, are
 in a room where in the middle there is a box containing a coin. It is common
 knowledge that:
  – None of the agents know whether the coin lies heads or tails up;
  – The box is locked and one needs a key to open it;
  – Agent A has the key of the box;
  – In order to learn whether the coin lies heads or tails up, an agent can peek
    into the box, but this require the box to be open;
  – If one agent is looking at the box and a second agents peeks into the box,
    then the first agent will observe this fact and will be able to conclude that
    the second agent knows the status of the coin. On the other hand, the first
    agent’s knowledge about which face of the coin is up does not change.
  – Distracting an agent causes her to not look at the box;
  – Signaling an agent to look at the box causes such agent to look at the box;
  – Announcing that the coin lies heads or tails up will make this common
    knowledge among the agents that are listening.
     Agent A would like to know whether the coin lies heads or tails up. She would
 also like to let agent B knowing that she knows this fact. However, she would
 like to keep this information secret from C.
     This can be achieved by: i) Distracting C from looking at the box; ii) Signaling
 B to look at the box if B is not looking at it; iii) Opening the box; and iv) Peeking
 into the box.
     For this domain, we have that AG = {A, B, C}, while the set of fluent F
 consists of:
 – heads: the coin lies heads up;
 – key(ag): agent ag has the key of the box;
 – opened: the box is open; and
 – look(ag): agent ag is looking at the box.

Let ag ∈ AG, the set of actions A comprises:

 – open: an agent opens the box;
 – peek: an agent peeks into the box;
 – signal(ag): an agent signals to agent ag to look at the box;
 – distract(ag): an agent distracts agent ag;
 – shout tails: an agent announces that the coin lies tails up.

In [5], the authors distinguished between three types of actions in the following
way (some examples of action execution can be found in Appendix):

 – World-altering action (also called ontic): used to modify certain properties
   (i.e., fluents) of the world, e.g., the action open or distract(ag) of Exam-
   ple 2.
 – Sensing action: used by an agent to refine her beliefs about the world, e.g.,
   the action peek.
 – Announcement action: used by an agent to affect the beliefs of other agents.
   e.g., in Example 2 the action shout tails.

    Given an action instance a ∈ AI, where AI is the set of all the possible
action instances A × AG, a fluent literal f ∈ F, a fluent formula φ and a belief
formula ϕ we can quickly introduce the syntax adopted in mA∗ .
    Executability conditions are captured by statements of the form:

                               executable a if ϕ

For ontic actions we have:
                                 a causes f if ϕ
Sensing actions statements have the form expressed by

                                 a determines f

Finally announcement actions are expressed as follows:

                                a announces φ.

    In multi-agent domains the execution of an action might change or not the
beliefs of an agent. This because, in such domains, each action instance associates
an observability relation to each agent. For example the agent C that becomes
oblivious as distracted by the agent A, is not able to see the execution of the
action openhAi. On the other hand, watching an agent executing a sensing or an
announcement action can change the beliefs of who is watching, e.g., the agent
B, who is watching the agent A sensing the status of the coin, will know that A
          Action type      Full observers    Partial Observers    Oblivious
        World-altering
           Sensing
        Announcement
                Table 1: Action type and observability relations.


knows the status of the coin without knowing the status herself. In Table 1 are
summarized the possible observability relations for each type of action. Partial
observability for World-altering action is not admitted as, whenever an agent is
aware of the execution of an ontic action, she must knows its effects on the world
as well.
   For brevity we address the reader to [5] for the definition of the transition
function in mA∗ .


4     Non-well-founded sets and Possibilities
This section defines the concept of possibility (originally introduced in [15]),
based on non-well-founded set theory. This section aims to provide the reader
with enough information to understand what possibilities are.
    For a more informative introduction the reader is addressed to [1, 6, 16] and
to [12] for a logic programming point of view on non-well-founded sets and their
equivalence.

4.1    Non-well-founded set theory fundamentals
We start by giving some fundamental definitions of non-well-founded set theory.
First of all a well-founded set is described in [1] as follows:
Definition 5 (well-founded set). Let E be a set, E 0 one of its elements, E 00
any element of E 0 , and so on. A descent is the sequence of steps from E to E 0 ,
E 0 to E 00 , etc. . . . A set is well-founded (or ordinary) when it only gives rise to
finite descents.
Well-founded set theory states that all the sets in the sense of Definition 5 can
be represented in the form of graphs, called pictures, (as shown in Figure 1). To
formalize this concept of ‘picture of a set’ however it is necessary to introduce
the concept of decoration:
Definition 6 (decoration and picture).
 – A decoration of a graph G=(V, E) is a function δ that assigns to each node
   n ∈ V a set δn in such a way that the elements of δn are exactly the sets
   assigned to successors of n, i.e., δn = {δn0 | (n, n0 ) ∈ E}.
 – If δ is a decoration of a pointed graph (G, n), then (G, n) is a picture of the
   set δn .
        0     1        2         3                                   1
                                                               0           2


                                                                     3


                                                               0     1     2

(a) Pictures of von Neumann ordinals where 0 = ∅;     (b) Alternative Pictures of von
1 = {∅}; 2 = {∅, {∅}}; 3 = {∅, {∅}, {∅, {∅}}}.        Neumann ordinals 2 and 3.

            Figure 1: Well-founded sets represented through graphs [1].


Moreover, in well-founded set theory, it holds the Mostovski’s lemma: “each
well-founded graph3 is a picture of exactly one set”.
   On the other hand in [1] a non-well-founded, or extraordinary set in the sense
of Mirimanoff, is a set that respects Definition 7.

Definition 7 (non-well-founded set). A set is non-well-founded (or extraor-
dinary) when among its descents there are some which are infinite.
In fact, when the Foundation Axiom4 is substituted by the Anti-Foundation
Axiom (AFA), expressed by Aczel in [1] as “Every graph has a unique decora-
tion”, the following consequences become true:

 – Every graph is a picture of exactly one set (AFA as is formulated in [16]);
 – non-well-founded sets exist given that a non-well-founded pointed graph has
   to be a picture of a non-well-founded set.


    (a) Standard picture Ω.                         (b) Unfolding of the picture of Ω.

        Figure 2: Representation of the non-well-founded set Ω = {Ω} [1].


    In [1, 16] it is pointed out how non-well-founded sets can also be expressed
through systems of equations. This concept will help us to formalize the notion
of state in our action language.
3
  A well-founded graph is a graph that doesn’t contain an infinite path n → n0 →
  n00 → . . . of successors.
4
  Expressed in [16] as “Only well-founded graphs have decorations”.
    A quick example of this representation can be derived by the set Ω = {Ω}
(Figure 2). We can, in fact, informally define this set by the (singleton) system
of equations x = {x}. Systems of equations and their solutions are described
more formally as follows in [16]:
Definition 8 (system of equations). For each class of atoms5 X a system of
equation in X is a class τ of equations x = X, where x ∈ X and X ⊆ X , such that
τ contains exactly one equation x = X for each x ∈ X . A solution to a system
of equations τ is a function δ that assigns to each x ∈ τ (X )6 a set δx such that
δx = {δy | y ∈ X}, where x = X is an equation of τ . If δ is the solution to a
system of equations τ , then the set {δx | x ∈ τ (X )} is called the solution set of
that system.
    Since both graphs and systems of equations are representations for non-well-
founded sets, it is natural to investigate their relationships. In particular it is
interesting to point out how from a graph G=(V, E) it is possible to construct a
system of equations τ and vice versa. The nodes in G, in fact, can be the set of
atoms τ (X ) and, for each node v ∈ V , an equation is represented by v = {v0 |
(v, v0 ) ∈ E}. Since each graph has a unique decoration, each system of equations
has a unique solution. This is also true when we consider bisimilar systems of
equations. In fact we can collapse them into their minimal representation thanks
to the concept of maximum bisimulation as introduced in [12]. Bisimilar labeled
graphs (or Kripke structures) have therefore a unique solution as well since we
collapse their representations into the minimal one. This idea will be further
expanded in Section 5.2.


4.2     Possibilities

Let us introduce the notion of possibility as in [15]:
Definition 9 (possibilities). Let AG be a set of agents and F a set of propo-
sitional variables:

 – A possibility u is a function that assigns to each propositional variable f ∈ F
   a truth value u(f) ∈ {0, 1} and to each agent ag ∈ AG an information state
   u(ag) = σ.
 – An information state σ is a set of possibilities.

    In Section 5.1 we will use this concept to describe a ‘state’ of the planning
problem. The intuition behind this idea is that a possibility u is a possible
interpretation of the world and of the agents’ beliefs; in fact u(f) specifies the
truth value of the fluent f in u and u(A) is the set of all the interpretations the
agent A considers possible in u.
    Moreover a possibility can be pictured as a decoration of a labeled graph
and therefore as a unique solution to a system of equations for possibilities. A
5
    Objects that are not sets and have no further set-theoretic structure.
6
    τ (X ) denotes the class of atoms X in which τ is described.
possibility represents the solution to the minimal system of equations in which
all bisimilar systems of equations are collapsed; that is the possibilities that
represent decorations of bisimilar labeled graphs are bisimilar and can be repre-
sented by the minimal one. This shows that the class of bisimilar labeled graphs
and, therefore, of bisimilar Kripke structures, used by mA∗ as states, can be
represented by a single possibility.
Definition 10 (equations for possibilities). Given a set of agents AG and
a set of propositional variables F, a system of equations for possibilities in a
class of possibilities X is a set of equations such that for each x ∈ X there exists
exactly one equation of the form x(f) = i, where i ∈ {0, 1}, for each f ∈ F, and
of the form x(ag) = X, where X ⊆ X , for each ag ∈ AG.
A solution to a system of equations for possibilities is a function δ that assigns
to each atom x a possibility δx in such a way that if x(f) = i is an equation then
δx(f) = i, and if x(ag) = σ is an equation, then δx(ag) = {δy | y ∈ σ}.


5     The action language mAρ
The research on multi-agent epistemic domain, both in logic and planning, al-
ready comprehends several theoretical studies [2, 5, 7, 13, 14, 16, 17, 23, 27, 28] and
also a variety of solvers [18,19,21,22,24,29] even if, at the best of our knowledge,
only [13, 21] can reason without limitations on domains described by LC       AG .
    Anyway multi-agent epistemic solvers still have to reason on domains where
the number of fluents and/or agents is limited. This is to reduce the length
of the planning problems solution that otherwise would require an excessive
quantity of resources (i.e., time and memory). For these reasons the demand of
computational resources needed (for example in respect to classical planning) is
one of the central problem in MEP.
    To reduce this gap several approaches can be used: i) as in [18, 19, 24, 29] the
planning domain can be limited to a less expressive class of problems. On the
other hand, when generality is required, ii) heuristics can effectively reduce the
resolution process, as shown in [21]. Finally another approach to follow could
be iii) to consider alternative representations to Kripke structures; this is what
mAρ tries to do.
    Changing the structure is especially important because different state repre-
sentation can lead to a better use of the resources, and to exploit properties of
the new structure to introduce important functionalities; e.g., mAρ could rely
on the concept of bisimulation to introduce the notion of visited states, being
bisimulation an equality criteria for non-well-founded sets.

5.1   State
As main contribution we introduce a modified version of mA∗ , called mAρ . The
difference is in how a state is defined: in mA∗ a state is represented as a Kripke
structure while in mAρ a state is a possibility (Section 4.2). For simplicity, we
maintain the syntax used in mA∗ .
    The strict connection of these two structures is highlighted in [16] from the
fact that a solution to a system of equations for possibilities (Definition 10)
represents a decoration for a labeled graph, which is essentially a Kripke model.
In [16], it is also expressed that a “possibility corresponds with a whole class of
bisimilar, but structurally different, Kripke models”.
    Let us define the usage of possibilities as states in a MEP domain where the
set of agents is AG and the set of fluents is F. A possibility is a function that
assigns to each propositional variable f ∈ F a truth value u(f) ∈ {0, 1} and to
each agent ag ∈ AG a set of possibilities u(ag).
    A state in MEP has to encode the truth value of the fluents, as in classical
planning, and also the beliefs of the agents about fluents and beliefs themselves.
We defined in Section 3 how Kripke structures represent these information. In
Figure 3 we use a possibility as a system of equations to encode a state (of the
domain in Example 2).
    An equation is represented in the form u = {(ag1 , σ), (ag2 , σ 0 ), . . . , f, f 0 , . . . }
where ag1 , ag2 ∈ AG, σ, σ 0 are sets of possibilities and f, f 0 ∈ F. When we write
(ag, σ) we mean that, in u, the agent ag believes that the possibilities in σ are
plausible. On the other hand only if a fluent f is present in the equation this
means that the fluent itself is true in u.
                                                                                                                         A,B
                                                                                                                  w0
                                                                                     A,B
                                                                                              w       A,B


        = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened}
 
 w
                                                                                                 C           C
 w 0
 
        = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened, heads}
                                                                                              C                   C
 v
 
       = {(A, {v, v0 }), (B, {v, v0 }), (C, {v, v0 }), look(ag), key(A)}
  0
  v     = {(A, {v, v0 }), (B, {v, v0 }), (C, {v, v0 }), look(ag), key(A), heads}      A,B,C
                                                                                                                       A,B,C


where ag ∈ {A, B}.                                                                            v       A,B,C
                                                                                                                  v0
            (a) System of equations for possibilities.                             (b) Decoration of the pointed la-
                                                                                   beled graph (G, w).

Figure 3: Representation of the possibility w after the execution of the actions
distract(C)hAi;openhAi on the initial state of Example 2. The possibility is
expanded to its system of equation for clarity.


    It is clear that a possibility correspond to a decoration of a pointed labeled
graph and therefore to a unique Kripke model up to bisimulation. The represen-
tation through possibilities allows, in our opinion, a more clear and concise view
of the state. That is because each state is represented by a single possibility; e.g.,
Figure 3 is represented by w= {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened}
where ag ∈ {A, B}.


5.2     State equality through Bisimulation

One of the reasons we chose to use possibilities as states is to exploit this new
structure to introduce new functionalities to MEP. In fact having the states
described as possibilities, which are strongly related to non-well-founded sets,
help us to introduce the concept of visited states, a core idea in planning.
    As said before, a possibility in the sense of decoration, represents all the
Kripke structures bisimilar to the decoration itself. This means that with pos-
sibilities we can exploit bisimulation to capture the idea of equality between
states. In fact, given two bisimilar decorations (or labeled graphs), these, even
with structural differences, are represented by the same possibility. On the other
hand this is not true when it comes to Kripke structures. This idea is best
described through graphical representation; therefore we will use Figure 4 to
explain this concept.
    As an example, let us introduce two new actions: flip and tell(ag). The
first one is an ontic action, where an agent inverts the position of the coin; the
observability of this action depends on the fluents looking. On the other hand,
tell(ag) means that an agent announces to ag the position in which she thinks
the coin lies while all the other agents are oblivious.
    Assuming that the coin lies tails up and given a sequence of action instances
∆ = peekhBi; distract(B)hCi; fliphCi; tell(B)hCi7 we show in Figure 4a the
result of applying ∆ in a slightly modified initial state of Example 2 where
C{A,B,C} (opened ∧¬look(A)). In Figure 4b, it is represented a Kripke structure
that has structural differences in respect to the one in Figure 4a. This means
that these two Kripke structures represent two different states in mA∗ even if
they are intuitively the same. On the contrary, if we think in term of possibilities,
both the Kripke structures of Figure 4 are represented by possibility

       r1 ={(A, {s0 , s1 }), (B, {r1 }), (C, {r1 }), look(C), key(A), opened, heads}.


                                                 B, C                                                     B,C


                                       r1                                                        r1
      B, C


                q0                 A                                                         A
                                       A                                                         A

                     A
                A


      A, B, C                               A, B, C                     A,B,C                         A,B,C


                s0       A, B, C       s1                                       s0   A,B,C       s1


(a) The resulting Kripke struc-                                    (b) A Kripke structure bisimilar
ture after the execution of ∆.                                     to the one in Figure 4a.

                                       Figure 4: Bisimilar Kripke Structures.


   Thanks to the use of possibilities we can capture bisimilar decoration while
the same were considered different in mA∗ and, therefore, check if a state has
7
    We recall that in [5] an action instance is represented as actionhagi where ag is the
    agent that executes action.
  already been visited. Next we define the concept of entailment and finally the
  transition function in mAρ .

  5.3     Entailment
  We now introduce the concept of entailment w.r.t. possibilities.
  Definition 11 (entailment w.r.t. possibilities). Given the belief formulae
  ϕ, ϕ1 , ϕ2 , a fluent f, an agent ag, a group of agents α, and a possibility u:
   (i ) u |= f if u(f) = 1;
  (ii ) u |= ¬φ if u 6|= φ;
 (iii ) u |= φ1 ∨ φ2 if u |= φ1 or u |= φ2 ;
  (iv ) u |= φ1 ∧ φ2 if u |= φ1 and u |= φ2 .
   (v ) u |= Bag ϕ if for each v ∈ u(ag), v |= ϕ;
  (vi ) u |= ¬ϕ if u 6|= ϕ;
 (vii ) u |= ϕ1 ∨ ϕ2 if u |= ϕ1 or u |= ϕ2 ;
(viii ) u |= ϕ1 ∧ ϕ2 if u |= ϕ1 and u |= ϕ2 ;
  (ix ) u |= Eα ϕ if u |= Bag ϕ for all ag ∈ α;
   (x ) u |= Cα ϕ if u |= Ekα ϕ for every k ≥ 0, where E0α ϕ = ϕ and Ek+1
                                                                      α ϕ =
        Eα (Ekα ϕ).

  5.4     Transition function
  Finally we introduce the transition function for mAρ . In defining this transition
  function we made some assumptions: i) we consider that the given initial state
  also specifies which world is the pointed one (as said in [5]), this allow us to relax
  the description of Φ in the case of announcements or sensing actions. Moreover
  ii) we do not take into consideration the case in which an agent can have false
  beliefs 8 because is still an open question how to deal with them in MEP; we will
  try to address this problem in future works. The transition function relative to
  the domain D is ΦD : AI ×Σ → Σ ∪{∅} where Σ is the set of all the possibilities.
      For the sake of readability, the abbreviations used in the following definition
  are explained at the end of it.
  Definition 12 (transition function for mAρ ). Let a ∈ AI, and a possibility
  u be given. If a is not executable in u then ΦD (a, u) = ∅ otherwise if a is executable
  in u :
   – (
     ΦD (a, u) = v if a is an ontic action instance and
       v(f) = u(f)          if f 6= caused(a)
       v(f) = caused(a) if f = caused(a)
        and
        (
          v(ag) = u(ag)                   if ag ∈ OD
                  S
          v(ag) = w∈u(ag) ΦD (a, w)       if ag ∈ FD

  8
      An agent ag has a false belief about ϕ in state s if s |= ϕ and s |= Bag ¬ϕ.
 – ΦD (a, u) = v if a sensed the fluent f
   
   
    ∅                                                       if sensed(a)[f] 6= u(f)
   
   v(ag) = u(ag)                                            if ag ∈ OD and sensed(a)[f] = u(f)
              S
   
    v(ag) = w∈u(ag) ΦD (a, w)                               if ag ∈ FD and sensed(a)[f] = u(f)
             S
     v(ag) = w∈u(ag) (ΦD (a, w) ∪ ΦD (¬a, w))                if ag ∈ PD and sensed(a)[f] = u(f)
   


 – ΦD (a, u) = v if a announces the fluent formula φ
   
   
    ∅                                          if u 6|= φ
   
   v(ag) = u(ag)                               if ag ∈ OD and u |= φ
              S
     v(ag) =           Φ D (a, w)               if ag ∈ FD and u |= φ
              Sw∈u(ag)
   
   
   
     v(ag) = w∈u(ag) (ΦD (a, w) ∪ ΦD (¬a, w)) if ag ∈ PD and u |= φ
   

where:

 – caused(a) is the fluent modified by the action instance a;
 – FD , PD , OD identify the fully observant, partially observant and oblivious agents
   respectively;
 – sensed(a)[f] represents the truth value of the fluent f determined by a = u(f);
 – ¬a describes the action that senses the opposite of a; namely sensed(a)[¬f].

Note that sensing and announcement actions generate the empty set when the
action effects do not respect the fluents truth values of the possibility where
the action is executed. That is because epistemic action (i.e., announcement or
sensing) cannot change the state of the world but only the beliefs of the agent.
    As example we present the execution of the sequence distract(C)hAi; openhAi
on the initial state of Example 2 encoded by the possibility u in Figure 5a.


(                                                    (
 u    = {(ag, {u, u0 }), look(ag), key(A)}            v     = {(ag, {v, v0 }), look(A), look(B), key(A)}
 u0   = {(ag, {u, u0 }), look(ag), key(A), heads}     v0    = {(ag, {v, v0 }), look(A), look(B), key(A), heads}
              (a) The initial state u.                   (b) The state v after executing distract(C)hAi.
(
 w       = {(A, {w, w0 }), (B, {w, w0 }), (C, {v, v0 }), look(A), look(B), key(A), opened}
 w0      = {(A, {w, w0 }), (B, {w, w0 }), (C, {v, v0 }), look(A), look(B), key(A), opened, heads}
                 (c) The state w after executing distract(C)hAi;openhAi.

Figure 5: Execution of distract(C)hAi;openhAi in Example 2 (ag ∈ {A, B, C}).


6     Conclusions and Future Works
In this paper we investigated an alternative to Kripke structures as representa-
tion for multi-agent epistemic planning states. Doing so we presented mAρ , an
action language for MEP based on possibilities, introducing MEP in non-well-
founded set theory. Moreover we exploited possibilities to define a stronger con-
cept of equality on states collapsing all the bisimilar states into the same possibil-
ity. Finally, as with mAρ is more direct to have an implicit state-representation,
using possibilities helps in reducing the search-space dimension.
    In the near future we intend to implement a planner for mAρ and to study
alternative representations. In particular we plan to: i) exploit more set-based
operations: especially for the entailment of group operators; ii) formalize the
concept of non-consistent belief for mAρ ; iii) investigate more thoroughly the
connection between Kripke structures and non-well-founded sets; iv) examine
the concept of bisimulation as equality between epistemic states; v) and finally
consider other alternatives to Kripke structures,e.g., OBDDs [9, 11].


References
 1. Aczel, P.: Non-well-founded sets. CSLI Lecture Notes, 14 (1988)
 2. Aucher, G., Bolander, T.: Undecidability in epistemic planning. pp. 27–33 (2013),
    cited By 27
 3. Aumann, R., Brandenburger, A., et al.: Epistemic conditions for Nash equilibrium.
    ECONOMETRICA-EVANSTON ILL- 63, 1161–1161 (1995)
 4. Balliu, M., Dam, M., Le Guernic, G.: Epistemic temporal logic for information flow
    security. In: Proceedings of the ACM SIGPLAN 6th Workshop on Programming
    Languages and Analysis for Security. pp. 6:1–6:12. PLAS ’11, ACM, New York,
    NY, USA (2011). https://doi.org/10.1145/2166956.2166962
 5. Baral, C., Gelfond, G., Pontelli, E., Son, T.C.: An action language
    for multi-agent domains: Foundations. CoRR abs/1511.01960 (2015),
    http://arxiv.org/abs/1511.01960
 6. Barwise, J., Etchemendy, J.: The liar: An essay on truth and circularity. Oxford
    University Press (1987)
 7. Bolander, T., Jensen, M., Schwarzentruber, F.: Complexity results in epistemic
    planning. vol. 2015-January, pp. 2791–2797 (2015)
 8. Bolander, T., Andersen, M.B.: Epistemic planning for single-and multi-agent sys-
    tems. Journal of Applied Non-Classical Logics 21(1), 9–34 (2011)
 9. Bryant, R.E.: Symbolic boolean manipulation with ordered binary-decision dia-
    grams. ACM Computing Surveys (CSUR) 24(3), 293–318 (1992)
10. Carbonell Jr, J.G.: Politics: Automated ideological reasoning. Cognitive Science
    2(1), 27–51 (1978)
11. Cimatti, A., Roveri, M.: Conformant planning via symbolic model checking. Jour-
    nal of Artificial Intelligence Research 13, 305–338 (2000)
12. Dovier, A.: Logic programming and bisimulation. In: Vos, M.D., Eiter, T., Lierler,
    Y., Toni, F. (eds.) ICLP. CEUR Workshop Proceedings, vol. 1433. CEUR-WS.org
    (2015), http://ceur-ws.org/Vol-1433
13. van Eijck, J.: Dynamic epistemic modelling. CWI. Software Engineering [SEN]
    (2004)
14. Fagin, R., Halpern, J.Y.: Reasoning about knowledge and probability. Journal of
    the ACM (JACM) 41(2), 340–367 (1994)
15. Gerbrandy, J., Groeneveld, W.: Reasoning about information change.
    Journal of Logic, Language and Information 6(2), 147–169 (1997).
    https://doi.org/10.1023/A:1008222603071
16. Gerbrandy, J.D.: Bisimulations on planet Kripke. Inst. for Logic, Language and
    Computation, Univ. van Amsterdam (1999)
17. Van der Hoek, W., Meyer, J.: A complete epistemic logic for multiple agents. In:
    Epistemic Logic and the Theory of Games and Decisions, pp. 35–68. Springer
    (1997)
18. Huang, X., Fang, B., Wan, H., Liu, Y.: A general multi-agent epistemic planner
    based on higher-order belief change. pp. 1093–1101 (2017)
19. Kominis, F., Geffner, H.: Beliefs in multiagent planning: From one agent to many.
    In: Proc. ICAPS. pp. 147–155 (2015)
20. Kripke, S.A.: Semantical considerations on modal logic. Acta Philosophica Fennica
    16(1963), 83–94 (1963)
21. Le, T., Fabiano, F., Son, T.C., Pontelli, E.: Efp and pg-efp: Epistemic forward
    search planners in multi-agent domains. In: Twenty-Eighth International Confer-
    ence on Automated Planning and Scheduling (2018)
22. Liu, Q., Liu, Y.: Multi-agent epistemic planning with common knowledge. In: IJ-
    CAI. pp. 1912–1920 (2018)
23. Moore, R.C.: Reasoning about knowledge and action. In: Readings in Artificial
    Intelligence, pp. 473–477. Elsevier (1981)
24. Muise, C.J., Belle, V., Felli, P., McIlraith, S.A., Miller, T., Pearce, A.R., Sonenberg,
    L.: Planning over multi-agent epistemic states: A classical planning approach. In:
    Proc. of AAAI. pp. 3327–3334 (2015)
25. Prakken, H.: Logical tools for modelling legal argument: a study of defeasible rea-
    soning in law, vol. 32. Springer Science & Business Media (2013)
26. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall
    Press, Upper Saddle River, NJ, USA, 3rd edn. (2009)
27. Van Ditmarsch, H., van Der Hoek, W., Kooi, B.: Dynamic epistemic logic, vol. 337.
    Springer Science & Business Media (2007)
28. Van Harmelen, F., Lifschitz, V., Porter, B.: Handbook of knowledge representation,
    vol. 1. Elsevier (2008)
29. Wan, H., Yang, R., Fang, L., Liu, Y., Xu, H.: A complete epistemic planner without
    the epistemic closed world assumption. In: IJCAI (2015)
A     mA∗ and mAρ comparison


For a more clear comparison we show the execution, on both mA∗ and mAρ ,
of the action instances sequence ∆c = distract(C)hAi; openhAi; peekhAi, that
leads to the desired goal, in the domain expressed in Example 2. With this we
want to give a graphical explanation of both the transition functions and state-
space defined by the two languages. Each state in mA∗ will be represented by a
Kripke structure while in mAρ will be a possibility (expanded to its respective
system of equations for clarity).
    The observability relations of each action instance in ∆c is expressed in the
following Table:


                        distract(C)hAi     openhAi    peekhAi
                  FD         A, B, C         A, B        A
                  PD            -             -          B
                  OD            -             C          C
         Table 2: Observability relations of the actions instances in ∆c .


    The initial state, based on Example 2, is defined by the conditions:


    intially C(key(A))
    intially C(¬key(B))
    intially C(¬key(C))
    intially C(¬opened)
    intially C(¬Bag heads ∧ ¬Bag ¬heads) for ag ∈ {A, B, C}
    intially C(look(ag)) for ag ∈ {A, B, C}
    intially ¬heads


    Finally the goal that both the state in Figure 9 entail is expressed with the
following formulae:

           BA ¬heads ∧ BA (BB (BA heads ∨ BA ¬heads))
           BB (BA heads ∨ BA ¬heads) ∧ (¬BB heads ∧ ¬BB ¬heads)
                  ^
           BC [        (¬Bag heads ∧ ¬Bag ¬heads)]
                ag∈{A,B,C}
                       The plan for Example 2


      A, B, C                                             A, B, C


                  s0           A, B, C             s1                         (
                                                                               u    = {(ag, {u, u0 }), look(ag), key(A)}
                                                                               u0   = {(ag, {u, u0 }), look(ag), key(A), heads}
     M0 [π](s0 ) = {look(ag), key(A)}                                         where ag ∈ {A, B, C}.
                                               where ag ∈ {A, B, C}.
     M0 [π](s1 ) = {look(ag), key(A), heads}


            (a) The intial Kripke structure (M0 , s0 ).                                (b) The initial possibility u.

                                                             Figure 6: The initial state.


      A, B, C                                             A, B, C


                 p0            A, B, C             p1                         (
                                                                               v    = {(ag, {v, v0 }), look(A), look(B), key(A)}
                                                                               v0   = {(ag, {v, v0 }), look(A), look(B), key(A), heads}
                M1 [π](p0 ) ={look(A), look(B), key(A)}                       where ag ∈ {A, B, C}
                M1 [π](p1 ) ={look(A), look(B), key(A), heads}


(a) The Kripke stucture (M1 , p0 ), obtained after the execution             (b) Possibility v, obtained after the execution of
of distract(C)hAi in (M0 , s0 ) (Figure 6a).                                 distract(C)hAi in u (Figure 6b).

                                                   Figure 7: Execution of distract(C)hAi.
              A, B                                                     A, B


                           q0              A, B             q1


                                    C                  C
                                                                                                (
                                                                                                    w    = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened}
                           C                                    C
                                                                                                    w0   = {(ag, {w, w0 }), (C, {v, v0 }), look(ag), key(A), opened, heads}

                                                                     A, B, C                   where ag ∈ {A, B} and v, v0 , are the possibilities represented in Figure 7b.
             A, B, C


                           p0            A, B, C            p1


 M2 [π](q0 ) = {look(ag), key(A), opened}          M2 [π](p0 ) = M1 [π](p0 )
 M2 [π](q1 ) = {look(ag), key(A), opened, heads} M2 [π](p1 ) = M1 [π](p1 )
  where ag ∈ {A, B}.

(a) The Kripke stucture (M2 , q0 ), obtained after the execution of                           (b) Possibility w, obtained after the execution of openhAi in v (Figure 7b).
openhAi in (M1 , p0 ) (Figure 7a).

                                                                               Figure 8: Execution of openhAi.


             A, B                                                    A, B


                        r0                B                r1


                                C                  C
                                                                                                (
                                                                                                 z       = {(A, {z}), (B, {z, z0 })(C, {v, v0 }), look(ag), key(A), opened}
                       C                                   C
                                                                                                 z0      = {(A, {z0 }), (B, {z, z0 })(C, {v, v0 }), look(ag), key(A), opened, heads}

                                                                    A, B, C                    where ag ∈ {A, B} and the possibilities v, v0 are represented in Figure 7b.
           A, B, C


                        p0              A, B, C            p1


 M3 [π](r0 ) = {look(ag), key(A), opened}          M3 [π](p0 ) = M1 [π](p0 )
 M3 [π](r1 ) = {look(ag), key(A), opened, heads} M3 [π](p1 ) = M1 [π](p1 )
  where ag ∈ {A, B}.

(a) The Kripke structure (M3 , q0 ), obtained after the execution of                           (b) Possibility z, obtained after the execution of peekhAi in w (Figure 8b).
peekhAi in (M2 , q0 ).

                                                                               Figure 9: Execution of peekhAi.