=Paper= {{Paper |id=Vol-2785/paper6 |storemode=property |title=Reasoning in Multi-Agent Conformant Planning over Transition Systems |pdfUrl=https://ceur-ws.org/Vol-2785/paper6.pdf |volume=Vol-2785 |authors=Peipei Wu,Yanjun Li |dblpUrl=https://dblp.org/rec/conf/overlay/WuL20 }} ==Reasoning in Multi-Agent Conformant Planning over Transition Systems == https://ceur-ws.org/Vol-2785/paper6.pdf
                                                  Proceedings of the
     2nd Workshop on Artificial Intelligence and Formal Verification, Logics, Automata and Synthesis (OVERLAY),
                                                  September 25, 2020




     Reasoning in Multi-Agent Conformant Planning
                over Transition Systems∗

                                        Peipei Wu1 and Yanjun Li1
                          1
                              College of Philosophy, Nankai University, China


                                                    Abstract

                Reasoning about actions and information is one of the most active areas of
            research in artificial intelligence. In this paper, we study the reasoning in multi-
            agent conformant planning. We introduce a formalism to trace the update of
            information in multi-agent conformant planning over transition systems. We
            propose a dynamic epistemic logical framework EALn to capture reasoning in such
            scenarios and present an upper bound on the time complexity of multi-agent
            conformant planning in terms of EALn .


1     Introduction
Conformant planning, which is a branch of artificial intelligence (AI), deals with devising an action
sequence to achieve a goal in the presence of uncertainty about the initial state (see [15]). Tracking the
update of the uncertainty during the execution of actions plays an important role. In single-agent settings,
the update of uncertainty is direct and intuitive. However, the situation becomes much more involved in
multi-agent settings, since nested beliefs need to be considered. How to intuitively represent and track the
update of uncertainty in multi-agent planning is one of the main challenges in the AI community.
    Instead of using transition systems as the underlying formalism of planning, in recent years, there has
been a growing interest in handling multi-agent planning in dynamic epistemic logic (DEL) framework
(see e.g., [5, 12, 2, 3, 17, 14, 10, 6, 13]). Usually, it is called epistemic planning in literature. In epistemic
planning, states are epistemic models, actions are event models and the state transitions are implicitly
encoded by the update product which computes a new epistemic model based on an epistemic model and
an event model. One advantage of this approach is its expressiveness in handling partially observable
actions, such as private announcements. However, this expressiveness comes at a price, as shown in [5, 4],
multi-agent epistemic planning is undecidable in general. Moreover, if only fully observable actions are
considered, the DEL formalism of event models is more complex and less intuitive than transition systems.
In [11, 16], another approach is proposed which uses the core-idea of DEL but not its formalism to track
the uncertainty change over transition systems in single-agent conformant planning, but this approach
cannot be applied to handle multi-agent planning.
    This paper proposes a semantic-driven dynamic epistemic framework for reasoning about knowledge
and action in multi-agent conformant planning. The main contributions of this paper are summarized as
follows: 1) We introduce an update formalism to track information change in multi-agent conformant
planning over transition systems. The update formalism can be seen as a generalized version of the
   ∗ This work is supported by the Fundamental Research Funds for the Central Universities 63202061 and 63202926.

Corresponding author: Yanjun Li
                 Copyright © 2020 for this paper by its authors.
                 Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
36                                                                                  Peipei Wu and Yanjun Li


approach introduced in [11], but it is not a directly generalized result. In [11], the agent’s belief is
represented by a set of states that the agent cannot distinguish. This representation does not work
anymore in multi-agent settings, since nested beliefs need to be considered. In this paper, we use a
non-well-founded notion, possibility, to represent belief states in multi-agent planning, and we define the
update on possibilities to capture information change. 2) We propose a logical framework EALn with
DEL-style update semantics and expressive language. In the literature of epistemic planning, such as [6],
the logical language is an extension of proposition logic with knowledge modalities. The language of EALn
has not only knowledge modalities but also action modalities, which allows us to express more complicated
planning goals. Moreover, we can talk about the interaction between action and knowledge in the logical
language. It would give us a more in-depth understanding of reasoning about action and information in
multi-agent planning. 3) We present an upper bound on the time complexity of multi-agent conformant
planning in terms of EALn . We show that multi-agent conformant planning with goal expressed by an
EALn -formula is in double exponential time.


2      Information change in multi-agent conformant planning
Let Act be a set of actions, P be a set of proposition variables, and Ag be a set of agents.
    A transition system T is a triple hS T , {RaT | a ∈ Act}, V T i, where S T is a non-empty set of states,
for each a ∈ Act, RaT is a function from S to P(S), and V T : P → P(S T ) is an assignment function.
    We use possibility on transition systems, which is a variant of the notion used in [8, 7], to represent
agents’ epistemic uncertainty about states. Before introducing the notion of possibility, let us start by
introducing some auxiliary notions.
                                                                                         i
     Let T be a transition system. A Kripke structure N on T is a triple hW N , {99K | i ∈ Ag}, LN i,
                                                                        i
where W N is a non-emptyset of possible worlds, for each i ∈ Ag, 99K is a binary relation on W , and
LN : W N → S T is a function that labels each possible world with a state in T . For each w ∈ W N , we
say (N , w) is a pointed Kripke structure.
    A Kripke structure N on T models agents’ uncertainty over states in T . By [1], we know that a
pointed Kripke structure can be represented by the following notion from non-well-founded set theory.
    Let N be a Kripke structure on a transition system T . A decoration d of N is a function that assigns
to each world w ∈ W a function d(w) such that
     • d(w)(T ) = LN (w) (i.e., d(w) assigns to T a state that is the one with which L labels w), and
                                                i
     • for each i ∈ Ag, d(w)(i) = {d(w0 ) | w 99K w0 } (i.e., d(w) assigns to each agent i the (non-well-
                                                                                i
       founded) set of functions associated with worlds reachable from w by 99K ).
If d is a decoration of N and w is a world in N , we say that d(w) is the solution of w in N , and (N , w) is
a picture of d(w).
    Now we are ready to introduce the concept of possibility.
Definition 1 (Possibility). Let T be a transition system. A possibility α on T is a function that assigns
to T a state in S T , and to each agent i ∈ Ag a set of possibilities.
    As shown in [8], there is a close relationship between these (non-well-founded) possibilities on T and
Kripke structures on T . Each Kripke structure N has a unique decoration that assigns to each world in
N a possibility, and each possibility has a picture. For example, let T0 be the transition system depicted
in Figure 1. A possibility α1 on T0 is presented in Figure 2a. The pointed Kripke structure (N , w1 ) on T0 ,
depicted in Figure 2b, is a picture of α1 .
    The choice of possibilities over Kripke structures as uncertainty representation provides several
advantages (see [9]). One of them is that update on possibility is natural and elegant. The update on
possibilities, defined in the following, tracks the change of agents’ uncertainty when an action is executed.

Definition 2 (Update). Let α be a possibility on T . The update of α with an action a ∈ Act, Sdenoted α|a ,
is a set of possibilities on T such that α ∈ α| if and only if α (T ) ∈ Ra (α(T )) and α (i) = β∈α(i) β|a .
                                          0    a                0                       0
Reasoning in Multi-Agent Conformant Planning over Transition Systems                                                                                    37


                                                                 α1 = {(i, (α1 , α1 )), (j, (α1 , α1 )),
                                                               
                                                               
                                                               
                                                                      (j, (α1 , α2 )), s1 }
                                                               
                                                               
             s1    a   / t1 : p                                  α2 = {(i, (α2 , α2 )), (j, (α2 , α1 )),
                          <
                                                               
                                                               
                                                               
                                                                      (j, (α2 , α2 )), s2 }
                                                               
                   a
                                                                       (a) System of equations of α1
             s2    a     / t2

                                                                         w1 :s1 o                     / w2 :s2
                                                                                      j
                                                                 i,j                                               i,j
                                                                         6                                    h
      Figure 1: Transition system T0
                                                                        (b) Picture of α1 : (N , w1 )

                                                                  Figure 2: Possibility α1 on T0


    As an example, the update of the possibility α1 depicted in Figure 2a will result in a unit possibility
set α11 , which is depicted in Figure 3. Moreover, the update of α2 in Figure 2a, whose picture is (N , w2 )
in Figure 2b, is the set of possibilities α21 and α22 , both of which are depicted in Figure 3. The pointed
Kripke structures (N 0 , w21 ) and (N 0 , w22 ) are the pictures of α21 and α22 respectively. Moreover, we
would like to point out that the difference between α11 (i) and α21 (i) corresponds to the fact that the
information state of agent i at state t1 is different depending on how he/she gets there, and the identity
between α11 (j) and α21 (j) reflects the fact that the information state of agent j at t1 is the same because
he/she intuitively cannot distinguish s1 and s2 .

     α11 =   {(i, (α11 , α11 )),
    
    
    
             (j, (α11 , α11 )), (j, (α11 , α21 )), (j, (α11 , α22 )), t1 }
    
    
    
                                                                                                                       i,j
    
    α =
    
             {(i, (α21 , α21 )), (i, (α21 , α22 )),
      21                                                                                                                      
            (j, (α21 , α11 )), (j, (α21 , α21 )), (j, (α21 , α22 )), t1 }                                          >
                                                                                                                   w11 :t1
                                                                                                                                  `
                                                                                                               j                      j
    
    
    α22 =   {(i, (α22 , α21 )), (i, (α22 , α22 )),
    
    
                                                                                                           ~
    
                                                                                                  w21 :t1 o                           / w22 :t2
                                                                                                                        i,j
             (j, (α22 , α11 )), (j, (α22 , α21 )), (j, (α22 , α22 )), t2 }
    
    
                                                                                          i,j                                                     i,j
                                                                                                  6                                           h
                   (a) System of equations of α11                                               (b) Picture of α11 : (N 0 , w11 )

                                            Figure 3: Possibility α11 on T0



3     The logic EALn and its application
In this section, we will firstly introduce the language and the semantics of EALn and then use it to capture
reasoning in multi-agent conformant planning. At the end of the section, we provide an upper bound for
multi-agent conformant planning in terms of EALn .
Definition 3 (Language). The language LEALn is defined by the following BNF:

                                            φ ::= p | ¬φ | φ ∧ φ | [a]φ | Ki φ,

where p ∈ P, a ∈ Act, and i ∈ Ag. We use ⊥, ∨, →, h·i as the usual abbreviations.
   The formula [a]φ intuitively means that φ holds if the action a is successfully done. The formula Ki φ
expresses that the agent i knows that φ. Since the language of EALn contains both action modalities and
knowledge modalities, this allows one to talk about the interaction between action and knowledge in the
object language.
   Formulas of LEALn will be interpreted on dynamic epistemic models. A dynamic epistemic model is a
pair (T , α), where T is a transition system and α is a possibility on T such that for each i ∈ Ag, α is in
α(i) and α0 ∈ α(i) implies α0 (i) = α(i).
38                                                                                   Peipei Wu and Yanjun Li


Definition 4 (Update semantics). Given a dynamic epistemic model (T , α) and a formula φ ∈ LEALn ,
the satisfaction relation  is defined as follows:

                                    T , α  p ⇐⇒ α(T ) ∈ V (p)
                                  T , α  ¬φ ⇐⇒ T , α 2 φ
                               T , α  φ ∧ ψ ⇐⇒ T , α  φ and T , α  ψ
                                 T , α  Ki φ ⇐⇒ for all α0 ∈ α(i) : T , α0  φ
                                 T , α  [a]φ ⇐⇒ for all α0 ∈ α|a : T , α0  φ

We say that φ is valid, denoted as  φ, if T , α  φ for each dynamic epistemic model T , α.
   Within EALn , we can express the epistemic state of agents and its evolvement in multi-agent planning.
Take the following example. Let T0 and α1 be depicted as Figure 1 and Figure 2 respectively. We then
have the following statements.
  • T0 , α1  [a](Ki p ∧ ¬Kj p). When the initial possibility is α1 , after the action a is executed, agent i
     knows that p, and agent j does not know that p.
  • T0 , α2  [a](¬Ki p ∧ ¬Kj p). When the initial possibility is α2 , after the action a is executed, neither
     i not j knows that p.
  • For all φ ∈ LEALn , T0 , α1  [a]Kj φ iff T0 , α2  [a]Kj φ. No matter the initial possibility is α1 or α2 ,
     after the action a is executed, the knowledge of j is the same.
   Within EALn , we can capture reasoning about action and information change in multi-agent conformant
planning.

Lemma 1 (Perfect Recall). For each dynamic epistemic model (T , α), we have that T , α  Ki [a]φ → [a]Ki φ
for each i ∈ Ag and each a ∈ Act.
Proof. Assume that T , α  Ki [a]φ. We will show that T , α  [a]Ki φ. Taka any α0 ∈ α|a and any β 0 ∈ α0 (i).
To show that T , α  [a]Ki φ, by the update semantics, we onlySneed to show that T , β 0  φ.
    Since α0 ∈ α|a , by Definition 2, this follows that α0 (i) = β∈α(i) β|a . Since β 0 ∈ α0 (i), we then have
that there is some β ∈ α(i) such that β 0 ∈ β|a . Since T , α  Ki [a]φ and β ∈ α(i), this follows that
T , β  [a]φ. Moreover, since β 0 ∈ β|a , this follows that T , β 0  φ. Therefore, we have shown that T , β 0  φ
for each α0 ∈ α|a and each β 0 ∈ α0 (i), namely T , α  [a]Ki φ. Thus, we have shown that if T , α  Ki [a]φ
then T , α  [a]Ki φ.
Lemma 2 (No Miracles). For each dynamic epistemic model (T , α), we have that T , α  haiKi φ → Ki [a]φ
for each i ∈ Ag and each a ∈ Act.
Proof. Assume that T , α  haiKi φ. We will show that T , α  Ki [a]φ. By the update semantics, we only
need to show that T , β 0  φ for each α0 ∈ α(i) and each β 0 ∈ α0 |a .
   Since T , α  haiKi φ, this follows that
                                         S there isa some γ ∈ α| 0 such that T , γ  Ki φ. Since γ 0 ∈a α| , by
                                                                   a                                       a

Definition 2, this follows that γ(i) = β∈α(i) β| . Take any α ∈ α(i). We then have that α | ⊆ γ(i).
Take any β 0 ∈ α0 |a . We then have that β 0 ∈ γ(i). Since T , γ  Ki φ, this follows that T , β 0  φ. Thus, we
have shown that T , β 0  φ for each α0 ∈ α(i) and each β 0 ∈ α0 |a , namely T , α  Ki [a]φ. Thus, we have
shown that if T , α  haiKi φ then T , α  Ki [a]φ.
    Intuitively, perfect recall means that if the agent cannot distinguish two states after doing action a,
then he/she could not distinguish them before. No miracles mean that if the agent cannot distinguish two
states and the same action is executed on both states, then he/she cannot distinguish the resulting states.
These can be depicted in Figure 4.
    We now introduce multi-agent conformant planning in terms of EALn . First,       S we introduce some
auxiliary notations. Let Σ be a set of possibilities. We use Σ|a to denote the set α∈Σ α|a . Let σ ∈ Act∗
be an action sequence a1 · · · an . We use α|σ to denote the information state (· · · ((α|a1 )|a2 ) · · · )|an .

Definition 5 (Multi-agent conformant planning). A multi-agent conformant planning problem P is a
tuple hT , α, G, {Acti | i ∈ G}, φg i, where T is a transition system, α is the initial possibility on T , G
Reasoning in Multi-Agent Conformant Planning over Transition Systems                                                                 39



                                                               s1 o       / s3                           s1 o               / s3
                                                                      i                                           i
                   s1
                                           perfect
                   a                       −−−−→               a            a         no miracles
                                                                                      ←−−−−−−−           a                      a
                                            recall
                                                               
                   s2 o
                          i   / s4                             s2 o
                                                                      i   / s4                           
                                                                                                         s2
                                                                                                                                 
                                                                                                                                s4


                                           Figure 4: perfect recall and no miracles


is a group of agents, Acti ⊆ Act is a set of actions that the agent i can perform,    S φg ∈ L∗EAL is the
                                                                                                    n

goal formula. A solution to P is a finite sequence of actions σ =     a1 · · · an ∈ (  i∈G Act i ) such that
1) theTsequence σ is applicable 1 in the state α0 (T ) for each α0 ∈ i∈G α(i), and 2) T , β  φg for each
                                                                    T
α0 ∈ i∈G α(i) and each β ∈ α0 |σ .
    Let LaMφ be an abbreviation of the formula hai> ∧ [a]φ. It can be shown that an action sequence
a1 · · · an is applicable on the state α(T ) iff T , α  La1 M · · · Lan M>. We then have the following theorem,
which means that the solution can be verified in EALn .
Theorem 3. Let P = hT , α, G, {Acti | i ∈ G}, φg i be a planning         problem. An action sequence a1 · · · an
is a solution for P iff T , α0  La1 M · · · Lan Mφg for each α0 ∈ i∈G α(i).
                                                                  T

   In this paper’s remainder, we show an upper bound on the time complexity of multi-agent conformant
planning in terms of EALn .
                                                                                                              i
    Let T = hS, {Ra | a ∈ Act}, V i be a transition system and N = hW, {99K | i ∈ Ag}, Li be a
                                                                              †   i                           †
Kripke structure on T . The Kripke structure N † = hW N , {99K N † | i ∈ Ag}, LN i is defined as follows:
     †                                 i                                                   i                          †
W N = {(w, L(w)) | w ∈ W }, 99K N † = {((w, L(w)), (w0 , L(w0 ))) | w 99K w0 }, and LN (w, L(w)) = L(w).
If N is a Kripke structure on T , so is N † . It is obvious that (N , w) is isomorphic to (N † , (w, L(w))).
Therefore, if (N , w) is a picture of a possibility α, so is (N † , (w, L(w))).
    Given a transition system T and a Kripke structure N on T , let the Kripke structure N † is defined as
above. We now define the update of N † with an action a ∈ Act, denoted as N † |a . The Kripke structure
                   † a    i                              † a                                       † a
N † |a = hW N | , {99K N † |a | i ∈ Ag}, LN | i is defined as follows: W N | = {(w, s0 ) ∈ W × S | there is
               †                                     i                                         i                          † a
(w, s) ∈ W N such that (s, s0 ) ∈ RaT }, 99K N † |a = {((w, s), (w0 , s0 )) | w 99K w0 }, and LN | (w, s) = s. It is
evident that N † |a is also a Kripke structure on T . We have known that if (N , w) is a picture of a possibility
α, so is (N † , (w, L(w))). If α0 ∈ α|a , this follows that α(T )RaT α0 (T ). Let α(T ) = s and α0 (T ) = s0 . Since
                                                                                                                   †
(N , w) is a picture of a possibility α, this follows that L(w) = s, and then we have that (w, s) ∈ W N .
                                                                               † a
Since sRaT s0 , by the definition of N † |a , we know that (w, s0 ) ∈ W N | . With the bisimulation principle
(see [8]), it can be shown that (N † |a , (w, s0 )) is a picture of α0 . Moreover, it can be shown that for each
action sequence a1 · · · an and each β ∈ α|a1 · · · |an , the pointed Kripke structure N † |a1 · · · |an , (w, β(T ))
                               † a1 an
is picture of β. Since W N | ···| ⊆ W × S, this follows that the number of possibilities that can be
generated by the initial possibility α is bounded by the size of the powerset of W × S.
    Given a possibility α, let the size of α be the size of the minimal pointed Kripke structure (N , w) such
that (N , w) is a picture of α. Given a multi-agent conformant planning problem P = hT , α, G, {Acti | i ∈
G}, φg i, let the size of P be the product of the size of T , the size of α, and the size of φg .
Theorem 4. Multi-agent conformant planning in terms of EALn is in double exponential time.
Proof idea. Given a multi-agent conformant planning P , we can do a breadth-first search with duplicate
checking on possibilities to find a plan. Since we have shown above that the number of possibilities that
can be generated by the initial possibility α is bounded by the exponent of the size of P , this follows
that the depth of the searching tree is bounded by the exponent of the size of P . Therefore, the time of
the breadth-first search is in double exponent of the size of P . Moreover, by Theorem 3, we know that
the verification of a plan can be reduced to model checking in EALn . Since model checking in EALn is in
polynomial time, this follows that searching a plan for P is in double exponential time.
   1 a · · · a is applicable on a state s iff R (s) 6= ∅, and for each 1 ≤ k < n, t ∈ R
      1       n                                a1                                       a1 ···ak (s) implies Rak+1 (t) 6= ∅.
40                                                                               Peipei Wu and Yanjun Li


References
 [1] P. Aczel. Non-well-founded sets, volume 14 of CSLI lecture notes series. CSLI, 1988.
 [2] M. B. Andersen, T. Bolander, and M. H. Jensen. Conditional epistemic planning. In Logics in
     Artificial Intelligence, pages 94–106. Springer, 2012.

 [3] G. Aucher. DEL-sequents for regression and epistemic planning. Journal of Applied Non-Classical
     Logics, 22(4):337–367, 2012.
 [4] G. Aucher and T. Bolander. Undecidability in epistemic planning. In Proceedings of IJCAI ’13, pages
     27–33, 2013.
 [5] T. Bolander and M. B. Andersen. Epistemic planning for single and multi-agent systems. Journal of
     Applied Non-Classical Logics, 21(1):9–34, 2011.
 [6] T. Bolander, M. H. Jensen, and F. Schwarzentruber. Complexity results in epistemic planning. In
     Proceedings of IJCAI ’15, pages 2791–2797, 2015.
 [7] F. Fabiano, A. Burigana, A. Dovier, and E. Pontelli. EFP 2.0: A multi-agent epistemic solver with
     multiple e-state representations. In J. C. Beck, O. Buffet, J. Hoffmann, E. Karpas, and S. Sohrabi,
     editors, Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling,
     Nancy, France, October 26-30, 2020, pages 101–109. AAAI Press, 2020.
 [8] J. Gerbrandy and W. Groeneveld. Reasoning about Information Change. Journal of Logic, Language
     and Information, 6(2):147–169, 1997.

 [9] W. Groeneveld. Logical Investigations into Dynamic Semantics. PhD thesis, University of Amsterdam,
     1995.
[10] M. H. Jensen. Epistemic and Doxastic Planning. PhD thesis, Technical University of Denmark, 2014.
[11] Y. Li, Q. Yu, and Y. Wang. More for free: a dynamic epistemic framework for conformant planning
     over transition systems*. Journal of Logic and Computation, 27(8):2383–2410, 06 2017.

[12] B. Löwe, E. Pacuit, and A. Witzel. DEL planning and some tractable cases. In LORI’11, pages
     179–192. Springer, 2011.
[13] C. Muise, V. Belle, P. Felli, S. McIlraith, T. Miller, A. R. Pearce, and L. Sonenberg. Planning
     over multi-agent epistemic states: A classical planning approach. In The 29th AAAI Conference on
     Artificial Intelligence, 2015.
[14] P. Pardo and M. Sadrzadeh. Strong planning in the logics of communication and change. In Declarative
     Agent Languages and Technologies X, pages 37–56. Springer, 2013.
[15] D. E. Smith and D. S. Weld. Conformant graphplan. In Proceedings of the Fifteenth National/Tenth
     Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, AAAI ’98/IAAI
     ’98, page 889–896, USA, 1998. American Association for Artificial Intelligence.
[16] Q. Yu, Y. Li, and Y. Wang. A dynamic epistemic framework for conformant planning. In R. Ramanu-
     jam, editor, Proceedings Fifteenth Conference on Theoretical Aspects of Rationality and Knowledge,
     TARK 2015, Carnegie Mellon University, Pittsburgh, USA, June 4-6, 2015, volume 215 of EPTCS,
     pages 298–318, 2015.

[17] Q. Yu, X. Wen, and Y. Liu. Multi-agent epistemic explanatory diagnosis via reasoning about actions.
     In Proceedings of IJCAI’13, pages 1183–1190, 2013.