=Paper=
{{Paper
|id=Vol-2785/paper6
|storemode=property
|title=Reasoning in Multi-Agent Conformant Planning over Transition Systems
|pdfUrl=https://ceur-ws.org/Vol-2785/paper6.pdf
|volume=Vol-2785
|authors=Peipei Wu,Yanjun Li
|dblpUrl=https://dblp.org/rec/conf/overlay/WuL20
}}
==Reasoning in Multi-Agent Conformant Planning over Transition Systems ==
Proceedings of the 2nd Workshop on Artificial Intelligence and Formal Verification, Logics, Automata and Synthesis (OVERLAY), September 25, 2020 Reasoning in Multi-Agent Conformant Planning over Transition Systems∗ Peipei Wu1 and Yanjun Li1 1 College of Philosophy, Nankai University, China Abstract Reasoning about actions and information is one of the most active areas of research in artificial intelligence. In this paper, we study the reasoning in multi- agent conformant planning. We introduce a formalism to trace the update of information in multi-agent conformant planning over transition systems. We propose a dynamic epistemic logical framework EALn to capture reasoning in such scenarios and present an upper bound on the time complexity of multi-agent conformant planning in terms of EALn . 1 Introduction Conformant planning, which is a branch of artificial intelligence (AI), deals with devising an action sequence to achieve a goal in the presence of uncertainty about the initial state (see [15]). Tracking the update of the uncertainty during the execution of actions plays an important role. In single-agent settings, the update of uncertainty is direct and intuitive. However, the situation becomes much more involved in multi-agent settings, since nested beliefs need to be considered. How to intuitively represent and track the update of uncertainty in multi-agent planning is one of the main challenges in the AI community. Instead of using transition systems as the underlying formalism of planning, in recent years, there has been a growing interest in handling multi-agent planning in dynamic epistemic logic (DEL) framework (see e.g., [5, 12, 2, 3, 17, 14, 10, 6, 13]). Usually, it is called epistemic planning in literature. In epistemic planning, states are epistemic models, actions are event models and the state transitions are implicitly encoded by the update product which computes a new epistemic model based on an epistemic model and an event model. One advantage of this approach is its expressiveness in handling partially observable actions, such as private announcements. However, this expressiveness comes at a price, as shown in [5, 4], multi-agent epistemic planning is undecidable in general. Moreover, if only fully observable actions are considered, the DEL formalism of event models is more complex and less intuitive than transition systems. In [11, 16], another approach is proposed which uses the core-idea of DEL but not its formalism to track the uncertainty change over transition systems in single-agent conformant planning, but this approach cannot be applied to handle multi-agent planning. This paper proposes a semantic-driven dynamic epistemic framework for reasoning about knowledge and action in multi-agent conformant planning. The main contributions of this paper are summarized as follows: 1) We introduce an update formalism to track information change in multi-agent conformant planning over transition systems. The update formalism can be seen as a generalized version of the ∗ This work is supported by the Fundamental Research Funds for the Central Universities 63202061 and 63202926. Corresponding author: Yanjun Li Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 36 Peipei Wu and Yanjun Li approach introduced in [11], but it is not a directly generalized result. In [11], the agent’s belief is represented by a set of states that the agent cannot distinguish. This representation does not work anymore in multi-agent settings, since nested beliefs need to be considered. In this paper, we use a non-well-founded notion, possibility, to represent belief states in multi-agent planning, and we define the update on possibilities to capture information change. 2) We propose a logical framework EALn with DEL-style update semantics and expressive language. In the literature of epistemic planning, such as [6], the logical language is an extension of proposition logic with knowledge modalities. The language of EALn has not only knowledge modalities but also action modalities, which allows us to express more complicated planning goals. Moreover, we can talk about the interaction between action and knowledge in the logical language. It would give us a more in-depth understanding of reasoning about action and information in multi-agent planning. 3) We present an upper bound on the time complexity of multi-agent conformant planning in terms of EALn . We show that multi-agent conformant planning with goal expressed by an EALn -formula is in double exponential time. 2 Information change in multi-agent conformant planning Let Act be a set of actions, P be a set of proposition variables, and Ag be a set of agents. A transition system T is a triple hS T , {RaT | a ∈ Act}, V T i, where S T is a non-empty set of states, for each a ∈ Act, RaT is a function from S to P(S), and V T : P → P(S T ) is an assignment function. We use possibility on transition systems, which is a variant of the notion used in [8, 7], to represent agents’ epistemic uncertainty about states. Before introducing the notion of possibility, let us start by introducing some auxiliary notions. i Let T be a transition system. A Kripke structure N on T is a triple hW N , {99K | i ∈ Ag}, LN i, i where W N is a non-emptyset of possible worlds, for each i ∈ Ag, 99K is a binary relation on W , and LN : W N → S T is a function that labels each possible world with a state in T . For each w ∈ W N , we say (N , w) is a pointed Kripke structure. A Kripke structure N on T models agents’ uncertainty over states in T . By [1], we know that a pointed Kripke structure can be represented by the following notion from non-well-founded set theory. Let N be a Kripke structure on a transition system T . A decoration d of N is a function that assigns to each world w ∈ W a function d(w) such that • d(w)(T ) = LN (w) (i.e., d(w) assigns to T a state that is the one with which L labels w), and i • for each i ∈ Ag, d(w)(i) = {d(w0 ) | w 99K w0 } (i.e., d(w) assigns to each agent i the (non-well- i founded) set of functions associated with worlds reachable from w by 99K ). If d is a decoration of N and w is a world in N , we say that d(w) is the solution of w in N , and (N , w) is a picture of d(w). Now we are ready to introduce the concept of possibility. Definition 1 (Possibility). Let T be a transition system. A possibility α on T is a function that assigns to T a state in S T , and to each agent i ∈ Ag a set of possibilities. As shown in [8], there is a close relationship between these (non-well-founded) possibilities on T and Kripke structures on T . Each Kripke structure N has a unique decoration that assigns to each world in N a possibility, and each possibility has a picture. For example, let T0 be the transition system depicted in Figure 1. A possibility α1 on T0 is presented in Figure 2a. The pointed Kripke structure (N , w1 ) on T0 , depicted in Figure 2b, is a picture of α1 . The choice of possibilities over Kripke structures as uncertainty representation provides several advantages (see [9]). One of them is that update on possibility is natural and elegant. The update on possibilities, defined in the following, tracks the change of agents’ uncertainty when an action is executed. Definition 2 (Update). Let α be a possibility on T . The update of α with an action a ∈ Act, Sdenoted α|a , is a set of possibilities on T such that α ∈ α| if and only if α (T ) ∈ Ra (α(T )) and α (i) = β∈α(i) β|a . 0 a 0 0 Reasoning in Multi-Agent Conformant Planning over Transition Systems 37 α1 = {(i, (α1 , α1 )), (j, (α1 , α1 )), (j, (α1 , α2 )), s1 } s1 a / t1 : p α2 = {(i, (α2 , α2 )), (j, (α2 , α1 )), < (j, (α2 , α2 )), s2 } a (a) System of equations of α1 s2 a / t2 w1 :s1 o / w2 :s2 j i,j i,j 6 h Figure 1: Transition system T0 (b) Picture of α1 : (N , w1 ) Figure 2: Possibility α1 on T0 As an example, the update of the possibility α1 depicted in Figure 2a will result in a unit possibility set α11 , which is depicted in Figure 3. Moreover, the update of α2 in Figure 2a, whose picture is (N , w2 ) in Figure 2b, is the set of possibilities α21 and α22 , both of which are depicted in Figure 3. The pointed Kripke structures (N 0 , w21 ) and (N 0 , w22 ) are the pictures of α21 and α22 respectively. Moreover, we would like to point out that the difference between α11 (i) and α21 (i) corresponds to the fact that the information state of agent i at state t1 is different depending on how he/she gets there, and the identity between α11 (j) and α21 (j) reflects the fact that the information state of agent j at t1 is the same because he/she intuitively cannot distinguish s1 and s2 . α11 = {(i, (α11 , α11 )), (j, (α11 , α11 )), (j, (α11 , α21 )), (j, (α11 , α22 )), t1 } i,j α = {(i, (α21 , α21 )), (i, (α21 , α22 )), 21 (j, (α21 , α11 )), (j, (α21 , α21 )), (j, (α21 , α22 )), t1 } > w11 :t1 ` j j α22 = {(i, (α22 , α21 )), (i, (α22 , α22 )), ~ w21 :t1 o / w22 :t2 i,j (j, (α22 , α11 )), (j, (α22 , α21 )), (j, (α22 , α22 )), t2 } i,j i,j 6 h (a) System of equations of α11 (b) Picture of α11 : (N 0 , w11 ) Figure 3: Possibility α11 on T0 3 The logic EALn and its application In this section, we will firstly introduce the language and the semantics of EALn and then use it to capture reasoning in multi-agent conformant planning. At the end of the section, we provide an upper bound for multi-agent conformant planning in terms of EALn . Definition 3 (Language). The language LEALn is defined by the following BNF: φ ::= p | ¬φ | φ ∧ φ | [a]φ | Ki φ, where p ∈ P, a ∈ Act, and i ∈ Ag. We use ⊥, ∨, →, h·i as the usual abbreviations. The formula [a]φ intuitively means that φ holds if the action a is successfully done. The formula Ki φ expresses that the agent i knows that φ. Since the language of EALn contains both action modalities and knowledge modalities, this allows one to talk about the interaction between action and knowledge in the object language. Formulas of LEALn will be interpreted on dynamic epistemic models. A dynamic epistemic model is a pair (T , α), where T is a transition system and α is a possibility on T such that for each i ∈ Ag, α is in α(i) and α0 ∈ α(i) implies α0 (i) = α(i). 38 Peipei Wu and Yanjun Li Definition 4 (Update semantics). Given a dynamic epistemic model (T , α) and a formula φ ∈ LEALn , the satisfaction relation is defined as follows: T , α p ⇐⇒ α(T ) ∈ V (p) T , α ¬φ ⇐⇒ T , α 2 φ T , α φ ∧ ψ ⇐⇒ T , α φ and T , α ψ T , α Ki φ ⇐⇒ for all α0 ∈ α(i) : T , α0 φ T , α [a]φ ⇐⇒ for all α0 ∈ α|a : T , α0 φ We say that φ is valid, denoted as φ, if T , α φ for each dynamic epistemic model T , α. Within EALn , we can express the epistemic state of agents and its evolvement in multi-agent planning. Take the following example. Let T0 and α1 be depicted as Figure 1 and Figure 2 respectively. We then have the following statements. • T0 , α1 [a](Ki p ∧ ¬Kj p). When the initial possibility is α1 , after the action a is executed, agent i knows that p, and agent j does not know that p. • T0 , α2 [a](¬Ki p ∧ ¬Kj p). When the initial possibility is α2 , after the action a is executed, neither i not j knows that p. • For all φ ∈ LEALn , T0 , α1 [a]Kj φ iff T0 , α2 [a]Kj φ. No matter the initial possibility is α1 or α2 , after the action a is executed, the knowledge of j is the same. Within EALn , we can capture reasoning about action and information change in multi-agent conformant planning. Lemma 1 (Perfect Recall). For each dynamic epistemic model (T , α), we have that T , α Ki [a]φ → [a]Ki φ for each i ∈ Ag and each a ∈ Act. Proof. Assume that T , α Ki [a]φ. We will show that T , α [a]Ki φ. Taka any α0 ∈ α|a and any β 0 ∈ α0 (i). To show that T , α [a]Ki φ, by the update semantics, we onlySneed to show that T , β 0 φ. Since α0 ∈ α|a , by Definition 2, this follows that α0 (i) = β∈α(i) β|a . Since β 0 ∈ α0 (i), we then have that there is some β ∈ α(i) such that β 0 ∈ β|a . Since T , α Ki [a]φ and β ∈ α(i), this follows that T , β [a]φ. Moreover, since β 0 ∈ β|a , this follows that T , β 0 φ. Therefore, we have shown that T , β 0 φ for each α0 ∈ α|a and each β 0 ∈ α0 (i), namely T , α [a]Ki φ. Thus, we have shown that if T , α Ki [a]φ then T , α [a]Ki φ. Lemma 2 (No Miracles). For each dynamic epistemic model (T , α), we have that T , α haiKi φ → Ki [a]φ for each i ∈ Ag and each a ∈ Act. Proof. Assume that T , α haiKi φ. We will show that T , α Ki [a]φ. By the update semantics, we only need to show that T , β 0 φ for each α0 ∈ α(i) and each β 0 ∈ α0 |a . Since T , α haiKi φ, this follows that S there isa some γ ∈ α| 0 such that T , γ Ki φ. Since γ 0 ∈a α| , by a a Definition 2, this follows that γ(i) = β∈α(i) β| . Take any α ∈ α(i). We then have that α | ⊆ γ(i). Take any β 0 ∈ α0 |a . We then have that β 0 ∈ γ(i). Since T , γ Ki φ, this follows that T , β 0 φ. Thus, we have shown that T , β 0 φ for each α0 ∈ α(i) and each β 0 ∈ α0 |a , namely T , α Ki [a]φ. Thus, we have shown that if T , α haiKi φ then T , α Ki [a]φ. Intuitively, perfect recall means that if the agent cannot distinguish two states after doing action a, then he/she could not distinguish them before. No miracles mean that if the agent cannot distinguish two states and the same action is executed on both states, then he/she cannot distinguish the resulting states. These can be depicted in Figure 4. We now introduce multi-agent conformant planning in terms of EALn . First, S we introduce some auxiliary notations. Let Σ be a set of possibilities. We use Σ|a to denote the set α∈Σ α|a . Let σ ∈ Act∗ be an action sequence a1 · · · an . We use α|σ to denote the information state (· · · ((α|a1 )|a2 ) · · · )|an . Definition 5 (Multi-agent conformant planning). A multi-agent conformant planning problem P is a tuple hT , α, G, {Acti | i ∈ G}, φg i, where T is a transition system, α is the initial possibility on T , G Reasoning in Multi-Agent Conformant Planning over Transition Systems 39 s1 o / s3 s1 o / s3 i i s1 perfect a −−−−→ a a no miracles ←−−−−−−− a a recall s2 o i / s4 s2 o i / s4 s2 s4 Figure 4: perfect recall and no miracles is a group of agents, Acti ⊆ Act is a set of actions that the agent i can perform, S φg ∈ L∗EAL is the n goal formula. A solution to P is a finite sequence of actions σ = a1 · · · an ∈ ( i∈G Act i ) such that 1) theTsequence σ is applicable 1 in the state α0 (T ) for each α0 ∈ i∈G α(i), and 2) T , β φg for each T α0 ∈ i∈G α(i) and each β ∈ α0 |σ . Let LaMφ be an abbreviation of the formula hai> ∧ [a]φ. It can be shown that an action sequence a1 · · · an is applicable on the state α(T ) iff T , α La1 M · · · Lan M>. We then have the following theorem, which means that the solution can be verified in EALn . Theorem 3. Let P = hT , α, G, {Acti | i ∈ G}, φg i be a planning problem. An action sequence a1 · · · an is a solution for P iff T , α0 La1 M · · · Lan Mφg for each α0 ∈ i∈G α(i). T In this paper’s remainder, we show an upper bound on the time complexity of multi-agent conformant planning in terms of EALn . i Let T = hS, {Ra | a ∈ Act}, V i be a transition system and N = hW, {99K | i ∈ Ag}, Li be a † i † Kripke structure on T . The Kripke structure N † = hW N , {99K N † | i ∈ Ag}, LN i is defined as follows: † i i † W N = {(w, L(w)) | w ∈ W }, 99K N † = {((w, L(w)), (w0 , L(w0 ))) | w 99K w0 }, and LN (w, L(w)) = L(w). If N is a Kripke structure on T , so is N † . It is obvious that (N , w) is isomorphic to (N † , (w, L(w))). Therefore, if (N , w) is a picture of a possibility α, so is (N † , (w, L(w))). Given a transition system T and a Kripke structure N on T , let the Kripke structure N † is defined as above. We now define the update of N † with an action a ∈ Act, denoted as N † |a . The Kripke structure † a i † a † a N † |a = hW N | , {99K N † |a | i ∈ Ag}, LN | i is defined as follows: W N | = {(w, s0 ) ∈ W × S | there is † i i † a (w, s) ∈ W N such that (s, s0 ) ∈ RaT }, 99K N † |a = {((w, s), (w0 , s0 )) | w 99K w0 }, and LN | (w, s) = s. It is evident that N † |a is also a Kripke structure on T . We have known that if (N , w) is a picture of a possibility α, so is (N † , (w, L(w))). If α0 ∈ α|a , this follows that α(T )RaT α0 (T ). Let α(T ) = s and α0 (T ) = s0 . Since † (N , w) is a picture of a possibility α, this follows that L(w) = s, and then we have that (w, s) ∈ W N . † a Since sRaT s0 , by the definition of N † |a , we know that (w, s0 ) ∈ W N | . With the bisimulation principle (see [8]), it can be shown that (N † |a , (w, s0 )) is a picture of α0 . Moreover, it can be shown that for each action sequence a1 · · · an and each β ∈ α|a1 · · · |an , the pointed Kripke structure N † |a1 · · · |an , (w, β(T )) † a1 an is picture of β. Since W N | ···| ⊆ W × S, this follows that the number of possibilities that can be generated by the initial possibility α is bounded by the size of the powerset of W × S. Given a possibility α, let the size of α be the size of the minimal pointed Kripke structure (N , w) such that (N , w) is a picture of α. Given a multi-agent conformant planning problem P = hT , α, G, {Acti | i ∈ G}, φg i, let the size of P be the product of the size of T , the size of α, and the size of φg . Theorem 4. Multi-agent conformant planning in terms of EALn is in double exponential time. Proof idea. Given a multi-agent conformant planning P , we can do a breadth-first search with duplicate checking on possibilities to find a plan. Since we have shown above that the number of possibilities that can be generated by the initial possibility α is bounded by the exponent of the size of P , this follows that the depth of the searching tree is bounded by the exponent of the size of P . Therefore, the time of the breadth-first search is in double exponent of the size of P . Moreover, by Theorem 3, we know that the verification of a plan can be reduced to model checking in EALn . Since model checking in EALn is in polynomial time, this follows that searching a plan for P is in double exponential time. 1 a · · · a is applicable on a state s iff R (s) 6= ∅, and for each 1 ≤ k < n, t ∈ R 1 n a1 a1 ···ak (s) implies Rak+1 (t) 6= ∅. 40 Peipei Wu and Yanjun Li References [1] P. Aczel. Non-well-founded sets, volume 14 of CSLI lecture notes series. CSLI, 1988. [2] M. B. Andersen, T. Bolander, and M. H. Jensen. Conditional epistemic planning. In Logics in Artificial Intelligence, pages 94–106. Springer, 2012. [3] G. Aucher. DEL-sequents for regression and epistemic planning. Journal of Applied Non-Classical Logics, 22(4):337–367, 2012. [4] G. Aucher and T. Bolander. Undecidability in epistemic planning. In Proceedings of IJCAI ’13, pages 27–33, 2013. [5] T. Bolander and M. B. Andersen. Epistemic planning for single and multi-agent systems. Journal of Applied Non-Classical Logics, 21(1):9–34, 2011. [6] T. Bolander, M. H. Jensen, and F. Schwarzentruber. Complexity results in epistemic planning. In Proceedings of IJCAI ’15, pages 2791–2797, 2015. [7] F. Fabiano, A. Burigana, A. Dovier, and E. Pontelli. EFP 2.0: A multi-agent epistemic solver with multiple e-state representations. In J. C. Beck, O. Buffet, J. Hoffmann, E. Karpas, and S. Sohrabi, editors, Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling, Nancy, France, October 26-30, 2020, pages 101–109. AAAI Press, 2020. [8] J. Gerbrandy and W. Groeneveld. Reasoning about Information Change. Journal of Logic, Language and Information, 6(2):147–169, 1997. [9] W. Groeneveld. Logical Investigations into Dynamic Semantics. PhD thesis, University of Amsterdam, 1995. [10] M. H. Jensen. Epistemic and Doxastic Planning. PhD thesis, Technical University of Denmark, 2014. [11] Y. Li, Q. Yu, and Y. Wang. More for free: a dynamic epistemic framework for conformant planning over transition systems*. Journal of Logic and Computation, 27(8):2383–2410, 06 2017. [12] B. Löwe, E. Pacuit, and A. Witzel. DEL planning and some tractable cases. In LORI’11, pages 179–192. Springer, 2011. [13] C. Muise, V. Belle, P. Felli, S. McIlraith, T. Miller, A. R. Pearce, and L. Sonenberg. Planning over multi-agent epistemic states: A classical planning approach. In The 29th AAAI Conference on Artificial Intelligence, 2015. [14] P. Pardo and M. Sadrzadeh. Strong planning in the logics of communication and change. In Declarative Agent Languages and Technologies X, pages 37–56. Springer, 2013. [15] D. E. Smith and D. S. Weld. Conformant graphplan. In Proceedings of the Fifteenth National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, AAAI ’98/IAAI ’98, page 889–896, USA, 1998. American Association for Artificial Intelligence. [16] Q. Yu, Y. Li, and Y. Wang. A dynamic epistemic framework for conformant planning. In R. Ramanu- jam, editor, Proceedings Fifteenth Conference on Theoretical Aspects of Rationality and Knowledge, TARK 2015, Carnegie Mellon University, Pittsburgh, USA, June 4-6, 2015, volume 215 of EPTCS, pages 298–318, 2015. [17] Q. Yu, X. Wen, and Y. Liu. Multi-agent epistemic explanatory diagnosis via reasoning about actions. In Proceedings of IJCAI’13, pages 1183–1190, 2013.