=Paper= {{Paper |id=None |storemode=property |title=Backward Planning in the Logics of Communication and Change |pdfUrl=https://ceur-ws.org/Vol-918/111110231.pdf |volume=Vol-918 |dblpUrl=https://dblp.org/rec/conf/at/PardoS12 }} ==Backward Planning in the Logics of Communication and Change== https://ceur-ws.org/Vol-918/111110231.pdf
    Backward Planning in the Logics of Communication
                     and Change?

                             Pere Pardo and Mehrnoosh Sadrzadeh
                1
                    Institut d’Investigació en Intel·ligència Artificial (IIIA - CSIC)
                            2
                              Dept. Computer Science, Oxford University



        Abstract. In this contribution we study how to adapt Backward Plan search to
        the Logics of Communication and Change (LCC). These are dynamic epistemic
        logics with common knowledge modeling the way in which announcements,
        sensing and world-changing actions modify the beliefs of agents or the world
        itself. The proposed LCC planning system greatly expands the social complexity
        of scenarios involving cognitive agents that can be solved. For example, goals or
        plans may consist of a certain distribution of beliefs and ignorance among agents.
        Our results include: soundness and completeness of backward plan search (under
        BFS), both for deterministic planning and strong non-deterministic planning.


1     Introduction
Practical rationality or decision-making is a key component of autonomous agents, like
humans, and correspondingly has been studied at large. Research on this topic has been
conducted from several fields: game theory, planning, decision theory, etc. each focus-
ing on a different aspect: strategic decision-making, propositional means-ends analysis,
and uncertainty, respectively. Moreover, some of these models of rationality also pro-
vide a foundation for the notion of agreement (game-theoretic approaches to negotia-
tion, multi-agent planning, etc). Indeed, the process of reaching an agreement seems to
be a particular case of (multi-agent) practical decision-making.
     While these models of practical rationality are well-understood, they were (under-
standably) designed with a considerably low level of expressivity at the object language.
For instance, game-theory abstracts from propositional representations of states, actions
and goals. Planning algorithms take them into account, but not fully in the object lan-
guage, which makes it difficult to extend these algorithms to reason about others’ beliefs
and actions. All this contrasts with the area of logic, where logics with increasing ex-
pressivity have been characterized; though abilities or motivational attitudes have only
more recently been dealt with, in BDI, logics of strategic ability, etc. In the present con-
tribution we mainly focus on combining dynamic epistemic logic and plan search [9],
[5].
     Specially relevant to the topic of cognitive agents are the notions of belief, action,
goal, norm, etc. The first two elements are the target of dynamic epistemic logics DEL,
a recent family of logics for reasoning about agents’ communications and observa-
tions. We focus on the so-called Logics of Communication and Change LCC [10],
?
    AT2012, 15-16 October 2012, Dubrovnik, Croatia. Copyright held by the author(s).
which generalize many previously known DEL logics with physical actions and com-
mon knowledge. In particular LCC logics include a rich variety of epistemic actions
from the literature on DEL, including several types of communicative actions (truthful
or lying, public or private announcements). This is done with the help of action models,
that assign each action an accessibility relation describing how the action appears to
other agents, and hence what they learn after its execution.
     These logics may offer a semantic understanding of the notion of agreement be-
tween agents, since reaching an agreement requires these agents to have common
knowledge about it. The same can be said about the epistemic aspects involved in the
(intentional) process of reaching such an agreement. To this end, we address planning
problems expressible in LCC. This is done by introducing goals, which can be an arbi-
trary formula in the language of LCC. Planning domains consist as usual of an initial
state formula, available actions and a goal formula, and the task consists in finding some
structure of actions that make the goal formula true when executed in the initial state.
Instead of a semantic [3],[11],[7] or a proof-theoretic approach [1], we suggest to reduce
plan search into search in the space of proofs. With more detail, search methods operate
in the space of LCC formulas that are about plans (their success, and executability). As
in those works, an LCC plan is built while having in mind what agents can learn during
its execution.
     In the present contribution, we extend previous results on Backward Planning by
describing simpler algorithms for deterministic and strong non-deterministic planning.
These are shown to be sound and complete: if the algorithm terminates, it outputs a
solution plan; and if a successful plan exists, then the algorithm will find such a solution.

Motivating example. Our aim, then, is to endow LCC reasoning agents with planning
capacities for these logics, so they can achieve their goals in scenarios where other
agents have similar cognitive and acting abilities. In particular, LCC planning seems
necessary for an agent whose goals consist in (or depend on) a certain distribution of
knowledge, ignorance and false beliefs among agents. To illustrate the kind of rational
behavior an LCC planner can exhibit, consider the following example.


Example 1. Agent a just bet some prize agent b that the next coin toss will land heads
(h); a knows she can sense and even flip the coin without b ever suspecting it. Given
a sensing action that tells a whether h holds or not, a successful plan seems to be: toss
the coin; if sense that h, then announce it to b; otherwise flip the coin and announce h.


Structure of the paper. The paper is structured as follows: after a review of the related
work in Section 2, we briefly present the dynamic epistemic logics in LCC in Section
3. Then in Section 4, we present an algorithm for deterministic planning in LCC based
on Breadth First Search (BFS) and show it is sound and complete. The rest of the paper
is devoted to non-deterministic planning: Section 5 we present an extension of LCC
with action composition and choice. Finally, in Section 6 we propose an algorithm
for (strong) non-deterministic and show its soundness and completeness. The paper
concludes with a summary and some topics for future work.
2      Related Work

Several logics address the related problem of plan existence (rather than plan search)
for similarly expressive languages. For example, arbitrary public announcement logics
APAL [4], and logics for strategic ability. These logics contain modal operators hCi for
agents or coalitions, with formulas hCiϕ expressing: there is a plan or strategy allowing
agents in C to enforce ϕ.
    More in line with the present aims, among logics for action guidance we find the
family of BDI and related logics for intention (or some other motivational attitudes).
While these logics usually allow for considerable expressivity w.r.t. intention (and their
interaction with beliefs), they are not completely understood, and in fact planning meth-
ods have been suggested for BDI agent architectures. In particular, Bolander et al. [3]
suggest the use of LCC planning to this end for (a limited fragment of) BDI. In this
work [3], the authors study incremental forward planning based on the semantics of up-
date models. By incremental we mean that the plan is built in a step-wise fashion: plans
are being refined (with further actions) until a solution is found. The framework of [3]
is shown to be sound and complete under Breadth First Search, and hence the proposed
planning system for LCC becomes semi-decidable. Here we study the opposite direc-
tion of search, i.e. backward planning (from the goals to the initial state). The reason is
that in forward planning, relevant actions to be considered are those that are currently
executable actions to be the notion of relevant action, rather than current goal enforcing
actions, as in backward planning. This makes a difference in LCC since many actions
may exist (in the action model) which are everywhere executable, so forward planning
would typically face the state explosion problem. Along this line, Aucher [1] (and re-
lated papers) introduce regression methods for the fragment of LCC without common
knowledge. The present work differs from the latter two by making plan search incre-
mental (as in [3], [11]), but allowing for backward search. The proposed algorithms
easily adapt to search for optimal plans in backward planning.


3      Preliminaries: The Logics of Communication and Change

We briefly recall the basics of LCC logics [10]. Since these are built by adding an
action model U on top of (an epistemic reading of) propositional dynamic logic PDL,
we recall the latter first. The syntax of PDL, denoted LPDL , is as follows:

                              ϕ ::= p | ¬ϕ | ϕ1 ∧ ϕ2 | [π]ϕ
                              π ::= a | ?ϕ | π1 ; π2 | π1 ∪ π2 | π ∗
As usual, the symbols ⊥, ∨, ↔ and hπi are defined from the above as abbreviations.
     In the epistemic reading of PDL, denoted as E·PDL, the set of atomic programs a
is just the set of agents Ag, and we read the program [a] as agent a believes (or knows)
that.3 With the PDL program constructors (composition “;”, choice “∪” and the Kleene
star “(·)∗ ”, we can model:
 3
     E·PDL, and LCC, are rather abstract about the particular properties of the accessibility Ra
     relation of agent a, that distinguish belief from knowledge.
     nested beliefs       [a; b]         agent a believes agent b believes that
       group belief [B], or [a ∪ b] agents in B(= {a, b}) believe that
 common knowledge [B ∗ ], or [(a ∪ b)∗ ] agents in B have common knowledge that
     An E·PDL model M = (W, hRa ia∈Ag , V ) does, as usual, contain a set of worlds
W 0 , a relation Ra for each agent a, and an evaluation V : Var → P(W )).
     An LCC logic will add to an E·PDL language a set of modalities [U, e] for each
pointed action model U, e with distinguished (actual) action e. These new operators
[U, e] read after each execution of action e it is the case that. An action model is a tuple
U = (E, R, pre, post) containing
 – E = {e0 , . . . , en−1 }, a set of actions
 – R : Ag → (E × E), a map assigning a relation Ra to each agent a ∈ Ag
 – pre : E → LPDL , a map assigning a precondition pre(e) to each action e
 – post : E × Var → LPDL , a map assigning a post-condition post(e)(p), or ppost(e) ,
   to each e ∈ E and p ∈ Var
    The semantics of LCC consists in computing M, w |= [U, e]p in terms of the prod-
uct update of M, w and U, e. Their product update is (again) an E·PDL pointed model
M ◦ U, (w, e), where M ◦ U = (W 0 , hRa0 ia∈Ag , V 0 ) is as follows. The set W 0 consists
of those worlds (w, e) such that M, w |= pre(e), so executing e will lead to the corre-
sponding state (w, e). The relation (w, e)Ra0 (v, f) holds iff both wRa v and eRa f hold.
Finally, V 0 (p) = {(w, e) ∈ W 0 | M, w |= post(e)(p)}, i.e. the truth-value of p after
executing e depends on that of post(e)(p) before the execution.
    Here, though, we will follow [14] and restrict postconditions to post(e)(p) ∈
{p, >, ⊥}. This makes the truth-value of p after e to be either: that before e, or sim-
ply true or false irregardless of its truth-value before e. (This framework is shown in
[14] to be as expressive as the general case above, in case choice of actions is consid-
ered. See Section 6.) The PDL semantics [ · ] for E·PDL-formulas can be extended to
a semantics for LCC as follows:

        [[U, e]ϕ]M = {w ∈ W | if M, w |= pre(e) then (w, e) ∈ [ϕ]M ◦U }.
    In [10], the authors define program transformers TijU (π) provide a mapping between
E·PDL programs. Given any combination of ontic or epistemic actions (e.g. public and
private announcements) the transformers provide a complete set of reduction axioms,
reducing LCC to E·PDL. These axioms push the U, e-modalities inside the formula,
and then the basic case [U, e]p is turned into an E·PDL formula. See [10] for the details
on the T U (π) mapping.

Definition 1. A calculus for LCC is given by the axioms and rules for PDL plus
[U, e]> ↔ > and the following
   [U, e]p ↔ (pre(e) → post(e)(p))               [U, e]¬ϕ ↔ (pre(e) → ¬[U, e]ϕ)
                                                               Vn−1
   [U, e](ϕ1 ∧ ϕ2 ) ↔ ([U, e]ϕ1 ∧ [U, e]ϕ2 )     [U, ei ][π]ϕ ↔ j=0 [TijU (π)][U, ej ]ϕ
and necessitation rules for each action model modality: if ` ϕ then ` [U, e]ϕ.

    The completeness for this calculus is shown by reducing LCC to E·PDL. The trans-
lation simultaneously defined for formulas and programs, resp. t and r, is
                                                           Vj=n−1
  t([U, e]p) = t(pre(e)) → ppost(e)       t([U, ei ][π]ϕ) = j=0 [TijU (r(π))]t([U, ej ]ϕ)
t([U, e]¬ϕ) = t(pre(e)) → ¬t(ϕ)         t([U, e][U, f]ϕ) = t([U, e]t([U, f]ϕ))

r is trivial for all programs, and t is also trivial for the remaining types of formula. The
translation assigns to each LLCC formula an equivalent formula in LE·PDL .


4   Backward deterministic LCC planning.

In [8], we studied LCC planning with the help of generalized frame axioms for LCC
(actually, theorems in this logic) used to study the regression of an open goal under an
action. Here we propose a simpler approach, based on the above reduction of LCC into
E·PDL; and moreover they allow for better plan search methods. We will simply use
the expressions of the form

                       t([U, e]ϕ ∧ hU, ei>) → [U, e]ϕ ∧ hU, ei>

These give rise to a goal-regression function: a current goal ϕ is mapped into t([U, e]ϕ∧
hU, ei>); the latter is the new goal we need to address for an execution of e to end up
in a ϕ-world.

Definition 2. Given some LCC logic for an action model U, a planning domain is a
triple M = (ϕT , A, ϕG ), where ϕT , ϕG are consistent E·PDL formulas describing,
resp., the initial and goal states; and A ⊆ E is the subset of a actions available to the
agent.
    A solution to M is a sequence π = (f1 , . . . , fm ) of actions in A, such that

          |= ϕT → [U, f1 ] . . . [U, fm ]ϕG and |= ϕT → hU, f1 i . . . hU, fm i>

Definition 3. Given some planning domain M = (ϕT , A, ϕG ), the (initial) empty
plan is the pair π∅ = (∅, ϕG ) and if π = (π, ϕgoals(π) ) is a plan, then π(e) =
(π ∩ hei, ϕgoals (π(e))), defined by the goal ϕgoals(π(e)) = t([U, e]ϕgoals(π) ∧ hU, ei>),
is also a plan.A plan π is a leaf iff ϕgoals(π(e)) is inconsistent, or |= ϕgoals(π(e)) →
ϕgoals(π) .

    Leafs are plans not worth considering, either because unexecutable or non-
contributing to the ultimate goals ϕG . Also note that actions e -as defined above- are
deterministic, in the sense that |= [U, e]ϕ ∨ ψ ↔ [U, e]ϕ ∨ [U, e]ψ. Thus, deterministic
planning simply consists in “restricting” actions e ∈ E to our current action models, or
simply LCC planning. Later we extend LCC with composition ⊗ and choice ∪ to study
the non-deterministic case. A BFS plan search algorithm for deterministic planning is
the following.

Theorem 1. BFS is sound and complete for LCC backward planning: the output π of
the algorithm in Fig. 1 is a solution for (ϕT , A, ϕG ), and is such a solution exists, then
the algorithm terminates with such an output.
               1. LET Plans = hπ∅ i and ϕ = ϕG .
               2. SELECT the first element π of Plans. DELETE π from Plans.
               3. If π is a leaf REPEAT step 2. If |= T → goals(π), then OUTPUTπ.
                     Otherwise SET Plans ← Plans∩ hπ(e) | e ∈ Ai. REPEAT step 2.

             Fig. 1. BFS algorithm for deterministic planning with input (ϕT , A, ϕG ).



4.1     Optimal planning.

We may consider as well a cost function cost : E → R+ assigning to e the cost of an
execution of e. 4 This gives a measure of the total cost of a plan cost(π) = Σe∈π cost(e).
    Recall search for optimal plans (e.g. A*) can be defined from BFS in Fig. 1 by
adding an extra step in step 3: after updating Plans, this set is re-ordered into a sequence
of plans with increasing cost. That is, π is re-ordered before π 0 in Plans if and only if
cost(π) ≤ cost(π 0 ). Since by definition cost(e) > 0, and the set E is finite, there can
be no infinite sequence of actions with decreasing cost. This implies (using [9] Thm 1)
that A* is complete and admissible for deterministic backward planning in LCC using
the (trivial) heuristic estimation h(π(e)) = cost(π) + cost(e).

Corollary 1. A* is sound, and complete for LCC backward deterministic planning. It
is also admissible: the output of A* is an optimal solution π for (ϕT , A, ϕG ), if some
solution exists.




Fig. 2. (Left) Announcing ¬p as insider in Example 2. (Right) Announcing ¬p disguised as a
fortune-teller.


 4
     For example, the energy cost of voicing an announcement that ϕ into a crowd, ϕ!aB , can be
     defined in terms of |B| and the length of ϕ.
Example 2. Suppose you are a high-ranked employee a? in Phil. Inc and want to sell
some sensible information, ¬p = Phil. Inc is not doing well, to a close acquaintance
a1 , an important investor in Phil. Inc. It is commonly ignored that ¬p, except for you
and a0 , also an important investor in Phil. Inc. Knowing ¬p, a0 wants to sell stocks
before a1 does, but she prefers to wait for the price to increase a bit more. As long as
[a0 ]ha1 ip holds, you can sell the info, so your actions e should preserve this formula
while causing [a1 ]¬p. To make things more difficult, say agents a0 , a1 are sitting next
to each other.
     Figure 2 (Left) models your truthful announcement of ¬p, presenting yourself as
                           ?
insider a? ; that is, ¬p!aa1 . Intuitively, this would not preserve [a0 ]ha1 ip; and indeed,
available actions would either destroy the old goal or would not be successful about the
new goal.
                                      a? /f
     Figure 2 (Right) models ¬p!a1 : your truthful announcement that ¬p when dis-
guised as a fortune-teller. You expect a1 will identify you as a? , while a0 will think it
common belief that you are a fortune-teller (and that no one can tell a fortune-teller is
right or not). The new goal raised by this refinement is already true in any ϕT state. So
this action already constitutes a solution to the problem.


5     LCC with composition and choice
In this section we introduce non-deterministic actions into planning models, and study
the corresponding logic LCC⊗∪ . This will be used in Section 6 to find an algorithm
for non-deterministic planning in this logic. First, we expand any LCC logic with ac-
tion composition ⊗, and later we add choice ∪. These operations model the following
executions:
    – composition e ⊗ f models an execution of e followed by an execution of f, and
    – choice e ∪ f, models non-deterministic actions: each execution of e ∪ f either in-
      stantiates as an execution of e or as an execution of f.
     We introduce first composition ⊗ in the action models U and define the logic LCC⊗ ,
which can be reduced to LCC. Then, we will introduce choice ∪ and its semantics, the
latter be based on the semantics of LCC⊗ .

5.1    LCC⊗ : extending LCC with composition of actions.
On the one hand, the introduction of composition ⊗ proceeds as usual. We will simply
consider that the action model U is now closed under finite composition ⊗. To do so,
we must define first the composition of an action model by itself U ⊗ U, also denoted
U2 = (E2 , pre2 , post2 , R2 ).
Definition 4. Let U = (E, R, pre, post) be an action model. We define U ⊗ U = (E ×
E, R ◦ R, pre, post) as follows:
                   pre(e ⊗ e0 ) = pre(e) ∧ [U, e]pre(e0 )
                                  (
                         0          post(e)(p)     if post0 (e0 )(p) = p
               post(e ⊗ e )(p) =
                                    post (e )(p) if post0 (e0 )(p) ∈ {>, ⊥}
                                        0 0
Lemma 1. We have the following isomorphism
                         M ◦ (U ⊗ U) ∼= (M ◦ U) ◦ U.
Definition 5. The axioms and rules corresponding to LCC⊗ are those of LCC plus:
axiom K and Necessitation for e ⊗ f and the reduction axiom:
                                 [U, e ⊗ f]ϕ ↔ [U, e][U, f]ϕ
    The LCC calculus extends to LCC⊗ using this reduction axiom. Moreover, the
previous Lemma 1 extends to the general update U⊗ with compositions e ⊗ · · · ⊗ f of at
most n actions in E, for any fixed finite number n. (This is done by taking disjoint unions
of update models with n0 compositions, for each n0 ≤ n.) The extensions of LCC for
these compositional action models (defined by the corresponding axioms) will also be
denoted LCC⊗ . Having such a bound n on composition means in practice that we will
not know a priori which LCC⊗ is actually needed to solve the planning problem. Only
after the plan search algorithm terminates, we can identify this logic. A similar remark
applies to the logical extensions LCC⊗∪ studied next.

5.2   LCC⊗∪ : extending LCC with composition and choice.
Now we will introduce the choice for actions e ∪ f, denoting an indeterminate choice
between e and f. Note if the action e ∪ f is available (in A), then in general e, f will
be in E but not in A. As suggested in [10], we adopt the (multi-pointed) semantics and
axioms to model choice in an extension of LCC. (Note in the previous semantics we
cannot extend post(·)(p) to the choice of ontic actions e ∪ f if these disagree about
post(·)(p).)
Definition 6. Given an epistemic model M and an action model U, let Wd ⊆ W and
Ed = {f1 , . . . , fk } ⊆ E. Then M, Wd and U, Ed are multi-pointed models. We define
   M, Wd |= ϕ        iff M, w |= ϕ for each w ∈ Wd
                         V
   M, w |= [U, Ed ]ϕ iff f∈Ed M, w |= [U, f]ϕ
                        iff M, w |= pre(f) implies M ◦U, (w, f) |= ϕ, for each f ∈ Ed
Choice will be indistinctly represented as follows Ed , {e, . . . , f} or e ∪ . . . ∪ f.
In summary, the semantics of update with a multi-pointed action model U, e ∪ f can be
defined by the multi-pointed epistemic model M ◦ U, {(w, e), (w, f)}. The semantics of
[U, e ∪ f] from Definition 6 then becomes M, w |= [U, {e, f}]ϕ iff
   M, w |= pre(e) ⇒ M ◦ U, (w, e) |= ϕ and M, w |= pre(f) ⇒ M ◦ U, (w, f) |= ϕ
So in case M, w |= pre(e) ∧ pre(f), the two RHS conditions reduce to satisfaction in
a multi-pointed epistemic model: M ◦ U, {(w, e), (w, e0 )} |= ϕ. In fact, for simplicity
here we only consider the case pre(e) = pre(f), for any (available) non-deterministic
action e ∪ f.
    Also note that e ∪ f is not an action in the set E of the action model U or U⊗ . The
syntax of LCC⊗∪ extends that of LCC⊗ with the further type of formulas [U, Ed ]ϕ, for
each Ed ⊆ E (whose elements have identical preconditions). In [10], the next reduction
axiom was suggested for ∪:
                                               V
                                 [U, Ed ]ϕ ↔       e∈Ed [U, e]ϕ

Definition 7. The axioms and rules corresponding to LCC⊗∪ are those of LCC⊗ plus:
axiom K and Necessitation for [U, e ∪ . . . ∪ f], and the previous reduction axiom, i.e.:
                        [U, e ∪ . . . ∪ f]ϕ ↔ [U, e]ϕ ∧ . . . ∧ [U, f]ϕ

Fact 1 The usual LCC reduction axioms for [U, e] are derivable for [U, e ∪ e0 ], except
for [U, e ∪ f]¬ϕ ↔ (pre(e ∪ f) → ¬[U, e ∪ f]ϕ).

Proposition 1. The axioms and rules from Def. 7 are sound and complete for LCC⊗∪ .


6   Non-Deterministic Planning

Now we turn into strong planning for domains with non-deterministic actions, i.e. plan-
ning involving actions with (truly) disjunctive effects

       |= [U, f0 ∪ f1 ]p ∨ q,   but with     6|= [U, f0 ∪ f1 ]p and       6|= [U, f0 ∪ f1 ]q

(For this, let post(f0 )(p) = post(f1 )(q) = >, and post(f0 )(q) = q and post(f1 )(p) =
p). A strong solution for a given planning domain is a plan such that all of its executions
in the initial state lead to a goal state. Thus, ignoring preconditions, the above action
f0 ∪ f1 is a strong solution to (ϕT , {f0 ∪ f1 }, ϕG ), if ϕG = p ∨ q; and it is a weak
solution if ϕG = p.) Consider the action of tossing a coin, whose effect is heads or
tails, denoted h ∨ ¬h. We represent it as a choice between two actions: tossh , toss¬h ,
with postconditions

                   post(tossh )(h) = >,        and post(toss¬h )(h) = ⊥

Note both tossh and toss¬h are deterministic actions in E, though neither of them will
be in A. The reason is that they are not individually available to the agent: only toss(h)∪
toss(¬h) models this agent’s abilities and hence only the latter can be used during plan
search. On the other hand, e ∪ f is an element of A but not of E. For simplicity, we
only define the choice between two actions f0 ∪ f1 . The following definitions easily
generalize to the choice of finitely many actions e ∪ · · · ∪ f.
    First we must redefine solution for the non-deterministic case; the reasons for this
are the generalization of available actions A to the non-deterministic case, and the the
limited reduction axioms that characterize U, e ∪ f modalities.

Definition 8. Given an action model U in LCC, let A be a set of actions e in E and
actions of the form f0 ∪ f1 , where f0 , f1 ∈ E. We define a set A⊗∪ of A-modalities in the
extended language of LCC⊗∪ as follows:
                (A) (E ∩ A)<ω ⊆ A⊗∪
                (∪) if e, e0 ∈ A⊗∪ ,                then e ∪ e0 ∈ A⊗∪
               (⊗L) if e ∈ A and e ∈ (E ∩ A) , then e0 ⊗ e ∈ A⊗∪
                              ⊗∪     0         <ω

               (⊗R) if e ∈ A⊗∪ and e0 ∈ (E ∩ A)<ω , then e ⊗ e0 ∈ A⊗∪
If e does not make use of (⊗R) we say it is in search form. If e does not make use of
(⊗L) or (⊗R) we say it is in branching form.

See Figure 3 for an illustration of these and other modalities. Note a modality U, e in
branching form essentially consists in a conjunction of LCC⊗ modalities. The output
of a plan search algorithm for LCC⊗∪ will be an A-modality U, e in tree form. But let
us define first when an A-modality is a solution for a planning domain M.

Definition 9. Let the LCC⊗∪ logic of some action model U(= U⊗ ) be given, plus a
set of available actions A. Let M = (ϕT , A, ϕG ) be some planning domain definable in
this logic. We say an action e ∈ A⊗∪ is a solution for M iff

                                                                     ϕT → hU, →
                                                                              −
                                                          ^
         (1) |= ϕT → [U, e]ϕG       and     (2) |=                            e i>
                                                     →
                                                     −
                                                     e a path in Π


The only difference with Definition 2 lies in executability: we check that every pos-
sible instantiation →−e of e does indeed lead to some state (and so, by (1), the actual
instantiation will lead to a goal state).
    For (strong) non-deterministic planning, the idea is to reduce the search for a non-
deterministic output Π to a tree of deterministic solutions Π = ({π, . . .}, ≺). With
more detail, the planning domain M = (ϕT , A, ϕG ) will split into a tree of planning
domains Mξ , each to with its own deterministic solution π ξ . For this, we need to enrich
the notation π ξ = (π ξ , ϕgoals(πξ ) ) into a 4-tuple as follows

            plan = ( action seq., open goals, initial state, original goals ),
              πξ = (     πξ ,      ϕgoals(πξ ) ,  ϕT ξ ,          ϕGξ       )

The tree structure of these deterministic plans is given by the successor relation ≺. A
                                       0                                                  0
plan π ξ and any of its successors π ξ are related by some formula they share: π ξ ≺ π ξ
means that the initial state ϕT ξ0 in Mξ is part of the goals ϕGξ for Mξ . Besides the tree
                                         0


structure for these components π ξ , these and their planning domains Mξ will be totally
ordered by some lexicographic ordering ) [U, f0 ]ϕgoals(πξ ) ∧ hU, f0 i>      ϕGξ
π ξ1     ∅                      ϕGξ                        “[U, f1 ](·)”               ϕGξ
π ξ2     ∅      ϕgoals(πξ (f0 )) ∧ ϕgoals(πξ1 (f1 ))            ϕT ξ              as open goals
                               ≺+ =≺ ∪{hπ ξ2 , π ξ i, hπ ξ2 , π ξ1 i}.
Example 3. Let the output Π for some (ϕT , A, ϕG ) and A be as in Figure 3. The left-
most figure represents the output of the BFSnd algorithm, built incrementally accord-
ing to the lexicographic ordering: Π = hπ ∅ , π 1 , π 2 , π 21 , π 22 , π 221 , π 222 i. Arrows also
indicate the direction of backward search, with horizontal arrows representing the in-
troduction of a non-deterministic action.

     See Figure 4 for a description of the BFSnd algorithm for non-deterministic plan-
ning. If a plan Π is the output of BFSnd , then we also call its ≺-minimal element, say
π ξ2 , as π root .

           1. LET Plans = hΠ∅ i.
           2. SELECT the first element Π of Plans. DELETE Π from Plans.
           3. SELECT the 
                                                     →
                                                     −
                                                     e a path in e?


   Now we proceed to show that BFSnd is also complete for non-deterministic plan-
ning in LCC⊗∪ .

Lemma 2. For a given action model U and a set of available actions A,

 each modality U, e in A⊗∪ is logically equivalent to a modality U, e? in A⊗∪ search
                                        form.

  Using this Lemma, we may simply assume next that the solution, to be found by
BFSnd , is in search form.
Theorem 3. Given an LCC logic of some action model U, a set A, and a planning
domain M expressible in the extended logic LCC⊗∪ . Let U, e be a modality, with e ∈ A
in search form. If U, e is a solution for M, then the algorithm BFSnd for M terminates
with an output Π.

    Let us note that from Theorems 2 and 3 we can conclude that backward planning
for LCC⊗∪ logics is at least semi-decidable. For the case of (incremental) forward
planning, this result is shown in [3]. To illustrate non-determinsitic planning, let us
finally solve Example 1.




Fig. 5. Plan search in Example 1. A single plan π 0 in the plan set (Left), is refined by a non-
deterministic action (Center), splitting π 0 into π 0 , π 1 , π 2 . These plans are latter solved, as shown
in (Right).




Example 4. Recall Example 1, where the planner agent a must show heads, denoted
h, to win the prize. Among the actions A of agent a, we find: getting the coin; tossing
the coin; flipping the coin to heads (with precondition that a knows that ¬h); a set of
announcements, including justified truthful announcements ϕ!!ab (the latter require that
the announcing agent knows the truth of ϕ); and secret observations, for heads h?a and
tails ¬h?a . (Their secrecy consists in b mistaking either action by skipa .) In 5(Left),
the BFSnd algorithm searches first for a single plan π 0 . This particular π 0 demands
ϕgoals(π0 ) = h, and the only action possibly enforcing h is to toss the coin, which also
can lead to ¬h. Refining with this action is shown in (Center). Finally, (Right) shows
the output of BFSnd that solves this problem.

    Before concluding, let us remark that the previous A∗ search for optimal planning
easily extends to the non-deterministic case. Among the several possible options to
define the total cost of a non-deterministic plan, consider e.g. the pessimistic and opti-
mistic estimations based on
 cost(e ∪ e0 ) = max{cost(e), cost(e0 )} and cost(e ∪ e0 ) = min{cost(e), cost(e0 )}
    Thus, the cost of a plan is the max or min cost of its paths. For each of these cost
aggregation functions max, min, one can redefine the BFSnd algorithm so that instead
of stopping when the first output Π is found (with cost k), it continues search until all
plans Π 0 ∈ Plans have at least some path whose cost is ≥ k. Hence, optimal planning
can be extended to strong non-deterministic planning in LCC⊗∪ .


Conclusions and Future Work
We presented backward planning algorithms for a planner-reasoner agent enabling this
agent to find deterministic or (non-deterministic) strong plans in multi-agent scenarios.
The logics considered here are dynamic epistemic logics with ontic actions, thus al-
lowing the plans to contain communicative actions, sensing or the usual fact-changing
actions. We hope the proposed work might serve for practical reasoning in commu-
nicative agents, and in particular provide a logical foundation for the modeling (and
computing) of agreements among motivated agents. As for future work, several direc-
tions seem interesting: like the introduction of dynamic operators for belief revision,
and at a more applied level, the study of heuristic criteria for optimal plans.

Acknowledgements: This work has been funded by projects AT (CSD 2007-022); Lo-
MoReVI (FFI2008-03126-E/FILO FP006); ARINF (TIN2009-14704-C03-03); and the
GenCat 2009-SGR-1434 and EPSRC grant EP/J002607/1.


References
 1. G. Aucher. DEL-sequents for progression, Journal of Applied Non-Classical Logics 21(3-4):
    289–321 (2011)
 2. A. Baltag, L. Moss and S. Solecki. The logic of public announcements, common knowl-
    edge and private suspicions, Proc. of 7th Theoretical Aspects of Representation Knowledge
    TARK’98 pp. 43–56 (1998)
 3. T. Bolander and M. Birkegaard Andersen. Epistemic planning for single- and multi-agent
    systems Journal of Applied Non-Classical Logics, in press (2011)
 4. T. French and H. van Ditmarsch. Undecidability for arbitrary public announcement logic,
    Advances in Modal Logic AiML 2008, F. Wolter, H. Wansing, M. de Rijke and Zakharyaschev
    (eds.), Vol. 7, pp. 23–42, CSLI Stanford (2008)
 5. M. Ghallab, D. Nau and P. Traverso. Automated Planning: Theory and Practice, Morgan
    Kaufmann, San Francisco, USA (2004)
 6. B. Kooi. Expressivity and completeness for public update logics via reduction axioms, Jour-
    nal of Applied Non-Classical Logics, 17(2): 231–253 (2007)
 7. B. Löwe, E. Pacuit and A. Witzel. Planning based on dynamic epistemic logic. Tech. Report
    (2010)
 8. P. Pardo and M. Sadrzadeh. Planning in the Logics of Communication and Change, Proc.
    of Autonomous Agents and Multiagent Systems AAMAS 2012, V. Conitzer, M. Winikoff, L.
    Padgham and W. van der Hoek (eds.) Valencia, Spain (2012)
 9. J. Pearl Heuristics: Intelligent Search Strategies for Computer Problem Solving Addison-
    Wesley, 1984.
10. J. van Benthem, J. van Eijck and B. Kooi. Logics of Communication and Change, Informa-
    tion and Computation 204:1620–1662 (2006)
11. W. van der Hoek and M. Wooldridge. Tractable Multiagent Planning for Epistemic Goals
    Proc. of Autonomous Agents and Multiagent Systems AAMAS 2002, pp. 1167–1174 (2002)
12. H. van Ditmarsch, W. van der Hoek and B. Kooi. Dynamic Epistemic Logic, Springer (2008)
13. H. van Ditmarsch, J. van Eijck, F Sietsma and Y. Wang. On the logic of Lying, Games,
    Actions and Social Software J. van Eijck and R. Verbrugge (eds.) LNAI-FoLLI, Springer (in
    press)
14. H. van Ditmarsch, B. Kooi. Semantic results for ontic and epistemic change Proceedings of
    Logic and the Foundations of Game and Decision Theory LOFT 7 G. Bonanno, W. van der
    Hoek and M. Wooldridge (eds.) pp. 87–117 (2008)