=Paper=
{{Paper
|id=Vol-451/paper-6
|storemode=property
|title=Efficient Plan Adaptation through Replanning Windows and Heuristic Goals
|pdfUrl=https://ceur-ws.org/Vol-451/paper08gerevini.pdf
|volume=Vol-451
|dblpUrl=https://dblp.org/rec/conf/rcra/GereviniS08
}}
==Efficient Plan Adaptation through Replanning Windows and Heuristic Goals==
Efficient Plan Adaptation through Replanning Windows
and Heuristic Goals
Alfonso E. Gerevini1 and Ivan Serina2
1
DEA, Università degli Studi di Brescia, via Branze 38, 25123 Brescia, Italy
gerevini@ing.unibs.it
2
Free University of Bozen – Bolzano, Viale Stazione 16, 39042 Bressanone, Italy
ivan.serina@unibz.it
Abstract
Fast plan adaptation is important in many AI-applications. From a theoretical
point of view, in the worst case adapting an existing plan to solve a new problem
is no more efficient than a complete regeneration of the plan. However, in practice
plan adaptation can be much more efficient than plan generation, especially when
the adapted plan can be obtained by performing a limited amount of changes to
the original plan. In this paper, we investigate a domain-independent method
for plan adaptation that modifies the original plan by replanning within limited
temporal windows containing portions of the plan that need to be revised. Each
window is associated with a particular replanning subproblem that contains some
“heuristic goals” facilitating the plan adaptation, and that can be solved using
different planning methods. An experimental analysis shows that, in practice,
adapting a given plan for solving a new problem using our techniques can be much
more efficient than replanning from scratch.
1 Introduction
Fast plan adaptation is important in many AI-applications requiring a plan reasoning
module. A typical plan adaptation task consists of modifying a previously generated
plan in order to use it for solving a new problem which is “similar” to the original
one, in the sense that only a few facts in the initial and goal states are different. This
process can be either off-line (e.g., adapting a plan retrieved from a plan library before
its execution), or on-line (e.g., adapting a plan during a “mixed-initiative” construction
of it [3], or during its execution).
From a theoretical point of view, in the worst case adapting an existing plan is
no more efficient than a complete regeneration of the plan [15]. However, in practice,
we expect that in many cases plan adaptation should be much easier than a complete
replanning, because the adapted plan can be obtained by performing a limited amount
of changes to the original plan. Our general goal is the development of plan adaptation
methods that perform efficiently especially in such cases.
Adj is a domain-independent planning system based on planning graphs [1], that
uses a collection of systematic and heuristic search techniques for solving plan adapta-
tion tasks by replanning certain portions of the original plan.1 In this context, a plan is
1
Adj is available by inquiring the authors.
Proceedings of the 15th International RCRA workshop (RCRA 2008):
Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion
Udine, Italy, 12–13 December 2008
a partially ordered set of STRIPS-actions, where each action is associated with a time
step and unordered actions are associated with the same time step (indicating that
these actions can be executed in any relative order). A first version of Adj using the
IPP planner [13] was presented in [8]. In this paper, we propose a significant extension
of the original version of Adj based on new heuristics for fast replanning, and we in-
vestigate the use of a different style of planning for the replanning phase. Moreover, we
introduce a plan optimization phase in Adj, and we present new experimental results
evaluating our approach.
Adj is based on identifying and resolving the inconsistencies (with respect to the
new problem) that are present in the original (input) plan by revising limited portions
of the plan. An inconsistency is an action-precondition that is not satisfied, a goal
of the new problem that is not achieved, or a pair of mutually exclusive actions [1].
The portions of the plan that are revised are delimited by some temporal windows on
the plan containing inconsistencies at specific time steps. Each replanning window is
associated with a particular replanning (sub)problem that is solved using a systematic
or heuristic search algorithm (in our implementation we used the search algorithms
of IPP [13] and FF [10], but other algorithms could be used as well). During the
adaptation process a temporal window can be incrementally enlarged up to contain, in
the worst case, the full plan. In such a case a complete replanning is performed. When
a subplan for solving the replanning problem associated with a temporal window W is
computed, the actions of the plan under adaptation that are in W are replaced with the
new subplan. This technique is complete, in the sense that if the replanning algorithm
is sound and complete and the new problem is solvable, then the revision of the input
plan for the old problem leads to a correct plan for the new problem.
In the rest of the paper, first we give a general description of our plan-adaptation
method, followed by a more detailed description; then we present the results of an
experimental evaluation showing that, in practice, our approach can be much more
efficient than replanning from scratch.
2 Plan Adaptation, Replanning Windows and Heuristic
Goals
In this section we describe our plan adaptation technique. We assume that the reader
is familiar with the Graphplan-style of planning [1].
2.1 A General Description of adjust-plan
Given a plan π0 for a planning problem Π0 and a new problem Π, differing from Π0 in
some initial or goal fact(s), the plan adjustment process of Adj consists of three main
phases:
1. Analysis of the input plan to determine a subset of the actions in π0 that are
applicable for solving Π.
2
2. Identification of the set S of inconsistencies that are present in π0 with respect to
Π (an inconsistency is a pair of mutually exclusive actions, an action with some
unachieved precondition(s), or a goal of Π which is not achieved).
3. Revision of π0 to repair S obtaining a valid plan for Π.
The first phase is performed by mapping each action of π0 to an action node of the
planning graph G for the new problem (if such a node exists). This mapping considers
three cases: (a) the number of time steps involved in π0 is the same as the number of
levels in G; (b) the number of time steps involved in π0 is higher than the number of
levels in G; (c) the number of the time steps involved in π0 is smaller than the number
of levels in G. Each of these cases will be discussed in the last part of Section 2.
The second and the third phases of the adaptation process are accomplished by an
algorithm, adjust-plan, that is correct and complete: the adjusted plan is a valid
plan for Π, and if there exists a plan for Π, then an adjusted plan is computed. Figure
1 describes the main steps of adjust-plan, while some important additional details
are given in the next subsections. The first step of adjust-plan identifies and removes
from π0 the actions that are not applicable in the context of the new problem. Then
step 2 identifies the earliest time step at which π0 contains an inconsistency. Since we
are considering deterministic domains, this can be accomplished in polynomial time by
simulating the execution of the plan of π0 (analogously, the facts that are necessarily
true at any time step can be determined in polynomial time).
Then adjust-plan processes the time step i identified at step 2, trying to remove
the inconsistencies by replanning from time i - 1 to time i. If there exists no plan,
or a certain search limit is exceeded, then the replanning window is enlarged (e.g.,
we replan from i - 1 to i + 1). The process is iterated until either a (sub)plan is
found, the window has been enlarged up to include all the actions of the plan and
the (complete) replanning has failed, or the search has reached a predefined CPU-
time limit (max-adjust-time). Note that in our current implementation of adjust-
plan, during replanning (step 5) the actions of π that are present in the replanning
window are ignored (a new planning graph for the replanning problem is constructed).
The replanning process at step 5 is performed by first constructing the corresponding
replanning graph, and then searching it by using the backtracking scheme of IPP or FF.
At step 6 of adjust-plan the replanning window can be increased going either
backward in time (i.e., init-time is decreased), forward in time (i.e., goal-time is
increased), or both. Enlarging a replanning window corresponds to revising a larger
portion of the plan under adaptation. Such a portion will be replaced by the subplan
solving the replanning problem associated with the enlarged window (when this is
found). Adj has a default value for max-adjust-time that can be modified by the
user. In principle, if max-adjust-time is set to sufficiently high values, then adjust-
plan can increase a replanning window up to reach the earliest and latest time steps.
This would determine a complete search.
3
Algorithm: adjust-plan
Input: a plan π0 with some inconsistencies w.r.t. Π, a CPU-time limit max-adjust-time.
Output: either a correct plan or fail.
1. Let π be the plan obtained from π0 by removing the actions that are not applicable for
solving Π;
2. Identify the earliest time step i in π containing an inconsistency; if there is no such a
time step, then return π;
3. If i is the latest time step of π, then set init-time to i − 1 and goal-time to i,
otherwise set init-time to i and goal-time to i + 1;
4. While CPU-time ≤ max-adjust-time
5. Replan using as initial facts F (init-time) and as goals G(goal-time), where
F (init-time) is the set of facts that are true at time init-time, and
G(goal-time) is a set containing the preconditions of the actions in π at time
goal-time;
6. If there is no plan from F (init-time) to G(goal-time), or a search limit is
exceeded, then decrease init-time or increase goal-time (i.e., we enlarge the
replanning window), otherwise replace the existing actions between init-time
and goal-time in π with the new subplan, and goto 2.
7. Return fail.
Figure 1: High-level description of adjust-plan.
Finally, during replanning within a particular window we impose a search limit that
is automatically increased when the replanning window is enlarged.2 The motivation
of this heuristic is that when a replanning problem associated with a certain window is
hard to solve, it can be easier to revise a larger portion of the plan (instead of dedicating
a big effort to the revision of a restricted part of the plan). While this does not affect
completeness, in practice it can be significantly effective for the efficient computation
of an adapted plan.
2.2 A Detailed Description of adjust-plan
The input plan π0 of adjust-plan is a plan for solving an old problem that we would
like to adapt for solving a new problem Π, where the initial state or the goal state
are different from the corresponding states in the old problem. We indicate with IΠ
the initial state of Π, with π the plan under adaptation for solving Π, and with G the
planning graph of Π. For the moment we assume that G is the planning graph for
Π such that the last level of G contains all the goals of Π with no mutually exclusive
relations between them. Moreover, we assume that length(G) = length(π0 ), i.e., that
2
In the current implementation this limit is defined by constraining either the maximum number m
of time steps or the number of actions in the solution of the replanning problem, depending on the style
of planning (either parallel or sequential) that we use for solving the replanning problem. For parallel
planning, m is initially set to 3, and then it is automatically increased by 2 each time the replanning
window is enlarged by one level; for sequential planning, m is initially set to 4, and then increased by
3 times the number actions in the replanning window.
4
the number of levels in G (excluding the final goal level) is the same as the number of
time steps of π0 . In the last part of Section 2, we will generalise to the other cases.
The definition and computation of replanning window use the following notion of
applicable action:
Definition 1 (Applicable action). An action Ak of π0 at a certain time step k is
applicable for solving Π if and only if Ak is a node of G at the action-level k.
Note that, by definition of planning graph [1], when the fact corresponding to a
precondition of an action Ak of π0 is absent at the fact-level k of G, also the action
node corresponding to Ak is absent at the action-level k of G.
Before starting the adaptation process we remove from π0 the actions that are not
applicable, and we set π to the resulting plan. During the adaptation process π is
incrementally revised by replacing subplans within certain temporal windows with new
subplans.
Definition 2 (Replanning window). Given two time steps i and j of a plan π under
adaptation for solving Π, a replanning window for π is a pair hF (i), G(j)i such that:
• 0 ≤ i < j and if i > 1, then the subplan of π from time step 0 to time step i − 1 is
a valid subplan for achieving the preconditions of the actions at time i − 1 from
IΠ ;
• F (i) is the set of positive facts that are true at time step i in π;
• G(j) is a set of (sub)goals either consisting of the final goals of Π (if j is the time
step corresponding to the final goals of Π), or containing the preconditions of the
actions at time j (otherwise).
The time steps i and j of a replanning window hF (i), G(j)i are called replanning
start time and replanning end time respectively; F (i) is called the replanning initial
state, and G(j) is called replanning goal set. Note that when the replanning goal state
does not correspond to the goals of Π, Definition 2 requires that the replanning goal
set G(j) contains the action-preconditions at time j, but it does not specifies which
are the other replanning goals that should be included (if any); moreover these extra
goals are irrelevant for ensuring correctness and completeness [8]. On the other hand,
as will be discussed in the next section, they can be useful for making the adaptation
process more efficient.
Step 2 of adjust-plan guarantees that each replanning window contains the ear-
liest inconsistencies of π. This allows computing an exact assessment of the facts F (i)
that are true at the replanning start time. (If the plan contained an inconsistency
preceding the current replanning start time i, then we could compute only an approx-
imation of the facts that are true at time i, because these can depend on how the
preceding inconsistency will be repaired.) In particular, F (i) can be computed in poly-
nomial time by simulating the execution of the actions in π preceding time i, i.e., F (i)
can be incrementally computed according to the following definition:
IΠ if i = 0
F (i) = +
Ei ∪ Fi−1 if i > 1,
5
where Ei is the set formed by the positive effects of the actions of π at time step i − 1,
+
and Fi−1 is the subset of F (i − 1) formed by the facts that are not deleted by the
actions of π at time i − 1.
In [8] we prove that adjust-plan is sound and complete, provided that max-adjust-time
is set to a sufficiently high value and that a sound and complete algorithm is used for
the replanning phase.
2.3 Heuristic Replanning Goals
According to Definition 2, the replanning goal set can contain some goals in addition
to the preconditions of the actions that are planned at the replanning end time. In the
following Σj indicates this set of action-preconditions, while Ωj indicates the (possibly
empty) set of the remaining goals of G(j), i.e., G(j) = Σj ∪ Ωj .
Including a non-empty Ω-set in G(j) can be useful because some actions in the
subsequent part of the plan might require that the Ω-facts hold in the replanning goal
state (despite they are not preconditions of actions at time j). This need arises, for
example, when an action A of π0 in the replanning window has some effect φ that is
not required by any other action in the replanning window, but that persists up to
a time step after the replanning end time (j) to achieve a precondition of an action
B, that otherwise would not be satisfied.3 The inclusion of φ in G(j) is useful be-
cause the (sub)plan found during replanning might not necessarily contain A (all the
actions in the replanning window are always replaced by the new subplan solving the
corresponding replanning problem). Thus, if φ were not in G(j) and, after the subplan
replacement, φ did not hold at time j, then adjust-plan would identify an inconsis-
tency (the unsatisfied φ-precondition of B) at a time later than j, and hence another
replanning phase would be activated.4
The Ω goal set of G(j) can be seen as an assessment of the set Ω∗j of the facts that
should hold at time j for achieving those preconditions of actions in the subsequent
part of π that otherwise would be unsatisfied (and so these actions would not be
reusable for solving Π). Note that in general computing an exact assessment of Ω∗j
would not be feasible, because the subsequent part of π can change incrementally, as a
consequence of repeated replanning phases to cope with inconsistencies at times later
than j. Fortunately, including an exact assessment of Ω∗ in G(j) is not necessary for
ensuring the correctness and completeness of the method [8]. Instead, the inclusion
of the Ω-goals should be seen as a heuristic aimed at reusing as many actions of the
original plan as possible, and at obtaining a fast adaptation.
We have developed definitions of Ω-goals, which are based on the assumption that
Π can be solved by a plan which is similar to π, i.e., that the adaptation of π0 requires
a limited number of changes. In the rest of the paper, we present two of such defini-
tions and we experimentally compare them in order to analyze their importance and
effectiveness in practice.
3 φ
Using the causal-link planning terminology, we can say that π0 contains the causal link A −→ B.
4
Reducing the amount of replanning can be crucial not only for efficiency, but also for reusing as
much as possible the input plan, which is important, for example, in mixed-initiative planning.
6
We will use the following notation for associating a time step in the plan under
adaptation π to a time step in the input plan π0 . Let t be a time step of π, t̂ denotes
the time P
step corresponding to t in π0 before any modification. I.e., t̂ is defined as
t̂ = t − ks=1 δs , where k is the number of subplan-replacements performed by the
algorithm, and δi is the difference between the number of time steps involved in the
new replanned subplan and the number of time steps in the corresponding existing
subplan for the i-th replacement.
2.4 Forward Ω Goal set
We now give the definition of forward Ω replanning goals (ΩF -goal set), which uses
the notion of persistent fact with respect to π0 and to the planning graph of the new
problem. ΩFj is considered a set of facts that we include in G(j) to obtain a replanning
goal state that is similar to the corresponding state at time ̂ in π0 . The reason for this
is that, if π0 is correct for the old problem Π0 , then each state reached by π0 contains
a set of facts that is an exact assessment of its Ω∗ -set. Hence, if we can solve Π using a
plan that is similar to π0 , then we can expect that Ωj is a good approximation of Ω∗j .
Definition 3 (Persistent fact). A fact fk is persistent in π0 and G at time k if and
only if fk is a node of G at the fixpoint level such that the corresponding noop-action is
not mutually exclusive with any applicable action of π0 .
In the following definition endtime(π0 ) indicates the last time step of plan π0 .
Definition 4 (Forward Ω Goal set – ΩF ). Let k be a time step of π0 , i the earliest time
step where π0 has an inconsistency, j the end time of the current replanning window
in π, and Γk the following set of facts:
F (k) if k ≤ i
Γk =
E k ∪ Γ−
k−1 if k > i,
where Ek is the set formed by the positive effects of the applicable actions of π0 at time
k, and Γ−
k−1 is the subset of Γk−1 formed by the facts that are persistent in π0 and G at
time k − 1. ΩFj is the subset of Γ̂ consisting of the facts that are persistent in π0 and
G at time ̂, if ̂ < endtime(π0 ); the empty set if ̂ = endtime(π0 ).
Note that the Γ-sets in the definition of the ΩF -goals of G(j), as well as the Σ-
goals, can contain facts that are mutually exclusive in G at level ̂. This can be the case
because the set Ej of action-effects, as well as the set Σj of action-preconditions, can
contain facts that are mutually exclusive. This can happen even if the original plan from
which we start the adaptation is valid (for the old problem), because the new problem
Π can have a different initial state, introducing mutual exclusion relations that were
not present in the planning graph of the old problem. During the computation of ΩFj ,
if a certain Γx contains a subset of mutually exclusive facts, then we impose that not
more than one of them (randomly chosen) belongs to Γx+1 (for x < j).
Figure 2 gives an example illustrating the elements of the sets F (i), Γj , ΩFj and
G(j). The Figure shows a portion of the planning graph G for a problem Π in the
7
blocks world domain, that we are trying to solve by adapting a plan π0 (for the sake
of clarity the information in the planning graph that is not relevant for the example is
omitted). Initially we have π = π0 . We assume that all the actions of π0 are applicable,
that the earliest inconsistency in π0 is at time i and that the current replanning window
is hF (i), G(i + 1)i. Moreover, we assume that, after the application of the actions of π
preceding i, the set of positive facts in the replanning initial state is the same as Γi :5
Γi = F (i) = {hold C, on A B, on D E, clear D, clear A}.
Note that the facts clear B and ontab D are present in G, but after the execution of
π up to time i − 1 they are false. stack C B is the only action of π at the time i. This
action has two preconditions: hold C and clear B. clear B is not satisfied at the level
i, and this causes an inconsistency in π at the time step i. The elements of Γi+1 are
the union of:
• the set of the positive effects of the action stack C B: {clear C, on C B, arm empty};
• the subset Γ− i of the facts in Γi that at time i + 1 continue to be true (i.e. the
facts in Γi that are persistent in π0 and G): {clear D, on D E, clear A}.6
hold C does not belong to Γ− i (and to Γi+1 ), as the corresponding no-op is mutually
exclusive with stack C B (i.e., hold C is not persistent in π0 and G at time i). Similarly,
also on A B does not belong to Γi+1 . The Σ-set and the ΩF -set forming G(j) are ob-
tained as follows:7 Σi+1 = {arm empty, clear D, ontab D}, ΩFi+1 = {on C B, clear C, clear A}.
2.5 Causal Ω Goal set
The previously defined ΩF -set contains all the facts that are persistent at a specific
time step. Unfortunately, this set could contain some facts that are not necessary for
the correctness of the remaining actions of the plan, which could make solving the
local planning problem harder than necessary. If we consider the example of Figure 2,
fact clear C at time step i + 1 is not necessary to the applicability of pickup D and
stack D A.
In order to improve the definition of the heuristic goal set, we can exploit the
causal link representation of the plan under adaptation [16]: the existence of a causal
f
link a → b “crossing” a time step l indicates that the truth of fact f at l is necessary
for the correctness of π. More precisely, the set Lπ of the causal links in π is defined as:
n f
Lπ = (a → b) | a, b ∈ π ∧ f ∈ add(a) ∧ f ∈ prec(b) ∧ Levπ (a) < Levπ (b) ∧
6 ∃c ∈ π s.t. f ∈ del(c) ∧ Levπ (a) ≤ Levπ (c) < Levπ (b)} ,
where Levπ (x) corresponds to the time step of the action x in π. Moreover, we define
the set Lπl ⊆ Lπ of the causal links in π crossing a time step l as follows:
n f o
Lπl = (a → b) ∈ Lπ | Levπ (a) < l ≤ Levπ (b) .
5
Since in this example k̂ = k for any step k, to simplify the notation k̂ is indicated with k.
6
on D E, clear D and clear A are the only ones in F (i) whose corresponding no-ops at the level i
in the planning graph are not mutually exclusive either with stack C B or with each other.
7
arm empty and on D E in Γi+1 are not in ΩF i+1 because they are not persistent in π and G at i + 1.
8
m.e. m.e. Time
hold_C clear_B on_A_B clear_D ontab_D on_D_E clear_A
i
m.e. m.e. m.e.
noop_hold_C stack_C_B noop_on_A_B noop_clear_D noop_ontab_D noop_on_D_E noop_clear_A
Ω i+1
m.e.
hold_C clear_C on_C_B arm_empty on_A_B clear_D ontab_D on_D_E clear_A
i+1
m.e. m.e. m.e.
noop_clear_C noop_on_C_B noop_arm_empty pick_up_D noop_on_D_E noop_clear_A
A D C D
C
B E B A E
State Level i Goal State
Goals={on C B, on D A}, π = {...; i : stack C B; i + 1 : pickup D; i + 2 : stack D A}.
Figure 2: Example illustrating the definitions of F (i), Γj , ΩFj and G(j), for j = i + 1. The
actions of π at the times i and i + 1 are stack C B and pickup D respectively. We use solid
boxes for representing the facts belonging to F (i), and dashed boxes for the facts of G that
are not in F (i). The facts belonging to G(j) are indicated with gray boxes, while the facts
belonging to ΩFj are enclosed into a box.
The f -facts involved in the causal links of Lπl can be used to derive a new definition of
the Ω-goal set at time step l:
n f
o
ΩCl = f | ∃(a → b) in Ll
π
.
For the example of figure 2, we have the following causal links crossing time step i + 1
(ainit and aend are special actions with effects the facts that are true in the initial state,
and with preconditions the problem goals, respectively):
n arm empty
on C B
Lπi+1 = stack C B −→ aend , stack C B −→ pickup D ,
o
clear D clear A
ainit −→ pickup D , ainit −→ stack D A
and so we have ΩC i+1 = {on C B, clear A, arm empty, clear D} .
Differently from ΩFi+1 , ΩC
i+1 does not contain fact clear C, but it includes clear A,
which is necessary to the applicability of stack D A (note that facts arm empty and
clear D already belong to Σi+1 ). It is important to observe that, since π could be
an invalid plan, there could be some conflicts between facts in ΩC l and in Σ. In these
cases, in order to avoid having conflicting replanning goals, it is necessary to select a
subset of the computed replanning goals. We do this heuristically. Intuitively, having
to choose between two facts that are mutually exclusive, we prefer the fact that (a) is
used by an action that is temporally nearest to the time step of the replanning goals
and (b) whose truth satisfies the greatest number of action preconditions.
2.6 Additional Details
To conclude the description of our plan-adaptation method, we consider the cases in
which the number of time steps nt involved in π0 is different from the number of levels
9
W1 W2 W3 W4
Replanning
windows
Levels 0 i1 j1 i2 j2 i3 j3 i4 j4
O1 O3 O6 O8
O2 O4 O7 O9
O5 O 10
Optimization
windows
O end
Figure 3: Example illustrating the iterative optimization procedure. W1···4 are the replanning
windows of the subplans replaced during adaptation. Oi are the alternative replanning windows
considered during the optimization process so as to find a series of plans each of which is
characterized by a smaller number of time steps/actions with respect to the previous plan.
nl in the planning graph G.
When nt > nl , the applicable actions of π0 are determined by first extending G to
have the same number of levels as the time steps of π0 , and then applying a definition
of an applicable action analogous to Definition 1.
When nt < nl , the mapping between plan-actions and graph-nodes in the definition
of applicable actions is done by considering the last nt levels of G. E.g., we try to map
actions at time nt to nodes at the last level of G, actions at time nt − 1 to nodes at the
penultimate level of G, and so on. If for an action A of π0 there is no corresponding
action node in G, then we consider A an action that cannot be applied for solving Π,
and we remove it from π. If nt < nl , we say that a fact fk is persistent in π0 and G if and
only if fk is a node of G at level l = k + length(G) − length(π0 ), and the corresponding
noop-action at level l is not mutually exclusive with any applicable action of π0 at
time k.
2.7 Plan Optimisation
Concerning plan quality, clearly, the plan adaptation techniques that we have presented
can find a first solution that is not optimal with respect to the number of actions or
time steps. If we are interested in finding good quality plans, which is important in
many applications, we can extended our method by identifying a “suitable” alternative
replanning window O, for each previous replanning phase, and repeating the adaptation
process in order to find a new subplan with a lower number of actions/time steps w.r.t.
the corresponding number of actions/time steps in O.
The iterative optimization process illustrated in the example of Figure 3 produces a
series of optimized plans, each of which involves a smaller number of either time steps
or actions. The longer this process is, the better the final plan can be. This process
can be interrupted at any time to return the best plan generated so far.
The computation of an optimized plan is based on considering alternative replan-
ning windows for the subplans that were previously inserted, during either the plan
adaptation phase or the plan optimization process. For each of these subplans, we
have to consider a replanning window that is larger than the one previously used; if,
10
for this enlarged window, we find a subplan involving a smaller number of time steps
(or actions) w.r.t. the current subplan in the replanning window under consideration,
then the plan actions inside the replanning window are replaced with those in the newly
generated subplan.
For instance, if we have replanned from F (ı̂) to G(̂), obtaining a subplan with k
time steps, then we will optimize from F (i0 ) to G(j 0 ) with j 0 > ̂ and i0 ≤ ı̂, imposing
to the solution subplan a maximum number of time steps equal to (j 0 − ̂) + k + (ı̂ −
i0 ). If there is no plan from F (i0 ) to G(j 0 ) that satisfies such a constraint, then we
either decrease i0 or increase j 0 (i.e. we enlarge the replanning window); otherwise,
we revise π0 with the new subplan, and then we consider another replanning window
previously processed by the the plan adaptation/optimization process for the possible
improvement of the corresponding subplan. In principle, each replanning window can
be enlarged up to contain the entire current plan, i.e., F (i0 ) = IΠ0 (the initial state)
and G(j 0 ) = GΠ0 . In this case, if we solve the replanning problems using an optimal
planner, like IPP, obviously the generated plan is guaranteed to be optimal.
3 Experimental Results
We present the results for some experiments aimed at testing the efficiency of our plan
adaptation approach, which is implemented in the Adj system. The general goal of
our experiments was to confirm the hypothesis that plan adaptation using Adj can be
much faster than replanning from scratch, for which we used two types of planners: the
optimal parallel planner IPP based on planning graphs, and the satisficying state-based
sequential planner FF based on heuristic search techniques. These well-known planners
are among the best available for these styles of planning. We tested our system using a
large collection of known planning problems in different known domains: Rocket [20],
Logistics [12], Gripper [4], DriverLog and Zenotravel [14]. For each of these problems
we designed some modifications (to the initial state or to the goals) and we solved the
modified versions using a solution plan for the original problems as part of the input.8
Concerning the comparison of Adj with IPP, we used a set of variants of some
benchmarks from the 1st International Planning Competition (IPC). For this experi-
ment, IPP was also used by Adj to solve the planning problems associated with the
replanning windows. Figure 4 compares the CPU-times consumed by Adj with the
CPU-times consumed by IPP for solving the same set of problem variants, which are
named on the x-axis of the plots using numbers. Each problem variant contains few
changes to the facts of the the initial or goal state(s) of the original problem, making
the input plan solving the original problem invalid for the revised problem.9 In general,
these results show that solving a problem by plan adaptation using Adj can be much
faster than by replanning from scratch, up to four orders of magnitude. (For example,
Adj with ΩC solved the Logistics c problem 14 in 0.66 seconds and with ΩF in 0.75
seconds; while IPP required 1871 seconds.) The main reason of this observed behaviour
8
The machine we used for the tests was an Intel(R) Xeon(TM) CPU 2.40GHz, with 1GB of RAM.
9
The formalisation of the test problems is available from
http://www.ing.unibs.it/∼serina/adaptation/adjust-problems.tar
11
CPU time in seconds (logscale) CPU time in seconds (logscale) CPU time in seconds (logscale)
1000 100000 1000
OmegaCL (30 solved) OmegaCL (45 solved) OmegaCL (30 solved)
OmegaF (30 solved) OmegaF (45 solved) OmegaF (30 solved)
IPP (24 solved) IPP (20 solved) IPP (9 solved)
10000
100 100
1000
10 10
100
1 1
10
0.1 0.1
1
0.01 0.1 0.01
5 10 15 20 25 30 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30
Rocket_B Variants Logistics_C Variants GRIPPER_12 Variants
Figure 4: CPU-seconds (logarithmic scale) required by Adj and IPP for solving variants of
Rocket b, Logistics c, and Gripper 12.
CPU time in seconds (logscale) CPU time in seconds (logscale) CPU time in seconds (logscale)
10000 10000 10000
OmegaCL (175 solved) OmegaCL (146 solved) OmegaCL (96 solved)
OmegaF (175 solved) OmegaF (61 solved) OmegaF (47 solved)
FF (175 solved) FF (87 solved)
1000
1000
1000
100
100
100 10
10
1
10
1
0.1
0.1 1 0.01
20 40 60 80 100 120 140 160 180 20 40 60 80 100 120 140 160 180 20 40 60 80 100
Logistics Untyped Strips Domain - log70 Problems Drivelog Domain - pfile20 Problems ZenoTravel Strips Domain - pfile20 handcoded Problems
Figure 5: CPU-seconds (logarithmic scale) required by Adj and corresponding generation
CPU-time using FF for the variants of Logistics70, DriverLog Strips pfile20 and Zenotravel
Strips Handcoded pfile20.
is that, very often, the planning subproblems defined by the replanning windows are
much easier than the complete full problem considered by IPP, and they can be solved
by shorter (sub)plans. In general, the performance of Adj depends on the number
of the replanning windows considered, on their size, and on the complexity of the cor-
responding replanning problems. But we observed that the first two factors seem less
crucial than the third one.
From the results in Figure 4 we can also observe that, for this experiment, the
performance of Adj with ΩC and with ΩF are comparable. We believe that the main
reason here is the simplicity of the test problems, for which the use ΩF is sufficiently
effective. On the other hand, we observe that the overhead due to the computation of
casual links in ΩC does not slow down the overall performance.
The problems used for the experimental analysis presented in the next part of this
section are variants of hard problems from the 2nd and 3rd IPCs. None of these
problems can be solved by IPP within the CPU-time and memory limits imposed in
our experiments. Since FF was the winner of the 2nd IPC (domain-independent track),
for these problems we used FF both as the reference planner for the replanning from
scratch and as the planner for solving the local replanning problems in Adj. For this
experiment, the mutual exclusive relations used by Adj were computed by running the
12
DISCOPLAN system [6]. Figure 5-a shows the results for several variants of the very
hard problem logistics-70 (from the “additional problems track” of the 2nd IPC). We
can observe that plan adaptation is about two orders of magnitude faster than planning
from scratch. The performances of ΩC and ΩF are again very similar; sometimes the less
accurate ΩF heuristic is even more effective than the ΩC heuristic, but we observed that
often the solutions generated using ΩF are not as good as to those obtained using ΩC .
Figure 5-b gives results for the Driverlog domain from the 3rd IPC [14]. We cre-
ated a set of variants to the pfile20 problem (the most difficult STRIPS problem of
the “standard track” of the 3rd IPC). These modifications concern the initial and/or
final positions of the involved drivers and packages. FF solved none of these problem
variants within the 3600 CPU-seconds limit. Differently from the previous cases, in
this experiment we observed that the ΩC goal sets are very useful compared to the ΩF
goal sets: Adj with ΩC solves 146 over 175 problem variants, while Adj with ΩF solves
only 61 variants. Moreover, Adj with ΩC is generally faster than with ΩF , and it finds
solutions with better quality. Interesting, ΩF can solve two problems that ΩC cannot
solve. We think that the reason is related to the fact that Adj with ΩC can remove
some useless actions from the plan, which allows the system to find better quality so-
lutions, but which for these problems makes processing the final replanning windows
more difficult.
Figure 5-c gives results for a set of variants of the pfile20 “handcoded” problem in
the Zenotravel STRIPS domain (the hardest problem in this domain, originally designed
for testing the performance of domain-dependent planners). Here again we can observe
a generally good behavior of ΩC , with which Adj solves 96 over 105 problem variants,
while with ΩF it solves only 47 variants. FF solves 87 variants (using at most 1800
CPU-seconds). Overall, Adj with ΩC is one or two orders of magnitude faster than
FF, although the first solution produced by Adj can contain more actions than the
corresponding solution produced by FF.
As previously observed, the first plan computed by Adj can involve a number
of time steps or actions that is higher than necessary. In Figure 6, we give some
experimental results about the the effectiveness of the plan optimisation phase of Adj.
On the x-axis, we have the CPU-seconds for finding a solution; on the y-axis, we have
plan qualities, in terms of either number of time steps (levels) or number of actions, for
the various solutions that are incrementally generated during the optimisation process.
The curves in Figure 6 indicate that Adj computes very quickly a first solution
which is worse than the optimal solution. However, then the optimisation process can
incrementally derive additional solutions with better quality, up to a solution that in
terms of number of involved time steps or actions is similar to the optimal one. For ex-
ample, concerning the Logistics a variant 24, Adj computes a first solution involving 16
time steps in 1.24 seconds. Then the optimization process derives additional solutions
involving 15, 14, 13, 12, and 11 time steps, using 1.8, 4.2, 142, 1599, 1800 CPU-seconds,
respectively. While IPP computes an optimal solution in 1430 CPU-seconds.
Moreover, Figure 6 also compare the performances of FF and Adj with either
ΩC or ΩF . Although usually the first solution computed by Adj is not better than
the one generated by FF, we observe that, for the test problems considered in this
13
Levels Number of Actions Number of Actions
16 466 500
ADJUST-PLAN OmegaCL OmegaCL
15 IPP 464 OmegaF OmegaF
FF 490 FF
14 462
460 480
13
458
12 470
456
11
454 460
10 452
450
9 450
8 448 440
0 200 400 600 800 1000 1200 1400 1600 1800 0 100 200 300 400 500 600 0 200 400 600 800 1000 1200 1400 1600 1800
Seconds Seconds - log70-I1-G1-n4 variant Seconds - log70-I5-G1-n1 variant
Figure 6: Optimisation phase for variant num. 24 of Logistics a considering Adj vs. IPP and
variants 34 and 151 of Logistics70 Untyped considering Adj vs. FF
experiment, Adj generates additional plans quickly deriving a solution that is better
than the solution computed by FF.
4 Conclusions and Future Work
We have presented a method and some heuristics for efficient domain-independent
plan adaptation that are implemented in the Adj system. An experimental analysis
indicates that solving a planning problem by adapting an existing plan for a similar
problem using Adj can be dramatically more efficient than replanning from scratch
using two well-known approaches of planning.
In the literature, other methods and systems for plan adaptation have been pro-
posed, such as priar [11], prodigy/analogy [20] and spa [9]. These systems use
underlying planning algorithm and data structures that are significantly different from
those implemented in Adj. Another important difference concerns the input informa-
tion. While Adj uses a very simple and general plan description that can be easily
generated from the output of many planners (or even written by hand), the other men-
tioned systems use specific additional information (such as the “refinement” decisions
taken during the search process in spa) constraining the input plan to be a plan gen-
erated by a specific planner. Other recent related systems are Far-off [17] and van
der Krogt & de Weerdt’s system [19], each of which is based on techniques that are
significantly different from Adj’s techniques.
We are currently studying a new version of Adj using weaker, but computation-
ally more efficient data structures and an extension to support numeric and temporal
domains [14]. Other current and future work includes further techniques for determin-
ing replanning goals, and additional experiments, including, in particular, a comparison
with the performance of other related plan adaptation systems. (A preliminary compar-
ison with the two most related systems mentioned above indicates that Adj performs
more efficiently.) Moreover, we are studying the integration of LPG [5] and SGPlan
[21] into Adj for solving the problems associated with Adj’s replanning windows.
Finally, we intend to develop a complete case-based planning system, which selects
an existing plan from a library and adapts it to solve the current problem using Adj.
Since adapting an existing plan is PSPACE-Complete [15], the use of an exact general
method for deciding whether adapting a plan from a library gives a computational
advantage w.r.t. a complete replanning from scratch seems practically infeasible. How-
ever, this question can be addressed by using heuristic methods such as the polynomial
14
technique presented in [17].
References
[1] A. Blum and M.L. Furst. Fast planning through planning graph analysis. Artificial Intel-
ligence, 90:281–300, 1997.
[2] B. Drabble. Excalibur — A program for planning and reasoning with processes. Artificial
Intelligence, 62(1):1–40, 1993.
[3] G. Ferguson and J. Allen. Towards a mixed-initiative planning assistant. In In Proceedings
of AIPS-96, 1996.
[4] M. Fox and D. Long. The detection and exploitation of symmetry in planning problems.
In Proceedings JCAI-99 , pages 956–961, 1999.
[5] A. Gerevini, A. Saetti, and I. Serina. Planning through stochastic local search and temporal
action graphs. Journal of Artificial Intelligence Research (JAIR), 20:pp. 239–290, 2003.
[6] A. Gerevini and L. Schubert. Inferring state constraints in discoplan: Some new results.
In Proceedings of the AAAI-00, pages 761–767. AAAI press/The MIT Press, 2000.
[7] A. Gerevini and I. Serina. Fast planning through greedy action graphs. In Proceedings of
the AAAI-99, pages 503–510. AAAI Press/MIT Press, July 1999.
[8] A. Gerevini and I. Serina. Fast plan adaptation through planning graphs: Local and sys-
tematic search techniques. In Proceedings of the AIPS-00, pages 112–121. AAAI Press/MIT
Press, 2000.
[9] S. Hanks and D.S. Weld. A domain-independent algorithm for plan adaptation. Journal
of Artificial Intelligence Research (JAIR), 2:319–360, 1995.
[10] J. Hoffmann. FF: The fast-forward planning system. AI Magazine, 22(3):57–62, 2001.
[11] Subbarao Kambhampati and James A. Hendler. A validation-structure-based theory of
plan modification and reuse. Artificial Intelligence, 55:193–258, 1992.
[12] H.A. Kautz and B. Selman. Pushing the envelope: Planning, propositional logic, and
stochastic search. In Howard Shrobe and Ted Senator, editors, Proceedings of the AAAI-
96, pages 1194–1201. AAAI Press, 1996.
[13] J. Koehler, B. Nebel, J. Hoffmann, and Y. Dimopoulos. Extending planning graphs to
an ADL subset. In Fourth European Conference on Planning (ECP’97), pages 273–285.
Springer Verlag, 1997.
[14] D. Long and M. Fox. The 3rd international planning competition: Results and analysis.
Journal of Artificial Intelligence Research (JAIR), Forthcoming, 2003.
[15] B. Nebel and J. Koehler. Plan reuse versus plan generation: A complexity-theoretic
perspective. Artificial Intelligence, 76:427–454, 1995.
[16] J.S. Penberthy and D.S. Weld. UCPOP: A sound, complete, partial order planner for
ADL. In Proceedings of KR’92, pages 103–114, Boston, MA, 1992. Morgan Kaufmann.
[17] Flavio Tonidandel and Marcio Rillo. The far-off system: A heuristic search case-based
planning. In Proceedings of AIPS, pages 302–311. AAAI Press, 2002.
[18] D.R. Traum, J.F. Allen, G. Ferguson, P.A. Heeman, Chung-Hee Hwang, T- Kato, N.
Martin, M. Poesio, and L. K. Schubert. Integrating natural language understanding and
plan reasoning in the TRAINS-93 conversation system. In Working Notes of the AAAI
Spring Symposium on Active NLP, pages 63–67, Stanford, CA, 1994.
[19] R. van der Krogt and M. de Weerdt. Plan repair as an extension of planning. In Proceedings
of ICAPS’05, pages 161–170. AAAI, 2005.
[20] M. Veloso. Planning and learning by analogical reasoning, volume 886 of Lecture Notes in
Artificial Intelligence and Lecture Notes in Computer Science. Springer-Verlag Inc., New
York, NY, USA, 1994.
[21] Hsu C. W., Wah B. W., Huang R., and Chen Y. X. New features in sgplan for handling soft
constraints and goal preferences in PDDL3.0. In Abstract Booklet of the Fifth International
Planning Competition,ICAPS’06, 2006.
15