An Empirical Analysis of Some Heuristic Features for
                  Planning with Local Search in LPG?

                       Alfonso E. Gerevini1 , Alessandro Saetti1 , and Ivan Serina2
               1
                   DEA, Università degli Studi di Brescia, via Branze 38, 25123 Brescia, Italy
                               {gerevini,saetti}@ing.unibs.it
           2
               Free University of Bozen – Bolzano, Viale Stazione 16, 39042 Bressanone, Italy
                                      ivan.serina@unibz.it


                                                       Abstract
            LPG is a planner that performed very well in the third and fourth International plan-
        ning competitions. The system is based on a stochastic local search procedure, and it
        incorporates several heuristic features. In this paper we experimentally analyze the most
        important of them with the goal of understanding and evaluating their impact on the per-
        formance of the planner. In particular, we examine three heuristic functions for evaluating
        the search neighborhood and some settings of the “noise” parameter, that randomizes the
        next search step for escaping from local minima. Moreover, we present and analyze ad-
        ditional heuristic techniques for restricting the search neighborhood and for selecting the
        next inconsistency to handle. The experimental results show that the use of such tech-
        niques significantly improves the performance of the planner.


1       Introduction
The results of the 3rd and 4th International planning competitions [13, 12] showed that LPG
is an efficient planner for PDDL2.2 domains [7, 8, 9]. The system is based on a stochastic
local search procedure, called Walkplan, that is similar to the well-known Walksat procedure
for solving SAT problems [15].
     As in any local search scheme, the definition of the search neighborhood (the set of possible
successor states) and the heuristic function for evaluating its elements are crucial features for
the effectiveness of Walkplan. When the number of the elements in the neighborhood is high,
its evaluation can be computationally expensive, and a technique for pruning some elements
can be very effective. Moreover, in an iterative-repair approach, the strategy to select the
next flaw to handle (inconsistency in LPG, unsatisfied clause in Walksat) may also affect the
performance of the search.
     In order to escape from local minima, in Walkplan as in Walksat, if every element in the
neighborhood is worse than the current state (according to an heuristic function), then with
some probability (called “noise”) an element of the neighborhood is randomly chosen, instead
of selecting the best one. In general, the value of the noise can significantly affect the per-
formance of the search. In LPG the noise value can be either statically set by the user, or
automatically set to an initial default value that is dynamically changed during search.
     This paper has two main contributions:
    ?
    This paper has been already published in the Proceedings of the Fourteenth International Conference on Auto-
mated Planning & Scheduling (ICAPS-04).


Proceedings of the 16th International RCRA workshop (RCRA 2009):
Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion
Reggio Emilia, Italy, 12 December 2009
        • we propose some techniques for effectively restricting the search neighborhood of Walk-
          plan, and for selecting the next inconsistency to handle;

        • we experimentally analyze the main heuristic features for local search in LPG with the
          goal of understanding and evaluating their impact on the performance of the planner.

    In addition to the techniques for neighborhood restriction and inconsistency selection, we
analyze three heuristic functions for evaluating the neighborhood elements, that we introduced
in previous work [6, 7, 8], and the noise setting. We focus our analysis on simple STRIPS
domains.
    The second section gives the necessary background on LPG and Walkplan; the third section
presents the techniques for the neighborhood restriction and the inconsistency selection; the
fourth section presents and discusses the results of our experimental analysis; finally the last
section gives the conclusions.


2        Background on LPG
In this section, we give an overview of LPG’s plan representation, search space, search algo-
rithm and main heuristic features [6, 7, 8].

2.1       Plan Representation: Linear Action Graphs
In our framework, plans are represented through action graphs [6] that are particular subgraphs
of planning graphs [1]. Given a planning graph G for a planning problem Π, an action graph
(A-graph) for Π is a subgraph A of G such that, if a is an action node of G in A, then also the
fact nodes of G corresponding to the preconditions and positive effects of a are in A, together
with the edges connecting them to a. An action graph can contain some inconsistencies, i.e.,
action preconditions that are not supported, or pairs of action nodes involved in mutex rela-
tions.1 A precondition node q at a level i is supported in an action graph A of G if in A there is
an action node (or a no-op node) at level i − 1 representing an action with a positive effect q.2
An action graph without inconsistencies represents a valid plan that we call a solution graph.
    In the current version of LPG, plans for STRIPS problems are represented through a subclass
of A-graphs called linear action graph with no-ops propagation (LA-graphs).3 A linear action
graph A is an A-graph in which each action level contains at most one action node and any
number of no-op nodes. Moreover, if a is an action node of A at a level l, then, for any positive
effect e of a and any level l0 > l of A, the no-op node of e at a level l0 is in A, unless there
is another action node at a level l00 (l < l00 ≤ l0 ) that is mutex with the no-op node. In any
LA-graph, the only inconsistencies that the search procedure needs to manage explicitly are
the unsupported preconditions. Although an LA-graph cannot contain more than one action
    1
     LPG considers only pairs of actions that are globally mutex, i.e. that hold at every level of G [8].
    2
     We assume that the goal nodes of G (i.e., the problem goals) represent the preconditions of a special action
aend , which is the last action in any valid plan; the fact nodes of the first level of G (i.e., the initial facts of the
problem) represent the effects of a special action astart , which is the first action in any valid plan. astart and aend
belong to any A-graph of G.
   3
     In order to handle domains involving time and numerical variables (levels 2 and 3 of PDDL 2.1), LPG uses some
extensions of LA-graphs that, however, will not be considered in this paper.


                                                           2
per level, a plan with parallel actions (a partial order plan) can be derived from a solution
graph by considering the causal dependencies and mutex relations between the actions in the
graph. LA-graphs offer some advantages with respect to A-graph that are discussed in [8]. All
experiments presented in this paper were conducted using this representation.

2.2   Stochastic Local Search: Walkplan
Given a planning problem Π, LPG uses local search for searching a solution LA-graph in the
space of the LA-graphs for Π. The general scheme consists of two main steps. First we
construct an initial LA-graph; then we iteratively apply some graph modifications to transform
the initial LA-graph into a solution graph. In the current version of LPG, the default initial
graph is the empty LA-graph with the “fixed-point level” of the underlying planning graph as
the last level.
     At each search step, LPG selects an inconsistency (unsupported precondition) in the current
LA-graph. As will be shown, the strategy for selecting the next inconsistency to handle can
have a significant impact on the overall performance. In the next sections we will present and
analyze some possible strategies that are implemented in LPG.
     In order to resolve the selected inconsistency, we can either add an action node that supports
it, or we can remove an action node that is connected to that fact node by a precondition edge.
When we add an action node to a level l, the LA-graph is extended by one level, all action
nodes from l are shifted forward by one level, and the new action is inserted at level l (a more
detailed description is given in [8]).
     Given a linear action graph A and an inconsistency σ in A, the neighborhood N (σ, A) of A
for σ is the set of LA-graphs obtained from A by applying a graph modification that resolves
σ. At each step of the local search scheme, the elements of the neighborhood are evaluated
according to an heuristic function estimating their quality, and an element with the best quality
is then chosen as the next possible LA-graph (search state). Evaluating all elements in the
neighborhood can be computationally very expensive, because the neighborhood could contain
many LA-graphs and an accurate evaluation of each of them could require significant CPU-
time. For this reason, as we will show, it is important that the evaluation of the neighborhood
elements is combined with a technique that reduces its size.
     The default search strategy used by LPG is called Walkplan. Walkplan is similar to Walk-
sat, a well-known local search method for solving propositional satisfiability problems [15]. In
Walkplan the best element in the neighborhood is the LA-graph which has the lowest decrease
of quality with respect to the current LA-graph, i.e., it does not consider possible improve-
ments. Given an LA-graph A and an inconsistency σ, if there is a modification for σ that does
not decrease the quality of A, then the resulting LA-graph is chosen as the next LA-graph;
otherwise, with a probability p one of the graphs in N (σ, A) is randomly chosen, and with
probability 1 − p the next LA-graph is the best element in N (σ, A) according to an action
evaluation function E (an element with the lowest evaluation). As in Walksat, p is called noise
factor, and its value may have a significant impact on the search. When N (σ, A) contains more
than one graph with the best evaluation, LPG chooses randomly one of them. Finally, if after
a certain number of search steps (max steps) a solution LA-graph is not reached, the current
LA-graph is reinitialized, and the search is repeated up to a user-defined maximum number of
times (max restarts).


                                                3
2.3     Heuristic Evaluation of the Search Neighborhood
The neighborhood evaluation function has two parts evaluating: the search cost of an LA-graph
in the neighborhood (i.e., the number of search steps required to reach a solution graph); the
quality (or execution cost) of the (partial) plan represented by the LA-graph. In this section
we focus on the first part. At the end of the section we briefly describe how execution cost is
evaluated.
    In the design of a neighborhood evaluation function, there is an important tradeoff to con-
sider between accuracy of the evaluation and the computational cost of the evaluation. An
accurate evaluation of the elements in the neighborhood could lead to a valid plan within few
search steps. However, when the neighborhood contains many elements, the evaluation could
slow down the search excessively. On the other hand, a less accurate evaluation of the neigh-
borhood is faster to compute, but since it is less informative it could lead to a valid plan only
after many search steps. We developed three evaluation functions trying to identify an appro-
priate balance between informativeness and efficiency of their computation: E0 , EH and Eπ .
In the rest of this section we describe each of them, while in the section about the experimental
results we analyze their impact on the performance of Walkplan. For each E ∈ {E0 , EH , Eπ },
E(a, A)i is the evaluation of the LA-graph derived from the current graph A by adding the ac-
tion a to it (also called the “cost of adding a to A”). Similarly, E(a, A)r is the cost of removing
a from A.
    The simplest function that we consider was proposed in [6] and is defined as follows.
Heuristic function E0 :
      E0 (a, A)i = |P (a, A)| + |T hreats(a, A)|
      E0 (a, A)r = |U nsup(a, A)|
where Threats(a, A) is the set of supported precondition facts in A that become unsupported
by adding a to A; P (a, A) is the set of the precondition facts of a that are not supported in
A; Unsup(a, A) is the set of supported precondition facts in A that become unsupported by
removing a from A.
     While computing E0 in the context of LA-graphs is quite fast, a more accurate evaluation
could be more effective. In fact, it can be the case that, although the insertion of a new action
ai leads to fewer new unsupported preconditions than those introduced by an alternative action
aj , the unsupported preconditions of ai are more difficult to satisfy (support) than those of
aj in the context of the current partial plan. For this reason, subsequently we proposed two
alternative, more informative functions, EH [7] and Eπ [8].
     EH is a refinement of E0 in which, instead of just counting the number of the unsupported
preconditions of a and those in Unsup(a, A), we estimate the search cost of achieving them.
More precisely, EH is defined as follows.
Heuristic function EH :
      EH (a, A)i = M AX H(f, A) + |T hreats(a, A)|
                   f ∈P re(a)

      EH (a, A)r =      M AX          H(f, A − a)
                     f ∈U nsup(a,A)

where Pre(a) is the set of the preconditions of a and H(f, A) is the heuristic cost of supporting


                                                    4
f , which is recursively defined in the following way:
             
              0      if f is supported
             
                  0
                          f                             0
    H(f, A) = H(f , A) if a is no-op with precondition f
              M AX H(f 0 , A) + |T hreats(af , A)| + 1
             
             
                  f 0 ∈P re(af )

where

        af = ARGM IN E0 (a0 , A)i
                    
              {a0 ∈Af }

and Af is the set of action nodes of the underlying planning graph at the levels preceding f
that have f as one their effect nodes. Informally, the heuristic cost of an unsupported fact f is
determined by considering all the actions at a level preceding f whose addition would support
it. Among these actions, we choose the one with the best evaluation (af ) according to the
basic action evaluation function E0 . H(f, A) is recursively computed by summing the highest
heuristic cost of supporting a precondition f 0 of af in A (H(f 0 , A)) and the number of sup-
ported precondition facts in A that become unsupported by adding af to A (|Threats(af , A)|).
The last term “+1” takes account of the insertion of af to support f .
    Eπ [8] was designed to improve the accuracy of EH . It was used by LPG in the 3rd and 4th
planning competitions. In the evaluation of the addition of a new action a, Eπ estimates the
search cost of achieving all preconditions of a (Pre(a)) in the context of the current LA-graph,
while EH considers the maximum over the search costs of its preconditions. Moreover, instead
of just counting the number of action preconditions in A that would become unsupported when
adding a (Threats(a, A)), Eπ estimates the search cost of re-achieving such conditions after
the addition of a. More precisely, in Eπ the search costs are estimated by computing a relaxed
plan πr for achieving Pre(a) and Threats(a, A), and counting the number of actions in πr . In
addition, Eπ counts the number of the action preconditions in A that are subverted by an action
a0 in πr (Threats(a0 , A)). Eπ is formally defined as follows.
Heuristic function Eπ :
 Eπ (a, A)i = |π(a, A)i | +                               0
                               P
                                   a0 ∈π(a,A)i |T hreats(a , A)|

 Eπ (a, A)r = |π(a, A)r | +                               0
                                   P
                                   a0 ∈π(a,A)r |T hreats(a , A)|

where

   • π(a, A)i is an estimate of a minimal set of actions forming a relaxed plan achieving
     Pre(a) and Threats(a, A);

   • π(a, A)r is an estimate of a minimal set of actions forming a relaxed plan achieving
     Unsup(a, A).

    The plans of Eπ are relaxed because their validity do not consider the negative effects of
the actions. However, negative effects are considered in the heuristic selection of the actions
forming a relaxed plan (more details below). The initial state I(l) of the (relaxed) problem of
achieving either Pre(a) or Unsup(a, A) is the state obtained by applying the actions in A up


                                                   5
RelaxedPlan(G, I(l), A)
 Input: A set of goal facts (G), the set of facts that are true after executing the actions of the current
      LA-graph up to level l (I(l)), a possibly empty set of actions (A);
 Output: An estimated minimal set of actions required to achieve G.
1. G ←G−I(l);
        S       Acts ← A;
2. F ← a∈Acts Add(a);
3. while G − F 6= ∅
4. g ← a fact in G − F ;
5. bestact ← Bestaction(g);
6. Rplan ← RelaxedPlan(P re(bestact), I(l), Acts);
7. Acts ← S Rplan ∪ {bestact};
8. F ← a∈Acts Add(a);
9. return Acts.
Figure 1: Algorithm for computing a relaxed plan achieving a set of action preconditions from
the state I(l).


to level l−1 (ordered according to their corresponding levels). The initial state from which we
achieve Threats(a, A) is the state obtained by applying a to I(l).
     π(a, A)i and π(a, A)r are computed by an algorithm called RelaxedPlan (see Figure 1).
For π(a, A)i , RelaxedPlan is run twice, first to achieve Pre(a) and then to achieve Threats(a, A).
The set of actions identified by the first run is given as input to the second run, so that the re-
laxed subplan for Threats(a, A) can reuse the actions in the subplan for Pre(a).
     RelaxedPlan constructs a plan through a backward process from the input goal set G to the
input initial state. The action chosen to achieve a (sub)goal g, Bestaction(g), is determined by
considering for each fact f an estimate of the minimum number of actions required to achieve
f from I(l) (N um acts(f, l)). A detailed description of this reachability information and of
its computation is given in [8].
More formally, Bestaction(g) is defined as
                                                                   
                                                              0
       ARGMIN          MAX N um acts(p, l) + |T hreats(a , A)| ,
       {a0 ∈Ag }   p∈P re(a0 )−F

where F is the set of positive effects of the actions currently in the relaxed plan (Acts), and Ag
is the set of actions with the effect g and with all preconditions reachable from the initial state.
For a more detailed description of Bestaction, RelaxedPlan and Eπ , the interested reader may
see [8].
     Finally, we briefly describe the heuristic evaluation of the execution cost associated with the
plan represented by an LA-graph in the search neighborhood. The action evaluation function
is a normalized linear combination of the search cost and the execution cost, that for STRIPS
domains is defined as the number of actions in the plan. In E0i the execution cost is modeled
by a “+1” term, while in E0r we do not have this term; in EH it is modeled by a term equal to
the maximum depth of the tree of action nodes identified by H; finally, in Eπ it is modeled by
a term equal to the number of the actions in the relaxed plan [8].


                                                    6
3       Additional Heuristic Features
In this section we propose some additional heuristic features that have a significant impact
on the performance of LPG. In particular, we present techniques for restricting the search
neighborhood and selecting the next inconsistency to handle.

3.1     Neighborhood Restriction
In general, the effectiveness of a heuristic function evaluating the elements in the search neigh-
borhood can be significantly affected by the size of the neighborhood. If this is too large, an
accurate evaluation might require too much time, and a less accurate (but computationally more
efficient) function could perform better. Since LPG’s basic search neighborhood can be very
large, we considered some alternative restricted neighborhoods, and we compared the perfor-
mance of E0 , EH and Eπ using them. The results of this experiment are presented in the next
section. In this section, first we overview the basic neighborhood of LPG, and then we present
some techniques to restrict it.
     The basic neighborhood N (p, A) of an LA-graph A for an unsupported precondition node
p is the set of the LA-graphs that can be derived from A by adding an action node supporting
p, or by removing the action node with precondition p. Suppose that p is a precondition node
at a level l of A. In order to support p we can add an action a with positive effect p to any level
l0 ≤ l, provided that p can be propagated to l by adding a no-op for p to the levels between l0
and l. (The propagation is not possible if at any of such intermediate levels there is an action
node that is mutex with the no-op of p.) In the basic definition of N (p, A) for LA-graphs,
we have an LA-graph for each of these graph modifications. Moreover, N (p, A) contains the
LA-graph obtained by removing the action node of which p is a precondition.4
     While any restriction of the basic neighborhood makes its evaluation faster, clearly not
every restriction can speed up the search of a solution graph. In particular, we would like to
remove from the neighborhood only the bad elements (those requiring more search steps to
reach a solution graph).
     In order to select the elements forming a restricted search neighborhood, LPG pre-evaluates
the candidate elements of the basic neighborhood using the same method used in Eπ to select
the actions forming a relaxed plan. Specifically, if A0 is the LA-graph derived by adding an
action a0 to a level l of A, the quality of A0 is defined as the maximum over
        N um acts(q, l) + |T hreats(a0 , A)|

for every unsupported precondition q of a0 . Clearly, this evaluation is much faster to compute
than Eπ , but it is also less accurate.
    We considered four strategies for determining the number of the elements forming the
restricted neighborhood: NRk , NRk% , NRA , and NRL . They differ from the basic neighborhood
because they contain only a subset of the LA-graphs obtained by adding an action node to
the current LA-graph. NRk , NRk% and NRA exploit reachability information that is available
    4
     Since at any level there can be at most one action node (plus any number of no-ops), when we remove an
action node, the corresponding action level becomes “empty” (i.e., it contains only no-ops). If the LA-graph
contains adjacent empty levels, and in order to support p a certain action node can be added to any of these levels,
then N (p, A) contains only one of the resulting graphs.


                                                         7
when the heuristic evaluation function is Eπ ; NRL is a simpler technique that can be used in
combination with any of the heuristic functions that we have described.
    In NRk , N (p, A) contains only the k best pre-evaluated elements. Note that if there are
many candidate elements of good quality according to the pre-evaluation, it could be the case
that NRk removes elements that actually have better quality according to the heuristic evalua-
tion function.
    In NRk% , the search neighborhood contains only those elements, whose pre-evaluation is
worse than the best element by a factor less than or equal to k% (k is an input parameter
for both NRk and NRk% ). NRk% was designed to remove from the neighborhood only those
elements that the pre-evaluation considers significantly bad, and so to reduce the probability of
erroneously removing elements that the heuristic evaluation function would consider of good
quality.
    In NRA , for each action a with effect p, N (p, A) contains only one LA-graph (while the
basic neighborhood contains an LA-graph for any level where a can be added and p can be
propagated up to the level of the inconsistency). The best level where adding a is decided ac-
cording to the pre-evaluation above. If there is more than one element with the best evaluation,
NRA chooses the one where a is put at the highest level. NRA is the restriction used by LPG in
the third and fourth IPCs.
    NRL is a simplified version of NRA in which the best level where we add a is just the last
level where a can be added, i.e. the same level of the unsupported precondition p.
    Finally, we considered a fifth strategy in which we combine all three restriction techniques
that use reachability information. We call such a strategy NRall .

3.2    Inconsistency Selection
In partial-order causal-link (POCL) planning, the choice of the next inconsistency to repair
at each search step can significantly affect the performance of the search process (e.g., [5,
14]). Although the search methods of LPG and POCL-planners are radically different (and
hence we cannot directly import results from POCL planning into our framework), preliminary
experimental tests showed that this is the case also for LPG: if we change the basic random
selection strategy of Walkplan, the performance of the planner can be significantly different.5
Thus, we designed some alternative strategies with the aim of improving the performance of
the random strategy (indicated with Σrand ).
     In this section we introduce two of them, Σlif o and Σlev− , and in the next section we
evaluate their performance with respect to the default random strategy.
     Σlif o is a simple standard strategy that handles the inconsistencies according to a last-in
first-out discipline. Σlev− prefers the inconsistency at the lowest level t of the current LA-
graph; if more than one of such inconsistencies are present at level t, then Σlev− chooses
randomly one of them. The rationale of Σlev− is related to the definition of the evaluation
function Eπ , for which it was initially designed. As we have seen, Eπ (a, A)i is defined using a
relaxed plan πr for achieving the preconditions of a from the state I(l), where I(l) is the state
obtained by executing the actions at the levels preceding the level l of the selected inconsistency
starting from the initial state. Moreover, the actions forming πr are selected using reachability
   5
     The random selection of the next inconsistency to handle in Walkplan is the analogue of the random selection
of the next unsatisfied clause to handle in Walksat [15].


                                                       8
information for their preconditions that are dynamically computed from I(l). By removing the
inconsistency at the earliest level l of the current LA-graph, we guarantee that I(l) is the state
s reached by the actions at the levels preceding l. On the contrary, if we randomly select an
inconsistency at any level l, and there are other inconsistencies before l, then I(l) can be only
an estimate of s. Such an estimate could change in the next LA-graphs of the search (because
some actions are added/removed to deal with inconsistencies at levels before l). Hence, the
evaluation of Eπ (a, A)i at a certain search step could change radically at a next search step.
This could lead Eπ combined with Σrand to make incorrect evaluations more often than when
Eπ is combined with Σlev− . For similar reasons, we conjecture that also E0 and EH can
benefit from the use of Σlev− . In this paper we will restrict the experimental analysis of Σlev−
by considering only Eπ . Σlev− is the inconsistency selection strategy used by LPG in the third
IPC.


4       Experimental Results
In this section we present some experimental results illustrating the performance of the heuris-
tic features for LPG described in the previous sections. We will use the test problems of the
3rd IPC [13]. For lack of space we will show the experimental results for only the 102 prob-
lems of the S TRIPS domain variants (Depots, DriverLog, Rovers, Satellite and
Zenotravel).6 Similar experimental results were obtained for the temporal domain variants
of the 3rd IPC.
     LPG is an incremental planner, in the sense that it produces a sequence of valid plans, each
of which improves the quality of the previous ones. We tested LPG in terms of both the CPU-
time required to find a solution (LPG-speed) and the quality of the best plan computed by the
incremental process using five CPU-minutes (LPG-quality).7
     We compare different options of an heuristic feature using Friedman’s statistical test [4].
Note that when there are more than two options to compare, Friedman’s test is more accurate
than Wilcoxon’s test [17], that Long and Fox used to compare the performance of pairs of
planners [13].
     The first observation that we derived from our experimental results is that the assumptions
of the Friedman’s test procedure [4] are satisfied for every heuristic feature analyzed in this
paper.

Observation 1. The noise value, the heuristic evaluation of the search neighborhood, the
neighborhood restriction and the inconsistency selection are statistically meaningful features
for the performance of LPG.

    6
      These domains are described at www.dur.ac.uk/ d.p.long/competition.html. We have not con-
sidered the Freecell domain because currently LPG solves the problems of this domain only by using the alter-
native best-search mode [8].
    7
      Since Walkplan is a stochastic procedure, we ran LPG five times for problem tested. The tests were conducted
on a PIII Intel 866 MHz with 512 Mbytes of RAM. In every run the max steps parameter of Walkplan was set to
500, and it was automatically increased by 10% at each search restart.


                                                        9
                             Results using NoNR    E0    EH         Eπ     Eπ−T
                             Problems Solved     72.5%  86.3% 99.0%         87.3%
                             Speed Mean Rank      12.5    8.4      10.2      11.0
                             Quality Mean Rank    12.1   11.0       8.9      10.0
                                   CPU-time: F riedman(E0 , EH , Eπ , Eπ−T )


                                      EH        Eπ          Eπ−T       E0

                                                                            D = 1.9
                                  Plan Quality: F riedman(E0 , EH , Eπ , Eπ−T )


                                      Eπ        Eπ−T          EH       E0

                                                                            D = 1.3


Table 1: Results of Friedman’s test on the performance of Walkplan using four evaluation
functions (E0 , EH , Eπ , and Eπ−T ) without neighborhood restriction: percentage of problems
solved, CPU-time and plan quality ranks.

4.1    Heuristic Evaluation of the Search Neighborhood
In this section we comment on the performance of LPG with different heuristic evaluations of
the search neighborhood: E0 , EH , Eπ , and Eπ without considering the terms counting the
threats (Eπ−T ). Eπ−T is tested in order to show the importance considering the threats in the
evaluation of Eπ . Note that this is an important difference in the construction of the relaxed
plans computed by LPG and those computed by FF [10] and the first version of SAPA [2].8
    EH and Eπ are more accurate than E0 . On the other hand, as expected, we observed that
E0 is the fastest to compute, EH is slower than E0 , and Eπ is the slowest one.
    Table 1 shows the performance of different neighborhood evaluation functions in terms of
both CPU-time and plan quality when no restriction is imposed on the neighborhood. The
graphs in each table should be interpreted as follows. An edge connecting an option T to
another option T 0 indicates that T performs statistically better than T 0 . For instance, in Table
1 EH performs better than E0 in terms of CPU-time. If an edge connects a cluster of options
to a certain option T , as in Table 3, it means that any option in the cluster is statistically better
than T . Each graph is annotated with the corresponding D-value.

Observation 2. Without any restriction of the search neighborhood, in terms of CPU-time,
EH is statistically better than both E0 and Eπ−T , and Eπ is statistically better only than E0 .

The fact that Eπ does not perform better than EH is somewhat disappointing. It shows that
without restricting the search neighborhood, a more accurate but computationally more expen-
sive function does not pay off.

Observation 3. A complete examination of the search neighborhood using Eπ can be too
expensive.
   8
    The last version of SAPA [3] considers static mutex relations in order to improve the makespan of the relaxed
plan, and to check if the relaxed plan is a valid plan for the planning problem.


                                                       10
                              Heuristic   Speed Mean Rank   Quality Mean Rank
                              Function    NoNR     NRL      NoNR       NRL
                              E0           25.2     27.1     22.7      25.2
                              EH           17.7     20.6     20.3      25.1
                              Eπ           21.5     13.5     16.3      17.4
                              Eπ−T         23.3     15.2     18.5      18.7


Table 2: Mean ranks of CPU-time and plan quality using different heuristic evaluations of the
search neighborhood with/out neighborhood restriction (NRL /NoNR).

The previous observation suggests that the use of an accurate evaluation should be combined
with a restriction of the search neighborhood. In fact, if we combine the use of Eπ with a
restriction of the search neighborhood, we obtain a different picture. In particular, if we use
the simple restriction NRL introduced in the previous section, we have that Walkplan with Eπ
performs more efficiently than with any other evaluation function (see Table 2).
    Regarding plan quality, as expected, the most accurate heuristic evaluation functions (Eπ
and Eπ−T ) performs better than the other evaluation functions. Moreover, the results of Fried-
man’s test show that Eπ−T is worse than Eπ (see the mean ranks of tables 1 and 2), which
demonstrates the importance of taking threats into account.

Observation 4. Without any restriction of the search neighborhood, in terms of plan quality,
Eπ is statistically better than both EH and E0 , while Eπ−T is better than E0 .

4.2    Noise Value for Walkplan
It is well known that the performance of a stochastic local search procedure (like Walkplan)
can depend on the value of its noise parameter, that is used for escaping from local minima
[15]. In this section we empirically analyze the performance of Walkplan using different noise
values, including the special case in which the noise is set to zero. Since the impact of the
noise value may also depend on which heuristic function is used to evaluate the neighborhood,
the experimental analysis for the noise parameter was conducted using two heuristic functions:
the simplest function (E0 ) and the most accurate one (Eπ ).
     Tables 3 and 4 show the results of Friedman’s test for E0 and Eπ , respectively, using
different static values for the noise, as well as a dynamic value (Ndyn ) that is defined in the
following way. The search process starts with a low noise value and it considers the variance
of the number of inconsistencies in the last t visited LA-graphs. The idea is that a low variance
indicates the presence of a local minimum; in this case, the noise value is increased to facilities
escaping from the minimum. The variance is checked every t search step (in our tests we used
t = 25). If it is less than one, the noise value is increased by a certain factor (we used 1.5).
However, it can increase only up to a maximum value that depends on which neighborhood
evaluation function is used.9 If the variance is greater than one, the noise value is set to its
initial default value.

Observation 5. Using E0 , the dynamic noise performs statistically better than low noise val-
ues, and slightly better than high noise values, in terms of both CPU-time and plan quality.
   9
    Such maximum values were empirically determined by observing the performance of each evaluation function
using various noise values for all test problems considered in this paper.


                                                    11
                     Results using E0       N0     N10        N20       N50         Ndyn
                     Problems Solved        2%    59.8%      68.6% 64.7%            72.5%
                     CPU-time Mean Rank 19.1       14.5       11.9      10.0         9.5
                     Quality Mean Rank      19.1   12.9       11.5      11.2         10.3
                              CPU-time: Friedman(N0 , N10 , N20 , N50 , Ndyn )


                                Ndyn      N50        N20      N10       N0

                                                                                   D = 2.3
                            Plan Quality: Friedman(N0 , N10 , N20 , N50 , Ndyn )


                              Ndyn        N50      N20          N10          N0


                                                                                   D = 1.9


Table 3: Results of Friedman’s test on the performance of Walkplan using E0 with different
noise settings (0%, 10%, 20%, 50% and dynamic noise): percentage of problems solved, CPU-
time and plan quality ranks.

                      Results using Eπ   N0     N10       N20       N50       Ndyn
                      Problems Solved  93.1%   97.1% 94.1% 71.6%              98.0%
                      CPU-time Rank      9.7    10.5      13.7      20.9       10.2
                      Quality Rank       11.3   11.2      13.1      18.9       10.5
                            CPU-time: F riedman(N0 , N10 , N20 , N50 , Ndyn )


                                N0        Ndyn      N10         N20       N50

                                                                                  D = 2.6
                           Plan Quality: F riedman(N0 , N10 , N20 , N50 , Ndyn )


                             Ndyn         N10       N0           N20       N50


                                                                                  D = 1.9


Table 4: Results of Friedman’s test on the performance of Walkplan using Eπ with different
noise settings (0%, 10%, 20%, 50% and dynamic): percentage of problems solved, CPU-time
and plan quality ranks.


Observation 6. Using Eπ , low noise values and the dynamic noise perform statistically better
than high noise values, in terms of both CPU-time and plan quality.

The reason why the dynamic noise performs better than with the other static noise settings can
be intuitively explained by the fact that this method tends to make random choices only when
needed, i.e., only when the search has reached (or is near to) a local minimum. A high value of
the noise setting facilitates escaping from local minima, but obviously it could affect negatively
the performance by introducing unsupported preconditions that are difficult to achieve in the
context of the current LA-graph, as well as many threats. This determines an increment of the
total CPU-time and slows down the incremental process for finding good quality plans.

                                                    12
                    Results using Eπ        NoNR   NRL    NRA      NRk      NRk%       NRall
                    Problems Solved        92.2%   99% 98%        96.1% 98.0%          99.0%
                    CPU-time Mean Rank      21.3   12.9   13.2     14.8     17.7        13.0
                    Quality Mean Rank       17.1   18.6   14.9     13.9     16.0        12.5
                             CPU-time: Friedman(NoNR, NRL , NRA ,NRk ,NRk% ,NRall )


                           NRL      NRall      NRA         NRk       NRk%       NoNR

                                                                                      D = 3.3
                             Quality: Friedman(NoNR, NRL , NRA ,NRk ,NRk% ,NRall )


                           NRall      NRk       NRA        NRk%     NoNR        NRL


                                                                                      D = 2.2


Table 5: Results of Friedman’s test on the performance of Walkplan using Eπ with the neigh-
borhood restriction techniques (NoNR, NRL , NRA , NRk , NRk % and NRall ): percentage of
problems solved, CPU-time and plan quality ranks.

Observation 7. The performance of Walkplan using E0 with 0% of noise is statistically worse
than with the other noise values analyzed, in terms of both CPU-time and plan quality.
Observation 8. The performance of Walkplan using Eπ with 0% of noise is not statistically
different from the performance with low noise values, in terms of both CPU-time and plan
quality.
    The reason why Walkplan using Eπ performs better with low noise values seems mainly
related to the very good accuracy of the neighborhood evaluation of Eπ . In particular, we
experimentally observed that, while E0 can often lead to local minima, Eπ rarely does so.10
Hence, a high value of the noise when using Eπ can often “destruct” the search towards the so-
lution graph. On the contrary, when using E0 , performing random choices among the elements
of the neighborhood is more often useful to abandon portions of the search space containing
local minima.

4.3    Neighborhood Restriction
As we have seen, the effectiveness of Eπ can be improved by restricting the search neighbor-
hood using NRL . In this section we focus on Eπ , and we analyze the relative importance of the
neighborhood restriction options that we introduced in the previous section.
    Table 5 shows the results of Friedman’s test for Walkplan with no neighborhood restriction
(NoNR), NRL , NRA , NRk , NRk% , and NRall . For NRk we used k = 10, and for NRk% we used
k = 300.
Observation 9. In terms of CPU-time, Walkplan with any of the neighborhood restrictions
analyzed performs statistically better than with no restriction.
  10
   Hoffmann observed a similar behavior for the relaxed-plan heuristic used by FF, that he tested on different
domains [11]. Moreover, the heuristics of LPG and FF, as well as their search spaces, are different.


                                                      13
                           Results (Eπ with NRA ) Σrand     Σlev−      Σlif o
                           Problems Solved        97.1%     98.0%      94.1%
                           Speed Mean Rank         9.2        6.0         8.8
                           Quality Mean Rank       8.4        6.3         9.3
                               CPU-time: Friedman(Σrand , Σlev− , Σlif o )


                                     Σlev−        Σlif o   Σrand

                                                                       D = 1.4
                             Plan Quality: Friedman(Σrand , Σlev− , Σlif o )


                                     Σlev−       Σrand      Σlif o

                                                                       D = 0.9


Table 6: Results of Friedman’s test on the performance of Walkplan using Eπ with NRA and
one of the inconsistency selection strategies Σrand , Σlev− and Σlif o : percentage of problems
solved, CPU-time and plan quality ranks.


An accurate evaluation of every neighborhood element using Eπ is computationally expensive.
For this reason, Walkplan with a restricted neighborhood visits a portion of the search space
that can be significantly larger than the portion visited with no restriction in the same amount
of CPU-time. This makes Eπ more effective in terms of both CPU-time and plan quality.

Observation 10. NRall is statistically the best neighborhood restriction among those consid-
ered, in terms of both CPU-time and plan quality.

     The removal of some neighborhood elements can be harmful for the local search process,
because it can eliminate the only elements that could take the search away from a local mini-
mum in which it is trapped. NRk and NRk% appear to be more sensitive to this negative side
effect than the other restrictions considered.
     NRL performs badly in terms of plan quality. We believe that the reason for this behavior
is that NRL does not pre-evaluate the neighborhood elements. The restriction of NRL is fast to
compute, but it can easily exclude LA-graphs from which, within the same CPU-time, could
reach a solution of better quality.

4.4   Inconsistency Selection
In this section we present results concerning a comparison of three inconsistency selection
strategies for Walkplan. For lack of space we consider only the use of Eπ . The results that we
obtained using the other neighborhood evaluation functions are similar.
    Table 6 shows the results of Friedman’s test for Σlev− , Σrand , and Σlif o , from which we
can derive the following observation.

Observation 11. Walkplan with Eπ and Σlev− performs statistically better than with Eπ and
either Σrand , or Σlif o , both in terms of CPU-time and plan quality.


                                                  14
5    Conclusions
We have proposed some techniques for restricting the search neighborhood of Walkplan, and
for selecting the next inconsistency to handle. These techniques have been experimentally
evaluated together with some options for other important heuristic features of LPG (the noise
value and the neighborhood evaluation function).
    Our analysis is based on Friedman’s statistical test, and it shows that the features investi-
gated are very useful for improving the search in terms of required CPU-time or quality of the
solutions. In particular, restricting the search neighborhood using any of the techniques that
we have proposed is essential to exploit an accurate neighborhood evaluation function like Eπ .
Moreover, the analysis identifies options for the noise setting and the inconsistency selection
that perform better than others (Ndyn and Σlev− , respectively).
References
[1] Blum, A., and Furst, M. Fast planning through planning graph analysis. Artificial Intelligence.
   90:281–300. 1997.
[2] Do, M., and Kambhampati, S. Sapa: A domain-independent heuristic metric temporal planner. In
   Proc. of ECP-01. 2001.
[3] Do, M., and Kambhampati, S. Sapa: A scalable multi-objective heuristic metric temporal planner.
   JAIR 20:155–194. 2003.
[4] Friedman, M. The use of ranks to avoid the assumptions of normality implicit in the analysis of
   variance. JASA 32:675–701. 1937.
[5] Gerevini, A., and Schubert, L. Accelerating partial-order planners: some techniques for effective
   search control and pruning. JAIR 5:95–137. 1996.
[6] Gerevini, A., and Serina, I. Fast planning through greedy action graphs. In Proc. of AAAI-99. 1999.
   AAAI Press/MIT Press.
[7] Gerevini, A., and Serina, I. LPG: A planner based on local search for planning graphs with action
   costs. In Proc. of AIPS-02. 2002.
[8] Gerevini, A., Saetti, A., and Serina, I. Planning through stochastic local search and temporal action
   graphs in LPG. JAIR 20:239–290. 2003.
[9] Gerevini, A., Saetti, A., and Serina, I. An approach to temporal planning and scheduling in domains
   with predictable exogenous events. Journal of Artificial Intelligence Research (JAIR), 25:187–231,
   2006.
[10] Hoffmann, J., and Nebel, B. The FF planning system: Fast plan generation through heuristic
   search. JAIR:14:253–302. 2001.
[11] Hoffmann, J. Local search topology in planning benchmarks: an empirical analysis. In Proc. of
   IJCAI-01. 2002.
[12] Hoffmann, J., and Edelkamp, S. The deterministic part of IPC-4: An overview. JAIR, 24:519–579,
   2005.
[13] Long, D., and Fox, M. The 3rd international planning competition: Results and analysis. JAIR
   20:1–59. 2003.
[14] Pollack, M.E. and Joslin, D. and Paolucci, M. Flaw Selection Strategies for Partial-Order Planning.
   JAIR 6:223–262. 1997.
[15] Selman, B.; Kautz, H.; and Cohen, B. Noise strategies for improving local search. In Proc. of
   AAAI-94. 1994.
[16] Siegel, S.; Castellan J. Nonparametric Statistics for the Behavioral Sciences McGraw Hill. 1988.
[17] Ury, H., K. A comparison of four procedures for multiple comparison among means - pairwise
   contrast for arbitrary sample sizes. Tecnometrics 18:89–97. 1976.


                                                   15