-

1613-0073

Heuristics for Numeric Planning: A Preliminary Study

Valerio Borelli

valerio.borelli@unibs.it 0 1

Alfonso Emilio Gerevini

alfonso.gerevini@unibs.it 1

Enrico Scala

enrico.scala@unibs.it 1

Ivan Serina

ivan.serina@unibs.it 1

Heuristic Search, Automated Planning, Mixed Discrete and Continuous Domains

0 Sapienza University of Rome , Italy 1 University of Brescia , Italy

Heuristic search is a key technique in almost all types of automated planning approaches. Various works have shown that black-box approaches, such as neural networks and deep neural networks, can be used to learn an heuristic competitive with the state of the heuristics for classical planning problems [1, 2, 3]. However little to no work has been done regarding numeric planning problems. In our work we are investigating if similar methods can also be applied to numeric planning problems, and how they can be improved in a numeric planning context.

CEUR ceur-ws.org A

1.1. Planning Framework

The starting point for our work has been how the model defined in [ 1 ] can be applied to a The model takes as an input a state, and return a heuristic value as output, such a value can be used as a heuristic function for a specific planning problem.

We will now discuss briefly about all the components that must be defined, and what the results A numeric planning problem is a tuple Π =< 0, , , > where is the set of variables, which can be either propositional or numeric, 0 is the initial state of the problem, is a set of actions, either numeric or propositional, and is the goal of the problem, which consists in a set of propositional and numeric goal conditions.

An action ∈ is a pair < (), () > , in which () represents the set of preconditions that must be verified in a state s before executing the action, and () represents the set of efects of the action, that will be applied to the actual state to reach the next state. Italy ∗Corresponding author. A plan =< 1, ..., > is a sequence of actions, and is considered a solution for the planning problem if applying the sequence of actions, starting from the initial state of the problem, leads to a final state in which all the goals of the problem are satisfied.

1.2. Model of the network

The starting model for the neural network, as previously said, is the one defined in [ 1 ]. The model is a multi-label classification network [ 4 ], in which the inputs of the network are states generated extracted from a valid plan. The outputs are represented with a unary encoding, which basically consists in a multi-label approach with ordered labels [ 5 ], and represents heuristic values.

Regarding hyperparameters, the model has three hidden layers, with a number of neurons that scales in equal size steps from the input layer size to output layer size. Therefore, the number of trainable weights is strictly correlated to the number of non-static facts of the planning problem and the maximum heuristic value that we want to have as output.

All the layers in the network use sigmoid activation functions, without regularization.

1.3. Creation of the samples

In order to generate enough samples to train the neural network, for each task we use a Random Walk to generate new initial states for that task. Then, for each initial state generated in this way, the teacher search, which in our case is ENHSP [ 6 ] using as search algorithm and ℎ [7] as heuristic function, solves the task. For each plan found this way, one state is selected randomly and used as a sample for the neural network.

Greedy best-first search ( ) [Doran and Michie, 1996] is a suboptimal search algorithm. It takes a greedy approach to explore the state-space, always prioritizing the node with the lowest heuristic value. ℎ [8] is an inadmissible heuristic, that considers all the subgoals of a problem independent. In our case we use a version of ℎ generalized for numeric planning problems [7].

1.4. Preliminary Results

Preliminary results on the previously described models show that, compared to classical planning, the heuristics defined this way works much worse with respect to traditional numeric planning heuristics.

In particular, while in classical planning the generated heuristics, despite being slower than traditional ones, still manages to expand a smaller number of nodes, in the numerical case this diference does not exist with the current model.

Preliminary experiments have been conducted on two numerical domains: block grouping and counters [9], the first consists of grouping blocks of the same colour in the same cell, while the latter aims to find a specific numeric value for each defined counter, with the purpose of satisfy all the numeric constraints between the counters.

The results, as said before, show that the described model cannot learn a competitive heuristic. In particular, in the case of block grouping, the heuristics found works worse than ℎ both in terms of time and in terms of expanded nodes. In the case of counters the results are even worse: the model fails to apprehend and predict the heuristic value methodically, and therefore, using the resulting heuristics the planner often can’t find a plan. 2. Future Work In order to assess if what was done for classical planning can also be achieved for numeric planning, there are two possible directions, complementary to each other.

The first direction is to understand if the model can be improved, first and foremost via hyperparameter optimization.

Given that the starting model was built to work with boolean values, it is natural to think that a similar model with numerical values could have very diferent best configurations. Furthermore, we have no guarantees that a multi-label classification network is still the better solution. Diferent approaches may work better in this case, such as a regression model, or a multi-classification model with a one-hot encoding to interpret the output.

Lastly, diferent approaches to generate the samples might be taken into consideration, such as taking as samples all the states of the generated plans.

The latter is to find numeric problems in which traditional heuristics struggles to guide the planner correctly. For example one domain that is particularly challenging for traditional heuristics, is called Settlers [10]. This domain revolves around the management of resources, with the objective to construct a variety of structures in specific locations. Its dificulty lies in the fact that actions often have complex interactions and dependencies, which traditional heuristics often fail to detect, due to the approximations on which they are based. [7] E. Scala, P. Haslum, S. Thiébaux, et al., Heuristics for numeric planning via subgoaling (2016). [8] P. Haslum, H. Gefner, Admissible heuristics for optimal planning., in: AIPS, Citeseer, 2000, pp. 140–149. [9] G. Frances, H. Gefner, Modeling and computation in planning: Better heuristics from more expressive languages, in: Proceedings of the International Conference on Automated Planning and Scheduling, volume 25, 2015, pp. 70–78. [10] D. Long, M. Fox, The 3rd international planning competition: Results and analysis, Journal of Artificial Intelligence Research 20 (2003) 1–59.

[1]

Ferber ,

Helmert ,

Hofmann , Neural network heuristics for classical planning: A study of hyperparameter space , in: ECAI 2020 , IOS Press, 2020 , pp. 2346 - 2353 .

[2]

Karia ,

Srivastava , Learning generalized relational heuristic networks for modelagnostic planning , in: Proceedings of the AAAI Conference on Artificial Intelligence , volume 35 , 2021 , pp. 8064 - 8073 .

[3]

Shen ,

Trevizan ,

Thiébaux , Learning domain-independent planning heuristics with hypergraph networks , in: Proceedings of the International Conference on Automated Planning and Scheduling , volume 30 , 2020 , pp. 574 - 584 .

[4]

Tsoumakas , I. Katakis, Multi-label classification: An overview , International Journal of Data Warehousing and Mining (IJDWM) 3 ( 2007 ) 1 - 13 .

[5]

Cheng ,

Wang ,

Pollastri , A neural network approach to ordinal regression , in: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) , IEEE, 2008 , pp. 1279 - 1284 .

[6]

Scala ,

Haslum ,

Thiébaux ,

Ramirez , Interval-based relaxation for general numeric planning , in: ECAI 2016 , IOS Press, 2016 , pp. 655 - 663 .