<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Heuristics for Numeric Planning: A Preliminary Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Valerio Borelli</string-name>
          <email>valerio.borelli@unibs.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alfonso Emilio Gerevini</string-name>
          <email>alfonso.gerevini@unibs.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrico Scala</string-name>
          <email>enrico.scala@unibs.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivan Serina</string-name>
          <email>ivan.serina@unibs.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Heuristic Search, Automated Planning, Mixed Discrete and Continuous Domains</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sapienza University of Rome</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Brescia</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Heuristic search is a key technique in almost all types of automated planning approaches. Various works have shown that black-box approaches, such as neural networks and deep neural networks, can be used to learn an heuristic competitive with the state of the heuristics for classical planning problems [1, 2, 3]. However little to no work has been done regarding numeric planning problems. In our work we are investigating if similar methods can also be applied to numeric planning problems, and how they can be improved in a numeric planning context.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org
A</p>
    </sec>
    <sec id="sec-2">
      <title>1.1. Planning Framework</title>
      <p>
        The starting point for our work has been how the model defined in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] can be applied to a
The model takes as an input a state, and return a heuristic value as output, such a value can be
used as a heuristic function for a specific planning problem.
      </p>
      <p>We will now discuss briefly about all the components that must be defined, and what the results
A numeric planning problem is a tuple Π =&lt;  0, , ,  &gt;
where  is the set of variables, which
can be either propositional or numeric, 0 is the initial state of the problem,  is a set of actions,
either numeric or propositional, and  is the goal of the problem, which consists in a set of
propositional and numeric goal conditions.</p>
      <p>An action  ∈ 
is a pair &lt;   (),   () &gt;
, in which   ()
represents the set of preconditions
that must be verified in a state s before executing the action, and   ()
represents the set of
efects of the action, that will be applied to the actual state to reach the next state.
Italy
∗Corresponding author.
A plan  =&lt;  1, ...,   &gt; is a sequence of actions, and is considered a solution for the planning
problem if applying the sequence of actions, starting from the initial state of the problem, leads
to a final state in which all the goals of the problem are satisfied.</p>
    </sec>
    <sec id="sec-3">
      <title>1.2. Model of the network</title>
      <p>
        The starting model for the neural network, as previously said, is the one defined in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
The model is a multi-label classification network [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], in which the inputs of the network are states
generated extracted from a valid plan. The outputs are represented with a unary encoding, which
basically consists in a multi-label approach with ordered labels [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and represents heuristic
values.
      </p>
      <p>Regarding hyperparameters, the model has three hidden layers, with a number of neurons that
scales in equal size steps from the input layer size to output layer size. Therefore, the number of
trainable weights is strictly correlated to the number of non-static facts of the planning problem
and the maximum heuristic value that we want to have as output.</p>
      <p>All the layers in the network use sigmoid activation functions, without regularization.</p>
    </sec>
    <sec id="sec-4">
      <title>1.3. Creation of the samples</title>
      <p>
        In order to generate enough samples to train the neural network, for each task we use a
Random Walk to generate new initial states for that task. Then, for each initial state generated
in this way, the teacher search, which in our case is ENHSP [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] using   as search algorithm
and ℎ [7] as heuristic function, solves the task. For each plan found this way, one state is
selected randomly and used as a sample for the neural network.
      </p>
      <p>Greedy best-first search (   ) [Doran and Michie, 1996] is a suboptimal search
algorithm. It takes a greedy approach to explore the state-space, always prioritizing the node with
the lowest heuristic value.
ℎ [8] is an inadmissible heuristic, that considers all the subgoals of a problem independent.
In our case we use a version of ℎ generalized for numeric planning problems [7].</p>
    </sec>
    <sec id="sec-5">
      <title>1.4. Preliminary Results</title>
      <p>Preliminary results on the previously described models show that, compared to classical
planning, the heuristics defined this way works much worse with respect to traditional numeric
planning heuristics.</p>
      <p>In particular, while in classical planning the generated heuristics, despite being slower than
traditional ones, still manages to expand a smaller number of nodes, in the numerical case this
diference does not exist with the current model.</p>
      <p>Preliminary experiments have been conducted on two numerical domains: block grouping and
counters [9], the first consists of grouping blocks of the same colour in the same cell, while the
latter aims to find a specific numeric value for each defined counter, with the purpose of satisfy
all the numeric constraints between the counters.</p>
      <p>The results, as said before, show that the described model cannot learn a competitive heuristic.
In particular, in the case of block grouping, the heuristics found works worse than ℎ both in
terms of time and in terms of expanded nodes. In the case of counters the results are even
worse: the model fails to apprehend and predict the heuristic value methodically, and therefore,
using the resulting heuristics the planner often can’t find a plan.
2. Future Work
In order to assess if what was done for classical planning can also be achieved for numeric
planning, there are two possible directions, complementary to each other.</p>
      <p>The first direction is to understand if the model can be improved, first and foremost via
hyperparameter optimization.</p>
      <p>Given that the starting model was built to work with boolean values, it is natural to think that a
similar model with numerical values could have very diferent best configurations.
Furthermore, we have no guarantees that a multi-label classification network is still the better
solution. Diferent approaches may work better in this case, such as a regression model, or a
multi-classification model with a one-hot encoding to interpret the output.</p>
      <p>Lastly, diferent approaches to generate the samples might be taken into consideration, such as
taking as samples all the states of the generated plans.</p>
      <p>The latter is to find numeric problems in which traditional heuristics struggles to guide the
planner correctly. For example one domain that is particularly challenging for traditional
heuristics, is called Settlers [10]. This domain revolves around the management of resources,
with the objective to construct a variety of structures in specific locations. Its dificulty lies
in the fact that actions often have complex interactions and dependencies, which traditional
heuristics often fail to detect, due to the approximations on which they are based.
[7] E. Scala, P. Haslum, S. Thiébaux, et al., Heuristics for numeric planning via subgoaling
(2016).
[8] P. Haslum, H. Gefner, Admissible heuristics for optimal planning., in: AIPS, Citeseer,
2000, pp. 140–149.
[9] G. Frances, H. Gefner, Modeling and computation in planning: Better heuristics from
more expressive languages, in: Proceedings of the International Conference on Automated
Planning and Scheduling, volume 25, 2015, pp. 70–78.
[10] D. Long, M. Fox, The 3rd international planning competition: Results and analysis, Journal
of Artificial Intelligence Research 20 (2003) 1–59.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ferber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Helmert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hofmann</surname>
          </string-name>
          ,
          <article-title>Neural network heuristics for classical planning: A study of hyperparameter space</article-title>
          ,
          <source>in: ECAI</source>
          <year>2020</year>
          , IOS Press,
          <year>2020</year>
          , pp.
          <fpage>2346</fpage>
          -
          <lpage>2353</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Karia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <article-title>Learning generalized relational heuristic networks for modelagnostic planning</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>35</volume>
          ,
          <year>2021</year>
          , pp.
          <fpage>8064</fpage>
          -
          <lpage>8073</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Trevizan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thiébaux</surname>
          </string-name>
          ,
          <article-title>Learning domain-independent planning heuristics with hypergraph networks</article-title>
          ,
          <source>in: Proceedings of the International Conference on Automated Planning and Scheduling</source>
          , volume
          <volume>30</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>574</fpage>
          -
          <lpage>584</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Tsoumakas</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Katakis, Multi-label classification: An overview</article-title>
          ,
          <source>International Journal of Data Warehousing and Mining (IJDWM) 3</source>
          (
          <issue>2007</issue>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pollastri</surname>
          </string-name>
          ,
          <article-title>A neural network approach to ordinal regression</article-title>
          ,
          <source>in: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence)</source>
          , IEEE,
          <year>2008</year>
          , pp.
          <fpage>1279</fpage>
          -
          <lpage>1284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Scala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Haslum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Thiébaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ramirez</surname>
          </string-name>
          ,
          <article-title>Interval-based relaxation for general numeric planning</article-title>
          ,
          <source>in: ECAI</source>
          <year>2016</year>
          , IOS Press,
          <year>2016</year>
          , pp.
          <fpage>655</fpage>
          -
          <lpage>663</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>