<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning Sound and Complete Preconditions in Complex Real-World Domains</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>René Heesch</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Björn Ludwig</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jonas Ehrhardt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Diedrich</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oliver Niggemann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HSU-AI Institute for Artificial Intelligence, Helmut-Schmidt-University</institution>
          ,
          <addr-line>Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Automation, Helmut-Schmidt-University</institution>
          ,
          <addr-line>Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>In this paper, we address the problem of learning sound and complete action preconditions in complex real-world planning domains, the remaining bottleneck in the N3PCP pipeline. Such planning domains involve hybrid state spaces including discrete and numerical variables. We propose a dependency-aware learning approach that captures interdependencies between discrete and numerical variables by constructing distinct convex hulls over the numerical subspace for each discrete state configuration. This poses a more accurate representation of hybrid preconditions in real-world domains than existing approaches for learning preconditions. We empirically compare our method against two other approaches: an exact baseline method that ensures soundness but lacks completeness, and a generalized variant of N-SAM, that achieves completeness but compromises soundness. We evaluate our approach across multiple planning problems and domains which are based on a real-world industrial system, demonstrating the practical benefits of our approach. An additional theoretical analysis confirms that, under standard convexity assumptions and suficient coverage of discrete configurations within the training data, our proposed dependency-aware method guarantees both completeness and soundness.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Planning</kwd>
        <kwd>N3PCP</kwd>
        <kwd>SMT</kwd>
        <kwd>Neural Networks</kwd>
        <kwd>Action Model Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Automated planning is a key capability for the autonomy of intelligent systems [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Planning enables
system to generate plans, which are action-sequences that transition a system from an initial state to
a goal state [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In many real-world domains, this involves reasoning over hybrid state spaces that
combine symbolic (discrete) and continuous (numerical) components, e.g., in robotics, autonomous
systems, and industrial manufacturing [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. In such settings, the complexity of planning lies not
exclusively in the branching factor of the search space, but also in the need to determine valid control
parameters for actions and the dificulty of modeling system behavior mathematically [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Traditional planning approaches depend on manually specified domain models, which define action
preconditions, efects, and causal relations. However, constructing such models in complex, dynamic,
and heterogeneous domains is time-consuming and error-prone [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This modeling bottleneck
significantly limits the scalability and adaptability of planning systems in complex real-world domains.
      </p>
      <sec id="sec-1-1">
        <title>To address this bottleneck, Neural Network–enriched Numerical Planning with Control Parameters</title>
        <p>
          (N3PCP) was recently introduced as a hybrid planning framework that leverages neural networks
to approximate the efects of actions based on empirical data [
          <xref ref-type="bibr" rid="ref2 ref7">7, 2</xref>
          ]. By learning efect models from
observations or simulations, N3PCP eliminates the need for explicit, hand-crafted transition models.
While early approaches employed eager and exact evaluation strategies [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], later work adopted lazy
evaluation methods that trade of completeness for generality and computational eficiency [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. However,
both existing methods still rely on manually defined action preconditions, limiting their generality and
automation.
        </p>
        <p>In this paper, we address this remaining bottleneck in the N3PCP pipeline by proposing an approach
for learning action preconditions from data. We focus on domains where only observational data are
available, i.e., states to which actions have been successfully applied, without access to examples of
invalid plans. This leads to our first research question:</p>
        <p>RQ1: Is it possible to learn action preconditions in real-world domains exclusively from observations
of the states in which an action was applied?</p>
        <p>
          To motivate this question, we consider a concrete and challenging application: production planning
in modular manufacturing systems (cf. Figure 1). In particular, we focus on a modular real-world system
that refines automotive glass components via polyurethane foaming. Modular manufacturing systems
require the coordinated execution of production steps, each guided by machine-level control parameters,
to reliably transform raw materials into finished products [
          <xref ref-type="bibr" rid="ref7">8, 7</xref>
          ]. The hybrid nature of the task, by
combining symbolic reasoning and numerical parameter identification, makes it a typical real-world use
case for automated planning frameworks like N3PCP [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Moreover, rapid reconfiguration is essential
to cope with increasing product variability and reduced batch sizes [9, 10].
        </p>
        <p>In such hybrid domains, two properties are essential: soundness, which ensures that any generated
plan is executable, and completeness, which ensures that all valid plans can be found [11]. These
properties depend not only on the planner but also on the quality of the domain model, i.e., the learned
action preconditions. This raises our second research question:</p>
        <p>RQ2: How does precondition learning afect the soundness and completeness of planning algorithms
in N3PCP domains?</p>
        <p>To address these questions, we propose a novel approach to learn dependency-aware preconditions
from observational data. We compare this approach with two other approaches to precondition learning.
An exact, memorization baseline, assuring soundness and a variation of the N-SAM [12], aiming for
completeness. We assess the methods two-fold: Empirically, using data from an industrial production
system, and theoretically, via formal analysis of their soundness and completeness.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Automated planning relies on formal domain descriptions. These description include action models,
which specify the conditions that must be satisfied in a state to make an action applicable to this state,
i.e., the preconditions, and how this action is transforming the state, i.e., the efects. As manually
creating such descriptions is tedious and error-prone [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], multiple approaches proposed learning action
models based on observations from plan executions [
        <xref ref-type="bibr" rid="ref6">13, 12, 14, 15, 16, 6</xref>
        ]. Yet, most existing approaches
are restricted to propositional representations and only limited work focus on learning action models
in numerical domains.
      </p>
      <p>PlanMiner [16, 15] infers action models with numeric preconditions and efects from noisy or partial
plan traces by employing a combination of preprocessing, regression, and classification techniques
to learn logical and arithmetic relations among state variables. However, the approach lacks formal
guarantees on soundness: a plan found using the learned model may not be valid in the real world.</p>
      <p>Addressing this limitation, N-SAM [12] extends the propositional SAM framework to numeric
domains by learning safe numeric action models, adopting a two-stage learning process. The safe
action models guarantee that generated plans execute as expected, with a total deviation in numeric
efects bounded by at most a threshold  ∈ ℝ . The algorithm first infers propositional preconditions
using techniques for classical planning problems [17]. In detail, to ensure soundness, the propositional
preconditions are conservatively defined, i.e., as the intersection of all observed applicable states, which
guarantees that no invalid plans are produced. Then it defines numeric preconditions by constructing a
convex hull over the numeric values observed in all states where the action was applied and expressing
this hull as a set of linear inequalities. The convex hull defines a conservative approximation of the
region in which the action is known to be applicable. However, learning a convex hull for an action
with  numeric parameters requires at least  + 1 independent samples. Furthermore, the algorithm
requires information about the set of numeric state variables involved in each action’s preconditions
and efects.</p>
      <p>N-SAM* [14] enhance the N-SAM algorithm by reducing the sample requirements to one observation
of every action and the requirements of prior knowledge on the involved state variables, while still
maintaining their previous introduced safety guarantee. The authors build a linear combination of using
the Gram-Schmidt process and thereby enabling to learn preconditions from less than  + 1 samples for
an action with  numerical parameters.</p>
      <p>Both neural networks and control parameters afect only action efects, not preconditions. Hence,
although N-SAM [12] and N-SAM* [14] were designed for numeric planning domains, their
preconditionlearning methods could, in principle, be adapted to N3PCP problems.</p>
      <p>However, these approaches learn models expressed in the Planning Domain Definition Language
(PDDL), which has been shown to lack the expressiveness required for complex real-world applications
such as production planning [18]. Moreover, their methodology treats propositional and numerical
variables independently during precondition learning. As a result, they cannot capture cross-dependencies
between these variable types, e.g., for cases where the feasible numeric state space is further constrained
by specific propositional configurations. Additionally, the conservative strategy for learning the
propositional preconditions leads to a loss of completeness, potentially excluding valid behaviors that were
simply not observed in all instances.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Background</title>
      <p>All planning approaches to Neural Network-enriched Numerical Planning with Control Parameters
(N3PCP) are grounded in the planning as satisfiability paradigm, specifically by using Satisfiability
Modulo Theories (SMT) as the underlying computational framework.</p>
      <sec id="sec-3-1">
        <title>3.1. Logical Preliminaries</title>
        <p>We operate within the setting of many-sorted first-order logic. Terms are either constants,
individual variables, or applications of  -ary function symbols to  terms. Atoms are either propositional
variables or  -ary predicate symbols applied to  terms. Formulae are constructed from atoms using
standard Boolean connectives (¬, ∧, ∨, →, ↔) and quantifiers ( ∀, ∃) applied to one or more variables and
a subformula. We follow the standard semantic terminology of interpretation, model, satisfiability, and
validity [19].</p>
        <p>
          In SMT, the interpretation of a given set of symbols is constrained by a given background theory, e.g.,
the theory of nonlinear arithmetic with transcendental functions (NTA). To solve N3PCP problems,
numerical constants, real-valued variables, standard function symbols +, −, ×, ÷, comparison predicates
&lt;, ≤, &gt;, ≥, and transcendental functions such as tan are considered [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>For planning, we employ a symbolic representation of infinite-state transition systems [ 20]. A
transition system is defined as the tuple ⟨ ,  ( ),  ( ,  ′)⟩, where:
•  is a set of state variables;
•  ( ) is an SMT formula characterizing the set of initial states (i.e., a state  is initial if  ⊧  );
•  ( ,  ′) is the transition relation, with  denoting the state before and  ′ denoting the state after
the transition.</p>
        <p>A transition from state  to  ′ exists if the combined valuation over  ∪  ’ that assigns  according to 
and  ’ according to ’ satisfies  , i.e., (, ’) ⊧  . In this case,  ′ is a successor of  , and  a predecessor of
 ′. In a transition system Γ, a trace is a sequence of states  0,  1, … such hat  0 ⊧  and for all  ,  +1 is a
successor of   . To express properties over such traces, we use Linear Temporal Logic [21]. The model
checking problem Γ ⊧  states, that all traces of Γ satisfy a temporal formula  .</p>
        <p>Among various verification techniques, we focus on Bounded Model Checking (BMC) [22], which
reduces the reachability problem to satisfiability over a bounded horizon. BMC proceeds by unrolling
the transition relation for  steps. The following SMT formula is constructed:
 ( 0) ∧  ( 0,  1) ∧ … ∧  ( −1 ,   ) ∧ (  ),
(1)
where (  ) specifies the goal condition. The formula is satisfiable if there exists a trace of
 transitions
from the initial state to a state satisfying  . A satisfying assignment to the variables  0,  1, … ,  
represents the corresponding plan.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Real-World State Spaces</title>
        <p>=</p>
        <p>Val() × Ω.</p>
        <p>In AI planning with propositional and numeric variables, the state space factors as the Cartesian product
of a propositional valuation space and a numeric assignment space [23]. Let  be the set of propositional
variables and  the set of numeric variables. Define</p>
        <p>Val() ∶= { ∣  ∶  → {⊥, ⊤}}
and
Ω ∶=
∏   ⊆ ℝ ,
∈
where each   ⊆ ℝ is typically a closed interval. A state is a pair (, )
with  ∈ Val() and  ∈ Ω , and
the state space is</p>
        <p>If the real-world state space also includes finite discrete variables  = {
 }∈ with domains   , these
are encoded into the propositional part via a standard injective encoding, obtaining an expanded set
propositional part.
 ′ and a bijection ∏∈   × Val() ≅</p>
        <p>
          Val( ′). Hence, w.l.o.g., we use Eq. (3) and refer only to the
Definition 1. For ,  ∈ ℝ  , the line segment 
is  ∶=
{(1 − ) +  ∣  ∈ [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ]}.
        </p>
        <sec id="sec-3-2-1">
          <title>For states</title>
          <p>(, ), (,  ) ∈ 
with the same propositional valuation  , the corresponding segment in  is { (,  ) ∣  ∈
 }.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>If the propositional parts difer, no convex combination is defined in</title>
          <p>.</p>
          <p>Definition 2.</p>
          <p>Let  ⊆ Val() × ℝ  . For  ∈</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>Val() , define the numeric fiber</title>
          <p>() ∶= {  ∈ ℝ
 ∣ (, ) ∈
 }. We say that  is fiberwise convex if each  () is convex in ℝ .</p>
          <p>In many applications, numeric variables are bounded physical quantities, i.e.,  ∈ Ω =
intervals   = [ℓ ,   ]. Each   is convex, and thus Ω is an axis-aligned box and convex. Accordingly, we
assume throughout that for every fixed propositional assignment  , the feasible numeric fiber  ()
is convex. This guarantees that for any two feasible numeric states under the same  , their linear
∏∈   with
interpolations remain feasible.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Problem Definition</title>
      <p>
        Following the formalization of [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] the Neural Network-enriched Numerical Planning with Control
Parameters (N3PCP) problem is defined as a tuple:
(2)
(3)
(4)
 = ⟨,  , , Θ,  ,  ,  , ⟩,
where each component contributes to describing hybrid planning tasks that involve both symbolic and
numerical reasoning under parametric control. This formalism is particularly motivated by real-world
applications such as production planning in modular manufacturing systems [
        <xref ref-type="bibr" rid="ref7">8, 7</xref>
        ], where tasks involve
discrete process steps and machine-level parameter tuning.
      </p>
      <p>Let  denote the set of propositional (Boolean) state variables and  the set of numerical state
variables. Each state   ∈  ≔  ×  ,  ∈ ℕ</p>
      <p>is characterized by a valuation of the propositional variables
in  and real-valued (or bounded-precision) variables in  . We assume a fixed ordering of  .

 denotes the finite set of actions, and</p>
      <p>Θ the set of control parameters associated with these actions.</p>
      <p>
        Each action models a transition of the system from one state to another, influenced by a specific valuation
of the parameters Θ. In the context of production systems, actions represent diferent production steps
such as cutting, assembling, or coating, while the control parameters correspond to machine-level
settings like feed rate, temperature, or pressure [
        <xref ref-type="bibr" rid="ref2 ref7">7, 2</xref>
        ].
      </p>
      <p>Each action  ∈</p>
      <p>is specified in one of two ways:
• A symbolic transition function   ∈  , represented as an SMT formula   (  , Θ,  +1 ) over the
current state   , the control parameters Θ, and the successor state  +1 .
• An executable transition function   ∈  , which is a loop-free imperative program that computes
a successor state  +1 given (  , Θ). In practice, this may be implemented by a Neural Network,
trained to approximate physical or empirical system dynamics or a simulation of a particular
production step.</p>
      <p>The symbolic representation allows for logical reasoning, constraint satisfaction, and artificial data
generation. The executable form enables forward simulation, including black-box models where
analytical descriptions are infeasible.</p>
      <p>The control parameters Θ are subject to domain constraints. Let ValidCP  (Θ)denote the admissible
space of control parameter valuations for action  . These constraints may be, for instance, interval
bounds:</p>
      <p>≤  ≤ , ∀ ∈ Θ,
encoding physical or safety limitations, e.g., torque limits and temperature ranges in the context of
production systems, where  describes the lower bound and  the upper bound, ,  ∈ ℝ .</p>
      <p>Additionally, not all actions are applicable in all states. We define the applicability set of an action
 ∈  as:</p>
      <p>Sin() = { ∈  ∣   () = true},
where   ∶  → { true, false} is a precondition predicate that determines whether the action  can be
applied in state  . These predicates may be manually specified or learned from empirical execution data,
such as successful and failed application attempts of  across diferent states.</p>
      <p>Moreover, we assume that the symbolic and executable transition functions   and   are defined
and valid only on their respective domains of applicability:
(5)
(6)
dom(  ),dom(  ) ⊆ Sin() × ValidCP  .
(7)
This ensures that logical inference and simulation are only performed under feasible conditions.</p>
      <p>denotes the set of initial state conditions, encoded as an SMT formula over  and  , describing
feasible system configurations at the outset of planning. The goal condition  is similarly an SMT
formula over  and  , specifying the desired properties of the final state. Unlike classical planning with
discrete goals, N3PCP problems allow goal descriptions and preconditions of actions over continuous
state variables. These are not expressed as equalities but as a set of inequalities, where the numerical
values have to lay within a certain range, reflecting the properties of real-world domains, e.g., target
dimensions, quality metrics.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Solution</title>
      <p>In this section, we introduce our novel solution to learn dependency-aware preconditions solely from
observations of the planning domain. We derive our dependency-aware approach from two baseline
approaches, one emphasizing exclusively on soundness of the learned preconditions, and the other
emphasizing on completeness of the learned preconditions. While the first baseline is the most restrictive
and conservative one with the highest possible incompleteness, the second baseline is the most optimistic
approach, aiming for completeness at the cost of soundness. Our proposed dependency-aware approach
is based on reasonable assumptions both, sound and complete.</p>
      <p>
        For all approaches, a set   of observed applicable states { 1, … ,   } for an action  is given. Following
the vector-based state-space representation presented in Heesch et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], we represent the assignments
to all variables in  and  , i.e., the states, in the form of vectors.
      </p>
      <sec id="sec-5-1">
        <title>5.1. Soundness First: Learning Exact Preconditions</title>
        <p>Our first baseline is sound with respect to the training data. Given the observed applicable states
  = { 1, … ,   } ⊆   , we define the exact precondition as a pure enumeration of those states:
 exact = ⋁  

=1</p>
        <p>This DNF exactly accepts the training examples and rejects all others. The corresponding algorithm
is given in Alg. 1.</p>
        <p>1: Initialize  exact ← ∅
2: for all   ∈</p>
        <p>do
3:
4:</p>
        <p>Construct clause   ←  
 exact ←  exact</p>
        <p>∨  
5: end for
6: return  exact
Algorithm 1 Learn Exact Preconditions
Ensure: Symbolic precondition  exact
Require: Dataset   = { 1,  2, … ,   } of states  ∈  where action  was applied
 exact is returned (cf. Alg. 1 l. 6).</p>
        <p>The algorithm first initializes an empty clause  exact (cf. Alg. 1 l. 1). Subsequently, the algorithm
iterates through the whole dataset   . For each state   a clause   is constructed (cf. Alg. 1 l. 2). The
clause   is a conjunction, assigning each variable in  and  the corresponding value of   (cf. Alg. 1 l.
3). This clause is then added to the clause  exact via a disjunction (cf. Alg. 1 l. 4). At the end, the clause</p>
        <p>As Alg. 1 may be reasonable in propositional domains, it becomes problematic in continuous or
hybrid domains. In such domains, this method is unlikely to provide a complete representation, as it
only enumerates the observed states from the data set. It cannot cover the infinite space of possible
continuous values. As a result,  exact is sound, but its completeness due to possible unseen but valid
states is limited to the dataset and propositional domains.
(8)
(9)</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Completeness First: Generalized Preconditions</title>
        <p>Our second baseline approximates the preconditions by generalizing from observed data emphasizing
rather on completeness than soundness. This involves inferring broader parts of the state space from
sampled data,</p>
        <p>e.g., intervals or convex regions, over the numerical variables  . This relaxation may come at the
expense of soundness, when the space of applicable states is not convex, e.g., some states which satisfy
 
general may not allow action  in practice.</p>
        <p>Let   ∈  be a continuous state variable. If the action  was applied in states  0 and  1 where all other
variables are equal and only   difers, and if  was successfully executed in both cases, we assume  is
applicable to any state  satisfying:
  ∈   0  1

where   denotes an assignment to   in state  .</p>
        <p>Building on this idea and inspired by Mordoch et al. [12], we separate propositional and numerical
variables when learning preconditions. For the numerical variables, we calculate the convex hull of the
numerical components of the states in the data set   of successful executions of  . The linear inequalities
of the hull’s supporting hyperplanes then define the numerical precondition. Unlike Mordoch et al.
[12], our method does not require prior manual identification of action-relevant numeric variables. For
propositional variables, we use a straightforward disjunctive strategy to increase completeness: we
collect the distinct propositional assignments under which  was observed, and take their disjunction
as the propositional precondition.
14: return</p>
        <p>The algorithm begins by initializing two data structures: an empty set Φ to collect all unique
propositional configurations observed in the dataset   (cf. Alg. 2, l. 1), and an empty matrix  to store
the numerical components of the corresponding states (cf. Alg. 2, l. 2).</p>
        <p>For each state  ∈   , the propositional subvector   is extracted (cf. Alg. 2, l. 4). If this configuration
has not yet been encountered, it is added to Φ (cf. Alg. 2, l. 5–7). Simultaneously, the numerical
subvector is appended to the matrix  for subsequent processing (cf. Alg. 2, l. 8).</p>
        <p>Once all states have been processed, the propositional precondition  () is constructed as a disjunction
over the elements of Φ (cf. Alg. 2, l. 10):
 () =
⋁   ,
  ∈Φ
(10)
ensuring that the resulting precondition evaluates to true precisely for the propositional configurations
under which the action has been previously observed. This avoids the overly conservative strategy of
assuming propositional variables must be fixed across all samples, as in [ 12].</p>
        <p>Subsequently, a convex hull  is computed over the matrix  containing the numerical state
components (cf. Alg. 2, l. 11). The set of defining linear inequalities for  is then extracted and used to define
the numerical precondition  ( ) (cf. Alg. 2, l. 12). The final action precondition  general is returned as
the conjunction of its propositional and numerical components (cf. Alg. 2, l. 13-14).</p>
        <p>Using a convex hull to describe the numerical component reduces incompleteness by generalizing
over observed numeric values, while maintaining soundness under the assumption that preconditions
are independent of unobserved propositional contexts. However, this assumption introduces a potential
loss of soundness when the propositional and numerical components are not independent. Specifically,
by applying a single convex hull over all numerical data points, not accounting for the associated
propositional configurations, the approach cannot capture cross-dependencies, such as when certain
propositional conditions constrain the feasible numeric subspace. Consequently, the combined
precondition  general =  () ∧  ( ) may admit states that were never observed together in the data, potentially
leading to plans with action that could not be applied.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Dependency-Aware Preconditions</title>
        <p>To address the drawbacks our two baselines, we introduce our novel dependency-aware precondition
formulation. Instead of treating propositional and continuous constraints independently, we model the
precondition as a conditional structure. Let   and   denote the propositional and numeric subvectors
of the state  , respectively. Let Φ ⊆ dom(  ) be the set of observed propositional configurations (from
components observed under configuration   .
states where action  was applied). For each   ∈ Φ , we fit a convex region</p>
        <p>Formally, the dependency-aware precondition for action  is defined as:
( ) over the numeric
 
depend
=
⋁ (</p>
        <p>⏟
  ∈Φ propositional
∧  ⏟ 
( )</p>
        <p>).
numeric
(11)
The numeric condition</p>
        <p>( ) is defined by the set of linear inequalities (i.e., halfspaces) that describe
the convex hull over all numeric vectors observed in states with the propositional configuration   .
Algorithm 3 Dependency-Aware Preconditions
Require: Dataset   of states  where action  was applied; propositional indices ℐ   , numeric indices
ℐ  
Ensure: Symbolic precondition</p>
        <p>depend
4:
5:
6:
10:
11:
12:
13:
14:
1: Initialize  
2: Initialize map   ∶ 
3: for all  ∈   do
depend</p>
        <p>← ∅</p>
        <p>← [ℐ   ]
 ← [ℐ</p>
        <p>]</p>
        <p>Append  to  [
7: end for
8: Initialize list  ← [ ]
9: for all   ∈ keys( )</p>
        <p>]</p>
        <p>ConvexHull( )
 ←  [
 ←
( )
()</p>
        <p>←  
Append ( 
() ∧</p>
        <p>( ) ) to clauses
15: end for
16: return  
depend
← ⋁∈ clauses 
 ↦ List of numeric vectors
 ]</p>
        <p>do
← GetHalfspaceInequalities( )
inequalities   
associated with propositional configuration   .</p>
        <p>The propositional part</p>
        <p>The algorithm begins by partitioning the dataset   into groups of states that exhibit identical
propositional configurations (cf. Alg. 3, l.2–7). For each such group, a convex hull is computed over
the continuous components of the state vectors (cf. Alg.3, l.10-11), and the corresponding set of linear
( ) is derived (cf. Alg.3, l. 12). These inequalities define the feasible numeric region
() is constructed as a conjunction of variable assignments that match
added, and added to</p>
        <p>depend as a disjunction (cf. Alg.3, l. 14-16).
configuration  (cf. Alg. 3, l.13). The two components are then conjoined into a single clause that is</p>
        <p>By explicitly associating each propositional configuration with a corresponding region in the numeric
state space, this approach avoids the need for conservative treatment of propositional variables of
other approaches [17, 12]. As a result, it mitigates the incompleteness typically introduced by requiring
propositional preconditions to hold uniformly across all observed states. Simultaneously, under the
assumption that the numeric values observed under each configuration form a convex region, the
approach preserves soundness, ensuring that plans generated using the learned model remain executable
in the real domain.</p>
        <p>We recognize several limitations of this method. Most notably, constructing convex hulls requires a
suficient number of observed state samples for each propositional configuration in which the action
was applied. Furthermore, the complexity of the resulting preconditions can increase the
computational complexity, both during model learning and at planning time, where more detailed applicability
conditions must be evaluated.</p>
        <p>Nonetheless, in domains where complexity arises from the of actions rather than the number of
objects, e.g., production planning [8], capturing these dependencies is critical for deriving both sound
and complete action models. Importantly, we assume that precondition learning is performed entirely
ofline, prior to planning. This design choice alleviates concerns about runtime overhead, making it
feasible to deploy expressive, dependency-aware preconditions in practical planning systems without
compromising online eficiency.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Evaluation</title>
      <p>Our evaluation is twofold. To address the first research question, we analyze the performance of our
proposed approache empirically. To address the second research question, we perform a theoretical
analysis of the presented methods in terms of soundness and completeness.</p>
      <sec id="sec-6-1">
        <title>6.1. Empirical Evaluation</title>
        <p>To investigate whether data-driven methods can efectively infer action preconditions based solely on
observations of states in which the action was executed (RQ1), we focus on a representative action
from an industrial production system. Thereby, we do not only focus on learning preconditions, but
also the subsequent planning task, to account for the increasing complexity of the planning problem by
learning complex precondition structures as in the generalized and the dependency-aware approach.</p>
        <p>As evaluation metric, we chose the algorithmic runtime. Specifically, we included both, the learning
of the action preconditions and the planning with these learned preconditions in the recording of
runtime. Soundness and completeness are proven theoretically in Section 6.2.
6.1.1. Experimental Setup
We evaluated the algorithms on a real-world production system (cf. Figure 1). The production system
refines automotive glass components using polyurethane foaming. In the foaming cell, a robot positions
mechanical inserts onto a pretreated glass pane fixed in a foaming mold. This insert placement is a
critical step before the polyurethane injection and must satisfy strict constraints related to temperature
and configuration to ensure both product integrity and functional reliability.</p>
        <p>Our evaluation centers on the robot’s action model. An expert-defined model exists for each phase
of the process, including an action for placing the inserts. Our objective is to infer the preconditions of
this specific action exclusively from 1000 observations of states where the action was applied. These
observations were generated by randomly sampling state transitions from the expert model. In the
minimal complexity level, the states are described by fiteen state variables: thirteen propositional
and two numerical variables1. To assess the scalability, we introduced up to four additional numerical
state variables, incrementally adding additional preconditions to the planning domain, resulting in five
increasingly complex planning problems and domains. Following the experimental design guidelines
for Machine Learning research by [24], we generated the data for each level of domain complexity with
eight diferent seeds. For avoiding random efects, we repeated each experiment ten times. The source
1We removed the control parameters and the neural network which is usually representing the actions’ efect in N3PCP
problems to reduce the complexity of the planning problems. Including these components would significantly increase
computational complexity, and thereby obscure a meaningful comparison of the approaches to learn preconditions.
code and the action-model modeled by experts are available at https://github.com/RHeesch/Learning_
Hybrid_Preconditions.</p>
        <p>We ran our experiments on a system with an AMD Ryzen 9 9950X 16-core processor and 96 GB RAM.
The planning problems, including the learned preconditions, were encoded in the Unified Planning
Framework (UPF)[25] and solved using the LPG planner [26]. We evaluated the resulting plans, which
were generated based on the learned preconditions, against the expert-defined domain model using the
UPF and the TAMER planner [27].
6.1.2. Results
Table 1 summarizes the runtimes for learning action preconditions. We report mean and standard
deviation of the runtime for each algorithm across the increasing domain complexities. Regarding
precondition learning (cf. Table 1), all approaches exhibited similar runtimes in the domain with two
numerical state variables. The exact method showed only modest increases as domain complexity grew,
whereas the generalized and dependency-aware variants scaled exponentially, likely due to the overhead
of constructing convex hulls and deriving the corresponding systems of linear inequalities.</p>
        <p>Table 2 reports the mean planning times and standard deviation. All plans are valid with respect
to expert-modeled domain description. Planning runtimes increased exponentially with domain
complexity across all approaches. For the 15-variable domain, planning was initially much faster using
the generalized and dependency-aware preconditions compared to the exact preconditions. However,
as complexity increased, planning times for both variants soon exceeded those of the exact method,
with the generalized approach exhibiting the highest planning times in the most complex domain.
The dependency-aware variant initially performed less favorably than the generalized one, but at last
surpassed it in the two most complex domains.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Theoretical Evaluation</title>
        <p>For evaluating the soundness and completeness of the diferent approaches, we consider a hybrid state
space  =  ×  , where  denotes the propositional component and  the continuous (numeric)
component. For any action  , we define its true applicability region as   = { ∈  ∣  is truly applicable in } .
We assume access to a finite training set   = { 1, … ,   } ⊆   , representing observed states where action
 was applicable.</p>
        <p>The learned preconditions are sound, if the planning algorithm can find a plan that is executable in
the real world, based on learned preconditions. The learned preconditions are complete, if the planning
algorithm can find every single plan, that is an executable plan in the real world, based on the learned
preconditions [11].</p>
        <p>Exact Preconditions:
Theorem 1. The exact preconditions are sound, i.e., for all  ∈  , with  exact holds  ∈   .
Proof 1. Let   be the complete set of applicable states for action  . For the training set   ⊆   holds and
∀ ∈  exact ∶  ∈   . Thus, ∀ ∈  exact ∶  ∈   .</p>
        <p>Since the exact preconditions only allow the action to be applied to states to which the action was
previously applied, the learned exact preconditions are sound by construction.</p>
        <p>Theorem 2. The exact preconditions are complete with respect to the training set   .
Proof 2. Proof by contradiction: We assume  ∗ ∈   exists with  is applicable for  ∗, where  ∗ ∉  exact,
but by definition it holds ∀ ∈   ∶  ∈  exact.</p>
        <p>Exact preconditions are complete only if the training set   fully enumerates   . However, in general,
when   is continuous or infinite, this condition is not met, making exact preconditions incomplete in
practical scenarios.</p>
        <p>Generalized Preconditions:
Theorem 3. In general, the generalized preconditions are not sound, i.e., ∃ ∗ ∈  
general
∶  ∗ ∉   .</p>
        <p>Proof 3. Proof by contradiction: Let   ∶= { 1,  2} with  1 = [true, false, 10],  2 = [true, true, 15]. We
assume   = { 1,  2}. Creating  general from   gives the set  general = {[true, ,  ] |  ∈ { true, false},  ∈
[10, 15]}. Let  3 = [true, false, 15], then  3 ∈  general, but  3 ∉   . So ∃ ∗ ∈  general ∶  ∗ ∉   .
The generalized preconditions are not, in general, sound. In particular, generalization within the
numerical space across all propositional configurations may introduce additional states into the precondition
set that do not lie within the true applicability region of the action.</p>
        <p>Theorem 4. In general, the generalized preconditions are not complete, i.e., ∃ ∗ ∈   ∶  ∗ ∉  
general.</p>
        <p>Proof 4. Proof by contradiction: Let   ∶= { 1,  2,  3} with  1 = [true, false],  2 = [true, true] and  3 =
[false, true]. We assume   = { 1,  2}. Creating  general from   gives the set  general = { 1,  2}. Then
 3 ∉  general, but  3 ∈   . So ∃ ∈   ∶  ∉  general.</p>
        <p>The generalized preconditions are not generally complete, as propositional configurations that fall
within the true applicability region may be absent from the training data and, consequently, excluded
from the learned preconditions.</p>
        <p>Theorem 5. If there is only one single propositional configuration () within   and the numerical state
space represented in   ( ) ⊆   ( ) and   ( ) is convex, then the generalized preconditions are sound.
Proof 5. Let ∀ ∗ ∈   ∶  ∗() = () , then ∀ ∗ ∈   ∶  ∗() = () and therefore ∀ ∗ ∈  
() . By assumption,   ( ) is convex and ∀( ) ∈   ( ) ∶ ( ) ∈   ( ) , so  ( ) ∶=
general
  ( ) . Then ∀ ∗( ) ∈  ( ) ∶  ∗( ) ∈   ( ) . Therefore, ∀ ∈   ∶  ∈   .
general ∶  ∗() =
Conv(  ( )) ⊆
Theorem 6. If there is only one single propositional configuration () within   and the numerical state
space represented in   ( ) is the convex hull of   ( ) , then the generalized preconditions are complete.
Proof 6. Let ∀ ∗ ∈   ∶  ∗() = () , then ∀ ∗ ∈   ∶  ∗() = () and therefore ∀ ∗ ∈  general ∶  ∗() =
() . By assumption,   ( ) is convex and represented by   ( ) , so  ( ) ∶= Conv(  ( )) =   ( ) . Then
∀ ∗( ) ∈   ( ) ∶  ∗( ) ∈ Conv(  ( )) . Therefore, ∀ ∈   ∶  ∈  general.</p>
        <p>This generalization admits all convex combinations of observed numeric values. Assuming that the
true numeric applicability region is globally convex, the generalized preconditions are both sound and
complete. However, in non-convex domains, particularly where diferent propositional configurations
admit distinct convex regions, soundness may be violated.</p>
        <p>Dependency-Aware Preconditions:</p>
        <sec id="sec-6-2-1">
          <title>Theorem 7. The dependency-aware preconditions are also sound for non-globally convex hulls, where</title>
          <p>diferent propositional configurations () ∈  () allow for diferent convex hulls  () , when ∀() ∈
  ∶ Conv( ,() ) ⊆  ,() .</p>
          <p>Proof 7. By definition ∀() ∈   ∶ Conv( ,()</p>
        </sec>
        <sec id="sec-6-2-2">
          <title>Theorem 5.</title>
          <p>) ⊆  ,() . Therefore, for each fixed () ∈   it holds</p>
        </sec>
        <sec id="sec-6-2-3">
          <title>Theorem 8. The dependency-aware preconditions are complete for non-globally convex hulls, where</title>
          <p>diferent propositional configurations () ∈  () allow for diferent convex hulls  () , when the
dataset   represents these hulls.</p>
          <p>Proof 8. For each fixed () ∈   by assumption,  ,() ( ) is convex and ∀() ∈   ∶ Conv( ,()
 ,() ( ) . Therefore, for each fixed () ∈   it holds Theorem 6.
( )) =</p>
          <p>By exploiting the structure of propositional configurations to construct independent convex
approximations of applicability regions, the dependency-aware preconditions also achieve soundness and
completeness under only localized convexity assumptions.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Discussion</title>
      <p>The empirical and theoretical evaluations ofer complementary insights for answering our two posed
research questions.</p>
      <p>The experimental results show that it is possible to learn action preconditions in a real-world domain
exclusively from observational data in which the action has previously been successfully applied.
All methods supported both learning and subsequent planning based on the learned preconditions,
confirming the eficiency of data-driven precondition learning in hybrid domains without expert input
(cf. RQ1). Nevertheless, there were clear diferences between the three approaches in terms of runtime
performance and scalability (cf. Table 1).</p>
      <p>Interestingly, for smaller domains, the generalized approach exhibited slower runtime growth than
the dependency-aware variant. Yet, in the most complex domain, the dependency-aware method
outperformed the generalized one in terms of runtime. This suggests that the overhead introduced by
dependency modeling may become advantageous in higher-complexity settings.</p>
      <p>The results regarding the planning runtimes (cf. Table 2) reflect a core limitation of convex hull-based
representations: as numerical preconditions are encoded via systems of linear inequalities derived from
hulls, computational demands grow with state space complexity. This can be mitigated by incorporating
domain knowledge, e.g., by constraining the set of considered variables during learning, as proposed by
[12]. In addition, preconditions can be represented more eficiently by bounded intervals in areas with
axis- or box-shaped spaces, as it is often the case in most real-world applications.</p>
      <p>It is also notable that planning runtimes exhibited high standard deviation, indicating sensitivity to
the random seed. This variation is likely due to diferences in the distribution of numerical variables,
which can influences the structure of the state space.</p>
      <p>Two key requirements must be considered when learning action models from data for planning in
complex real-world domains: soundness and completeness. To learn preconditions fulfilling both of
these requirements, especially in continuous or high-dimensional numerical domains, where it is not
possible for training data to completely cover all valid states, is challenging.</p>
      <p>The exact approach guarantees soundness by declaring an action applicable only in states that were
explicitly observed during training. However, the approach does not account for the mentioned problem
of missing data for every single state within the numerical state space and is therefore not complete.</p>
      <p>By contrast, the generalized approach is complete under the assumptions that (i) the dataset includes
all valid propositional configurations, (ii) the numerical state space can be approximated by a convex
hull, and (iii) boundary states are represented in the dataset. Nonetheless, this method risks
overgeneralization in both the propositional and numerical dimensions. In the propositional space, the method
does not restrict preconditions to propositions true in all observed states, compromising soundness.
In the numerical space, even if the outer shape is convex, the internal structure may contain holes or
disallowed subregions, e.g., if certain configurations of the propositional state space only allow for
a certain subspace of the numerical state space. Consequently, the generalized approach does not
guarantee soundness in either component, which is confirmed by experimental observations where
plans were generated that are invalid with respect to the expert-modeled domain.</p>
      <p>Our novel dependency-aware approach maintains the completeness of the generalized approach
under the same assumptions but addresses its key shortcomings by modeling dependencies among
propositional variables and possibly resulting constraints in the numerical space. It also guarantees
soundness under the reasonable assumption that, for each propositional configuration, the corresponding
set of applicable numerical states forms a convex region.</p>
      <p>Accordingly, the dependency-aware method ensures both soundness and completeness of planning in
N3PCP domains, provided that the conditions (i)-(iii) are satisfied (cf. RQ2).</p>
    </sec>
    <sec id="sec-8">
      <title>8. Conclusion</title>
      <p>In this paper, we addressed the challenge of learning action preconditions in complex real-world domains
characterized by hybrid, i.e., discrete and continuous, state spaces, using only observational data from
states in which the actions were executed. We introduced a novel dependency-aware approach that
constructs separate convex hulls in the numerical subspace for each discrete state configuration, thereby
explicitly modeling the dependencies between discrete and continuous components of the state space. In
contrast to methods determining the discrete preconditions based on conservative learning paradigms
to ensure soundness, the proposed approach guarantees completeness with respect to the discrete state
configurations within the training data. Moreover, by capturing structural dependencies between the
discrete and continuous components, the approach also guarantees soundness, under the assumption
that the valid numerical states associated with each discrete configuration form a convex region.</p>
      <p>Future work will focus on addressing the computational challenges introduced by the convex-hull
representations, which significantly impact planner performance in large-scale domains. We also plan
to conduct a broader empirical evaluation across diverse real-world domains and integrate the learned
models into planning systems capable of handling N3PCP problems.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments</title>
      <p>This research as part of the projects EKI and LaiLa is funded by dtec.bw – Digitalization and Technology
Research Center of the Bundeswehr, which we gratefully acknowledge. dtec.bw is funded by the
European Union – NextGenerationEU.</p>
    </sec>
    <sec id="sec-10">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT in order to: Grammar and spelling
check, Paraphrase and reword. After using this tool/service, the author reviewed and edited the content
as needed and take full responsibility for the publication’s content.
[8] A. Köcher, R. Heesch, N. Widulle, A. Nordhausen, J. Putzke, A. Windmann, O. Niggemann,
A research agenda for ai planning in the field of flexible production systems, 2022 IEEE 5th
International Conference on Industrial Cyber-Physical Systems (ICPS) (2022) 1–8.
[9] E. Jarvenpaa, N. Siltala, M. Lanz, Formal resource and capability descriptions supporting rapid
reconfiguration of assembly systems, in: 2016 IEEE International Symposium on Assembly and
Manufacturing (ISAM), IEEE, 2016. doi:10.1109/isam.2016.7750724.
[10] H. Kagermann, J. Helbig, A. Hellinger, W. Wahlster, Recommendations for implementing the
strategic initiative INDUSTRIE 4.0: Securing the future of German manufacturing industry; final
report of the Industrie 4.0 Working Group, Forschungsunion, 2013.
[11] S. J. Russell, P. Norvig, Artificial intelligence: a modern approach, Pearson, 2016.
[12] A. Mordoch, B. Juba, R. Stern, Learning safe numeric action models, in: Proceedings of the AAAI</p>
      <p>Conference on Artificial Intelligence, volume 37, 2023, pp. 12079–12086.
[13] D. Aineto, E. Scala, Action Model Learning with Guarantees, in: Proceedings of the 21st
International Conference on Principles of Knowledge Representation and Reasoning, 2024, pp. 801–811.
doi:10.24963/kr.2024/75.
[14] A. Mordoch, E. Scala, R. Stern, B. Juba, Safe learning of pddl domains with conditional efects, in:
Proceedings of the International Conference on Automated Planning and Scheduling, volume 34,
2024, pp. 387–395.
[15] J. Á. Segura-Muros, R. Pérez, J. Fernández-Olivares, Discovering relational and numerical
expressions from plan traces for learning action models, Applied Intelligence 51 (2021) 7973–7989.
[16] J. Á. Segura-Muros, J. Fernández-Olivares, R. Pérez, Learning numerical action models from noisy
input data, arXiv preprint arXiv:2111.04997 (2021).
[17] B. Juba, H. S. Le, R. Stern, Safe learning of lifted action models, in: Proceedings of the International
Conference on Principles of Knowledge Representation and Reasoning, volume 18, 2021, pp.
379–389.
[18] A. Bunte, P. Wunderlich, N. Moriz, P. Li, A. Mankowski, A. Rogalla, O. Niggemann, Why symbolic
ai is a key technology for self-adaption in the context of cpps, in: 2019 24th IEEE International
Conference on Emerging Technologies and Factory Automation (ETFA), IEEE, 2019, pp. 1701–1704.
[19] D. van Dalen, Logic and structure (3. ed.), Universitext, Springer, 1994.
[20] A. Cimatti, A. Griggio, S. Mover, M. Roveri, S. Tonetta, Verification modulo theories, Formal</p>
      <p>Methods Syst. Des. 60 (2022) 452–481.
[21] Z. Manna, A. Pnueli, The modal logic of programs, in: ICALP, volume 71 of Lecture Notes in</p>
      <p>Computer Science, Springer, 1979, pp. 385–409.
[22] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu, Bounded model checking, Adv. Comput.</p>
      <p>58 (2003) 117–148.
[23] M. Helmert, Decidability and undecidability results for planning with numerical state variables.,
in: AIPS, 2002, pp. 44–53.
[24] D. Vranješ, J. Ehrhardt, R. Heesch, L. Moddemann, H. S. Steude, O. Niggemann, Design principles for
falsifiable, replicable and reproducible empirical machine learning research, in: 35th International
Conference on Principles of Diagnosis and Resilient Systems (DX 2024), Schloss
Dagstuhl–LeibnizZentrum für Informatik, 2024, pp. 7–1.
[25] A. Micheli, A. Bit-Monnot, G. Röger, E. Scala, A. Valentini, L. Framba, A. Rovetta, A. Trapasso,
L. Bonassi, A. E. Gerevini, L. Iocchi, F. Ingrand, U. Köckemann, F. Patrizi, A. Saetti, I. Serina,
S. Stock, Unified planning: Modeling, manipulating and solving ai planning problems in python,
SoftwareX 29 (2025) 102012. doi: https://doi.org/10.1016/j.softx.2024.102012.
[26] A. Gerevini, A. Saetti, I. Serina, An empirical analysis of some heuristic features for planning
through local search and action graphs, Fundam. Inform. 107 (2011) 167–197. doi:10.3233/
FI- 2011- 399.
[27] A. Valentini, A. Micheli, A. Cimatti, Temporal planning with intermediate conditions and efects,
in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 9975–9982.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghallab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nau</surname>
          </string-name>
          , P. Traverso,
          <source>Automated Planning and Acting</source>
          , Cambridge University Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ehrhardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Heesch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Niggemann</surname>
          </string-name>
          ,
          <article-title>Learning process steps as dynamical systems for a subsymbolic approach of process planning in cyber-physical production systems</article-title>
          ,
          <source>in: Artificial Intelligence. ECAI 2023 International Workshops</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>332</fpage>
          -
          <lpage>345</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hausknecht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Stone</surname>
          </string-name>
          ,
          <article-title>Deep reinforcement learning in parameterized action space</article-title>
          ,
          <year>2016</year>
          . doi:
          <volume>10</volume>
          . 48550/ARXIV.1511.04143.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Heesch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cimatti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ehrhardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Diedrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Niggemann</surname>
          </string-name>
          ,
          <article-title>A lazy approach to neural numerical planning with control parameters</article-title>
          ,
          <source>in: Frontiers in Artificial Intelligence and Applications</source>
          , volume Volume
          <volume>392</volume>
          :
          <source>ECAI</source>
          <year>2024</year>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .3233/FAIA241000.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Savaş</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Long</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Magazzeni</surname>
          </string-name>
          ,
          <article-title>Planning using actions with control parameters</article-title>
          ,
          <source>in: Proceedings of the Twenty-second European Conference on Artificial Intelligence</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1185</fpage>
          -
          <lpage>1193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Grand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pellier</surname>
          </string-name>
          , H. Fiorino,
          <string-name>
            <surname>TempAMLSI:</surname>
          </string-name>
          <article-title>Temporal action model learning based on STRIPS translation</article-title>
          ,
          <source>Proceedings of the International Conference on Automated Planning and Scheduling</source>
          <volume>32</volume>
          (
          <year>2022</year>
          )
          <fpage>597</fpage>
          -
          <lpage>605</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Heesch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ehrhardt</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. Niggemann,</surname>
          </string-name>
          <article-title>Integrating machine learning into an SMT-based planning approach for production planning inc yber-physical production systems</article-title>
          ,
          <source>in: Artificial Intelligence. ECAI 2023 International Workshops</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>318</fpage>
          -
          <lpage>331</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>