1. Introduction

Learning Sound and Complete Preconditions in Complex Real-World Domains

René Heesch

0 1

Björn Ludwig

0 1

Jonas Ehrhardt

0 1

Alexander Diedrich

0 1

Oliver Niggemann

0 1 0 HSU-AI Institute for Artificial Intelligence, Helmut-Schmidt-University , Hamburg , Germany 1 Institute of Automation, Helmut-Schmidt-University , Hamburg , Germany

2025

In this paper, we address the problem of learning sound and complete action preconditions in complex real-world planning domains, the remaining bottleneck in the N3PCP pipeline. Such planning domains involve hybrid state spaces including discrete and numerical variables. We propose a dependency-aware learning approach that captures interdependencies between discrete and numerical variables by constructing distinct convex hulls over the numerical subspace for each discrete state configuration. This poses a more accurate representation of hybrid preconditions in real-world domains than existing approaches for learning preconditions. We empirically compare our method against two other approaches: an exact baseline method that ensures soundness but lacks completeness, and a generalized variant of N-SAM, that achieves completeness but compromises soundness. We evaluate our approach across multiple planning problems and domains which are based on a real-world industrial system, demonstrating the practical benefits of our approach. An additional theoretical analysis confirms that, under standard convexity assumptions and suficient coverage of discrete configurations within the training data, our proposed dependency-aware method guarantees both completeness and soundness.

eol>Planning N3PCP SMT Neural Networks Action Model Learning

1. Introduction

Automated planning is a key capability for the autonomy of intelligent systems [ 1 ]. Planning enables system to generate plans, which are action-sequences that transition a system from an initial state to a goal state [ 1 ]. In many real-world domains, this involves reasoning over hybrid state spaces that combine symbolic (discrete) and continuous (numerical) components, e.g., in robotics, autonomous systems, and industrial manufacturing [ 2, 3, 4 ]. In such settings, the complexity of planning lies not exclusively in the branching factor of the search space, but also in the need to determine valid control parameters for actions and the dificulty of modeling system behavior mathematically [ 5 ].

Traditional planning approaches depend on manually specified domain models, which define action preconditions, efects, and causal relations. However, constructing such models in complex, dynamic, and heterogeneous domains is time-consuming and error-prone [ 6 ]. This modeling bottleneck significantly limits the scalability and adaptability of planning systems in complex real-world domains.

To address this bottleneck, Neural Network–enriched Numerical Planning with Control Parameters

(N3PCP) was recently introduced as a hybrid planning framework that leverages neural networks to approximate the efects of actions based on empirical data [ 7, 2 ]. By learning efect models from observations or simulations, N3PCP eliminates the need for explicit, hand-crafted transition models. While early approaches employed eager and exact evaluation strategies [ 7 ], later work adopted lazy evaluation methods that trade of completeness for generality and computational eficiency [ 4 ]. However, both existing methods still rely on manually defined action preconditions, limiting their generality and automation.

In this paper, we address this remaining bottleneck in the N3PCP pipeline by proposing an approach for learning action preconditions from data. We focus on domains where only observational data are available, i.e., states to which actions have been successfully applied, without access to examples of invalid plans. This leads to our first research question:

RQ1: Is it possible to learn action preconditions in real-world domains exclusively from observations of the states in which an action was applied?

To motivate this question, we consider a concrete and challenging application: production planning in modular manufacturing systems (cf. Figure 1). In particular, we focus on a modular real-world system that refines automotive glass components via polyurethane foaming. Modular manufacturing systems require the coordinated execution of production steps, each guided by machine-level control parameters, to reliably transform raw materials into finished products [ 8, 7 ]. The hybrid nature of the task, by combining symbolic reasoning and numerical parameter identification, makes it a typical real-world use case for automated planning frameworks like N3PCP [ 7 ]. Moreover, rapid reconfiguration is essential to cope with increasing product variability and reduced batch sizes [9, 10].

In such hybrid domains, two properties are essential: soundness, which ensures that any generated plan is executable, and completeness, which ensures that all valid plans can be found [11]. These properties depend not only on the planner but also on the quality of the domain model, i.e., the learned action preconditions. This raises our second research question:

RQ2: How does precondition learning afect the soundness and completeness of planning algorithms in N3PCP domains?

To address these questions, we propose a novel approach to learn dependency-aware preconditions from observational data. We compare this approach with two other approaches to precondition learning. An exact, memorization baseline, assuring soundness and a variation of the N-SAM [12], aiming for completeness. We assess the methods two-fold: Empirically, using data from an industrial production system, and theoretically, via formal analysis of their soundness and completeness.

2. Related Work

Automated planning relies on formal domain descriptions. These description include action models, which specify the conditions that must be satisfied in a state to make an action applicable to this state, i.e., the preconditions, and how this action is transforming the state, i.e., the efects. As manually creating such descriptions is tedious and error-prone [ 6 ], multiple approaches proposed learning action models based on observations from plan executions [ 13, 12, 14, 15, 16, 6 ]. Yet, most existing approaches are restricted to propositional representations and only limited work focus on learning action models in numerical domains.

PlanMiner [16, 15] infers action models with numeric preconditions and efects from noisy or partial plan traces by employing a combination of preprocessing, regression, and classification techniques to learn logical and arithmetic relations among state variables. However, the approach lacks formal guarantees on soundness: a plan found using the learned model may not be valid in the real world.

Addressing this limitation, N-SAM [12] extends the propositional SAM framework to numeric domains by learning safe numeric action models, adopting a two-stage learning process. The safe action models guarantee that generated plans execute as expected, with a total deviation in numeric efects bounded by at most a threshold ∈ ℝ . The algorithm first infers propositional preconditions using techniques for classical planning problems [17]. In detail, to ensure soundness, the propositional preconditions are conservatively defined, i.e., as the intersection of all observed applicable states, which guarantees that no invalid plans are produced. Then it defines numeric preconditions by constructing a convex hull over the numeric values observed in all states where the action was applied and expressing this hull as a set of linear inequalities. The convex hull defines a conservative approximation of the region in which the action is known to be applicable. However, learning a convex hull for an action with numeric parameters requires at least + 1 independent samples. Furthermore, the algorithm requires information about the set of numeric state variables involved in each action’s preconditions and efects.

N-SAM* [14] enhance the N-SAM algorithm by reducing the sample requirements to one observation of every action and the requirements of prior knowledge on the involved state variables, while still maintaining their previous introduced safety guarantee. The authors build a linear combination of using the Gram-Schmidt process and thereby enabling to learn preconditions from less than + 1 samples for an action with numerical parameters.

Both neural networks and control parameters afect only action efects, not preconditions. Hence, although N-SAM [12] and N-SAM* [14] were designed for numeric planning domains, their preconditionlearning methods could, in principle, be adapted to N3PCP problems.

However, these approaches learn models expressed in the Planning Domain Definition Language (PDDL), which has been shown to lack the expressiveness required for complex real-world applications such as production planning [18]. Moreover, their methodology treats propositional and numerical variables independently during precondition learning. As a result, they cannot capture cross-dependencies between these variable types, e.g., for cases where the feasible numeric state space is further constrained by specific propositional configurations. Additionally, the conservative strategy for learning the propositional preconditions leads to a loss of completeness, potentially excluding valid behaviors that were simply not observed in all instances.

3. Background

All planning approaches to Neural Network-enriched Numerical Planning with Control Parameters (N3PCP) are grounded in the planning as satisfiability paradigm, specifically by using Satisfiability Modulo Theories (SMT) as the underlying computational framework.

3.1. Logical Preliminaries

We operate within the setting of many-sorted first-order logic. Terms are either constants, individual variables, or applications of -ary function symbols to terms. Atoms are either propositional variables or -ary predicate symbols applied to terms. Formulae are constructed from atoms using standard Boolean connectives (¬, ∧, ∨, →, ↔) and quantifiers ( ∀, ∃) applied to one or more variables and a subformula. We follow the standard semantic terminology of interpretation, model, satisfiability, and validity [19].

In SMT, the interpretation of a given set of symbols is constrained by a given background theory, e.g., the theory of nonlinear arithmetic with transcendental functions (NTA). To solve N3PCP problems, numerical constants, real-valued variables, standard function symbols +, −, ×, ÷, comparison predicates <, ≤, >, ≥, and transcendental functions such as tan are considered [ 4 ].

For planning, we employ a symbolic representation of infinite-state transition systems [ 20]. A transition system is defined as the tuple ⟨ , ( ), ( , ′)⟩, where: • is a set of state variables; • ( ) is an SMT formula characterizing the set of initial states (i.e., a state is initial if ⊧ ); • ( , ′) is the transition relation, with denoting the state before and ′ denoting the state after the transition.

A transition from state to ′ exists if the combined valuation over ∪ ’ that assigns according to and ’ according to ’ satisfies , i.e., (, ’) ⊧ . In this case, ′ is a successor of , and a predecessor of ′. In a transition system Γ, a trace is a sequence of states 0, 1, … such hat 0 ⊧ and for all , +1 is a successor of . To express properties over such traces, we use Linear Temporal Logic [21]. The model checking problem Γ ⊧ states, that all traces of Γ satisfy a temporal formula .

Among various verification techniques, we focus on Bounded Model Checking (BMC) [22], which reduces the reachability problem to satisfiability over a bounded horizon. BMC proceeds by unrolling the transition relation for steps. The following SMT formula is constructed: ( 0) ∧ ( 0, 1) ∧ … ∧ ( −1 , ) ∧ ( ), (1) where ( ) specifies the goal condition. The formula is satisfiable if there exists a trace of transitions from the initial state to a state satisfying . A satisfying assignment to the variables 0, 1, … , represents the corresponding plan.

3.2. Real-World State Spaces

Val() × Ω.

In AI planning with propositional and numeric variables, the state space factors as the Cartesian product of a propositional valuation space and a numeric assignment space [23]. Let be the set of propositional variables and the set of numeric variables. Define

Val() ∶= { ∣ ∶ → {⊥, ⊤}} and Ω ∶= ∏ ⊆ ℝ , ∈ where each ⊆ ℝ is typically a closed interval. A state is a pair (, ) with ∈ Val() and ∈ Ω , and the state space is

If the real-world state space also includes finite discrete variables = { }∈ with domains , these are encoded into the propositional part via a standard injective encoding, obtaining an expanded set propositional part. ′ and a bijection ∏∈ × Val() ≅

Val( ′). Hence, w.l.o.g., we use Eq. (3) and refer only to the Definition 1. For , ∈ ℝ , the line segment is ∶= {(1 − ) + ∣ ∈ [ 0, 1 ]}.

For states

(, ), (, ) ∈ with the same propositional valuation , the corresponding segment in is { (, ) ∣ ∈ }.

If the propositional parts difer, no convex combination is defined in

Definition 2.

Let ⊆ Val() × ℝ . For ∈

Val() , define the numeric fiber

() ∶= { ∈ ℝ ∣ (, ) ∈ }. We say that is fiberwise convex if each () is convex in ℝ .

In many applications, numeric variables are bounded physical quantities, i.e., ∈ Ω = intervals = [ℓ , ]. Each is convex, and thus Ω is an axis-aligned box and convex. Accordingly, we assume throughout that for every fixed propositional assignment , the feasible numeric fiber () is convex. This guarantees that for any two feasible numeric states under the same , their linear ∏∈ with interpolations remain feasible.

4. Problem Definition

Following the formalization of [ 4 ] the Neural Network-enriched Numerical Planning with Control Parameters (N3PCP) problem is defined as a tuple: (2) (3) (4) = ⟨, , , Θ, , , , ⟩, where each component contributes to describing hybrid planning tasks that involve both symbolic and numerical reasoning under parametric control. This formalism is particularly motivated by real-world applications such as production planning in modular manufacturing systems [ 8, 7 ], where tasks involve discrete process steps and machine-level parameter tuning.

Let denote the set of propositional (Boolean) state variables and the set of numerical state variables. Each state ∈ ≔ × , ∈ ℕ

is characterized by a valuation of the propositional variables in and real-valued (or bounded-precision) variables in . We assume a fixed ordering of . denotes the finite set of actions, and

Θ the set of control parameters associated with these actions.

Each action models a transition of the system from one state to another, influenced by a specific valuation of the parameters Θ. In the context of production systems, actions represent diferent production steps such as cutting, assembling, or coating, while the control parameters correspond to machine-level settings like feed rate, temperature, or pressure [ 7, 2 ].

Each action ∈

is specified in one of two ways: • A symbolic transition function ∈ , represented as an SMT formula ( , Θ, +1 ) over the current state , the control parameters Θ, and the successor state +1 . • An executable transition function ∈ , which is a loop-free imperative program that computes a successor state +1 given ( , Θ). In practice, this may be implemented by a Neural Network, trained to approximate physical or empirical system dynamics or a simulation of a particular production step.

The symbolic representation allows for logical reasoning, constraint satisfaction, and artificial data generation. The executable form enables forward simulation, including black-box models where analytical descriptions are infeasible.

The control parameters Θ are subject to domain constraints. Let ValidCP (Θ)denote the admissible space of control parameter valuations for action . These constraints may be, for instance, interval bounds:

≤ ≤ , ∀ ∈ Θ, encoding physical or safety limitations, e.g., torque limits and temperature ranges in the context of production systems, where describes the lower bound and the upper bound, , ∈ ℝ .

Additionally, not all actions are applicable in all states. We define the applicability set of an action ∈ as:

Sin() = { ∈ ∣ () = true}, where ∶ → { true, false} is a precondition predicate that determines whether the action can be applied in state . These predicates may be manually specified or learned from empirical execution data, such as successful and failed application attempts of across diferent states.

Moreover, we assume that the symbolic and executable transition functions and are defined and valid only on their respective domains of applicability: (5) (6) dom( ),dom( ) ⊆ Sin() × ValidCP . (7) This ensures that logical inference and simulation are only performed under feasible conditions.

denotes the set of initial state conditions, encoded as an SMT formula over and , describing feasible system configurations at the outset of planning. The goal condition is similarly an SMT formula over and , specifying the desired properties of the final state. Unlike classical planning with discrete goals, N3PCP problems allow goal descriptions and preconditions of actions over continuous state variables. These are not expressed as equalities but as a set of inequalities, where the numerical values have to lay within a certain range, reflecting the properties of real-world domains, e.g., target dimensions, quality metrics.

5. Solution

In this section, we introduce our novel solution to learn dependency-aware preconditions solely from observations of the planning domain. We derive our dependency-aware approach from two baseline approaches, one emphasizing exclusively on soundness of the learned preconditions, and the other emphasizing on completeness of the learned preconditions. While the first baseline is the most restrictive and conservative one with the highest possible incompleteness, the second baseline is the most optimistic approach, aiming for completeness at the cost of soundness. Our proposed dependency-aware approach is based on reasonable assumptions both, sound and complete.

For all approaches, a set of observed applicable states { 1, … , } for an action is given. Following the vector-based state-space representation presented in Heesch et al. [ 7 ], we represent the assignments to all variables in and , i.e., the states, in the form of vectors.

5.1. Soundness First: Learning Exact Preconditions

Our first baseline is sound with respect to the training data. Given the observed applicable states = { 1, … , } ⊆ , we define the exact precondition as a pure enumeration of those states: exact = ⋁ =1

This DNF exactly accepts the training examples and rejects all others. The corresponding algorithm is given in Alg. 1.

1: Initialize exact ← ∅ 2: for all ∈

do 3: 4:

Construct clause ← exact ← exact

∨ 5: end for 6: return exact Algorithm 1 Learn Exact Preconditions Ensure: Symbolic precondition exact Require: Dataset = { 1, 2, … , } of states ∈ where action was applied exact is returned (cf. Alg. 1 l. 6).

The algorithm first initializes an empty clause exact (cf. Alg. 1 l. 1). Subsequently, the algorithm iterates through the whole dataset . For each state a clause is constructed (cf. Alg. 1 l. 2). The clause is a conjunction, assigning each variable in and the corresponding value of (cf. Alg. 1 l. 3). This clause is then added to the clause exact via a disjunction (cf. Alg. 1 l. 4). At the end, the clause

As Alg. 1 may be reasonable in propositional domains, it becomes problematic in continuous or hybrid domains. In such domains, this method is unlikely to provide a complete representation, as it only enumerates the observed states from the data set. It cannot cover the infinite space of possible continuous values. As a result, exact is sound, but its completeness due to possible unseen but valid states is limited to the dataset and propositional domains. (8) (9)

5.2. Completeness First: Generalized Preconditions

Our second baseline approximates the preconditions by generalizing from observed data emphasizing rather on completeness than soundness. This involves inferring broader parts of the state space from sampled data,

e.g., intervals or convex regions, over the numerical variables . This relaxation may come at the expense of soundness, when the space of applicable states is not convex, e.g., some states which satisfy general may not allow action in practice.

Let ∈ be a continuous state variable. If the action was applied in states 0 and 1 where all other variables are equal and only difers, and if was successfully executed in both cases, we assume is applicable to any state satisfying: ∈ 0 1 where denotes an assignment to in state .

Building on this idea and inspired by Mordoch et al. [12], we separate propositional and numerical variables when learning preconditions. For the numerical variables, we calculate the convex hull of the numerical components of the states in the data set of successful executions of . The linear inequalities of the hull’s supporting hyperplanes then define the numerical precondition. Unlike Mordoch et al. [12], our method does not require prior manual identification of action-relevant numeric variables. For propositional variables, we use a straightforward disjunctive strategy to increase completeness: we collect the distinct propositional assignments under which was observed, and take their disjunction as the propositional precondition. 14: return

The algorithm begins by initializing two data structures: an empty set Φ to collect all unique propositional configurations observed in the dataset (cf. Alg. 2, l. 1), and an empty matrix to store the numerical components of the corresponding states (cf. Alg. 2, l. 2).

For each state ∈ , the propositional subvector is extracted (cf. Alg. 2, l. 4). If this configuration has not yet been encountered, it is added to Φ (cf. Alg. 2, l. 5–7). Simultaneously, the numerical subvector is appended to the matrix for subsequent processing (cf. Alg. 2, l. 8).

Once all states have been processed, the propositional precondition () is constructed as a disjunction over the elements of Φ (cf. Alg. 2, l. 10): () = ⋁ , ∈Φ (10) ensuring that the resulting precondition evaluates to true precisely for the propositional configurations under which the action has been previously observed. This avoids the overly conservative strategy of assuming propositional variables must be fixed across all samples, as in [ 12].

Subsequently, a convex hull is computed over the matrix containing the numerical state components (cf. Alg. 2, l. 11). The set of defining linear inequalities for is then extracted and used to define the numerical precondition ( ) (cf. Alg. 2, l. 12). The final action precondition general is returned as the conjunction of its propositional and numerical components (cf. Alg. 2, l. 13-14).

Using a convex hull to describe the numerical component reduces incompleteness by generalizing over observed numeric values, while maintaining soundness under the assumption that preconditions are independent of unobserved propositional contexts. However, this assumption introduces a potential loss of soundness when the propositional and numerical components are not independent. Specifically, by applying a single convex hull over all numerical data points, not accounting for the associated propositional configurations, the approach cannot capture cross-dependencies, such as when certain propositional conditions constrain the feasible numeric subspace. Consequently, the combined precondition general = () ∧ ( ) may admit states that were never observed together in the data, potentially leading to plans with action that could not be applied.

5.3. Dependency-Aware Preconditions

To address the drawbacks our two baselines, we introduce our novel dependency-aware precondition formulation. Instead of treating propositional and continuous constraints independently, we model the precondition as a conditional structure. Let and denote the propositional and numeric subvectors of the state , respectively. Let Φ ⊆ dom( ) be the set of observed propositional configurations (from components observed under configuration . states where action was applied). For each ∈ Φ , we fit a convex region

Formally, the dependency-aware precondition for action is defined as: ( ) over the numeric depend = ⋁ (

⏟ ∈Φ propositional ∧ ⏟ ( )

). numeric (11) The numeric condition

( ) is defined by the set of linear inequalities (i.e., halfspaces) that describe the convex hull over all numeric vectors observed in states with the propositional configuration . Algorithm 3 Dependency-Aware Preconditions Require: Dataset of states where action was applied; propositional indices ℐ , numeric indices ℐ Ensure: Symbolic precondition

depend 4: 5: 6: 10: 11: 12: 13: 14: 1: Initialize 2: Initialize map ∶ 3: for all ∈ do depend

← ∅

← [ℐ ] ← [ℐ

]

Append to [ 7: end for 8: Initialize list ← [ ] 9: for all ∈ keys( )

]

ConvexHull( ) ← [ ← ( ) ()

← Append ( () ∧

( ) ) to clauses 15: end for 16: return depend ← ⋁∈ clauses ↦ List of numeric vectors ]

do ← GetHalfspaceInequalities( ) inequalities associated with propositional configuration .

The propositional part

The algorithm begins by partitioning the dataset into groups of states that exhibit identical propositional configurations (cf. Alg. 3, l.2–7). For each such group, a convex hull is computed over the continuous components of the state vectors (cf. Alg.3, l.10-11), and the corresponding set of linear ( ) is derived (cf. Alg.3, l. 12). These inequalities define the feasible numeric region () is constructed as a conjunction of variable assignments that match added, and added to

depend as a disjunction (cf. Alg.3, l. 14-16). configuration (cf. Alg. 3, l.13). The two components are then conjoined into a single clause that is

By explicitly associating each propositional configuration with a corresponding region in the numeric state space, this approach avoids the need for conservative treatment of propositional variables of other approaches [17, 12]. As a result, it mitigates the incompleteness typically introduced by requiring propositional preconditions to hold uniformly across all observed states. Simultaneously, under the assumption that the numeric values observed under each configuration form a convex region, the approach preserves soundness, ensuring that plans generated using the learned model remain executable in the real domain.

We recognize several limitations of this method. Most notably, constructing convex hulls requires a suficient number of observed state samples for each propositional configuration in which the action was applied. Furthermore, the complexity of the resulting preconditions can increase the computational complexity, both during model learning and at planning time, where more detailed applicability conditions must be evaluated.

Nonetheless, in domains where complexity arises from the of actions rather than the number of objects, e.g., production planning [8], capturing these dependencies is critical for deriving both sound and complete action models. Importantly, we assume that precondition learning is performed entirely ofline, prior to planning. This design choice alleviates concerns about runtime overhead, making it feasible to deploy expressive, dependency-aware preconditions in practical planning systems without compromising online eficiency.

6. Evaluation

Our evaluation is twofold. To address the first research question, we analyze the performance of our proposed approache empirically. To address the second research question, we perform a theoretical analysis of the presented methods in terms of soundness and completeness.

6.1. Empirical Evaluation

To investigate whether data-driven methods can efectively infer action preconditions based solely on observations of states in which the action was executed (RQ1), we focus on a representative action from an industrial production system. Thereby, we do not only focus on learning preconditions, but also the subsequent planning task, to account for the increasing complexity of the planning problem by learning complex precondition structures as in the generalized and the dependency-aware approach.

As evaluation metric, we chose the algorithmic runtime. Specifically, we included both, the learning of the action preconditions and the planning with these learned preconditions in the recording of runtime. Soundness and completeness are proven theoretically in Section 6.2. 6.1.1. Experimental Setup We evaluated the algorithms on a real-world production system (cf. Figure 1). The production system refines automotive glass components using polyurethane foaming. In the foaming cell, a robot positions mechanical inserts onto a pretreated glass pane fixed in a foaming mold. This insert placement is a critical step before the polyurethane injection and must satisfy strict constraints related to temperature and configuration to ensure both product integrity and functional reliability.

Our evaluation centers on the robot’s action model. An expert-defined model exists for each phase of the process, including an action for placing the inserts. Our objective is to infer the preconditions of this specific action exclusively from 1000 observations of states where the action was applied. These observations were generated by randomly sampling state transitions from the expert model. In the minimal complexity level, the states are described by fiteen state variables: thirteen propositional and two numerical variables1. To assess the scalability, we introduced up to four additional numerical state variables, incrementally adding additional preconditions to the planning domain, resulting in five increasingly complex planning problems and domains. Following the experimental design guidelines for Machine Learning research by [24], we generated the data for each level of domain complexity with eight diferent seeds. For avoiding random efects, we repeated each experiment ten times. The source 1We removed the control parameters and the neural network which is usually representing the actions’ efect in N3PCP problems to reduce the complexity of the planning problems. Including these components would significantly increase computational complexity, and thereby obscure a meaningful comparison of the approaches to learn preconditions. code and the action-model modeled by experts are available at https://github.com/RHeesch/Learning_ Hybrid_Preconditions.

We ran our experiments on a system with an AMD Ryzen 9 9950X 16-core processor and 96 GB RAM. The planning problems, including the learned preconditions, were encoded in the Unified Planning Framework (UPF)[25] and solved using the LPG planner [26]. We evaluated the resulting plans, which were generated based on the learned preconditions, against the expert-defined domain model using the UPF and the TAMER planner [27]. 6.1.2. Results Table 1 summarizes the runtimes for learning action preconditions. We report mean and standard deviation of the runtime for each algorithm across the increasing domain complexities. Regarding precondition learning (cf. Table 1), all approaches exhibited similar runtimes in the domain with two numerical state variables. The exact method showed only modest increases as domain complexity grew, whereas the generalized and dependency-aware variants scaled exponentially, likely due to the overhead of constructing convex hulls and deriving the corresponding systems of linear inequalities.

Table 2 reports the mean planning times and standard deviation. All plans are valid with respect to expert-modeled domain description. Planning runtimes increased exponentially with domain complexity across all approaches. For the 15-variable domain, planning was initially much faster using the generalized and dependency-aware preconditions compared to the exact preconditions. However, as complexity increased, planning times for both variants soon exceeded those of the exact method, with the generalized approach exhibiting the highest planning times in the most complex domain. The dependency-aware variant initially performed less favorably than the generalized one, but at last surpassed it in the two most complex domains.

6.2. Theoretical Evaluation

For evaluating the soundness and completeness of the diferent approaches, we consider a hybrid state space = × , where denotes the propositional component and the continuous (numeric) component. For any action , we define its true applicability region as = { ∈ ∣ is truly applicable in } . We assume access to a finite training set = { 1, … , } ⊆ , representing observed states where action was applicable.

The learned preconditions are sound, if the planning algorithm can find a plan that is executable in the real world, based on learned preconditions. The learned preconditions are complete, if the planning algorithm can find every single plan, that is an executable plan in the real world, based on the learned preconditions [11].

Exact Preconditions: Theorem 1. The exact preconditions are sound, i.e., for all ∈ , with exact holds ∈ . Proof 1. Let be the complete set of applicable states for action . For the training set ⊆ holds and ∀ ∈ exact ∶ ∈ . Thus, ∀ ∈ exact ∶ ∈ .

Since the exact preconditions only allow the action to be applied to states to which the action was previously applied, the learned exact preconditions are sound by construction.

Theorem 2. The exact preconditions are complete with respect to the training set . Proof 2. Proof by contradiction: We assume ∗ ∈ exists with is applicable for ∗, where ∗ ∉ exact, but by definition it holds ∀ ∈ ∶ ∈ exact.

Exact preconditions are complete only if the training set fully enumerates . However, in general, when is continuous or infinite, this condition is not met, making exact preconditions incomplete in practical scenarios.

Generalized Preconditions: Theorem 3. In general, the generalized preconditions are not sound, i.e., ∃ ∗ ∈ general ∶ ∗ ∉ .

Proof 3. Proof by contradiction: Let ∶= { 1, 2} with 1 = [true, false, 10], 2 = [true, true, 15]. We assume = { 1, 2}. Creating general from gives the set general = {[true, , ] | ∈ { true, false}, ∈ [10, 15]}. Let 3 = [true, false, 15], then 3 ∈ general, but 3 ∉ . So ∃ ∗ ∈ general ∶ ∗ ∉ . The generalized preconditions are not, in general, sound. In particular, generalization within the numerical space across all propositional configurations may introduce additional states into the precondition set that do not lie within the true applicability region of the action.

Theorem 4. In general, the generalized preconditions are not complete, i.e., ∃ ∗ ∈ ∶ ∗ ∉ general.

Proof 4. Proof by contradiction: Let ∶= { 1, 2, 3} with 1 = [true, false], 2 = [true, true] and 3 = [false, true]. We assume = { 1, 2}. Creating general from gives the set general = { 1, 2}. Then 3 ∉ general, but 3 ∈ . So ∃ ∈ ∶ ∉ general.

The generalized preconditions are not generally complete, as propositional configurations that fall within the true applicability region may be absent from the training data and, consequently, excluded from the learned preconditions.

Theorem 5. If there is only one single propositional configuration () within and the numerical state space represented in ( ) ⊆ ( ) and ( ) is convex, then the generalized preconditions are sound. Proof 5. Let ∀ ∗ ∈ ∶ ∗() = () , then ∀ ∗ ∈ ∶ ∗() = () and therefore ∀ ∗ ∈ () . By assumption, ( ) is convex and ∀( ) ∈ ( ) ∶ ( ) ∈ ( ) , so ( ) ∶= general ( ) . Then ∀ ∗( ) ∈ ( ) ∶ ∗( ) ∈ ( ) . Therefore, ∀ ∈ ∶ ∈ . general ∶ ∗() = Conv( ( )) ⊆ Theorem 6. If there is only one single propositional configuration () within and the numerical state space represented in ( ) is the convex hull of ( ) , then the generalized preconditions are complete. Proof 6. Let ∀ ∗ ∈ ∶ ∗() = () , then ∀ ∗ ∈ ∶ ∗() = () and therefore ∀ ∗ ∈ general ∶ ∗() = () . By assumption, ( ) is convex and represented by ( ) , so ( ) ∶= Conv( ( )) = ( ) . Then ∀ ∗( ) ∈ ( ) ∶ ∗( ) ∈ Conv( ( )) . Therefore, ∀ ∈ ∶ ∈ general.

This generalization admits all convex combinations of observed numeric values. Assuming that the true numeric applicability region is globally convex, the generalized preconditions are both sound and complete. However, in non-convex domains, particularly where diferent propositional configurations admit distinct convex regions, soundness may be violated.

Dependency-Aware Preconditions:

Theorem 7. The dependency-aware preconditions are also sound for non-globally convex hulls, where

diferent propositional configurations () ∈ () allow for diferent convex hulls () , when ∀() ∈ ∶ Conv( ,() ) ⊆ ,() .

Proof 7. By definition ∀() ∈ ∶ Conv( ,()

Theorem 5.

) ⊆ ,() . Therefore, for each fixed () ∈ it holds

Theorem 8. The dependency-aware preconditions are complete for non-globally convex hulls, where

diferent propositional configurations () ∈ () allow for diferent convex hulls () , when the dataset represents these hulls.

Proof 8. For each fixed () ∈ by assumption, ,() ( ) is convex and ∀() ∈ ∶ Conv( ,() ,() ( ) . Therefore, for each fixed () ∈ it holds Theorem 6. ( )) =

By exploiting the structure of propositional configurations to construct independent convex approximations of applicability regions, the dependency-aware preconditions also achieve soundness and completeness under only localized convexity assumptions.

7. Discussion

The empirical and theoretical evaluations ofer complementary insights for answering our two posed research questions.

The experimental results show that it is possible to learn action preconditions in a real-world domain exclusively from observational data in which the action has previously been successfully applied. All methods supported both learning and subsequent planning based on the learned preconditions, confirming the eficiency of data-driven precondition learning in hybrid domains without expert input (cf. RQ1). Nevertheless, there were clear diferences between the three approaches in terms of runtime performance and scalability (cf. Table 1).

Interestingly, for smaller domains, the generalized approach exhibited slower runtime growth than the dependency-aware variant. Yet, in the most complex domain, the dependency-aware method outperformed the generalized one in terms of runtime. This suggests that the overhead introduced by dependency modeling may become advantageous in higher-complexity settings.

The results regarding the planning runtimes (cf. Table 2) reflect a core limitation of convex hull-based representations: as numerical preconditions are encoded via systems of linear inequalities derived from hulls, computational demands grow with state space complexity. This can be mitigated by incorporating domain knowledge, e.g., by constraining the set of considered variables during learning, as proposed by [12]. In addition, preconditions can be represented more eficiently by bounded intervals in areas with axis- or box-shaped spaces, as it is often the case in most real-world applications.

It is also notable that planning runtimes exhibited high standard deviation, indicating sensitivity to the random seed. This variation is likely due to diferences in the distribution of numerical variables, which can influences the structure of the state space.

Two key requirements must be considered when learning action models from data for planning in complex real-world domains: soundness and completeness. To learn preconditions fulfilling both of these requirements, especially in continuous or high-dimensional numerical domains, where it is not possible for training data to completely cover all valid states, is challenging.

The exact approach guarantees soundness by declaring an action applicable only in states that were explicitly observed during training. However, the approach does not account for the mentioned problem of missing data for every single state within the numerical state space and is therefore not complete.

By contrast, the generalized approach is complete under the assumptions that (i) the dataset includes all valid propositional configurations, (ii) the numerical state space can be approximated by a convex hull, and (iii) boundary states are represented in the dataset. Nonetheless, this method risks overgeneralization in both the propositional and numerical dimensions. In the propositional space, the method does not restrict preconditions to propositions true in all observed states, compromising soundness. In the numerical space, even if the outer shape is convex, the internal structure may contain holes or disallowed subregions, e.g., if certain configurations of the propositional state space only allow for a certain subspace of the numerical state space. Consequently, the generalized approach does not guarantee soundness in either component, which is confirmed by experimental observations where plans were generated that are invalid with respect to the expert-modeled domain.

Our novel dependency-aware approach maintains the completeness of the generalized approach under the same assumptions but addresses its key shortcomings by modeling dependencies among propositional variables and possibly resulting constraints in the numerical space. It also guarantees soundness under the reasonable assumption that, for each propositional configuration, the corresponding set of applicable numerical states forms a convex region.

Accordingly, the dependency-aware method ensures both soundness and completeness of planning in N3PCP domains, provided that the conditions (i)-(iii) are satisfied (cf. RQ2).

8. Conclusion

In this paper, we addressed the challenge of learning action preconditions in complex real-world domains characterized by hybrid, i.e., discrete and continuous, state spaces, using only observational data from states in which the actions were executed. We introduced a novel dependency-aware approach that constructs separate convex hulls in the numerical subspace for each discrete state configuration, thereby explicitly modeling the dependencies between discrete and continuous components of the state space. In contrast to methods determining the discrete preconditions based on conservative learning paradigms to ensure soundness, the proposed approach guarantees completeness with respect to the discrete state configurations within the training data. Moreover, by capturing structural dependencies between the discrete and continuous components, the approach also guarantees soundness, under the assumption that the valid numerical states associated with each discrete configuration form a convex region.

Future work will focus on addressing the computational challenges introduced by the convex-hull representations, which significantly impact planner performance in large-scale domains. We also plan to conduct a broader empirical evaluation across diverse real-world domains and integrate the learned models into planning systems capable of handling N3PCP problems.

Acknowledgments

This research as part of the projects EKI and LaiLa is funded by dtec.bw – Digitalization and Technology Research Center of the Bundeswehr, which we gratefully acknowledge. dtec.bw is funded by the European Union – NextGenerationEU.

Declaration on Generative AI

During the preparation of this work, the authors used ChatGPT in order to: Grammar and spelling check, Paraphrase and reword. After using this tool/service, the author reviewed and edited the content as needed and take full responsibility for the publication’s content. [8] A. Köcher, R. Heesch, N. Widulle, A. Nordhausen, J. Putzke, A. Windmann, O. Niggemann, A research agenda for ai planning in the field of flexible production systems, 2022 IEEE 5th International Conference on Industrial Cyber-Physical Systems (ICPS) (2022) 1–8. [9] E. Jarvenpaa, N. Siltala, M. Lanz, Formal resource and capability descriptions supporting rapid reconfiguration of assembly systems, in: 2016 IEEE International Symposium on Assembly and Manufacturing (ISAM), IEEE, 2016. doi:10.1109/isam.2016.7750724. [10] H. Kagermann, J. Helbig, A. Hellinger, W. Wahlster, Recommendations for implementing the strategic initiative INDUSTRIE 4.0: Securing the future of German manufacturing industry; final report of the Industrie 4.0 Working Group, Forschungsunion, 2013. [11] S. J. Russell, P. Norvig, Artificial intelligence: a modern approach, Pearson, 2016. [12] A. Mordoch, B. Juba, R. Stern, Learning safe numeric action models, in: Proceedings of the AAAI

Conference on Artificial Intelligence, volume 37, 2023, pp. 12079–12086. [13] D. Aineto, E. Scala, Action Model Learning with Guarantees, in: Proceedings of the 21st International Conference on Principles of Knowledge Representation and Reasoning, 2024, pp. 801–811. doi:10.24963/kr.2024/75. [14] A. Mordoch, E. Scala, R. Stern, B. Juba, Safe learning of pddl domains with conditional efects, in: Proceedings of the International Conference on Automated Planning and Scheduling, volume 34, 2024, pp. 387–395. [15] J. Á. Segura-Muros, R. Pérez, J. Fernández-Olivares, Discovering relational and numerical expressions from plan traces for learning action models, Applied Intelligence 51 (2021) 7973–7989. [16] J. Á. Segura-Muros, J. Fernández-Olivares, R. Pérez, Learning numerical action models from noisy input data, arXiv preprint arXiv:2111.04997 (2021). [17] B. Juba, H. S. Le, R. Stern, Safe learning of lifted action models, in: Proceedings of the International Conference on Principles of Knowledge Representation and Reasoning, volume 18, 2021, pp. 379–389. [18] A. Bunte, P. Wunderlich, N. Moriz, P. Li, A. Mankowski, A. Rogalla, O. Niggemann, Why symbolic ai is a key technology for self-adaption in the context of cpps, in: 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), IEEE, 2019, pp. 1701–1704. [19] D. van Dalen, Logic and structure (3. ed.), Universitext, Springer, 1994. [20] A. Cimatti, A. Griggio, S. Mover, M. Roveri, S. Tonetta, Verification modulo theories, Formal

Methods Syst. Des. 60 (2022) 452–481. [21] Z. Manna, A. Pnueli, The modal logic of programs, in: ICALP, volume 71 of Lecture Notes in

Computer Science, Springer, 1979, pp. 385–409. [22] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu, Bounded model checking, Adv. Comput.

58 (2003) 117–148. [23] M. Helmert, Decidability and undecidability results for planning with numerical state variables., in: AIPS, 2002, pp. 44–53. [24] D. Vranješ, J. Ehrhardt, R. Heesch, L. Moddemann, H. S. Steude, O. Niggemann, Design principles for falsifiable, replicable and reproducible empirical machine learning research, in: 35th International Conference on Principles of Diagnosis and Resilient Systems (DX 2024), Schloss Dagstuhl–LeibnizZentrum für Informatik, 2024, pp. 7–1. [25] A. Micheli, A. Bit-Monnot, G. Röger, E. Scala, A. Valentini, L. Framba, A. Rovetta, A. Trapasso, L. Bonassi, A. E. Gerevini, L. Iocchi, F. Ingrand, U. Köckemann, F. Patrizi, A. Saetti, I. Serina, S. Stock, Unified planning: Modeling, manipulating and solving ai planning problems in python, SoftwareX 29 (2025) 102012. doi: https://doi.org/10.1016/j.softx.2024.102012. [26] A. Gerevini, A. Saetti, I. Serina, An empirical analysis of some heuristic features for planning through local search and action graphs, Fundam. Inform. 107 (2011) 167–197. doi:10.3233/ FI- 2011- 399. [27] A. Valentini, A. Micheli, A. Cimatti, Temporal planning with intermediate conditions and efects, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 9975–9982.

[1]

Ghallab ,

Nau , P. Traverso, Automated Planning and Acting , Cambridge University Press, 2016 .

[2]

Ehrhardt ,

Heesch ,

Niggemann , Learning process steps as dynamical systems for a subsymbolic approach of process planning in cyber-physical production systems , in: Artificial Intelligence. ECAI 2023 International Workshops , Springer Nature Switzerland, Cham, 2024 , pp. 332 - 345 .

[3]

Hausknecht ,

Stone , Deep reinforcement learning in parameterized action space , 2016 . doi: 10 . 48550/ARXIV.1511.04143.

[4]

Heesch ,

Cimatti ,

Ehrhardt ,

Diedrich ,

Niggemann , A lazy approach to neural numerical planning with control parameters , in: Frontiers in Artificial Intelligence and Applications , volume Volume 392 : ECAI 2024 , 2024 . doi: 10 .3233/FAIA241000.

[5]

Savaş ,

Fox ,

Long ,

Magazzeni , Planning using actions with control parameters , in: Proceedings of the Twenty-second European Conference on Artificial Intelligence , 2016 , pp. 1185 - 1193 .

[6]

Grand ,

Pellier , H. Fiorino, TempAMLSI: Temporal action model learning based on STRIPS translation , Proceedings of the International Conference on Automated Planning and Scheduling 32 ( 2022 ) 597 - 605 .

[7]

Heesch ,

Ehrhardt , O. Niggemann, Integrating machine learning into an SMT-based planning approach for production planning inc yber-physical production systems , in: Artificial Intelligence. ECAI 2023 International Workshops , Springer Nature Switzerland, Cham, 2024 , pp. 318 - 331 .