1. Introduction

Object2Plan: An Ontological Approach to Automated Generation of Assembly Plans from Objects

Jona Thai

Michael Grüninger

0 0 Department of Mechanical and Industrial Engineering, University of Toronto , Ontario , Canada

M5S 3G8

2026

Robotic object assembly remains a core challenge in automated planning, demanding precise reasoning about both object structure and task execution. Existing symbolic approaches (e.g., PDDL) and data-driven methods (e.g., LLMs) each fall short: the former lack scalability, while the latter lack semantic grounding. Despite the central role of parthood and connection, no formal mereological foundation currently unifies object structure and assembly planning. This paper introduces a mereological framework for object assembly based on CISCO, a non-classical mereology for connected induced substructures. We define parallel mereologies for object components and assembly (sub)activities and prove them order-isomorphic. Within this framework, all possible assembly plans can be embedded within a “master plan,” linking the semantics of physical composition with those of task execution and enabling more principled reasoning over the space of possible assembly plans.

eol>robotic assembly mereotopology robotics task planning

1. Introduction

Object assembly is a deceptively simple task. Even something as trivial as joining Part A with Part B can become impossible without an understanding of the parthood and connection relations at play. These relations underpin not only the physical structure of objects but also the logic of the actions required to assemble them. Consequently, a substantial body of research has focused on developing reliable robotic assembly systems, which are central to applications ranging from high-mix manufacturing to consumer service robotics.

Two dominant paradigms currently exist in robotic assembly planning: symbolic, rule-based approaches (e.g., the Planning Domain Definition Language, PDDL) and data-driven machine learning methods (e.g., large language models, neural nets)[ 1 ][ 2 ]. The former offers reliability and interpretability but becomes brittle with scale. The latter exhibits impressive generalization ability but can be unreliable when faced with statistical long-tail events. Both paradigms, however, encounter significant limitations when addressing complex assemblies—those involving numerous subassemblies, intricate contact relations, and combinatorially large task spaces. As a result, current research efforts often seek hybrid strategies, combining symbolic representations with data-driven inference to balance generalization with rigor.

Despite these advances, several aspects of assembly planning remain surprisingly underexplored. For one, despite the undeniable role of parthood and connection plays in robotic assembly, there are no explicit axiomatizations of mereology in the literature. Without formal axioms as a backbone, data structures can only capture the taxonomy of an assembly plan, not its semantics. Another overlooked area is the characterization of the relationship between an object and its assembly plan, and by extension the relationship among all possible assembly plans for a given object. Existing approaches typically focus on generating a single feasible or optimal plan, but seldom on the structural space of possible plans as a whole. By mathematically defining this space of of potential assembly plans, these insights can be used to develop better algorithms to constrain the search space.

This paper addresses the above-listed gaps by introducing a formal mereological framework that unifies the representation of objects, task plans, and their occurrences. We define a mereology of object components in parallel with a mereology of assembly (sub)activities with CISCO, a non-classical mereology for connected induced substructures [ 3 ]. We prove that the object component mereology is order isomorphic to the mereology of assembly activities. Within this framework, individual assembly plan occurrences(i.e., specific sequences or realizations of assembly actions) are surjective to the mereology of assembly activities. Put simply, all possible task plan sequences can be effectively “embedded” within a master task plan. In doing so, we connect the semantics of physical composition with the semantics of task execution, offering a unified foundation for reasoning about assembly at both the object and activity levels.

2. Related Work

Regardless of whether the approach is symbolic or learning-based, representation lies at the core of robotic assembly planning. Accordingly, we provide a brief overview of the current state of the art in object and task plan representations within this domain.

2.1. Object Representation

Given the three-dimensional nature of most assembly objects, the most common representation is through 3D computer-aided design (CAD) files. Two predominant forms of 3D object representation within CAD systems are constructive solid geometry (CSG) and boundary representation (B-rep)[ 4 ][ 5 ]. CSG models objects as compositions of primitive solids (e.g., cubes, cylinders, spheres) combined through Boolean operations such as union, difference, and intersection. In contrast, B-rep defines an object through its boundary surfaces—vertices, edges, and faces—enabling precise surface-level modeling suitable for visualization, manufacturing, and simulation.

Due to its higher fidelity, B-rep is currently the dominant CAD form. One of the most notable 3D model datasets that utilizes B-rep is the Autodesk Fusion360 Assembly Dataset, which provides a large corpus of real-world CAD assemblies with rich parametric and hierarchical information [ 6 ]. Of the many data formats available, or particular interest is the assembly graph model - where the vertices of the graph are components and the graph edges denote parent-child relationships. This allows it to capture the relationship between components and subcomponents through a parthood hierarchy [ 7 ].

Other datasets following similar paradigms include the NVIDIA Omniverse Automate Dataset, which focuses on industrial assembly scenes with annotated kinematic and physical properties, and even the Assembly101 Dataset, which provides RGB-D video sequences paired with 3D object models for studying human-robot collaborative assembly [ 8 ] [ 9 ].

It is interesting to note that at its core, all of these different approaches seek to model parthood and connection within objects, meaning these relationships and properties are most core to robot assembly activity.

2.2. Task Plan Representation

A critical challenge in robotic assembly lies in representing task-level plans in a manner that enables efficient search by automated planners while maintaining sufficient geometric and kinematic specificity to support both generalization across tasks and formal verification. The representation must strike a delicate balance: overly abstract formulations hinder grounding in the physical domain, whereas overly detailed encodings rapidly become intractable for symbolic planners.

The lingua franca of symbolic task planning has long been declarative languages such as the Planning Domain Definition Language (PDDL) and its predecessor, the Stanford Research Institute Problem Solver (STRIPS)[ 10 ][ 11 ]. In these frameworks, a planning problem is defined in terms of objects, initial states, and goal states, while the domain specifies the available actions and predicates. Despite their expressiveness and clarity, such definitions are often ad hoc and fail to scale gracefully as the complexity of the solution space increases. The brittleness of purely symbolic formulations has therefore motivated extensive research into more robust or hybrid representations.

One notable direction, particularly relevant to robotic assembly, is Assembly Sequence Planning (ASP). ASP addresses the problem of determining a feasible and efficient order for assembling multiple components, often represented as a directed graph of assembly operations. Precedence constraints—derived from geometric, physical, or functional dependencies between parts—govern the generation of these sequences. Classical ASP approaches leverage heuristic search and constraint reasoning, while more recent work integrates geometric reasoning and physical feasibility analysis to ensure executable plans.

An alternative paradigm, assembly-by-disassembly, adopts a physics-based simulation perspective [ 2 ]. Instead of symbolically defining assembly rules, it simulates disassembly actions to infer feasible assembly orders. With the availability of large-scale datasets capturing realistic assembly and disassembly trajectories, machine learning models—including deep neural networks—can generalize across a wide range of geometries and joint types. While this approach offers flexibility and data-driven adaptability, it continues to face limitations in modeling complex assemblies involving multiple contact types, hierarchical structures, and intricate connection relations.

More recently, neurosymbolic approaches have emerged as a promising middle ground. These methods aim to combine the interpretability and reliability of symbolic task planning with the adaptability and perception robustness of learning-based systems . By embedding symbolic constraints within learned representations (like ontologies!)—or conversely, grounding symbolic plans in continuous perception and action spaces—neurosymbolic frameworks strive to achieve both formal structure and empirical lfexibility [ 12 ] [ 13 ][ 1 ]. In the context of robotic assembly, such approaches hold potential for unifying geometric reasoning, task-level planning, and sensory feedback within a coherent and semantically grounded framework.

2.3. Object-Task Plan Representation

Looking at the literature, we can conclude that accurately representing parthood, connection and hierarchy are tantamount to robotic task assembly. This rings true across all these diverse methods. Yet, none of these approaches explicitly utilize mereology - variables are captured through data structures, but semantics remain unclear. Relations suggestive of parthood or connectivity (e.g., part-of, attached-to, contained-in) are often included, yet lack correspondence to a formal axiomatization. Formal specification allows for models to be characterized, increasing reliability and explainability. The goal of this paper is to address that gap.

3. Object2Plan

The key contribution of this paper is the formalization of the relationship between the parts of an object and the parts(subactivities) of a task plan. Our approach is best illustrated by the diagram in Figure 1. The primary idea is that the mereology of connected components of the object is isomorphic to the mereology of activities that assemble the object. Moreover, the mereology of assembly activities describes the set of all possible occurrences of an assembly plan for the object.

3.1. Mereologies for Objects and Assembly Activities

Classical mereology is based on the assumption that any two underlapping elements have a sum, yet when we consider the problem of object assembly, this assumption is not valid. Instead, we find that mereological sums must always be connected objects.

We also observe that existing approaches to assembly planning represent an object as a simple graph; in this case, the components of the object should correspond to connected induced subgraphs of that graph. The work in [ 3 ] introduced the parthood and connection structure on the set of connected induced subgraphs of a graph and used this as the basis of the representation theorem for a new mereotopology, Tcisco_mt , in which the sum of two elements exists iff they are connected.

We therefore propose two distinct mereologies that are each logically synonymous to Tcisco_mt , but with different signatures. The mereology on physical objects, Tob ject_component , has the signature {object(x), componentOf(x,y), comp_sum(x,y,z), comp_overlaps(x,y), comp_covers(x,y)}: Definition 1. T ob ject_component is the following sentences: ∀x, y componentO f (x, y) ⊃ ob ject(x) ∧ ob ject(y)

∀x ob ject(x) ⊃ componentO f (x, x) ∀x, y componentO f (x, y) ∧ componentO f (y, x) ⊃ (x = y) ∀x, y, z componentO f (x, y) ∧ componentO f (y, z) ⊃ componentO f (x, z) ∀x, y comp_covers(x, y) ⊃ ∃z atom(z) ∧ comp_sum(z, y, x) ∧ ¬componentO f (z, y)

∀x, y comp_overlaps(x, y) ⊃ ∃z comp_sum(x, y, z) ∀x, y ¬componentO f (x, y) ⊃ ∃z componentO f (z, x) ∧ ¬comp_overlaps(z, y) (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14)

3.2. Multigeometry

To formally represent the isomorphism between the mereology of object components O and the mereology of assembly subactivities A and their occurrences O, we consider a class of structures known as multigeometries, which represent homomorphisms between mereologies.

Tactivity_mereology, which is the mereology on assembly activities, has the signature {assembly_activity(x), subassembly(x,y), assembly_covers(x,y), assembly_overlaps(x,y), assembly_sum(x,y)}: Definition 2. T assembly_activity is the following set of sentences: ∀x, y subassembly(x, y) ⊃ assembly_activity(x) ∧ assembly_activity(y)

∀x assembly_activity(x) ⊃ subassembly(x, x) ∀x, y subassembly(x, y) ∧ subassembly(y, x) ⊃ (x = y) ∀x, y, z subassembly(x, y) ∧ subassembly(y, z) ⊃ subassembly(x, z) ∀x, y assembly_covers(x, y) ⊃ ∃z atom(z) ∧ assembly_sum(z, y, x) ∧ ¬subassembly(z, y) ∀x, y assembly_overlaps(x, y) ⊃ ∃z assembly_sum(x, y, z) ∀x, y ¬subassembly(x, y) ⊃ ∃z subassembly(z, x) ∧ ¬assembly_overlaps(z, y) Definition 3. Q ⊕ I ⊕ Ais an assembly multigeometry iff 4. NI(U Q(x)) ⊆ U

A(NI(x)) 1. Q = ⟨P, ≼⟩ such that Q ∈ Mcisco; 2. A = ⟨L, ⟩≤ such that A ∈ M cisco 3. I = ⟨P, L, I⟩ such that I ∈ Mbi jection_bipartite We denote the class of assembly multigeometries by Massembly_plan.

By conditions 1 and 2, there exists two partial orderings Q and A. Condition 3 guarantees that the incidence relation I represents a mapping µ : Q → P. Condition 4 guarantees that this mapping is an order isomorphism.

Theorem 1. Let Iso(Q, A) denote the set of all poset order isomorphisms between the partial orderings Q, A. If

P = {µ : µ ∈ Iso(Q, A) ∈ Q, A ∈ Mcisco} then there is a bijection ϕ : P → Massembly_plan such that ϕ(µ) = Q ⊕ I ⊕ A µ(x) = NI(x) ∩ L iff µ : Q → A and This guarantees the property that each assembly multigeometry Q ⊕ I ⊕ A corresponds to an order isomorphism that maps the object component mereology Q to the assembly activity mereology A. The incidence structure I represents this order isomorphism in the sense that the subordering induced by the non-atoms of the object component mereology are isomorphic to the atoms of the subactivity mereology.

Combining all of the axioms together gives us the complete ontology for Object2Plan: Definition 4. T assembly_plan is the extension of Tob ject_component ∪ Tassembly_activity with the following set of sentences: (∀x, a) assemble(x, a) ⊃ ob ject(x) ∧ assembly_activity(a)

(∀p) ob ject(p) ⊃ ¬assembly_activity(p). (∀l1, l2, p) assembly_activity(l1) ∧ assembly_activity(l2) ∧ ob ject(p)

∧assemble(p, l1) ∧ assemble(p, l2)) ⊃ (l 1 = l2)). (∀p1, p2, l) assembly_activity(l) ∧ ob ject(p1) ∧ ob ject(p2)

∧assemble(p1, l) ∧ assemble(p2, l) ⊃ (p 1 = p2). (∀x, y, l1, l2) componentO f (x, y) ∧ assembly_activity(l1) ∧ assembly_activity(l2)

∧assemble(x, l1) ∧ assemble(y, l2) ⊃ subassembly(l 1, l2).

(∀x) (ob ject(x) ⊃ (∃y)(assembly_activity(y) ∧ assemble(x, y))). (∀x, y, l1, l2) assemble(x, l1) ∧ assemble(y, l2) ∧ ob ject(x) ∧ ob ject(y)

∧subassembly(l1, l2) ⊃ componentO f (x, y). (∀l) (assembly_activity(l) ⊃ (∃p)(ob ject(p) ∧ assemble(p, l))). (15) (16) (17) (18) (19) (20) (21) (22)

These axioms specify the mereologies on both the objects being assembled and also the activities that assemble the objects. The guarantee that these two mereologies are isomorphic provides a way of evaluating the correctness of assembly plans with respect to the mereological structure of the object.

3.3. Ontological Consequences

Identity criteria for objects cannot be reduced merely to their primitive components, since two distinct objects may be composed of the same basic elements yet remain different due to the ways those elements are connected. What distinguishes them is not the underlying parts, but the structure of their composition—the mereological relations that hold among those parts. In this sense, even if a mapping or even an isomorphism exists between their components, the objects remain distinct because their identities are determined by their composition. This distinction becomes especially important when contrasting well-defined classes of activity with non-deterministic activities, since the criteria for identity must account not only for the presence of components but also for the specific structural and relational configurations that constitute the object.

4. Discussion & Future Work

This paper provides a formal mereological framework that unifies the representation of objects, task plans, and their occurrences. Specifically, a mereology for object components and a mereology for assembly activities are defined utilizing CISCO, a non-classical mereology for connected induced substructures. We proved that the object mereology is order isomorphic to the assembly activity mereology, where the subordering induced by the non-atoms of the object component mereology are isomorphic to the atoms of the subactivity mereology. For future work, we intend to extend this work to cover a wider range of parthood and connection relation types, and further utilize our insights to guide development of robotic assembly algorithms. In particular, we intend to define the relationship between assembly plan occurrences and the assembly plan mereology, with the requirement that there be a 1-1 correspondence between occurrences of the plan and maximal chains in the mereology of assembly activities.

5. Declaration on Generative AI

During the preparation of this work, the author(s) used ChatGPT in order to: Paraphrase and reword, translation of APA-style citations into BibTex format. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the publication’s content.

[1]

Kwon ,

Kim ,

Y. J.

Kim , Fast and accurate task planning using neuro-symbolic language models and multi-level goal decomposition , in: IEEE International Conference on Robotics and Automation (ICRA) , 2025 .

[2]

Tian ,

Xu ,

Li ,

Luo ,

Sueda ,

Li ,

K. D.

Willis , W. Matusik, Assemble them all: Physics-based planning for generalizable assembly by disassembly , ACM Transactions on Graphics 41 ( 2022 ).

[3]

Grüninger ,

Chui ,

Ru ,

Thai , A mereology for connected structures , in: Formal Ontology in Information Systems Proceedings of the 11th International Conference, (FOIS 2020 ), volume 330 of Frontiers in Artificial Intelligence and Applications , IOS Press, 2020 , pp. 171 - 185 . doi: 10 .3233/FAIA200670.

[4]

A. A. G.

Requicha ,

H. B.

Voelcker , Constructive solid geometry , Technical Report TM-25 , University of Rochester, Production Automation Project , 1977 .

[5]

B. G.

Baumgart , A polyhedron representation for computer vision , in : Proceedings of the National Computer Conference , 1975 , pp. 589 - 596 .

[6] K. D. D. Willis , Y.

Pu , J.

Luo , H.

Chu , T.

Du , J. G.

Lambourne , A.

Solar-Lezama , W.

Matusik , Fusion 360 gallery: A dataset and environment for programmatic cad construction from human design sequences , ACM Transactions on Graphics (TOG) 40 ( 2021 ).

[7]

K. D.

Willis ,

P. K.

Jayaraman ,

Chu ,

Tian ,

Li ,

Grandi ,

Sanghi ,

Tran ,

J. G.

Lambourne ,

Solar-Lezama ,

Matusik , Joinable: Learning bottom-up assembly of parametric cad joints , arXiv preprint arXiv:2111.12772 ( 2021 ).

[8]

Tang , I. Akinola ,

Xu ,

Wen ,

Handa ,

K. V.

Wyk ,

Fox ,

G. S.

Sukhatme ,

Ramos ,

Narang , Automate: Specialist and generalist assembly policies over diverse geometries , in: Robotics: Science and Systems (RSS) , 2024 .

[9]

Sener ,

Chatterjee ,

Shelepov ,

He ,

Singhania ,

Wang , A. Yao, Assembly101: A large-scale multi-view video dataset for understanding procedural activities , CVPR ( 2022 ).

[10]

McDermott ,

Ghallab ,

Howe ,

Knoblock ,

Ram ,

Veloso ,

Weld ,

Wilkins , PDDL-The Planning Domain Definition Language , Version 1 .2, Technical

Report

, Yale Center for Computational Vision and Control, 1998 . Technical Report CVC TR-98-003 , AIPS-98 Planning Competition Committee.

[11]

R. E.

Fikes ,

N. J.

Nilsson , Strips: A new approach to the application of theorem proving to problem solving , Artificial Intelligence 2 ( 1971 ) 189 - 208 . doi: 10 .1016/ 0004 - 3702 ( 71 ) 90010 - 5 .

[12]

Beßler ,

Pomarlan ,

Beetz , Owl-enabled assembly planning for robotic agents , in: Proceedings of the 2018 International Conference on Autonomous Agents (AAMAS '18) , AAMAS '18 , Stockholm, Sweden, 2018 . Finalist for the Best Robotics Paper Award .

[13]

Du ,

Li ,

Du ,

Su ,

Fu ,

Zhan ,

Zhao ,

Wang , Fast task planning with neuro-symbolic relaxation , arXiv preprint arXiv:2507.15975 ( 2025 ).