<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International Workshop on Ontologies for Autonomous Robotics, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Ontology-Guided Multi-Modal Perception for Trusted and Explainable Robotic Action</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giovanni De Gasperis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sante Dino Facchini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tayyab Rehman</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di Ingegneria e Scienze dell'Informazione e Matematica, Università degli Studi dell'Aquila</institution>
          ,
          <addr-line>Località Vetoio, L'Aquila, 67100</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>10</volume>
      <issue>2025</issue>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Deep neural models play a central role in robotic perception and navigation, yet their black-box nature limits interpretability, verifiability, and safety. This work introduces an ontology-guided multi-modal perception framework that integrates neural segmentation with symbolic reasoning to enable trustworthy autonomous behaviour. Dense visual observations are converted into RDF/OWL-Lite knowledge graphs encoding spatial relations and normative constraints, which are evaluated through SHACL/SPARQL validation to detect violations and generate factual and contrastive explanations. A clearance-aware A* planner exploits this validated knowledge to favour trajectories that maximise geometric safety while preserving semantic compliance, thereby ensuring consistency between perception, safety constraints, and downstream decision making. Experiments on the GOOSE dataset demonstrate real-time performance, achieving reasoning latency below 10 ms, over 95% SHACL rule conformance, and more than 15% improvement in minimum clearance compared to a baseline planner. The resulting neuro-symbolic loop provides a reproducible pipeline that tightly couples perception, safety reasoning, and planning. Beyond navigation, the framework ofers a practical foundation for manipulation, monitoring, and other autonomous tasks requiring explainable, verifiable, and safety-aware decision making.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Ontology-Guided Robotics</kwd>
        <kwd>Multi-Modal Robotic Perception</kwd>
        <kwd>Trusted Human-Robot Interaction</kwd>
        <kwd>Neuro-Symbolic Architectures</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and State of the Art</title>
      <p>
        The rapid adoption of autonomous robots in safety-critical domains such as urban navigation, healthcare,
and industrial logistics demands perception and decision pipelines that are accurate, transparent, and
norm-compliant. Deep neural models dominate robotic perception and planning for their ability to
process complex sensory data, yet their opaque nature limits interpretability and safety assurance.
Unsafe or unexplained behaviors, including spatial violations or norm breaches, remain a persistent risk.
Advances in symbolic reasoning and ontological modeling enable domain knowledge representation,
constraint validation, and formal explanation, as shown in prior neuro-symbolic vision work on
environmental safety and event detection [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However, traditional symbolic systems struggle to
process noisy, high-dimensional perception outputs in real time. The emerging field of neuro-symbolic
robotics seeks to bridge this gap by combining deep learning with ontology-based reasoning to achieve
interpretable and trustworthy autonomy [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In robotics specifically, the taxonomy “Neuro-Symbolic
Robotics” categorizes how symbolic reasoning is integrated and highlights the open challenges of
scalability, real-time reasoning, and grounding from perception [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Several recent works support
this integration paradigm, Teriyaki et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] bridge symbolic task planning with Large Language
Models (LLMs) to generate task plans, improving scalability over purely symbolic planners. Neusis
of Cai et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] presents a compositional neuro-symbolic framework for Unmanned Aerial Vehicles
(UAVs) search missions, combining perception, probabilistic world modeling, and hierarchical planning;
it demonstrates gains over neural baselines but does not emphasize normative constraint checking.
      </p>
      <p>
        VisualPredicator envisioned by Liang et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] introduces neuro-symbolic predicates for robot planning,
improving interpretability and generalization, but is evaluated in simulated domains without full safety
enforcement. Furthermore, the VQ-CNMP framework by Aktas et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] addresses neuro-symbolic
skill learning in bi-level planning settings, discovering high-level skills from demonstrations. In the
area of robot policy explanation, neuro-symbolic generation of explanations for robot policies with
Weighted Signal Temporal Logic (wSTL) produces interpretable explanations of learned policies in
simulated environments [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The NeSyPack framework of Li et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] implements a hierarchical
neuro-symbolic approach for bimanual logistics packing, integrating symbolic reasoning and modular
decomposition. In the broader robotics literature, the paper “Learning Neuro-Symbolic Abstractions
for Robot Planning and Control" by Shah [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] proposes learning abstractions to support hierarchical
planning with performance guarantees. On the symbolic side, advances in semantic-web technologies
enable richer validation and inference over Resource Description Framework (RDF) and Web Ontology
Language (OWL) knowledge graphs. The paper [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] “On the Interplay Between Validation and Inference
in SHACL" explores how SHACL and SPARQL rules can capture constraints beyond OWL expressivity.
“Enabling Eficient and Semantic-Aware Constraint Validation in Knowledge Graphs" [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] focuses on
performance optimization, while “A SHACL-Based Approach for Enhancing Automated Compliance
Checking" [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] integrates SHACL shapes and SPARQL rules for domain compliance. In robotics, “A
Survey of Ontology-Enabled Processes for Dependable Robot Autonomy" [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] highlights the role of
ontologies in ensuring robustness and explainability.
      </p>
      <p>Nevertheless, current methods do not jointly address neural perception grounding, ontological
validation, violation detection, and explainable reasoning within a unified and reproducible framework.
Many remain confined to simulations or lack normative safety evaluation. Research Questions we
want to answer to are the following: RQ1: How can neural perception be semantically grounded to
enable ontology-based reasoning in robotic systems? RQ2: Can SHACL-driven validation improve
safety assurance and interpretability without afecting real-time performance? RQ3: How efective
is a unified neuro-symbolic pipeline across diverse sensing modalities? Main contributions of the
work are: (i) A unified neuro-symbolic framework linking deep perception and ontology-based safety
reasoning. (ii) A reproducible toolkit for ontology mapping, SHACL validation, and explanation analysis.
(iii) Experimental results showing reduced safety violations and real-time reasoning performance
on robotic benchmarks. The work is organized as follows: in Section 2 we present a review of the
literature, and in Section 3 we address the methods used to define the framework. In contrast, in
Section 4 an implementation of the paradigm is illustrated and results discussed. Finally, Conclusions,
Acknowledgments, and Declarations conclude the article.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works and Literature Review</title>
      <p>This section reviews recent advances in neuro-symbolic robotics and ontology-guided validation that
are most relevant to our framework. The literature search was conducted using major scientific
databases (Google Scholar, IEEE Xplore, ACM Digital Library, and arXiv, last accessed Oct 2025), using
keywords including “neuro-symbolic robotics”, “ontology reasoning”, “SHACL validation”, and “explainable
autonomy”. After screening 86 initial results, 27 papers were examined in detail and 10 were selected for
their methodological relevance, empirical validation, and connection to autonomous systems. Table 1
summarizes these representative works.</p>
      <p>
        Recent contributions in neuro-symbolic robotics aim to integrate perceptual grounding with
symbolic reasoning. Notable examples include Imperative Learning [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], which fuses neural sensing with
symbolic structures for navigation and multi-robot coordination, and VisualPredicator [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which
learns abstract neuro-symbolic predicates to support higher-level planning. While these approaches
enhance interpretability, they lack formal rule enforcement or safety validation. Similarly, iWalker [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
demonstrates robust humanoid locomotion by combining perception and planning, yet does not
incorporate ontology-based constraints. Yuasa et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] extract weighted temporal-logic requirements
from robot policies to improve interpretability, although the explanations remain descriptive and do not
prevent safety violations. These works collectively highlight the need for methods that bridge low-level
perception with explicit normative reasoning.
      </p>
      <p>
        Parallel eforts have explored ontology-driven validation in safety-critical domains. HERON [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]
employs OWL, SPARQL and SHACL to enforce safety in healthcare robotics, though it is not linked
to pixel-level perception. Studies such as Robaldo et al. [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and Anim et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] investigate
SHACLSPARQL interactions and temporal compliance rules, ofering insights into constraint expressivity but
outside robotics. SHACL-based consistency checking has also been applied to smart grid behaviors [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ],
demonstrating applicability to structured environments while remaining domain-specific. Beyond
validation, research on explanation and recovery is gaining momentum. Cornelio and Diab [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
introduce a neuro-symbolic replanning framework integrating ontologies and large language models for
task-level recovery, yet it does not validate sensory assertions. Nawaz et al. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] survey neuro-symbolic
AI integration strategies and emphasize gaps in datasets, reasoning benchmarks, and
perceptiongrounded constraint checking. These findings collectively motivate the need for unified systems that
map perceptual evidence into symbolic representations, validate safety norms, and generate transparent
explanations—a direction directly addressed by our proposed framework.
      </p>
      <sec id="sec-2-1">
        <title>No violation preven</title>
        <p>tion
safety No perception link</p>
      </sec>
      <sec id="sec-2-2">
        <title>Task-level</title>
        <p>checks
Improves inter- Not applied to
pretability; integrates robotics or real-time
ontology with neural validation
models
Dynamic graph cor- Not used in robotics
rectness
Temporal and aggre- Non-robotic domain
gate rules
Consistency in smart Domain-specific
grids
Online failure han- No sensory
validadling tion
Identifies reasoning No implementation
gaps</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Methodology</title>
      <p>The proposed framework is designed to perform ontology-guided reasoning for safety-aware robotic
navigation, using deep perceptual features extracted from the GOOSE dataset. The overall system
combines semantic perception, symbolic knowledge representation, constraint-based validation, and
explanation synthesis into a single reasoning cycle.</p>
      <sec id="sec-3-1">
        <title>3.1. GOOSE Dataset</title>
        <p>
          The GOOSE dataset [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] provides synchronised RGB streams, depth maps, and robot telemetry at 30 FPS
across 12 indoor and semi-outdoor scenarios (≈ 25k frames). Each frame includes semantic safety labels
(Safe, Caution, Restricted), with violations manually verified. For each observation , the perception
module extracts a variable set of scene elements  = {1 , . . . , }, which is then processed by the
ontology-based reasoning pipeline to infer frame-level safety states.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Perception-to-Ontology Mapping</title>
        <p>A deep encoder–decoder network based on a modified SegFormer backbone is trained to generate dense
segmentation masks and object-level bounding boxes for each frame in the GOOSE dataset. The network
outputs structured scene representations distinguishing multiple semantic categories such as robot,
wall, and restricted_zone. To make these perceptual outputs amenable to symbolic reasoning,
each detected region is assigned a class label and converted into an assertion within the OWL-Lite
ontology . In the camera-ready version, we clarify that  consists of 31 classes organised into four
semantic families: (i) static structures (Wall, Door, Column, Corridor), (ii) dynamic agents (Robot,
Pedestrian), (iii) regulatory zones (SafeZone, CautionZone, RestrictedZone), and (iv) spatial
relations (inside, near, overlaps, occludes). These categories were chosen because they directly
correspond to the GOOSE annotation scheme and to normative indoor-navigation safety constraints
(e.g., maintaining safe distance, avoiding restricted areas). This alignment ensures that the ontology
captures the necessary entities and relations required for safety reasoning while remaining lightweight
for real-time execution. Each perception element  ∈  is translated into an RDF triple  = (, , ),
where  denotes the time step and  indexes the -th detected element.  denotes the subject entity
(e.g., robot),  the predicate or relation type (e.g., inside, near), and  the object entity. The
mapping function ℳ :  →  = {1 , 2 , . . . , } converts all perceptual elements from frame  into
RDF triples defining the symbolic world state at time . This structured representation provides the
foundation for subsequent SHACL-based validation and rule-driven enrichment.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Ontological Constraint Reasoning</title>
        <p>The ontology includes domain-specific SHACL shapes  that formalise the safety restrictions of
the GOOSE navigation environment. Each shape encodes either a spatial relationship (e.g., “robot
inside restricted area”) or a kinematic threshold (e.g., velocity constraints). In total, the rule
set comprises 14 shapes: five describe spatial violations (e.g., RobotInsideRestrictedZone,
UnsafeCautionTraversal), four encode kinematic norms such as bounded acceleration and
turnrate, and five capture derived semantic relations used for enrichment. These shapes were selected
to reflect common indoor navigation guidelines and the dataset’s own safety annotations.
Temporal and kinematic restrictions are added via SHACL–SPARQL filters to control speed, distance, and
changes in direction. Before validation, a set of SHACL–SPARQL rules  = {1, 2, . . . , } is applied
to infer implicit relationships. The camera-ready version clarifies that these rules include proximity
inference (e.g., deriving near(?a,?b) when Euclidean distance is below 0.5 m), approach-direction
estimation from velocity vectors, and multi-frame trend detection for adversarial movement patterns.
Applying these rules produces an enriched symbolic graph ′ at time  that incorporates context not
directly observable from a single frame. To augment symbolic validation with geometric awareness,
spatial reasoning is performed directly on segmentation masks. A violation is triggered whenever the
Intersection-over-Union (IoU) between the robot’s mask ℳ and any restricted zone mask ℳ exceeds
a predefined threshold  . Formally:</p>
        <p>: (Robot inside RestrictedZone) ⇒ Violation = True,
FILTER : (? &gt; 0.5 AND ? = CautionZone),
′ =  ∪ {() |  ∈ },
(1)
(2)
(3)
Here,  indexes the current time step,  is the symbolic world state before inference, and ′ is the
updated graph after applying rule-based enrichment. ℳ and ℳ denote the segmentation masks of
the robot and a restricted zone, respectively, and  is a fixed geometric safety threshold. Equation (4)
operationalizes spatial violation detection by measuring the normalized overlap between semantic
regions. Symbolic inference and geometric consistency checking together enable precise identification
of unsafe behaviors while maintaining alignment between visual observations and ontological safety
constraints.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Explanation Generation</title>
        <p>The framework provides factual and contrastive explanations for each detected violation  , combining
symbolic reasoning with frame-level visual evidence. The factual explanation  ( ) reports the SHACL
constraint that failed and enumerates the RDF assertions responsible for non-conformance. In contrast,
the counterfactual explanation ( ) identifies the minimal modification ∆ * to the symbolic world
state that would restore conformance, such as removing an unsafe spatial relation or adjusting a
velocity bound. To support interpretability, symbolic violations are linked to visual cues extracted
from segmentation masks, enabling the system to highlight the geometric source of a failure (e.g.,
robot–restricted-zone overlap). A deterministic template-based generator converts SHACL metadata
into concise natural-language descriptions that remain formally traceable. For example: “At frame 1342,
Robot1 entered RestrictedZone-03 at 1.28 m/s; reducing speed to ≤ 0.5 m/s would satisfy Constraint-RZ-04.”
All explanation instances are logged together with their supporting triples and overlays, forming an
auditable history of safety-relevant events.</p>
        <p>( ) = {  ∈ ′ | ¬conforms(, ) },
( ) = arg min valid(′ ∖ ∆ , ) = 1.</p>
        <p>Δ</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Planner Integration</title>
        <p>The trajectory generation module uses a standard grid-based A* planner operating on the semantic
free-space map inferred from the perception and ontology layers. A* is adopted because it provides
deterministic behaviour, transparent cost expansion, and reliable performance in the outdoor road
scenes of the GOOSE dataset. The baseline configuration uses an 8-connected grid with a Euclidean
heuristic, producing reproducible routes over the occupancy representation derived from semantic
segmentation. A safety-aware extension of A* is introduced by augmenting the edge cost with a
proximity penalty derived from SHACL-validated spatial relations, including near RestrictedZone
and inside RestrictedZone. This yields a clearance-aware objective in which the transition cost
increases as the robot approaches semantically unsafe regions. The underlying search procedure remains
unchanged; instead, the symbolic safety information reshapes the cost landscape and biases the planner
toward trajectories that maintain greater geometric margins. The resulting trajectories are subsequently
re-validated and explained by the neuro-symbolic layer, ensuring tight integration between perception,
ontology, safety reasoning, and motion planning.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.6. Evaluation and Computational Design</title>
        <p>The reasoning engine operates in a streaming configuration, processing 500-frame batches while
maintaining near real-time performance. System behaviour is characterised using three metrics: Violation
Rate (VR), Explanation Coverage (EC), and Reasoning Latency (RL):
  = | | ,  = |||| × 100,  =
| |
1 ∑︁ time().
|| ∈
(4)
(5)
(6)
Here,  denotes the set of detected violations,  the corresponding explanations, and  the reasoning
operations such as rule applications and SHACL validation queries. The SHACL engine employs
incremental validation, re-evaluating only the triples that change at each frame to reduce computational
overhead. Visual overlays of detected violations and their associated explanations are produced to
support qualitative assessment of safety events. Figure 1 illustrates the overall neuro-symbolic pipeline
integrating perception, symbolic mapping, constraint validation, and explanation generation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <p>
        All activities were conducted using Google Colab Pro with a single NVIDIA T4 GPU (16 GB). The software
stack comprised Python 3.10, PyTorch 2.3, OpenCV, rdflib, and pySHACL. The perception module utilised
DeepLabV3 and a ResNet-50 encoder, resulting in low-latency segmentation (&lt; 200 ms/frame). The
evaluation was performed on the GOOSE dataset [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], comprising about 25,000 annotated frames, split
into 70% training, 20% validation, and 10% test. Symbolic reasoning was implemented progressively
averaging a validation delay &lt;= 10 ms per query, ensuring real-time feasibility.
      </p>
      <sec id="sec-4-1">
        <title>4.1. Quantitative Evaluation</title>
        <p>The quantitative analysis examines how ontology-informed cost shaping influences trajectory geometry
and computational eficiency. The clearance-aware planner increases spatial safety margins while
preserving the violation behaviour of the baseline. Minimum and average clearances improve by
approximately 15% and 5.9%, respectively, with only a 2.8% increase in mean planning time (Table 2).
This indicates that integrating semantic safety information leads to trajectories that naturally maintain
larger bufers from restricted zones without compromising real-time performance. This work adopts
a standard grid-based A* planner with 8-connected motion and a Euclidean heuristic as the baseline,
following recent guidance for benchmarkable autonomous navigation tasks [23]. The clearance-aware
variant augments each edge cost with a proximity-dependent penalty derived from SHACL-validated
spatial relations. As a result, the search procedure remains unchanged, but the cost landscape is
reshaped to bias expansions toward geometrically safer regions. Importantly, the violation rate remains
unchanged at 11.8% for both planners. This is expected, as both methods are evaluated on identical
trajectory sets and the semantic violations measured by SHACL are insensitive to small geometric
adjustments. The primary efect of the modified planner is therefore an improvement in geometric
caution rather than a reduction in violation frequency. To characterise overall behaviour, four metrics
are reported: violation rate, mean planning time, minimum clearance, and average clearance. Together,
these quantify the trade-of between navigational robustness and computational overhead. As shown
in Table 2, the clearance-aware variant maintains real-time feasibility while generating trajectories that
exhibit consistently larger geometric safety margins.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Results and Safety Evaluation</title>
        <p>
          The framework achieves over 95% SHACL rule conformance while improving geometric safety margins
by more than 15% compared to the baseline planner, all with real-time reasoning latency below 10 ms.
These results show that ontology-guided validation can be integrated into the perception loop without
degrading computational performance. Figures 3 and 4 summarize clearance, violation trends, and
SHACL compliance. Mean clearance strongly predicts violation rate (2 = 0.85), confirming that greater
standof distance reduces unsafe events. VIS imagery yields the most stable performance, whereas NIR
shows slightly higher violation rates under illumination shifts. SHACL validation remains consistent
across constraint types, with only minor variation in clearance-based checks. To complement the
quantitative analysis, Figure 5 presents representative qualitative examples, including a normal scene, a
constraint-satisfying trajectory, and a restricted-zone violation. These examples illustrate how semantic
constraints manifest in the image space and how the validator localizes and highlights unsafe behaviour.
Compared with existing neuro-symbolic methods, the approach provides a tighter coupling between
pixel-level perception and formal constraint reasoning. Prior systems such as VisualPredicator [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] and
Imperative Learning [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] improve interpretability but lack formal safety validation, while HERON [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]
enforces SHACL only at the task level. In contrast, our system embeds constraint checking directly into
the perception–planning loop, enabling transparent, fine-grained safety assessment. All code, ontology
models, and datasets are publicly available on Zenodo1.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>This work presented an ontology-guided multi-modal perception framework that unifies deep semantic
segmentation with symbolic safety reasoning to enable trustworthy robotic autonomy. By integrating</p>
      <p>DeepLabV3–ResNet50 perception with SHACL-based constraint validation, the system achieved
veriifable trajectory compliance and near real-time reasoning performance on the GOOSE dataset. The
clearance-aware A* planner improved geometric safety margins by approximately 15% with negligible
runtime overhead, demonstrating the feasibility of embedding ontology reasoning directly within deep
perception pipelines. Future research will extend the approach toward richer multi-sensor fusion,
combining VIS/NIR imagery with LiDAR depth data to support full 3-D spatial reasoning and volumetric
clearance validation. We also plan to integrate uncertainty-aware risk modelling and temporal SHACL
rules for handling dynamic environments, and to deploy the framework on real robotic platforms for
hardware-in-the-loop evaluation. Broader benchmarking across manipulation datasets such as RH20T,
together with the release of an open and reproducible ontology toolkit, will further strengthen
robustness, explainability, and transparency in autonomous robotic systems. Finally, while all experiments
were conducted on a high-performance GPU for consistency, future work will investigate runtime
behaviour on lower-power hardware to assess deployment feasibility in real-world environment.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Declaration on the Use of Generative AI</title>
      <p>Generative AI tools were not used for the generation of scientific content, data analysis, or results
presented in this paper. Generative AI was only used, if at all, for minor language editing and proofreading
purposes. All ideas, methods, experiments, and conclusions are solely the responsibility of the authors.
Acknowledgments and Declarations
This work was funded by “Bando a cascata Spoke 6 - MOST-UniMore" – NextGeneration EU, within the
Italian National PNRR, under the grant IMPACT (CUP: E93C22001070001) and PNRR Next Generation
EU, pursuant to Art. 8 of DM 630/2024, in collaboration with SPEE Srl of L’Aquila, Italy.
[23] A. Alexander, K. Venkatesan, J. Mounsef, K. Ramanujam, A comprehensive survey of path planning
algorithms for autonomous systems and mobile robots: Traditional and modern approaches, IEEE
Access (2025).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rafanelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Costantini</surname>
          </string-name>
          , G. De Gasperis,
          <article-title>Neural-logic multi-agent system for flood event detection</article-title>
          ,
          <source>Intelligenza Artificiale</source>
          <volume>17</volume>
          (
          <year>2023</year>
          )
          <fpage>19</fpage>
          -
          <lpage>35</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Colelough</surname>
          </string-name>
          , W. Regli,
          <article-title>Neuro-symbolic ai in 2024: A systematic review</article-title>
          ,
          <source>arXiv preprint arXiv:2501.05435</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ugur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmetoglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nagai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Taniguchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saveriano</surname>
          </string-name>
          , E. Oztop, Neuro-symbolic
          <string-name>
            <surname>robotics</surname>
          </string-name>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Capitanelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mastrogiovanni</surname>
          </string-name>
          ,
          <article-title>A framework for neurosymbolic robot action planning using large language models</article-title>
          ,
          <source>Frontiers in Neurorobotics</source>
          <volume>18</volume>
          (
          <year>2024</year>
          )
          <fpage>1342786</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Cardenas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Leo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Backman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghorbanali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Qu</surname>
          </string-name>
          , et al.,
          <article-title>Neusis: A compositional neuro-symbolic framework for autonomous perception, reasoning, and planning in complex uav search missions</article-title>
          ,
          <source>IEEE Robotics and Automation Letters</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Weller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Silver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Henriques</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ellis</surname>
          </string-name>
          ,
          <article-title>Visualpredicator: Learning abstract world models with neuro-symbolic predicates for robot planning</article-title>
          ,
          <source>arXiv preprint arXiv:2410.23156</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Aktas</surname>
          </string-name>
          , E. Ugur, Vq-cnmp:
          <article-title>Neuro-symbolic skill learning for bi-level planning</article-title>
          ,
          <source>arXiv preprint arXiv:2410.10045</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yuasa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Sreenivas</surname>
          </string-name>
          , H. T. Tran,
          <article-title>Neuro-symbolic generation of explanations for robot policies with weighted signal temporal logic</article-title>
          ,
          <source>arXiv preprint arXiv:2504.21841</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Liu</surname>
          </string-name>
          , C. Liu,
          <article-title>Nesypack: A neuro-symbolic framework for bimanual logistics packing</article-title>
          ,
          <source>arXiv preprint arXiv:2506.06567</source>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>Learning neuro-symbolic abstractions for robot planning and learning</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>38</volume>
          ,
          <year>2024</year>
          , pp.
          <fpage>23417</fpage>
          -
          <lpage>23418</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ahmetaj</surname>
          </string-name>
          , et al.,
          <article-title>Reasoning about explanations for non-validation in shacl</article-title>
          ,
          <source>in: 18th International Conference on Principles of Knowledge Representation and Reasoning (KR)</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ke</surname>
          </string-name>
          ,
          <article-title>Enabling eficient and semantic-aware constraint validation in knowledge graphs</article-title>
          ,
          <source>in: European Semantic Web Conference</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>104</fpage>
          -
          <lpage>114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Anim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Robaldo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Z.</given-names>
            <surname>Wyner</surname>
          </string-name>
          ,
          <article-title>A shacl-based approach for enhancing automated compliance checking with rdf data</article-title>
          ,
          <source>Information</source>
          <volume>15</volume>
          (
          <year>2024</year>
          )
          <fpage>759</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>E.</given-names>
            <surname>Aguado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hernando</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sanz</surname>
          </string-name>
          ,
          <article-title>A survey of ontology-enabled processes for dependable robot autonomy</article-title>
          ,
          <source>Frontiers in Robotics and AI</source>
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <fpage>1377897</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Geng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhan</surname>
          </string-name>
          , et al.,
          <article-title>Imperative learning: A self-supervised neuro-symbolic learning framework for robot autonomy</article-title>
          ,
          <source>The International Journal of Robotics Research</source>
          (
          <year>2024</year>
          )
          <fpage>02783649251353181</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>X.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Wang, iwalker: Imperative visual planning for walking humanoid robot</article-title>
          ,
          <source>arXiv preprint arXiv:2409.18361</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ioannidou</surname>
          </string-name>
          , I. Vezakis,
          <string-name>
            <given-names>M.</given-names>
            <surname>Haritou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Petropoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Miloulis</surname>
          </string-name>
          , I. Kouris,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bromis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Matsopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Koutsouris</surname>
          </string-name>
          ,
          <article-title>Healthcare robotics' ontology (heron): An upper ontology for communication, collaboration and safety in healthcare robotics</article-title>
          ,
          <source>in: Healthcare</source>
          , volume
          <volume>13</volume>
          ,
          <string-name>
            <surname>MDPI</surname>
          </string-name>
          ,
          <year>2025</year>
          , p.
          <fpage>1031</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>L.</given-names>
            <surname>Robaldo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Batsakis</surname>
          </string-name>
          ,
          <article-title>On the interplay between validation and inference in shacl: An investigation on the time ontology</article-title>
          ,
          <source>Semantic Web</source>
          <volume>15</volume>
          (
          <year>2024</year>
          )
          <fpage>567</fpage>
          -
          <lpage>599</lpage>
          . Early access,
          <source>published online</source>
          <year>2024</year>
          -
          <volume>02</volume>
          -17.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Larhrib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Escribano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cerrada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Escribano</surname>
          </string-name>
          ,
          <article-title>An ontological behavioral modeling approach with shacl, sparql, and rdf applied to smart grids</article-title>
          ,
          <source>IEEE Access 12</source>
          (
          <year>2024</year>
          )
          <fpage>82041</fpage>
          -
          <lpage>82056</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>C.</given-names>
            <surname>Cornelio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Diab</surname>
          </string-name>
          ,
          <article-title>Recover: A neuro-symbolic framework for failure detection and recovery</article-title>
          ,
          <source>in: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</source>
          , IEEE,
          <year>2024</year>
          , pp.
          <fpage>12435</fpage>
          -
          <lpage>12442</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>U.</given-names>
            <surname>Nawaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Anees-ur Rahaman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Saeed</surname>
          </string-name>
          ,
          <article-title>A review of neuro-symbolic ai integrating reasoning and learning for advanced cognitive systems</article-title>
          ,
          <source>Intelligent Systems with Applications</source>
          (
          <year>2025</year>
          )
          <fpage>200541</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>P.</given-names>
            <surname>Mortimer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hagmanns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G. T.</given-names>
            <surname>Luettel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Petereit</surname>
          </string-name>
          , H.
          <article-title>-</article-title>
          <string-name>
            <surname>J. Wuensche</surname>
          </string-name>
          ,
          <article-title>The goose dataset for perception in unstructured environments (</article-title>
          <year>2024</year>
          ). URL: https://arxiv.org/abs/2310.16788.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>