<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Softening Ontological Reasoning with Large Language Models</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Teodoro</forename><surname>Baldazzi</surname></persName>
							<email>teodoro.baldazzi@uniroma3.it</email>
							<affiliation key="aff0">
								<orgName type="institution">Università Roma Tre</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Davide</forename><surname>Benedetto</surname></persName>
							<email>davide.benedetto93@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Università Roma Tre</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Luigi</forename><surname>Bellomarini</surname></persName>
							<email>luigi.bellomarini@bancaditalia.it</email>
							<affiliation key="aff1">
								<orgName type="institution">Banca d&apos;Italia</orgName>
								<address>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Emanuel</forename><surname>Sallinger</surname></persName>
							<email>sallinger@dbai.tuwien.ac.at</email>
							<affiliation key="aff2">
								<orgName type="institution">TU Wien</orgName>
								<address>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="institution">University of Oxford</orgName>
								<address>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Adriano</forename><surname>Vlad</surname></persName>
							<email>adriano.vlad@gmail.com</email>
							<affiliation key="aff2">
								<orgName type="institution">TU Wien</orgName>
								<address>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff4">
								<address>
									<settlement>Bucharest</settlement>
									<country key="RO">Romania</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Softening Ontological Reasoning with Large Language Models</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">16D42C1D28FAC0C0A8A9FA49E2721132</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:27+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Ontological reasoning, Language models, Knowledge graphs 0000-0002-1762-1431 (T. Baldazzi)</term>
					<term>0000-0001-6079-4250 (D. Benedetto)</term>
					<term>0000-0001-6863-0162 (L. Bellomarini)</term>
					<term>0000-0001-7441-129X (E. Sallinger)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Logic-based Knowledge Graphs (KGs) and Knowledge Representation and Reasoning (KRR) have emerged as fundamental methodologies in many data-intensive areas, fostering trust and accountability for effective decisionmaking. However, the knowledge captured by such approaches is often restricted by the rigidity of their structured rule-based formalisms. More recently, the rising adoption of Large Language Models (LLMs) has introduced a new layer of semantic understanding and flexibility in human-data interaction. Yet, these models are inherently limited in reasoning capabilities and lack systematic and explainable outcomes due to their opaque nature. To address today's challenge of combining the strengths of both technologies, we propose a novel neurosymbolic solution that leverages the power of LLMs to "soften" rule activations, enhancing adaptability in ontological reasoning while preserving robustness and transparency of KRR systems.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In recent years, the widespread interest in querying and exploiting large volumes of data has catalyzed the development of increasingly mature, efficient, and scalable solutions capable of capturing and reasoning over real-world scenarios. In this context, ensuring the transparency of data-driven processes is paramount to provide high levels of trustworthiness and accountability in decision-making, especially over critical domains such as finance and biomedicine <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>. Powered by logic-based Knowledge Representation and Reasoning (KRR) formalisms, such intelligent systems are fully explainable <ref type="bibr" target="#b2">[3]</ref>, as they provide factual conclusions augmented with the consequentially logical steps that led to them through the inference. Among these formalisms, logic programming-based database query languages, such as Datalog and its extensions <ref type="bibr" target="#b3">[4]</ref>, are a yardstick, thanks to their effective trade-off between expressive power and computational complexity. Leveraging such languages, factual data from corporate databases can be combined with business-level definitions as ontologies in Knowledge Graphs (KGs), and further augmented via ontological reasoning <ref type="bibr" target="#b4">[5]</ref>.</p><p>However, ontological reasoning systems are constrained by the rigid nature of KRR formalisms at their foundation, which limits their adaptability to the complexities of real-world data. Indeed, these systems typically rely on query-based interactions, operating at a low level and often proving challenging for non-specialists to use effectively. Moreover, all inputs and outputs are confined to structured formats such as facts, n-tuples, or triples, and the generation of new knowledge through rule activation is restricted to what can be syntactically captured by predefined logical predicates and via precise bindings to actual values. This rigidity fundamentally clashes with the inherent ambiguity of unstructured or raw data that may not fit into predefined categories. Together with the incompleteness, inconsistencies, and inaccuracies that might affect such data, these issues inhibit the applicability of KRR in real-world scenarios where understanding the semantic meaning of information is crucial. Consequently, we are observing a critical need for solutions that enable more semantic-aware and flexible reasoning capabilities in such systems.</p><p>As an intuitive example, let us consider the natural language (NL) sentence "Through a series of five transactions, E. Musk has acquired 52% of Twitter in October 2022, after previously expressing interest in the platform during several interviews." and the logical rule Owns(owner,owned,shares), shares &gt; 0.5 → Controls(owner,owned), stating that "a financial entity owning more than 50% of the shares of another one controls it. While a human would readily understand that Elon Musk now controls Twitter, automatically inferring this result presents significant challenges for a KRR system. Indeed, it should first recognize that, despite the unstructured nature of the input, the rule's body could bind to it, given the close semantic relationship between acquisition and ownership. Then, it would need to correctly map the arguments of the Owns predicate to the corresponding portions of the input, i.e., identifying E. Musk as the entity owning, Twitter as the entity owned, and 52% as the shares involved in the ownership. Moreover, information such as the number of transactions and the time-frame, while not affecting rule activation, still provides relevant context in the financial domain the example belongs to and should not be filtered out, whereas details like Musk's prior expressions of interest can be omitted. Finally, the rule would be activated, producing as output Controls(E. Musk,Twitter), ideally augmented with such contextually-relevant details as additional metadata to enrich explainability.</p><p>The demand for solutions that enable more adaptable and flexible reasoning to navigate the intricacies and ambiguities of real-world data has gained further traction with the recent breakthrough of AI-based chatbots and Large Language Models (LLMs) <ref type="bibr" target="#b5">[6]</ref>, which has marked a significant turning point in the field of Natural Language Processing (NLP) and a pivotal shift in the access to data and knowledge towards more human-friendly and high-level paradigms. Today, LLMs such as OpenAI's GPT <ref type="bibr" target="#b6">[7]</ref> and Meta's Llama <ref type="bibr" target="#b7">[8]</ref> are effectively adopted to address a plethora of tasks across multiple domains <ref type="bibr" target="#b8">[9]</ref>. Following the development of techniques such as chain-of-thought prompting <ref type="bibr" target="#b9">[10]</ref>, recent attempts have been carried out to employ LLMs for complex data analyses as well as multi-step reasoning <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref>. Yet, despite the advancements in the field and the proficiency of these models in handling semantic relationships within natural language, concerns persist due to their intrinsic opacity and unpredictability <ref type="bibr" target="#b12">[13]</ref>. Indeed, they often fall short in providing systematic, explainable outcomes necessary for big data processing and robust decision-making in high-stakes domains <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15]</ref>. This paper addresses the challenge of synergistically combining the robustness and transparency of KRR systems with the power of LLMs in understanding the semantic meaning of NL knowledge. We propose a neurosymbolic solution that leverages LLMs to augment the ontological reasoning process with real-world semantic flexibility, injecting "softness" into rule activations. Specifically, we operate in the context of the Vadalog <ref type="bibr" target="#b15">[16]</ref> system, a Datalog-based reasoning engine for KGs, that finds many industrial applications <ref type="bibr" target="#b16">[17]</ref>. The semantics of a Vadalog set Σ of rules can be defined in an operational way via the well-known chase <ref type="bibr" target="#b17">[18]</ref> procedure. Given a database in input, this algorithmic tool expands it with new facts entailed via the application of the rules in Σ, until all of them are satisfied. Intuitively, a rule is applied when an exact binding is identified, i.e., a set of mappings of the variables in the rule's body to the constants of structured facts in the database.</p><p>With the goal of extending the traditional chase mechanism to address the complexities of unstructured data, our approach leverages a pre-trained Llama 3 model to act as a semantic unifier, responsible for identifying bindings in the chase between rules and such data. In practice, given the next rule to be applied via the reasoner, both the rule in its natural language form and the candidate facts to activate it on are passed to the LLM. The model leverages its semantic understanding capabilities to generate bindings as sets of mappings from the variables of the rule body to the proper excerpts of the NL facts. These mappings then undergo a validation phase, which includes a feedback loop to confirm their correctness and coherence and to address potential hallucinations. Once validated, the resulting binding is provided to the reasoner, which employs it to attempt rule activation. If all the conditions in the rule are satisfied, a new fact is inferred and additional details of the parents NL facts are preserved as chase metadata. Finally, the newly produced fact is verbalized into natural language via a dedicated module and a termination check is performed, leveraging again the LLM to ensure that the knowledge it provides had not already been generated at a previous step of the reasoning. If that is the case, the fact is added as new input in the chase, and the procedure continues until no more bindings can be identified. A high-level summary of the pipeline, illustrated in Figure <ref type="figure" target="#fig_0">1</ref>, will guide our discussion.</p><p>More in detail, our contributions can be summarized as follows.</p><p>• We present a novel soft chase technique that extends logic rule bindings and termination control of traditional chase methodologies to unstructured data, leveraging the semantic awareness of LLMs and a deterministic verbalization of logic facts into NL.</p><p>• We deliver such an approach in a new neurosymbolic KRR-centered architecture (powered by Vadalog, but compatible with any chase-based reasoner) to enable more adaptable and flexible ontological reasoning while preserving robustness and explainability.</p><p>• We discuss a preliminary experimental evaluation confirming the validity of our approach and comparing standard chase with its soft counterpart, powered by pre-trained and Retrieval-Augmented Generation (RAG) <ref type="bibr" target="#b18">[19]</ref>-enriched versions of the LLM.</p><p>Overview. In Section 2 we provide essential background notions. In Section 3 we present our proposed neurosymbolic architecture. A preliminary experimental evaluation is provided in Section 4. Section 5 discusses related work. We draw our conclusions in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Chase-based Ontological Reasoning in the Vadalog System</head><p>To guide the rest of our discussion, we first lay out some preliminary notions on ontological reasoning over KGs, with a specific focus on the Vadalog system and the chase procedure at its foundation. Relational foundations. Let C and V be disjoint countably infinite sets of constants and variables, respectively. A (relational) schema S is a finite set of relation symbols (or predicates) with associated arity. A term is either a constant or a variable. An atom over S is an expression of the form 𝑅(𝑣 ¯), where 𝑅 ∈ S is of arity 𝑛 &gt; 0 and 𝑣 ¯is an 𝑛-tuple of terms. A database (instance) over S associates to each symbol in S a relation of the respective arity over the domain of constants. The members of the relations are called tuples or facts. Given two conjunctions of atoms ς 1 and ς 2 , we define a homomorphism from ς 1 to ς 2 as a mapping ℎ :</p><formula xml:id="formula_0">C ∪ V → C ∪ V s.t. ℎ(𝑡) = 𝑡 if 𝑡 ∈ C and for each atom 𝑎(𝑡 1 , . . . , 𝑡 𝑛 ) ∈ ς 1 , then ℎ(𝑎(𝑡 1 , . . . , 𝑡 𝑛 )) = 𝑎(ℎ(𝑡 1 ), . . . , ℎ(𝑡 𝑛 )) ∈ ς 2 .</formula><p>Vadalog syntax. Vadalog is a declarative language for ontological reasoning. It is based on Warded Datalog ± , a member of the Datalog family that, at the price of very mild syntactic restrictions, extends Datalog with existential quantifiers and guarantees PTIME data complexity for query answering <ref type="bibr" target="#b19">[20]</ref>. A Warded Datalog ± program consists of a set of tuples (or facts) and tuple-generating dependencies (TGDs) of the form ∀𝑥 ¯∀𝑦 ¯(𝜑(𝑥 ¯, 𝑦 ¯)→∃𝑧 ¯𝜓(𝑥 ¯, 𝑧 ¯)), where 𝜑(𝑥 ¯, 𝑦 ¯) (the body) and 𝜓(𝑥 ¯, 𝑧 ¯) (the head) are conjunctions of atoms over the respective predicates, 𝑥 ¯, 𝑦 ¯are vectors of universally quantified variables and constants, and 𝑧 ¯is a vector of existentially quantified variables. Quantifiers can be omitted and conjunction is denoted by comma. In this context, Vadalog extends the Warded fragment with features of practical utility to address real-world scenarios <ref type="bibr" target="#b15">[16]</ref>. Support for aggregate functions, namely sum, prod, min, max and count, is achieved by means of monotonic aggregations <ref type="bibr" target="#b20">[21]</ref>. Other relevant extensions include negations and negative constraints of the form 𝜑(𝑥 ¯, 𝑦 ¯) →⊥, where 𝜑(𝑥 ¯, 𝑦 ¯) is a conjunction of atoms and ⊥ denotes the truth constant false to model disjointness or non-membership, as well as expressions in rule bodies, modelled with comparison (&gt;, &lt;, ≥, ≤, ≠) and algebraic (+, −, * , /, etc.) operators. Chase Procedure. KRR approaches model KGs as the combination Σ(𝐷) of an extensional component, essentially the ground business data in a database 𝐷, and an intensional component, which formally describes the business knowledge as a set Σ of rules in a declarative language such as Vadalog. Performing ontological reasoning over the KG augments it with new inferred knowledge derived from the application of the rules over the input data. Specifically, the semantics of a Vadalog program can be defined in an operational way with the chase procedure <ref type="bibr" target="#b17">[18]</ref>. It enforces the satisfaction of a set Σ of rules over a database 𝐷, incrementally augmenting 𝐷 with facts entailed via the application of the rules over 𝐷, until fixpoint. While Vadalog guarantees that such fixpoint exists when only the core features are used <ref type="bibr" target="#b15">[16]</ref>, the joint presence of algebraic operations and recursion must be carefully handled, as even simple Datalog programs can be in general non-terminating <ref type="bibr" target="#b21">[22]</ref>. A TGD 𝜎 : 𝜑(𝑥 ¯, 𝑦 ¯)→𝜓(𝑥 ¯, 𝑧 ¯) is applicable to 𝐷 if: (i) there exists a homomorphism 𝜃 (technically known as binding) such that 𝜃 (𝜑(𝑥 ¯, 𝑦 ¯)) ⊆ 𝐷, that is, if there exists a set of mappings from the terms of 𝜑(𝑥 ¯, 𝑦 ¯) to the constants of facts in 𝐷 such that each term maps to exactly one constant, and (ii) 𝜃 (𝜓(𝑥 ¯, 𝑧 ¯)) is a fact not already present in 𝐷. If such a binding 𝜃 exists, then 𝜃 (𝜓(𝑥 ¯, 𝑧 ¯)), derived by applying these mappings to the conclusion of the TGD, is added to 𝐷 via a chase step. The chase graph G(𝐷, Σ) is the directed acyclic graph with the facts from the chase Σ(𝐷) as nodes and an edge from a node 𝑛 to a node 𝑚 if 𝑚 derives from 𝑛 (and possibly other facts) via a chase step <ref type="bibr" target="#b3">[4]</ref>. Dedicated works <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b23">24]</ref> have thoroughly explored chase termination <ref type="bibr" target="#b21">[22]</ref> in Vadalog in the presence of recursion and algebraic operations. Vadalog reasoner. The Vadalog system is a state-of-the-art ontological reasoning engine that leverages the theoretical underpinnings of the chase procedure and the vast experience of the database community on provenance to power efficient, scalable, and explainable reasoning tasks over critical business domains and large KGs <ref type="bibr" target="#b15">[16]</ref>. To achieve this, it adopts a streaming data processing architecture based on the pipes and filters style <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b24">25]</ref>. Here, the set of rules Σ and the queries are translated into active data scans (linear scans for linear TGDs, join scans for join TGDs, and an output scan for the query), connected by intermediate buffers in a processing pipeline. The reasoning process is performed as a data stream along the pipeline, where each filter (i.e., scan) reads tuples from the respective parent, from the output scan down to the external data stores that inject ground facts into the pipeline. Interactions between scans occur by means of primitives such as next(), which fetches facts from the parent stream, if present. Since, for each filter, multiple parent filters may be available, Vadalog selects which one to invoke by employing specific routing strategies (round-robin, shortest path, etc.) that manage a priority queue of the sources. This methodology allows Vadalog to keep track of the provenance of each result, derived from one or more chase steps. Unlike traditional semi-naive approaches <ref type="bibr" target="#b21">[22]</ref>, Vadalog generalizes the volcano iterator model <ref type="bibr" target="#b25">[26]</ref>, operating in a pull-based query-driven fashion in which, ideally, facts are materialized only at the end of the chase and if they contributed to the reasoning task.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Neurosymbolic Reasoning by Softening the Chase</head><p>The input blocks of the soft chase pipeline in Figure <ref type="figure" target="#fig_0">1</ref> are a set 𝐷 of data, a set Σ of reasoning rules expressed in Vadalog, and a glossary 𝐺. Without loss of generality, we define 𝐷 as the collection of structured data from relational databases 𝐷 𝑠 and unstructured data from natural language sources 𝐷 𝑢 , all connected to Vadalog for the reasoning task. The glossary 𝐺 lists the predicates in Σ and their corresponding natural language descriptions.</p><p>Let us first introduce our running example. Here, 𝐷 contains a collection of acquisitions and ownerships of companies' shares by financial entities in the market, both persons and other companies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Example 1.</head><p>The following set Σ contains the Vadalog rules governing who has decision power in a financial entity, based on who owns, directly or indirectly via intermediaries, a significant amount of shares of the financial entity <ref type="bibr" target="#b26">[27]</ref> Owns(𝑥, 𝑦, 𝑠) → OwnedShares(𝑥, 𝑦, 𝑦, 𝑠) (𝜎 1 ) SignificantShares(𝑥, 𝑧), Owns(𝑧, 𝑦, 𝑠) → OwnedShares(𝑥, 𝑧, 𝑦, 𝑠) (𝜎 2 ) OwnedShares(𝑥, _, 𝑦, 𝑠), 𝑡𝑠 = msum(𝑠), 𝑡𝑠 &gt; 0. "what are all the entailed significant shares?" as ontological reasoning task. Note that the example is not intended to reflect real-world dynamics.</p><p>In pure KRR settings, the set Σ(𝐷 𝑠 ) is computed via the standard chase: starting from Σ(𝐷 𝑠 ) = 𝐷 𝑠 , it augments Σ(𝐷 𝑠 ) with facts derived from the application of the rules in Σ up to fixpoint. Figure <ref type="figure">2</ref> illustrates the chase graph derived from the activation of Σ over 𝐷 𝑠 . Specifically, rule 𝜎 1 generates OwnedShares(Elon Musk, Tesla, Tesla, 0.19), OwnedShares(Google LLC, DeepMind, DeepMind, 0.7), and OwnedShares(BlackRock, Google, Google, 0.4) representing the direct ownership entailed from the input facts. Then, SignificantShares(Google LLC, DeepMind) and SignificantShares(BlackRock, Google) are inferred via rule 𝜎 3 , whereas Elon Musk does not own significant shares of Tesla directly. Note that we cannot automatically derive, via rule 𝜎 2 and rule 𝜎 3 , that BlackRock owns significant shares of DeepMind indirectly through Google, as rule 𝜎 2 does not activate on the join argument ⟨Google LLC, Google⟩. Reasoning with the soft chase. Let us now extend Example 1 by taking into account an additional source of information apart from 𝐷 𝑠 . For instance, consider the following input NL data 𝐷 𝑢 = {"E. Musk bought 21% additional shares of Tesla in 2023", "Andy Jassy is CEO of Amazon since 2021"}. Indeed, in this instance relevant information would be lost via the standard chase due to the absence of syntactic bindings from the rule bodies to NL knowledge in 𝐷 𝑢 . Thus, we extend binding identification by introducing the soft chase, in which an LLM acts as a semantic unifier between rule bodies and Σ(𝐷), 𝐷 = 𝐷 𝑠 ∪ 𝐷 𝑢 , injecting NL understanding capabilities into the reasoning process. Σ(𝐷) = Σ(𝐷) ∪ i ′ ⊲ add newly generated fact to the chase 25:</p><p>return Σ(𝐷)</p><p>Specifically, the soft chase can be distinguished into five distinct phases, discussed below for Example 1 with the aid of Algorithm 1 and Figure <ref type="figure" target="#fig_0">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Initialization and rule selection.</head><p>As in a standard chase procedure, we begin by initializing the set Σ(𝐷) of chase facts to the ground ones in 𝐷 (line 2, in the algorithm). Next, we consider the data from Σ(𝐷) to activate the rules in Σ and generate new knowledge. Current rule and data to check for bindings are fetched via next() primitive in Vadalog, if present, according to a routing strategy. Let us assume that we are employing the default round-robin strategy. Let us also assume that each rule features both its logical form and a natural language description, easily produced as a preprocessing step by deterministically verbalizing the atoms in body and head into a "Since {body}, then {head}" sentence <ref type="bibr" target="#b27">[28]</ref> according to select-project-join semantics and looking up the glossary 𝐺. Similarly, if the input facts belong to the 𝐷 𝑠 database, they are verbalized as well. 2. Binding identification. The goal of this step is identifying the possible binding of the current rule body with the input facts. To achieve this, the LLM is employed, acting as semantic unifier to generate a set of variable-to-constant mappings. Specifically, we operate with a pre-trained model, augmented only with some manually defined examples of mappings in a few-shot learning fashion to increase accuracy and limit hallucinations, both in the actual task and in the output format.</p><p>Here we observe distinct behaviours according to the type of the rule. Indeed, if the rule is linear, i.e., it features a single atom in the body (such as 𝜎 1 in Example 1), the model only verifies whether there exists a set of mappings from the verbalized atom to excerpts of the NL fact (line 7). For instance, the NL fact "E. Musk bought 21% additional shares of Tesla in 2023 " maps to the verbalized form of atom Owns(𝑥,𝑦,𝑠), that is, "A financial entity 𝑥 owns 𝑠% shares of another financial entity 𝑦". If a possible binding is identified, the LLM returns as output the structured set of mappings from the rule body to the fact, e.g., { 𝑥 → E. Musk, 𝑦 → Tesla, 𝑠 → 0.21 }, together with the details around the time-frame as additional metadata. Otherwise, it returns the empty set.</p><p>If instead the rule involves a join, first the model performs the same binding identification as in the linear case, for each individual atom in the body. Then, it further processes the resulting sets of mappings to check whether the values corresponding to the join variables match semantically (line 10), in which case the mappings are returned as output. For instance, the input fact "Google LLC owns 70% shares of DeepMind" (the NL version of Owns(Google LLC, DeepMind, 0.7)) and "BlackRock owns significant shares of Google" (the NL version of SignificantShares(BlackRock, Google)) match on the join argument ⟨Google LLC, Google⟩, unlike the standard chase approach discussed above. 3. Binding validation. After generating the candidate mappings 𝑖 mappings , a validation step occurs. Specifically, it first performs a deterministic check to ensure that all the variables in the body have been mapped to exactly one constant (e.g., an excerpt of the NL fact). This step is required to comply with the definition of binding as a homomorphism introduced in Section 2. Then, a separate LLM is employed as well, acting as a validator to confirm the response of the binding identification phase in a feedback loop fashion (lines 11-18). Indeed, if the candidate mappings do not pass the check, the cause of the issue is provided to the semantic unifier, which is tasked with repeating the step. A limit is enforced on the maximum number of attempts before considering the rule as unable to be bound to the current data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Rule activation.</head><p>If the set of mappings is not empty after validation, the logic rule can be deterministically activated via the Vadalog reasoner according to the binding. Before this, standard applicability checks of the rule occur, verifying the pre-existence of the unified head in Σ(𝐷). If that is not the case, and if additional conditions that might be present in the rule, such as selections and negations, are satisfied, the rule is activated and the new logic fact 𝑖 ′ is inferred (line 20). Then, the fact is verbalized via the dedicated module and according to the glossary (line 22). For instance, from the binding { 𝑥 → E. Musk, 𝑦 → Tesla, 𝑠 → 0.21 }, rule 𝜎 1 generates the fact OwnedShares(E. Musk, Tesla, Tesla, 0.21), verbalized into "E. Musk owns 21% shares of Tesla directly", with the specific time-frame of the parent fact as additional chase metadata, i.e., "acquisition occurred in 2023 ". 5. Termination check. Finally, the resulting fact 𝑖 ′ undergoes a semantic termination check to ensure that it is not already present in the chase instance Σ(𝐷). This step, essential to prevent loops in recursive settings, in the soft chase version goes beyond standard applicability checks as it limits redundancy of inferred knowledge throughout the reasoning by pruning facts whose semantic meaning has already been derived in a previous step. Thus, the semantic unifier is employed once again and the verbalized version of 𝑖 ′ is semantically compared with the ones of the facts in Σ(𝐷) (line 23). Such a phase needs to be properly handled to prevent the removal of relevant facts. For instance, in our running example the fact "E. Musk owns 21% shares of Tesla directly" must be added to Σ(𝐷), thus it must not be pruned due to "Elon Musk owns 19% shares of Tesla". To address this, the LLM is enriched with specific examples, and the chase metadata of the compared facts is taken into account as well. If 𝑖 ′ passes the check, it is added to Σ(𝐷) and the soft chase begins a new iteration, until fixpoint. Extending soft chase with RAG. To further specialize the LLM into the domain of interest for the reasoning task, thus enabling a more accurate semantic unification throughout the procedure, we can also make available additional knowledge and terminology via RAG mechanisms. RAG enhances the model's contextual understanding by retrieving relevant documents or data points that contain specific information related to the concepts (i.e., the atoms and the facts) involved in the binding at hand. As further discussed in the next section, this proved to have a relevant impact in practical settings. For instance, in pure soft chase the fact significantShares(Andy Jassy, Amazon) is inferred from the NL input "Andy Jassy is CEO of Amazon since 2021" via rule 𝜎 1 , due to the mapping { 𝑥 → Andy Jassy, 𝑦 → Amazon, 𝑠 → 0.51 }, and then rule 𝜎 3 . In this instance, the LLM is incorrectly assuming that being CEO of a company entails owning the majority of its shares. We can prevent this incorrect inference by explicitly specifying, in the domain knowledge provided via RAG, that, in the absence of additional information, a CEO does not necessarily own any shares of the company at all. Figure <ref type="figure" target="#fig_2">4</ref> illustrates the soft chase graph for Example 1. As further discussed in the next section, it can be observed how the soft chase variants augment the resulting chase instance with multiple relevant facts derived from LLM's semantic understanding of the domain.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Preliminary Experimental Evaluation</head><p>We integrated our proposed pipeline with the Vadalog system, although it can be integrated with any chase-based ontological reasoner. A full-scale evaluation of the architecture is beyond the scope of this work. Conversely, in this section we provide a preliminary comparison of standard and soft chase (in its pure and RAG-powered versions) over an instance of Example 1. Setup. The experiments were conducted over a KG comprising ownership relationships between companies and persons as financial entities, represented using various nomenclatures such as full names, stock symbols, phrases, or common abbreviations. The KG featured inherent ambiguities and synonymous terms, reflecting real-world complexities and inconsistencies typical of semi-structured and unstructured corporate data. Moreover, natural language sentences describing ownership and acquisition facts were provided separately as input, simulating the scenario introduced in the previous section. We employed a pre-trained Llama 3 70B model as the semantic unifier.  Goal and Metrics. The primary goal of this evaluation is to assess the extent to which the injection of "softness" enhances the standard chase by recognizing similar entities and relationships according to real-world semantics. This enables augmenting the inference capabilities of the traditional approach while also preventing the generation of redundant data that represents the same knowledge in different syntactic forms. We conducted the experiments both before and after integrating the LLM with detailed knowledge of the domain of interest via RAG, with the purpose of further improving the model's accuracy in recognizing domain-specific entities and relationships. We compared the three distinct approaches (standard chase, soft chase, soft chase with RAG) according to the following metrics:</p><formula xml:id="formula_1">σ 1 σ 1 σ 3 σ 1 σ 3 σ 1 σ 1 σ 2 σ 2 σ 3 σ 3 σ 3</formula><p>• precision, i.e., the fraction of inferred significant shares that are correct;</p><p>• recall, i.e., the fraction of correct significant shares that are inferred; • F1 score, i.e., the harmonic mean of precision and recall;</p><p>• false positive (FP) shares, i.e., the fraction of incorrect significant shares that are inferred. For this evaluation, the correct instances of significant shares were determined through a manually curated golden set, where domain experts verified the correctness of the inferred relationships. Discussion. Results are illustrated in Figure <ref type="figure" target="#fig_3">5</ref>. The standard chase featured full precision by definition, as it derived significant shares solely through strict logical binding with structured facts. However, this precision came at the cost of recall, as the standard chase was unable to bind rules to unstructured input, leading to missed inferences throughout the reasoning process. For instance, it failed to derive the direct relationship OwnedShares (E. Musk,Tesla,Tesla,0.21) from the input knowledge "E. Musk bought 21% additional shares of Tesla in 2023", and consequently did not infer that Elon Musk holds significant shares of Tesla. On the other hand, the soft chase demonstrated lower precision but higher recall, as it leveraged the LLM to recognize and semantically unify unstructured concepts with structured relationships. Furthermore, the introduction of RAG significantly improved both precision and recall, reducing the generation of incorrect facts such as significantShares(Andy Jassy, Amazon) from the input "Andy Jassy is CEO of Amazon since 2021". Indeed, the domain-specific knowledge provided by RAG effectively mitigated LLM hallucinations, reducing false positive bindings and consequently the incorrect inference of significant shares in the soft chase approach. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Related Work</head><p>Neurosymbolic methodologies are currently at the forefront of both academic and industrial research due to their potential in developing more intelligent, versatile, and explainable AI applications <ref type="bibr" target="#b28">[29]</ref>. In this context, the integration of logic-based KGs and, more broadly, KRR approaches with LLMs has shown significant promise <ref type="bibr" target="#b29">[30,</ref><ref type="bibr" target="#b30">31]</ref>. Among the distinct forms of hybrid interactions between the two paradigms <ref type="bibr" target="#b31">[32]</ref>, studies have primarily focused on enriching LLMs with domain-specific knowledge encapsulated in KGs <ref type="bibr" target="#b27">[28]</ref>, as well as employing these models for tasks such as KG construction from unstructured text <ref type="bibr" target="#b32">[33]</ref> and exploration <ref type="bibr" target="#b33">[34]</ref>.</p><p>A recent line of research involves integrating LLMs with foundational reasoning skills, modeling implicit structure information within the text and performing explicit logical reasoning over them to deduce the conclusion <ref type="bibr" target="#b34">[35]</ref>. However, while these approaches improve reasoning capabilities, they often lack the robust, transparent reasoning structures that KRR systems inherently provide. To address this, frameworks like LOGIC-LM have been introduced, which first translates natural language problems into symbolic formulations using LLMs, and then employs a deterministic symbolic solver for inference <ref type="bibr" target="#b35">[36]</ref>.</p><p>To the best of our knowledge, this is the first approach that goes beyond the pure combination of LLMs with symbolic solvers to translate and solve specific logical problems. Our proposal is designed to seamlessly integrate LLMs within a KRR-centric framework to enhance ontological reasoning with semantic understanding throughtout the whole process, injecting human-like flexibility for complex real-world tasks while also preserving the inherent transparency of the paradigm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>In this paper, we addressed the limitations of traditional ontological reasoning systems, particularly their inherent rigidity in managing the intricacies and ambiguities of natural language data. We proposed a novel neurosymbolic approach that integrates Large Language Models as semantic interpreters between logic rules and such unstructured knowledge, enhancing the flexibility and robustness of rule activations. Our preliminary experiments demonstrate the effectiveness of our solution in preserving correctness and explainability while significantly improving adaptability. As future work, we aim to further refine the underlying formalism of our proposal and tackle challenges related to accuracy and scalability, particularly critical when processing large amounts of text as input knowledge for complex reasoning tasks. We believe this approach lays the foundation for deeper and more synergistic interactions between KRR systems and LLMs, fostering human-like reasoning in real-world contexts.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Neurosymbolic reasoning pipeline for LLM-powered soft chase.𝐷 represents input data collected from relational databases and natural language sources connected to the ontological reasoning system. Σ denotes the set of logic rules to be applied on 𝐷. Σ(𝐷) refers to the original data augmented with new knowledge inferred by applying the rules in Σ throughout the reasoning process.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Vadalog processing pipeline of soft chase for Example 1.Green nodes are linear rules, the blue one is a join rule, and the red one is the output of the reasoning task. Solid edges are logical dependencies between the rules, and dashed ones denote an interaction with the semantic unifier of the type specified in the label (bind(), join(), check_termination()).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Instance of soft chase graph for Example 1. Red nodes and edges denote an incorrect derivation due to LLM hallucination, prevented in the version featuring RAG support.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Comparison of precision, recall, F1 score, and FP shares for standard chase, soft chase, and soft chase with RAG, evaluated on instance of Example 1.</figDesc></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The work on this paper was partially supported by the Vienna Science and Technology Fund (WWTF) [10.47379/ICT2201, 10.47379/ VRG18013, 10.47379/NXT22018]; and the Christian Doppler Research Association (CDG) JRC LIVE.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Reasoning on company takeovers: From tactic to strategy</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bencivelli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Biancotti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Blasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">P</forename><surname>Conteduca</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gentili</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data Knowl. Eng</title>
		<imprint>
			<biblScope unit="volume">141</biblScope>
			<biblScope unit="page">102073</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Reasoning over health records with Vadalog: a rule-based approach to patient pathways</title>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">P</forename><surname>Dwyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Baldazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Davies</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Vlad</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Argumentation approaches for explanaible ai in medical informatics</title>
		<author>
			<persName><forename type="first">L</forename><surname>Caroprese</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Vocaturo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Zumpano</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.iswa.2022.200109</idno>
		<ptr target="https://doi.org/10.1016/j.iswa.2022.200109" />
	</analytic>
	<monogr>
		<title level="j">Intelligent Systems with Applications</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page">200109</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A general datalog-based framework for tractable query answering over ontologies</title>
		<author>
			<persName><forename type="first">A</forename><surname>Calì</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Lukasiewicz</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.websem.2012.03.001</idno>
	</analytic>
	<monogr>
		<title level="j">J. Web Semant</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="57" to="83" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Swift logic for big data and knowledge graphs: Overview of requirements, language, and system</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pieris</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Theory and Practice of Computer Science: 44th International Conference on Current Trends in Theory and Practice of Computer Science</title>
				<meeting><address><addrLine>Krems, Austria</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018-02-02">January 29-February 2, 2018. 2018</date>
			<biblScope unit="page" from="3" to="16" />
		</imprint>
	</monogr>
	<note>Proceedings 44</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">The genai is out of the bottle: generative artificial intelligence from a business model innovation perspective</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">K</forename><surname>Kanbach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Heiduk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Blueher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lahmann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Review of Managerial Science</title>
		<imprint>
			<biblScope unit="page" from="1" to="32" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Improving language understanding by generative pre-training</title>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Narasimhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Salimans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Touvron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Martin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Stone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Albert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Almahairi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Babaei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Bashlykov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Batra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bhargava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bhosale</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2307.09288</idno>
		<title level="m">Llama 2: Open foundation and fine-tuned chat models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A survey on evaluation of large language models</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Intelligent Systems and Technology</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="1" to="45" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Large language models are zero-shot reasoners</title>
		<author>
			<persName><forename type="first">T</forename><surname>Kojima</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">S</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Reid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Matsuo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Iwasawa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NIPS</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="22199" to="22213" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">D</forename><surname>Shu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.07311</idno>
		<title level="m">Knowledge graph large language model (kg-llm) for link prediction</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Chain-of-thought prompting elicits reasoning in large language models</title>
		<author>
			<persName><forename type="first">J</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schuurmans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bosma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Xia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Chi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">V</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NIPS</title>
		<imprint>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="24824" to="24837" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Du</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2309.01029</idno>
		<title level="m">Explainability for large language models: A survey</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Answering the &quot;why&quot; in answer set programming-a survey of explanation approaches</title>
		<author>
			<persName><forename type="first">J</forename><surname>Fandinno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Schulz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Theory and Practice of Logic Programming</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="114" to="203" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Teaching the fate community about privacy</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">I</forename><surname>Hong</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Commun. ACM</title>
		<imprint>
			<biblScope unit="volume">66</biblScope>
			<biblScope unit="page" from="10" to="11" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Vadalog: A modern architecture for automated reasoning with large knowledge graphs</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Benedetto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IS</title>
		<imprint>
			<biblScope unit="volume">105</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Knowledge graphs and enterprise AI: the promise of an enabling technology</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Fakhoury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDE</title>
				<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="26" to="37" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A proof procedure for data dependencies</title>
		<author>
			<persName><forename type="first">C</forename><surname>Beeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Y</forename><surname>Vardi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the ACM (JACM)</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="page" from="718" to="741" />
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Retrieval-augmented generation for knowledgeintensive nlp tasks</title>
		<author>
			<persName><forename type="first">P</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Perez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Piktus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Petroni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">NeurIPS</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="9459" to="9474" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Beyond sparql under owl 2 ql entailment regime: Rules to the rescue</title>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pieris</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Twenty-Fourth International Joint Conference on Artificial Intelligence</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Optimizing recursive queries with monotonic aggregates in deals</title>
		<author>
			<persName><forename type="first">A</forename><surname>Shkapsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zaniolo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 31st International Conference on Data Engineering</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2015">2015. 2015</date>
			<biblScope unit="page" from="867" to="878" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Abiteboul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hull</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Vianu</surname></persName>
		</author>
		<title level="m">Foundations of databases</title>
				<imprint>
			<publisher>Addison-Wesley Reading</publisher>
			<date type="published" when="1995">1995</date>
			<biblScope unit="volume">8</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">The vadalog system: Datalog-based reasoning for knowledge graphs</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<idno type="DOI">10.14778/3213880.3213888</idno>
		<idno>doi:10.14778/3213880.3213888</idno>
		<ptr target="https://doi.org/10.14778/3213880.3213888" />
	</analytic>
	<monogr>
		<title level="j">Proc. VLDB Endow</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="975" to="987" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Reasoning in warded datalog+/-with harmful joins</title>
		<author>
			<persName><forename type="first">T</forename><surname>Baldazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Atzeni</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SEBD</title>
		<imprint>
			<biblScope unit="page" from="292" to="299" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Ontological reasoning over shy and warded datalog+/-for streaming-based architectures</title>
		<author>
			<persName><forename type="first">T</forename><surname>Baldazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Favorito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Symposium on Practical Aspects of Declarative Languages</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="169" to="185" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">The volcano optimizer generator: Extensibility and efficient search</title>
		<author>
			<persName><forename type="first">G</forename><surname>Graefe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">J</forename><surname>Mckenna</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICDE, IEEE Computer Society</title>
				<imprint>
			<date type="published" when="1993">1993</date>
			<biblScope unit="page" from="209" to="218" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Distributed company control in company shareholding graphs</title>
		<author>
			<persName><forename type="first">A</forename><surname>Gulino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ceri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Gottlob</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 37th International Conference on Data Engineering (ICDE)</title>
				<meeting><address><addrLine>Los Alamitos, CA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2637" to="2648" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Fine-tuning large enterprise language models via ontological reasoning</title>
		<author>
			<persName><forename type="first">T</forename><surname>Baldazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ceri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Colombo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gentili</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Joint Conference on Rules and Reasoning</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="86" to="94" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Neurosymbolic ai: The 3 rd wave</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">D</forename><surname>Garcez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">C</forename><surname>Lamb</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence Review</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="page" from="12387" to="12406" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">L</forename><surname>Dong</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2308.14217</idno>
		<title level="m">Generations of knowledge graphs: The crazy ideas and the business impact</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<title level="m" type="main">Is neuro-symbolic AI meeting its promise in natural language processing? A structured review</title>
		<author>
			<persName><forename type="first">K</forename><surname>Hamilton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nayak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bozic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Longo</surname></persName>
		</author>
		<idno>CoRR abs/2202.12205</idno>
		<ptr target="https://arxiv.org/abs/2202.12205.arXiv:2202.12205" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Luo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2306.08302</idno>
		<title level="m">Unifying large language models and knowledge graphs: A roadmap</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Trajanoska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Stojanov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Trajanov</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.04676</idno>
		<title level="m">Enhancing knowledge graph construction using large language models</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">please, vadalog, tell me why&quot;: Interactive explanation of datalog-based reasoning</title>
		<author>
			<persName><forename type="first">T</forename><surname>Baldazzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bellomarini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ceri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Colombo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gentili</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Sallinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EDBT</title>
				<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="834" to="837" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Fan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2301.08913</idno>
		<title level="m">Unifying structure reasoning and language model pre-training for complex reasoning</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Albalak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">Y</forename><surname>Wang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2305.12295</idno>
		<title level="m">Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
