<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Study on Contradiction Detection Using a Neuro-Symbolic Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alessia Donata Camarda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovambattista Ianni</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Calabria</institution>
          ,
          <addr-line>Rende, 86037</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Research related to automated contradiction processing is a hot topic in several scientific communities. Many solutions ofer limited explainability due to the exclusive usage of neural models or machine learning approaches. Also, although reproducibility can be achieved by controlling randomness in these models, their inherent complexity and lack of transparency often hinder adoption in domains that require high levels of determinism and some elaboration tolerance. This problem can be addressed with the introduction of neuro-symbol approaches, where part of the problem can be solved by exploiting logical formalisms. In this paper, we propose a neurosymbolic pipeline whose purpose is to identify simple contradictions within sentences expressed in natural language. The contribution of the non-explainable side of the pipeline is confined to the conversion from natural language, and to the extraction of commonsense knowledge, while reasoning and knowledge derivation is delegated to a symbolic reasoner based on Answer Set Programming. We describe the proposed architecture and present a simplified implementation. We then report about early experiments.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Knowledge representation and reasoning</kwd>
        <kwd>Neuro-Symbolic AI</kwd>
        <kwd>Digital Forensics</kwd>
        <kwd>Logic Programming</kwd>
        <kwd>Natural Language Processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Contradiction management [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] is a longstanding hot topic, especially in the Natural Language
Processing field [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], as it is transversal to many other tasks such as speech recognition, text classification,
natural-language understanding, and textual entailment classification. Contradiction management,
herein intended as all the research issues arising when dealing with the detection, automated processing,
symbolic representation and usage of contradictions, has been object of study in many other fields under
a diferent terminology. In particular, dealing with contradictions is a fundamental yet challenging
issue in the legal and forensics field [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], where adding decision support tools for analyzing possibly
contradictory evidence and/or possibly conflicting testimonies is as strategic as it is challenging.
      </p>
      <p>In the specific, the concept of contradiction and, more generally, the notion of anomaly plays a
prominent role in the forensics field. When solving a case, evidence and testimonies are typically of not
straightforward interpretation, and require proper processing: the search for new evidence that can
confirm or contradict previous information and/or testimonies also plays an important role.</p>
      <p>
        Contradictions have been addressed from various perspectives also in the field of Knowledge
Representation and Reasoning. Some examples of this line of research can be identified in the inconsistency
management through argumentation frameworks [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] or in the realm of logic programming and
description logic [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
        ].
      </p>
      <p>
        Contradiction management is often approached with neural models [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
        ], where recognizable
patterns within sentences are learned: these models are often influenced by the presence of specific
trigger word in sentences (e.g."however", "but", etc.) and they are limited at directly detecting the
semantic presence or absence of contradictions.
      </p>
      <p>The usage of logical formalisms in this respect can, in principle, help in solving some known issues
in contradiction management, such as adding forms of explainability, ensuring repeatable consistency
checks, supporting conflict resolution strategies and/or generating all the non-conflicting scenarios,
and facilitating the integration of heterogeneous knowledge sources under an unified symbolic format.</p>
      <p>However, the fruitful adoption of knowledge representation tools in the realm of contradiction
management is prevented by several research gaps, among which are:
• Conversion of natural language into logical statements. As widely investigated in the natural
processing field, humans speak or write ambiguously: they omit subjects and context, they
imply multiple subjects and predicates in the same sentence, and so on. A transformation from
natural language to logic statements should preserve meaning, resolve ambiguities, retain context
whenever applicable, and support reasoning.
• Formalizing contradictory knowledge. Although inconsistency is a first-class citizen in logical
representations, and can be dealt with in several ways, contradictions occurring in natural
language exhibit several shades that do not have an obvious mapping to sharp inconsistencies
modeled in formal logic. For instance, in natural language two propositions might be contradictory
because of the conflicting semantics they convey and not because one is the direct negation of
the other. However, modeling semantics requires to additionally model a non-negligible portion
of background knowledge.</p>
      <p>
        One of the alternative options that could be explored to manage contradictions is the use of
neurosymbolic architectures, i.e. the combinations of neural models and symbolic approaches. This kind
of solution is becoming increasingly widespread as it allows the introduction of explainable and
deterministic modules within combined evaluation pipelines. These latter would be otherwise totally
probabilistic and dificult to explain. Such approaches are ideal in areas where explainability plays
an important role, such as the legal and forensic one. In this paper, we propose to confine the role of
Large Language Models (LLMs) just as factual knowledge extraction tools: the extracted information
is then processed in a subsequent symbolic reasoning stage that deals with structured knowledge
expressed in form of rules. As a symbolic declarative programming language we will focus on Answer
Set Programming [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ], the prominent logic programming formalism widely adopted in the state of
the art, especially in applications involving non-monotonic reasoning and knowledge representation.
      </p>
      <p>Our contribution can be summarized in the following points:
• We propose a neuro-symbolic pipeline whose final output identifies the presence of contradictions
within sentence pairs.
• We inspect the possibility of leveraging LLMs as extractors of ground terminological knowledge,
possibly re-usable for modeling contradictory statements.
• We explore a method for using LLMs as translators from natural language to logical formalisms,
based on providing a fixed predicate format to the language model at hand. This approach has
been made possible by recent improvement in LLMs that allow to specify the predicate format in
prompts, without fine tuning and training.
• As modeling all contradictions inherent in natural language is an ambitious goal, we identify a
proper class of contradictory statements which, nonetheless covers a reasonable range of practical
cases, under some assumption of “context atomicity”.
• We report about first experiments that we performed in this respect, and discuss possible
extensions and contradiction modelings.</p>
      <p>The rest of the paper is structured as follows. Section 2 provides some background details and reviews
related work. Sections 3, 4 and 5 discuss in detail our proposed neuro-symbolic pipeline and the results
obtained experimenting this pipeline. Section 6 provides some insights for possible extended modeling
that analyze more specifically some diferent types of contradictions. Finally, Section 7 outlines our
conclusions and some future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and related work</title>
      <p>
        Contradictions classification An analysis of categories of contradictions can be found [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. De
Marnefe et al. start their analysis with a simple question: “What is a contradiction?”. The answer they
gave themselves is the following: “contradiction occurs when two sentences are extremely unlikely to
be true simultaneously”. Although this answer is far from a strict logical definition of contradiction (i.e.
sentences A and B are contradictory if there is no possible world in which A and B are both true), it
comes very close to the way humans reason and speak. The taxonomy proposed by De Marnefe et al
can be summarized as follows:
• Antonymy-Based Contradictions (AC): these arise when key words in two statements are
antonyms or have opposing meanings.
• Negation-Based Contradictions (NEC): these occur when one statement explicitly negates
another.
• Numeric Contradictions (NUC): these arise when numerical values (such as quantities, dates,
or statistics) do not match.
• Factive Contradictions (FC): these contradictions arise from factive verbs (e.g., "know," "realize,"
"confirm"), which presuppose the logical truth of the statement that is being afirmed.
• Structural Contradictions (SC): these contradictions arise due to diferences in sentence
structure, often leading to diferent interpretations.
• Lexical Contradictions (LC): these occur when two sentences use diferent words that imply
opposing facts.
• World Knowledge Contradictions (WKC): these contradictions rely on conflicting general
world knowledge.
      </p>
      <p>
        A similar classification is presented by Wu et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. This taxonomy is more suited for AI-based pattern
detection rather than formal logic. Specially, some of the proposed classes do not have clear logical
structures and are harder to formalize. For example the “Scope-Based Contradictions”, which occurs
when the scope of an event or fact is narrowed down or broadened in a way that contradicts the first
statement. In [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] a classification of contradictions theorized by Aristotle is revisited. However, it does
not fully reflect the contradictions of natural language as we have described them before.
Large Language Models A Large Language Model is a machine learning model which can generate
new content in response to a particular prompt, usually in a human-like language. LLMs have become
popular due to the large number of tasks they can perform, e.g. text generation, translation, content
summary, rewriting, and classification. LLMs are trained on vast amounts of text data from various
sources, e.g. books, articles, websites, which are usually cleaned from noise, standardized, and tokenized
into smaller units. According to their internal structure, LLMs can learn the importance of diferent
words and capture complex relationships and dependencies within the text. They learn how to predict
the next word or sequence of words in the given text based on the context provided by the preceding
words. After the training phase, LLMs can be further trained (fine-tuned) on specific tasks or domains
to improve performance and to adapt their parameters to particular requirements of the task at hand.
LLMs as semantic parsers and knowledge generator. Large Language Models have also been
used as generators of structured semantic knowledge with the aim of automating the generation of facts
or logical rules so as to facilitate the knowledge designer’s work. Some approaches generate structured
knowledge from texts [
        <xref ref-type="bibr" rid="ref19 ref20 ref21 ref22">19, 20, 21, 22</xref>
        ] while others receive images or videos as input [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. The main
task of these studies is to exploit LLMs as semantic parsers: therefore some of these approaches discuss
the possibility of fine-tuning the model in question to improve the generation capabilities and reduce
the possibility of errors. In particular, the work of [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] provides a pipeline in which input tests are also
provided to LLMs in order to better handle errors and possibly improve the generation. LLMs have
been used for the population of ontologies whose schema already exists [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]; and to derive entities,
associations and properties, in order to populate a new schema starting from texts and tables [
        <xref ref-type="bibr" rid="ref26">26, 27</xref>
        ].
Legal and Forensic Ontologies As the forensic field is a driving domain when considering
contradiction management, we explored the panorama of forensic ontologies in order to assess existing
knowledge models and possibly integrate our work in this context. Several ontologies have been
implemented to standardize and, to some extent, automate the representation of knowledge in forensics
and law. Since these two fields are closely connected, these ontologies often share overlapping concepts.
However, their scope and purpose are diferent. The Unified Cybersecurity Ontology (UCO) [ 28] was
designed to support information integration and situational awareness in cybersecurity systems. UCO
aggregates data from various cybersecurity tools for comprehensive analysis and supports automated
threat intelligence sharing. UCO has been extended by the Cyber-investigation Analysis Standard
Expression (CASE) ontology [29], which has become one of the most widely adopted standards in digital
forensics. CASE focuses on the representation of investigative actions and artifacts, and it helps in
standardizing the reporting of cyber investigations. Another important ontology is the Cyber Forensics
Ontology for Cyber Criminal Investigation [30], which aims to classify cybercrimes and examine the
collection of digital evidence. In the legal domain, several well-established ontologies exist. These
include:
• PROV-O [31], which models provenance information and describes the origins and history of
data and entities;
• LKIF (Legal Knowledge Interchange Format) Core Ontology [32], which represents basic legal
concepts such as norms, legal acts, and roles;
• OPJK (Ontology of Professional Judicial Knowledge) [33], which aggregates knowledge relevant
to legal professionals, particularly for legal education or frequent questions.
      </p>
      <p>However, none of the aforementioned ontologies explicitly model complex legal concepts such as
the content of testimony or depositions, despite their importance in both investigative and judicial
processes.</p>
      <p>Answer Set Programming Answer Set Programming (ASP) is a declarative language used in logic
programming; even if its birth dates back to around 1993, nowadays, it is widely used in the field of
artificial intelligence, planning, robotics and, currently, digital forensics [ 34, 35, 36]. We assume the
reader is familiar with the basic of modelling with Answer Set Programming and we remaind to [37]
for further information on the topic.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Contradiction modeling and detection framework</title>
      <p>We will illustrate next our framework and the type of contradictions we deal with.</p>
      <p>Our neuro-symbolic pipeline is depicted in Figure 1. The framework includes a Large Language
Model  and an ASP solver .</p>
      <p>At runtime,  plays both the role of commonsense knowledge extractor and of parser converting
knowledge in logical statements.  is given as input a pair of claims (1, 2), whose core semantic
content is translated from natural language into ASP facts (instance knowledge). Also  is prompted
to produce logical facts that model possible incompatibilities present in the sentences, and useful for
identifying contradictions (reusable knowledge).</p>
      <p>Reusable knowledge produced by  can be used to populate a knowledge base . The content of
, optionally containing also reusable knowledge collected in the past, is provided as input to the
ASP solver  together with instance knowledge.  gets in input also a logic program , which includes
ifxed knowledge and rules useful in identifying the contradictions possibly present in the input. The
program rules are written by humans and not dynamically generated.</p>
      <p>The output of  is an answer set which contains instances of the  predicate, indicating
whether the input to the pipeline is contradictory.</p>
      <sec id="sec-3-1">
        <title>3.1. Types of contradictions</title>
        <p>We assume to deal with couples of claims (1, 2) where 1 and 2 are in the form (, , , ), where 
represent the subject of the claim,  is the predicate and  is the object.  is a unique identifier associated
to each claim. We also assume that claims refer to events happened at the same time, and that one
specific main atomic event involved a single actor performing a single atomic action.</p>
        <p>We are particularly interested in three main type of incompatibilities:
• Incompatible objects:
– Given an object ,  is incompatible with respect to a predicate . For example: given the
predicate  and the object , these are incompatible, as one cannot stab someone with a
gun. This type of inconsistency is modeled using the logical fact _(, ),
which is expected to be produced by the LLM at hand according to its commonsense
knowledge.
– Given an object 1 and an object 2, these are incompatible simply because they are diferent
and only one object for an atomic action can be identified as the object of the claim. For
example, it is contradictory to state that an atomic action was performed by Alice with a
 and then to state that the same atomic action was performed using a . This type
of incompatibility is handled using apposite logic rules.
• Incompatible predicates: predicates 1 and 2 could be conflicting when they are used referring to
the same atomic action whenever they are not synonymous or equal. For example the predicate
 and the predicate ℎ are incompatible, as we expect one exclusively shoots or stabs
someone in the same atomic action. This is modeled on the one hand by asking the LLM at hand
to produce assertions in the form _(1, 2), whenever applicable.
• Incompatible subjects: given a subject 1 and a subject 2 these are incompatible simply because
they are diferent. For example: stating that an action was performed by  and then stating
that it was performed by  leads to a contradiction due to the fact that Alice and Bob are two
diferent people and cannot have performed the same action at that moment.
• Treatment of referrals to same objects and/or predicates: the same object in the atomic action can
be referred with more than one word, possibly synonymous. Also, couples of predicates can have
acceptable semantic similarities. Think, e.g. at  and  , and to the predicates ℎ and
. This knowledge is provided by the LLM at hand by prompting it to use just one name for
referrals to words denoting the same object and/or action.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental setting</title>
      <p>
        We performed some experiments over a curated version of the dataset  proposed by [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The
dataset contains possibly conflicting claims of testimonies regarding simplified crime events. The
training set of the original dataset contained 1019 instances, while the test set contained 813 tuples.
Then, the dataset has been refined to remove duplicates and to be compliant with our assumption of
action atomicity. In the specific, sentences describing sequences of actions were modified to describe a
single action, and some misclassified tuples have been modified to correct them. At the end of the
cleaning phase, the dataset contains 31 instances, among which 16 are contradictions pairs of sentences
and the remaining ones are neutral sentences. We tested our pipeline with a choice of two LLM models
and an ASP solver. The LLMs of choice were GPT o1 and GPT o3-mini which were queried via their
API interface, while the ASP solver of choice was DLV [38]. The LLM was prompted with a text divided
into two sections as shown in Figure 2 and 3. The prompt is constituted by a fixed system prompt, and
a dynamic user prompt where the two inputs sentences 1 and 2 are fed as parameters.
      </p>
      <p>You are an expert in Answer Set Programming (ASP). Do not output comments or anything other than
facts and rules. In your response, you must use the following atoms:
"statement(ID, S, P, O)", where "ID" is the ID of the sentence, "S" is the main subject, "P" is the
main predicate, and "O" is the object of the sentence;
"weapon(ID, W)", where "ID" is the ID of the sentence, and "W" is the weapon involved in the
predicate P;
"incompatible_object(P, O)", where "O" is an object and "P" is a predicate of the same sentence that
cannot be done using the object O;
"incompatible_predicates(P1, P2)", where "P1" and "P2" are two different predicates which are not
synonyms of each other.</p>
      <p>Example of sentences:
Sentence1: "Yesterday, Will said that he saw Mary killing Peter."
Sentence2: "But in front of the judge, Will swore that he saw Mary beating Peter."
Example of output: statement(s1, mary, kill, peter). statement(s2, mary, beat, peter).
incompatible_predicates(kill, beat).</p>
      <p>Return ASP facts that model the event involved in the following sentences and their possible
incompatibilities. If necessary, when you find two words that are similar or synonyms, keep only one
of the two words.
"Sentence1: {C1}"
"Sentence2: {C2}"</p>
      <p>The prompts were designed by taking into account the following criteria:
• Specifying the format of the predicates is necessary to keep the knowledge generated by the
model consistent, otherwise the LLM could provide diferent predicates for each answer. In this
latter case, it would not be possible to reuse the knowledge obtained.
• Specifying that we do not want comments or anything else to be returned in the response
is necessary to immediately obtain an executable result that can be parsed by an ASP solver.</p>
      <p>Otherwise, we would first have to clean the model output and then feed it as input to the solver.
• Including an example is of great importance given the complex request. This gives the model a
clearer view of the request made previously and what output we expect.</p>
      <p>The results obtained were then given as input to the ASP solver to compute the answer sets, together
with the fixed ASP rules described next.</p>
      <sec id="sec-4-1">
        <title>4.1. Answer Set Programming rules</title>
        <p>Symbolic reasoning and contradiction entailment are delegated to the Answer Set Programming
module. As we delve deeper into the context of our example (i.e. the forensic one), this module acquires
great importance. In fact, it becomes the explainable, deterministic and reproducible part of the
pipeline. The output obtained from the LLM becomes the input for the ASP solver along with some
rules human-written that model the simplest cases of contradictions. Since we are considering the base
cases, the rules do not need auxiliary predicates to those we have already discussed in the previous
sections. Below we report about the logic program we adopted for our experiments. The rules aim to
entail the atom “()” as new knowledge. The term  is a function symbol that allows us
to take into account all the statements that are considered to be contradictory. In fact, we can find cases
of incompatibility between pairs of sentences but also in single sentences, which is why we need to
be able to consider both cases but to represent them in a single atom. In case an incompatibility is
identified in a single sentence, the value of  will be  (); if the incompatibility occurs between two
statements then the value of  will be in the form  (1, 2). The last four rules model our atomicity
assumption, i.e. they deal with the possibility that the two input testimonies are diferent from each
other although referring to the same event. This is a reasonable assumption in the forensic context,
where one should take into account that a first testimony could difer greatly from a second, e.g. when
the same person is accused of having done two diferent, mutually exclusive actions.
% Rules about incompatibles objects
contradiction(f(ID1)) :- statement(ID1, S, P, O), additional_info(ID1, weapon, W),
incompatible_objects(P, O).
contradiction(f(ID1)) :- statement(ID1, S, P, O), additional_info(ID1, weapon, W),
incompatible_objects(P, W).
contradiction(f(ID1, ID2)) :- statement(ID1, S, P, O1), statement(ID2, S, P, O2),</p>
        <p>ID1 != ID2, O1 != O2.
contradiction(f(ID1, ID2)) :- statement(ID1, S, P, O), statement(ID2, S, P, O),
additional_info(ID1, weapon, W1), ID1 != ID2,
additional_info(ID2, weapon, W2), W1 != W2.
% Rules about incompatibles predicates
contradiction(f(ID1, ID2)) :- statement(ID1, S, P1, O), statement(ID2, S, P2, O),
incompatible_predicates(P1, P2), P1 != P2, ID1!=ID2.
% Rules about incompatible subjects
contradiction(f(ID1, ID2)) :- statement(ID1, S1, P, O), statement(ID2, S2, P, O),</p>
        <p>ID1 != ID2, S1 != S2.
%Rules about changes from the first deposition
contradiction(f(ID1, ID2)) :- statement(ID1, S1, P1, O), statement(ID2, S2, P2, O),
ID1 != ID2, S1 != S2,
incompatible_predicates(P1, P2), P1 != P2.
contradiction(f(ID1, ID2)) :- statement(ID1, S, P1, O1), statement(ID2, S, P2, O2),
ID1 != ID2, O1 != O2,
incompatible_predicates(P1, P2), P1 != P2.
contradiction(f(ID1, ID2)) :- statement(ID1, S, P1, O), statement(ID2, S, P2, O),
additional_info(ID1, weapon, W1), W1 != W2,
additional_info(ID2, weapon, W2), ID1 != ID2,
incompatible_predicates(P1, P2), P1 != P2.
contradiction(f(ID1, ID2)) :- statement(ID1, S1, P, O1), statement(ID2, S2, P, O2),</p>
        <p>ID1 != ID2, O1 != O2, S1 != S2.</p>
        <p>Model SE
GPT o3-mini 0.03</p>
        <p>GPT o1 0</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>Pairs of sentences were given as input to the two LLMs of choice. These could be contradictory or
neutral (i.e. deemed as compatible). The results obtained were compared with the known ground truth,
while we manually monitored the output of the LLMs and of the ASP solver, in order to assess the
quality of the pipeline. In general, we noticed little diferences between the output of GPT o1 and GPT
o3-mini.</p>
      <p>In particular, we noticed that:
• Both models struggle to represent negative expressions as triples (, , ). For example, the
sentence “he omitted to have killed” is translated as “he killed”, so the final predicate is "kill" in
both cases, thus missing to demonstrate the contradiction between the two sentences.
• GPT o3-mini tries to use all predicates present in the prompt, even if they are not needed. For
example, if in two depositions the subjects are incompatible (in the first the subject is Alice while
in the second the subject is Bob), the model still tries to represent this incompatibility using the
predicate _, even though it does not model this case.
• Both models attempt to model background information that does not need to be modeled. For
example, if a deposition contains the sentence "Mike confesses to killing Jessica but forensic
scientists say it was Ross," the models will also attempt to model the information that Ross was
the culprit.
• Both models struggle to represent information in triples (, , ) in some sentences. For example,
if one statement states that the subject ran west and the next statement states that he ran east,
then the models will use the geographic information as an attribute of the predicate, modeling
them as "run_west" and "run_east" instead of attributing this information to the object.
Quantitative results are shown in Table 1. The table is structured as follow:
• The first column identifies the model used. The second column identifies the percentage of
instances returned by the LLMs that present Syntactic Errors (SE).
• The third, fourth, fifth and sixth columns represent respectively: the rate of True Positives
(instances classified as contradictory and which actually are), the rate of False Positives (instances
classified as contradictory but which are not), the rate of True Negatives (instances classified as
neutral and which actually are), and the rate of False Negatives (instances classified as neutral
but which are not). Please note that by classified as contradictory we mean that, given the output
of the LLM as input to the ASP solver, it detects the presence of a contradiction (therefore the
atom () for some  was entailed in the answer set).
• The remaining columns of the table represent standard evaluation metrics used to assess the
models’ performance. ACC (Accuracy) indicates the rate of correct predictions over the total
number of instances. PRE (Precision) measures the rate of instances correctly identified as positive
out of all instances predicted as positive ones. REC (Recall) represents the rate of positive instances
correctly identified by the model. F1 (F1-Score) is the harmonic mean of precision and recall.
From the obtained results, we can notice that GPT o1 was the better performer of the two models.
It did not return any instances containing syntactic errors and also had higher evaluation metrics.
This suggests that GPT o1 is more reliable and efective in terms of output format and in extracting
commonsense knowledge. The results are not entirely surprising: GPT o1 is a significantly larger and
more powerful model compared to GPT o3-mini.</p>
      <p>The code and the obtained results are available at the following repository: https://github.com/
DeMaCS-UNICAL/CILC-contrdetect.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Possible extensions and alternative modelings</title>
      <p>Our contradiction model leaves room for refinement by lifting some of our simplifying assumptions.
For instance, looking more closely at the modeling of incompatible predicates one can see that they
involve two main settings: couples of predicates can have either a completely diferent and unrelated
meaning, or be deemed to have an opposite meaning. This last case models typical contradictions based
on antonymies, as mentioned in Sec. 2. If we decide to explicitly model antonyms-based contradictions
in ASP, it is necessary to introduce some terminological meta-predicate which represents couple of
antonyms: e.g., we could use the atom (, ), where  and  are antonyms. Since this is
a-priori information that we should have even before to run the ASP program, instances of such atom
could be taken from some knowledge base. Such knowledge could be extracted from LLMs, or from
ifxed linguistic ontologies, like WordNet [39] or ConceptNet [40], a semantic network which includes
common-sense knowledge. Leveraging such knowledge bases could further refine the neurosemantic
approach, thus reducing the impact that the LLM has on the entire pipeline and shifting it towards the
reasoning phase.
ab_contradiction(X, Y) :- sentence(X, S, P, O1), sentence(Y, S, P, O2),</p>
      <p>O1 != O2, antonyms(O1, 02).
ab_contradiction(X, Y) :- sentence(X, S, P1, O), sentence(Y, S, P2, O),</p>
      <p>P1 != P2, antonyms(P1, P2).
ab_contradiction(X, Y) :- sentence(X, S1, P, O), sentence(Y, S2, P, O),</p>
      <p>S1 != S2, antonyms(S1, S2).</p>
      <p>Contradictions based on antonyms can be considered a sub-case of a wider class of lexical
contradictions, which occur when two words have incompatible meanings. One might consider modeling such
contradictions, so as to cover a larger number of situations. Consider the following example:
• Sentence 1: "The cat is a kitten."
• Sentence 2: "The cat is an adult."
In this case, kitten and adult are not direct antonyms, but they refer to diferent mutually exclusive
stages of life. To know when two concepts have an incompatible lexical meaning, we can model an ASP
predicate __ which will store this information. Like for the predicate
, even __ make the presence of a knowledge base necessary.
lc_contradiction(X, Y) :- sentence(X, S, P1, O), sentence(Y, S, P2, O),</p>
      <p>P1 != P2, opposite_lexical_meaning(P1, P2).
lc_contradiction(X, Y) :- sentence(X, S, P, O1), sentence(Y, S, P, O2),</p>
      <p>O1 != O2, opposite_lexical_meaning(O1, O2).
lc_contradiction(X, Y) :- sentence(X, S1, P, O), sentence(Y, S2, P, O),</p>
      <p>S1 != S2, opposite_lexical_meaning(S1, S2).</p>
      <p>Nonetheless, a lexical contradiction does not only occur when two subjects, two objects, or
two relations respectively have opposite meanings and the rest of the sentence is unchanged between
the two. Consider the following example:
• Sentence 1: "The government banned this product."
• Sentence 2: "The product is available for purchase in stores."
In this case, the object of the first sentence is the subject of the second and it is the relations of both
sentences that have opposite meanings. So, we can consider cases like the following:
lc_contradiction(X, Y) :- sentence(X, S1, P1, O),
sentence(Y, O, P2, O2), P1 != P2,
opposite_lexical_meaning(P1, P2).</p>
      <p>However, these cases are more complex and therefore fall outside the scope of this work.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions</title>
      <p>In this paper we provided a study on how to model basic cases of contradiction by exploiting Large
Language Models as knowledge extractors/translators while delegating symbolic reasoning aspects
to an ASP solver. The obtained results can be considered promising enough to continue studying and
deepening this approach, especially for more complex cases that involve sequences of events or a greater
number of subjects involved.</p>
      <p>Our pipeline could be modified in order to build the knowledge base without having to use the
instance data available. In fact, the LLM could be prompted to obtain its commonsense knowledge of
specific subdomains in the form of the _ and _ predicates.
For example, one could prompt the model with the following request: “Return all pairs of actions whose
meaning is opposite using the predicate incompatible_predicates ”.</p>
      <p>Furthermore, it could be interesting to try other translation methods, e.g. producing terminological
assertions expressed in OWL or RDFS, or using entity and relationship extraction techniques, thus
allowing to further reduce the role of LLMs. Moreover, in the future, it could be useful to explore the
possibility of a formalization of the diferent types of contradictions present in the wild, in order to use
such modelings for new ASP programs.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work was partially supported by project SERICS (PE00000014) and FAIR (PE0000013) under the
MUR National Recovery and Resilience Plan funded by the European Union - NextGenerationEU.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>Besides the usage of generative AI as described in the scientific contribution of the paper, that is, for the
generation of logical formalisms to be used as input for the symbolic module, the authors used GPT-4
to do grammar and spelling check and rewriting of specific sentences. After using these tools, the
authors reviewed and edited the content as needed and take full responsibility for the publication’s content.
[27] S. A. G´omez, P. R. Fillottrani, Leveraging Large Language Models for Ontology-Based Data</p>
      <p>Access: A Preliminary Analysis, 2024.
[28] Z. Syed, A. Padia, T. Finin, M. L. Mathews, A. Joshi, UCO: A Unified Cybersecurity Ontology, in:
AAAI Workshop: Artificial Intelligence for Cyber Security, volume WS-16-03 of AAAI Technical
Report, AAAI Press, 2016.
[29] E. Casey, S. Barnum, R. Grifith, J. Snyder, H. M. A. van Beek, A. Nelson, Advancing coordinated
cyber-investigations and tool interoperability using a community developed specification language,
Digit. Investig. 22 (2017) 14–45.
[30] H. Park, S. Cho, H. Kwon, Cyber Forensics Ontology for Cyber Criminal Investigation, in:
eForensics, volume 8 of Lecture Notes of the Institute for Computer Sciences, Social Informatics and
Telecommunications Engineering, Springer, 2009, pp. 160–165.
[31] T. Prudhomme, G. De Colle, A. Liebers, A. Sculley, P. Xie, S. Cohen, J. Beverley, A semantic
approach to mapping the Provenance Ontology to Basic Formal Ontology, Scientific Data 12
(2025).
[32] R. Hoekstra, J. Breuker, M. D. Bello, A. Boer, The LKIF Core Ontology of Basic Legal Concepts, in:</p>
      <p>LOAIT, volume 321 of CEUR Workshop Proceedings, CEUR-WS.org, 2007, pp. 43–63.
[33] V. R. Benjamins, J. Contreras, P. Casanovas, M. Ayuso, M. Bécue, L. Lemus, C. Urios, Ontologies of
Professional Legal Knowledge as the Basis for Intelligent IT Support for Judges, Artif. Intell. Law
12 (2004) 359–378.
[34] S. Costantini, F. A. Lisi, R. Olivieri, DigForASP: A European Cooperation Network for Logic-based
AI in Digital Forensics, in: CILC, volume 2396 of CEUR Workshop Proceedings, CEUR-WS.org, 2019,
pp. 138–146.
[35] E. Erdem, M. Gelfond, N. Leone, Applications of Answer Set Programming, AI Mag. 37 (2016)
53–68.
[36] G. Ianni, F. Pacenza, J. Zangari, Incremental maintenance of overgrounded logic programs with
tailored simplifications, Theory Pract. Log. Program. 20 (2020) 719–734.
[37] V. Lifschitz, Answer Set Programming, Springer, 2019.
[38] M. Alviano, F. Calimeri, C. Dodaro, D. Fuscà, N. Leone, S. Perri, F. Ricca, P. Veltri, J. Zangari, The
ASP system DLV2, in: LPNMR, volume 10377 of Lecture Notes in Computer Science, Springer, 2017,
pp. 215–221.
[39] G. A. Miller, WORDNET: a lexical database for english, in: HLT, Morgan Kaufmann, 1992.
[40] R. Speer, C. Havasi, Conceptnet 5, Adv. Math. Commun. 1 (2012).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Condoravdi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Crouch</surname>
          </string-name>
          , V. de Paiva,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stolle</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. G.</surname>
          </string-name>
          <article-title>Bobrow, Entailment, intensionality and text understanding</article-title>
          ,
          <source>in: Proceedings of the HLT-NAACL 2003 Workshop on Text Meaning</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>38</fpage>
          -
          <lpage>45</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Harabagiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hickl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. F.</given-names>
            <surname>Lacatusu</surname>
          </string-name>
          , Negation, Contrast and Contradiction in Text Processing, in: AAAI, AAAI Press,
          <year>2006</year>
          , pp.
          <fpage>755</fpage>
          -
          <lpage>762</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>I.</given-names>
            <surname>Dagan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sammons</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Zanzotto</surname>
          </string-name>
          ,
          <source>Recognizing Textual Entailment: Models and Applications</source>
          ,
          <source>Synthesis Lectures on Human Language Technologies</source>
          , Morgan &amp; Claypool Publishers,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Bowman</surname>
          </string-name>
          , G. Angeli,
          <string-name>
            <given-names>C.</given-names>
            <surname>Potts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <article-title>A large annotated corpus for learning natural language inference</article-title>
          ,
          <source>in: EMNLP, The Association for Computational Linguistics</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>632</fpage>
          -
          <lpage>642</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Surana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dembla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bihani</surname>
          </string-name>
          ,
          <source>Identifying Contradictions in the Legal Proceedings Using Natural Language Models, SN Comput. Sci. 3</source>
          (
          <year>2022</year>
          )
          <fpage>187</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Dung</surname>
          </string-name>
          ,
          <article-title>On the Acceptability of Arguments and its Fundamental Role in Nonmonotonic Reasoning, Logic Programming</article-title>
          and
          <string-name>
            <surname>n-Person</surname>
            <given-names>Games</given-names>
          </string-name>
          , Artif. Intell.
          <volume>77</volume>
          (
          <year>1995</year>
          )
          <fpage>321</fpage>
          -
          <lpage>358</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schlobach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cornet</surname>
          </string-name>
          ,
          <article-title>Non-Standard Reasoning Services for the Debugging of Description Logic Terminologies</article-title>
          , in: IJCAI, Morgan Kaufmann,
          <year>2003</year>
          , pp.
          <fpage>355</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Qi</surname>
          </string-name>
          , W. Liu,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Bell</surname>
          </string-name>
          ,
          <article-title>A revision-based approach to handling inconsistency in description logics</article-title>
          ,
          <source>Artif. Intell. Rev</source>
          .
          <volume>26</volume>
          (
          <year>2006</year>
          )
          <fpage>115</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Hewitt</surname>
          </string-name>
          , Inconsistency Robustness in Logic Programs,
          <year>2015</year>
          . arXiv:
          <volume>0904</volume>
          .
          <fpage>3036</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Blair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Subrahmanian</surname>
          </string-name>
          , Paraconsistent Logic Programming,
          <source>Theor. Comput. Sci</source>
          .
          <volume>68</volume>
          (
          <year>1989</year>
          )
          <fpage>135</fpage>
          -
          <lpage>154</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Asif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Khan</surname>
          </string-name>
          , W. Song,
          <article-title>Evaluating large language models for optimized intent translation and contradiction detection using KNN in IBN</article-title>
          ,
          <source>IEEE Access 13</source>
          (
          <year>2025</year>
          )
          <fpage>20316</fpage>
          -
          <lpage>20327</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>V.</given-names>
            <surname>Dragos</surname>
          </string-name>
          ,
          <article-title>Detection of contradictions by relation matching and uncertainty assessment</article-title>
          ,
          <source>in: KES</source>
          , volume
          <volume>112</volume>
          of Procedia Computer Science, Elsevier,
          <year>2017</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>80</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pielka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rode</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pucknat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Deußer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sifa</surname>
          </string-name>
          ,
          <article-title>A linguistic investigation of machine learning based contradiction detection models: An empirical analysis and future perspectives</article-title>
          , in: ICMLA, IEEE,
          <year>2022</year>
          , pp.
          <fpage>1649</fpage>
          -
          <lpage>1653</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          , G. Ianni, T. Krennwallner,
          <article-title>Answer Set Programming: A Primer, in: Reasoning Web</article-title>
          , volume
          <volume>5689</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2009</year>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>W.</given-names>
            <surname>Faber</surname>
          </string-name>
          ,
          <article-title>An Introduction to Answer Set Programming and Some of Its Extensions</article-title>
          , in: RW, volume
          <volume>12258</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2020</year>
          , pp.
          <fpage>149</fpage>
          -
          <lpage>185</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>M. de Marnefe</surname>
            ,
            <given-names>A. N.</given-names>
          </string-name>
          <string-name>
            <surname>Raferty</surname>
            ,
            <given-names>C. D.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          , Finding Contradictions in text, in: ACL, The Association for Computer Linguistics,
          <year>2008</year>
          , pp.
          <fpage>1039</fpage>
          -
          <lpage>1047</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Niu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <article-title>Topological Analysis of Contradictions in text</article-title>
          , in: SIGIR, ACM,
          <year>2022</year>
          , pp.
          <fpage>2478</fpage>
          -
          <lpage>2483</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Gärtner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Göhlich</surname>
          </string-name>
          ,
          <article-title>Automated requirement contradiction detection through formal logic and llms</article-title>
          ,
          <source>Autom. Softw. Eng</source>
          .
          <volume>31</volume>
          (
          <year>2024</year>
          )
          <fpage>49</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rajasekharan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Padalkar</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Gupta, Reliable Natural Language Understanding with Large Language Models and Answer Set Programming</article-title>
          ,
          <source>in: ICLP</source>
          , volume
          <volume>385</volume>
          <source>of EPTCS</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>274</fpage>
          -
          <lpage>287</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>M. A. B. Santana</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Kareem</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Ricca</surname>
          </string-name>
          ,
          <article-title>Towards Automatic Composition of ASP Programs from Natural Language Specifications, in: IJCAI, ijcai</article-title>
          .org,
          <year>2024</year>
          , pp.
          <fpage>6198</fpage>
          -
          <lpage>6206</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S.</given-names>
            <surname>Caruso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dodaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maratea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mochi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Riccio</surname>
          </string-name>
          ,
          <article-title>CNL2ASP: Converting Controlled Natural Language Sentences into ASP, Theory Pract</article-title>
          . Log. Program.
          <volume>24</volume>
          (
          <year>2024</year>
          )
          <fpage>196</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>E.</given-names>
            <surname>Coppolillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Calimeri</surname>
          </string-name>
          , G. Manco,
          <string-name>
            <given-names>S.</given-names>
            <surname>Perri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricca</surname>
          </string-name>
          , LLASP:
          <article-title>Fine-tuning Large Language Models for Answer Set Programming</article-title>
          ,
          <source>in: KR</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Walega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schultz</surname>
          </string-name>
          ,
          <article-title>Visual Explanation by High-Level Abduction: On Answer-Set Programming Driven Reasoning About Moving Objects</article-title>
          , in: AAAI, AAAI Press,
          <year>2018</year>
          , pp.
          <fpage>1965</fpage>
          -
          <lpage>1972</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kalyanpur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Saravanakumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Barres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chu-Carroll</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Melville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Ferrucci</surname>
          </string-name>
          , LLM-ARC:
          <article-title>Enhancing LLMs with an Automated Reasoning Critic</article-title>
          ,
          <source>CoRR abs/2406</source>
          .17663 (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agiollo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Omicini</surname>
          </string-name>
          ,
          <article-title>Large language models as oracles for instantiating ontologies with domain-specific knowledge</article-title>
          ,
          <source>Knowl. Based Syst</source>
          .
          <volume>310</volume>
          (
          <year>2025</year>
          )
          <fpage>112940</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>A.</given-names>
            <surname>Oarga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Bran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lederbauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Schwaller</surname>
          </string-name>
          ,
          <article-title>Scientific Knowledge Graph and Ontology Generation using Open Large Language Models</article-title>
          , in: Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities, and
          <string-name>
            <surname>Challenges</surname>
          </string-name>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>