<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <article-id pub-id-type="doi">10.3233/FAIA240896</article-id>
      <title-group>
        <article-title>Argumentation-based Explainable Recommender System with ARES</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Riccardo Felici</string-name>
          <email>riccardofelici7@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuele De Angelis</string-name>
          <email>emanuele.deangelis@iasi.cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessio Ferrato</string-name>
          <email>alessio.ferrato@uniroma3.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maurizio Proietti</string-name>
          <email>maurizio.proietti@iasi.cnr.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Sansonetti</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Toni</string-name>
          <email>f.toni@imperial.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CNR-IASI</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Imperial</institution>
          ,
          <addr-line>London</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Roma Tre University</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2009</year>
      </pub-date>
      <volume>392</volume>
      <fpage>199</fpage>
      <lpage>218</lpage>
      <abstract>
        <p>Traditional recommender systems lack transparency, limiting user trust. This paper presents ARgumentationbased Explainable recommender System - ARES, which ofers traceable recommendations with explicit reasoning paths. For explainability ARES relies upon ABALearn, a system that learns Assumption-Based Argumentation (ABA) frameworks from positive and negative examples, given a background knowledge. Argumentative explanations are reformulated into natural language via a Large Language Model, linked in ABA logic to prevent hallucinations. The system uses an iterative learning mechanism, guided by ABALearn, and facilitated by an interactive chatbot, to dynamically adapt user profiles.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable Recommender Systems</kwd>
        <kwd>Assumption-based Argumentation</kwd>
        <kwd>Traceable and Iterative Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Recommender Systems (RSs) are widely used and highly efective tools for guiding users through
vast amounts of information and products, ofering personalized suggestions across various domains.
However, RSs are often black boxes, as they provide suggestions failing to explain why, limiting the
trust of the users who do not understand the reasons behind the suggestion [1]. This work seeks to
address this critical transparency gap by developing an Explainable Recommender System (XRS) [2],
built upon Assumption-based Argumentation (ABA) frameworks [3, 4].</p>
      <p>We propose ARES (ARgumentation-based Explainable recommender System), whose core contribution
is using ABA frameworks to provide traceable recommendations, in which every step of the reasoning
process, including rules and assumptions leading to a recommendation, is fully explicit and verifiable.
This intrinsic traceability supports a complete reconstruction of the entire reasoning process, from the
initial background knowledge to the final recommendation, ensuring a very high degree of transparency.
The argumentative explanations are then reformulated in natural language using an advanced linguistic
model, making it more understandable to the user. Unlike many LLM-based XRSs that might generate
plausible but unfaithful explanations, ARES ensures that natural language generation is directly based
on the rigorous logic of ABA, significantly reducing the presence of hallucinations [ 5] and guaranteeing
explanations aligned with the actual formal reasoning.</p>
      <p>ARES features an iterative learning mechanism, which allows the user profiles to be dynamically
updated based on preferences and feedback, making suggestions increasingly accurate. The learning
process is driven by ABALearn [6, 7], an automated logic-based learning system designed to infer
ABA frameworks from positive and negative examples, and background knowledge. The interactive
chatbot enables the learnt ABA frameworks to evolve continuously, integrating new user feedback and
preferences without requiring a full retraining from scratch. This represents a clear advantage over
traditional one-shot learning approaches, which lack such flexibility.</p>
      <p>We validate our method in the complex and subjective efild of perfumery, showing that ARES can
deliver recommendations that are transparent, personalized, and capable of adapting over time, while
having comparable or competitive performances against standard baselines.</p>
      <p>Related work In the context of RS, various approaches have been developed to enhance
explainability and user trust. Xian et al. [8] introduced CAFE, a CoArse-to-FinE neural symbolic reasoning
framework that generates user profiles as coarse sketches of behaviors, guiding a path-finding process
to derive reasoning paths for recommendations as fine-grained predictions. This method emphasizes
the importance of incorporating symbolic reasoning into RS to improve interpretability. Similarly,
Tan et al. [9] proposed CountER, a counterfactual explainable recommendation model that utilizes
counterfactual reasoning from causal inference to generate minimal changes on item aspects, creating a
counterfactual item where the recommendation decision is reversed. This approach aids in providing
clear explanations by highlighting what would need to change for a diferent recommendation outcome.</p>
      <p>Several argumentation-based RS have been proposed in the literature to date. Among these, Rago et al.
[10] uses quantitative tripolar argumentation frameworks, rather than ABA frameworks, generated
automatically from data without any need for knowledge to be manually incorporated, and Rago et al.
[11] draw recommendations for a variety of products from their textual reviews, but with quantitative
bipolar argumentation rather than ABA, and without integrating user profiles. Furthermore, Briguez
et al. [12] use a further form of argumentation (Defeasible Logic Programming) to formulate the
conditions under which a movie should be recommended to a given user, but without any learning
from examples/background knowledge. To the best of our knowledge, the proposed methodology is the
ifrst to use learnt ABA frameworks towards explainable recommendations.</p>
      <p>Paper Structure In Section 2 we present background on ABA frameworks and ABALearn. In Section 3
we present our argumentative approach used for the development and implementation of the ARES
recommender system. In Section 4 we describe the implementation of iterative learning with ABALearn
via a chatbot interface. In Section 5 we present the results of the experimental evaluation, before
concluding and discussing future work in Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <sec id="sec-2-1">
        <title>2.1. Assumption-Based Argumentation Frameworks</title>
        <p>An ABA framework [3, 4] is a tuple ⟨ℒ, ℛ, ,− ⟩, where:
∙ ℒ is a set of sentences;
∙ ℛ is a set of inference rules of the form  0 ←  1, . . . , , with  ∈ ℒ, for  = 0, . . . , ;
∙  ⊆ ℒ is a non-empty set of assumptions;
∙ − :  → ℒ is a contrary function, mapping each assumption to its contrary in ℒ.</p>
        <p>In a rule 0 ←  1, . . . , , the sentence 0 is the head of the rule and 1, . . . ,  is the body of the
rule. If the body of a rule is empty we call it a fact. In this work, we focus on flat ABA frameworks,
where assumption in heads of rules are disallowed. In general, elements of ℒ can be any sentences, but
in this paper we restrict to ABA frameworks where ℒ is a finite set of ground atoms. However, in the
spirit of logic programming, we use schemata to write sentences, rules, assumptions and contraries,
using variables that range over a given universe of constants.</p>
        <p>Example 1 (Dream Fragrance). To illustrate the fundamental concepts of ABA, let us consider a
simple scenario, aimed at determining the liking of a perfume. We define an ABA framework  =
⟨ℒ, ℛ, ,− ⟩ as follows, where  and  range over a suitable universe to describe perfumes:
∙ ℒ = {(), _ (,  ),  ( ), ( ), _( )};
∙ ℛ = {() ← _ (,  ),  ( ), ( ), _() ←,
_ (_ , ) ←, _ (, ) ←,
 () ←,  () ←};
∙  = {( )};
∙ ( ) = _( ).</p>
        <p>Intuitively, a perfume  is liked if it contains a floral ingredient  , unless  is , and a
particular fragrance (_ ) contains  and  ingredients, both floral.</p>
        <p>We often write facts as rules with equalities in the body, e.g., in the earlier example, we may write
 () ← as  () ←  = .</p>
        <p>Given an ABA framework, an argument for a claim  ∈ ℒ is a deduction of  constructed from a finite
set of assumptions  ⊆  by applications of rules in ℛ [4].  constitutes the support for the argument.</p>
        <p>The acceptability of arguments depends on their ability to defend from possible “attacks”: an argument
1 attacks an argument 2 if the claim of 1 is the contrary of an assumption in the support of
2. In this paper, the notion of acceptability we focus on (for flat ABA frameworks) is given in terms
of stable extensions [3, 4], which determine accepted (and rejected) arguments and their associated
claims as follows. A set ∆ ⊆  of arguments is a stable extension if (i) no argument in ∆ attacks any
argument in ∆ (i.e. ∆ is conflict-free) and (ii) every argument not in ∆ is attacked by an argument
in ∆ (i.e. ∆ “attacks” all arguments it does not contain, thus pre-emptively “defending” itself against
attacks). We say that an ABA framework is satisfiable if it admits at least one stable extension, and
unsatisfiable otherwise. We also say that a sentence is (credulously) accepted in a stable extension ∆ of
an ABA framework if it is the claim of an argument in ∆.</p>
        <p>Example 2 (Dream Fragrance, Cont.). Given  in Example 1, there is an argument 1 for claim
(_ ) with support {()} , and another argument 2 for the same claim with
support {()} . 1 is not attacked by any other argument of  . In contrast, 2 is attacked by
the argument 3 for _() with support the empty set of assumptions. 1 and 3 belong to
the unique stable extension of  (and thus (_ ) is accepted), and 2 does not.</p>
        <p>An ABA argument can be represented and visualized as an argument tree in a “hierarchical” manner.
An argument tree is a finite tree whose root node corresponds to the claim, the internal nodes represent
the atoms derived by applying rules in the intermediate steps, and the leaves are the assumptions in the
support of the argument. The structure of these trees is particularly interesting because it allows us to
visualize and trace the reasoning paths that the inferential process has taken to deduce the claim. For
an argument tree to be considered well-formed, it must be finite and acyclic [13].</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Learning ABA Frameworks</title>
        <p>
          Learning ABA frameworks aims at generating understandable rules for decision-making, thus helping
to promote transparency and interpretability, which are crucial aspects for overcoming the black box
nature of many traditional machine learning models. In ABALearn [6, 14, 15] the learning process takes
as input: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) a background knowledge, in the form of a satisfiable ABA framework  = ⟨ℒ, ℛ, , ⟩,
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) positive examples ℰ + ⊆ ℒ , and (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) negative examples ℰ − ⊆ ℒ , and derives an ABA framework  ′ =
⟨ℒ′, ℛ′, ′, ′⟩, with ℛ ⊆ ℛ ′,  ⊆  ′, ⊆ ′, such that (i)  ′ admits a stable extension ∆ , (ii) all
positive examples are accepted in ∆, and (iii) no negative example is accepted in ∆.
        </p>
        <p>
          ABALearn learns ABA frameworks automatically by making use of transformation rules, including: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
rote learning, which, given a positive example (), introduces a new rule () ←  =  ; (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) folding,
which, given rules  ← ,  and  ←  , derives the new rule  ← ,  ; (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) assumption introduction,
which, given rule  ←  , introduces an assumption  , with contrary  , and derives the new rule
 ← ,  ; and (4) fact subsumption, which deletes any fact of the form () ← (or () ←  =  ) if
there is an accepted argument with claim () in the ABA framework ⟨ℒ, ℛ ∖ {() ←}, , − ⟩.
        </p>
        <p>The ABALearn algorithm follows an iterative strategy based on four steps:
1. Generating initial rules. This step applies rote learning to learn facts from positive examples.
2. Generalising facts. This step selects a fact obtained by rote learning and applies fact subsumption.</p>
        <p>
          If the fact is not subsumed, it applies folding with the goal of generating a new, more general,
rule that makes no explicit references to the constants occurring in the ABA framework.
3. Introducing new assumptions. This step applies assumption introduction to any rule obtained
by step (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) if it supports an argument for a negative example.
4. Learning facts for contraries. This step applies rote learning to derive facts for the contraries
of the new assumptions introduced by step (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ).
        </p>
        <p>
          The ABALearn strategy consists in following the above steps according to the pattern: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ); (2; 3; 4)* .
Example 3 (Learning ABA frameworks). We now show how the first two rules of ℛ in Example 1
can be learnt from the facts in ℛ. We assume that the examples are: ℰ + = {(_ )}
and ℰ − = {()}. By rote learning, we get1
        </p>
        <p>1. () ←  = _ 
By repeatedly folding, we get:</p>
        <p>2. () ← _ (,  ),  ( )
Now, we have derived an ABA framework with a single stable extension, where the positive example
(_ ) is accepted, but also the negative example {()} is accepted.
To avoid the acceptance of the negative example, by assumption introduction, we add the assumption
() to the body of rule  2 and, by rote learning, we add the fact _() ←  =  for the
contrary of (), thereby getting the set ℛ of rules shown in Example 1. 2</p>
        <p>When considering the stable extension semantics, ABALearn is implemented in ASP-ABAlearn
(available at https://github.com/ABALearn/aba_asp) using the SWI-Prolog [16] system and the Clingo [17]
ASP solver. . The central idea of ASP is to solve a given computational problem by specifying it as a set
of rules, called ASP program, whose models, called answer sets, represent solutions to the computational
problem. In particular, by translating an ABA framework into an ASP program, ABALearn can take
advantage of the mapping between stable extensions and answer sets, thereby reducing some reasoning
tasks required by rote learning and fact subsumption to computing answer sets of the ASP encoding.
Indeed, by inspecting the answer sets of the ASP encoding of an ABA framework, we can learn facts to
(i) accept positive examples (at step 1), and (ii) attack assumptions to reject negative examples (at step
4), as well as remove facts (at step 2) whose corresponding claims already belong to the stable extension.</p>
        <p>Although the above mentioned mapping would allow us to recast our method for learning (flat) ABA
frameworks as a method for learning ASP programs, we believe that working with the ABA
representation gives us several advantages. First of all, it reflects more naturally the argumentative approach to
the learning process, by which the learnt rules can be considered to be defeasible, and hence they can
be modified when conflicting conclusions or exceptions arise. This aspect is especially significant when
ABA frameworks are learnt incrementally [18]. Furthermore, the adoption of this formalism allows the
direct use of tools for learning ABA frameworks that have been recently developed [6].</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. ARES</title>
      <p>The ARES architecture in Figure 1a is designed to support an iterative, learning-based process. The
system has been developed for providing recommendations in the perfume domain, but its structure is
very general, and could be easily adapted to support a very wide range of diferent domains. It consists
of several interconnected modules that manage the diferent stages of the recommendation process:
• User Profiling Module: Manages the collection of user information and its preparation. It
captures explicit and implicit preferences, converting them into positive and negative examples
(ℰ +, ℰ − ), and background knowledge (BK) rules in the ABALearn syntax.
1We label rules with identifiers   for ease of reference.
2No applications of fact subsumption have been necessary in this example.</p>
      <p>• ABALearn Module: Represents the core of ARES. It receives as input the examples (ℰ +, ℰ − ) and
the background knowledge BK. Its task is to learn and continuously update the ABA framework
that constitutes the user preference profile.
• Recommender Module: Uses the learnt ABA framework to generate personalized
recommendations. It also receives specific queries describing the desired features in the perfume sought.
• Explanation Module: Generates natural language explanations, using a Large Language Model,
and argument trees for the recommendations produced. Input includes recommended item details
and logic evidence derived from the answer set via the inferential process.</p>
      <p>User Profiling The initial phase of ARES is devoted to acquiring information from the user and
constructing her/his personalized profile, represented as an ABA framework. Explicit preferences, such
as liked or disliked perfumes indicated directly by the user, are translated into positive and negative
examples that will be used to train the ABA framework. Implicit preferences result from the selection
of evocative images related to olfactory scenarios or ingredient groups. These selections are used to
generate rules that reflect assumptions about the user’s preferences. These rules, which model the
user’s implicit preferences, constitute the ARES’ dynamic component of the background knowledge,
evolving with interaction. For example, if the user likes an image associated with a woody scenario,
the system generates a rule stating that the user likes scents containing woody ingredients. To ensure
the robustness and flexibility of the learnt rules, dummy items may be introduced among the positive
examples: the presence of these items prevents the invalidation of the rules in later stages of learning,
in the case where these rules are attacked and cover no real examples. Contextually, fragrance domain
information, such as ingredients, olfactory scenarios, and designers, are extracted from a dataset and
converted to ground facts in the ABA framework, going to constitute the static component of the
background knowledge, that is, the set of facts describing the domain.</p>
      <p>Example 4 (ABA representation of the user profile). We show below an ABA framework
representing the profile of a user who likes sweet ingredients. This preference is supported by the first rule
and a positive dummy example (1). The rule is defeasible, as it contains an assumption ℎ_1(, )
(a) Recommender System
(b) Chatbot
for whose contrary _ℎ_1(, ) ABALearn can learn rules, if needed to capture exceptions. The
user’s preferences are completed by a positive example (meltine) and a negative example (velvette).
We use the Prolog-like syntax accepted by the ABALearn system with ‘:-’ to indicate ‘← ’. The ABA
framework representation also includes declarations of assumptions and their contraries.
% Rules from the user profile for dummy example
like(A) :- ingredient_of(A,B), sweet(B), alpha_1(A,B).
sweet(A) :- A=t1.
ingredient_of(A, B) :- A=p1, B=t1.
% Rules from the perfume dataset for non-dummy examples
oriental(A) :- A=vanilla.
ingredient_of(A, B) :- A=meltine, B=vanilla.
sweet(A) :- A=sugar.
ingredient_of(A, B) :- A=velvette, B=sugar.
% Assumptions and contraries
assumption(alpha_1(A,B)).
contrary(alpha_1(A,B),c_alpha_1(A,B)) :- assumption(alpha_1(A,B)).
% Positive examples: [like(p1),like(meltine)]
% Negative examples: [like(velvette)]</p>
      <p>Examples and background knowledge are given as input to ABALearn, which produces an ABA
framework that represents the user profile, and is capable of supporting arguments for or against liking
certain perfumes. In particular, the learnt framework accepts all positive example, does not accept any
negative example, and is also able to decide the acceptance of unseen items.</p>
      <p>Example 5 (Learnt ABA framework). From the ABA framework, positive and negative examples
in Example 4, ABALearn generates a rule for the contrary of the assumption ℎ_1(, ). This rule
captures an exception to the rule representing a preference for sweet scents, and excludes the ingredient
sugar to avoid accepting the negative example (). A rule for like is also generated to
cover the positive example (), stating the preference for oriental ingredients. The output is:
c_alpha_1(A,B) :- ingredient_of(A,B), B=sugar.</p>
      <p>like(A) :- ingredient_of(A,B), oriental(B).</p>
      <p>The learning process is inherently dynamic: when new information or feedback is collected from the
user, it is used to update the examples or background knowledge, triggering a re-execution of ABALearn.
However, as we will see in detail in Section 4, by rendering learnt rules defeasible, ABALearn is able
to modify existing rules and add exceptions without having to rerun the training from scratch. This
mechanism allows refining the ABA framework and adapting the user profile dynamically over time.
Recommender The recommendation generation phase uses the learnt ABA framework to identify
and suggest (unseen) perfumes that the user may like. The process begins by capturing the user-specified
recommendation goal, such as seasonality or desired scent intensity. A preliminary filtering step is
then applied to reduce the pool of candidate perfumes. This filter uses the characteristics desired by the
user and compares them with the attributes in the dataset to create an initial ranking. This ranking
is based on how consistent each perfume is with the user’s specified preferences and objective scent
attributes; it does not consider the ABA framework user profile. For each candidate perfume that has
passed preliminary filtering, its attributes are transformed into ground ABA facts. The user’s ABA
framework and the candidate perfume facts are translated into an ASP program, and Clingo determines
which perfumes specified by the user’s goal belong to the answer set of that program, and hence are
claims accepted in an ABA framework’s stable extension. An analysis of the answer set produced by
Clingo makes it possible to identify which specific rules of the ABA framework were used to derive
the recommended claims from the facts. The number of used rules helps calculate a similarity score,
indicating the degree of compatibility between the perfume and the user profile. This score is combined
with the preliminary rank to obtain a final rank, computed for item  as:</p>
      <p>
        Ranking() = 0.5 · () + 0.5 · ()
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
      </p>
      <p>Where () represents the argumentative contribution, which is quantified by the number of
supporting rules derived by Clingo. () incorporates a combination of heuristic parameters relating the
relevance of  to the request. These parameters include ratings, matching features, genre compatibility
and diferences in intensity.</p>
      <p>Although the similarity formula presented here has been calibrated specifically for perfumery, the
modularity of the ARES architecture allows alternative, more generic similarity metrics to be integrated.
This modular approach ensures that the system’s high-level structure remains unchanged, enabling
ARES to adapt efectively to diferent scenarios and domains. While the current choice aims at achieving
an optimal balance between generality and accuracy for the specific use case, it also paves the way for
future extensions to get broader applicability.</p>
      <p>Explanation The recommendations generated are not simply suggestions, but are supported by an
explicit formal reasoning process, where each step is verifiable. This is made possible by the system’s
ability to construct argument trees and reasoning paths, defined as the specific paths within the argument
tree, which link facts in the background knowledge (such as user preferences and item features) to the
ifnal recommendation claim. This transparency makes it possible to determine which features or rules
influenced the decision. To translate these reasoning paths into user-understandable explanations, the
system makes use of the Gemini 2.0 Flash model. The LLM receives the extracted reasoning paths as
input, along with other formal argumentation evidence, and, guided by a carefully formulated prompt,
generates a natural language explanation. This prompt is designed to constrain the LLM to produce
consistent descriptions that are faithful to the argumentative process, ensuring that the explanation is
ifrmly linked to the actual data and inferences, thus preventing the generation of hallucinations.</p>
      <p>This approach difers from some recent trends in Explainable AI, such as Chain of Thought (CoT)
prompting [19]. Although CoT prompting aims to explicate the intermediate steps of a model’s reasoning,
recent studies [20] have shown that LLMs can generate CoTs that are plausible but not always sound and
reflecting the actual process of constructing an answer, thus creating an illusion of reasoning [ 21]. In
our system, however, the faithfulness of the reasoning chain is guaranteed: the reasoning paths are not
an arbitrary textual construction, but are derived directly from facts and rules of the ABA framework.
Thus, the generated explanation is verifiable and corresponds faithfully to the actual inferential process,
ofering robust and reliable explainability.</p>
      <p>Example 6 (Recommendation, reasoning paths and explanation). Reasoning paths are extracted
from a single argument tree generated by the recommendation for the item amberlush. With reference
to Examples 4 and 5, Figure 2 shows a path that supports the  rule for sweet ingredients and a
path for oriental ingredients. The prompt uses the argument tree and the recommended item details to
generate a natural language explanation.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Iterative Learning</title>
      <p>The ARES chatbot architecture in Figure 1b serves as an advanced conversational interface for user
interaction. Its main purpose is to facilitate the collection of information and feedback from the user in
natural language, overcoming the limitations of traditional structured interfaces. It allows the user to
rate and ask questions about perfumes, and provide comments indicating liking or disliking.
Information gathered through the chatbot is used to dynamically update the user’s profile and refine future
recommendations. This architecture is built by emulating an agentic style [22], where the system adapts
its behavior based on the user’s request.</p>
      <p>• NLP Parsing: Receives user textual prompts as input and uses Natural Language Processing
(NLP) techniques to parse the text, identify expressed communicative intent, and extract relevant
entities (item name, class and sentiment). The implementation leverages the capabilities of an
LLM to classify user requests [23].
• Operational Routines: Based on the output of the NLP Parsing Module, the message is routed
to one of the specialized operational routines. Each routine is designed to perform specific
operations in response to particular categories of user requests.
• Routine Integration Module: This module represents the point of convergence of the various
operational routines. It is responsible for aggregating and standardizing the outputs generated by
the individual specialized routines, ensuring a consistent, processable format for integration at
later stages of the learning cycle. The new structured information is then passed on to the User
Profiling and ABALearn modules.</p>
      <p>The interaction through the chatbot enables the iterative learning of the ABA framework representing
the user’s profile. Information and feedback acquired through the conversational interface is formalized
in the format needed by the ABALearn engine, with feedback management routines playing a key role
in this formalization. When the user expresses a positive sentiment toward a specific feature, the system
generates a new  rule within the user’s ABA framework. Similarly to the case of the user profiling
module, dummy items are introduced to serve as explicit positive examples. The handling of negative
sentiment, on the other hand, is more complex and is addressed through an integrated approach with
ABALearn: information about disliked items is formalized through the creation of a dummy item with
the negative feature, which is added to the set of negative examples provided in input to ABALearn.
This mechanism allows the learning engine to identify existing rules that enforce liking of the dummy
item and automatically generate the necessary contrary to prevent such derivation. Finally, feedback
related to specific perfumes results in a direct update of the lists of positive and negative examples in
the user profile, allowing the user to change her/his mind about previously expressed preferences.</p>
      <p>Once all the collected and formalized information has updated the inputs, the learning process is
re-executed. This iterative cycle, unlike static models, allows the ABA framework to evolve dynamically,
incorporating new knowledge from more recent user interactions, with the goal of progressively
improving the accuracy and relevance of future recommendations. This feature of ARES can be seen as
a realisation of contestable learning [24, 18], which gives users the ability to interact with the system
and question its decisions or recommendations. In other words, users can question the rules that lead
to the acceptance of an undesired claim and, vice versa allow the system to learn new rules that lead to
the acceptance of a desired claim. The redress of the system after contestation is obtained by making
the previously learnt rules defeasible, and then learning new rules, without re-learning from scratch.
Example 7 (Contestation: Ingredient with negative sentiment). After the amberlush
recommendation received in Example 6, the user may contest the system by marking amber as an undesirable
feature. To redress the ABA framework after this contestation, a new dummy item 2 is created, together
with rules specifying that 2 has the amber ingredient and that amber is an oriental type of scent. Then,
ABALearn identifies any rule that can be used for entailing (2). This is the previously learnt rule
() ← _ (, ), () (see Example 5). Now, by applying the assumption
introduction transformation, ABALearn adds a new assumption ℎ_2(), thus rendering the rule
defeasible and, by rote learning, also adds a rule for the contrary _ℎ_2() of the assumption,
thus introducing an exception to the general rule:
% Background knowledge for dummy item p2
ingredient_of(A, B) :- A=p2, B=amber.
oriental(B) :- B=amber.
% Learnt rules
like(A) :- ingredient_of(A,B), oriental(B), alpha_2(A,B).</p>
      <p>c_alpha_2(A, B) :- ingredient_of(A, B), B=amber.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Evaluation</title>
      <p>The ARES evaluation process was conducted to quantify the efectiveness and performance of the
proposed model by comparing it with established methodologies. The evaluation methodology
employed standard quantitative metrics and a cross-validation protocol based on the Leave-One-Out
Cross-Validation technique [25]. This technique involves temporarily removing a single item from
each user’s profile for use as a test item, training the system on the reduced profile and evaluating its
ability to correctly predict the omitted item. For comparison, several Collaborative Filtering algorithms
were used, including KNN (User-based Nearest-Neighbor), SVD (Singular Value Decomposition), NMF
(Non-negative Matrix Factorization) and CoClustering.</p>
      <p>The evaluation metrics adopted include Mean Absolute Error (MAE) and Root Mean Square Error
(RMSE), which measure the accuracy of numerical predictions, and Precision@n, Recall@n, and
FMeasure@n, which assess the quality of recommendations in terms of relevance and completeness.</p>
      <p>The experimental results in Table 1 show that the ARES approach performs in line with State-of-Art
algorithms in terms of MAE and RMSE, indicating good predictive accuracy. However, a lower recall
was observed for ARES than for the collaborative algorithms, suggesting that in highly subjective
domains such as perfumery, collective preferences may ofer added value over purely content analysis.
Overall, our experiments show that our approach does not sacrifice too much performance with respect</p>
      <p>KNN
to State-of-the-Art systems, while providing transparency and explainability. A comparison with other
explainable RS, e.g., CAFE [8] and CountER [9], is left for future work.</p>
      <p>Besides explainability, a distinctive feature of ARES is its support to contestability (see Section 4).
The efects of contestability are evaluated through an iterative simulation that compared ARES with
the KNN algorithm. This simulation allowed us to observe the trend of the mean absolute error over
several iterations for a single user, with five simulations. The graph in Figure 3 associated with this
test revealed that ARES converges to a lower absolute error, demonstrating remarkable adaptive ability
and consistency in the representation of preferences. Alternatively, the KNN model showed less stable
behavior, with the absolute error fluctuating along iterations without clear convergence. This result
emphasizes how our approach is particularly suitable in scenarios where user preferences evolve over
time, thus confirming its validity from a dynamic perspective and its ability to adapt efectively.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>We explored the application of ABALearn for the development of explainable recommender systems,
addressing the problem of opacity inherent in traditional models. We introduced ARES
(ARgumentationbased Explainable recommendation System), a system that generates traceable and understandable
recommendations based on an explicit reasoning process. Our architecture integrates a formal
representation via ABA frameworks, which model the user’s preference profile through rules, assumptions
and contraries, and an LLM to translate argumentative explanations into natural language. The linking
of explanations to formal derivations in ABA ensures the faithfulness of the inferential process and
prevents the generation of hallucinations.</p>
      <p>A distinguished feature of ARES is its iterative and adaptive learning mechanism, enabled by a
conversational chatbot interface. This dynamic interaction enables the system to continuously update
the user profile based on explicit and implicit feedback, allowing the ABA framework to evolve and
refine recommendations over time.</p>
      <p>Future Work. We envisage several directions for the future development of ARES. A first direction
concerns the implementation of a hybrid model that combines the ARES content-based approach with
collaborative filtering. Instead of considering only the interaction history, we could exploit the similarity
between users, based on the similarity between the ABA frameworks generated for each of them, e.g.,
by comparing extensions (or answer sets of the corresponding ASP representations). This would enable
us to take advantage of the personalisation inherent in content-based user profiles and the ability of
collaborative filtering to detect common patterns among users with similar logical preference structures.</p>
      <p>A second line of development focuses on presenting logical reasoning as Chain of Thoughts in natural
language. Currently, explanations are generated by the LLM from the extracted reasoning paths. We
could further explore automatic textual generation of these reasoning paths, making them even more
understandable and narrative for the user. The ultimate goal is to provide a chain of reasoning that is
not only plausible, but whose soundness and faithfulness to the inference process is ensured by direct
derivation from the rules of the ABA framework, unlike approaches generating an illusion of reasoning.</p>
      <p>Finally, ARES focuses on the perfumery domain, but its design principles and architecture are general.
We plan to implement other instances of the system in diferent domains. It is reasonable to expect that
in domains where the features of the items to be recommended are more objective (e.g., technological
or financial products), the benefits of ARES may be even more pronounced.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We thank support by the Royal Society, UK (IEC\R2\222045). Toni was funded by the ERC (grant
agreement No. 101020934) and by J.P. Morgan and the RAEng, UK, under the Research Chairs Fellowships
scheme (RCSRF2021\11\45). De Angelis and Proietti were supported by the MUR PRIN 2022 Project
DOMAIN funded by the EU-NextGenerationEU (2022TSYYKJ, CUP B53D23013220006, PNRR, M4.C2.1.1),
the PNRR MUR project PE0000013-FAIR (CUP B53C22003630006), and the INdAM - GNCS Project
Argomentazione Computazionale per apprendimento automatico e modellazione di sistemi intelligenti
(CUP E53C24001950001). De Angelis and Proietti are members of the INdAM-GNCS research group.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Said</surname>
          </string-name>
          ,
          <article-title>On explaining recommendations with large language models: a review, Frontiers in Big Data 7 (</article-title>
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .3389/fdata.
          <year>2024</year>
          .
          <volume>1505284</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Explainable recommendation: A survey and new perspectives</article-title>
          ,
          <source>Foundations and Trends® in Information Retrieval</source>
          <volume>14</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>101</lpage>
          . doi:
          <volume>10</volume>
          .1561/1500000066.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bondarenko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kowalski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Toni</surname>
          </string-name>
          ,
          <article-title>An abstract, argumentation-theoretic approach to default reasoning</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>93</volume>
          (
          <year>1997</year>
          )
          <fpage>63</fpage>
          -
          <lpage>101</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>