<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Natural XAI: Generating Feasible, Actionable, and Causally-Aware Counterfactual Explanations in Natural Language</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pedram Salimi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Robert Gordon University</institution>
          ,
          <addr-line>Garthdee Rd, Aberdeen, AB10 7GJ</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Counterfactual explanations have become a significant component in eXplainable AI (XAI), ofering intuitive "what if" scenarios. However, typical numeric or tabular outputs can be vague to non-technical audiences. Additionally, many counterfactual methods ignore causal relationships or suggest inactionable changes such as “be younger by five years,” raising concerns over realism and ethics. To address these issues, we propose a holistic approach that integrates a Feature Actionability Taxonomy (FAT) and causal discovery into counterfactual generation, thereby ensuring realistic, ethically sound, and semantically transparent explanations in natural language. We further introduce an interactive, agentic workflow enabling users to iteratively refine constraints. Through extensive user studies, pilot evaluations, and synergy with Case-Based Reasoning (CBR), this approach yields explanations that are accessible, trust-enhancing, and practically useful in domains such as healthcare, ifnance, and education.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable AI</kwd>
        <kwd>Counterfactual Explanations</kwd>
        <kwd>Causality</kwd>
        <kwd>Natural Language Generation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>(iii) Natural Language Generation (NLG), presenting counterfactuals in user-friendly textual
form. Finally, we propose an agentic framework, in which users can iteratively refine or reject certain
suggestions, receiving updated counterfactuals each time. This paper outlines our progress towards a
more transparent, practical, and interactive XAI, referred to here as Natural-XAI.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Motivation and Related Work</title>
      <p>Modern decision-support systems increasingly rely on complex, “black-box” models. Trust and
adoption highly rely on efective explanation. Counterfactual methods stand out for their simplicity and
actionability: rather than simply stating “your loan was denied because of X,” a counterfactual might
say “if your monthly income increased by $200, your loan would have been approved.”</p>
      <p>
        However, user trust can degrade if suggested feature changes are not plausible or ethically-sound.
For instance, suggestions like “be ten years younger” or “change your race” are not only inactionable
but also unethical in many contexts. Prior works such as DiCE [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] address diversity in counterfactual
explanations, while FACE [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] focuses on data manifold feasibility. Causality-oriented approaches (e.g.
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]) endeavour to align recommendations with real-world cause-and-efect. Meanwhile, template-based
NLG [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] has shown promise for generating explanations that resonate more strongly with end-users
compared to raw numeric or tabular outputs.
      </p>
      <p>Despite these advancements, few solutions comprehensively unify actionability, causality, and natural
language. Our approach bridges these dimensions, while also employing Case-Based Reasoning (CBR),
which complements counterfactuals by providing exemplars of similar cases and how they difer from
the current instance.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed Approach</title>
      <p>Building on the limitations of purely numeric or tabular counterfactuals, our proposed framework
introduces a holistic pipeline that progressively addresses actionability, presentation, causal coherence,
and iterative user engagement. We begin by specifying how features can and cannot be changed in the
Feature Actionability Taxonomy (FAT). Next, we incorporate these constraints into a template-based
Natural Language Generation (NLG) module to communicate potential changes in a user-friendly
manner. We then extend the counterfactual generation process to handle causal dependencies so that
recommended interventions remain realistic. Finally, we embed all these components into an agentic
workflow, allowing iterative dialogue and refinement of constraints. The sections that follow describe
these four core pillars in turn.</p>
      <sec id="sec-3-1">
        <title>3.1. Feature Actionability Taxonomy (FAT)</title>
        <p>A foundational element of our approach is the Feature Actionability Taxonomy (FAT), which
classifies features into three categories based on how realistically they can be modified:
• Directly Mutable (DM): Easily adjustable features such as requested loan amount.
• Indirectly Mutable (IM): Features that are changeable but only through more extended or
intricate actions (e.g. educational level, occupation type).
• Non-sensitive or Sensitive Immutable: Characteristics like age or gender, which typically
cannot or should not be changed for ethical or practical reasons.</p>
        <p>
          FAT was defined using a data-driven methodology that relied on examining features extracted from
six datasets [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ] related to Fair AI. They span three distinct domains, with each feature analysed to
determine suitable actionability categories [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          By explicitly encoding each feature’s mutability, FAT automatically filters out suggestions that violate
real-world or ethical constraints, thereby increasing user trust in the system [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Figure 1 illustrates this
taxonomy, showing how each feature is funnelled into a relevant branch, thus ensuring that impossible
or unethical recommendations. for example, “reduce your age by 10 years” are never generated [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Natural Language Generation (NLG)</title>
        <p>
          We employ a three-stage, template-based approach to generate user-friendly counterfactual explanations
in natural language [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. First, sentence planning leverages the Feature Actionability Taxonomy (FAT) to
identify appropriate templates based on whether a feature is directly mutable, indirectly mutable, or
immutable. Second, surface realisation populates these templates with user-specific details while taking
into account thematic preferences (e.g. emphasising feasibility or positive language). For instance,
immutable features are described using templates that convey encouragement rather than negative
warnings, following recommendations from psychology research [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Finally, discourse planning groups
and orders the individual sentences by category (e.g. mutable vs. immutable) and by feature-based
priority (e.g. a SHAP ranking), generating a coherent paragraph-level explanation.
        </p>
        <p>Figure 2 (referenced in the text) illustrates the full pipeline, from matching features to templates
through to sentence sorting and domain-specific epilogues. For example, a finance application might
conclude with “Good luck with your loan!” while a healthcare scenario might close with “Stay healthy!”
This layered approach ensures that explanations are both contextually accurate and reassuring in tone.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Causally-Aware Counterfactual Explanations</title>
        <p>
          While FAT addresses the actionability of individual features, many real-world scenarios involve intricate
causal dependencies. Changing one feature (e.g. “Education Level”) may, in fact, afect others (e.g.
“Occupation,” “Income”), which in turn could impact the ultimate prediction (e.g. “Loan Approval”). To
capture these relationships, we integrate causal discovery, for instance, the DECI framework [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] or
domain-informed causal graphs. In this work the causal graph is learned with DECI, a deep end-to-end
causal discovery method which use a bayesian approach to learn causal relationships using observational
data. Also, when partial causal knowledge is available, domain experts can directly adjust or constrain
edges in the causal graph. These human-in-the-loop edits ensure that only plausible relationships feed
into the counterfactual discovery.
        </p>
        <p>Once such a causal graph is in place, our counterfactual discovery engine intervenes on chosen
features while propagating changes through the causal network, ensuring each recommended scenario
respects real cause-and-efect pathways. This stands in contrast to naive methods that treat all features
as mutually independent, potentially yielding contradictory suggestions (e.g. recommending both fewer
working hours and a higher income). By linking causal modelling with FAT, we ensure that only
permissible features are altered, and do so in a way that maintains internal consistency across all
features.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Agentic and Iterative Workflow</title>
        <p>Finally, we embed these components in an agentic workflow, transitioning from static explanations
to dynamic, user-driven conversations. Our agentic workflow is simultaneously interactive, which
means the user can add or relax constraints in natural language as well as being iterative so that
the system regenerates counterfactuals until the user is satisfied. Basically, after the system presents
a counterfactual recommendations in natural language, the user can refine constraints or discard
unfeasible changes by updating Feature Actionability Taxonomy which is embeded in this workflow.
Concretely:
• Rejecting a Recommendation: Users may specify, “I cannot reduce my monthly outgoings below
$1,000 due to fixed costs.”
• Prioritising Feasibility: The system re-executes the counterfactual search, guided by both FAT and
the user’s updated constraints, and then re-runs the NLG to produce revised textual explanations.</p>
        <p>This interactive loop ensures each subsequent iteration of recommendations is increasingly tailored
to the user’s personal limitations and priorities. Consequently, Natural Language Counterfactual
Explanations evolve from a one-of advisory statement into an iterative conversation, fostering transparency,
trust, and practical usability.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Progress and Preliminary Results</title>
      <p>Thus far, our work has centred around three key pillars ensuring actionability, incorporating causal
insights, and enabling interactive user refinement and has yielded promising preliminary outcomes.</p>
      <p>First, we developed a Feature Actionability Taxonomy (FAT) to categorise features as directly mutable,
indirectly mutable, or immutable/sensitive. Using insights from a focused user study, we then crafted a
template-based NLG module that transforms raw numeric deltas into readable, context-rich sentences
(e.g. emphasising time frames and feasibility cues). Early feedback showed that these text-based
explanations were perceived as significantly more transparent and actionable than purely tabular
representations.</p>
      <p>Next, to address the risk of contradictory or unrealistic suggestions, we integrated causal discovery
(e.g. via DECI) into the counterfactual search process. This ensures that interventions on one feature
(e.g. increasing “Education Level”) properly cascade to related variables (e.g. “JobType” or “Income”),
thus reflecting genuine real-world cause-and-efect. Pilot experiments demonstrated a measurable
reduction in contradictory recommendations, particularly in financial scenarios, when compared to
correlation-only methods.</p>
      <p>Finally, we introduced an agentic workflow that treats users as active participants rather than passive
recipients of one-shot explanations. After receiving an initial set of textual counterfactuals, users can
impose additional constraints (“I cannot reduce my expenses below $1,000” ) or request alternative actions.
The system promptly regenerates revised solutions, again expressed via the NLG module. Preliminary
user testing suggests that iterative refinement can boosts clarity and trust.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Directions</title>
      <p>In moving towards truly Natural XAI, our framework unifies a Feature Actionability Taxonomy (FAT),
causal integration, and a template-based NLG approach within an agentic workflow. This design ensures
that counterfactual explanations remain feasible, ethically grounded, and responsive to user feedback.
As the next major step, we plan a user study that deploys this interactive system in finance, healthcare,
and education scenarios each featuring its own domain-specific features and constraints. Participants
will provide iterative feedback on the system’s recommendations (e.g. indicating infeasible changes)
and then observe how the workflow adapts the generated counterfactuals in real time. We will evaluate
how iterative refinement influences trust, clarity, and overall satisfaction, testing whether our method
significantly outperforms traditional single-shot explanations. Ultimately, we aim to demonstrate that
combining actionability, causality, and user engagement not only enhances the transparency of AI
decisions but also ofers a meaningful path towards ethically and practically sound recourse.
Acknowledgements. The author would like to thank the supervisory team, domain experts who
provided guidance on ethical considerations, and participants in pilot studies whose feedback has
substantially shaped the direction of this research.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During preparation of this work, the authors used ChatGPT for the purpose of: grammar and spelling
check, paraphrase and reword. After using this tool, the authors reviewed and edited the content and
take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Mothilal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Tan,</surname>
          </string-name>
          <article-title>Explaining machine learning classifiers through diverse counterfactual explanations</article-title>
          ,
          <source>in: Proc. Conf. on Fairness, Accountability, and Transparency</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>607</fpage>
          -
          <lpage>617</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Poyiadzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sokol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Santos-Rodriguez</surname>
          </string-name>
          , T. De Bie,
          <string-name>
            <given-names>P.</given-names>
            <surname>Flach</surname>
          </string-name>
          ,
          <article-title>Face: feasible and actionable counterfactual explanations</article-title>
          ,
          <source>in: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>344</fpage>
          -
          <lpage>350</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.-H.</given-names>
            <surname>Karimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schölkopf</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Valera</surname>
          </string-name>
          ,
          <article-title>Algorithmic recourse: from counterfactual explanations to interventions</article-title>
          ,
          <source>in: Proceedings of the 2021 ACM conference on fairness, accountability, and transparency</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>353</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Salimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Wiratunga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Corsar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wijekoon</surname>
          </string-name>
          ,
          <article-title>Towards feasible counterfactual explanations: A taxonomy guided template-based nlg method</article-title>
          ,
          <source>in: ECAI</source>
          <year>2023</year>
          , IOS Press,
          <year>2023</year>
          , pp.
          <fpage>2057</fpage>
          -
          <lpage>2064</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dua</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Graf,</surname>
          </string-name>
          <article-title>UCI machine learning repository</article-title>
          ,
          <year>2017</year>
          . URL: http://archive.ics.uci.edu/ml.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Le Quy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Iosifidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , E. Ntoutsi,
          <article-title>A survey on datasets for fairness-aware machine learning</article-title>
          ,
          <source>WIREs Data Mining and Knowledge Discovery</source>
          <volume>12</volume>
          (
          <year>2022</year>
          )
          <article-title>e1452</article-title>
          . URL: https://wires.onlinelibrary.wiley. com/doi/abs/10.1002/widm.1452. doi:https://doi.org/10.1002/widm.1452. arXiv:https://wires.onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1452.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Salimi</surname>
          </string-name>
          ,
          <article-title>Addressing trust and mutability issues in xai utilising case based reasoning</article-title>
          .,
          <source>ICCBR Doctoral Consortium</source>
          <volume>1613</volume>
          (
          <year>2022</year>
          )
          <fpage>0073</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Reiter</surname>
          </string-name>
          ,
          <article-title>An architecture for data-to-text systems</article-title>
          ,
          <source>in: Proc. 11th European Workshop on NLG (ENLG 07)</source>
          , DFKI GmbH, Saarbrücken, Germany,
          <year>2007</year>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>104</lpage>
          . URL: https://aclanthology.org/ W07-2315.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Burieva</surname>
          </string-name>
          ,
          <article-title>The efectiveness of teaching writing to the students with the technique “rewards and positive reinforcement”</article-title>
          , Academic research in educational sciences (
          <year>2020</year>
          )
          <fpage>229</fpage>
          -
          <lpage>232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gefner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Antoran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Foster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Gong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ma</surname>
          </string-name>
          , E. Kiciman,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lamb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kukla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hilmkil</surname>
          </string-name>
          , et al.,
          <article-title>Deep end-to-end causal inference</article-title>
          ,
          <source>in: NeurIPS 2022 Workshop on Causality for Real-world Impact</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>