<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Explainable NLLP: Advancements in Explainable AI for Natural Legal Language Processing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lucas Resck</string-name>
          <email>lucas.domingues@fgv.edu.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felipe Moreno-Vera</string-name>
          <email>felipe.moreno@fgv.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tobias Veiga</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gerardo Paucar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ezequiel Fajreldines</string-name>
          <email>ezequiel.santos@fgv.br</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guilherme Klafke</string-name>
          <email>guilherme.klafke@fgv.br</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luis Gustavo Nonato</string-name>
          <email>gnonato@icmc.usp.br</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jorge Poco</string-name>
          <email>jorge.poco@fgv.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Getulio Vargas Foundation</institution>
          ,
          <addr-line>Rio de Janeiro</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Getulio Vargas Foundation</institution>
          ,
          <addr-line>São Paulo</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of São Paulo</institution>
          ,
          <addr-line>São Carlos</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Despite the increasing application of machine learning and NLP methods in the legal domain, there has been limited efort to enhance the understanding and transparency of these algorithms. This paper addresses this gap by presenting a survey on Explainable AI (XAI) applied to Natural Legal Language Processing (NLLP). To our knowledge, this survey represents the first comprehensive examination at the intersection of XAI, Law, and NLP. Building upon prior surveys focused on partial intersections of these domains, we propose a taxonomy for classifying papers based on the NLLP task, explanation type, and technique employed. Additionally, we delve into discussions surrounding Explainable NLLP, considering perspectives related to ethics, current open issues, and future work. Our analysis reveals that the categorized papers generally do not thoroughly examine the ethical implications of the explainability principle in NLP within the legal field. Furthermore, they do not discuss the role and value of explanations nor do they efectively utilize their respective XAI techniques to ofer insights into the limitations of NLP systems.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable AI</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Law</kwd>
        <kwd>Survey</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Natural Language Processing (NLP) falls under the umbrella of Artificial Intelligence (AI). It is dedicated
to facilitating interaction between computers and human language. It aims to empower machines to
comprehend, interpret, and generate human language in a manner that is not only meaningful but
also contextually relevant. While there has been substantial growth in the application of Machine
Learning (ML) and NLP methods within the legal domain, often referred to as “LegalAI” [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ],
relatively little has been done to enhance the comprehension and transparency of algorithms such as
legal document summarization [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], legal document classification [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], and predictive analytics for
legal outcomes [
        <xref ref-type="bibr" rid="ref8">8, 9, 10, 11</xref>
        ].
      </p>
      <p>With the growing power and complexity of ML and NLP algorithms, the demand for transparency
in these systems has never been more critical. Transparency in ML applications entails the capacity
to comprehend, interpret, and expound upon the decisions and predictions made by these algorithms,
a vital aspect within the legal domain. Within this context, Transparency in ML and Explainable
Artificial Intelligence (XAI) are closely intertwined concepts, both striving to render AI and ML systems
more understandable, interpretable, and accountable. Together, they tackle ethical, regulatory, and
user trust concerns in AI and facilitate the widespread integration of AI technologies across various
ifelds, particularly in Natural Legal Language Processing (NLLP). Within legal NLP, the fusion of ML
transparency and XAI is indispensable for upholding fairness, compliance, and trustworthiness. This
approach benefits legal professionals, stakeholders, and the public by providing insights into AI-driven
legal decisions and enabling AI’s responsible and ethical use within the legal domain.</p>
      <p>
        Nonetheless, while a few works encompass the study, review, and synthesis of XAI &amp; NLP [12, 13]
or NLP &amp; Law [
        <xref ref-type="bibr" rid="ref1 ref3">1, 3</xref>
        ], none of these delve into pertinent subjects or trends in XAI, Legal NLP, or the
intersection of both, such as trustworthiness, fairness, and ethics. Hence, we recognize the significance
of investigating Legal, NLP, and XAI. This intersection is paramount because the legal domain imposes
specific constraints and requisites concerning explanations and justifications. In this vein, we advocate
for thoroughly exploring the prevailing trends in techniques and explanations applied in NLLP. To
address this, we present this survey, focusing on covering and addressing these topics and structuring
them through developing a taxonomy rooted in XAI and NLLP. In this work, we treat explainability and
interpretability interchangeably, as it is common in the literature [14, 15], despite existing debate [16].
Main Contributions. This work encompasses a survey focusing on applying Explainable AI in
Natural Legal Language Processing. Additionally, we identify papers that explicitly address ethics,
particularly within the context of XAI. Our primary contributions in this study are outlined as follows:
• We introduce a taxonomy for systematically categorizing papers based on the NLLP task and the
specific type and explanation technique employed.
• We analyze the prevailing research trends in types of explanation and the utilization of XAI.
• We consider perspectives related to ethics in XAI and the legal domain, emphasizing the necessity
of addressing ethical concerns and pointing out current challenges.
      </p>
      <p>• We comprehensively discuss the existing open issues within the realm of XAI applied to NLLP.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Previous Surveys</title>
      <p>
        Researchers have made significant progress in summarizing and classifying the forefront of XAI, yielding
an extensive body of literature that addresses this challenge from various perspectives and within
diverse domains [
        <xref ref-type="bibr" rid="ref1 ref3">17, 1, 12, 13, 3, 18</xref>
        ]. In a recent paper, Schwalbe and Finzel [17] have consolidated all
these prior eforts into a unified taxonomy.
      </p>
      <p>
        This survey classifies advances in XAI within the specific Natural Legal Language Processing domain.
To our knowledge, this represents the inaugural survey at the crossroads of XAI, Law, and NLP. We
extend upon pertinent prior works that have approached the convergence of NLP and Law [
        <xref ref-type="bibr" rid="ref1 ref3">1, 3</xref>
        ] and
NLP and XAI [12, 13] to construct a comprehensive taxonomy encompassing XAI, Law, and NLP.
      </p>
      <p>Atkinson et al. [18] analyze explanation methodologies in AI as applied to law. However, their focus
is predominantly on conventional automation-based systems, such as rule- and case-based ones. While
they touch upon explainability in machine learning, it is done with a critical perspective. We delve into
these critiques and their implications in Section 5.</p>
      <p>Danilevsky et al. [13] present a survey on applying XAI in NLP. This work is complemented by an
interactive browser-based system for exploring the study [12]. This body of work organizes explanations
and encompasses various modalities through which explanations are extracted and visualized. Drawing
inspiration from Danilevsky et al. [13] eforts, we propose our own taxonomy, particularly concerning
the categorization of explanations (local vs. global, self-explanatory vs. post-hoc) and methods of
explainability (e.g., feature importance, surrogate, among others).</p>
      <p>
        Additionally, some works scrutinize the intersection of NLP and the legal domain, a field referred to
as LegalAI by Zhong et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Specifically, this research categorizes and illustrates several methods
based on embeddings and symbols. It also delineates several applications of LegalAI. Finally, Katz et al.
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] provide a comprehensive overview of the current state of legal NLP. Despite their extensive analysis
of hundreds of related papers, they also propose a broad taxonomy centered around engineering tasks
in NLP.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Search for Papers</title>
        <p>To curate pertinent literature, we conducted a thorough search using Google Scholar and Semantic
Scholar. Employing keywords such as “xai legal nlp,” “legal nlp,” “legal decision prediction,” “nlp
legal judgment prediction xai,” and “nlp legal judgment prediction interpretable,” without date range
restrictions, we executed the search. The gathered papers underwent screening based on their title,
abstracts, keywords, and full text. Ensuring a focused selection of literature, papers were screened
based on relevance regarding the intersection of XAI, NLP, and Law. For instance, we included any
papers that employed explainability or interpretability techniques to enhance the understanding of NLP
models in legal contexts.</p>
        <p>Additionally, we scrutinized the bibliographies of each selected paper from the initial search,
incorporating those identified as pertinent into our list for meticulous examination. Following a comprehensive
review of the selected papers, we arrived at a set of 40 documents, which are thoroughly discussed and
organized within the proposed taxonomy (see Section 3.2). The resulting survey considers works from
a diversity of venues, including the Association for Computational Linguistics Anthology (ACL, AACL,
COLING, EMNLP, NAACL, etc.), AI &amp; Law venues (Artificial Intelligence and Law and ICAIL), and
preprint repositories (arXiv and SSRN).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Taxonomy</title>
        <p>To propose a taxonomy for the convergence of NLP, XAI, &amp; Law, we have built upon prior eforts in
categorizing papers within the realms of NLP &amp; Law and NLP &amp; XAI (Section 2).</p>
        <p>Explanation Type: We adhere to the approach outlined by Danilevsky et al. [13] and Qian et al. [12],
organizing explanation methods into the following classifications:
• Local vs. Global: This pertains to whether the explanation is specific to a particular instance or
provides an overview of the model’s behavior across the entire set of instances.
• Self-explaining vs. Post-hoc: This distinguishes whether the explanation is derived directly from
the model or obtained through a post-processing.</p>
        <p>It is worth noting that only a limited number of works rely on global explanations (as shown in
Table 1). Consequently, while global explanations constitute a pertinent category, our ensuing discussion
will primarily center on local explanation methods.</p>
        <p>Explainability Technique: Diverse approaches exist for integrating XAI methods into a legal NLP
pipeline, encapsulated by the various explainability methods employed. We also draw upon Danilevsky
et al. [13]’s work for classification:
• Feature importance: This XAI method scrutinizes and assigns importance scores to the features
utilized in the prediction process, such as employing attention mechanisms [19].
• Surrogate model: In this approach, another model, typically simpler and interpretable,
approximates the decision-making process of the original model and serves as a stand-in for explanations,
as exemplified by LIME [20].
• Example-driven: Other examples are used to justify the prediction.
• Provenance-based: This method is employed when the decision-making process involves a
sequence of derivation steps, some or all presented as part of the explanation.
• Declarative induction: Human-readable representations like trees [21] serve as explanations in
this category.</p>
        <p>
          It is essential to note that these categories are not mutually exclusive. For instance, the LIME
technique falls under both feature importance and surrogate model. When this happens, we apply the
most pertinent one.
Zhao et al. [22], Bhambhoria et al. [23],
Li et al. [24], Lyu et al. [
          <xref ref-type="bibr" rid="ref9">25</xref>
          ], Wu et al.
[
          <xref ref-type="bibr" rid="ref10">26</xref>
          ], Branting et al. [19], Zhong et al.
[
          <xref ref-type="bibr" rid="ref11">27</xref>
          ], Branting et al. [
          <xref ref-type="bibr" rid="ref12">28</xref>
          ], Chen et al.
[
          <xref ref-type="bibr" rid="ref13">29</xref>
          ], Jiang et al. [
          <xref ref-type="bibr" rid="ref14">30</xref>
          ], Ashley and
Brüninghaus [
          <xref ref-type="bibr" rid="ref15">31</xref>
          ]
Bertalan and Ruiz [
          <xref ref-type="bibr" rid="ref16">32</xref>
          ], Nielsen et al.
[
          <xref ref-type="bibr" rid="ref17">33</xref>
          ], Wang et al. [
          <xref ref-type="bibr" rid="ref18">34</xref>
          ], Zhou et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ],
Branting et al. [19], Chalkidis et al. [
          <xref ref-type="bibr" rid="ref19">35</xref>
          ],
Malik et al. [
          <xref ref-type="bibr" rid="ref20">36</xref>
          ], Norkute et al. [
          <xref ref-type="bibr" rid="ref21">37</xref>
          ],
Branting et al. [
          <xref ref-type="bibr" rid="ref12">28</xref>
          ], Caled et al. [
          <xref ref-type="bibr" rid="ref22">38</xref>
          ],
Ye et al. [39]
Mumford et al. [40], Ye et al. [39]
Resck et al. [41]
Benedetto et al. [42], Resck et al. [41],
Bhambhoria et al. [43], Domingues [44],
Górski and Ramakrishna [45], Rabelo
et al. [46], Rabelo et al. [47], Chhatwal
et al. [48]
Valvoda and Cotterell [49], Semo
et al. [50], T.y.s.s et al. [51], Górski
and Ramakrishna [45], Malik et al. [
          <xref ref-type="bibr" rid="ref20">36</xref>
          ],
Norkute et al. [
          <xref ref-type="bibr" rid="ref21">37</xref>
          ], Mahoney et al.
[
          <xref ref-type="bibr" rid="ref23">52</xref>
          ], Waltl et al. [
          <xref ref-type="bibr" rid="ref24">53</xref>
          ], Górski et al. [
          <xref ref-type="bibr" rid="ref25">54</xref>
          ],
Landthaler et al. [
          <xref ref-type="bibr" rid="ref26">55</xref>
          ]
de Arriba-Pérez et al. [
          <xref ref-type="bibr" rid="ref27">56</xref>
          ]
Valvoda and Cotterell [49]
Medvedeva et al. [
          <xref ref-type="bibr" rid="ref28">57</xref>
          ], Strickson and
De La Iglesia [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], Aletras et al. [11]
González-González et al. [21], de
ArribaPérez et al. [
          <xref ref-type="bibr" rid="ref27">56</xref>
          ]
NLP Task: Research at the intersection of NLP and Law leverages NLP techniques to address legal
challenges. Hence, it is crucial to classify these studies based on their specific NLP tasks. We employ the
comprehensive and succinct taxonomy proposed by Katz et al. [1, Table 1] for this purpose. However, our
analysis revealed that most of the works fall within the "Classification" category, encompassing Outcome
Prediction, Legal Area Classification, and Topic Modeling. It is worth noting that this prevalence is not
arbitrary. While Katz et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] presents a broad spectrum of legal NLP tasks, those beyond classification,
machine summarization, and text generation, such as “resources,” tend to be less reliant on machine
learning, if not entirely independent. Consequently, they pose challenges when applying XAI methods.
Conversely, machine summarization and text generation are comparatively less common [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>
          Nonetheless, certain studies (e.g., 41) are labeled with additional categories beyond classification,
such as information retrieval and resources. A few others are labeled independently of classification,
such as machine summarization [
          <xref ref-type="bibr" rid="ref17">33</xref>
          ] and text generation [39].
        </p>
        <p>Ethical Issues: The ethical implications of applying machine learning to NLP are paramount,
particularly concerning the choice of explainability methods. We identify studies that address these ethical
concerns, emphasizing those that do so within the context of XAI.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Taxonomy Discussion</title>
      <p>This section ofers an overview of the primary XAI techniques employed in each respective XAI type.
It is worth noting that the global post-hoc XAI type is omitted due to its absence in the reviewed
literature. Furthermore, we present noteworthy observations applicable to all the examined studies.
Table 1 thoroughly categorizes the works based on the XAI type, explainability method, and NLLP task.</p>
      <sec id="sec-4-1">
        <title>4.1. Local Post-hoc</title>
        <p>
          This XAI type encompasses notable XAI techniques such as LIME and input gradient methods, including
Integrated Gradients [
          <xref ref-type="bibr" rid="ref29">58</xref>
          ] and Grad-CAM [
          <xref ref-type="bibr" rid="ref30">59</xref>
          ]. Recent studies employing LIME include those conducted
by Resck et al. [41] and Bhambhoria et al. [43]. Similarly, Benedetto et al. [42] and Semo et al. [50]
have undertaken investigations utilizing input gradient techniques. In several studies within this XAI
type, researchers have employed a combination of or explored at least two diferent approaches (types
or techniques) to provide explanations, e.g., Górski and Ramakrishna [45] and Norkute et al. [
          <xref ref-type="bibr" rid="ref21">37</xref>
          ].
Noteworthy is the work by Benedetto et al. [42], which distinguishes itself by ofering explanations
at the sentence level and by conducting comparisons against ground truth. Conversely, other works
primarily generate explanations at the word level.Information retrieval frameworks, e.g., text similarity,
are employed by Resck et al. [41] and Landthaler et al. [
          <xref ref-type="bibr" rid="ref26">55</xref>
          ] — the retrieval is explained with additional
text similarity and LIME, respectively. In machine summarization, Norkute et al. [
          <xref ref-type="bibr" rid="ref21">37</xref>
          ] also explore
whether adding textual similarity highlights as an explanation can help users evaluate the summarization
of legal documents.
        </p>
        <p>In one particularly interesting approach, given a primary model tasked with a main classification,
a secondary model autonomously computes additional pertinent classes or text segments capable of
elucidating the prediction. Representative works have emanated from the competitions COLIEE 2019
[47] and 2020 [46]. Due to the independent nature of the secondary model from the main model’s
predictions, the former can generate predictions in advance of the latter. Consequently, the nomenclature
post-hoc may not entirely encapsulate the essence of this technique.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Local Self-explaining</title>
        <p>
          This XAI type primarily employs attention weights of deep learning architectures, e.g.,
Transformer[22] and LSTM-based [
          <xref ref-type="bibr" rid="ref22">38</xref>
          ] models, as its main approach — an exception is the work by Zhou et al.
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], which employs classifier weights, commonly used by global explanations but aiming individual
samples. Typically, local self-explaining methods, except for provenance-based, emphasize the word
level [
          <xref ref-type="bibr" rid="ref16 ref17 ref22">32, 41, 33, 38</xref>
          ]. However, in the study by Zhao et al. [22], attention scores are sometimes extended
to encompass entire sentences, thereby providing an alternative explanation at the textual level. For
instance, a whole factual statement may be deemed significant at the sentence level. In contrast,
mentioning a concept may hold importance at the word level.
        </p>
        <p>
          In the context of legal summarization, Nielsen et al. [
          <xref ref-type="bibr" rid="ref17">33</xref>
          ] and Norkute et al. [
          <xref ref-type="bibr" rid="ref21">37</xref>
          ] have explored the
use of attention highlights as explanations. When evaluating legal document summarization, attention
highlights improved completion time, trust, and preference [
          <xref ref-type="bibr" rid="ref21">37</xref>
          ]; the use of attention highlights did not
afect the temporal allocation of user attention, but spatiotemporal allocation has evidence of being
afected [
          <xref ref-type="bibr" rid="ref17">33</xref>
          ]. Similarly, Ye et al. [39] explored attention scores to interpret the text generation of court
views, which are analogous to natural language explanations for charge predictions.
        </p>
        <p>
          In a diferent approach, secondary models predict other relevant and interpretable labels, which
subsequently serve as features for the primary model responsible for the main prediction task. This
approach encompasses a substantial body of work [
          <xref ref-type="bibr" rid="ref10 ref11 ref15 ref9">25, 26, 23, 27, 31</xref>
          ] and is a subset of provenance-based
methods.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Global Self-explaining</title>
        <p>
          While less commonly observed in the reviewed literature, this XAI type ofers valuable insights. Aletras
et al. [11] and Medvedeva et al. [
          <xref ref-type="bibr" rid="ref28">57</xref>
          ] employ the feature importance of an SVM model, achieved through
an analysis of the SVM kernel weights. Similarly, de Arriba-Pérez et al. [
          <xref ref-type="bibr" rid="ref27">56</xref>
          ] and González-González
et al. [21] leverage declarative induction within a Random Forest model. This entails identifying,
for any given class, all the tree paths from root to leaf that contribute to the score of the respective
class. Both methodologies apply the models to text that has undergone preprocessing using TF-IDF.
Additionally, Strickson and De La Iglesia [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] directly analyze the most important TF-IDF features. The
inherent simplicity of these techniques plays a crucial role in generating thoroughly explainable models.
Remarkably, they consistently demonstrate commendable performance despite the anticipated trade-of
between interpretability and performance [
          <xref ref-type="bibr" rid="ref31 ref32 ref33">60, 61, 62</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Ethics Discussion</title>
      <p>In this work, we identify papers that explicitly address ethical concerns, particularly within the context
of XAI (Section 3.2). This section discusses these works, the necessity of addressing ethics in XAI, and
the ethical implications of applying XAI in the legal domain.</p>
      <sec id="sec-5-1">
        <title>5.1. Ethics Mentions</title>
        <p>
          The ongoing discourse on ethics guidelines for developing and operating AI systems emphasizes
the importance of explainability, transparency, and accountability as ethical principles within the AI
domain. Systematic and scope reviews substantiate that these principles rank among the most frequently
referenced in this field [
          <xref ref-type="bibr" rid="ref34 ref35 ref36">63, 64, 65</xref>
          ]. Floridi and Cowls [
          <xref ref-type="bibr" rid="ref37">66</xref>
          ] assert that “explicability,” construed as
“intelligibility” and “accountability,” stands as the singular novel structural principle that has been
appended to the established quartet of bioethics principles — “beneficence, non-maleficence, autonomy,
and justice.”
        </p>
        <p>
          Given the prominence of these ethical tenets, it is surprising that only ten papers broach the ethical
implications of AI applications, with a mere five specifically addressing the ethical facets of XAI
[
          <xref ref-type="bibr" rid="ref10 ref11 ref17">42, 33, 26, 27, 49</xref>
          ]. Among these, two delve into how their proposed XAI solutions ameliorate ethical
concerns, particularly about fairness and non-discrimination in legal cases, outperforming similar
techniques [
          <xref ref-type="bibr" rid="ref11">42, 27</xref>
          ]. Wu et al. [
          <xref ref-type="bibr" rid="ref10">26</xref>
          ] proactively include a disclaimer elucidating that their framework
ought to be perceived as an auxiliary tool for judges rather than an automatic decision-making system,
a distinction made on ethical grounds. Valvoda and Cotterell [49] suggest careful use of their work to
automate legal decisions, given that their results indicate unaligned precedents between models and
judges. Nielsen et al. [
          <xref ref-type="bibr" rid="ref17">33</xref>
          ] call for legal ethicists’ attention to a specific experimental result.
        </p>
        <p>
          It is non-trivial to point out why most categorized papers do not fully address the ethical concerns of
their works in a domain as sensitive as law. Perhaps the lack of ethical discussions is due to the focus on
technical aspects, which is the main goal in NLP and machine learning and could erroneously indicate
that no ethical issues exist [
          <xref ref-type="bibr" rid="ref38">67</xref>
          ]. Additionally, authors may be discouraged by the lack of a dedicated
space in targeted venues. For instance, there was no extra space for ethical considerations, limitations,
and impact statements in *ACL publications until 2021 [
          <xref ref-type="bibr" rid="ref38">67</xref>
          ].
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Ethical Implications</title>
        <p>Notably, the categorized papers generally do not furnish exhaustive accounts of the ethical implications
of the explainability principle in the context of NLP within the legal domain. For instance, there exists
potential for discerning between a “legal explanation” and a “model explanation” (refer to Section 6.1),
given the longstanding academic discourse on what constitutes sound legal reasoning or a morally and
legally sound decision. This discussion is introduced by Atkinson et al. [18]1, in alignment with Robbins
[68], who provides a more nuanced perspective on the “explicability principle” itself and critiques the
prevailing notion that it should encompass an explication of the algorithm’s decision-making process.
The author contends in favor of elucidating results rather than processes. Subsequently, Robbins [68]
expounds on two overarching approaches to XAI and addresses certain misconceptions about this
principle. Even the critiques articulated by Robbins [68] and Atkinson et al. [18] do not fully establish
the value of an explanation in the realm of legal decision-making (Section 6.1).</p>
        <p>
          Additional challenges and considerations in explaining AI within the legal domain include the need
for judges to maintain control over automated decision-making systems and fully understand their
processes [
          <xref ref-type="bibr" rid="ref10">26</xref>
          ]. This requirement is critical for AI models to function as supportive tools rather than
replace human judgment, thereby reducing the risk of discrimination and bias inherent in models
and datasets, which can be particularly damaging in sensitive areas such as family law. Ensuring
transparency and fairness in legal case decisions is essential to avoid unjust outcomes [
          <xref ref-type="bibr" rid="ref11">42, 27</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Open Issues and Future Work</title>
      <p>Despite advances in XAI within NLLP, several unresolved challenges persist. This section outlines some
of these issues and suggests potential research directions. We explore the role and value of explanations
in the legal domain, propose ways to enhance NLP systems using explanations, and highlight limitations
in the current literature.</p>
      <sec id="sec-6-1">
        <title>6.1. Role and Value of Explanations</title>
        <p>
          Researchers exert significant eforts to keep XAI in step with the dynamic landscape of NLP. However,
a more concerted endeavor must contemplate XAI’s role and implications in the specific legal domain.
Explanations are central in most automated decision regulations [
          <xref ref-type="bibr" rid="ref34 ref35 ref36">63, 64, 65</xref>
          ], being deemed critical
for ensuring quality control, accountability, and justice [
          <xref ref-type="bibr" rid="ref37">66</xref>
          ]. The ethical consequences of algorithms
impacting decisions on critical legal matters make the need for clear and interpretable explanations
even more pressing, as they help alleviate moral concerns. Explanations are pivotal for several legal
stakeholders: they empower judges in their decision-making process [
          <xref ref-type="bibr" rid="ref10">26</xref>
          ], assist lawyers and other
experts in the analysis of court understandings [41], support the evaluation of AI systems by model
creators [
          <xref ref-type="bibr" rid="ref21">37</xref>
          ], and provide users with the ability to understand AI-driven decisions, supporting the
“right to explanation” [69] or model trust. However, this area remains underexplored in the literature.
While complex, a stronger focus on XAI’s legal and ethical implications could enhance its perceived
importance and the resources devoted to advancing it.
        </p>
        <p>
          XAI aims to elucidate how or why the algorithm arrives at a specific conclusion as we understand it.
Although this conclusion may align with or contribute to a legal evaluation, the factors influencing
algorithmic decisions often diverge significantly from the reasoning employed by legal practitioners.
For instance, an AI model’s decision-making process may difer from a judge’s, even though they can
agree on the decision itself. This discrepancy raises a fundamental question regarding the role of XAI
within the legal framework: Should explanations be confined to ensuring the decision is robust, meaning
any other legal operator would arrive at the same conclusion, or should they also provide insights
into the critical legal factors at play? In other words, should the explanation elucidate the juridical
reasoning or the machine learning model’s decision-making process? The former is indispensable to
legal reasoning, given that societal shifts or alterations in the interpretation of legal principles may lead
1Not categorized as it is a survey.
to diferent conclusions compared to well-established legal precedents. Meanwhile, the latter is crucial
for understanding the model’s inner workings and ensuring that it is not biased or discriminatory
[
          <xref ref-type="bibr" rid="ref11">42, 27</xref>
          ]. In this sense, Atkinson et al. [18] and Robbins [68] argue that the explanation should focus
on elucidating the legal outcome rather than the AI’s internal processes. To afirm this, Robbins [68]
assumes that the only object that requires an explanation is the juridical decision. However, as we argue
in this work, an explanation has more validity than simply justifying a legal decision and is essential
to diferent stakeholders. Understanding both the legal reasoning and the model’s decision-making
process is crucial for ensuring transparency, accountability, and trust in AI systems.
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Directions for Improvement with XAI</title>
        <p>Most studies reviewed do not leverage XAI techniques to expose the limitations of NLP systems. A
notable exception is Bhambhoria et al. [43], where the authors observe that the Longformer model
[70] exhibits higher unreliability and susceptibility to spurious correlations compared to the XGBoost
model [71] despite the former’s greater accuracy. Ideally, XAI insights could be used to identify specific
scenarios where an NLP model excels or struggles. This would allow researchers to improve model
performance while providing users with crucial information about the contexts in which the model is
most dependable — a particularly important consideration in the legal domain, where the stakes of a
model’s failure can be significant.</p>
        <p>Further insights into the limitations of NLP systems could be gleaned by incorporating model
confidence scores into the analysis, especially given that most works fall under the “Classification”
category (Table 1). Understanding a model’s limitations within this context is of paramount importance.
An instance of misclassification with low confidence is expected. Conversely, a misclassification by an
overconfident model poses greater risks. With the aid of XAI techniques, the former can be linked to a
deficiency of relevant features for the model. In contrast, the latter can be attributed to a feature that
the model misuses, potentially revealing issues in the training process or model selection. Moreover,
confidence scores play a crucial role in score calibration, a vital aspect of providing users of an NLP
system with interpretable probabilities of model accuracy. Unfortunately, not all models exhibit
wellcalibrated curves, presenting a challenging hurdle. Thus, demonstrating the relevance of calibration
methods and results in the context of XAI is imperative. Most of the reviewed works make no mention
of this crucial aspect — notable exceptions are Resck et al. [41] and Semo et al. [50]. Accurate probability
estimates from machine learning classifiers help legal stakeholders assess their confidence in the model’s
decisions, preventing overreliance on incorrect predictions.</p>
      </sec>
      <sec id="sec-6-3">
        <title>6.3. Limitations</title>
        <p>
          Several limitations in the reviewed literature are worth highlighting. A key issue is the limited number of
works focusing on global explanations, particularly the absence of global post-hoc explanations, which
are crucial in XAI. Such explanations are essential because global methods can help users understand
the model’s behavior in general, which is important for legal stakeholders. Post-hoc methods are
ideal for black-box models that are not inherently interpretable. Other surveys, such as the one from
Danilevsky et al. [13], have also pointed out the scarcity of global explanations. Normally, post-hoc
explanations — such as LIME [20], SHAP [72], and input gradients [
          <xref ref-type="bibr" rid="ref29">58</xref>
          ] — are employed to explain the
model’s decision-making process using a specific sample, which may partly explain the absence of
global post-hoc explanations.
        </p>
        <p>
          Another area for improvement is the evaluation of explanations in NLLP. While benchmarks for
explanations exist [73], they are not standardized, especially within the legal domain. The lack of a
consistent benchmarking framework hinders evaluating, validating, and comparing diferent explanation
methods. This is a significant gap that needs to be addressed to advance the field. Finally, the efectiveness
of certain types of explanations, such as attention scores, widely used by local self-explaining methods
[
          <xref ref-type="bibr" rid="ref12 ref16 ref17 ref18 ref21 ref22">32, 34, 22, 33, 37, 28, 38, 39</xref>
          ], has been debated in previous work [74, 75].
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>This paper presents a comprehensive survey on the intersection of Explainable AI and Natural Legal
Language Processing. We compile a wide range of studies that apply explainability techniques to NLLP
tasks and categorize them based on a taxonomy derived from previous research, including explanation
types, techniques, and NLP tasks. Through this categorization, we identify trends in how XAI is
being applied to NLLP. We also explored works incorporating ethical considerations and discussed
the implications of using XAI in the legal domain. Our findings indicate that most papers do not fully
address the ethical concerns associated with their research. We outline the challenges and emphasize
the need to prioritize ethical considerations when applying XAI to legal contexts. Finally, we discussed
open issues and proposed directions for future research, particularly focusing on the role and value
of explanations in the legal domain and potential strategies for enhancing NLP systems with more
efective explanations.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>AI tools (e.g., Gemini) were employed to assist with specific tasks, including text refinement, enhancing
overall workflow eficiency. The authors meticulously reviewed all AI-assisted outputs and bear full
responsibility for the final content of this manuscript.
[9] V. G. F. Bertalan, E. E. S. Ruiz, Predicting judicial outcomes in the brazilian legal system using
textual features, in: DHandNLP@PROPOR, 2020. URL: https://api.semanticscholar.org/CorpusID:
218906340.
[10] A. Lage-Freitas, H. Allende-Cid, O. V. Santana, L. de Oliveira-Lage, Predicting brazilian court
decisions, PeerJ Computer Science 8 (2019). URL: https://api.semanticscholar.org/CorpusID:165164045.
[11] N. Aletras, D. Tsarapatsanis, D. Preoţiuc-Pietro, V. Lampos, Predicting judicial decisions of the
european court of human rights: A natural language processing perspective, PeerJ computer
science 2 (2016) e93.
[12] K. Qian, M. Danilevsky, Y. Katsis, B. Kawas, E. Oduor, L. Popa, Y. Li, XNLP: A Living Survey for XAI
Research in Natural Language Processing, in: 26th International Conference on Intelligent User
Interfaces - Companion, IUI ’21 Companion, Association for Computing Machinery, New York,
NY, USA, 2021, pp. 78–80. URL: https://dl.acm.org/doi/10.1145/3397482.3450728. doi:10.1145/
3397482.3450728.
[13] M. Danilevsky, K. Qian, R. Aharonov, Y. Katsis, B. Kawas, P. Sen, A Survey of the State of
Explainable AI for Natural Language Processing, in: Proceedings of the 1st Conference of the
Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International
Joint Conference on Natural Language Processing, Association for Computational Linguistics,
Suzhou, China, 2020, pp. 447–459. URL: https://aclanthology.org/2020.aacl-main.46.
[14] H. Zhao, H. Chen, F. Yang, N. Liu, H. Deng, H. Cai, S. Wang, D. Yin, M. Du, Explainability for
Large Language Models: A Survey, ACM Transactions on Intelligent Systems and Technology 15
(2024) 20:1–20:38. URL: https://dl.acm.org/doi/10.1145/3639372. doi:10.1145/3639372.
[15] L. Resck, I. Augenstein, A. Korhonen, Explainability and Interpretability of Multilingual Large</p>
      <p>Language Models: A Survey, 2025. URL: https://openreview.net/forum?id=KQjVhM2YhN.
[16] L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, L. Kagal, Explaining Explanations: An
Overview of Interpretability of Machine Learning, in: 2018 IEEE 5th International Conference
on Data Science and Advanced Analytics (DSAA), IEEE, Turin, Italy, 2018, pp. 80–89. URL: https:
//ieeexplore.ieee.org/abstract/document/8631448. doi:10.1109/DSAA.2018.00018.
[17] G. Schwalbe, B. Finzel, A comprehensive taxonomy for explainable artificial intelligence: a
systematic survey of surveys on methods and concepts, Data Mining and Knowledge Discovery
(2023). URL: https://doi.org/10.1007/s10618-022-00867-8. doi:10.1007/s10618-022-00867-8.
[18] K. Atkinson, T. Bench-Capon, D. Bollegala, Explanation in AI and law: Past, present and
future, Artificial Intelligence 289 (2020) 103387. URL: https://linkinghub.elsevier.com/retrieve/pii/
S0004370220301375. doi:10.1016/j.artint.2020.103387.
[19] L. K. Branting, C. Pfeifer, B. Brown, L. Ferro, J. Aberdeen, B. Weiss, M. Pfaf, B. Liao, Scalable
and explainable legal prediction, Artificial Intelligence and Law 29 (2021) 213–238. URL: https:
//doi.org/10.1007/s10506-020-09273-1. doi:10.1007/s10506-020-09273-1.
[20] M. T. Ribeiro, S. Singh, C. Guestrin, "Why Should I Trust You?": Explaining the Predictions of Any
Classifier, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, KDD ’16, Association for Computing Machinery, New York, NY,
USA, 2016, pp. 1135–1144. URL: https://doi.org/10.1145/2939672.2939778. doi:10.1145/2939672.
2939778.
[21] J. González-González, F. de Arriba-Pérez, S. García-Méndez, A. Busto-Castiñeira, F. J.
GonzálezCastaño, Automatic explanation of the classification of Spanish legal judgments in
jurisdictiondependent law categories with tree estimators, Journal of King Saud University - Computer and
Information Sciences 35 (2023) 101634. URL: https://www.sciencedirect.com/science/article/pii/
S131915782300188X. doi:10.1016/j.jksuci.2023.101634.
[22] Q. Zhao, T. Gao, N. Guo, Legal Judgment Prediction Via Legal Knowledge Fusion and Prompt</p>
      <p>Learning, 2023. URL: https://papers.ssrn.com/abstract=4341600. doi:10.2139/ssrn.4341600.
[23] R. Bhambhoria, H. Liu, S. Dahan, X. Zhu, Interpretable low-resource legal decision making, in:</p>
      <p>Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2022, pp. 11819–11827.
[24] L. Li, L. Zhao, P. Nai, X. Tao, Charge prediction modeling with interpretation enhancement driven
by double-layer criminal system, World Wide Web 25 (2022) 381–400. URL: https://doi.org/10.
classification of legislative contents, in: Digital Libraries for Open Knowledge: 23rd International
Conference on Theory and Practice of Digital Libraries, TPDL 2019, Oslo, Norway, September
9-12, 2019, Proceedings 23, Springer, 2019, pp. 238–252.
[39] H. Ye, X. Jiang, Z. Luo, W. Chao, Interpretable Charge Predictions for Criminal Cases: Learning to
Generate Court Views from Fact Descriptions, in: M. Walker, H. Ji, A. Stent (Eds.), Proceedings of
the 2018 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational
Linguistics, New Orleans, Louisiana, 2018, pp. 1854–1864. URL: https://aclanthology.org/N18-1168.
doi:10.18653/v1/N18-1168.
[40] J. Mumford, K. Atkinson, T. Bench-Capon, Combining a Legal Knowledge Model with Machine
Learning for Reasoning with Legal Cases, in: Proceedings of the Nineteenth International
Conference on Artificial Intelligence and Law, ICAIL ’23, Association for Computing Machinery,
New York, NY, USA, 2023, pp. 167–176. URL: https://dl.acm.org/doi/10.1145/3594536.3595158.
doi:10.1145/3594536.3595158.
[41] L. E. Resck, J. R. Ponciano, L. G. Nonato, J. Poco, Legalvis: Exploring and inferring precedent
citations in legal documents, IEEE Transactions on Visualization and Computer Graphics (2023).
[42] I. Benedetto, A. Koudounas, L. Vaiani, E. Pastor, E. Baralis, L. Cagliero, F. Tarasconi, PoliToHFI
at SemEval-2023 task 6: Leveraging entity-aware and hierarchical transformers for legal entity
recognition and court judgment prediction, in: A. K. Ojha, A. S. Doğruöz, G. Da San Martino,
H. Tayyar Madabushi, R. Kumar, E. Sartori (Eds.), Proceedings of the 17th International Workshop
on Semantic Evaluation (SemEval-2023), Association for Computational Linguistics, Toronto,
Canada, 2023, pp. 1401–1411. URL: https://aclanthology.org/2023.semeval-1.194/. doi:10.18653/
v1/2023.semeval-1.194.
[43] R. Bhambhoria, S. Dahan, X. Zhu, Investigating the state-of-the-art performance and explainability
of legal judgment prediction., in: Canadian Conference on AI, 2021.
[44] L. E. R. Domingues, Inferring and explaining potential citations to binding precedents in brazilian
supreme court decisions (2021).
[45] Ł. Górski, S. Ramakrishna, Explainable artificial intelligence, lawyer’s perspective, in: Proceedings
of the Eighteenth International Conference on Artificial Intelligence and Law, 2021, pp. 60–68.
[46] J. Rabelo, M.-Y. Kim, R. Goebel, M. Yoshioka, Y. Kano, K. Satoh, Coliee 2020: methods for legal
document retrieval and entailment, in: New Frontiers in Artificial Intelligence: JSAI-isAI 2020
Workshops, JURISIN, LENLS 2020 Workshops, Virtual Event, November 15–17, 2020, Revised
Selected Papers 12, Springer, 2021, pp. 196–210.
[47] J. Rabelo, M.-Y. Kim, R. Goebel, M. Yoshioka, Y. Kano, K. Satoh, A summary of the coliee 2019
competition, in: New Frontiers in Artificial Intelligence: JSAI-isAI International Workshops,
JURISIN, AI-Biz, LENLS, Kansei-AI, Yokohama, Japan, November 10–12, 2019, Revised Selected
Papers 10, Springer, 2020, pp. 34–49.
[48] R. Chhatwal, P. Gronvall, N. Huber-Fliflet, R. Keeling, J. Zhang, H. Zhao, Explainable Text
Classification in Legal Document Review A Case Study of Explainable Predictive Coding, in:
2018 IEEE International Conference on Big Data (Big Data), 2018, pp. 1905–1911. URL: https:
//ieeexplore.ieee.org/document/8622073. doi:10.1109/BigData.2018.8622073.
[49] J. Valvoda, R. Cotterell, Towards Explainability in Legal Outcome Prediction Models, in: K. Duh,
H. Gomez, S. Bethard (Eds.), Proceedings of the 2024 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long
Papers), Association for Computational Linguistics, Mexico City, Mexico, 2024, pp. 7269–7289. URL:
https://aclanthology.org/2024.naacl-long.404/. doi:10.18653/v1/2024.naacl-long.404.
[50] G. Semo, D. Bernsohn, B. Hagag, G. Hayat, J. Niklaus, Classactionprediction: A challenging
benchmark for legal judgment prediction of class action cases in the us, arXiv preprint arXiv:2211.00582
(2022).
[51] S. T.y.s.s, S. Xu, O. Ichim, M. Grabmair, Deconfounding legal judgment prediction for european court
of human rights cases towards better alignment with experts, in: Y. Goldberg, Z. Kozareva, Y. Zhang
(Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing,
in: Y. Goldberg, Z. Kozareva, Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical
Methods in Natural Language Processing, Association for Computational Linguistics, Abu Dhabi,
United Arab Emirates, 2022, pp. 4509–4516. URL: https://aclanthology.org/2022.emnlp-main.299.
doi:10.18653/v1/2022.emnlp-main.299.
[68] S. Robbins, A Misdirected Principle with a Catch: Explicability for AI, Minds and
Machines 29 (2019) 495–514. URL: https://doi.org/10.1007/s11023-019-09509-3. doi:10.1007/
s11023-019-09509-3.
[69] A. Selbst, J. Powles, “Meaningful Information” and the Right to Explanation, in: Proceedings of
the 1st Conference on Fairness, Accountability and Transparency, volume 81 of Proceedings of
Machine Learning Research, PMLR, New York, 2018, pp. 48–48. URL: https://proceedings.mlr.press/
v81/selbst18a.html, iSSN: 2640-3498.
[70] I. Beltagy, M. E. Peters, A. Cohan, Longformer: The long-document transformer, ArXiv
abs/2004.05150 (2020). URL: https://api.semanticscholar.org/CorpusID:215737171.
[71] T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, K. Chen, R. Mitchell, I. Cano, T. Zhou,
et al., Xgboost: extreme gradient boosting, R package version 0.4-2 1 (2015) 1–4.
[72] S. M. Lundberg, S.-I. Lee, A Unified Approach to Interpreting Model Predictions,
in: Advances in Neural Information Processing Systems 30 (NIPS 2017), volume 30,
Curran Associates, Inc., 2017. URL: https://proceedings.neurips.cc/paper/2017/hash/
8a20a8621978632d76c43dfd28b67767-Abstract.html.
[73] C. Agarwal, S. Krishna, E. Saxena, M. Pawelczyk, N. Johnson, I. Puri, M. Zitnik, H. Lakkaraju,
OpenXAI: towards a transparent evaluation of post hoc model explanations, in: Proceedings of
the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Curran
Associates Inc., Red Hook, NY, USA, 2024, pp. 15784–15799. URL: https://dl.acm.org/doi/10.5555/
3600270.3601418.
[74] S. Jain, B. C. Wallace, Attention is not Explanation, in: J. Burstein, C. Doran, T. Solorio (Eds.),
Proceedings of the 2019 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers),
Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 3543–3556. URL:
https://aclanthology.org/N19-1357. doi:10.18653/v1/N19-1357.
[75] S. Wiegrefe, Y. Pinter, Attention is not not Explanation, in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.),
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the
9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association
for Computational Linguistics, Hong Kong, China, 2019, pp. 11–20. URL: https://aclanthology.org/
D19-1002. doi:10.18653/v1/D19-1002.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Katz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hartung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gerlach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jana</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. J. Bommarito</surname>
            <given-names>II</given-names>
          </string-name>
          ,
          <source>Natural Language Processing in the Legal Domain</source>
          ,
          <year>2023</year>
          . URL: http://arxiv.org/abs/2302.12039. doi:
          <volume>10</volume>
          .48550/arXiv.2302.12039, arXiv:
          <fpage>2302</fpage>
          .12039 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kuang</surname>
          </string-name>
          ,
          <article-title>Lk-ib: a hybrid framework with legal knowledge injection for compulsory measure prediction</article-title>
          ,
          <source>Artificial Intelligence and Law</source>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>5218</fpage>
          -
          <lpage>5230</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <volume>466</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>466</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>M.-Y. Kim</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Goebel</surname>
          </string-name>
          ,
          <article-title>Summarization of legal texts with high cohesion and automatic compression rate</article-title>
          ,
          <source>in: JSAI-isAI Workshops</source>
          ,
          <year>2012</year>
          . URL: https://api.semanticscholar.org/CorpusID: 38025582.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Chieze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Farzindar</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Lapalme,</surname>
          </string-name>
          <article-title>An automatic system for summarization and information extraction of legal information</article-title>
          ,
          <source>in: Semantic Processing of Legal Texts</source>
          ,
          <year>2010</year>
          . URL: https://api. semanticscholar.org/CorpusID:12554475.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>H. L. de Araujo</surname>
          </string-name>
          , T. E. de Campos,
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Braz</surname>
          </string-name>
          , N. C. da
          <string-name>
            <surname>Silva</surname>
          </string-name>
          ,
          <article-title>Victor: a dataset for brazilian legal documents classification</article-title>
          ,
          <source>in: International Conference on Language Resources and Evaluation</source>
          ,
          <year>2020</year>
          . URL: https://api.semanticscholar.org/CorpusID:219299779.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N. C.</given-names>
            da
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Braz</surname>
          </string-name>
          , T. E. de Campos,
          <string-name>
            <given-names>A. L. P.</given-names>
            <surname>Guedes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. B.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Bezerra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. B.</given-names>
            <surname>Gusmao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. B. S.</given-names>
            <surname>Chaves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. G.</given-names>
            <surname>Ziegler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. H.</given-names>
            <surname>Horinouchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. U.</given-names>
            <surname>Ferreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. H.</given-names>
            <surname>Inazawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. H. D.</given-names>
            <surname>Coelho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. V. C.</given-names>
            <surname>Fernandes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Peixoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S. M.</given-names>
            <surname>Filho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Sukiennik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. P. M.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Junquilho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. H. T.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <article-title>Document type classification for brazil's supreme court using a convolutional neural network</article-title>
          ,
          <source>Proceedings of The Tenth International Conference on Forensic Computer Science and Cyber Law</source>
          (
          <year>2018</year>
          ). URL: https://api.semanticscholar.org/CorpusID:69283834.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Strickson</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. De La Iglesia</surname>
          </string-name>
          ,
          <article-title>Legal Judgement Prediction for UK Courts</article-title>
          ,
          <source>in: Proceedings of the 3rd International Conference on Information Science and Systems</source>
          , ICISS '20,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , pp.
          <fpage>204</fpage>
          -
          <lpage>209</lpage>
          . URL: https://doi.org/10.1145/ 3388176.3388183. doi:
          <volume>10</volume>
          .1145/3388176.3388183. 1007/s11280-021-00873-8. doi:
          <volume>10</volume>
          .1007/s11280-021-00873-8.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lyu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>Improving legal judgment prediction through reinforced criminal element extraction</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>59</volume>
          (
          <year>2022</year>
          )
          <fpage>102780</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kuang</surname>
          </string-name>
          ,
          <article-title>Towards interactivity and interpretability: A rationale-based legal judgment prediction framework</article-title>
          ,
          <source>in: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>4787</fpage>
          -
          <lpage>4799</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Iteratively questioning and answering for interpretable legal judgment prediction</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>34</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>1250</fpage>
          -
          <lpage>1257</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>K.</given-names>
            <surname>Branting</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pfeifer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pfaf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yeh</surname>
          </string-name>
          ,
          <article-title>Semisupervised methods for explainable legal prediction</article-title>
          ,
          <source>in: Proceedings of the seventeenth international conference on artificial intelligence and law</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>22</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <article-title>Charge-Based Prison Term Prediction with Deep Gating Network</article-title>
          , in: K. Inui,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <surname>X.</surname>
          </string-name>
          Wan (Eds.),
          <source>Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Hong Kong, China,
          <year>2019</year>
          , pp.
          <fpage>6362</fpage>
          -
          <lpage>6367</lpage>
          . URL: https://aclanthology.org/D19-1667. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          -1667.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chao</surname>
          </string-name>
          , W. Ma,
          <article-title>Interpretable Rationale Augmented Charge Prediction System</article-title>
          , in: D.
          <string-name>
            <surname>Zhao</surname>
          </string-name>
          (Ed.),
          <source>Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Santa Fe, New Mexico,
          <year>2018</year>
          , pp.
          <fpage>146</fpage>
          -
          <lpage>151</lpage>
          . URL: https://aclanthology.org/C18-2032.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [31]
          <string-name>
            <surname>K. D. Ashley</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Brüninghaus</surname>
          </string-name>
          ,
          <source>Automatically classifying case texts and predicting outcomes, Artificial Intelligence and Law</source>
          <volume>17</volume>
          (
          <year>2009</year>
          )
          <fpage>125</fpage>
          -
          <lpage>165</lpage>
          . URL: https://doi.org/10.1007/s10506-009-9077-9. doi:
          <volume>10</volume>
          .1007/s10506-009-9077-9.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>V. G. F.</given-names>
            <surname>Bertalan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. E. S.</given-names>
            <surname>Ruiz</surname>
          </string-name>
          ,
          <article-title>Using attention methods to predict judicial outcomes</article-title>
          ,
          <source>Artificial Intelligence and Law</source>
          <volume>32</volume>
          (
          <year>2024</year>
          )
          <fpage>87</fpage>
          -
          <lpage>115</lpage>
          . URL: https://doi.org/10.1007/s10506-022-09342-7. doi:
          <volume>10</volume>
          . 1007/s10506-022-09342-7.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nielsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Skylaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Norkute</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Stremitzer</surname>
          </string-name>
          ,
          <article-title>Efects of xai on legal process (</article-title>
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <article-title>Interpretable prison term prediction with reinforce learning and attention</article-title>
          ,
          <source>Applied Intelligence</source>
          <volume>53</volume>
          (
          <year>2023</year>
          )
          <fpage>1306</fpage>
          -
          <lpage>1323</lpage>
          . URL: https://doi.org/10.1007/ s10489-022-03675-1. doi:
          <volume>10</volume>
          .1007/s10489-022-03675-1.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>I.</given-names>
            <surname>Chalkidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fergadiotis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsarapatsanis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Aletras</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Androutsopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Malakasiotis</surname>
          </string-name>
          ,
          <article-title>Paragraph-level rationale extraction through regularization: A case study on european court of human rights cases</article-title>
          , in: K.
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rumshisky</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Zettlemoyer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Hakkani-Tur</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Beltagy</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bethard</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Cotterell</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Chakraborty</surname>
          </string-name>
          , Y. Zhou (Eds.),
          <source>Proceedings of the</source>
          <year>2021</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</article-title>
          , ????, pp.
          <fpage>226</fpage>
          -
          <lpage>241</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .naacl-main.
          <volume>22</volume>
          /. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .naacl-main.
          <volume>22</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>V.</given-names>
            <surname>Malik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sanjay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Nigam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Modi</surname>
          </string-name>
          ,
          <article-title>ILDC for CJPE: Indian legal documents corpus for court judgment prediction and explanation</article-title>
          , in: C.
          <string-name>
            <surname>Zong</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Navigli</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , Association for Computational Linguistics, ????, pp.
          <fpage>4046</fpage>
          -
          <lpage>4062</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>313</volume>
          /. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>313</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>M.</given-names>
            <surname>Norkute</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Herger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Michalak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mulder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <surname>Towards Explainable</surname>
            <given-names>AI</given-names>
          </string-name>
          :
          <article-title>Assessing the Usefulness and Impact of Added Explainability Features in Legal Document Summarization, in: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems</article-title>
          , CHI EA '
          <volume>21</volume>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . URL: https://doi.org/10.1145/3411763.3443441. doi:
          <volume>10</volume>
          .1145/3411763.3443441.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>D.</given-names>
            <surname>Caled</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Won</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Martins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <article-title>A hierarchical label network for multi-label eurovoc Association for Computational Linguistics</article-title>
          , ????, pp.
          <fpage>1120</fpage>
          -
          <lpage>1138</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .emnlp-main.
          <volume>74</volume>
          /. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2022</year>
          .emnlp-main.
          <volume>74</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Mahoney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Huber-Fliflet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gronvall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>A Framework for Explainable Text Classification in Legal Document Review</article-title>
          ,
          <source>IEEE Computer Society</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1858</fpage>
          -
          <lpage>1867</lpage>
          . URL: https://www.computer.org/csdl/proceedings-article/big-data/
          <year>2019</year>
          /09005659/1hJsCablZfy. doi:
          <volume>10</volume>
          .1109/BigData47090.
          <year>2019</year>
          .
          <volume>9005659</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>B.</given-names>
            <surname>Waltl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Bonczek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Scepankova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Matthes</surname>
          </string-name>
          ,
          <article-title>Semantic types of legal norms in German laws: classification and analysis using local linear explanations</article-title>
          ,
          <source>Artificial Intelligence and Law</source>
          <volume>27</volume>
          (
          <year>2019</year>
          )
          <fpage>43</fpage>
          -
          <lpage>71</lpage>
          . URL: https://doi.org/10.1007/s10506-018-9228-y. doi:
          <volume>10</volume>
          .1007/s10506-018-9228-y.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [54]
          <string-name>
            <surname>Ł. Górski</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ramakrishna</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Nowosielski</surname>
          </string-name>
          ,
          <article-title>Towards grad-cam based explainability in a legal text processing pipeline</article-title>
          . extended version, in: International Workshop on AI Approaches to the
          <source>Complexity of Legal Systems</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>154</fpage>
          -
          <lpage>168</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [55]
          <string-name>
            <given-names>J.</given-names>
            <surname>Landthaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Glaser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Matthes</surname>
          </string-name>
          , Towards Explainable Semantic Text Matching,
          <source>Legal Knowledge and Information Systems</source>
          <volume>313</volume>
          (
          <year>2018</year>
          ). doi:
          <volume>10</volume>
          .3233/978-1-
          <fpage>61499</fpage>
          -935-5-200.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [56]
          <string-name>
            <surname>F. de Arriba-Pérez</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>García-Méndez</surname>
            ,
            <given-names>F. J.</given-names>
          </string-name>
          <string-name>
            <surname>González-Castaño</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>González-González</surname>
          </string-name>
          ,
          <article-title>Explainable machine learning multi-label classification of spanish legal judgements</article-title>
          ,
          <source>Journal of King Saud University - Computer and Information Sciences</source>
          <volume>34</volume>
          (
          <year>2022</year>
          )
          <fpage>10180</fpage>
          -
          <lpage>10192</lpage>
          . URL: https://www.sciencedirect.com/science/article/pii/S1319157822003664. doi:https://doi.org/ 10.1016/j.jksuci.
          <year>2022</year>
          .
          <volume>10</volume>
          .015.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>M.</given-names>
            <surname>Medvedeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vols</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. Wieling,</surname>
          </string-name>
          <article-title>Using machine learning to predict decisions of the European Court of Human Rights</article-title>
          ,
          <source>Artificial Intelligence and Law</source>
          <volume>28</volume>
          (
          <year>2020</year>
          )
          <fpage>237</fpage>
          -
          <lpage>266</lpage>
          . URL: https://doi.org/10. 1007/s10506-019-09255-y. doi:
          <volume>10</volume>
          .1007/s10506-019-09255-y.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundararajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Taly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>Axiomatic attribution for deep networks</article-title>
          ,
          <source>in: International conference on machine learning, PMLR</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>3319</fpage>
          -
          <lpage>3328</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [59]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Selvaraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cogswell</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Vedantam</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Parikh</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Batra</surname>
          </string-name>
          , Grad-cam:
          <article-title>Visual explanations from deep networks via gradient-based localization</article-title>
          ,
          <source>in: Proceedings of the IEEE international conference on computer vision</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>618</fpage>
          -
          <lpage>626</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [60]
          <string-name>
            <given-names>L.</given-names>
            <surname>Resck</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Raimundo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Poco</surname>
          </string-name>
          ,
          <article-title>Exploring the Trade-of Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales</article-title>
          , in: K. Duh,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , S. Bethard (Eds.),
          <source>Findings of the Association for Computational Linguistics: NAACL</source>
          <year>2024</year>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Mexico City, Mexico,
          <year>2024</year>
          , pp.
          <fpage>4190</fpage>
          -
          <lpage>4216</lpage>
          . URL: https://aclanthology. org/
          <year>2024</year>
          .findings-naacl.
          <volume>262</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2024</year>
          .
          <article-title>findings-naacl.262, also presented as a poster at the LatinX in NLP at NAACL 2024 workshop</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [61]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Rudra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anand</surname>
          </string-name>
          , Explain and Predict, and then Predict Again,
          <source>in: Proceedings of the 14th ACM International Conference on Web Search and Data Mining</source>
          , Association for Computing Machinery, Virtual Event Israel,
          <year>2021</year>
          , pp.
          <fpage>418</fpage>
          -
          <lpage>426</lpage>
          . URL: https://doi.org/10.1145/3437963.3441758. doi:
          <volume>10</volume>
          .1145/3437963.3441758.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [62]
          <string-name>
            <given-names>B.</given-names>
            <surname>Paranjape</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Thickstun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hajishirzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <article-title>An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>1938</fpage>
          -
          <lpage>1952</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .emnlp-main.
          <volume>153</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .emnlp-main.
          <volume>153</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [63]
          <string-name>
            <given-names>E.</given-names>
            <surname>Prem</surname>
          </string-name>
          ,
          <article-title>From ethical ai frameworks to tools: a review of approaches</article-title>
          ,
          <source>AI and Ethics</source>
          <volume>3</volume>
          (
          <year>2023</year>
          )
          <fpage>699</fpage>
          -
          <lpage>716</lpage>
          . URL: https://link.springer.com/article/10.1007/s43681-023-00258-9.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [64]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hagendorf</surname>
          </string-name>
          ,
          <article-title>The ethics of ai ethics: An evaluation of guidelines</article-title>
          ,
          <source>Minds and Machines</source>
          <volume>30</volume>
          (
          <year>2020</year>
          )
          <fpage>99</fpage>
          -
          <lpage>120</lpage>
          . URL: https://link.springer.com/article/10.1007/s11023-020-09517-8.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [65]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jobin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ienca</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Vayena,</surname>
          </string-name>
          <article-title>The global landscape of ai ethics guidelines</article-title>
          ,
          <source>Nature machine intelligence</source>
          <volume>1</volume>
          (
          <year>2019</year>
          )
          <fpage>389</fpage>
          -
          <lpage>399</lpage>
          . URL: https://www.nature.com/articles/s42256-019-0088-2.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [66]
          <string-name>
            <given-names>L.</given-names>
            <surname>Floridi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cowls</surname>
          </string-name>
          ,
          <article-title>A unified framework of five principles for ai in society</article-title>
          ,
          <source>Harvard Data Science Review</source>
          <volume>1</volume>
          (
          <year>2019</year>
          ). URL: https://hdsr.mitpress.mit.edu/pub/l0jsh9d1/release/8. doi:https: //doi.org/10.1162/99608f92.8cd550d1.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [67]
          <string-name>
            <given-names>L.</given-names>
            <surname>Benotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Blackburn</surname>
          </string-name>
          ,
          <article-title>Ethics consideration sections in natural language processing papers,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>