<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A normative model of explanation for binary classification legal AI and its implementation on causal explanations of Answer Set Programming</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Evan Iatrou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Logic, Language and Computation (University of Amsterdam)</institution>
          ,
          <addr-line>Science Park 107, 1098 XG Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, I provide a normative model of explanation for the output of AI algorithms used in legal practice. I focus on binary classification algorithms due to their extensive use in the field. In the last part of the paper, I examine the model's compatibility with causal explanations provided by Answer Set Programming (ASP) causal models. The motivation for proposing this model is the necessity for providing explanations for the output of legal AI. From the multiplicity of arguments supporting that necessity, the proposed model addresses the argument that legal AI's output should be objectionable. That can be achieved only if the explanation of the output has a form that makes it amenable to evaluation by legal practitioners. Hence, I firstly provide a normative model for the explanations used by legal practitioners in their practice (CLMLP) and then I provide the normative model for the explanations of legal AI's outputs (EXPBC) that I base on CLPLP. CLPLP in its turn is based on the Classical Model of Science (CMS) which is the normative model of explanations that every “proper” science should follow according to philosophers throughout history. Following the introduction of EXPBC, I propose three degrees of explainability regarding binary classification explanations according to their fidelity to EXP BC. I further argue that machine learning can not satisfy even the lowest degree of explainability, while rule-based AI - like ASP-based AI - can satisfy the highest degree. Concluding, I propose an ASP methodology in-progress that can use EXPBC to provide causal explanations. In the proposed ASP methodology, I am using causal graphs as models of causal inference as well as a metaphysically neutral interventionist account of causation. CMSLP and EXPBC are normative models that are based on the derivation relation of subsumptivedeductive inference among norms and propositions. On the other hand, the proposed ASP methodology is based on causal - and hence non-subsumptive-deductive - relations among norms and propositions that supervene on the subsumptive-deductive ones. Consequently, the motivation behind proposing this specific ASP methodology is to establish a precedent for a unification of diferent types of explanations (e.g., deductive and causal explanations) as well as to bridge the gaps among computational modelling, actual practice, and the philosophical underpinnings of a domain of expertise (law in this case).</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Classical Model of Science (CMS)</kwd>
        <kwd>legal XAI</kwd>
        <kwd>degrees of explainability</kwd>
        <kwd>causal graphs</kwd>
        <kwd>Answer Set Programming (ASP)</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction</p>
      <p>
        1AI has already multiple applications in legal practice. Public judicial institutions employ AI
to assess the possibility of the defendant recidivating [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Private companies use AI to review
thousands of documents to determine which ones are relevant to a particular case [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. There
are even “robot lawyers” like the DoNotPay app;2 an initiative that started by contesting parking
tickets and now it has expanded to other diverse services - from landlord protection to canceling
a Disney+ subscription.
      </p>
      <p>
        A large amount of the legal AI literature is devoted on predicting the outcome of a case
(e.g., whether the Court will rule in favour of the prosecution [
        <xref ref-type="bibr" rid="ref3">3, 4, 5</xref>
        ]) and whether a legal
document belongs to a certain category (e.g., category1=“The document is relevant to a pending
case.”, category2=“The document is not relevant to a pending case.” [6, 7, 8]). There are also many
cases of AI that are not usually considered by the general public as legal AI while they should.
Such an example with significant impact on the daily lives of billions of people are upload filters
- i.e., AI that “substitutes” legal institutions (e.g., courts) by classifying data as (un)lawful [9]. All
foregoing examples are essentially binary classification tasks: the defendant is or is not guilty, a
document is or is not relevant to a pending case, an uploaded post contains or it does not contain
unlawful hate speech. Considering its wide range of applications, the rest of the paper will be
devoted to binary classification legal AI.
      </p>
      <p>Legal experts advocate that AI used in legal practice must provide an explanation for its output.
One of the central arguments is that the outcome of such an algorithm should be objectionable
[10]. That can be achieved only if the explanation of the output has a form that makes it
amenable to evaluation by legal practitioners. As a result, any normative model of explanations
of legal AI’s output should be constructed based on a normative model of explanations used
in legal practice so as to justify its normativity. That also means that any normative model of
explanations of legal AI’s output should be constructed prior to its implementation to actual AI
models since the latter should follow the former. This is why, in this paper, I do not propose a
normative model derived from the current state-of-the-art legal AI, but on the contrary, it is
a model that the state-of-the-art legal AI should follow. The characterisation “state-of-the-art”
itself can not be attributed without a prior listing of all the normative explainability requirements
from the prospective of the legal practitioner that will actually use the proposed AI model.</p>
      <p>The paper is structured in 5 sections with §1 being the introduction and §5 the conclusion. In
§2, I construct a normative model of legal practice. Specifically, in §2.1, I clarify the meaning
of the term “legal practice” in the context of this paper. In §2.2, I present the Classical Model
of Science (CMS), i.e., a normative model of every “proper science” according to philosophers
throughout history. Finally, in §2.3, I apply the CMS to legal practice in order to construct a
normative model of legal practice that I name CMSLP.</p>
      <p>In §3.1, I use the normative model of legal practice from §2.3 to finally construct a normative
model of explanation for the output of binary classification legal AI. I name this model EXPBC.
In §3.2, I use this model to classify the explanations provided by current legal XAI methods in
three categories. Those categories are ordered in degrees of explainability. I argue that rule-based</p>
    </sec>
    <sec id="sec-2">
      <title>1Original submission</title>
      <p>2https://donotpay.com/ (last visit: 10/04/2022)
AI can satisfy the highest degree of explainability while machine learning can not even satisfy
the lowest degree.</p>
      <p>So far, all proposed normative models (CMSLP, EXPBC) are based on subsumptive-deductive
inference - the principal reasoning method used in legal practice. In §4 - the final section of
the paper, I argue how Answer Set Programming (ASP) - as a paradigmatic case of rule-based
programming - can be employed to: (i) model causal inference based on subsumptive-deductive
inference;3 (ii) use causal inference’s supervenience on subsumption-deduction to satisfy the
highest degree of explainability presented in §3.2. The motivation for this approach to causal
explanations is to establish a precedent for a unification of types of explanations (e.g., deductive
and causal explanations) and to bridge the gaps among computational modelling, actual practice,
and the philosophical underpinnings of a discipline (the discipline of law in this case). Due to
the latter motivation, the paper is written in a way that makes it accessible to all addressed
audiences. The endeavour of addressing multiple disciplines is an emerging challenge for the
practical use of AI and ergo, a secondary issue this paper tries to indirectly tackle not explicitly
but by example. E.g., by avoiding unnecessary AI or philosophical jargon and by providing
examples and intuitions where necessary all the while retaining the content’s quality.</p>
      <sec id="sec-2-1">
        <title>2. A normative model of legal practice: the Classical Model of</title>
      </sec>
      <sec id="sec-2-2">
        <title>Science</title>
        <sec id="sec-2-2-1">
          <title>2.1. Legal practice</title>
          <p>Before constructing a normative model of legal practice, I have to clarify what legal practice
is in the context of this paper. By “legal practice”, I mean the totality of the activities of the legal
experts that are authorised to take part in the process of deciding a judgement (e.g., the lawyers
of the defense and the prosecution, the judges of the authorised court). Those activities can be
for instance the arguments the defence makes during a trial and all of the defence’s preparatory
work before the trial. The authorisation of the legal experts is stipulated by law. E.g., the
European Convention of Human Rights (the Convention) stipulates that the authorised court for
deciding whether there has been a violation of the Convention’s articles is the European Court
of Human Rights (ECtHR).4</p>
          <p>I borrow this construal of legal practice from [11]. My motivation is that according to this
construal, the produced results of legal practice are legally binding. And being legally binding
will be a necessary property for the conception of truth I will employ in the proposed normative
model of legal practice.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2. The Classical Model of Science (CMS)</title>
          <p>The normative model of legal practice I will provide will be based on the Classical Model of
Science (CMS). CMS is a posterior reconstruction of how the ideal scientific explanation for every
“proper science” should look like according to philosophers throughout history [12]. One may
3My choice of causal inference models is causal graphs based on a metaphysically neutral interventionist account of
causation.
4Article 19 of the Convention.
object to my decision to provide a normative model of legal practice using the CMS since the
latter is about sciences and not about a practice. However, even if legal practice is not a science
whatever that means - my arguments supporting that legal practice fits into the CMS will still
hold since none of those arguments is based on the premiss that legal practice is a science.</p>
          <p>Let’s move to the presentation of the CMS. I will provide a less detailed account of the CMS
which I will call CMSsum. For the construction of EXPBC, there is no need to argue how legal
practice fits into the more detailed version of CMS sum found in [12]. Specifically, EXP BC will be
a minimal model. By “minimal”, I mean that one can further expand and/or precisify parameters
of that model5 using the more detailed version of the CMSsum.6 However, for that to be done
properly, I would have had to exceed the word limit.
(SUM.1) CMSsum is a system S of propositions, concepts (or terms) which satisfy clauses (SUM.2)
to (SUM.6).
(SUM.2) CMSsum has a domain . The propositions, concepts and terms of S are about the entities
of .
(SUM.3) The propositions form a hierarchy  based on their derivation: the propositions situated
higher in the hierarchy are derived from the propositions situated lower.
(SUM.4) The concepts and the terms form hierarchies  and  respectively based on their
derivation: the constituents of a hierarchy situated higher in the hierarchy are derived
from the constituents situated lower.
(SUM.5) All propositions of S are true.
(SUM.6) The domain experts know that the propositions of S are true and they have adequate
knowledge of all concepts and terms of S. For each hierarchy ,∈{,, }, knowledge
about one of its constituents is justified from knowledge about other constituents situated
lower in that hierarchy.
2.3. Applying CMSsum to legal practice</p>
          <p>As noted by [12], for each application of the CMS, CMS’s parameters may have to be
modified to fit the particularities of that application. Below, I perform such modifications
to apply CMSsum to legal practice. In §2.4, I sum up the outcome of those modifications to
introduce a normative model of legal practice which I will name CMSLP.
∙ Applying (SUM.3): Legal reasoning can be construed as rule-based reasoning [13].
One way to perform such a construal is to view a legal inference as an inference whose
premisses consist of particular facts and general rules [14] like the following example borrowed
from [15]:
5By “parameters”, I refer to all the objects and relations that the CMS refers to: propositions, concepts, terms,
derivation relations, the domain , the concepts of truth and knowledge, etc.
6An example of such a pacification is to narrow down the derivation relations among propositions to those of
deduction and causality. That way we exclude all other derivation relation like reasoning by analogy or metaphysical
grounding.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Example 1.</title>
      <p>(1) If  lives in Italy for more than 183 consecutive days over a 12-month period, (general rule)
then  is obliged to pay taxes in Italy on their worldwide income.
(1) Alice lives in Italy for more than 183 consecutive days over a 12-month period. (particular fact)
(2) Alice is obliged to pay taxes in Italy on their worldwide income. (conclusion)
A particular fact is a proposition while a (general) rule is a norm. Intuitively, the diference is
that propositions describe which state of afairs is the case (e.g., “Alice is paying taxes.”) while
norms stipulate which state of afairs should be the case (e.g., “Alice should pay taxes.”). Since
legal inferences - like Example 1 - contain both propositions and norms, we can not explain the
inferences’ conclusions without appealing to the referred norms. Consequently, CMSLP has to
contain norms apart from propositions. Moreover, Example 1’s conclusion is also a rule - not
a general, but a particular one - and hence a norm. In other words, we can derive new norms
from other norms and propositions. Consequently, the hierarchy  in CMSLP contains both
norms and propositions which are ordered based on derivation relations;  ’s constituents
situated higher are derived from those situated lower. E.g., norm 2 is situated higher in 
than norm 1 and proposition 1. The derivation relations whose conclusions are norms are
those of subsumptive-deductive inferences: the particular fact (2) is subsumed by rule (1) and via
deduction the conclusion (3) follows.</p>
      <p>In the context of this paper, the output of a binary classification algorithm is essentially a
judgement. E.g., a court’s judgement about whether there has been a violation of article X or
not. As such, a judgement is the conclusion a legal inference. In Example 1, that conclusion has
the form of a norm. However, we can always construe a judgement as a proposition  and not as
a norm . Specifically,  :=  is a descriptive meta-analysis of  whose existence is “parasitical”
to that of  [16]. E.g, instead of  =“Alice should pay taxes.”, we can always say  =“Alice is
obliged by law to pay taxes.”. Having said that, whenever I discuss about a judgement as part of
legal inference (e.g., Example 1) I will construe it as a norm unless explicitly saying otherwise.</p>
      <p>But what are the norms of  ? In §2.1, I construed legal practice as “the totality of the
activities of the legal experts that are authorised to take part in the process of deciding a judgement”.
The foregoing authorisation of the legal experts is authorised by the laws of a specific legal
system. E.g., a court’s judgement about Alice’s taxes to the Italian State will be based on
the norms of the Italian legal system. At the same time, if Alice’s activities lie outside of
the jurisdiction of French law, then the norms of the French legal system are not applicable
to them. Consequently, diferent legal systems may regulate diferent sets of entities 
(domains). As a result, the practice in each domain will be diferent. In other words, to every
legal system corresponds a diferent CMS LP whose norms are: (i) those that make up the
legal system; (ii) those that can be derived by the norms of (i) in a similar manner with Example 1.
∙ Applying (SUM.4): Subsumption is essentially a decision performed by the authorised
legal experts; the experts decide inter alia whether terms appearing in the particular fact fall
under concepts appearing in a rule. E.g., whether the term “Alice” belongs to (is subsumed by)
the concept of “living in Italy for more than 183 consecutive days over a 12-month period”[14].
Consequently, CSMLP includes hierarchies  and  of the concepts and terms participating
in the subsumption.</p>
      <p>For the application of (SUM.5) to legal practice described right afterwards, it is important to
highlight that the concepts of  are interpretive concepts: an interpretive concept of a domain
 is a concept for which the domain experts can not concede on a specific list of criteria about
whether an entity of  belongs to that concept. The concepts for which domain experts do
concede on such criteria are called criterial concepts. For instance, the domain experts of biology
agree that there is a a unique criterion for whether an entity belongs to the concept of tiger: the
entity has tiger DNA. In law, there is no such thing as a DNA of justice, freedom, dignity and the
rest of legal concepts. To make things worse, the decision of whether an entity belongs to a
legal concept is influenced by the background beliefs (ethical and political) of the authorised
legal practitioners [17]. Hence, the extensions of the concepts of  are decided on subjective
criteria. Taking that into consideration, one could argue that CMSsum is not applicable to legal
practice. Specifically, according to [ 12], the hierarchy  reflects “ real or objective grounds
(aitiai) of things” (ordo essendi). Therefore, since interpretive concepts’ extension are depended
on subjective criteria they can not be grounded on “real or objective” aitiai.</p>
      <p>To respond to that objection, I adopt the construal of legal interpretive concepts found in
[18]. In brief, their argument is that in each tradition of legal practice (e.g., in the tradition
of Italian legal practice) at a given point in time a unique conception  of a legal concept
emerges. The legal practitioners interpret that concept based on their background beliefs.
Then, via their disagreement which is performed according to the methods and customs of
that tradition, they concede at a decision of whether an entity belongs to . Ideally, their
consensus coincides with the ordo essendi. Note that the decision about the extension of a legal
concept is vital for legal practice since the truthfulness of a subsumption depends on that decision.
∙ Applying (SUM.5): The propositions of CMSLP refer to state of afairs among CMSLP’s concepts
and terms. Since there is an ordo essendi about these concepts and terms, there is also an ordo
essendi about their state of afairs and hence, all propositions of CMSLP are true.</p>
      <p>Regarding the truthfulness of CMSLP’s norms, we have to deal with the objection that norms
can not take truth values since they do not attempt to describe actual states of afairs. From all
available responses to that problem, I adopt the response that a norm is true in a given normative
system in virtue of that system including that norm.7 E.g., an Italian law is true in the Italian
jurisdiction in virtue of belonging to the Italian legal system. Apart from the norms that are
included in a normative system, we can infer new norms via subsumptive-deductive inference
as is done in Example 1. Those norms are true in virtue of a truthful subsumption-deduction.</p>
      <p>I will notate the conception of truth I have used so far as truth1. It is the truth reflected in
the ordo essendi. However, next to truth1, we need to include another notion of truth: truth2.
Specifically, as elaborated in §2.1, a judgement  in the context of legal practice is decided by
a group of authorised domain experts. Hence, it may be the case that the experts’ judgement
does not correspond to the ordo essendi. A usual cause of such diferentiations is the mismatch
between the consensus of the authorised legal experts on the extension of a legal concept and
the ordo essendi. Despite that, judgement  is still established as true in virtue of the authority
of the authorised legal experts since their judgements are legally binding [16]. This is truth2.
7To argue in favour of that response, I would have to exceed both the scope and the word limit of this paper. Hence,
I take it as a given.</p>
      <p>In contrast with legal practice, in the practice of empirical sciences, no group of experts has
the authority to establish truths; the domain experts do not hold any authority over the physical
world [16]. For instance, Alice has cancer independently of the experts’ conclusion, while O. J.
is guilty in virtue of the experts’ conclusion (truth2) and independently of whether he indeed
killed his wife (truth1). There are of course institutional ways to challenge truth2 which are
still though appeals to another group of authorised experts - like appealing to a higher court.
On the contrary, in empirical sciences, a conclusion can always be challenged, and by any one
of the domain experts [16].</p>
      <p>In case that the legal concepts of CMSLP have been interpreted by the authorised legal
experts (truth2) in a diferent way than the ordo essendi (truth1), they may also compose a
diferent hierarchy than the hierarchy  of the ordo essendi. I will notate that hierarchy
as ,2. Similarly, since the concepts of ,2 play a decisive role in the subsumption of
particular facts (propositions) by general rules (norms) in legal inferences, those propositions
and norms may compose a diferent hierarchy than  . I will notate that hierarchy as
,2. For instance, it may be the case that according to the ordo essendi, a particular fact
 is subsumed by a norm 1 allowing us to infer another norm 2. In that case, 2 is
situated higher in  than 1 and  . However, if the experts interpret the concepts of
1 diferently than the ordo essendi, it may be the case that  can not be subsumed by 1
and subsequently, we can no longer infer 2. In that case, 2 is not situated higher than 1 and  .
∙ Applying (SUM.6): Since legal practitioners must accept the judgements (conclusions) of the
authorised domain experts any conception of knowledge and justification must be based on truth2.
∙ Applying (SUM.2): As mentioned in ¶4 of (SUM.3)’s application to legal practice,
CMSLP’s domain  consists of all the entities that CMSLP’s norms regulate. The concepts,
terms and propositions of CMSLP are about the entities of .
2.4. CMSLP: a normative model of legal practice</p>
      <p>In this subsection, I introduce CMSLP, i.e., a normative model of legal practice. It is the result
of the application of CMSsum to legal practice as elaborated in §2.3.
(LP.1) CMSLP is a system S of propositions (particular facts), norms (general rules), concepts and
terms which satisfy clauses (LP.2) to (LP.6).
(LP.2) CMSLP’s domain  consists of all the entities that the norms of CMSLP regulate. The
propositions, concepts and terms of CMSLP are about the entities of .
(LP.3) The propositions and norms of CMSLP form a hierarchy  based on their derivation: the
propositions and norms situated higher in the hierarchy are derived from the propositions
and norm situated lower. The derivation of norms from other norms and propositions
has the form of subsumptive-deductive inference.
(LP.4) The concepts and the terms form hierarchies  and  respectively based on their
derivation: the constituents of a hierarchy situated higher in the hierarchy are derived
from the constituents situated lower. The concepts are interpretive and objective (i.e., there
is an ordo essendi).
(LP.5) There are two notions of truth: (i) truth1 that reflects the ordo essendi; (ii) truth2 which
is established by a group of authorised domain experts. The authority of the authorised
experts is authorised by the norms of  . In legal practice, the truthfulness of a judgement
is true in the sense of true2.8 truth2 induces new hierarchies ,2, ,2. truth1 coincides
with truth2 only if ,2 coincides with  .
(LP.6) The domain experts know that the propositions and norms of ,2 are true (truth2) and
they have adequate knowledge of all concepts and terms of ,2 and  respectively. For
each hierarchy ,2, ,2 and  , knowledge about one of its constituents is justified
from knowledge about other constituents situated lower in that hierarchy.</p>
      <sec id="sec-3-1">
        <title>3. A normative model of the explainability of upload filters</title>
        <sec id="sec-3-1-1">
          <title>3.1. The model</title>
          <p>Assume a set of norms  that are legally binding for a set of data  (e.g., hateful tweets).
Then, based on the norms  ,  is segmented into two disjoint sets:  =  ∪  . Data  are
elements of  if and only if they do not violate the norms of  . Equivalently,  are elements
of  if and only if they do violate the norms of  . Since  are norms that are legally binding
for , any judgement on the violation of  by ’s data is true if and only if it coincides with
the judgments of the authorised legal practitioners. Since I want to provide a normative model
and not a descriptive one, I assume that there is an algorithm ℱ that segments (or filters )  to
the two disjoint sets  =  ∪  with 100% accuracy (ideal - normative case). Propositions
of the form  :=  ∈  - where  ∈ { ,  } - are judgements for the legality of . Since 
represents a judgement, we can substitute it by a norm  that represents the same judgement.9</p>
          <p>Since  is a judgement it is part of legal practice. Hence, its truthfulness corresponds to truth2.
In this normative context, truth2 is the judgment to which the authorised legal experts would
have conceded if the judgement had been decided by them and not by ℱ . Moreover, since  is part
of the legal practice, its explanation should be based on the normative model CMSLP. According
to CMSLP, the truthfulness of  is grounded on the hierarchy ,2: there are propositions
and norms situated lower than  in the hierarchy ,2 such that  is derived from them via
subsumption-deduction. Consequently, an explanation of  should consist of the
subsumptivedeductive arguments whose conclusion is . Finally, the user whose data are filtered knows the
normative explanation of ℱ ’s output when they know those subsumptive-deductive arguments.
(BC.1) Assume a dataset , a set of norms  which are legally binding for , and an algorithm
ℱ that segments (filters )  to  =  ∪  with 100% accuracy. Then, EXPBC is the
CMSLP whose norms include  and the norms , where  :=  ∈  and  ∈ { ,  }.
(BC.2) EXPBC’s domain is the dataset .</p>
          <p>8Since CMSLP is normative, one could argue that 2 should coincide with true1; that is the ideal (normative) case.
However, since I want to provide a model of legal practice, I want it to be pragmatically useful. I.e., I want a model
that is applicable to every group of actual legal practitioners. Therefore, the ideal state of afairs in every trial is the
ideal state of afairs relevant to the capabilities of the authorised legal practitioners. Consequently, it may be the
case that even in their ideal (normative) performance, those practitioners are not capable of discovering truth1.
9See ¶3 of (SUM.3)’s application to legal practice in §2.3.
(BC.3) The truthfulness of the partition  =  ∪  is that of truth2.
(BC.4) An explanation of a judgement  consists of:
(BC.4.a) all the propositions and norms that are the premisses of the subsumptive-deductive
inference whose conclusion is 
(BC.4.b) the way those propositions and norms are structured (argumentative structure) to
form the subsumptive-deductive inference of (BC.4.a).
(BC.5) A user whose data are filtered by</p>
          <p>ℱ knows:
(BC.5.a) the propositions and norms described in (BC.4.a)
(BC.5.b) the argumentative structure described in (BC.4.b).
3.2. EXPBC and the current state of AI: machine learning vs. rule-based
programming</p>
          <p>EXPBC is compatible with the the review of the current forms of explanation for the output
of legal AI found in [10]. Based on that review, we can classify diferent explainable legal AI (or
legal XAI ) models into three categories of diferent degree of explainability:10
(degree 1) Explanation that consist only of a subset of the propositions of (BC.4.a).
(degree 2) Explanation that consist only of a subset of the propositions and norms of (BC.4.a).
(degree 3) Explanation that consists of parts of the argumentative structure described in (BC.4.b).</p>
          <p>From the above classification, we can infer that the content of explanations of degree 1 is
contained in the content of explanations of degree 2 whose content is contained in the content of
explanations of degree 3. Hence, explanations of degree 3 contain the same and more information
of explanations of degree 2 which contain the same and more information than explanations
of degree 3. The ideal would be for the XAI model to provide maximum information in its
explanations. I.e., provide an explanation of degree 3 that contains all the propositions and
norms of (BC.4.a) and their complete argumentative structure described in (BC.4.b).</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.2.1. EXPBC and machine learning</title>
          <p>Although I find Adrien et al.’s 2021 clustering of types of explanation quite useful, there is an
important divergence between my interpretation of that classification and theirs. In contrast
with them, I do not consider machine learning (ML) algorithms capable of satisfying any of the
three degrees of explainability.</p>
          <p>
            Take for instance the example of [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ] which [10] present as a state-of-the-art case of explanation
of degree 3. Their model is a binary classification ML algorithm. The designing of binary
classification ML algorithms consists of two phases: (a) training phase; (b) testing phase. During
the training phase, the model is “fed” with data from two categories and it extracts patterns that
appear with high frequency in each category. During the testing phase, the model attempts to
identify such patterns in new data and it classifies them accordingly. In case that the classification
of the testing phase has low accuracy, the model is re-trained and so on [19].
10The following is my reconstruction of Adrien et al.’s 2021 classification.
          </p>
          <p>
            The binary classification algorithm that [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ] designed is classifying cases brought before the
ECtHR to: (a)  : cases that have violated an article of the Convention; (b)  : cases that
have not violated that article. For instance, during the training phase of their algorithm, they
“fed” the algorithm with past cases of violations of Article 3:
          </p>
          <p>Article 3 (Prohibition of torture): “No one shall be subjected to torture or to
inhuman or degrading treatment or punishment.”
By doing so, the algorithm identified words that appear with high frequency in those cases (e.g.,
“injury”, “ukraine”, “detainee”, “food”). I.e., the patterns the algorithm was using to distinguish
the two categories “violation” and “no violation” were patterns of words. Afterwards, during
the testing phase, they “fed” the algorithm with new cases of alleged violations of Article
3. Whenever the algorithm encountered one of the aforementioned words in a new case, it
was raising the probability of that case being judged by the ECtHR as a violation of Article
3. Consequently, when the algorithm was classifying a new case  as a violation of Article 3,
the explanation was the particular fact  =“The words 1, 2, ...,  appear in  and they also
appear with high frequency in past cases violating Article 3.”.</p>
          <p>Now facts of the form of  are facts about Article 3 itself. Since the term “Article 3” is
mentioned in  , for  (a particular fact) to stand in a subsumptive relation with Article 3 (a
norm), “Article 3” should either be a term appearing in Article 3 itself (self-reference) or it should
belong to a concept mentioned in Article 3. Neither of the two is the case. In other words,
unless a norm is self-referential - either by including the norm itself as a term or by including a
concept to which the norm belongs to - the possibility of a fact of the form of  standing in a
subsumptive relation with that norm is excluded.11 Hence, such a fact can not be part of the
propositions of the hierarchy ,2 since those propositions stand in subsumptive-deductive
derivation relations with norms of ,2. Consequently, an explanation that includes facts like
 is not an explanation that adheres to the normative model EXPBC at all - not even in degree 1.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>3.2.2. EXPBC and rule-based programming</title>
          <p>In contrast with ML algorithms, ruled-based models can provide explanations of degree 3. A
rule-based model is a model that consists of rules () ⇒  () and facts  (), where ,  , 
are propositional functions,12  is a variable and  a term without free variables. Whenever
() is true, then for every rule  of the form () ⇒  () we have that  () is true [15].</p>
          <p>When they appear in the code of a programme Π , rules can be interpreted as norms and facts
as propositions [15]. Specifically, () ⇒  () can be interpreted as “Whenever () is the case,
then  () should be the case.” and the fact () can be interpreted as “() is the case.”. Usually,
the output has only facts like  (). When in the output, facts can also be interpreted as norms:
“According to the programme Π ,  () must be the case.”. This flexibility in the interpretation of
the output is on par with the remark about being able to construe a judgement as both a norm
and a proposition mentioned at ¶3 of (SUM.3)’s application to legal practice in §2.3.
11The generalised version of  for all classification ML algorithms is:  =“The patterns 1, 2, ...,  appear in data
 and they also appear with high frequency in past cases of category . Hence,  belongs to .”
12Their arities can vary.</p>
          <p>Let’s see now a formalisation of Example 1 using rule-based programming. Assume that
 () :=“x pays taxes in Italy on their worldwide income.”, () :=“ lives in Italy for more than
183 consecutive days over a 12-month period.”. Then, we can formalize Example 1 as follows:
line 1. () ⇒  ()
line 2. ()
output: { ()}
Clearly, this formalisation provides an explanation of degree 3: it contains the norms (line 1,
output), propositions (line 2), and the derivation relation among them.
4. ASP causal explanations supervening on EXPBC</p>
        </sec>
        <sec id="sec-3-1-4">
          <title>4.1. ASP in a nutshell</title>
          <p>ASP is a rule-based programming method that uses first-order logic (FOL) language. As such,
an ASP programme Π is a set of rules of the form  ⇒ ℎ, where ℎ is an atom and the
body consists of combinations of literals , where each  can either be an atom  or its default
negation ∼ . An atom is said to be proven whenever it appears in the head of rule whose body
is satisfied. When we have the edge case where the body of a rule is empty (i.e., ⇒ ℎ), the
head is considered proven under any circumstances. Hence, we call it a fact [20]. The notion of
a proven atom is central for ASP since the output of any ASP programme Π is logical models
that include only those atoms which have been proven. Those models are called answer sets and
hence the name “answer set programming”. Finally, as a FOL programming method, ASP syntax
includes relations (e.g., ) and functions (e.g.,  ) of arities  and ′ respectively symbolised as
/ and  /′. For instance, a common way of representing graphs in ASP is to employ the
following two relations: (i) a predicate /1 which singles out the atoms which are nodes;
(ii) a binary relation /2 which signifies the existence of a directed edge from its first to its
second argument (e.g., (, ) represents  → ). For a more detailed &amp; quick introduction
to the basics of ASP see [20].</p>
          <p>Let’s see now how the rule-based model of Example 1 proposed in §3.2.2 can be realised via
ASP. Assume that the predicate 183/1 stands for , the predicate 2/1 stands
for  . Then the programme Π 1 = {183() ⇒ 2(), 183()}
has only one answer set: 1 = {2(), 183()}. Specifically, since
183() is a fact it has to belong to every answer set of Π 1 . At the same time, since
the body of the rule 183() ⇒ 2() is satisfied for  = , then its head
2()|= must also belong to every answer set of Π 1 . Since there are no more
proven atoms, the only answer set compatible with Π 1 ends up being that of 1 .
4.2. Causal models of causal structures supervening on ,2</p>
          <p>An argument against CMSLP could be that in the actual legal practice the derivation
relations among the propositions and norms of ,2 are not exhausted in subsumptive-deductive
inference. Indeed, by looking in any book on legal reasoning (see e.g., [21], [22]) one can see
that there is a plurality of reasoning methods used in the actual practice: analogical/evidential
reasoning, deontic logic, counterfactuals, etc. However, CMSLP is not a normative model of how
legal practitioners should reason in their practice (e.g., by using abduction [22]). Instead, it is a
normative model on how legal practitioners should explain the outcome of their practice. I.e., it
is a meta-analysis of how they actually reason to conclude to that outcome.</p>
          <p>Having said that, one can still reconstruct the proposed normative explanations to reflect a
reasoning method diferent than subsumption-deduction in a way that it is still clear which is the
hierarchical relation ,2 among the propositions and norms involved in that reconstruction. In
what follows, I will do so for the case of causal inference using ASP. Note that causal inference
is of prime importance for explanations of legal judgements. The most characteristic such
explanation is the alleged causal relation between the defendants’ actions (alleged cause) to the
applicants’ alleged harm (alleged efect ).</p>
          <p>Before reconstructing the proposed subsumptive-deductive explanations to causal
explanations, I need to decide on the the definition of causal inference; I have to know what I model
before modelling it. There is a diverse plethora of available definitions motivated by diferent
metaphysical conceptualisation of causation. Considering that, I am choosing a metaphysically
neutral definition so as to be compatible with many such conceptualisations.</p>
          <p>Definition 4.1 (Cause).  is a cause of  if: (i) it is possible to intervene on ; (ii) under
some such possible intervention on , changes in the value of  are associated changes in the
value of  . [23, p.3583]</p>
          <p>Since the desideratum is a conceptualisation of causal inference to reflect the hierarchy ,2
the variables  and  of Definition 4.1 will be propositions and norms that are part of that
hierarchy. More precisely, considering clauses (i) and (ii) of Definition 4.1, a proposition/norm
 will be the cause of another proposition/norm  if
4.1.i′ it is possible intervene to the value of : The values of  and  in this situation are truth
values - those of truth2. Moreover, “possibility” in this context is a conceptual possibility;
it does not reflect the actual state of afairs, but it is a counterfactual case of another
non-actual state of afairs. For instance, it may be the case that in the actuality Alice has
lived in Italy for more than 183 consecutive days over a 12-month period, but it is also
possible to conceptualise a counterfactual situation in which Alice left on day 182 for a
conference on logical programming in Haifa, Israel.
4.1.ii′ by intervening in the value of , we intervene in the value of  : This is the point where the
hierarchy ,2 comes into play. ,2 is induced by derivation relations: norms situated
higher are derived from norms and propositions situated lower via subsumption-deduction.
See for instance the left graph of Figure 4.1. The direction of the edges 2 → 1 and
2 → 1 exhibits that 2 is derived from (or grounded on) 1 and 1. Since 2 is “derived”
it is the true conclusion of an argument whose premisses are also true. In other words, all
the variables in the current state of afairs (that of ,2) have the value “true”. Hence,
the only intervention to any such variable that we can make is that of setting it false.
Consequently,  will cause  if whenever we set its truth value to false we have that
 ’s truth value also become false.</p>
          <p>The foregoing conceptualization of causation reflects the hierarchical structure ,2 since 
being the cause of  implies that : (i) is situated lower in the hierarchy than  ; (ii) is one of
the premisses from which  is derived via subsumption-deduction.</p>
          <p>Let’s proceed with modelling this conceptualisation of causality. Assume a causal structure C
that exists in the actual world. As a causal structure I define a collection of causal relata and
the causal relations among them. From 4.1.ii′, we know that such a structure supervenes on the
hierarchy ,2 and that the causal relata are the involved propositions/norms. A causal graph
(C) = ⟨ (C), (C)⟩ is a graph whose nodes  (C) are representations of C’s causal relata
and whose edges (C) are representations of the direct causal relation13 among those relata
and hence, it can serve as a model of C.</p>
          <p>2
1</p>
          <p>1
,2 of Example 1.</p>
          <p>C supervenes on ,2
2
(C)
1</p>
          <p>1</p>
          <p>A prominent method of constructing (C) is to bookeep a list C of all the independencies
between the causal relata of C and then, construct (C) in such a way that expresses those and
only those independencies. I.e., assuming that we also have a list (C) of the independencies
among (C)’s nodes, the ideal is that for every set of independent relata in C their representations
in (C) are also independent and vice versa. That requirement presupposes at least two distinct
formal notions of independence - one for C’s causal relata and one for  (C) - and that both
notions explicate formally the same concept of independence.</p>
          <p>Let’s start with the concept of independence we want to formalise. Assume three disjoint
sets of causal relata , , . Then, an independency is a proposition of the form: “Knowing 
renders  irrelevant to  .” [25, p.5]. We symbolise this proposition as C  |. Similarly, for
three disjoint sets of nodes ′,  ′, ′ an independency is a proposition of the form “Knowing
the value of the variables in ′ renders the values of the variables in ′ irrelevant to the values
of the variables in  ′.” symbolised as (′C) ′|′. Verma &amp; Pearl [26, p.354] provide a more
“formal” definition which can be adjusted to the context of this paper as follows:
Definition 4.2 (Independency IXY|Z). Assume an ordered set of variables  and three
disjoint subsets of   = {1, 2, ..., },  = {1, 2, ..., },  = {1, 2, ..., } such
that all variables of  ∪  are situated lower than  ’s variables in that ordering. Then,  is
independent of  given  if when assigning specific values to ’s variables ({1 = 1, 2 =
2, ...,  = }) the value  for every variable of  ∈  will be the same for any set of
values {1, 2, ..., } we assign to the variables of .
13An informal definition of “ direct cause” is the following:  is a direct cause of  if there is no causal relatum that
mediates between  and  [24, p.20]. A paradigmatic example of that is the following: billiard ball 1 hits billiard
ball 2 which hits billiard ball 3. 1 causes 3 to move, but it is not a direct cause since there is another causal
relatum - that of 2 - that mediates between them.</p>
          <p>Although this definition is more “ formal”, it is still not complete since it leaves many open
questions - which are not in the scope of this paper to be answered - like what kind of orderings
of the set  are acceptable. Despite that, it is still quite an insightful definition since it reveals
two important prerequisites for explicating formally the concept of independence:
(I.1) there is a functional dependence of  to both  and :  =  (, )
(I.2) there needs to be an ordering of the variables such that the variables situated higher in
that ordering can potentially be functionally depended in the variables situated lower in
that ordering.</p>
          <p>From 4.1.ii′, we infer that the ordering of (I.2) is that of the hierarchy ,2 and that the functional
dependence of (I.1) is that of a truth assignment function. In the literature of causal modelling,
C  | is usually probabilistic independence and the functional dependence  =  (, ) is
asymmetric. I.e., although  is functionally dependent on both  and , the inverse does not
hold. The latter needs to be a requirement for our model as well since it represents an inherent
asymmetric property of causal relata: the efect depends on the cause but not the inverse [27].
4.3. ASP modelling of causal graphs &amp; EXPBC</p>
          <p>The construction of causal graphs using ASP14 is performed in the following three steps:
(CM.1) the existence of a direct edge between two nodes ,  ∈  (C) is encoded as an atom
(, )
(CM.2) we place restrictions on those atoms to force them respect the independencies (C)
(CM.3) an answer set solver returns all stable models that satisfy those restrictions. The collection
of edges in each such model is the required causal graph.</p>
          <p>The ASP programme described in those three steps does not output a binary classification,
but causal graphs. However, with a few extra steps we can induce from it a binary classification
algorithm. Assume for instance that we want to use ASP to perform the same binary classification
as in the example used in §3.2.1: the input is the facts of a case and the output is whether or
not there has been been a violation of the Convention’s Article 3. Firstly, we construct an
ASP programme Π (C) that outputs causal graphs based on the process described above. For
that, we use past cases of violations of Article 3 to identify the independencies C  |. Let’s
postpone the discussion on the process of identifying those independencies for the conclusion
of the paper. Now from 4.1.ii′ and Definition 4.2, we can infer that an independency of the form
(C) |, means that if values of ’s variables change to false the values of all  ’s variables
will still remain true as long as the values of variables of  remain true. Having that in mind,
we can construct a binary classification ASP programme Π  in the following way:
Π  = Π (C),2 ∪ Π  ∪ Π 
• where Π (C),2 is an ASP programme that consists of the edges outputted by Π (C)
• Π  is an ASP coding of the facts of a case
14See e.g. [28, 29, 30] and [31, §7.2.1].</p>
          <p>• Π  is a set of rules which given an independence   | requires the variables  to be
in the answer set in case that the variables of  are in Π . A naive such set of rules
could be that if all direct causes of  are true then  has to be true as well. The required
output - the violation of Article 3 - is encoded in Π  as an atom. In case that it appears in
the answer sets, then Π  has outputted that we have a violation of Article 3.
It is part of my Thesis - an ongoing project - to come up with rules for Π  and eventually
construct an example of Π  regarding hate speech cases of the ECtHR.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>5. Conclusion</title>
        <p>What I would like to remark in the conclusion is that in terms of explainability the most
dificult black box to whiten in the foregoing proposal of causal modelling is the process
of deciding the independencies. The common practice is to consider C to be a probabilistic
independence. However, that leaves room for the same kind of criticism than the one against
machine learning I used in §3.2.1; C is based on frequentistic arguments of the form of statistical
independence tests: “In  out of the  examined cases,  was independent of  when we condition
on  and according to the statistical test  they should be conditionally independent.”.</p>
        <p>In order to overcome this problem we are left without any other choice than inquiring for
different forms of functional (in)dependencies than the probabilistic ones. Conveniently, a plurality
of important theorems that allow probabilistic independencies to be translated successfully to
graph-theoretical independencies have been proven for functional dependencies that satisfy
certain properties (e.g., symmetry   | ↔  | and decomposition ∪  | →   |
[32]) and not for probability functions in particular. Hence, a good point to start is to find
functions that still satisfy those properties and whose semantic interpretation is compatible
with EXPBC. Since the variables of (C) are norms and propositions, it is rather intuitive that
truth functions - not per se those of classical logic - have such semantic interpretations.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Acknowledgments</title>
        <p>Thanks to the guidance of Arianna Betti, Ronald de Haan , and Katrin Schulz as well as the
insightful feedbacks of Derek So (Wing Yi) and of the reviewers of the ASPOCP 2022 Workshop.
[4] M. Medvedeva, M. Vols, M. Wieling, Using machine learning to predict decisions of
the european court of human rights, Artificial Intelligence and Law 28 (2020) 237–266.
doi:doi.org/10.1007/s10506-019-09255-y.
[5] M. C. C. F. V. A. L. S. Masías VH, Valle M, Modeling verdict outcomes using social
network measures: The watergate and caviar network cases., PLoS ONE (2016) e0147248.
doi:10.1371/journal.pone.0147248.
[6] Lawformer: A pre-trained language model for chinese legal long documents, AI Open 2
(2021) 79–84. URL: https://www.sciencedirect.com/science/article/pii/S2666651021000176.
doi:10.1016/j.aiopen.2021.06.003.
[7] G. P. H.-F. N. K. R. Z. J. . Z. H. Chhatwal, R. (2018).
[8] F. Wei, H. Qin, S. Ye, H. Zhao, Empirical study of deep learning for text classification in
legal document review, in: 2018 IEEE International Conference on Big Data (Big Data),
2018, pp. 3317–3320. doi:10.1109/BigData.2018.8622157.
[9] G. Sartor, A. Loreggia, The impact of algorithms for online content filtering or
moderation - upload filters, Study requested by the JURI Committee. European Parliament
Think Tank, 2020. URL: https://www.europarl.europa.eu/thinktank/en/document/IPOL_
STU(2020)657101.
[10] B. Adrien, M. Lognoul, A. de Streel, B. Frénay, Legal requirements on explainability
in machine learning, Artificial Intelligence and Law 29 (2021) 149–169. doi: 10.1007/
s10506-020-09270-4.
[11] Álvaro Núñez Vaquero, Five models of legal science, Revus 19 (2013). doi:https://doi.</p>
        <p>org/10.4000/revus.2449.
[12] W. R. de Jong, A. Betti, The classical model of science: a millennia-old model of scientific
rationality, Synthese 174 (2010) 185–203. doi:10.1007/s11229-008-9417-4.
[13] L. Alexander, E. Sherwin, Demystifying legal reasoning, Cambridge University Press, 2008.
[14] N. MacCormick, Legal deduction, legal predicates and expert systems, International</p>
        <p>Journal for the Semiotics of Law 5 (1992) 181–202. doi:10.1007/BF01101868.
[15] G. Governatori, A. Rotolo, G. Sartor, Logic and the law: philosophical foundations,
deontics, and defeasible reasoning, in: D. Gabbay, J. Horty, X. Parent, R. van der Meyden,
L. van der Torre (Eds.), Handbook of deontic logic and normative systems, volume 2,
College Publications, 2021.
[16] C. E. Alchourrón, Limits of logic and legal reasoning, in: C. Bernal, C. Huerta (Eds.),
Essays in legal philosophy, Oxford University Press, [1992] 2015. doi:10.1093/acprof:
oso/9780198729365.003.0017.
[17] R. Dworkin, Justice for hedgehogs, Belknap Press of Harvard University Press, 2011.
[18] F. Schroeter, L. Schroeter, K. Toh, A new interpretivist metasemantics for fundamental
legal disagreements, Legal Theory 26 (2020) 62–99. doi:10.1017/S1352325220000063.
[19] C. M. Bishop, Pattern recognition and machine learning (Information, science and statistics),</p>
        <p>Springer-Verlag, 2006.
[20] M. Gebser, R. Kaminski, B. Kaufmann, T. Schaub, Answer set solving in practice (2012).</p>
        <p>doi:10.2200/S00457ED1V01Y201211AIM019.
[21] G. Bongiovanni, G. Postema, A. Rotolo, G. Sartor, C. Valentini, D. Walton (Eds.), Handbook
of legal reasoning and argumentation, Springer, Dordrecht, 2018.
[22] D. Walton, Legal argumentation and evidence, The Pennsylvania State University Press,
2002.
[23] J. Woodward, Methodology, ontology, and interventionism, Synthese 192 (2015) 3577–3599.</p>
        <p>doi:10.1007/s11229-014-0479-1.
[24] P. Spirtes, C. Glymour, R. Scheines, Causation, Prediction, and Search, 2 ed., MIT Press,
2000.
[25] J. Pearl, A. Paz, GRAPHOIDS: A Graph-based Logic for Reasoning about Relevance
Relations, University of California computer science department, technical Report 850038
(R-53), 1985.
[26] T. Verma, J. Pearl, Causal networks: semantics and expressiveness, in: Proceedings of
the fourth workshop on uncertainty in artificial intelligence, 1988, pp. 352–359. doi: 10.
48550/ARXIV.1304.2379.
[27] N. Cartwright, Against modularity, the causal markov condition, and any link between
the two: comments on Hausman and Woodward, British Journal for the Philosophy of
Science 53 (2002) 411–453. doi:10.1093/bjps/53.3.411.
[28] Z. Zhalama, J. Zhang, F. Eberhardt, W. Mayer, M. J. Li, Asp-based discovery of
semimarkovian causal models under weaker assumptions, in: Proceedings of the 28th
International Joint Conference on Artificial Intelligence, IJCAI’19, AAAI Press, 2019, p. 1488–1494.
doi:10.5555/3367243.3367245.
[29] A. Hyttinen, F. Eberhardt, M. Järvisalo, Constraint-based causal discovery: conflict
resolution with answer set programming, in: Proceedings of the Thirtieth Conference on
Uncertainty in Artificial Intelligence, UAI’14, AUAI Press, Arlington, Virginia, USA, 2014,
p. 340–349. doi:10.5555/3020751.3020787.
[30] S. Triantafillou, I. Tsamardinos, I. G. Tollis, Learning causal structure from overlapping
variable sets, in: In Proceedings of the 13th International Conference on Artificial Intelligence
and Statistics (AISTATS), 2010, p. 860–867.
[31] J. Peters, D. Janzing, B. Schölkopf, Elements of causal inference: foundations and learning
algorithms, The MIT Press, 2017.
[32] J. Pearl, T. S. Verma, The logic of representing dependencies, in: Proceedings of the 6th
national conference on A.I. (AAAI-87), volume 1, 1987, pp. 374–379.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] European Commission for the Eficiency of Justice, European ethical Charter on the use of artificial intelligence in judicial systems and their environment</article-title>
          ,
          <source>printed by the Council of Europe</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Remus</surname>
          </string-name>
          ,
          <string-name>
            <surname>F. S. Levy</surname>
          </string-name>
          ,
          <article-title>Can robots be lawyers? computers, lawyers, and the practice of law</article-title>
          ,
          <source>Georgetown journal of legal ethics 30</source>
          (
          <year>2016</year>
          )
          <volume>501</volume>
          +.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Aletras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsarapatsanis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Preoţiuc-Pietro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Lampos</surname>
          </string-name>
          ,
          <article-title>Predicting judicial decisions of the european court of human rights: a natural language processing perspective</article-title>
          ,
          <source>PeerJ Comput. Sci. 2</source>
          (
          <year>2016</year>
          )
          <article-title>e93</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>