A Qualitative Investigation of the Degree of
 Explainability of Defeasible Argumentation and
        Non-monotonic Fuzzy Reasoning

                            Lucas Rizzo and Luca Longo∗

    The ADAPT global centre of excellence for digital content and media innovation
         School of Computing, Dublin Institute of Technology, Dublin, Ireland
                    lucas.rizzo@mydit.ie,luca.longo@dit.ie∗


        Abstract. Defeasible argumentation has advanced as a solid theoreti-
        cal research discipline for inference under uncertainty. Scholars have pre-
        dominantly focused on the construction of argument-based models for
        demonstrating non-monotonic reasoning adopting the notions of argu-
        ments and conflicts. However, they have marginally attempted to exam-
        ine the degree of explainability that this approach can offer to explain
        inferences to humans in real-world applications. Model explanations are
        extremely important in areas such as medical diagnosis because they
        can increase human trustworthiness towards automatic inferences. In
        this research, the inferential processes of defeasible argumentation and
        non-monotonic fuzzy reasoning are meticulously described, exploited and
        qualitatively compared. A number of properties have been selected for
        such a comparison including understandability, simulatability, algorith-
        mic transparency, post-hoc interpretability, computational complexity
        and extensibility. Findings show how defeasible argumentation can lead
        to the construction of inferential non-monotonic models with a higher
        degree of explainability compared to those built with fuzzy reasoning.

        Keywords: Defeasible Argumentation, Non-monotonic Reasoning, Fuzzy
        Reasoning, Argumentation Theory, Explainable Artificial Intelligence


1     Introduction
Knowledge-driven approaches have been extensively used in the field of Artifi-
cial Intelligence (AI) for producing inferential models of reasoning. Among them,
fuzzy reasoning [21] and defeasible argumentation [4] possess a higher explana-
tory capacity when compared to other reasoning approaches for dealing with
partial, vague and conflicting information [2, 20]. This is because, intuitively, the
inferences that can be produced by these approaches can be better understood
by humans, due to the fact that they deal and manipulate knowledge provided
by experts preserving their natural language. However, to the best of our knowl-
edge, no empirical investigation of their explanatory capacity has been made
so far. Model explainability is essential for its adoption and usage. The lower
the model explanatory capacity, the lower the degree of trust posed by humans
2       A qualitative investigation of the degree of explainability of defeasible...

towards their inferences. Medical diagnosis and autonomous driving are exam-
ples of application areas where this often occur. In these areas, humans need to
fully understand model functioning in order to trust its inferences. In the field
of Artificial Intelligence a number of properties have been proposed for evalu-
ating the degree of explainability of inferential models. Some of these include
model extensibility [11], its simulatability and its post-hoc interpretability [12].
The aim of this research is to qualitatively analyse the explanatory capacity of
non-monotonic fuzzy reasoning and defeasible argumentation. A detailed step-
by-step description of their inferential mechanisms is described and contrasted
according to a selection of properties from the literature. Both these inferential
mechanisms are exploited by adopting a knowledge-base provided by an expert
in the field of biomarkers. This knowledge-base is composed by a set of rules
which are brought together and evaluated to predict the mortality risk of el-
derly individuals. In detail, the research question investigated is: “How do the
explanatory capacity provided by defeasible argumentation and non-monotonic
fuzzy reasoning relate qualitatively?”
    The remainder of this paper is organised as it follows: Section 2 firstly out-
lines defeasible argumentation and non-monotonic fuzzy reasoning. Secondly, it
introduces related work on Explainable Artificial Intelligence (XAI) presenting
a number of properties useful for assessing model explainability. The design of a
comparative research study and the inferential processes of defeasible argumen-
tation and non-monotonic fuzzy reasoning are detailed in Section 3. Section 4
provides a qualitative comparison of the selected properties followed by a dis-
cussion, while Section 5 concludes the research study.


2   Related work

Defeasible (non-monotonic) reasoning has emerged as a solid theoretical ap-
proach within AI for modeling non-monotonic activities under fragmented, am-
biguous and conflicting knowledge. In a non-monotonic reasoning process, con-
clusions do not necessarily increase monotonically, but instead they can be with-
drawn as new information arises [14]. A particular type of defeasible reasoning
is argumentation, built upon the notions of arguments and their conflicts [13,
2]. Defeasible argumentation provides the basis for the development of compu-
tational models of arguments. Such development starts with the definition of
the internal structure of arguments to the resolution of their conflicts and final
accrual towards a rational conclusion.
    Another type of non-monotonic reasoning can be achieved by employing fuzzy
logic and reasoning. This allows the creation of computational models with a
robust representation of linguistic information provided by domain experts by
employing the notion of degree of truth. Fuzzy reasoning consists of a fuzzifi-
cation module, responsible for assigning to each proposition or linguistic fuzzy
term, provided by an expert, a degree of truth; an inference engine accountable
for firing rules and aggregating fuzzy terms; and a defuzzification module, which
translates this aggregation using the original natural language employed in the
             An investigation of the explainability of defeasible argumentation                         3

underlying reasoning [15]. This robustness to deal with vagueness of informa-
tion have led to 50 years of research endeavour, with a plethora of applications
in many domains. However, in order to deal with non-monotonic information,
the classical fuzzification-engine-defuzzification process has to be extended with
a non-monotonic layer. Unfortunately not many research studies exist for this
purpose. For example, in [6], an average function is proposed for aggregating
conclusions from conflicting rules, while in [10] a reduction of non-monotonic
rules is suggested by means of a rule base compression method. In this study,
the approach proposed in [20] is selected. It employs the use of Possibility The-
ory [7] as a way of dealing with conflicting rules. In a nutshell, truth values are
represented by the notions of possibility and necessity. These indicate respec-
tively the extent to which data fail to refute its truth and the extent it supports
its truth.
    Previous studies have attempted to analyse the inferential capacity of defea-
sible argumentation in the context of other approaches of quantitative reasoning
under uncertainty [17–19]. However, so far, such analysis has been brought for-
ward only by means of predictive accuracy. It has been demonstrated that the
evaluation of predictive accuracy alone might not be sufficient for a model to
be employed and trusted by domain experts. For instance, in [5] a model was
trained to predict the probability of death from pneumonia and inferred less risk
to patients who also had asthma. However, asthma is, in fact, a predictor of
higher risk of death. The inference reflected a pattern of lower risk in the train-
ing data as a consequence of the more intrusive treatment received by asthmatic
patients. Hence, if we expect defeasible argumentation to be trusted and under-
stood by domain experts it is also necessary to situate its explanatory capacity
in relation to other similar reasoning approaches. The literature on Explainable
Artificial Intelligence is vast and it contains several properties for explainabil-
ity analysis [11, 1, 12]. Six of these were selected and considered relevant to the
knowledge-driven approaches under scrutiny. Some of them were initially defined
in the machine learning context, but we believe they can be borrowed for the
analysis of reasoning approaches. Table 1 lists their definitions.


                  Table 1: Properties for explainability, their definitions and sources.


 Property                 Definition                                                         Source
                          Capacity of understanding the inferential process behind a
 Understandability/
                          model in order to trust and adopt it as a decision supporting
 Post-hoc                                                                                    [1]/[12]
                          tool / Capacity of extracting information from a constructed
 Interpretability
                          model and the degree of elucidation of its inferences
                          Capacity of a human to step through every calculation required
 Simulatability           to produce a prediction in a reasonable time by employing input     [12]
                          and parameters
                          The easiness of an inferential system to accommodate new input
 Extendibility                                                                                [11]
                          parameters and new output classes.
 Computational            Complexity of the algorithms employed in the inferential process
                                                                                              [11]
 Complexity               (computational time needed to produce an inference)
 Algorithmic
                          Degree of application of the inferential process to new domains     [12]
 transparency
4       A qualitative investigation of the degree of explainability of defeasible...

3     Design and methodology
In order to investigate the explanatory capacity provided by defeasible argumen-
tation and non-monotonic fuzzy reasoning, a knowledge-base was selected and
operationalized employing two mechanisms for non-monotonic reasoning: defea-
sible argumentation and non-monotonic fuzzy reasoning. This knowledge-base
was produced by a clinician. The reasoning models built upon it aimed at pre-
dicting the risk of mortality in elderly individuals by using information related
to their biomarkers. The first inferential approach, defeasible argumentation, is
structured over 5 layers as in [13]: 1) definition of the structure of arguments, 2)
and their conflicts, 3) their evaluation 4) the definition of their dialectical status,
5) their final accrual. The second approach, non-monotonic fuzzy reasoning, is
composed of three main parts: 1) a fuzzification module, 2) an inference engine
and 3) a defuzzification module. Fig. 1 summarises the design of the research.


                                    Data of one individual
                                                                       Selection of
                                                                      activated rules
                                  Set of knowledge-base rules

                                                    Defeasible argumentation
           Non-monotonic fuzzy
           reasoning
                                                     1.   Structure of arguments
            1. Fuzzification                         2.   Conflicts of arguments
            2. Inference engine                      3.   Evaluation of conflicts
            3. Defuzzification                       4.   Dialectical status
                                                     5.   Accrual of arguments


                Inference
                                                                     Inference
                                    Comparison of prop-
                                      erties (Table 1)

                      Fig. 1: Design of the comparative research study.


3.1   Data and knowledge-base
Fifty-one biomarkers were described by a clinician and their association with
mortality risk levels was provided through the use of ‘IF premises THEN risk-
level ’. Some biomarkers were described by natural language terms such as low or
high. This applies also to risk levels (no, low, medium, high and extremely high).
Numerical ranges had to be defined for these terms and were used in different
ways within the defeasible argumentation and fuzzy reasoning approaches. Con-
tradictions among biomarkers were also made explicit as rules of the form ‘IF
premises THEN conclusion’. Eventually, some preferences among biomarkers
were provided. A contradiction refers to a situation in which some biomarker
should not be logically employed, while a preference occurs when a biomarker
should be used instead of another biomarker. Since the full knowledge-base con-
tains many rules, contradictions and preferences, it cannot be presented in this
               An investigation of the explainability of defeasible argumentation                   5

paper but it can be accessed online1 . A dataset2 was obtained in a primary
health care European hospital and the survival status of the 93 patients was
recorded 5 years after data collection. One random individual was picked for a
detailed analysis and the associated data can be seen in Table 2. From this in-
formation, a set of rules, contradictions and preferences was activated as shown
in Table 3. Note that rules, contradictions and preferences activation depend on
the patient’s data. A rule designed for female will not be activated for males.
A contradiction is not evaluated if its premises or conclusion are not activated.
Similarly, a preference is evaluated if both its terms are activated.
Table 2: Data about the biomarkers associated to an elderly. A full description can be found online1 .

       Age         Sex      Hypert      DM          Fglu      HbA1c Chol         HDL statins
         60      female      high        no          5.3        4.17   8.7        2.06    no
       CVD        BMI        w\h       skinf       COPB       allerd draller     analg  derm
        no        26.68      0.88        32          no          no    no          no     no
       OSP        Psy       MMS        CMV          EBV        HPA     LE         MO    NEU
          ?         no        26        2.6          170        10.4  6.94        11.7   28.8
       CRP          E        HB        HTC         MCV          FE    ALB        Clear HOMCIS
        3.8        4.42       140       0.41        93.2        23.6  47.7        2.11    7.9
      VitB12    FOLNA        INS      CORTIS        PRL        TSH    FT3         FT4 GAMA
        445        37.1       8.6      470.8        86.1       0.491  5.57        12.3   12.6
       IGE      anticoag     neo        Ly           RF       ANA Death
       46.2        yes        no        53.6          9        36.8    no

          Table 3: Activated rules, contradictions and preferences from data on Table 2.

                             Rules
              Premises                     Risk
          HDL high (> 1.0)               no risk                        Contradictions
           ANA high (> 32)               low risk                Premises          Conclusion
    w/h high (> 0.8) and f emale         low risk                  no CVD          no Anticoag
            Age ∈ [60, 65]               low risk            INS low (≤ 12.26)   w/h low (≤ 0.8)
             Hypert yes            extremely high risk
         HbA1c high (> 3.8)              low risk
            Anticoag yes               medium risk                        Preferences
         Chol high (≥ 6.19)        extremely high risk                      CRP > LE
           MO high (> 8.6)             medium risk                         CRP > ANA
               CRP > 3                  high risk                          w\h > BMI
           Ly high (> 40)              medium risk                        Hypert > Age
        LE > 6.5 and f emale           medium risk                           MO > LE
           FE high (> 18)                low risk                            LY > LE
       BMI medium(∈ [26, 29])          medium risk


3.2     Non-monotonic fuzzy reasoning inference

Fuzzification module Rules in the form “IF ... THEN ...” and contradictions
rules were constructed from data in Table 3 and depicted in Fig. 2-A on page 7.
Afterwards, fuzzy membership functions (FMF) were defined for linguistic vari-
ables such as BMI low (low body mass index) and FE high (high serum iron). Each
category of risk had an associated FMF (Fig. 2-B) with input in the range [0,
100] ∈ R. Because of that the input variables (biomarkers) had to be normalised
for the same range according to their possible minimum and maximum values.
1
    http://dx.doi.org/10.6084/m9.figshare.7028480
2
    https://doi.org/10.6084/m9.figshare.7028516.v1
6       A qualitative investigation of the degree of explainability of defeasible...

Fig. 2-C depicts examples of FMFs for FE high and FE low. Not all biomarkers
had a fuzzy representation provided by the domain expert and were incorporated
into the fuzzy inference as crisp variables (membership degree always 0 or 1).
For the case under analysis (picked patient), the crisp variables are HDL, Hypert,
Anticoag, MO, CRP and LE. Due to space limitations, not all FMFs are shown
here but they can be accessed online1 .
    Inference engine For each linguistic term provided by the domain expert,
and used within rules and exceptions, its membership degree have to be com-
puted by evaluating the associated membership function with a given input (from
table 2). Once each membership degree of each linguistic term in the premises of
a rule has been computed, then also a degree of truth for that rule can be com-
puted. This can be done by employing some fuzzy operators OR and AND. The
ones selected here are: Zadeh 3 , Product 4 and Lukasiewicz 5 (Fig. 2-D). Eventu-
ally, contradictions, which in fuzzy reasoning define non-monotonicity, have to
be evaluated. This evaluation can be done using Possibility Theory, as proposed
by [20] for fuzzy reasoning with rule-based systems. In this case truth values
are represented by possibility (Pos) and necessity (Nec) as defined on Section 2.
The Nec of a proposition is treated here as its membership grade and the Pos is
always 1 for all propositions. Under these circumstances (Pos ≥ Nec), the effect
on the necessity of a proposition A (N ec(A)) by a set of n propositions Q which
refute A is derivable in [20] and given by:

               N ec(A) = min(N ec(A), ¬N ec(Q1 ), . . . , ¬N ec(Qn ))                  (1)

where ¬N ec(Q) = 1 − N ec(Q). In addition, an order of precedence has to be
defined when applying equation 1. In this study, contrarily to usual fuzzy control
systems, the reasoning is done in a single step with all the activated rules fired
at once. Nonetheless, it is possible to organise exceptions in a tree structure in
which the consequent of an exception is the antecedent of the next exception.
Fig. 2-E illustrates this structure which allows equation 1 to be applied from the
roots to the leaves. The updated truth values of those rules subject to refutation
by other rules are listed in Fig. 2-F. The last step of the inference engine is to
aggregate all the truth values of the membership functions associated to each
risk category (grouped by the same category), by using the fuzzy-OR operator
(as per figure 2-G). The output of this can be graphically represented (Fig. 2-H).
    Defuzzification module A single defuzzified scalar which represents the
final mortality risk inferred has to be computed. Two common methods are
selected: mean of max and centroid. The former returns the average of all x co-
ordinates (mortality risks) whose respective y coordinates (membership grades)
are maximum in the graphical representation (Fig. 2-H). The latter returns the
coordinates of the centre of gravity of the same graphical representation (the x
coordinate is the final scalar). Fig. 2-I lists all the final inferences produced for
the patient under analysis.
3
  Given propositions a, b, then fuzzy-and and fuzzy-or are “min(a,b)”, “max(a,b)”.
4
  Product’s fuzzy-and and fuzzy-or are respectively “a × b” and “a + b - a × b”.
5
  Lukasiewicz’s fuzzy-and and fuzzy-or are “max(a + b - 1, 0)” and “min(a + b, 1)”.
                An investigation of the explainability of defeasible argumentation                                    7

    (A) IF-THEN rules and exceptions
    from activated rules
   IF-THEN Rules                                          Activated
                                                                              Table 3
   R1 : IF HDL high THEN no risk                            rules
   R2 : IF ANA high THEN low risk
   R3 : IF w/h high and f emale THEN low risk
   R4 : IF Age ∈ [60, 65] THEN low risk
   R5 : IF Hypert yes THEN extremely high risk                     (D) Truth values for IF-THEN rules and ex-
   R6 : IF HbA1c high THEN low risk
                                                                   ceptions’ premises for different fuzzy logics
   R7 : IF Anticoag yes THEN medium risk
   R8 : IF Chol high THEN extremely high risk                      Rules      Zadeh        Lukasiewicz      Product
   R9 : IF MO high THEN medium risk                                R1         1            1                1
   R10 : IF CRP > 3 THEN high risk                                 R2         0.01         0.01             0.01
   R11 : IF Ly high THEN medium risk                               R3         0.25         0.25             0.25
   R12 : IF LE > 6.5 and f emale THEN medium risk                  R4         0.17         0.17             0.17
   R13 : IF FE high THEN low risk                                  R5         1            1                1
   R14 : IF BMI medium THEN medium risk                            R6         0.008        0.008            0.008
   Exceptions                                                      R7         1            1                1
   E1 : no CVD refutes R7                                          R8         0.92         0.92             0.92
   E2 : INS low refutes R3                                         R9         1            1                1
   E3 : R10 refutes R12                                            R10        1            1                1
   E4 : R10 refutes R2                                             R11        0.76         0.76             0.76
   E5 : R3 refutes R14                                             R12        1            1                1
   E6 : R5 refutes R4                                              R13        0.25         0.25             0.25
   E7 : R9 refutes R12                                             R14        0.45         0.45             0.45
   E8 : R11 refutes R12                                            no CVD     1            1                1
                                                                   low INS    0.57         0.57             0.57

       (B) Membership functions for mor-
                                                         Fuzzification              (E) Graphical representation of
       tality risk categories
                                                           Module                   exceptions. Truth values before
                                                                                    and after applying equation (1)
                                                                                    next to each node.


          (C) Example of membership function
          for biomarker iron


                                                                               (F) Final truth values of IF-THEN
                                                                               rules after solving exceptions
                                                                                      Truth
                                                                             Rule                 Conclusion
                                                          Inference                   value
                                                                              R1         1            no risk
                                                           Engine             R2         0            low risk
                                                                              R3       0.25           low risk
                                                                              R4         0            low risk
         (H) Graphical representations of the                                 R5         1      extremely high risk
         aggregated FMFs of mortality risks                                   R6      0.008           low risk
                                                                              R7         0          medium risk
                                                                              R8       0.92     extremely high risk
                                                                              R9         1          medium risk
                                                                              R10        1           high risk
                                                                              R11        1          medium risk
                                                                              R12        0          medium risk
                                                                              R13      0.25           low risk
                                                                              R14      0.45         medium risk

                                                                          (G) Final aggregated mortality risks’
                                                                          truth values for different fuzzy logics
                                                                  Mortality risk      Zadeh Lukasiewicz Product
 (I) Defuzzification of graphical representations (H)
                                                                        No               1       1         1
 and final inference                                                   Low             0.25    0.59      0.49
 Defuzzification    Zadeh        Lukasiewicz       Product           Medium             1        1         1
   Centroid      (54.12, 0.31)   (51.10, 0.32)   (51.77, 0.31)         High              1       1         1
  Mean of max        56.25           56.25           56.25        Extremely high        1        1        1


                                                        Defuzzification
                                                           Module

Fig. 2: An illustration of the non-monotonic fuzzy reasoning process for the selected elderly patient.
The order of operations is from A to I.
8       A qualitative investigation of the degree of explainability of defeasible...

3.3   Defeasible argumentation inference

      Layer 1 - Definition of the internal structure of arguments The
first step of a defeasible argumentation process is to define a set of arguments.
Internally these are generally composed by a set of premises and a conclusion
derivable by applying an inference rule →. A typical version of this is known
as forecast argument in which, from a set of premises, a conclusion can be rea-
sonably forecasted. Examples can be found in Table 3 (left) where premises
reasonably forecast a degree of risk of mortality (as also listed in Fig. 3-A). Note
that, in contrast to fuzzy rules, the natural language linguistic terms associ-
ated to the premises are not quantitatively exploited. Instead, the premises are
evaluated true or not if input values are within certain ranges.
    Layer 2 - Definition of the conflicts of arguments Given a set of forecast
arguments, the next step for modelling an underlying knowledge-base, is to define
the conflicts between arguments. The goal is to evaluate potential inconsistencies
and identify invalid arguments through the notion of attack (conflict). In this
research, the notion of undercutting attack [16] is employed for the resolution of
conflicts. It defines an exception, where the application of the knowledge carried
in some argument is no longer allowed. It is formed by a set of premises and
an undercutting inference ⇒ to another argument. Examples of undercutting
attacks, derived from Table 3 (right), are in Fig. 3-B. All the designed arguments
and attacks can now be seen as an argumentation framework (Fig. 3-C).
    Layer 3 - Evaluation of the conflicts of arguments After conflicts for-
malisation, these can be evaluated using different approaches such as considering
the strength of attacks or the notion of preferentiality of arguments [9]. Alter-
natively, as in this study, conflicts follow a binary relation, that means, if two
arguments (attacker and attacked) are activated, the conflict between them is
fully considered.
    Layer 4 - Definition of the dialectical status of arguments Given an
argumentation framework and a notion of conflict, it is necessary to define the set
of defeated arguments. An argument A is defeated by B if there is a valid attack
from A to B. A well-known approach has been proposed by [8] in the form of
acceptability semantics. A semantics is an algorithm designed to produce a set of
acceptable and conflict-free arguments, called extensions. Note that the internal
structure of arguments is not considered at this stage. Well-known examples
are the grounded and the preferred semantics. In this study, only the former
algorithm is illustrated (Fig. 3-D). Fig. 3-E depicts its computed extension.
    Layer 5 - Accrual of acceptable arguments Having a set of acceptable
forecast arguments, it is necessary to accrue them in case a final inference is
required. If no quantity can be associated to an argument, then the conclusion
supported by the highest number of arguments could be chosen as final inference.
In case arguments can be quantitatively evaluated (they carry a value as in this
study), then several approaches can be used, including the selection of measures
of central tendency such as average (used in this study). Fig. 3-F illustrates the
value associated to each argument and the final inference which is their average.
               An investigation of the explainability of defeasible argumentation                               9


      (A) Forecast arguments from activated
      IF-THEN rules                                          Activated
                                                                            Table 3
    Forecast arguments                                         rules
    Arg1 : HDL > 1.0 → no risk
    Arg2 : BMI ∈ [26, 29] → medium risk                                  (B) Undercutting attacks from acti-
    Arg3 : w/h > 0.8 and f emale → low risk
                                                                         vated contradictions and preferences
    Arg4 : Age ∈ [60, 65] → low risk
    Arg5 : Hypert yes → extremely high risk                              Undercutting attacks
    Arg6 : HbA1c > 3.8 → low risk                                        UA1 : no CVD ⇒ Arg7
    Arg7 : Anticoag yes → medium risk                                    UA2 : INS ≤ 12.26 ⇒ Arg3
    Arg8 : Chol ≥ 6.19 → extremely high risk                             UA3 : CRP > 3 ⇒ Arg12
    Arg9 : MO > 8.6 → medium risk                                        UA4 : CRP > 3 ⇒ Arg2
    Arg10 : CRP > 3 → high risk
                                                              Layer 1    UA5 : w/h > 0.8 and f emale ⇒ Arg14
    Arg11 : Ly > 40 → medium risk                                        UA6 : Hypert yes ⇒ Arg4
    Arg12 : LE > 6.5 and f emale → medium risk                           UA7 : MO > 8.6 ⇒ Arg12
    Arg13 : ANA > 32 → low risk                                          UA8 : Ly > 40 ⇒ Arg12
    Arg14 : FE > 18 → low risk
                                                                           (C) Argumentation framework:
                                            Binary attack                  graphical representation
     (D) Grounded semantics:
                                                 relations
     pseudo-code
                                                              Layer 2
    Data: Abstract argumentation graph (C)
    Result: Set of accepted, rejected and
              undecided arguments
    find all roots;
    set all roots as accepted;
    if no roots then
          all arguments are undecided and
            terminate;
    end
    repeat
          reject all arguments attacked by an                 Layer 3
            accepted argument;
          accept all arguments that are attacked
            only by rejected arguments;
    until no argument was accepted in the previous step;                    (E) Grounded semantics:
    if argument is not accepted and not rejected then
          argument is undecided;                                            computed extension
    end


        (F) Accrual of forecast accepted argu-                Layer 4
        ments by grounded semantics
      Accepted         Conclusion            Value
      Arg1                no risk               0
      Arg5          extremely high risk       100
      Arg2              medium risk            50
      Arg6                low risk             25
      Arg8          extremely high risk       100
      Arg9              medium risk            50
      Arg10              high risk             75
      Arg11             medium risk            50             Layer 5
      Arg14               low risk             25
                                          Average: 52.7


Fig. 3: An illustration of the defeasible argumentation process for an elderly (order from A to F).


4    Comparison and discussion
A comparative qualitative analysis of the explanatory capacity of the defeasible
argumentation and non-monotonic fuzzy reasoning processes is performed by
using the properties listed in Table 1 (Section 2).
Understandability/Post-hoc Interpretability
– Non-monotonic fuzzy reasoning - The inferential process is aligned to the
  expert’s knowledge and natural language for most of its parts, which makes it
  generally intuitively understandable by humans. However, this does not apply
  for some parts, such as the normalisation of the input values, the selection of
  fuzzy logic and the defuzzification mechanism. Some mathematical reasoning
  is required to select suitable parameters of these parts.
– Defeasible argumentation - The initial reasoning steps (layers 1-3) are built
  upon the same natural language terms provided by the domain expert in the
10     A qualitative investigation of the degree of explainability of defeasible...

  knowledge-base. In layer 4 the grounded semantics was selected. This par-
  ticular semantics is not a complex algorithm to understand: intuitively, an
  argument is only rejected if it is attacked by an accepted argument. In layer
  5, the accrual of accepted arguments can be done by an intuitive measure
  of central tendency (average here). In case more complex (less intuitive) se-
  mantics, such as preferred [8] or ranking-based [3], are employed, then the
  understandability of the inferential process might be compromised.
Simulatability
 – Non-monotonic fuzzy reasoning - Practical applications built upon a small
    number of simple membership functions could support simulatability. How-
    ever, with more complex membership functions, a domain expert is not likely
    able to step through their calculation with high precision and in a reasonable
    time. Similarly, this applies to the calculations required within the defuzzi-
    fication unit (example, computation of the centroid).
 – Defeasible argumentation - Reasonably, an expert could perform the calcu-
    lations behind all the steps of the inferential process. However, this would be
    significantly impacted by the number of arguments in the knowledge-base,
    the complexity of selected acceptability semantics and the accrual strategy.
Extendibility
 – Non-monotonic fuzzy reasoning - New rules can be added/updated in the
    light of new information. However, fuzzy membership functions have to be
    defined, demanding further effort, not common in human reasoning.
 – Defeasible argumentation - New arguments can be constructed from new
    information and easily plugged-in the knowledge-base . They follow the same
    structure (premise to conclusions) which does not require the definitions of
    mathematical functions and is close to the way humans reason.
Computational complexity
 – Non-monotonic fuzzy reasoning - The full inferential process, in the worst
    case, is linear in the number of rules.
 – Defeasible argumentation - Layers 3 and 5 are linear in the number of ar-
    guments and attacks relations. However, for layer 4 (application of accept-
    ability semantics for the computation of the dialectical status of arguments),
    complexity can range from linear (example the grounded semantics) to ex-
    ponential (example the preferred semantics) [8].
Algorithmic transparency
 – Non-monotonic fuzzy reasoning - The inferential process can be applied
    across different domains. A knowledge-base is a formalisation of a reasoning
    activity for a specific underlying domain, thus it can be re-used or extended
    provided the new domains are similar. However, it is important to highlight
    that traditional fuzzy reasoning has not been designed for application in
    those domains requiring non-monotonic reasoning. In fact, in this study, the
    traditional fuzzy reasoning process has been extended through the incorpo-
    ration of Possibility Theory in order to deal with non-monotonicity.
 – Defeasible argumentation - The inferential process can be applied across
    different domains. By nature, defeasible argumentation is suitable for appli-
            An investigation of the explainability of defeasible argumentation      11

    cation in domains requiring non-monotonic reasoning activities. However, in
    the absence of conflicts, the inferential process can still be applied as it is.
The analysis of the two reasoning approaches suggests that defeasible argumenta-
tion might lead to explanations that are more suitable to understand for humans,
both for a domain expert and a lay person. In fact, through the comparison per-
formed above, on one hand, without some comprehension of fuzzy logic and
its membership functions, the understandability/post-hoc interpretability, sim-
ulatability of non-monotonic fuzzy reasoning and the extendibility of its models
is compromised. On the other hand, defeasible argumentation tends to use the
same natural language terms, provided by the domain expert, throughout the
whole inferential process, except in the conflict resolution layer (semantics). Se-
mantics vary in computational complexity (linear or exponential in the number
of arguments), allowing fuzzy reasoning to offer an equal or lower complexity,
since its fuzzification-engine-defuzzification layers are always linear in the num-
ber of rules. However, Possibility Theory always requires the specification of a
precedence order of exceptions in the inference engine of fuzzy reasoning. Con-
trarily to acceptability semantics that do not require any precedence order of
attacks for solving conflicts, thus it has a higher algorithmic transparency.

5   Conclusion and future work
Despite theoretical advances in defeasible argumentation, to the best of our
knowledge, there is lack of research devoted to the examination of the degree of
explainability that this reasoning approach can offer to illustrate inferences to
humans in real-world applications. Therefore, this research focused on a qualita-
tive comparison of the degree of explainability of defeasible argumentation and
non-monotonic fuzzy reasoning in a real-world setting: prediction of mortality
of elderly people by using biomarkers. The inferential processes behind the two
selected reasoning techniques were meticulously illustrated and exploited. The
comparison was performed using six properties for explainability extracted from
the literature. A qualitative discussion of these properties show how defeasible
argumentation has a greater potential for tackling the problem of explainability
of reasoning activities under uncertainty, partial and conflictual information. The
contribution of this study is to situate defeasible argumentation among similar
approaches for reasoning under uncertainty in terms of degree of explainability.

Acknowledgments
Lucas Middeldorf Rizzo would like to thank CNPq (Conselho Nacional de Desen-
volvimento Cientı́fico e Tecnológico) for his Science Without Borders scholarship,
proc n. 232822/2014-0.

References
 1. Allahyari, H., Lavesson, N.: User-oriented assessment of classification model un-
    derstandability. In: 11th scandinavian conference on Artificial intelligence (2011)
12      A qualitative investigation of the degree of explainability of defeasible...

 2. Bench-Capon, T.J., Dunne, P.E.: Argumentation in artificial intelligence. Artificial
    intelligence 171(10-15), 619–641 (2007)
 3. Bonzon, E., Delobelle, J., Konieczny, S., Maudet, N.: A comparative study of
    ranking-based semantics for abstract argumentation. In: AAAI. pp. 914–920 (2016)
 4. Bryant, D., Krause, P.: A review of current defeasible reasoning implementations.
    The Knowledge Engineering Review 23(3), 227–260 (2008)
 5. Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., Elhadad, N.: Intelligible
    models for healthcare: Predicting pneumonia risk and hospital 30-day readmission.
    In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge
    Discovery and Data Mining. pp. 1721–1730. ACM (2015)
 6. Castro, J.L., Trillas, E., Zurita, J.M.: Non-monotonic fuzzy reasoning. Fuzzy Sets
    and Systems 94(2), 217–225 (1998)
 7. Dubois, D., Prade, H.: Possibility theory: qualitative and quantitative aspects. In:
    Quantified representation of uncertainty and imprecision, pp. 169–226 (1998)
 8. Dung, P.M.: On the acceptability of arguments and its fundamental role in non-
    monotonic reasoning, logic programming and n-person games. Artificial intelligence
    77(2), 321–358 (1995)
 9. Garcı́a, D., Simari, G.: Strong and weak forms of abstract argument defense. Com-
    putational Models of Argument: Proceedings of COMMA 2008 172, 216 (2008)
10. Gegov, A., Gobalakrishnan, N., Sanders, D.: Rule base compression in fuzzy sys-
    tems by filtration of non-monotonic rules. Journal of Intelligent & Fuzzy Systems
    27(4), 2029–2043 (2014)
11. Giraud-Carrier, C.: Beyond predictive accuracy: what. In: Proceedings of the
    ECML-98 Workshop on Upgrading Learning to Meta-Level: Model Selection and
    Data Transformation. pp. 78–85 (1998)
12. Lipton, Z.C.: The mythos of model interpretability. Queue 16(3), 30:31–30:57
    (2018)
13. Longo, L.: Argumentation for knowledge representation, conflict resolution, defea-
    sible inference and its integration with machine learning. In: Machine Learning for
    Health Informatics, pp. 183–208. Springer (2016)
14. Longo, L., Kane, B., Hederman, L.: Argumentation theory in health care. In:
    Computer-Based Medical Systems, 25th Int. Symposium on. pp. 1–6. IEEE (2012)
15. Passino, K.M., Yurkovich, S., Reinfrank, M.: Fuzzy control
16. Prakken, H.: An abstract framework for argumentation with structured arguments.
    Argument and Computation 1(2), 93–124 (2010)
17. Rizzo, L., Longo, L.: Representing and inferring mental workload via defeasible
    reasoning: a comparison with the nasa task load index and the workload profile.
    In: 1st Workshop on Advances In Argumentation In Artificial Intelligence. pp.
    126–140 (2017)
18. Rizzo, L., Majnaric, L., Dondio, P., Longo, L.: An investigation of argumentation
    theory for the prediction of survival in elderly using biomarkers. In: Int. Conf. on
    Artificial Intelligence Applications and Innovations. pp. 385–397. Springer (2018)
19. Rizzo, L., Majnaric, L., Longo, L.: A comparative study of defeasible argumen-
    tation and non-monotonic fuzzy reasoning for elderly survival prediction using
    biomarkers. In: AI*IA 2018 - Advances in Artificial Intelligence - XVIIth Int. Con-
    ference of the Italian Association for Artificial Intelligence. pp. 197–209 (2018)
20. Siler, W., Buckley, J.J.: Fuzzy expert systems and fuzzy reasoning (2005)
21. Zadeh, L.A., et al.: Fuzzy sets. Information and control 8(3), 338–353 (1965)