A Qualitative Investigation of the Degree of Explainability of Defeasible Argumentation and Non-monotonic Fuzzy Reasoning Lucas Rizzo and Luca Longo∗ The ADAPT global centre of excellence for digital content and media innovation School of Computing, Dublin Institute of Technology, Dublin, Ireland lucas.rizzo@mydit.ie,luca.longo@dit.ie∗ Abstract. Defeasible argumentation has advanced as a solid theoreti- cal research discipline for inference under uncertainty. Scholars have pre- dominantly focused on the construction of argument-based models for demonstrating non-monotonic reasoning adopting the notions of argu- ments and conflicts. However, they have marginally attempted to exam- ine the degree of explainability that this approach can offer to explain inferences to humans in real-world applications. Model explanations are extremely important in areas such as medical diagnosis because they can increase human trustworthiness towards automatic inferences. In this research, the inferential processes of defeasible argumentation and non-monotonic fuzzy reasoning are meticulously described, exploited and qualitatively compared. A number of properties have been selected for such a comparison including understandability, simulatability, algorith- mic transparency, post-hoc interpretability, computational complexity and extensibility. Findings show how defeasible argumentation can lead to the construction of inferential non-monotonic models with a higher degree of explainability compared to those built with fuzzy reasoning. Keywords: Defeasible Argumentation, Non-monotonic Reasoning, Fuzzy Reasoning, Argumentation Theory, Explainable Artificial Intelligence 1 Introduction Knowledge-driven approaches have been extensively used in the field of Artifi- cial Intelligence (AI) for producing inferential models of reasoning. Among them, fuzzy reasoning [21] and defeasible argumentation [4] possess a higher explana- tory capacity when compared to other reasoning approaches for dealing with partial, vague and conflicting information [2, 20]. This is because, intuitively, the inferences that can be produced by these approaches can be better understood by humans, due to the fact that they deal and manipulate knowledge provided by experts preserving their natural language. However, to the best of our knowl- edge, no empirical investigation of their explanatory capacity has been made so far. Model explainability is essential for its adoption and usage. The lower the model explanatory capacity, the lower the degree of trust posed by humans 2 A qualitative investigation of the degree of explainability of defeasible... towards their inferences. Medical diagnosis and autonomous driving are exam- ples of application areas where this often occur. In these areas, humans need to fully understand model functioning in order to trust its inferences. In the field of Artificial Intelligence a number of properties have been proposed for evalu- ating the degree of explainability of inferential models. Some of these include model extensibility [11], its simulatability and its post-hoc interpretability [12]. The aim of this research is to qualitatively analyse the explanatory capacity of non-monotonic fuzzy reasoning and defeasible argumentation. A detailed step- by-step description of their inferential mechanisms is described and contrasted according to a selection of properties from the literature. Both these inferential mechanisms are exploited by adopting a knowledge-base provided by an expert in the field of biomarkers. This knowledge-base is composed by a set of rules which are brought together and evaluated to predict the mortality risk of el- derly individuals. In detail, the research question investigated is: “How do the explanatory capacity provided by defeasible argumentation and non-monotonic fuzzy reasoning relate qualitatively?” The remainder of this paper is organised as it follows: Section 2 firstly out- lines defeasible argumentation and non-monotonic fuzzy reasoning. Secondly, it introduces related work on Explainable Artificial Intelligence (XAI) presenting a number of properties useful for assessing model explainability. The design of a comparative research study and the inferential processes of defeasible argumen- tation and non-monotonic fuzzy reasoning are detailed in Section 3. Section 4 provides a qualitative comparison of the selected properties followed by a dis- cussion, while Section 5 concludes the research study. 2 Related work Defeasible (non-monotonic) reasoning has emerged as a solid theoretical ap- proach within AI for modeling non-monotonic activities under fragmented, am- biguous and conflicting knowledge. In a non-monotonic reasoning process, con- clusions do not necessarily increase monotonically, but instead they can be with- drawn as new information arises [14]. A particular type of defeasible reasoning is argumentation, built upon the notions of arguments and their conflicts [13, 2]. Defeasible argumentation provides the basis for the development of compu- tational models of arguments. Such development starts with the definition of the internal structure of arguments to the resolution of their conflicts and final accrual towards a rational conclusion. Another type of non-monotonic reasoning can be achieved by employing fuzzy logic and reasoning. This allows the creation of computational models with a robust representation of linguistic information provided by domain experts by employing the notion of degree of truth. Fuzzy reasoning consists of a fuzzifi- cation module, responsible for assigning to each proposition or linguistic fuzzy term, provided by an expert, a degree of truth; an inference engine accountable for firing rules and aggregating fuzzy terms; and a defuzzification module, which translates this aggregation using the original natural language employed in the An investigation of the explainability of defeasible argumentation 3 underlying reasoning [15]. This robustness to deal with vagueness of informa- tion have led to 50 years of research endeavour, with a plethora of applications in many domains. However, in order to deal with non-monotonic information, the classical fuzzification-engine-defuzzification process has to be extended with a non-monotonic layer. Unfortunately not many research studies exist for this purpose. For example, in [6], an average function is proposed for aggregating conclusions from conflicting rules, while in [10] a reduction of non-monotonic rules is suggested by means of a rule base compression method. In this study, the approach proposed in [20] is selected. It employs the use of Possibility The- ory [7] as a way of dealing with conflicting rules. In a nutshell, truth values are represented by the notions of possibility and necessity. These indicate respec- tively the extent to which data fail to refute its truth and the extent it supports its truth. Previous studies have attempted to analyse the inferential capacity of defea- sible argumentation in the context of other approaches of quantitative reasoning under uncertainty [17–19]. However, so far, such analysis has been brought for- ward only by means of predictive accuracy. It has been demonstrated that the evaluation of predictive accuracy alone might not be sufficient for a model to be employed and trusted by domain experts. For instance, in [5] a model was trained to predict the probability of death from pneumonia and inferred less risk to patients who also had asthma. However, asthma is, in fact, a predictor of higher risk of death. The inference reflected a pattern of lower risk in the train- ing data as a consequence of the more intrusive treatment received by asthmatic patients. Hence, if we expect defeasible argumentation to be trusted and under- stood by domain experts it is also necessary to situate its explanatory capacity in relation to other similar reasoning approaches. The literature on Explainable Artificial Intelligence is vast and it contains several properties for explainabil- ity analysis [11, 1, 12]. Six of these were selected and considered relevant to the knowledge-driven approaches under scrutiny. Some of them were initially defined in the machine learning context, but we believe they can be borrowed for the analysis of reasoning approaches. Table 1 lists their definitions. Table 1: Properties for explainability, their definitions and sources. Property Definition Source Capacity of understanding the inferential process behind a Understandability/ model in order to trust and adopt it as a decision supporting Post-hoc [1]/[12] tool / Capacity of extracting information from a constructed Interpretability model and the degree of elucidation of its inferences Capacity of a human to step through every calculation required Simulatability to produce a prediction in a reasonable time by employing input [12] and parameters The easiness of an inferential system to accommodate new input Extendibility [11] parameters and new output classes. Computational Complexity of the algorithms employed in the inferential process [11] Complexity (computational time needed to produce an inference) Algorithmic Degree of application of the inferential process to new domains [12] transparency 4 A qualitative investigation of the degree of explainability of defeasible... 3 Design and methodology In order to investigate the explanatory capacity provided by defeasible argumen- tation and non-monotonic fuzzy reasoning, a knowledge-base was selected and operationalized employing two mechanisms for non-monotonic reasoning: defea- sible argumentation and non-monotonic fuzzy reasoning. This knowledge-base was produced by a clinician. The reasoning models built upon it aimed at pre- dicting the risk of mortality in elderly individuals by using information related to their biomarkers. The first inferential approach, defeasible argumentation, is structured over 5 layers as in [13]: 1) definition of the structure of arguments, 2) and their conflicts, 3) their evaluation 4) the definition of their dialectical status, 5) their final accrual. The second approach, non-monotonic fuzzy reasoning, is composed of three main parts: 1) a fuzzification module, 2) an inference engine and 3) a defuzzification module. Fig. 1 summarises the design of the research. Data of one individual Selection of activated rules Set of knowledge-base rules Defeasible argumentation Non-monotonic fuzzy reasoning 1. Structure of arguments 1. Fuzzification 2. Conflicts of arguments 2. Inference engine 3. Evaluation of conflicts 3. Defuzzification 4. Dialectical status 5. Accrual of arguments Inference Inference Comparison of prop- erties (Table 1) Fig. 1: Design of the comparative research study. 3.1 Data and knowledge-base Fifty-one biomarkers were described by a clinician and their association with mortality risk levels was provided through the use of ‘IF premises THEN risk- level ’. Some biomarkers were described by natural language terms such as low or high. This applies also to risk levels (no, low, medium, high and extremely high). Numerical ranges had to be defined for these terms and were used in different ways within the defeasible argumentation and fuzzy reasoning approaches. Con- tradictions among biomarkers were also made explicit as rules of the form ‘IF premises THEN conclusion’. Eventually, some preferences among biomarkers were provided. A contradiction refers to a situation in which some biomarker should not be logically employed, while a preference occurs when a biomarker should be used instead of another biomarker. Since the full knowledge-base con- tains many rules, contradictions and preferences, it cannot be presented in this An investigation of the explainability of defeasible argumentation 5 paper but it can be accessed online1 . A dataset2 was obtained in a primary health care European hospital and the survival status of the 93 patients was recorded 5 years after data collection. One random individual was picked for a detailed analysis and the associated data can be seen in Table 2. From this in- formation, a set of rules, contradictions and preferences was activated as shown in Table 3. Note that rules, contradictions and preferences activation depend on the patient’s data. A rule designed for female will not be activated for males. A contradiction is not evaluated if its premises or conclusion are not activated. Similarly, a preference is evaluated if both its terms are activated. Table 2: Data about the biomarkers associated to an elderly. A full description can be found online1 . Age Sex Hypert DM Fglu HbA1c Chol HDL statins 60 female high no 5.3 4.17 8.7 2.06 no CVD BMI w\h skinf COPB allerd draller analg derm no 26.68 0.88 32 no no no no no OSP Psy MMS CMV EBV HPA LE MO NEU ? no 26 2.6 170 10.4 6.94 11.7 28.8 CRP E HB HTC MCV FE ALB Clear HOMCIS 3.8 4.42 140 0.41 93.2 23.6 47.7 2.11 7.9 VitB12 FOLNA INS CORTIS PRL TSH FT3 FT4 GAMA 445 37.1 8.6 470.8 86.1 0.491 5.57 12.3 12.6 IGE anticoag neo Ly RF ANA Death 46.2 yes no 53.6 9 36.8 no Table 3: Activated rules, contradictions and preferences from data on Table 2. Rules Premises Risk HDL high (> 1.0) no risk Contradictions ANA high (> 32) low risk Premises Conclusion w/h high (> 0.8) and f emale low risk no CVD no Anticoag Age ∈ [60, 65] low risk INS low (≤ 12.26) w/h low (≤ 0.8) Hypert yes extremely high risk HbA1c high (> 3.8) low risk Anticoag yes medium risk Preferences Chol high (≥ 6.19) extremely high risk CRP > LE MO high (> 8.6) medium risk CRP > ANA CRP > 3 high risk w\h > BMI Ly high (> 40) medium risk Hypert > Age LE > 6.5 and f emale medium risk MO > LE FE high (> 18) low risk LY > LE BMI medium(∈ [26, 29]) medium risk 3.2 Non-monotonic fuzzy reasoning inference Fuzzification module Rules in the form “IF ... THEN ...” and contradictions rules were constructed from data in Table 3 and depicted in Fig. 2-A on page 7. Afterwards, fuzzy membership functions (FMF) were defined for linguistic vari- ables such as BMI low (low body mass index) and FE high (high serum iron). Each category of risk had an associated FMF (Fig. 2-B) with input in the range [0, 100] ∈ R. Because of that the input variables (biomarkers) had to be normalised for the same range according to their possible minimum and maximum values. 1 http://dx.doi.org/10.6084/m9.figshare.7028480 2 https://doi.org/10.6084/m9.figshare.7028516.v1 6 A qualitative investigation of the degree of explainability of defeasible... Fig. 2-C depicts examples of FMFs for FE high and FE low. Not all biomarkers had a fuzzy representation provided by the domain expert and were incorporated into the fuzzy inference as crisp variables (membership degree always 0 or 1). For the case under analysis (picked patient), the crisp variables are HDL, Hypert, Anticoag, MO, CRP and LE. Due to space limitations, not all FMFs are shown here but they can be accessed online1 . Inference engine For each linguistic term provided by the domain expert, and used within rules and exceptions, its membership degree have to be com- puted by evaluating the associated membership function with a given input (from table 2). Once each membership degree of each linguistic term in the premises of a rule has been computed, then also a degree of truth for that rule can be com- puted. This can be done by employing some fuzzy operators OR and AND. The ones selected here are: Zadeh 3 , Product 4 and Lukasiewicz 5 (Fig. 2-D). Eventu- ally, contradictions, which in fuzzy reasoning define non-monotonicity, have to be evaluated. This evaluation can be done using Possibility Theory, as proposed by [20] for fuzzy reasoning with rule-based systems. In this case truth values are represented by possibility (Pos) and necessity (Nec) as defined on Section 2. The Nec of a proposition is treated here as its membership grade and the Pos is always 1 for all propositions. Under these circumstances (Pos ≥ Nec), the effect on the necessity of a proposition A (N ec(A)) by a set of n propositions Q which refute A is derivable in [20] and given by: N ec(A) = min(N ec(A), ¬N ec(Q1 ), . . . , ¬N ec(Qn )) (1) where ¬N ec(Q) = 1 − N ec(Q). In addition, an order of precedence has to be defined when applying equation 1. In this study, contrarily to usual fuzzy control systems, the reasoning is done in a single step with all the activated rules fired at once. Nonetheless, it is possible to organise exceptions in a tree structure in which the consequent of an exception is the antecedent of the next exception. Fig. 2-E illustrates this structure which allows equation 1 to be applied from the roots to the leaves. The updated truth values of those rules subject to refutation by other rules are listed in Fig. 2-F. The last step of the inference engine is to aggregate all the truth values of the membership functions associated to each risk category (grouped by the same category), by using the fuzzy-OR operator (as per figure 2-G). The output of this can be graphically represented (Fig. 2-H). Defuzzification module A single defuzzified scalar which represents the final mortality risk inferred has to be computed. Two common methods are selected: mean of max and centroid. The former returns the average of all x co- ordinates (mortality risks) whose respective y coordinates (membership grades) are maximum in the graphical representation (Fig. 2-H). The latter returns the coordinates of the centre of gravity of the same graphical representation (the x coordinate is the final scalar). Fig. 2-I lists all the final inferences produced for the patient under analysis. 3 Given propositions a, b, then fuzzy-and and fuzzy-or are “min(a,b)”, “max(a,b)”. 4 Product’s fuzzy-and and fuzzy-or are respectively “a × b” and “a + b - a × b”. 5 Lukasiewicz’s fuzzy-and and fuzzy-or are “max(a + b - 1, 0)” and “min(a + b, 1)”. An investigation of the explainability of defeasible argumentation 7 (A) IF-THEN rules and exceptions from activated rules IF-THEN Rules Activated Table 3 R1 : IF HDL high THEN no risk rules R2 : IF ANA high THEN low risk R3 : IF w/h high and f emale THEN low risk R4 : IF Age ∈ [60, 65] THEN low risk R5 : IF Hypert yes THEN extremely high risk (D) Truth values for IF-THEN rules and ex- R6 : IF HbA1c high THEN low risk ceptions’ premises for different fuzzy logics R7 : IF Anticoag yes THEN medium risk R8 : IF Chol high THEN extremely high risk Rules Zadeh Lukasiewicz Product R9 : IF MO high THEN medium risk R1 1 1 1 R10 : IF CRP > 3 THEN high risk R2 0.01 0.01 0.01 R11 : IF Ly high THEN medium risk R3 0.25 0.25 0.25 R12 : IF LE > 6.5 and f emale THEN medium risk R4 0.17 0.17 0.17 R13 : IF FE high THEN low risk R5 1 1 1 R14 : IF BMI medium THEN medium risk R6 0.008 0.008 0.008 Exceptions R7 1 1 1 E1 : no CVD refutes R7 R8 0.92 0.92 0.92 E2 : INS low refutes R3 R9 1 1 1 E3 : R10 refutes R12 R10 1 1 1 E4 : R10 refutes R2 R11 0.76 0.76 0.76 E5 : R3 refutes R14 R12 1 1 1 E6 : R5 refutes R4 R13 0.25 0.25 0.25 E7 : R9 refutes R12 R14 0.45 0.45 0.45 E8 : R11 refutes R12 no CVD 1 1 1 low INS 0.57 0.57 0.57 (B) Membership functions for mor- Fuzzification (E) Graphical representation of tality risk categories Module exceptions. Truth values before and after applying equation (1) next to each node. (C) Example of membership function for biomarker iron (F) Final truth values of IF-THEN rules after solving exceptions Truth Rule Conclusion Inference value R1 1 no risk Engine R2 0 low risk R3 0.25 low risk R4 0 low risk (H) Graphical representations of the R5 1 extremely high risk aggregated FMFs of mortality risks R6 0.008 low risk R7 0 medium risk R8 0.92 extremely high risk R9 1 medium risk R10 1 high risk R11 1 medium risk R12 0 medium risk R13 0.25 low risk R14 0.45 medium risk (G) Final aggregated mortality risks’ truth values for different fuzzy logics Mortality risk Zadeh Lukasiewicz Product (I) Defuzzification of graphical representations (H) No 1 1 1 and final inference Low 0.25 0.59 0.49 Defuzzification Zadeh Lukasiewicz Product Medium 1 1 1 Centroid (54.12, 0.31) (51.10, 0.32) (51.77, 0.31) High 1 1 1 Mean of max 56.25 56.25 56.25 Extremely high 1 1 1 Defuzzification Module Fig. 2: An illustration of the non-monotonic fuzzy reasoning process for the selected elderly patient. The order of operations is from A to I. 8 A qualitative investigation of the degree of explainability of defeasible... 3.3 Defeasible argumentation inference Layer 1 - Definition of the internal structure of arguments The first step of a defeasible argumentation process is to define a set of arguments. Internally these are generally composed by a set of premises and a conclusion derivable by applying an inference rule →. A typical version of this is known as forecast argument in which, from a set of premises, a conclusion can be rea- sonably forecasted. Examples can be found in Table 3 (left) where premises reasonably forecast a degree of risk of mortality (as also listed in Fig. 3-A). Note that, in contrast to fuzzy rules, the natural language linguistic terms associ- ated to the premises are not quantitatively exploited. Instead, the premises are evaluated true or not if input values are within certain ranges. Layer 2 - Definition of the conflicts of arguments Given a set of forecast arguments, the next step for modelling an underlying knowledge-base, is to define the conflicts between arguments. The goal is to evaluate potential inconsistencies and identify invalid arguments through the notion of attack (conflict). In this research, the notion of undercutting attack [16] is employed for the resolution of conflicts. It defines an exception, where the application of the knowledge carried in some argument is no longer allowed. It is formed by a set of premises and an undercutting inference ⇒ to another argument. Examples of undercutting attacks, derived from Table 3 (right), are in Fig. 3-B. All the designed arguments and attacks can now be seen as an argumentation framework (Fig. 3-C). Layer 3 - Evaluation of the conflicts of arguments After conflicts for- malisation, these can be evaluated using different approaches such as considering the strength of attacks or the notion of preferentiality of arguments [9]. Alter- natively, as in this study, conflicts follow a binary relation, that means, if two arguments (attacker and attacked) are activated, the conflict between them is fully considered. Layer 4 - Definition of the dialectical status of arguments Given an argumentation framework and a notion of conflict, it is necessary to define the set of defeated arguments. An argument A is defeated by B if there is a valid attack from A to B. A well-known approach has been proposed by [8] in the form of acceptability semantics. A semantics is an algorithm designed to produce a set of acceptable and conflict-free arguments, called extensions. Note that the internal structure of arguments is not considered at this stage. Well-known examples are the grounded and the preferred semantics. In this study, only the former algorithm is illustrated (Fig. 3-D). Fig. 3-E depicts its computed extension. Layer 5 - Accrual of acceptable arguments Having a set of acceptable forecast arguments, it is necessary to accrue them in case a final inference is required. If no quantity can be associated to an argument, then the conclusion supported by the highest number of arguments could be chosen as final inference. In case arguments can be quantitatively evaluated (they carry a value as in this study), then several approaches can be used, including the selection of measures of central tendency such as average (used in this study). Fig. 3-F illustrates the value associated to each argument and the final inference which is their average. An investigation of the explainability of defeasible argumentation 9 (A) Forecast arguments from activated IF-THEN rules Activated Table 3 Forecast arguments rules Arg1 : HDL > 1.0 → no risk Arg2 : BMI ∈ [26, 29] → medium risk (B) Undercutting attacks from acti- Arg3 : w/h > 0.8 and f emale → low risk vated contradictions and preferences Arg4 : Age ∈ [60, 65] → low risk Arg5 : Hypert yes → extremely high risk Undercutting attacks Arg6 : HbA1c > 3.8 → low risk UA1 : no CVD ⇒ Arg7 Arg7 : Anticoag yes → medium risk UA2 : INS ≤ 12.26 ⇒ Arg3 Arg8 : Chol ≥ 6.19 → extremely high risk UA3 : CRP > 3 ⇒ Arg12 Arg9 : MO > 8.6 → medium risk UA4 : CRP > 3 ⇒ Arg2 Arg10 : CRP > 3 → high risk Layer 1 UA5 : w/h > 0.8 and f emale ⇒ Arg14 Arg11 : Ly > 40 → medium risk UA6 : Hypert yes ⇒ Arg4 Arg12 : LE > 6.5 and f emale → medium risk UA7 : MO > 8.6 ⇒ Arg12 Arg13 : ANA > 32 → low risk UA8 : Ly > 40 ⇒ Arg12 Arg14 : FE > 18 → low risk (C) Argumentation framework: Binary attack graphical representation (D) Grounded semantics: relations pseudo-code Layer 2 Data: Abstract argumentation graph (C) Result: Set of accepted, rejected and undecided arguments find all roots; set all roots as accepted; if no roots then all arguments are undecided and terminate; end repeat reject all arguments attacked by an Layer 3 accepted argument; accept all arguments that are attacked only by rejected arguments; until no argument was accepted in the previous step; (E) Grounded semantics: if argument is not accepted and not rejected then argument is undecided; computed extension end (F) Accrual of forecast accepted argu- Layer 4 ments by grounded semantics Accepted Conclusion Value Arg1 no risk 0 Arg5 extremely high risk 100 Arg2 medium risk 50 Arg6 low risk 25 Arg8 extremely high risk 100 Arg9 medium risk 50 Arg10 high risk 75 Arg11 medium risk 50 Layer 5 Arg14 low risk 25 Average: 52.7 Fig. 3: An illustration of the defeasible argumentation process for an elderly (order from A to F). 4 Comparison and discussion A comparative qualitative analysis of the explanatory capacity of the defeasible argumentation and non-monotonic fuzzy reasoning processes is performed by using the properties listed in Table 1 (Section 2). Understandability/Post-hoc Interpretability – Non-monotonic fuzzy reasoning - The inferential process is aligned to the expert’s knowledge and natural language for most of its parts, which makes it generally intuitively understandable by humans. However, this does not apply for some parts, such as the normalisation of the input values, the selection of fuzzy logic and the defuzzification mechanism. Some mathematical reasoning is required to select suitable parameters of these parts. – Defeasible argumentation - The initial reasoning steps (layers 1-3) are built upon the same natural language terms provided by the domain expert in the 10 A qualitative investigation of the degree of explainability of defeasible... knowledge-base. In layer 4 the grounded semantics was selected. This par- ticular semantics is not a complex algorithm to understand: intuitively, an argument is only rejected if it is attacked by an accepted argument. In layer 5, the accrual of accepted arguments can be done by an intuitive measure of central tendency (average here). In case more complex (less intuitive) se- mantics, such as preferred [8] or ranking-based [3], are employed, then the understandability of the inferential process might be compromised. Simulatability – Non-monotonic fuzzy reasoning - Practical applications built upon a small number of simple membership functions could support simulatability. How- ever, with more complex membership functions, a domain expert is not likely able to step through their calculation with high precision and in a reasonable time. Similarly, this applies to the calculations required within the defuzzi- fication unit (example, computation of the centroid). – Defeasible argumentation - Reasonably, an expert could perform the calcu- lations behind all the steps of the inferential process. However, this would be significantly impacted by the number of arguments in the knowledge-base, the complexity of selected acceptability semantics and the accrual strategy. Extendibility – Non-monotonic fuzzy reasoning - New rules can be added/updated in the light of new information. However, fuzzy membership functions have to be defined, demanding further effort, not common in human reasoning. – Defeasible argumentation - New arguments can be constructed from new information and easily plugged-in the knowledge-base . They follow the same structure (premise to conclusions) which does not require the definitions of mathematical functions and is close to the way humans reason. Computational complexity – Non-monotonic fuzzy reasoning - The full inferential process, in the worst case, is linear in the number of rules. – Defeasible argumentation - Layers 3 and 5 are linear in the number of ar- guments and attacks relations. However, for layer 4 (application of accept- ability semantics for the computation of the dialectical status of arguments), complexity can range from linear (example the grounded semantics) to ex- ponential (example the preferred semantics) [8]. Algorithmic transparency – Non-monotonic fuzzy reasoning - The inferential process can be applied across different domains. A knowledge-base is a formalisation of a reasoning activity for a specific underlying domain, thus it can be re-used or extended provided the new domains are similar. However, it is important to highlight that traditional fuzzy reasoning has not been designed for application in those domains requiring non-monotonic reasoning. In fact, in this study, the traditional fuzzy reasoning process has been extended through the incorpo- ration of Possibility Theory in order to deal with non-monotonicity. – Defeasible argumentation - The inferential process can be applied across different domains. By nature, defeasible argumentation is suitable for appli- An investigation of the explainability of defeasible argumentation 11 cation in domains requiring non-monotonic reasoning activities. However, in the absence of conflicts, the inferential process can still be applied as it is. The analysis of the two reasoning approaches suggests that defeasible argumenta- tion might lead to explanations that are more suitable to understand for humans, both for a domain expert and a lay person. In fact, through the comparison per- formed above, on one hand, without some comprehension of fuzzy logic and its membership functions, the understandability/post-hoc interpretability, sim- ulatability of non-monotonic fuzzy reasoning and the extendibility of its models is compromised. On the other hand, defeasible argumentation tends to use the same natural language terms, provided by the domain expert, throughout the whole inferential process, except in the conflict resolution layer (semantics). Se- mantics vary in computational complexity (linear or exponential in the number of arguments), allowing fuzzy reasoning to offer an equal or lower complexity, since its fuzzification-engine-defuzzification layers are always linear in the num- ber of rules. However, Possibility Theory always requires the specification of a precedence order of exceptions in the inference engine of fuzzy reasoning. Con- trarily to acceptability semantics that do not require any precedence order of attacks for solving conflicts, thus it has a higher algorithmic transparency. 5 Conclusion and future work Despite theoretical advances in defeasible argumentation, to the best of our knowledge, there is lack of research devoted to the examination of the degree of explainability that this reasoning approach can offer to illustrate inferences to humans in real-world applications. Therefore, this research focused on a qualita- tive comparison of the degree of explainability of defeasible argumentation and non-monotonic fuzzy reasoning in a real-world setting: prediction of mortality of elderly people by using biomarkers. The inferential processes behind the two selected reasoning techniques were meticulously illustrated and exploited. The comparison was performed using six properties for explainability extracted from the literature. A qualitative discussion of these properties show how defeasible argumentation has a greater potential for tackling the problem of explainability of reasoning activities under uncertainty, partial and conflictual information. The contribution of this study is to situate defeasible argumentation among similar approaches for reasoning under uncertainty in terms of degree of explainability. Acknowledgments Lucas Middeldorf Rizzo would like to thank CNPq (Conselho Nacional de Desen- volvimento Cientı́fico e Tecnológico) for his Science Without Borders scholarship, proc n. 232822/2014-0. References 1. Allahyari, H., Lavesson, N.: User-oriented assessment of classification model un- derstandability. In: 11th scandinavian conference on Artificial intelligence (2011) 12 A qualitative investigation of the degree of explainability of defeasible... 2. Bench-Capon, T.J., Dunne, P.E.: Argumentation in artificial intelligence. Artificial intelligence 171(10-15), 619–641 (2007) 3. Bonzon, E., Delobelle, J., Konieczny, S., Maudet, N.: A comparative study of ranking-based semantics for abstract argumentation. In: AAAI. pp. 914–920 (2016) 4. Bryant, D., Krause, P.: A review of current defeasible reasoning implementations. The Knowledge Engineering Review 23(3), 227–260 (2008) 5. Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., Elhadad, N.: Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 1721–1730. ACM (2015) 6. Castro, J.L., Trillas, E., Zurita, J.M.: Non-monotonic fuzzy reasoning. Fuzzy Sets and Systems 94(2), 217–225 (1998) 7. Dubois, D., Prade, H.: Possibility theory: qualitative and quantitative aspects. In: Quantified representation of uncertainty and imprecision, pp. 169–226 (1998) 8. Dung, P.M.: On the acceptability of arguments and its fundamental role in non- monotonic reasoning, logic programming and n-person games. Artificial intelligence 77(2), 321–358 (1995) 9. Garcı́a, D., Simari, G.: Strong and weak forms of abstract argument defense. Com- putational Models of Argument: Proceedings of COMMA 2008 172, 216 (2008) 10. Gegov, A., Gobalakrishnan, N., Sanders, D.: Rule base compression in fuzzy sys- tems by filtration of non-monotonic rules. Journal of Intelligent & Fuzzy Systems 27(4), 2029–2043 (2014) 11. Giraud-Carrier, C.: Beyond predictive accuracy: what. In: Proceedings of the ECML-98 Workshop on Upgrading Learning to Meta-Level: Model Selection and Data Transformation. pp. 78–85 (1998) 12. Lipton, Z.C.: The mythos of model interpretability. Queue 16(3), 30:31–30:57 (2018) 13. Longo, L.: Argumentation for knowledge representation, conflict resolution, defea- sible inference and its integration with machine learning. In: Machine Learning for Health Informatics, pp. 183–208. Springer (2016) 14. Longo, L., Kane, B., Hederman, L.: Argumentation theory in health care. In: Computer-Based Medical Systems, 25th Int. Symposium on. pp. 1–6. IEEE (2012) 15. Passino, K.M., Yurkovich, S., Reinfrank, M.: Fuzzy control 16. Prakken, H.: An abstract framework for argumentation with structured arguments. Argument and Computation 1(2), 93–124 (2010) 17. Rizzo, L., Longo, L.: Representing and inferring mental workload via defeasible reasoning: a comparison with the nasa task load index and the workload profile. In: 1st Workshop on Advances In Argumentation In Artificial Intelligence. pp. 126–140 (2017) 18. Rizzo, L., Majnaric, L., Dondio, P., Longo, L.: An investigation of argumentation theory for the prediction of survival in elderly using biomarkers. In: Int. Conf. on Artificial Intelligence Applications and Innovations. pp. 385–397. Springer (2018) 19. Rizzo, L., Majnaric, L., Longo, L.: A comparative study of defeasible argumen- tation and non-monotonic fuzzy reasoning for elderly survival prediction using biomarkers. In: AI*IA 2018 - Advances in Artificial Intelligence - XVIIth Int. Con- ference of the Italian Association for Artificial Intelligence. pp. 197–209 (2018) 20. Siler, W., Buckley, J.J.: Fuzzy expert systems and fuzzy reasoning (2005) 21. Zadeh, L.A., et al.: Fuzzy sets. Information and control 8(3), 338–353 (1965)