Towards Knowledge-driven Distillation and Explanation of Black-box Models Roberto Confalonieri1 , Pietro Galliani1 , Oliver Kutz1 , Daniele Porello2 , Guendalina Righetti1 and Nicolas Troquard1 1 Faculty of Computer Science, Free University of Bozen-Bolzano, Italy 2 Dipartimento di Antichità, Filosofia e Storia (DAFIST), Università di Genova, Italy Abstract We introduce and discuss a knowledge-driven distillation approach to explaining black-box models by means of two kinds of interpretable models. The first is perceptron (or threshold) connectives, which enrich knowledge representation languages such as Description Logics with linear operators that serve as a bridge between statistical learning and logical reasoning. The second is Trepan Reloaded, an approach that builds post-hoc explanations of black-box classifiers in the form of decision trees enhanced by domain knowledge. Our aim is, firstly, to target a model-agnostic distillation approach exemplified with these two frameworks, secondly, to study how these two frameworks interact on a theoretical level, and, thirdly, to investigate use-cases in ML and AI in a comparative manner. Specifically, we envision that user-studies will help determine human understandability of explanations generated using these two frameworks. Keywords explainable AI, knowledge distillation, perceptron logic, Trepan Reloaded, decision trees 1. Introduction Since the development of expert systems in the mid-1980s [1], explainable Artificial Intelligence (xAI) has been promoting decision models that are transparent, i.e., that are able to explain why and how decisions are being made. More recent successes in machine learning technology, together with episodes of unfair and discriminating decisions taken by black-box models, have brought explainability back into the focus [2]. This has led to a plethora of new approaches for explanations of black-box models [3], aiming to achieve explainability without sacrificing system performance, and approaches to knowledge discovery in databases [4], aiming to combine Semantic Web data with the data mining and knowledge discovery process. Two of the most important problems that the two areas above have tried to address, and that are currently of great practical and theoretical interest are those of: • Interpretable Machine Learning [5, 6]: often, it is not sufficient for a model to make International Workshop on Data meets Applied Ontologies in Explainable AI (DAO-XAI 2021), September 18–19, 2021, Bratislava, SK Envelope-Open roberto.confalonieri@unibz.it (R. Confalonieri); pietro.galliani@unibz.it (P. Galliani); oliver.kutz@unibz.it (O. Kutz); daniele.porello@unige.it (D. Porello); guendalina.righetti@unibz.it (G. Righetti); nicolas.troquard@unibz.it (N. Troquard) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) dist. M IP L.A. ins. TBox ABox DB 𝒦ℬ Figure 1: Our proposed workflow. A learning algorithm (L.A.) extracts from the ABox component of a knowledge base (𝒦 ℬ) and/or from data of a database (DB) a model (M). This model is then distilled to an intepretable proxy (IP) which may be added to the knowledge base’s TBox. accurate predictions. We also want our model to be interpretable, that is, to provide a simple, human-understandable explanation of why it makes its predictions. • Integration of Prior Domain Knowledge [7, 8]: classification or regression tasks may not exist in a vacuum, but rather they may concern (and often, do concern) topics for which there exists a considerable amount of domain knowledge of the kind that is easily represented in terms of logical knowledge bases. This information may be of use both to direct the learning algorithm and to reformulate or generalize its conclusions. Nonetheless, the interplay between these two problems has remained mostly unexplored. Only a few of these approaches have considered how to integrate and use domain knowledge to foster interpretable machine learning, and to drive the explanation process (e.g., [9]). Fur- thermore, very few approaches have addressed explanations from a user point of view [10], in particular, analysing what makes for a good explanation [11], how these are perceived and understood by humans, and how to use these findings to measure the understandability of explanations of black-box models. We propose to address these problems with an approach that may be seen as an instance of knowledge distillation [12]. As shown in Figure 1, we propose to first train a (non-necessarily human-interpretable) model on data, and then attempt to approximate the resulting model by reducing it to an interpretable proxy. Depending on the nature of the model first trained and of the proxy, this may be trivial (for example if the model first trained is simple and interpretable enough to serve as its own proxy), rather less so (e.g., if the trained model is a multi-layer neural network and we wish for our interpretable proxy to be a linear model), or perhaps considerably difficult (e.g., if the trained model is a complex ensemble model). In general, then, the best way forward will be a modular, possibly model-agnostic, distillation approach through which an interpretable proxy is extracted by evaluating that model on arbitrary inputs and learning adaptively a simpler model that best imitates it. We envision that this ‘knowledge distillation’ procedure could be done by following at least two different approaches to approximate the original machine learning model. On one hand, we could use threshold (or “Tooth”) expressions, which extend knowledge representation models by means of linear classifiers [13]. On the other hand, we could use Trepan Reloaded, a model-agnostic approach that provides symbolic explanations, under the form of decision trees, of a black-box model [14, 15]. Since distillation will make use of the knowledge base as well as of the machine learning model, the resulting interpretable proxy will integrate the model and the logical information of the knowledge base to which it will be possibly added later, closing in this way a symbolic integration cycle. The cycle will foster knowledge reuse and sharing. As pointed out above, another important aspect, which has been nevertheless almost over- looked, is the evaluation of human-understandability of explanations [10, 15]. Providing ex- planations that are human-intelligible is essential for communicating the underlying logic of a decision making process to users in an effective way. Research in the social sciences has exten- sively studied what stands for human-understandable explanations, and how humans conceive and share explanations [10]. Other works studied and proposed how the understandability of explanations can be measured [16]. In the research area of Description Logics, the work in [17] considers the readability of tree-like proofs as explanations. The authors in [18] study the cognitive complexity of justifications of OWL entailment, and measure their understandability by human users. We will use these works as a basis to design experiments aiming to compare the explanations distilled using threshold expressions and Trepan Reloaded. 2. Explanations via Weighted Threshold Operators Weighted Threshold Operators are 𝑛-ary logical operators which compute a weighted sum of their arguments and verify whether it reaches a certain threshold. These operators have been extensively studied in the context of circuit complexity theory, and they are also known in the neural network community under the alternative name of perceptrons. In [13], threshold operators were studied in the context of Knowledge Representation, focusing in particular on Description Logics (DLs). In brief, if 𝐶1 … 𝐶𝑛 are concept expressions, 𝑤1 … 𝑤𝑛 ∈ ℝ are weights, and 𝑡 ∈ ℝ is a threshold, we can introduce a new concept ∇∇ 𝑡 (𝐶1 ∶ 𝑤1 … 𝐶𝑛 ∶ 𝑤𝑛 ) to designate those individuals 𝑑 such that ∑{𝑤𝑖 ∶ 𝐶𝑖 applies to 𝑑} ≥ 𝑡. In the context of DL and concept representation, such threshold expressions are natural and useful, as they provide a simple way to describe the class of the individuals that satisfy ‘enough’ of a certain set of desiderata. For example, let us consider the Felony Score Sheet used in the State of Florida1 , in which various aspects of a crime are assigned points, and a threshold must be reached to decide compulsory imprisonment. For example, possession of cocaine corresponds to 16 points if it is the primary offense and to 2.4 points otherwise, a victim injury describable as “moderate” corresponds to 18 points, and a failure to appear for a criminal proceeding results in 4 points. Imprisonment is compulsory if the total is greater than 44 points and not compulsory otherwise. A knowledge base describing the laws of Florida would need to represent this score sheet as part of its definition of its CompulsoryImprisonment concept, for instance as ∇∇ 44 (CocainePrimary ∶ 16, ModerateInjuries ∶ 18, …). While it would be possible to also describe it (or any other Boolean function) in terms of more ordinary logical connectives (e.g., by a DNF expression), a definition in terms of threshold expressions is far simpler and more readable. As such, the definition is more transparent and more explainable. We refer the interested reader to [13, 19, 20] for a more in-depth analysis of the properties of this operator. Having threshold expressions in a language of knowledge representation 1 http://www.dc.state.fl.us/pub/scoresheet/cpc_manual.pdf (accessed: 13 July 2021) has notable advantages. First, in psychology and cognitive science, the combination of two or more concepts has a more subtle semantics than set theoretic operations. As shown in [21], threshold operators can represent complex concepts more faithfully regarding the way in which humans think of them. For this reason, explanations provided using threshold expressions are in principle more accessible to human agents. Second, as illustrated in [19], since a threshold expression is a linear classification model, it is possible to use standard linear classification algorithms (such as the Perceptron Algorithm, Logistic Regression, or Linear SVM) to learn its weights and its threshold given a set of assertions about individuals (that is, given an ABox). Extensions of Description Logic involving threshold operators have also been discussed in [22, 23]. The approaches presented in these two papers are, however, different from the one summarised above: the former paper, indeed, changes the semantics of DL by associating graded membership functions to models and requiring them for the interpretation of expressions, while the latter one extends the semantics of the DL 𝒜 ℒ 𝒞 by means of weighted alternating parity tree automata. The approach described above is, in comparison, more direct: no changes are made to the definitions of the models of the DL(s) to which threshold operators are added, and the language is merely extended by means of the above-described operators. This allows for an easier comparison of our approach to better-known DL(s) as well as a simpler adaptation of already developed technologies. Provided that the language of the original DL contains the ordinary Boolean operators, adding the threshold operators to it does no increase the expressive power (as already noted in [13]), but does not increase the complexity of reasoning either [24]. Of particular interest in the context of this paper is the use of threshold operators to represent LIME-style local explanations ([25, 26]). Very briefly, the behaviour of a learned model near a certain data point is approximated by a linear function, that describes the overall behaviour of the complex model around that point and that could be represented by a threshold expression. By adding such expressions for a suitable number of data points, we could then provide (and add to a knowledge base) a human-readable approximation of the overall behaviour of the machine learning model. 3. Explanations via Decision Trees In the ML literature, techniques for explaining black-box models are typically classified as local and global methods [3]. Whilst local methods take into account specific examples and provide local explanations, global methods aim to provide an overall approximation of the behavior of the black-box model. Global explanations are usually preferable over local explanations, because they provide a more general view about the decision making process of a black-box. An attempt to aggregate local explanations into global one was proposed in [27]. A seminal explanation method to explaining black-box classifiers is Trepan [28]. Trepan is a tree induction algorithm that recursively extracts decision trees from oracles, in particular from feed-forward neural networks. The algorithm is model-agnostic, and it can in principle be applied to explain any black-box classifier (e.g., Random Forest). Trepan combines the learning of the decision tree with a trained machine learning classifier (the oracle). At each learning step, the oracle’s predicted labels are used instead of known real labels. The use of this oracle serves two purposes: first, it helps to prevent the tree from overfitting to outliers in the training data. Second, and more importantly, it helps to build more accurate trees. To produce enough examples to reliably generate test conditions on lower branches of the tree, Trepan draws extra artificial query instances that are submitted to the neural network as if they were real data. The features of these query instances are based on the distribution of the underlying data. Both the query instances and the original data are submitted to the neural network ‘oracle’, and its outputs are used to build the tree. An extension of the Trepan algorithm, called Trepan Reloaded, was proposed to take into account explicit knowledge, modeled by means of ontologies, in [14]. Trepan Reloaded uses a modified information gain that, in the creation of split nodes, gives priority to features associated with more general concepts defined in a domain ontology. This was achieved by means of an information content measures defined using refinement operators [29, 30, 31]. Linking explanations to structured knowledge in the form of ontologies, brings multiple advantages. It does not only enrich explanations (or the elements therein) with semantic information— thus facilitating effective knowledge transmission to users—but it also creates a potential for supporting the customisation of explanations to specific user profiles [32]. To measure the effects of the ontology on the understandability of explanations with human users an on-line user study was conducted. The study showed that decision trees generated by Trepan Reloaded, thus taking domain knowledge into account, were more understandable than those generated without the use of domain knowledge [14, 15]. 4. Evaluating Human Understandability of Explanations Decision trees and threshold expressions appear to have complementary pros and cons as explanatory tools for black-box classifiers. Decision trees have the advantage of having clear visual representations. A human user can easily follow them to understand what factors lead the classifier to reach which conclusion in which circumstances; but on the other hand, especially in the case of very large trees, it can be difficult for a user to follow the overall structure of the decision tree or use it to engage in counterfactual reasoning (e.g., “would the final decision of the classifier have been YES rather than NO if feature C1 had been different?”). Threshold expressions, on the other hand, are arguably of less immediate interpretability for a user; but have the advantage of specifying clearly which factors influence positively or negatively the decision of the classifier, and up to which (comparative) degree, thus making it easier for a user to evaluate the effect that changing certain specific input features would have on the outcome. Previous work attempting to measure the understandability of symbolic decision models, and decision trees in particular [33, 34], proposed syntactic complexity measures based on the model’s structure. The syntactic complexity of an explanation can be measured, for instance, in the case of decision trees, by counting the number of internal nodes or leaves, or in the case of logical formulas, by counting the number of symbols adopted. Having a measure like syntactic complexity, that can be easily computed, is useful from an application perspective. E.g., it may be used to prevent excessive complexity in building decision trees and threshold expressions when explaining a black-box. On the other hand, the syntactic complexity does not necessarily capture precisely the understandability of explanations by users. A direct measure of user … IP3 IP2 M IP1 Rec Txt TBox ABox DB 𝒦ℬ Usr Figure 2: Our eventual aim. From the black-box model (M), learned from a database (DB) and/or from the ABox of a knowledge base (𝒦 ℬ) we distil, via the knowledge base 𝒦 𝐵, a library of interpretable models IP1, IP2, …with different advantages and disadvantages. A recommender system (Rec) then chooses among these models the one to present to the user (Usr) in form of a narrative generated by a textual translator (Txt). The user in turn interacts with the recommender system. understandability is how accurately a user can employ a given explanation to perform a decision. Another measure of cognitive difficulty is the reaction time (RT) or response latency [35]. RT is a standard measure used by cognitive psychologists and has become a staple measure of complexity in the domain of design and user interfaces [36]. Understandability depends on the cognitive load experienced by users, e.g., in using the decision model to classify instances and in understanding the features in the model itself. However, for practical processing human understandability needs to be approximated by an objective measure. We will compare two characterisations of the understandability of explanations: (i) Under- standability based on the syntactic complexity of an explanation (number of internal nodes, leaves, symbols used in a weighted formulas, etc.), and (ii) Understandability based on users’ performances and subjective ratings, reflecting, for instance, the cognitive load by users in carrying out tasks using a given explanation format. We aim at conducting a user study to measure and compare the understandability of ex- planations given in the form of decision trees and threshold expressions with human users. This can be done in domains where explanations are critical, e.g., justice, finance or medicine. Conducting and analysing such experiments can provide useful insights under which conditions and tasks one representation is deemed more understandable than the other one by users. 5. Outlook to Future Work In this work we briefly introduced two novel and promising approaches to distilling black-box models into explainable models while making use of domain knowledge. As discussed in the previous section, the natural next step consists in investigating experimentally the respective advantages and disadvantages of these two approaches, with an eye towards a characterisation of the scenarios in which either provides models that are more understandable and/or closer to the black box model than the other. Much besides that of course remains to be done. For instance, to further improve understand- ability, an ulterior processing step translating either kind of model into a textual description might be worth implementing, e.g., using natural language generation [37] and narratives [38], as well as a way to use background knowledge to adjust the explanatory model to the needs of different stakeholders [32]. Ultimately, we aim to integrate these two approaches (or more) into a unified meta-approach that can use multiple modes of explanation for different aspects of a black-box model, automati- cally choosing among the available options the ones that are best suited to provide a faithful and understandable representation to that specific aspect of the model. This is an ambitious endeavour, which would culminate in the automated distillation of a single black-box model into a library of explainable models, different both in kind and in complexity (e.g., number of decision nodes, weights), whose availability to the user is mediated by a recommender system and a textual translator. References [1] M. R. Wick, W. B. Thompson, Reconstructive expert system explanation, Art. Intelligence 54 (1992) 33–70. [2] R. Confalonieri, L. Coba, B. Wagner, T. R. Besold, A historical perspective of explainable artificial intelligence, WIREs Data Mining and Knowledge Discovery 11 (2021). doi:h t t p s : //doi.org/10.1002/widm.1391. [3] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for explaining black box models, ACM Comp. Surv. 51 (2018) 1–42. [4] P. Ristoski, H. Paulheim, Semantic Web in data mining and knowledge discovery: A comprehensive survey, Journal of Web Semantics 36 (2016) 1–22. doi:1 0 . 1 0 1 6 / J . W E B S E M . 2016.01.001. [5] A. Vellido, J. D. Martín-Guerrero, P. J. Lisboa, Making machine learning models inter- pretable, in: 20th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2012, pp. 163–172. [6] F. Doshi-Velez, B. Kim, Towards a rigorous science of interpretable machine learning, arXiv preprint arXiv:1702.08608 (2017). [7] W. Liao, Q. Ji, Learning Bayesian network parameters under incomplete data with domain knowledge, Pattern Recognition 42 (2009) 3046–3056. [8] S. Mei, J. Zhu, J. Zhu, Robust regbayes: Selectively incorporating first-order logic domain knowledge into bayesian models, in: International Conference on Machine Learning, 2014, pp. 253–261. [9] X. Renard, N. Woloszko, J. Aigrain, M. Detyniecki, Concept tree: High-level representation of variables for more interpretable surrogate decision trees, CoRR abs/1906.01297 (2019). URL: http://arxiv.org/abs/1906.01297. [10] T. Miller, Explanation in artificial intelligence: Insights from the social sciences, Artificial Intelligence 267 (2019) 1–38. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . a r t i n t . 2 0 1 8 . 0 7 . 0 0 7 . [11] R. Confalonieri, T. R. Besold, T. Weyde, K. Creel, T. Lombrozo, S. Mueller, P. Shafto, What makes a good explanation? Cognitive dimensions of explaining intelligent machines, in: Proc. of the 41th Annual Meeting of the Cognitive Science Society, CogSci 2019, 2019. [12] Y. Cheng, D. Wang, P. Zhou, T. Zhang, A survey of model compression and acceleration for deep neural networks, arXiv preprint arXiv:1710.09282 (2017). [13] D. Porello, O. Kutz, G. Righetti, N. Troquard, P. Galliani, C. Masolo, A toothful of concepts: Towards a theory of weighted concept combination, in: Proc. of the 32nd International Workshop on Description Logics, volume 2373, CEUR-WS, 2019. [14] R. Confalonieri, T. Weyde, T. R. Besold, F. M. del Prado Martín, Trepan Reloaded: A Knowledge-driven Approach to Explaining Black-box Models, in: Proc. of the 24th Euro- pean Conference on Artificial Intelligence, volume 325 of Frontiers in Artificial Intelligence and Applications, IOS press, 2020, pp. 2457–2464. doi:1 0 . 3 2 3 3 / F A I A 2 0 0 3 7 8 . [15] R. Confalonieri, T. Weyde, T. R. Besold, F. M. del Prado Martín, Using ontologies to enhance human understandability of global post-hoc explanations of black-box models, Artificial Intelligence 296 (2021). doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . a r t i n t . 2 0 2 1 . 1 0 3 4 7 1 . [16] R. R. Hoffman, S. T. Mueller, G. Klein, J. Litman, Metrics for explainable AI: challenges and prospects, CoRR abs/1812.04608 (2018). [17] C. Alrabbaa, F. Baader, S. Borgwardt, P. Koopmann, A. Kovtunova, Finding small proofs for description logic entailments: Theory and practice, in: E. Albert, L. Kovács (Eds.), LPAR 2020: 23rd International Conference on Logic for Programming, Artificial Intelligence and Reasoning, Alicante, Spain, May 22-27, 2020, volume 73 of EPiC Series in Computing, EasyChair, 2020, pp. 32–67. doi:1 0 . 2 9 0 0 7 / n h p p . [18] M. Horridge, S. Bail, B. Parsia, U. Sattler, Toward cognitive support for owl justifications, Knowledge-Based Systems 53 (2013) 66–79. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . k n o s y s . 2 0 1 3 . 08.021. [19] P. Galliani, O. Kutz, D. Porello, G. Righetti, N. Troquard, On knowledge dependence in weighted description logic, in: Proc. of the 5th Global Conference on Artificial Intelligence (GCAI 2019), 2019, pp. 17–19. [20] P. Galliani, O. Kutz, N. Troquard, Perceptron Operators That Count, in: M. Homola, V. Ryzhikov, R. Schmidt (Eds.), Proceedings of the 34th International Workshop on De- scription Logics (DL 2021), CEUR Workshop Proceedings, CEUR-WS.org, 2021. [21] G. Righetti, D. Porello, O. Kutz, N. Troquard, C. Masolo, Pink panthers and toothless tigers: Three problems in classification, in: Proc. of the 5th Int. Workshop on Artificial Intelligence and Cognition, Manchester, September 10–11, 2019. [22] F. Baader, G. Brewka, O. F. Gil, Adding threshold concepts to the description logic ℰ ℒ, in: C. Lutz, S. Ranise (Eds.), Frontiers of Combining Systems, Springer International Publishing, Cham, 2015, pp. 33–48. [23] F. Baader, A. Ecke, Reasoning with prototypes in the description logic 𝒜 ℒ 𝒞 using weighted tree automata, in: Language and Automata Theory and Applications, Springer International Publishing, Cham, 2016, pp. 63–75. [24] P. Galliani, G. Righetti, O. Kutz, D. Porello, N. Troquard, Perceptron connectives in knowledge representation, in: Proceedings of 22nd International Conference on Knowledge Engineering and Knowledge Management (EKAW 2020), 2020. [25] M. T. Ribeiro, S. Singh, C. Guestrin, “Why Should I Trust You?”: Explaining the Predictions of Any Classifier, in: Proc. of the 22nd Int. Conf. on Knowledge Discovery and Data Mining, KDD ’16, ACM, 2016, pp. 1135–1144. [26] M. T. Ribeiro, S. Singh, C. Guestrin, Anchors: High-precision model-agnostic explanations, in: AAAI, AAAI Press, 2018, pp. 1527–1535. [27] M. Setzu, R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, F. Giannotti, GLocalX - From Local to Global Explanations of Black Box AI Models, Artificial Intelligence 294 (2021) 103457. doi:1 0 . 1 0 1 6 / j . a r t i n t . 2 0 2 1 . 1 0 3 4 5 7 . [28] M. W. Craven, J. W. Shavlik, Extracting tree-structured representations of trained networks, in: NIPS 1995, MIT Press, 1995, pp. 24–30. [29] N. Troquard, R. Confalonieri, P. Galliani, R. Peñaloza, D. Porello, O. Kutz, Repairing Ontologies via Axiom Weakening, in: S. A. McIlraith, K. Q. Weinberger (Eds.), Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, AAAI Press, 2018, pp. 1981–1988. [30] D. Porello, N. Troquard, R. Peñaloza, R. Confalonieri, P. Galliani, O. Kutz, Two Approaches to Ontology Aggregation Based on Axiom Weakening, in: J. Lang (Ed.), Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, ijcai.org, 2018, pp. 1942–1948. doi:1 0 . 2 4 9 6 3 / i j c a i . 2018/268. [31] R. Confalonieri, P. Galliani, O. Kutz, D. Porello, G. Righetti, N. Troquard, Towards Even More Irresistible Axiom Weakening, in: Proc. of the 33rd International Workshop on De- scription Logics (DL 2020), Online, September 12-14, 2020, volume 2663 of CEUR Workshop Proceedings, CEUR-WS.org, 2020. [32] M. Hind, Explaining Explainable AI, XRDS 25 (2019) 16–19. doi:1 0 . 1 1 4 5 / 3 3 1 3 0 9 6 . [33] J. Huysmans, K. Dejaeger, C. Mues, J. Vanthienen, B. Baesens, An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models, Decision Support Systems 51 (2011) 141–154. [34] R. Piltaver, M. Luštrek, M. Gams, S. Martinčić-Ipšić, What makes classification trees comprehensible?, Expert Syst. Appl. 62 (2016) 333–346. [35] F. C. Donders, On the speed of mental processes., Acta Psychologica 30 (1969) 412–31. [36] J. B. William Lidwell, Kritina Holden, Universal. Principles of Design., Rockport, 2003. [37] E. Mariotti, J. M. Alonso, R. Confalonieri, A framework for analyzing fairness, accountabil- ity, transparency and ethics: A use-case in banking services, in: 30th IEEE International Conference on Fuzzy Systems, FUZZ-IEEE 2021, Luxembourg, July 11-14, 2021, IEEE, 2021, pp. 1–6. doi:1 0 . 1 1 0 9 / F U Z Z 4 5 9 3 3 . 2 0 2 1 . 9 4 9 4 4 8 1 . [38] P. Gervás, E. Concepción, C. León, G. Méndez, P. Delatorre, The long path to narrative generation, IBM J. Res. Dev. 63 (2019) 8:1–8:10.