Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 41 Intrinsic, Dialogic, and Impact Measures of Success for Explainable AI Jörg Cassens1 , Rebekah Wegener2 1 University of Hildesheim, 31141 Hildesheim, Germany 2 Paris Lodron University, 5020 Salzburg, Austria cassens@cs.uni-hildesheim.de, rebekah.wegener@sbg.ac.at Abstract should be able to “explain away” recommendations made by a diagnostic system in order to enhance the future performance. This paper presents a brief overview of require- While we currently focus on the opposite situation, e.g. an ments for development and evaluation of human artificial actor explaining its choice of recommendations to centred explainable systems. We propose three per- the human user, frameworks for designing explanation-aware spectives on evaluation models for explainable AI systems should be able to account for different flows of ex- that include intrinsic measures, dialogic measures planations, at least in principle and by extension. and impact measures. The paper outlines these dif- In order to distinguish this from views that see the machine ferent perspectives and looks at how the separa- as only the explainer, not the explainee, we make use of the tion might be used for explanation evaluation bench established term explanation awareness [Roth-Berghofer et marking and integration into design and develop- al., 2007; Roth-Berghofer and Richter, 2008]. Our working ment. We propose several avenues for future work. definition is as follows: • Internal View: Explanation as part of the reasoning 1 Explanations process itself. Explanations are foundational to social interaction [Lom- – Example: a recommender system can use domain brozo, 2006], and numerous different approaches to achiev- knowledge to explain the absence or variation of ing explainability have been proposed recently [Adadi and feature values, e.g. relations between countries Berrada, 2018; Arrieta et al., 2019; Doran et al., 2017]. Criticisms of current research trends include that “ac- • External View: giving explanations of the found solu- counts of explanation typically define explanation (the prod- tion, its application, or the reasoning process to the other uct) rather than explaining (the process)” [Edwards et al., actors 2019]. Another criticism is that explanations are currently – Example: the user tells said recommender system largely seen as a relatively uniform and definable concept, why he chooses an apartment in Norway despite and even systems that take user goals with explanation into the system suggesting one in Sweden account treat it largely on the system side of development [Bi- ran and Cotton, 2017]. Despite this, a human centred [Ehsan Semiotics and philosophy as well as the human and social and Riedl, 2020] perspective on explanation in artificial intel- sciences provide a rich basis for applications in explainable ligence is not new [Shortliffe, 1976; Swartout, 1983; Schank, AI [Miller, 2018]. There is sufficient empirical and theoreti- 1986; Leake, 1992, 1995; Mao and Benbasat, 2000]. For ex- cal evidence that explanations are generated, communicated, ample, Gregor and Benbasat [1999] point out that different understood and used in ways that are: user groups have different explanation needs. • Dialogic, as suggested e.g. by Leake Leake [1995], We have earlier construed contextualised explanations based on user goals [Sørmo et al., 2005]. This has been • Contextualised, as required by e.g. Fraassen van used to integrate explanatory needs in the system design pro- Fraassen [1980], comprised of cess [Roth-Berghofer and Cassens, 2005; Cassens and Kofod- – Context Awareness (knowing the situation the sys- Petersen, 2007]. However, we have represented explanation tem is in) and as a static object rather than a dialogic process. This includes – Context Sensitivity (acting according to such situ- the ability of the technical system to make use of explanations ation) Kofod-Petersen and Aamodt [2006]; Kofod- as well, at least as part of the theoretical model, even if not in Petersen and Cassens [2011] practical applications. In our understanding, both human and non-human actors • Multimodal, as argued for by e.g. Halliday Halliday in heterogeneous socio-technical systems (or socio-cognitive, [1978] and being [Noriega et al., 2015]) can be senders and receivers of expla- • Construed by user interest, as noted by e.g. Achinstein nations [Cassens and Wegener, 2019]. For example, a human Achinstein [1983]. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 42 Given these foundations, can a semiotic model of explanation De Ruyter and Aarts, 2010; Holtzblatt and Beyer, 2016], to as a form of multi-modal dialogic language behaviour in con- name a few). We should consider principles and methods text be used to generate contextually appropriate explanations for (designing and evaluating) explainability as additions to by computational systems? There is an extensive body of re- existing tool kits, agnostic to their use in established design search focusing on generating and using explanations in AI. processes whenever possible (limited by different ontological Currently, what is lacking is: commitments). 1. A theory of the dialogic process rather than a monologic Evaluation is central to Human-Computer Interaction, or product rather: evaluations are central since they typically form a cy- cle and cover a system at various stages. While (formative 2. A cohesive theory of explanation that is: and summative) evaluations are a cornerstone for human cen- • contextually appropriate (e.g. fitting people, topic, tred design, “it is far from being a solved problem” [MacDon- mode and place), ald and Atwood, 2013]. We are generally in need for evalu- • semantically appropriate (e.g. recognised as an ex- ation processes that are suited for emerging types of applica- planation) tions [Poppe et al., 2007] and for sustainable and responsible • lexicogrammatically optimal (best possible multi- systems development [Remy et al., 2018]. modal realisation) But even if current (usability) evaluation methods [Dumas and Salzman, 2006] may ultimately fall short in the con- 3. A framework for integrating explanatory capabilities in text of XAI, they can at least inform first iterations of eval- the whole software development life-cycle, from re- uation standards. In particular when used in combination quirements elicitation over design and implementation with theories and models from other areas, such as linguistics through to its use [Cassens and Wegener, 2008; Halliday, 1978; Wegener et al., 4. A framework for evaluation measures. 2008], psychology [Kaptelinin, 1996], the cognitive sciences [Keil and Wilson, 2000], or philosophy [Achinstein, 1983; We will focus on the last aspect in the remainder of this pa- van Fraassen, 1980]. per. Research in particular when it comes to measuring the actual effectiveness and efficiency of explanations given to In this short paper, we cannot explore these contributions users still seems fragmented. We propose to measure explain- in detail, but we will briefly outline a tripartite model for cap- ability along three lines of inquiry. Intrinsic measures deal turing explanatory effectiveness that includes: with the question of whether the system at hand can gener- • Intrinsic measures: measures that pertain to the ability ate explanations at all. Dialogic measures look at whether of a system to generate explanations. the system’s output is seen as an explanation by the users. Can the system generate explanations? Finally, impact measures ask whether the explanation gen- erated is of any use. These questions should help to elicit • Dialogic measures: measures that pertain to interaction and formalise requirements for explanations as well as find between the system and its users. ways to evaluate solutions that are operationalised sufficiently Does the system’s output work as an explanation for its to enable making claims of explainability that can be tested users? against and to further comparisons between systems and iter- • Impact measures: measures that pertain to the poten- ations of systems. tial, anticipated or actual impact of explanations. Explanations are needed during the whole life cycle of ap- Is the explanation generated of any use? plications, from initial requirements elicitation over design and development processes to using the final system. There- We have separated these measures because each of these fore, it makes sense to look at frameworks for measuring ef- three types of measures has different methods for testing and ficiency and effectiveness of explanations in the context of they cover distinct aspects of what “explanatory success” can whole development and life cycle management processes. mean. It is only by combining these different perspectives While quality measurements for explanation could eventu- that we can get a full picture of the explanatory performance ally enable a final system score (for benchmarking purposes of a system and the explanations that are a part of that sys- [Zhan et al., 2019]), development is a cycle and it is con- tem. While we can think of more perspectives, it is important textual, and the goal is to be able to build “better” systems to keep in mind that quality measures have to have a well through “better” development processes, where explanatory defined scope and they need to be, indeed, measurable [Car- success is part of success metrics. Given existing require- valho et al., 2017]. Furthermore, for them to be able to im- ments for transparency, such perspective on evaluating expla- prove processes in practice, they need to be sufficiently sim- nations can also be part of a regulatory framework for ethical ple to apply. AI [Cath, 2018; Coeckelbergh, 2020; Erdélyi and Goldsmith, 2018]. 2.1 Intrinsic Measures These measure the ability of the system to generate explana- 2 Evaluations tions, both generally for the given context of use, but specifi- Within HCI, a plethora of different instantiations of hu- cally the transparency and interpretability of the system itself man centred development processes exist (e.g. [Beyer or of aspects of the system such as ML models and data used and Holtzblatt, 1997; Carroll, 2000; Cooper et al., 2014; as well as algorithmic and other design choices. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 43 If a system or parts of a system are not transparent then it for beneficial and equitable AI, but ethical is at least a good is unlikely to perform well on either dialogic or impact mea- baseline outcome. Here we might expect to see methods such sures. We can think of intrinsic measures as a baseline for as impact studies and hypothetical, scenario and risk mod- explainable AI – it is a necessary, but not sufficient condition. elling. It would be beneficial to know what the anticipated From a design process perspective, we will need to look at consequences of the explanation are for everyone involved. which components are necessary for explanation generation [Roth-Berghofer and Cassens, 2005]. Evaluating, we might 3 Related Work explore the structure, modality and semantic characteristics of the different explanations to ensure that they are optimised Mohseni et al. [2018] argue that the interdisciplinary nature for the situation. There are different specific methods that of explainable artificial intelligence (XAI) “poses challenges might be useful for intrinsic measures. for identifying appropriate design and evaluation methodol- ogy and consolidating knowledge across efforts”. At the same 2.2 Dialogic Measures time, this interdisciplinary approach is essential to the success Here we look at the question of whether that which has been of XAI. We view our suggestion as a way to complement, fur- generated actually works as an explanation to the user, in vari- ther consolidate, and operationalise their classification sys- ous conditions, situations and contexts. Under investigation is tem for different goals in XAI. the shared semiotic process of explanation generator and ex- Hoffman et al. [2018] propose a process model of explain- planation consumer. Different methods are going to be useful ing and suggest measures that are applicable in the differ- for dialogic measures including user studies, reaction studies, ent phases of their conceptual model. This compliments our experimental studies and qualitative and quantitative meth- (more abstract) notions of dialogic and (to a lesser degree) ods in general. Explanations are inherently dialogic, so we impact measures, whereas we see our notion of intrinsic mea- are always going to want to know who is requesting the ex- sures as a prerequisite for their model. Both models can be planation, who is providing the explanation and how and why systematically combined, depending on the need for gran- they are providing it. Tracking the exchange of information ularity and aspects covered. Mueller et al. [2021] present itself is a way to evaluate because it lets us see the reaction to some helpful higher-level psychological considerations that the explanation. can serve as general templates for effective explanations. Trustworthy AI could be an outcome of systems that score Sokol and Flach [2020] introduce fact sheets with an ex- highly on dialogic measures. This does not mean that trust- tensive list of properties for different explanatory methods. worthy systems will score well on impact measures, indeed, This is complimentary to our approach and could be used to human and non-human agents are quite prepared to trust a select methods supporting the measures chosen. A survey by system that may have negative impacts on their wellbeing. Carvalho et al. [2019] on interpretability in machine learning Trust can be engendered through a dialogically well perform- is orthogonal to our model, with their results being useful for ing malicious system and this is what makes impact measures operationalisation of the intrinsic (e.g. their comparison of so essential. different methods) and the dialogic measures (e.g. the notion of explanation properties). 2.3 Impact Measures Impact measures look at whether providing explanations of- 4 Conclusion fers benefits over the use of the system itself. These can be used both on an individual level and for larger systems. We propose a tripartite perspective on explanation in intelli- For example, on the individual level, we might consider gent systems that aligns with (iterative and contextual) design an adaptive learning system that offers explanations to fur- and development processes of systems such that there is space ther the learning goal [Sørmo et al., 2005] a user might have. for formative and summative evaluations. While it enables a While dialogic measures can be used to evaluate whether such final system score (which we propose for benchmarking pur- an explanation can function as an explanation to the student, poses [Zhan et al., 2019]), development is a cycle and it is it would remain unclear whether the explanation did actually contextual, and the goal is to be able to build “better” sys- improve learning outcomes. tems, where explanatory success is part of success metrics. These measures also look at the impact that the system can We have previously discussed the potential for Ambient In- have in the world. How can it impact decisions, diagnoses, le- telligence to be useful for creating explainable AI [Cassens gal and access outcomes? The impact measures examine the and Wegener, 2019], particularly on the architecture level and potential, anticipated or actual impact of the system and the with regard to capabilities subsumed [De Ruyter and Aarts, ability of the system to explain these repercussions to users 2010]. We propose that the core characteristics and general in context. Here the concept of contextual AI is important architecture of ambient intelligent systems make them a good because as Ehsan and Riedl argue, ”if we ignore the socially framework for developing XAI and that AmI systems them- situated nature of our technical systems, we will only get a selves have the potential to become explanatory agents that partial and unsatisfying picture” [Ehsan and Riedl, 2020]. A can be mediators between humans and other systems. The good model of context is crucial for evaluating explanatory concept of mediating explanatory instances has also been ex- success [Kofod-Petersen and Cassens, 2007; Wegener et al., plored in the context of virtual explanatory agents [Weitz et 2008]. Ethical AI would be the outcome of a system that al., 2020] or as a user-specific “memory” of explanations scores highly on impact measures. We would of course aim [Chaput et al., 2021]. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 44 Development of such mediators, concentrating explanatory Rémy Chaput, Amélie Cordier, and Alain Mille. Explana- capabilities in specialised agents that are contextually embed- tion for humans, for machines, for human-machine interac- ded in our surroundings and have the potential for person- tions? In WS Explainable Agency in Artificial Intelligence alisation and anticipatory interaction, could greatly benefit at AAAI 2021, pages 145–152, 2021. from a cohesive framework for measuring explanatory suc- Mark Coeckelbergh. AI ethics. MIT Press, 2020. cess from different perspectives. Alan Cooper, Robert Reimann, David Cronin, and Christo- References pher Noessel. About Face (fourth edition): the essentials of interaction design. John Wiley & Sons, 2014. Peter Achinstein. The Nature of Explanation. Oxford Uni- versity Press, Oxford, 1983. Boris De Ruyter and Emile Aarts. Experience research: a methodology for developing human-centered interfaces. In Amina Adadi and Mohammed Berrada. Peeking inside the Handbook of ambient intelligence and smart environments, black-box: a survey on explainable artificial intelligence pages 1039–1067. Springer, 2010. (xai). IEEE access, 6:52138–52160, 2018. Derek Doran, Sarah Schulz, and Tarek R. Besold. What does Alejandro Barredo Arrieta, Natalia Dı́az-Rodrı́guez, explainable ai really mean? a new conceptualization of per- Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto spectives. arXiv preprint: 1710.00794, 2017. Barbado, Salvador Garcı́a, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Joseph S. Dumas and Marilyn C. Salzman. Usability as- Herrera. Explainable artificial intelligence (xai): Con- sessment methods. Reviews of Human Factors and Er- cepts, taxonomies, opportunities and challenges toward gonomics, 2(1):109–140, 2006. responsible ai. arXiv preprint: 1910.10045, 2019. Brian J Edwards, Joseph J Williams, Dedre Gentner, and Hugh Beyer and Karen Holtzblatt. Contextual design: defin- Tania Lombrozo. Explanation recruits comparison in a ing customer-centered systems. Elsevier, 1997. category-learning task. Cognition, 185:21–38, 2019. Or Biran and Courtenay Cotton. Explanation and justification Upol Ehsan and Mark O. Riedl. Human-centered explainable in machine learning: A survey. In IJCAI-17 Workshop on ai: Towards a reflective sociotechnical approach. arXiv Explainable AI (XAI), 2017. preprint: 2002.01092, 2020. John M Carroll. Making use: scenario-based design of Olivia J. Erdélyi and Judy Goldsmith. Regulating artificial in- human-computer interactions. MIT press, 2000. telligence: Proposal for a global solution. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and So- Rainara Maia Carvalho, Rossana Maria de Castro Andrade, ciety, AIES ’18, page 95–101, New York, NY, USA, 2018. Káthia Marçal de Oliveira, Ismayle de Sousa Santos, and Association for Computing Machinery. Carla Ilane Moreira Bezerra. Quality characteristics and measures for human–computer interaction evaluation in Shirley Gregor and Izak Benbasat. Explanations from intelli- ubiquitous systems. Software Quality Journal, 25(3):743– gent systems: Theoretical foundations and implications for 795, 2017. practice. MIS Quarterly, 23(4):497–530, 1999. Diogo V. Carvalho, Eduardo M. Pereira, and Jaime S. Car- Michael A.K. Halliday. Language as a Social Semiotic: the doso. Machine learning interpretability: A survey on meth- social interpretation of language and meaning. University ods and metrics. Electronics, 8(8):832, 2019. Park Press, 1978. Jörg Cassens and Anders Kofod-Petersen. Explanations and Robert R Hoffman, Shane T Mueller, Gary Klein, and Jor- case-based reasoning in ambient intelligent systems. In dan Litman. Metrics for explainable ai: Challenges and David C. Wilson and Deepak Khemani, editors, ICCBR-07 prospects. arXiv preprint 1812.04608, 2018. Workshop Proceedings, pages 167–176, Belfast, Northern Karen Holtzblatt and Hugh Beyer. Contextual design: Design Ireland, 2007. for life. Morgan Kaufmann, 2016. Jörg Cassens and Rebekah Wegener. Making use of abstract Viktor Kaptelinin. Activity theory: Implications for human- concepts – systemic-functional linguistics and ambient in- computer interaction. In Bonnie A. Nardi, editor, Con- telligence. In Max Bramer, editor, Artificial Intelligence text and Consciousness, pages 103–116. MIT Press, Cam- in Theory and Practice II – IFIP 20th World Computer bridge, MA, 1996. Congress, IFIP AI Stream, volume 276 of IFIP, pages 205– Frank C. Keil and Robert A. Wilson. Explaining explana- 214, Milano, Italy, 2008. Springer. tion. In Explanation and Cognition, pages 1–18. Bradford Jörg Cassens and Rebekah Wegener. Ambient explanations: Books, 2000. Ambient intelligence and explainable ai. In Ioannis Chatzi- Anders Kofod-Petersen and Agnar Aamodt. Contextualised giannakis, Boris De Ruyter, and Irene Mavrommati, edi- ambient intelligence through case-based reasoning. In tors, Proceedings of AmI 2019 – European Conference on Thomas R. Roth-Berghofer, Mehmet H. Göker, and H. Al- Ambient Intelligence, volume LNCS, Rome, Italy, Novem- tay Güvenir, editors, Proceedings of the Eighth European ber 2019. Springer. Conference on Case-Based Reasoning (ECCBR 2006), Corinne Cath. Governing artificial intelligence: ethical, legal volume 4106 of LNCS, pages 211–225, Berlin, September and technical opportunities and challenges, 2018. 2006. Springer. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Twelfth International Workshop Modelling and Reasoning in Context (MRC) @IJCAI 2021 45 Anders Kofod-Petersen and Jörg Cassens. Explanations and Thomas R. Roth-Berghofer and Jörg Cassens. Mapping goals context in ambient intelligent systems. In Boicho Koki- and kinds of explanations to the knowledge containers of nov, Daniel C. Richardson, Thomas R. Roth-Berghofer, case-based reasoning systems. In Héctor Muñoz-Avila and and Laure Vieu, editors, Modeling and Using Context – Francesco Ricci, editors, Case Based Reasoning Research CONTEXT 2007, volume 4635 of LNCS, pages 303–316, and Development – ICCBR 2005, volume 3630 of LNAI, Roskilde, Denmark, 2007. Springer. pages 451–464, Chicago, 2005. Springer. Anders Kofod-Petersen and Jörg Cassens. Modelling with Thomas Roth-Berghofer and Michael M Richter. On expla- problem frames: Explanations and context in ambient in- nation. Künstliche Intelligenz, 22(2):5–7, 2008. telligent systems. In Michael Beigl, Henning Christiansen, Thomas Roth-Berghofer, Stefan Schulz, David B Leake, and Thomas R. Roth Berghofer, Kenny R. Coventry, Anders Daniel Bahls. Explanation-aware computing. AI Maga- Kofod-Petersen, and Hedda R. Schmidtke, editors, Model- zine, 28(4):122, 2007. ing and Using Context – Proceedings of CONTEXT 2011, volume 6967 of LNCS, pages 145–158, Karsruhe, Ger- Roger C. Schank. Explanation Patterns – Understanding Me- many, 2011. Springer. chanically and Creatively. Lawrence Erlbaum, New York, 1986. David B. Leake. Evaluating Explanations: A Content Theory. Lawrence Erlbaum Associates, New York, 1992. Edward H Shortliffe. Computer-based medical consultations: Mycin. New York, 1976. David B. Leake. Goal-based explanation evaluation. In Goal- Driven Learning, pages 251–285. MIT Press, Cambridge, Kacper Sokol and Peter Flach. Explainability fact sheets: a 1995. framework for systematic assessment of explainable ap- proaches. In Proceedings of the 2020 Conference on Tania Lombrozo. The structure and function of explanations. Fairness, Accountability, and Transparency, pages 56–67, Trends in cognitive sciences, 10(10):464–470, 2006. 2020. Craig M. MacDonald and Michael E. Atwood. Changing per- William R. Swartout. What kind of expert should a system spectives on evaluation in hci: Past, present, and future. In be? xplain: A system for creating and explaining expert CHI ’13 Extended Abstracts on Human Factors in Com- consulting programs. Artificial Intelligence, 21:285–325, puting Systems, CHI EA ’13, page 1969–1978, New York, 1983. NY, USA, 2013. Association for Computing Machinery. Frode Sørmo, Jörg Cassens, and Agnar Aamodt. Explanation Ji-Ye Mao and Izak Benbasat. The use of explanations in in case-based reasoning – perspectives and goals. Artificial knowledge-based systems: Cognitive perspectives and a Intelligence Review, 24(2):109–143, October 2005. process-tracing analysis. Journal of Managment Informa- tion Systems, 17(2):153–179, 2000. Bas C. van Fraassen. The Scientific Image. Clarendon Press, Oxford, 1980. Tim Miller. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 2018. Rebekah Wegener, Jörg Cassens, and David Butt. Start mak- ing sense: Systemic functional linguistics and ambient in- Sina Mohseni, Niloofar Zarei, and Eric D. Ragan. A multidis- telligence. Revue d’Intelligence Artificielle, 22(5):629– ciplinary survey and framework for design and evaluation 645, 2008. of explainable ai systems. arXiv preprint: 1811.11839, 2018. Katharina Weitz, Dominik Schiller, Ruben Schlagowski, To- bias Huber, and Elisabeth André. “let me explain!”: ex- Shane T. Mueller, Elizabeth S. Veinott, Robert R. Hoffman, ploring the potential of virtual agents in explainable ai in- Gary Klein, Lamia Alam, Tauseef Mamun, and William J. teraction design. Journal on Multimodal User Interfaces, Clancey. Principles of explanation in human-ai systems. In pages 1–12, 2020. WS Explainable Agency in Artificial Intelligence at AAAI 2021, pages 153–162, 2021. Jianfeng Zhan, Lei Wang, Wanling Gao, and Rui Ren. Bench- council’s view on benchmarking ai and other emerging Pablo Noriega, Julian Padget, Harko Verhagen, and Mark workloads. arXiv preprint: 1912.00572, 2019. D’Inverno. Towards a framework for socio-cognitive tech- nical systems. In A. Ghose, N. Oren, P. Telang, and J. Thangarajah, editors, Coordination, Organizations, In- stitutions, and Norms in Agent Systems X, volume LNCS, pages 164–181. Springer, 2015. Ronald Poppe, Rutger Rienks, and Betsy Dijk. Evaluating the future of hci: Challenges for the evaluation of emerg- ing applications. volume LNCS 4451, pages 234–250, 01 2007. Christian Remy, Oliver Bates, Jennifer Mankoff, and Adrian Friday. Evaluating hci research beyond usability. In Ex- tended Abstracts of the 2018 CHI Conference, pages 1–4, 04 2018. Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).