ASCODI: An XAI-based interactive reasoning support system for justifiable medical diagnosing Dominik Battefeld1,∗ , Felix Liedeker2 , Philipp Cimiano2 and Stefan Kopp1 1 Social Cognitive Systems Group, CITEC, Bielefeld University, Inspiration 1, 33619 Bielefeld, Germany 2 Semantic Computing Group, CITEC, Bielefeld University, Inspiration 1, 33619 Bielefeld, Germany Abstract Research has shown that approximately 10% of medical diagnoses are wrong. As a direct consequence, appropriate medical treatment may be delayed or even absent, leading to an increased burden on patients, increased costs for medication, or even harm and death. As cognitive biases contribute to roughly three-quarters of these diagnostic errors, a lot can be gained from increasing a physician’s reasoning quality during the diagnostic process. Clinical decision support systems (CDSS), leveraging recent advances in artificial intelligence (AI) and insights from eXplainable AI (XAI), aim at providing accurate predictions and prognosis paired with corresponding ex-post explanations that make the reasoning of the system accessible to humans. Viewing explanations as involving the interactive construction of shared belief, we propose to move from diagnostic decision support to reasoning support which, in its true sense, needs to tailor the timing and content of generated explanations to the state of the reasoning process of physicians to meet their information needs and effectively mitigate the influence of cognitive biases. We claim that, given the uncertain and incomplete information inherent to medical diagnosis, the most effective way not to fall prey to cognitive reasoning errors is to establish and maintain proper justification for each decision throughout the diagnostic process. This paper contributes (1) a conceptual model and desiderata for AI-based interactive reasoning support that enhances reasoning quality through increased justification at every stage of the process, and (2) preliminary work on the development of the assistive, co-constructive differential diagnosis system, ASCODI, which provides reactive as well as proactive reasoning support to improve the justification of actions taken and decisions made during and after medical diagnosing. We also present selected use cases of ASCODI concerning its application in supporting the diagnosis of transient loss of consciousness and highlight their connection back to the theoretical concepts established. Keywords Interactive CDSS, Reasoning Support, Differential Diagnosis, Diagnosis Justification 1. Introduction Medical diagnoses are key to managing and curing diseases [1], and hinge on a solid catego- rization of a patient’s symptoms. If done incorrectly, selected categorizations and subsequent ineffective treatment based on it may lead to systemic issues like unnecessary test procedures [2] ECAI’24: Workshop on Multimodal, Affective and Interactive eXplainable AI (MAI-XAI), October 20, 2024, Santiago de Compostela, Spain ∗ Corresponding author. Envelope-Open dbattefeld@techfak.uni-bielefeld.de (D. Battefeld); fliedeker@techfak.uni-bielefeld.de (F. Liedeker); cimiano@techfak.uni-bielefeld.de (P. Cimiano); skopp@techfak.uni-bielefeld.de (S. Kopp) Orcid 0000-0002-5480-0594 (D. Battefeld); 0009-0006-2556-9430 (F. Liedeker); 0000-0002-4771-441X (P. Cimiano); 0000-0002-4047-9277 (S. Kopp) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings or serious patient harm by missing so-called red-flag indications of severe diseases [3]. But given the overwhelming number of diseases to consider, diagnostic tests to conduct, and questions to ask [4], medical diagnostic reasoning over inherently uncertain information is a challenging and error-prone task. As many diagnostic errors arise from faulty information processing and synthesis [5], clinical decision support systems (CDSS) with their contemporary focus on machine learning and AI have the potential to increase diagnostic accuracy and thus patient safety [4]. But given the high stakes of diagnostic decisions, generating an explanation for uncertain predictions within CDSS should be mandatory [6], lifting the targeted decision support to actual reasoning support [7]. However, a systematic review by Antoniadi et al. found “an overall distinct lack of application of XAI in the context of CDSS and, in particular, a lack of user studies exploring the needs of clinicians” [6, p. 1]. This is, in turn, problematic, because cognitive factors contribute to roughly three-quarters of diagnostic errors [5] which is framed as “the challenge of cognitive science for medical diagnosis” [8]. Over the years, several de- biasing techniques have been studied to mitigate the influence of cognitive reasoning errors by enforcing metacognition [9], checklists [10] or rule of thumbs [11], but their effectiveness in empirical studies remains mixed [12]. This paper proposes to combine approaches to decision/reasoning support and de-biasing strategies in a bias-aware interactive reasoning support system. The system aims to increase the objective justification of each action taken or decision made during diagnosing by monitoring the reasoning process of its user in the background. Leveraging these insights enables the system to provide reactive as well as proactive reasoning support tailored to the information needs of its user at any given point within their reasoning process. In this paper, we present our conceptual understanding of interactive reasoning support through six theoretical desiderata and ongoing work on their implementation in the interactive reasoning support system ASCODI, designed for neurologists diagnosing patients with transient loss of consciousness. 2. Related Work Even before the rise of XAI, the potential of CDSS to improve diagnosis was extensively discussed in the literature [13]. At the same time, however, the usage of CDSS has been limited and still is underwhelming, mainly due to challenges arising from the integration into everyday clinical practice [13, 4]. In some cases, the potential is completely misjudged and CDSS are even perceived as a threat to their job by physicians [14], making the development of co-constructive, “doctor-in-the-loop” [15, p. 2] support systems even more important. In addition, it has been argued both in general [16] and for the special case of AI-based decision support [17] that inherently interpretable white box models (Bayesian Networks [18], Decision trees [19], etc.) are more preferable than black box models (Deep Neural Networks [20], etc.). A general need for explainability, as well as interpretability of machine learning applications, has furthermore been recognized [21]. Nonetheless, only a small number of CDSS have been developed with a focus on explainability [6]. Most of the current state-of-the-art systems rely on black box models with post-hoc explanations [22]. Many current applications of XAI in the field of medical diagnosis revolve around medical image analysis, e.g. the detection of COVID-19 from X-ray images [23]. In this domain, feature attribution methods highlighting input features most relevant to the instance that should be ex- plained, are particularly widespread [24]. Outside of image analysis, counterfactual explanations (CFs) are widely used because they are close to human-like approaches to explaining [25]. CFs provide a hypothetical, yet similar, counterexample that would change the prediction of a ma- chine learning model, thereby explaining its decision. In practice, different CDSS have utilized CFs, e.g. [26]. A growing number of researchers recently outlined the importance of human fac- tors in XAI design along with the development of human-centered XAI applications [27, 15, 28] - including CDSS [29]. In this context, there’s also an increased effort to use interactive and “co-constructive” approaches to explanation processes [30], which has led to the development of multiple interactive XAI tools [31, 32, 33]. The human-centered side of XAI is also influenced by the cognitive biases of physicians that have repeatedly been shown to accompany medical diagnostic reasoning [9, 34, 35, 36, 11]. Here, an anchoring bias (clinging to a hypothesis despite contradictory information), an availability bias (generating hypotheses that readily come to mind), a confirmation bias (neglecting possibly contradictory information during information exploration), premature closure (ending information exploration too early), and overconfidence (perceiving evidential strength as higher than it is) are prominent and prevalent examples [34]. Despite these more commonly known instantiations, Croskerry identified a total of 50 cognitive biases associated with diagnosis and medical practice in general [37]. Graber et al. showed that these biases affect 74% of all pathologically confirmed diagnostic errors under review [5]. Today, minimizing the influence of cognitive errors is seen as a major lever to improve the quality of care [4, 8] and ease the burden on the medical system [38]. Multiple solutions have been proposed over the years from checklists during information exploration [10] to metacogni- tive approaches like the “dramatic big five”, which formalize diagnostic options for acute thoracic pain [11], inducing critical thinking [12] or cognitive forcing strategies as algorithmic step-by- step guide [9], technological interventions like providing visual interpretations of statistics [12], and motivational interventions by holding people personally accountable [12]. Their promised observable benefits remain mixed, where checklists improve reasoning on difficult cases but worsen it on simpler ones [10], trying to reflect out of self-induced motivation fails to translate into a sustainable effect [39] and leads to an improvement in only 50% of cases [12]. We argue that de-biasing strategies will become more effective if they are designed to align with the already existing psychological reasoning process of physicians. Checklists and dif- ferential diagnosis generators may present as helpful external sources of information, but as many errors remain of cognitive origin, increasing the cognitive load on physicians by adding even more uncertain sources of information will not suffice. The ASCODI system is thus not designed to provide support for a decision of a physician but co-constructively form an explain- able diagnosis with a physician, taking a collaborative part throughout the process to reduce instead of increase cognitive load. 3. Interactive Reasoning Support: Desiderata We conceive of interactive reasoning support for medical diagnosing as an ongoing interaction of proactive and reactive actions from both the physician and the support system. To sharpen this concept theoretically, we propose six general desiderata as guiding principles: (1) synchro- nizing the observable reasoning state between physician and system on a shared dashboard, (2) monitoring and inferring the hidden cognitive reasoning state of the physician, (3) supporting an informed choice of reasoning direction by the physician based on their information needs, (4) providing feedback on the chosen reasoning direction, (5) suggesting alternative reasoning directions, (6) displaying warnings in case of potential pitfalls. Desideratum 1: Shared Dashboard Physician and system should collaborate on a shared dashboard that summarizes the observable state of the reasoning process, i.e. which information about the patient is already known and how this information has already been integrated to form all currently active hypotheses. The idea behind the dashboard draws from the concept of extended cognition [40], where physical information-storing entities like written formu- las or diaries can play an active part in a cognitive process, and aims to establish sufficient grounding [41] about the reasoning state between both interaction parties. Desideratum 2: Monitored Reasoning The system should monitor the behavior of its user to detect the hidden state of the reasoning process, i.e. not explicitly stated assumptions or cognitive biases. This detection can be realized by building a computational cognitive process model [42] that formalizes how the dynamically changing mental state of the user leads to observable actions. Subsequently inverting the process model to predict mental states from observable information queries and hypothesis revisions via inverse planning [43] enables the detection of cognitive biases. This is a crucial capability of interactive reasoning support because the system can only tailor explanations towards the mitigation of specific reasoning errors if the presence of errors is acknowledged in the first place. Desideratum 3: Information Needs Physicians should be able to express information needs targeting the diagnostic problem or the interactive system itself and receive an appropriate re- sponse from the system, i.e. “Which symptoms cannot be explained by our current hypotheses?” or “Why do you judge hypothesis 𝑋 as less likely than me?”. Multiple studies have found and categorized information needs within the diagnostic process, e.g. [44, 45]. Further information needs arise from the interaction with an AI system itself and the requirement to understand the system’s actions and decisions [21]. Satisfaction of these needs through justified explanations that physicians can integrate into their reasoning will lead to increased problem understanding and this in turn will lead to increased reasoning quality [7]. Desideratum 4: Behavioral Feedback Physicians should receive feedback on their rea- soning behavior, e.g. a visual marker to display (dis-)agreement with the current state of each hypothesis, whose validity can be justified on demand by the system. This leads to an active exchange of arguments where both the physician and the system align their assessment of the current situation to combine the capabilities of AI in statistical reasoning and those of physicians in constructing and contextualizing a “picture of the patient” [7]. Figure 1: A schematic view on the interactive reasoning support for medical diagnosing within ASCODI. Desideratum 5: Situational Suggestions Physicians should receive justified suggestions for their reasoning behavior, i.e. unknown information to explore or hypotheses to generate. Productivity in interaction thrives if both parties proactively participate [46] and by suggesting alternative reasoning directions, insufficiently justified restrictions of options as observable for a confirmation bias (neglecting possibly contradictory information) or an anchoring bias (clinging to a hypothesis despite contradictory information) can be effectively mitigated. Desideratum 6: Red Lines Physicians should be warned clearly when crossing red lines within the reasoning process, e.g. trying to reject a viable diagnostic hypothesis or trying to commit to an insufficiently justified diagnosis. While it would be legally and ethically infeasible to set up red lines as impossible to overrule because physicians are the ones carrying the legal and epistemic responsibility for the outcome of the reasoning process and its implications, expressing the system’s clear disagreement with the decision is expected to trigger reflection. By adding this extra reasoning step the decision - even if unchanged - is promoted to a product of consciously selected thought. 4. System Description: ASCODI We present the assistive co-constructive differential diagnosis system (ASCODI) to implement the above-described general desiderata. A schematic overview of the architectural layout is shown in Figure 1. ASCODI provides interactive reasoning support for experienced neurologists during the diagnosis of patients suffering from transient loss of consciousness (TLOC) [47]. Figure 2: The user interface of ASCODI with its three main components for information exploration (left), hypothesis revisions (top right), and system interaction (bottom right). Diagnosis of TLOC places a high emphasis on the cognitive categorization and contextualization capabilities of physicians because the presence or absence of specific symptoms cannot warrant a decision towards or against any differential diagnosis and information is mainly obtained through subjective, personal dialogue rather than objective test results [48] - rendering it an ideal application area. Within the user interface of ASCODI (see Figure 2), physicians can work on one case at a time by exploring information about the patient, generating and refining hypotheses about the presence or absence of specific diseases, and receiving reactive as well as proactive reasoning support. Clinical Vignettes Medical cases within ASCODI are stored as pre-defined clinical vignettes. A clinical vignette is a set of variable-value pairs for each queryable information ranging from biographic information and medical history to current complaints and results of medical examinations. All vignettes are based on publicly available patient data [49], published case reports [50, 51, 52], and confidential electronic health records of patients at the Ruhr-Epileptology in Bochum. Each piece of information is associated with a certainty on a 4-point Likert scale and a patient response that formulates the value of the variable in natural language. All vignettes consist of the same set of variables with values chosen to fit the particular medical case. The variables are ordered hierarchically in a tree-like graph structure, i.e. the patient experiencing pain is a parent node while the location and the duration of that pain are two child nodes. Information exploration thus means to traverse the patient graph. Shared Dashboard Each diagnostic process within the ASCODI system is carried out on a shared dashboard (see Figure 2). The design of the dashboard is based on a monitoring tool to elicit step-by-step cognitive process trajectories of physicians for empirical data collection [53]. The dashboard summarizes the current state of the reasoning process, i.e. already explored information and the current state of all hypotheses. Every diagnosis starts with a short self- report of the patient about their sex, age, and initial complaints. Then, the physician can prompt the patient for biographic information, medical history, current complaints, and other typical anamnesis questions, conduct sophisticated examinations, and request lab reports by entering the name of a specific variable as stored in the clinical vignette. All of this information can be requested at any time and in any order. Explored information is displayed in semantic categories where child nodes of the root in the patient graph are top-level entries and detailed information (e.g. the location and duration of experienced pain) are grouped in the corresponding entry. Given each entry, physicians can issue follow-up questions on specific symptoms by querying child nodes of already explored variables to exhaustively explore their manifestation. Variables can also be queried by aliases to account for potential synonyms or slight differences between names of the same variable and suggestions are displayed based on the text entered to ensure that users find what they are looking for. During information exploration, physicians can generate new hypotheses, adjust the certainty of existing ones, and reject unreasonable hypotheses. By default, all possible hypotheses are viewed as inactive until the user explicitly generates them. Each hypothesis is understood as an argument towards or against the presence of a disease. Thus, the user not only has to generate potential diagnoses but also connect the already explored information with each hypothesis via drag-and-drop, which creates isolated disease arguments visible to the user and the system. Monitored Reasoning The ASCODI system constantly monitors information exploration and hypothesis revisions by the physician to infer the mental state of its user and detect potential cognitive biases. This inference is based on a computational cognitive process model that - given a clinical vignette and a cognitive bias as input - aims to replicate empirical reasoning trajectories of physicians, i.e. sequences of information queries and hypothesis updates. The dynamic problem of repeated exploration and opinion revision is formalized as a Markov decision process (MDP) [54] where the summed subjective strength of each hypothesis constitutes the reward signal from which the action policy is derived via Monte Carlo tree search (MCTS) [55]. This strength is defined as the Bayesian posterior probability of the hypothesis mediated by the amount of anticipated regret [56] that grows with the severity of the disease. Aligned to the dual process theory in which human thinking [57], including medical diagnostic reasoning [58], is divided into a fast, associative route (System I) and a slow, deliberate route (System II), the posterior computation is based on the observable patient information as hard evidence and assumed information that is intuitively derived from previous experience as soft evidence. Experience is formalized based on the MINERVA-DM model [59, 60] that captures past events (i.e. patients) as memory traces (i.e. vectors) whose similarity to the current observed event enables computations on assumed symptom likelihoods. Once this process model is fully implemented, we aim to run it exhaustively beforehand, to collect a sufficient amount of labeled data to train a classifier that inverts the generation of reasoning trajectories given a clinical vignette and a cognitive bias to infer cognitive biases given the reasoning trajectory. We aim to model an anchoring bias (clinging to a hypothesis despite contradictory information), an availability bias (generating hypotheses that readily come to mind), a confirmation bias (neglecting possibly contradictory information during information exploration), premature closure (ending information exploration too early), and overconfidence (perceiving evidential strength as higher than it is), which are prominent examples repeatedly found in empirical research on cognitive biases in diagnostic reasoning [34, 5, 61, 62, 36]. Detected cognitive biases are then forwarded to the explanation policy that tailors the timing and content of explanations to the user. Explanation Policy & Domain Model The explanation policy incorporated into the ASCODI system ensures the most appropriate explanation for a given situation is used. ‘Most appropriate’ explanation refers to the determination of (1) the type of the explanation, e.g. counterfactual (2) its form, e.g. written in the system chat, and (3) timing of the explanation-giving, e.g. instantly. The main purpose of fine-tuning an explanation is to mitigate detected biases that are fed into the explanation policy by the cognitive process model. In addition, the explanation policy is responsible for the tracking of previous user interactions. In particular, previously issued queries and the corresponding explanations given by the ASCODI system are stored to adapt the following explanations to the user based on the interaction history. A three-layer Bayesian network (BN) [63] serves as the domain model for explanation generation within the ASCODI system. The BN is trained on data provided by Wardrope et al. [49] as well as annotated and anonymized outpatient letters from the Ruhr-Epileptology in Bochum. A major advantage of a BN is its inherent interpretability and the ability to directly model causal relationships between variables [63]. The feasibility of a similar domain model for explanation generation in a CDSS has been demonstrated in a previous prototype [64]. Additionally, the modular structure of the system renders the domain model interchangeable both in terms of the algorithmic approach (e.g. BNs vs. Neural Networks) and the disease domain (e.g. Neurology vs. Cardiology). Explanations that are part of the ASCODI system can be divided into four groups based on the scope and the algorithms used to calculate the explanation: Suggestions of the next features to be queried by the physician are determined by maximizing mutual information [65]. Explanations for queries covering the likelihood of events (e.g. “What are likely diagnoses for a patient with symptoms 𝑢, 𝑣, 𝑤?”) are calculated via Bayesian inference in our domain model. Queries posing What if -questions (e.g. “If the patient would have diagnosis 𝑋, what other symptoms would they have?”) are answered by counterfactual explanations, and explanations that aim to justify (e.g. “Why do you judge 𝑋 different from me?”) rely on our definition of justification that is outlined below. A formal definition of justification is not only relevant for answering user queries but crucial for providing meaningful feedback or suggestions to the physician and ensuring compliance with red lines. We propose a combination of a relevance measure of the explanation with a measure for the degree of explored information to be used as justification. The rationale behind this is, that for a decision to be justified, it must be relevant and at the same time it must be certain, i.e. only made if supported by sufficient evidence. The Most Relevant Explanation Information Need Example Query Implementation Diagnostic procedure What information should be queried next? Mutual information Differential diagnosis Could this patient have condition 𝑋? Bayesian Inference What are likely diagnoses for a patient with symp- toms 𝑢, 𝑣, 𝑤? Disease complication If the patient with symptoms 𝑢, 𝑣, 𝑤 has disease 𝑋, Counterfactuals what other signs of symptoms would they have? System behavior Why do you judge hypothesis 𝑋 as less likely than Justification me? Table 1 Information needs with corresponding query examples and the underlying implementation in ASCODI. Note: The list of example queries is not exhaustive. (MRE) [66] method is utilized as our relevance measure for generated explanations. MRE is the partial instantiation of all possible diagnoses which maximizes the generalized Bayes factor. The requirements regarding the necessary amount of explored information are preliminarily implemented as thresholds for the ratio of explored patient features that are highly correlated to the current hypotheses under consideration. Information Needs The ASCODI system includes predefined user queries to satisfy the information needs that physicians encounter during the interactive reasoning support. Infor- mation needs defined in the literature [44, 67] as well as the most common questions asked by physicians [45, 68] can be grouped into three categories based on the source of the information need: Diagnosis, patient information, and treatment. Though treatment options including the prognosis of patients are beyond the scope of ASCODI and hence these information needs are not considered here. Information needs concerning patient information are not realized by explicit queries, but rather via the iterative exploration of the clinical vignette. However, the usage of the system itself results in further information needs [21]. Table 1 provides a summary of all information need categories that are part of ASCODI along with their operationalization as queries and their implementation. The queries are based on the prior theoretical work of Van Baalen et al. on diagnostic reasoning support systems [7]. System Responses The interaction between ASCODI and a physician is twofold: On the one hand, a reactive part acts as the direct response to user-triggered queries and, on the other hand, a proactive part can intervene at any point if the system detects a situation that requires action to support the reasoning process or prevent pitfalls. Users can trigger queries not only via the system interface but also at the ‘point of interaction’ within the system. For example, the query “Why do you judge hypothesis 𝑋 as less likely than me?” can be called via the system interface, but is also available in the dashboard by clicking on hypothesis 𝑋 and requesting the explanation of why the system assigns hypothesis 𝑋 a lower likelihood than the user does. Whereas explanations to user queries are always answered in the system chat, interventions by the proactive system part are presented in different formats and at different locations within the shared dashboard to be able to draw the user’s attention to certain areas and have a further lever for mitigating detected biases. Furthermore, the strength of system responses can be modulated, by using different forms of explanations and notifications, to attract the user’s attention in different scenarios. 5. Reasoning Support Use Cases As described above in detail, ASCODI’s reasoning support capabilities range from reactive explanations of user-triggered events to proactive advice to mitigate cognitive biases and reasoning pitfalls. To illustrate the system usage more vividly and highlight how it links back to the theoretical desiderata for interactive reasoning support, we present four selected use cases in which ASCODI (1) answers a user-triggered query to the system, (2) reacts to the user-triggered belief change in one hypothesis, (3) suggests an alternative direction during information exploration and (4) displays a warning for an unjustified diagnosis. Query Response As outlined before, ASCODI is capable of providing explanations for user- triggered queries to satisfy the information needs arising in the diagnostic reasoning process (desideratum 3). An example of this user interaction can be seen in the bottom right of Figure 2. In the given example, the user selected the query “Which feature would reduce the likelihood of the psychogenic seizure hypothesis?” via the system interface and received the written explanation in the system chat. Behavioral Feedback ASCODI constantly monitors how the user explores information and ranks hypotheses. According to the available and explored information, the belief of the domain model is updated and the system itself assigns probabilities to all hypotheses. A possible discrepancy between the user’s and the system’s ranking of hypotheses is displayed in the dashboard according to the explanation policy and the current state of the user. The ‘weakest’ form of this notification is realized by green (red) arrows within the hypotheses panel indicating a higher (lower) ranking by the system as shown in Figure 2. If the explanation policy determines that the user is in a state that does require a more vehe- ment notification, e.g. because previously made suggestions were ignored, further notifications can be sent to the system chat. A further notification in the system chat includes an explanation of why the system ranks a hypothesis higher than the user. An example of such an explanation of why the system would rank a hypothesis differently is shown in the system chat in Figure 2. Situational Suggestion The ASCODI system can proactively suggest information queries at any point within the reasoning trajectory (see Figure 3). Suggestions are attached to an explanation, that motivates why a certain query is deemed reasonable at this point (e.g. to increase the certainty of a hypothesis). Suggestions and explanations align to potentially detected cognitive biases (e.g. suggesting falsifying information for a hypothesis if a potential confirmation bias is detected). The same mechanism is employed to propose overlooked yet feasible hypotheses or already falsified alternatives. This capability links back to three desiderata, where the system first observes the current state of the reasoning process through the shared Figure 3: A proactively suggested query during information exploration. Suggestions can be accepted or rejected by the user. dashboard, then infers the possibility of a confirmation bias through the monitored reasoning, and then provides an alternative reasoning direction by suggesting unexplored information. The idea behind this suggestion is to not tell a biased person that they’re biased but to provide reasonable information that induces reflection. Diagnosis Safeguard Once a physician attempts to submit a final diagnosis, ASCODI checks its validity and will display a warning if the decision is unjustified, e.g. if there is a viable alternative hypothesis not even considered. By explicitly mentioning that the seizure of a patient could also resemble a psychogenic instead of an epileptic seizure and by pointing out that it would be worthwhile to explore the psychological history of the patient more closely to increase the strength of differentiation between these two diagnostic options, ASCODI implements four desiderata: observing the end of the diagnostic process on the dashboard, detecting overconfidence within the user, displaying a warning as a red flag to show major disagreement and suggesting an alternative reasoning direction by specifying which options are deemed insufficiently explored. The motivation behind this behavior is that it takes more cognitive effort to insist on a decision although someone else explicitly told you they would have decided differently which again is hypothesized to induce reflection. 6. Conclusion This paper presented a conceptual understanding of interactive reasoning support for medi- cal diagnosis in the form of six desiderata as well as ASCODI, an assistive, co-constructive differential diagnosis system, whose architecture draws from this theoretical foundation to build a system that (1) keeps track of its user’s reasoning process to detect potential cognitive biases, (2) tailors the timing and content of its explainable responses to the current external and cognitive state of the reasoning process and (3) proactively takes part in knowledge exploration and hypothesis revision by providing explainable suggestions and safeguards. To become an active part of its user’s reasoning process, ASCODI utilizes XAI-based explanation genera- tion techniques to derive justifiable opinions and responses [7]. While the implementation of ASCODI is still ongoing and subsequent empirical validation is lacking, we believe that support systems like ASCODI are the logical next step to enhance the reasoning quality of physicians through justifiable decisions along each path chosen. In a time when diagnostic errors are mainly caused by faulty information processing and synthesis [5], such systems present themselves as a promising direction for future work. Acknowledgments Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation): TRR 318/1 2021 – 438445824. References [1] N. Donner-Banzhoff, Die ärztliche Diagnose: Erfahrung - Evidenz - Ritual, Programmbere- ich Medizin, 1. auflage ed., Hogrefe, Bern, 2022. [2] E. S. Berner, M. L. Graber, Overconfidence as a Cause of Diagnostic Error in Medicine, The American Journal of Medicine 121 (2008) S2–S23. [3] H. C. Sox, M. C. Higgins, D. K. Owens, Medical decision making, 2nd ed ed., John Wiley & Sons, Chichester, West Sussex, UK : Hoboken, New Jersey, 2013. Medium: electronic resource. [4] M. L. Graber, Reaching 95%: decision support tools are the surest way to improve diagnosis now, BMJ Quality & Safety 31 (2022) 415–418. Publisher: BMJ Publishing Group Ltd Section: Editorial. [5] M. L. Graber, N. Franklin, R. Gordon, Diagnostic Error in Internal Medicine, Archives of Internal Medicine 165 (2005) 1493. [6] A. M. Antoniadi, Y. Du, Y. Guendouz, L. Wei, C. Mazo, B. A. Becker, C. Mooney, Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review, Applied Sciences 11 (2021) 5088. Number: 11 Publisher: Multidisciplinary Digital Publishing Institute. [7] S. Van Baalen, M. Boon, P. Verhoef, From clinical decision support to clinical reasoning support systems, Journal of Evaluation in Clinical Practice 27 (2021) 520–528. [8] P. Croskerry, S. G. Campbell, D. A. Petrie, The challenge of cognitive science for medical diagnosis, Cognitive Research: Principles and Implications 8 (2023) 13. [9] P. Croskerry, The Importance of Cognitive Errors in Diagnosis and Strategies to Minimize Them, Academic Medicine 78 (2003) 775–780. [10] T. Shimizu, K. Matsumoto, Y. Tokuda, Effects of the use of differential diagnosis checklist and general de-biasing checklist on diagnostic performance in comparison to intuitive diagnosis, Medical Teacher 35 (2013) e1218–e1229. [11] M. Gäbler, Denkfehler bei diagnostischen Entscheidungen, Wiener Medizinische Wochen- schrift 167 (2017) 333–342. [12] R. Ludolph, P. J. Schulz, Debiasing Health-Related Judgments and Decision Making: A Systematic Review, Medical Decision Making 38 (2018) 3–13. [13] K. K. Hall, S. Shoemaker-Hunt, L. Hoffman, S. Richard, E. Gall, E. Schoyer, D. Costar, B. Gale, G. Schiff, K. Miller, T. Earl, N. Katapodis, C. Sheedy, B. Wyant, O. Bacon, A. Hassol, S. Schneiderman, M. Woo, L. LeRoy, E. Fitall, A. Long, A. Holmes, J. Riggs, A. Lim, Making Healthcare Safer III: A Critical Analysis of Existing and Emerging Patient Safety Practices, Agency for Healthcare Research and Quality (US), Rockville (MD), 2020. [14] C. Krittanawong, The rise of artificial intelligence and the uncertain future for physicians, European Journal of Internal Medicine 48 (2018) e13–e14. [15] A. Holzinger, G. Langs, H. Denk, K. Zatloukal, H. Müller, Causability and explainability of artificial intelligence in medicine, WIREs Data Mining and Knowledge Discovery 9 (2019) e1312. [16] C. Rudin, Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead, Nature machine intelligence 1 (2019) 206–215. [17] R. L. Pierce, W. Van Biesen, D. Van Cauwenberge, J. Decruyenaere, S. Sterckx, Explainability in medicine in an era of AI-based clinical decision support systems, Frontiers in Genetics 13 (2022) 903600. [18] J. Pearl, Causality, Cambridge University Press, 2000. [19] J. R. Quinlan, Induction of decision trees, Machine Learning 1 (1986) 81–106. [20] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (2015) 436–444. Publisher: Nature Publishing Group. [21] F. Doshi-Velez, B. Kim, Towards A Rigorous Science of Interpretable Machine Learning, 2017. ArXiv:1702.08608 [cs, stat] version: 2. [22] V. Arya, R. K. E. Bellamy, P.-Y. Chen, A. Dhurandhar, M. Hind, S. C. Hoffman, S. Houde, Q. V. Liao, R. Luss, A. Mojsilović, S. Mourad, P. Pedemonte, R. Raghavendra, J. Richards, P. Sattigeri, K. Shanmugam, M. Singh, K. R. Varshney, D. Wei, Y. Zhang, One Explana- tion Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques, 2019. ArXiv:1909.03012 [cs, stat]. [23] S. Rajpal, N. Lakhyani, A. K. Singh, R. Kohli, N. Kumar, Using handpicked features in conjunction with ResNet-50 for improved detection of COVID-19 from chest X-ray images, Chaos, Solitons & Fractals 145 (2021) 110749. [24] H. Singh, A. N. D. Meyer, E. J. Thomas, The frequency of diagnostic errors in outpatient care: estimations from three large observational studies involving US adult populations, BMJ Quality & Safety 23 (2014) 727–731. [25] S. Wachter, B. Mittelstadt, C. Russell, Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR, 2018. ArXiv:1711.00399 [cs]. [26] Z. Wang, I. Samsten, P. Papapetrou, Counterfactual Explanations for Survival Predic- tion of Cardiovascular ICU Patients, in: A. Tucker, P. Henriques Abreu, J. Cardoso, P. Pereira Rodrigues, D. Riaño (Eds.), Artificial Intelligence in Medicine, volume 12721, Springer International Publishing, Cham, 2021, pp. 338–348. Series Title: Lecture Notes in Computer Science. [27] M. Ribera, A. Lapedriza, Can we do better explanations? A proposal of User-Centered Explainable AI, Joint Proceedings of the ACM IUI 2019 Workshops (2019). [28] K. Sokol, P. Flach, One Explanation Does Not Fit All: The Promise of Interactive Ex- planations for Machine Learning Transparency, KI - Künstliche Intelligenz 34 (2020) 235–250. [29] T. A. J. Schoonderwoerd, W. Jorritsma, M. A. Neerincx, K. van den Bosch, Human-centered XAI: Developing design patterns for explanations of clinical decision support systems, International Journal of Human-Computer Studies 154 (2021) 102684. [30] K. J. Rohlfing, P. Cimiano, I. Scharlau, T. Matzner, H. M. Buhl, H. Buschmeier, E. Esposito, A. Grimminger, B. Hammer, R. Häb-Umbach, I. Horwath, E. Hüllermeier, F. Kern, S. Kopp, K. Thommes, A.-C. Ngonga Ngomo, C. Schulte, H. Wachsmuth, P. Wagner, B. Wrede, Explanation as a Social Practice: Toward a Conceptual Framework for the Social Design of AI Systems, IEEE Transactions on Cognitive and Developmental Systems 13 (2021) 717–728. Conference Name: IEEE Transactions on Cognitive and Developmental Systems. [31] J. Wexler, M. Pushkarna, T. Bolukbasi, M. Wattenberg, F. Viegas, J. Wilson, The What-If Tool: Interactive Probing of Machine Learning Models, IEEE Transactions on Visualization and Computer Graphics (2019) 1–1. ArXiv:1907.04135 [cs, stat]. [32] T. Spinner, U. Schlegel, H. Schäfer, M. El-Assady, explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning (2019). [33] H. Baniecki, D. Parzych, P. Biecek, The grammar of interactive explanatory model analysis, Data Mining and Knowledge Discovery (2023). [34] G. Saposnik, D. Redelmeier, C. C. Ruff, P. N. Tobler, Cognitive biases associated with medical decisions: a systematic review, BMC Medical Informatics and Decision Making 16 (2016) 138. [35] Z. I. Vally, R. A. Khammissa, G. Feller, R. Ballyram, M. Beetge, L. Feller, Errors in clin- ical diagnosis: a narrative review, Journal of International Medical Research 51 (2023) 03000605231162798. [36] J. S. Blumenthal-Barby, H. Krieger, Cognitive Biases and Heuristics in Medical Decision Making: A Critical Review Using a Systematic Search Strategy, Medical Decision Making 35 (2015) 539–557. [37] P. Croskerry, 50 Cognitive and Affective Biases in Medicine, https://sjrhem.ca/wp-content/ uploads/2015/11/CriticaThinking-Listof50-biases.pdf, 2015. Accessed: 2024-09-06. [38] Hardeep Singh, Gordon D Schiff, Mark L Graber, Igho Onakpoya, Matthew J Thompson, The global burden of diagnostic errors in primary care, BMJ Quality & Safety 26 (2017) 484. [39] D. M. Berwick, Not again!, BMJ 322 (2001) 247–248. [40] A. Clark, D. Chalmers, The Extended Mind, Analysis 58 (1998). [41] H. H. Clark, S. E. Brennan, Grounding in communication., in: Perspectives on socially shared cognition., American Psychological Association, Washington, 1991, pp. 127–149. [42] J. B. Jarecki, J. H. Tan, M. A. Jenny, A framework for building cognitive process models, Psychonomic Bulletin & Review 27 (2020) 1218–1229. [43] C. Baker, R. Saxe, J. Tenenbaum, Bayesian Theory of Mind: Modeling Joint Belief-Desire Attribution, Proceedings of the Annual Meeting of the Cognitive Science Society (2011). [44] R. N. Jerome, N. B. Giuse, K. Wilder Gish, N. A. Sathe, M. S. Dietrich, Information needs of clinical teams: analysis of questions received by the Clinical Informatics Consult Service, Bulletin of the Medical Library Association 89 (2001) 177–185. [45] J. W. Ely, J. A. Osheroff, P. N. Gorman, M. H. Ebell, M. L. Chambliss, E. A. Pifer, P. Z. Stavri, A taxonomy of generic clinical questions: classification study, BMJ (Clinical research ed.) 321 (2000) 429–432. [46] R. E. v. Geffen, Proactivity in concert: an interactive perspective on employee proactivity, Library of the University of Amsterdam, 2018. OCLC: 1049936204. [47] T. Baumgartner, R. Surges, Synkope, epileptischer oder psychogener Anfall? Der Weg zur richtigen Diagnose, DMW - Deutsche Medizinische Wochenschrift 144 (2019) 835–841. [48] K. Malmgren, M. Reuber, R. Appleton, Differential diagnosis of epilepsy, Oxford textbook of epilepsy and epileptic seizures (2012) 81–94. [49] A. Wardrope, J. Jamnadas-Khoda, M. Broadhurst, R. A. Grünewald, T. J. Heaton, S. J. Howell, M. Koepp, S. W. Parry, S. Sisodiya, M. C. Walker, M. Reuber, Machine learning as a diagnostic decision aid for patients with transient loss of consciousness, Neurology: Clinical Practice 10 (2020) 96–105. [50] S. A. Haji Seyed Javadi, F. Hajiali, M. Nassiri Asl, Zolpidem Dependency and Withdrawal Seizure: A Case Report Study, Iranian Red Crescent Medical Journal 16 (2014). [51] B. Hellmich (Ed.), Fallbuch Innere Medizin, 6 ed., Georg Thieme Verlag, Stuttgart, 2020. Pages: b-007-170975. [52] R. Gerlach, A. Bickel, Fallbuch Neurologie, Fallbuch, 5., unveränderte auflage ed., Georg Thieme Verlag, Stuttgart New York, 2021. [53] D. Battefeld, S. Mues, T. Wehner, P. House, C. Kellinghaus, J. Wellmer, S. Kopp, Revealing the Dynamics of Medical Diagnostic Reasoning as Step-by-Step Cognitive Process Trajectories, in: Proceedings of the Annual Meeting of the Cognitive Science Society, Rotterdam, The Netherlands, 2024. [54] R. S. Sutton, A. Barto, Reinforcement learning: an introduction, Adaptive computation and machine learning, second edition ed., The MIT Press, Cambridge, Massachusetts London, England, 2020. [55] M. Świechowski, K. Godlewski, B. Sawicki, J. Mańdziuk, Monte Carlo Tree Search: a review of recent modifications and applications, Artificial Intelligence Review 56 (2023) 2497–2562. [56] M. Zeelenberg, R. Pieters, A Theory of Regret Regulation 1.0, Journal of Consumer Psychology 17 (2007) 3–18. [57] K. Frankish, Dual‐Process and Dual‐System Theories of Reasoning, Philosophy Compass 5 (2010) 914–926. [58] P. Croskerry, A Universal Model of Diagnostic Reasoning, Academic Medicine 84 (2009) 1022–1028. [59] M. R. P. Dougherty, C. F. Gettys, E. E. Ogden, MINERVA-DM: A memory processes model for judgments of likelihood., Psychological Review 106 (1999) 180–209. [60] R. P. Thomas, M. R. Dougherty, A. M. Sprenger, J. I. Harbison, Diagnostic hypothesis generation and human judgment., Psychological Review 115 (2008) 155–185. [61] T. Watari, A. Gupta, Y. Amano, Y. Tokuda, Japanese Internists’ Most Memorable Diagnostic Error Cases: A self-reflection Survey, Internal Medicine (2023) 1494–22. [62] M. F. Loncharich, R. C. Robbins, S. J. Durning, M. Soh, J. Merkebu, Cognitive biases in internal medicine: a scoping review, Diagnosis 0 (2023). [63] J. G. Richens, C. M. Lee, S. Johri, Improving the accuracy of medical diagnosis with causal machine learning, Nature Communications 11 (2020) 3923. [64] F. Liedeker, P. Cimiano, A Prototype of an Interactive Clinical Decision Support System with Counterfactual Explanations, in: Proceedings of the xAI-2023 Late-breaking Work, Demos and Doctoral Consortium co-located with the 1st World Conference on eXplainable Artificial Intelligence (xAI-2023), 2023. [65] F. Liedeker, P. Cimiano, Dynamic Feature Selection in AI-based Diagnostic Decision Support for Epilepsy, 2023. Poster presented at the 1st International Conference on Artificial Intelligence in Epilepsy and Neurological Disorders, Breckenridge, CO, USA. [66] C. Yuan, H. Lim, T.-C. Lu, Most Relevant Explanation in Bayesian Networks, J. Artif. Intell. Res. (JAIR) 42 (2011) 309–352. [67] J. A. Osheroff, D. E. Forsythe, B. G. Buchanan, R. A. Bankowitz, B. H. Blumenfeld, R. A. Miller, Physicians’ information needs: analysis of questions posed during clinical teaching, Annals of Internal Medicine 114 (1991) 576–581. [68] Y.-H. Seol, D. R. Kaufman, E. A. Mendonça, J. J. Cimino, S. B. Johnson, Scenario-based assessment of physicians’ information needs, Studies in Health Technology and Informatics 107 (2004) 306–310.