A Risk-based Approach to Trustworthy AI Systems for Judicial Procedures Majid Mollaeefar1,* , Eleonora Marchesini1 , Roberto Carbone1 and Silvio Ranise1,2 1 Fondazione Bruno Kessler, Center for Cybersecurity, Trento, Italy 2 Department of Mathematics, University of Trento, Italy Abstract In the rapidly evolving landscape of Artificial Intelligence (AI), ensuring the trustworthiness of AI tools deployed in sensitive use cases, such as judicial or healthcare processes, is paramount. The management of AI risks in judicial systems necessitates a holistic approach that includes various elements, such as technical, ethical considerations, and legal responsibilities. This approach should not only involve the application of risk management frameworks and regulations but also focus on the education and training of legal professionals. For this, we propose a risk-based approach designed to evaluate and mitigate potential risks associated with AI applications in judicial settings. Our approach is a semi-automated process that integrates both user (i.e., judge) feedback and technical insights to assess the AI tool’s alignment with Trustworthy AI principles. Keywords Judicial AI, Risk-aware, Trustworthy AI, Trustworthiness Risk Assessment. 1. Introduction countability and explainability of AI systems. As these In recent years, the adoption of Artificial Intelligence (AI) systems become integral to decision-making processes, it technologies has surged across various industries and is essential to comprehend how they reach their conclu- domains. AI systems now play a pivotal role in making sions or recommendations. TAI increases transparency critical decisions, automating tasks, and augmenting hu- and offers mechanisms for interpreting the rationale be- man capabilities. However, with the expanding influence hind AI-generated decisions, allowing users and stake- and complexity of AI, it is crucial to ensure the develop- holders to hold systems accountable. Cobianchi et al. [2] ment and deployment of Trustworthy AI (TAI) systems. emphasize the importance of accountability, technical ro- TAI encompasses the creation and implementation of AI bustness, and transparency in AI applications in surgery, technologies adhering to a set of principles that promote which can be extended to other domains. Third, TAI transparency, fairness, accountability, and robustness. By aids in mitigating risks associated with AI technologies. designing TAI systems, the aim is to inspire trust among If developed or deployed irresponsibly, AI systems can users, stakeholders, and society as a whole where these introduce numerous risks, including privacy breaches, systems must operate reliably, ethically, and in a man- biased decision-making, safety concerns, and the perpet- ner that respects fundamental rights and values. The uation of social inequalities. Addressing these risks is significance of TAI cannot be overstated, as it has the vital to protect individuals, organizations, and society potential to address pressing concerns that arise from from potential harm and adverse consequences. increasing reliance on AI systems. Some notable rea- The AI Act draft proposal for a Regulation1 of the Euro- sons why it is critical for AI systems to be designed with pean Parliament and of the Council laying down harmo- trustworthiness in mind including the following three; nized rules on AI represents the first attempt to enact a First, TAI cultivates user confidence and trust by ensur- horizontal AI regulation. This proposed legal framework, ing that personal data is handled responsibly, decisions focusing specifically on the use of AI systems, advocates made by AI systems are fair and unbiased, and privacy for a technology-neutral definition of AI systems in EU is protected. This is critical for building user confidence legislation. It emphasizes a risk-based approach where and trust in AI systems. The authors in [1] discuss the AI systems are classified with varying obligations pro- theoretical framework of AI trustworthiness, including portional to their level of risk. The AI Act categorizes aspects of privacy preservation and fairness, which are risks into four levels: minimal, limited, high, and unac- key to fostering user trust. Second, TAI bolsters the ac- ceptable (i.e., the latter are not permitted to be sold on the EU market). It focuses on high-risk AI applications Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- (HRAI) by setting specific requirements and obligations nized by CINI, May 29-30, 2024, Naples, Italy for both users and providers of these applications. This * Corresponding author. includes a conformity assessment before market place- $ mmollaeefar@fbk.eu (M. Mollaeefar); emarchesini@fbk.eu (E. Marchesini); carbone@fbk.eu (R. Carbone); ranise@fbk.eu (S. Ranise) 1 https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri= © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CELEX:52021PC0206 CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings ment or service commencement, enforcement measures developing techniques for better human understanding post-market placement, and a governance structure at of ML-generated algorithms. The choice between tra- both European and national levels. The aim is to ensure ditional and modern methods depends on the specific that obligations are aligned with the associated risk level application’s needs, including considerations of security of each AI system. and trustworthiness. An effective risk analysis is crucial One of the areas where AI holds a sensible impact is in in determining the suitability of an AI-produced algo- the legal context, where for instance judges can benefit rithm for a given scenario. from the presence of automated decision-making in ju- dicial proceedings [3, 4], potentially reducing the effort 2.2. Trustworthy AI Trustworthiness is a prerequisite for people and soci- required to search through documents, seek out relevant eties to develop, deploy and use AI systems. Without legal provisions, or support them in complex cases where AI systems—and the human beings behind them—being the human capacity to detect patterns is limited [5]. AI demonstrably worthy of trust, unwanted consequences tools like ChatGPT, while useful, present several limi- may ensue, and their uptake might be hindered, pre- tations in legal contexts. They may produce inaccurate venting the realization of the potentially vast social and information, as demonstrated in cases like Roberto Mata economic benefits that they can bring [6]. In the past vs Avianca2 , where reliance on ChatGPT led to legal few decades, the success of ML has primarily been evalu- issues due to the citation of non-existent cases. This ated based on its quantitative accuracy, which has made stresses the necessity for legal professionals, particularly training AI models much more manageable. Predictive judges, to be acutely aware of the risks associated with accuracy has also become the standard measure for de- their use of HRAI Systems. In this paper, we introduce a termining the superiority of an AI product. However, risk-based approach designed to evaluate and mitigate with the widespread use of AI, the limitations of using potential risks associated with the trustworthiness of AI accuracy as the sole measurement have become apparent, applications in judicial settings. as new challenges have arisen, such as malicious attacks 2. Background and the misuse of AI. To address these challenges, the AI community has recognized that factors beyond accuracy Below we introduce background information to better need to be considered and improved when building an perceive the approach. AI system. Recently, a number of enterprises, academia, 2.1. AI Algorithms public sectors, and organizations have identified princi- In the realm of AI, the development of algorithms falls ples of AI trustworthiness that go beyond accuracy-based into two primary views: traditional and modern. The measurements [7]. According to [8], the current degree traditional approach involves human-created models for of trustworthiness of an AI system is dependent on how specific problems or computations, where a limited set the user perceives its technical characteristics. Various of features and a fixed sequence of instructions are em- organizations, including the G20, the EU Parliament, the ployed. This method, exemplified by classical planning General Partnership on AI (GPAI), and the Organisation in autonomous systems, relies on symbolic representa- for Economic Co-operation and Development3 (OECD) tions and a predefined set of rules, necessitating heuris- have proposed different principles for ensuring trustwor- tics to navigate the vast potential state spaces. Despite thiness in AI systems [9]. The OECD, for instance, has its rigidity, this approach allows for the construction of put forward a set of five principles aimed at promoting algorithms that are easily understood and verified by hu- TAI: (i) inclusive growth, sustainable development and mans. Conversely, the modern perspective, dominated well-being, (ii) human-centered values and fairness, (iii) by Machine Learning (ML), leverages large datasets to transparency and explainability, (iv) robustness, security generate rules for problem-solving. Through processes and safety, and (v) accountability. The use of AI is in- like training and deployment, algorithms are formulated tended to promote human good and well-being, and as to classify or interpret data, such as classifying images such, it should not cause any harm. AI systems must be of dogs and cats. The ML-based methods benefit from characterized by fairness, accuracy, and reliability, and the ability to tackle complex problems without extensive should not be discriminatory. To be considered trustwor- human ingenuity, employing powerful optimization tech- thy, AI systems must be transparent and explainable, niques. However, it faces challenges such as potential meaning they should have the necessary capabilities, imprecision, bias in training data, and the complexity of functions, and features to achieve user goals, with their the resulting algorithms making them difficult for hu- algorithms being easily understood by users. Addition- mans to comprehend. Strategies to mitigate these issues ally, AI systems must be resilient to threats that may include performance monitoring, dataset filtering, and try to exploit their normal behaviors and turn them into harmful ones. In the literature, additional principles have 2 https://law.justia.com/cases/federal/district-courts/new-york/ 3 nysdce/1:2022cv01461/575368/54/ https://oecd.ai/ Unbiasedness been proposed such as accuracy [10], acceptance [11], Non-discrimination Fairness predictability and performance [12]. The AI HLEG [6], Diversity has focused on the concept of TAI, offering guidance Compliance in the form of a framework and identifying seven key Auditability Accountability Ethical ethical and technical requirements. Traceability Transparency 3. Our View on Trustworthiness Trust Explainability Interpretability In our analysis of the literature on finding principles of trustworthiness in AI, the commonly agreed-upon prin- Confidentiality Privacy Anonymity Technical ciples are accuracy, robustness, privacy, explainability, accountability, and fairness. While these six principles Security are widely acknowledged in the literature, there are ad- Safety Robustness Resiliency ditional considerations that can be incorporated within Integrity them. For instance, the concept of “human in the loop” Reliability Accuracy can be viewed as an aspect of fairness. We differentiate be- Data Validity tween properties and principles. While both concepts are related and work together to ensure the overall trustwor- Figure 1: TAI principles and properties relationship. thiness of AI systems, they represent different aspects of the trustworthiness framework. Properties refer to specific characteristics or attributes of an AI system that rithms [13]. Different AI models exhibit variability in contribute to ensure a principle. For instance, integrity, how they align with TAI principles. This variation stems reliability, and data validity can be considered as prop- from the inherent differences in model structures, train- erties relevant to the accuracy principle; Integrity refers ing methods, data used, and their intended applications. to the quality of an AI system being honest, consistent, For example, a model designed for healthcare decision and maintaining the integrity of the data and algorithms support may prioritize accuracy and privacy, while one it operates on. It ensures that the AI system is resistant for autonomous vehicles might focus more on safety and to unauthorized modifications or tampering. Reliability, robustness. The data used to train AI models significantly focuses on the consistency and dependability of an AI affects their trustworthiness. A model trained on limited system’s performance. A reliable AI system consistently or biased data may exhibit lower trustworthiness due to produces accurate results over time and under different its potential to generate skewed or unfair results. Addi- conditions. Data validity refers to the quality and correct- tionally, the type of algorithm—whether it is rule-based ness of the data used by an AI system to generate outputs. or learning-based—plays a crucial role in determining Valid data ensures that the information processed by the the model’s reliability, fairness, and transparency [13]. AI system is accurate, relevant, and representative of the problem domain. On the other hand, principles rep- 3.2. Algorithm-based Trustworthiness resent high-level guidelines or concepts that guide the The relationship between algorithms and TAI principles development and deployment of TAI systems. The rela- is a critical aspect of responsible AI development and tionship between properties and principles lies in how deployment. TAI principles serve as benchmarks against which the performance and ethical considerations of al- properties contribute to fulfilling the principles. Figure 1 depicts the relationship between properties and six essen- gorithms can be evaluated. Each algorithm has its own tial principles for TAI, categorized into either technical,set of advantages and limitations that align or conflict ethical, or both. Accuracy and robustness serve as tech- with these principles, making it essential to investigate nical principles, whereas fairness and accountability fall their compatibility in specific use cases. Since each al- within the ethical domain. Located in the center of the gorithm has a distinct set of characteristics, their com- figure, privacy, and explainability are unique principles patibility with TAI principles can differ significantly; in that encompass both the technical and ethical facets. other words, they have different compliance levels. To define Algorithm-based Trustworthiness (ABT) levels, it 3.1. AI Algorithms & Trustworthiness is essential to consider both the inherent characteristics Trustworthiness in AI is a multifaceted concept, often of each algorithm and the specific attributes related to seen as a relationship between two entities—the AI sys- each AI principle. We define the following qualitative tem and its user. The trustworthiness of an AI system is levels for this assessment; High: The algorithm inher- largely dependent on how it is perceived by the user in ently aligns with the AI principle in question, requiring terms of its technical characteristics. This perception is minimal or no additional measures to ensure compliance. influenced by various factors, including the type of AI Moderate: While the algorithm generally aligns with model, its application context, and the underlying algo- the principle, additional safeguards or contextual consid- erations may be necessary. Low: The algorithm poses achieve the same level of accuracy in complex scenar- challenges or risks that make it difficult to align with ios as their more sophisticated counterparts. On the the AI principle, and significant adjustments or limita- other hand, SVMs and neural networks, especially in tions would be required for compliance. To conduct a their advanced forms, are capable of handling complex, comparison between rule-based and ML-based AI algo- high-dimensional data with greater accuracy but often rithms, we need to consider some assumptions such as sacrifice explainability, presenting a challenge in under- consistency of environment (i.e., static or dynamic), the standing the rationale behind their decisions. When it complexity of problems, availability and quality of data, comes to robustness, SVMs are distinguished by their risk of bias, need for transparency, and explainability. high resilience, particularly against adversarial attacks, With these considerations, in our judicial case, we take thanks to their strong generalization capabilities. NNs, these assumptions; (i) the operational environment for despite their adeptness at complex pattern recognition, the AI system is dynamic, (ii) the complexity of the prob- exhibit moderate to low robustness and are vulnerable to lem can be considered as high, (iii) the high quality of adversarial examples, requiring specialized methods like datasets are available, free of bias and sensitive personal adversarial training to enhance their robustness. DTs information, and (iv) the explanation of the decisions is offer a moderate level of robustness, valued more for required. With these considerations, in the following, their interpretability than their resistance to adversarial we qualitatively evaluate the compatibility of the two examples, while LR models are less robust, particularly distinct types of algorithms with TAI principles. in complex datasets and adversarial environments. In 3.2.1. Rule-based AI terms of accountability, LR models excel due to their These AI systems are perfectly suited to applications that straightforward and transparent nature, which makes require small amounts of data and simple, straightfor- tracing decisions back to specific data points relatively ward rules. These algorithms exhibit high accuracy due easy. DTs also score highly in this regard, due to their to deterministic outcomes from well-defined rules. How- clear decision-making paths. SVMs, particularly with ever, since the assumption of the operational environ- non-linear kernels, present a more complex picture, offer- ment is dynamic and the problem is complex, we consider ing moderate to low accountability due to the intricacies a moderate level for the accuracy principle. These algo- involved in their decision-making processes. NNs are at rithms can be very robust if the rules are well-crafted the lower end of the spectrum in terms of accountability, to handle various edge cases. But they may falter in often described as “black boxes” due to their complex, lay- scenarios not covered by the existing rules, therefore, ered structures, although efforts like layer-wise relevance their robustness can also be considered moderate. These propagation (LRP) and SHAP4 values are employed to algorithms stand out for their high explainability and enhance their interpretability. The aspects of fairness and accountability, as their rule-based nature makes them privacy are also pivotal in evaluating the TAI alignment transparent and easy to understand, even for non-experts. of ML algorithms. The fairness of algorithms such as LR, DTs, SVMs, and NNs is predominantly governed by 3.2.2. ML-based AI the nature of their training data. Since these algorithms These AI systems, particularly suited for environments inherently lack bias, any unfairness in decision-making with abundant data, vary in their alignment with TAI largely stems from biases present in the training data. principles. For the sake of simplicity, we focus only on This reality highlights the importance of precise data four key supervised ML models; Linear Regression (LR), collection and processing, ensuring that the data is rep- Decision Trees (DT), Support Vector Machines (SVM), resentative and free of biases to maintain fairness in the and Neural Networks (NNs). LR is chosen for its fun- outcomes. Alongside fairness, privacy considerations in damental approach to data modeling. DTs offer a more these algorithms are crucial, yet they are not intrinsic intricate decision-making structure. SVMs are known to the algorithms themselves. Instead, privacy risks are for their efficiency in high-dimensional spaces, while closely tied to how the data is handled. Ensuring the pri- NNs, especially in deep learning, handle complex tasks vacy and security of data, especially sensitive personal like image and language processing. These models col- information, is vital, regardless of the algorithm in use. lectively represent the diverse capabilities of ML and Effective data handling practices, including anonymiza- provide insights into their trustworthiness in dynamic, tion and secure storage, play a critical role in mitigating data-intensive scenarios. For accuracy and explainability privacy risks in machine learning applications. There- principles, there is a notable trade-off observed across fore, in both fairness and privacy, the emphasis shifts the algorithms. In the literature [14, 15], there has been from the algorithmic design to the careful management a comprehensive comparison of different ML models in of the data they process. In Table 1, we summarized the terms of their accuracy and explainability level. The ABT levels for rule-based and ML-based algorithms. This LR and DT algorithms, while offering high levels of ex- plainability due to their transparent nature, may not 4 https://github.com/shap/shap Table 1 sesses the potential consequences of the principle being Qualitative comparison between the algorithms and their compromised within the context of the tool’s application. alignment with TAI principles. Legend; Low, Moderate, High Figure 2 illustrates the proposed approach is organized TAI Principles Rule-based ML-based (Supervised) sequentially into four steps: Data Collection, Data Model- Accuracy M LR L DT H SVM H NNs H ing & Analyzing, Risk Evaluation, and Suggestion which Robustness M L H M M operates in two modes: user-only (M1) or user-plus devel- Accountability Explainability H H H H M M L L L L oper (M2). The figure employs a color-coded system to Privacy Depends on data handling, not inherent to the model. differentiate between the specific actions and processes Fairness Depends on the data pipeline. associated with each mode: elements highlighted in blue pertain to the User, those in green correspond to the Developer, and the components in black apply to both comparison, which provides a framework to gauge how modes. Below, we explain each step concisely. various algorithms align with TAI principles, supports Data Collection. The data collection process is going to the risk assessment process effectively. In the next sec- be performed by having comprehensive questionnaires tion, we will propose a risk-based approach, where these that cover multiple factors regarding the development of comparative insights become a vital factor in evaluating AI tools. Depending on the involvement of the AI devel- AI trustworthiness and assessing risk levels. oper, three different questionnaires are provided—i.e., Q1- 4. The Risk-based Approach TAI Implementation, Q2-Criticality, and Q3-Algorithmic. Data Modeling & Analysis. The results obtained from The primary goal of this approach is to support judges the questionnaires in the previous step flow into this and legal practitioners with a set of best practices when step as essential inputs. Based on the scenario mode, out utilizing AI tools in their judicial work. This includes pro- of this step, two models can be generated; (i) the Basic viding them with a clear understanding of the potential model, which considers M1 mode, and (ii) the Advanced risks associated with these tools and offering actionable model, which is enriched with the involvement of both suggestions to mitigate these risks, ensuring responsible the AI developer and the user. The Advanced model and informed use of AI in legal settings. The approach extends beyond user feedback by integrating technical is a semi-automated process that requires user interac- insights, allowing for a more intricate analysis of the AI tion at the beginning of the approach to collect useful tool’s alignment with TAI principles. There are different information about the AI tool. This approach assesses automated processes in this step that are connected to risks associated with the use of AI tools, focusing on their each obtained response for the questionnaires, namely, alignment with TAI principles and their role in legal con- CE Assessment (P1), ABT Assessment (P2), Algorithmic texts. Before diving into the approach, we consider some Estimation (P3), and Criticality Analysis (P4). Below, we assumptions; (i) the user has some experience using the provide a brief description of each process; P1. This pro- AI tool, (ii) the user does not know anything about the cess analyses responses to Q1, determining CE levels for technical details behind the AI tool, (iii) the user knows each TAI principle. For each principle, specific properties only about the required input and output. Typically risk are identified (as depicted in Figure 1), with each property defines as a function of two values Likelihood and Impact being assessed through a series of targeted questions. P2. (i.e., Risk =𝑓 (L,I)). Similarly, we formulate the likelihood To conduct this analysis, preliminary we need to identify as function of two values which are ABT and Control the algorithm used in the AI system. In M2 mode, this effectiveness (CE), where the ABT refers to the degree identification is straightforward as the developer spec- to which the AI tool’s algorithm aligns with TAI prin- ifies the algorithm. In M1 mode, two scenarios arise: if ciples. It assesses whether the algorithmic design and the tool’s documentation is available and the user can functionality inherently support or conflict with these specify its algorithm; if not or the user is unable to spec- principles. For instance, the tool utilized with deep neural ify the algorithm, the user is prompted to complete Q3, networks has a high level of accuracy in prediction while which is part of the subsequent P3 process. P3. This their “black-box” nature makes them less explainable (see process performs in the case of M1 mode, which helps us Table 1). Instead, the CE represents the effectiveness of uncover the algorithm through responding to Q3. The implemented controls in mitigating risks associated with responses obtained from Q3 determine if the algorithm the AI tool. For example, strict access controls and log- is rule-based or ML-based. P4. For this analysis, the ging mechanisms increase confidentiality mitigate the user’s responses to Q2. We made a correlation between risk to the privacy principle. The combination of these each question in Q2 and TAI principles (they are constant two values produces the Likelihood level which collec- in our approach), which aids in assessing the extent to tively evaluates the probability of a TAI principle being which the principles of TAI may be affected in light of compromised. The Impact measures the criticality of the the specific use-case scenarios provided by the user. use-case scenario in terms of each TAI principle. It as- Risk Evaluation. In this step, we conduct likelihood Step 1 Step 2 Step 3 Step 4 Data Modeling & Data Collection Risk Evaluation Suggestion Analysis Legend Specify mitigation controls CE levels CE Assessment Questionnaire Likelihood Likelihood levels ABT levels Assessment Actor action Specify the algorithm ABT Assessment Process output Yes Developer Uncover the algorithm Risk Risk levels Risk Profile Automated Assessment Translation process Is the tool's No Algorithmic document available? Estimation Respond Report Suggestion Report Specify the use-case criticality Impact scores Impact Impact levels Scenario Mode Criticality Analysis Assessment User-only User User-plus Developer Figure 2: The proposed risk-aware approach. and impact assessments based on the previous step out- G. R. Marseglia, M. Massaro, et al., Artificial in- put. Depending on the mode, the risk assessment yields telligence and surgery: ethical dilemmas and open varying risk levels. In fact, the difference between these issues, Journal of the American College of Surgeons models lies in the input they provide for assessing likeli- 235 (2022) 268–275. hood. The basic model operates under constraints due to [3] A. Reichman, Y. Sagy, S. Balaban, From a panacea a lack of developer involvement, overlooking both (i) de- to a panopticon: the use and misuse of technology tailed algorithmic insights, where it might be possible the in the regulation of judges, Hastings LJ 71 (2019). document of the tool is not available or the user may be [4] L. Winmill, Technology in the judiciary: One unable to extract information regarding the algorithmic judge’s experience, Drake L. Rev. 68 (2020) 831. information of the tool and only rely on a general esti- [5] W. De Mulder, P. Valcke, J. Baeck, A collaboration mation, and (ii) CE levels. Instead, the advanced model between judge and machine to reduce legal uncer- integrates insights from both actors, providing a compre- tainty in disputes concerning ex aequo et bono com- hensive perspective on the AI tool’s trustworthiness. pensations, Artificial Intelligence and Law 31 (2023) Suggestion. Upon completing the risk assessment step 325–333. with either the basic or advanced model, the next step [6] H. AI, High-level expert group on artificial intelli- is translating the risk profiles into concrete suggestions. gence, 2019. This step aims to empower legal practitioners with (ac- [7] B. Li, P. Qi, B. Liu, S. Di, J. Liu, J. Pei, J. Yi, B. Zhou, tionable) insights to enhance their awareness regarding Trustworthy ai: From principles to practices, ACM the trustworthiness and reliability of the AI tool within Computing Surveys 55 (2023) 1–46. their judicial workflows. [8] B. Stanton, T. Jensen, et al., Trust and artificial intelligence, preprint (2021). 5. Conclusion [9] L. N. Tidjon, F. Khomh, The different faces of ai ethics across the world: a principle-implementation We proposed a risk-based approach that offers a system- gap analysis, arXiv:2206.03225 (2022). atic method for evaluating and managing potential risks [10] J. M. Wing, Trustworthy ai, Communications of associated with AI applications in judicial contexts. Com- the ACM 64 (2021) 64–71. bining user feedback, particularly from judges, with tech- [11] D. Kaur, S. Uslu, K. J. Rittichier, A. Durresi, Trust- nical insights, our approach assesses the alignment of AI worthy artificial intelligence: a review, ACM Com- tools with TAI principles. Through this semi-automated puting Surveys (CSUR) 55 (2022) 1–38. process, we aim to enhance awareness and accountability [12] S. Thiebes, S. Lins, A. Sunyaev, Trustworthy artifi- in AI usage within legal frameworks. cial intelligence, Electronic Markets 31 (2021). Acknowledgments [13] L. N. Tidjon, F. Khomh, Never trust, always verify: This work was partially supported by the JuLIA project, a roadmap for trustworthy ai?, arXiv:2206.11981 funded by the Justice Programme of the European Union (2022). — JuLIA (101046631), JUST – 2021 JTRA. [14] G. Yang, Q. Ye, J. Xia, Unbox the black-box for the medical explainable ai via multi-modal and multi- References centre data fusion: A mini-review, two showcases and beyond, Information Fusion 77 (2022) 29–52. [1] B. Li, P. Qi, B. Liu, S. Di, J. Liu, J. Pei, J. Yi, B. Zhou, [15] T. A. Abdullah, M. S. M. Zahid, W. Ali, A review of Trustworthy ai: From principles to practices, ACM interpretable ml in healthcare: taxonomy, applica- Computing Surveys 55 (2023) 1–46. tions, challenges, and future directions, Symmetry [2] L. Cobianchi, J. M. Verde, T. J. Loftus, D. Piccolo, 13 (2021) 2439. F. Dal Mas, P. Mascagni, A. G. Vazquez, L. Ansaloni,