=Paper= {{Paper |id=Vol-3762/469 |storemode=property |title=A Risk-based Approach to Trustworthy AI Systems for Judicial Procedures |pdfUrl=https://ceur-ws.org/Vol-3762/469.pdf |volume=Vol-3762 |authors=Majid Mollaeefar,Eleonora Marchesini,Roberto Carbone,Silvio Ranise |dblpUrl=https://dblp.org/rec/conf/ital-ia/MollaeefarMCR24 }} ==A Risk-based Approach to Trustworthy AI Systems for Judicial Procedures== https://ceur-ws.org/Vol-3762/469.pdf
                                A Risk-based Approach to Trustworthy AI Systems for
                                Judicial Procedures
                                Majid Mollaeefar1,* , Eleonora Marchesini1 , Roberto Carbone1 and Silvio Ranise1,2
                                1
                                    Fondazione Bruno Kessler, Center for Cybersecurity, Trento, Italy
                                2
                                    Department of Mathematics, University of Trento, Italy


                                                   Abstract
                                                   In the rapidly evolving landscape of Artificial Intelligence (AI), ensuring the trustworthiness of AI tools deployed in sensitive
                                                   use cases, such as judicial or healthcare processes, is paramount. The management of AI risks in judicial systems necessitates
                                                   a holistic approach that includes various elements, such as technical, ethical considerations, and legal responsibilities. This
                                                   approach should not only involve the application of risk management frameworks and regulations but also focus on the
                                                   education and training of legal professionals. For this, we propose a risk-based approach designed to evaluate and mitigate
                                                   potential risks associated with AI applications in judicial settings. Our approach is a semi-automated process that integrates
                                                   both user (i.e., judge) feedback and technical insights to assess the AI tool’s alignment with Trustworthy AI principles.

                                                   Keywords
                                                   Judicial AI, Risk-aware, Trustworthy AI, Trustworthiness Risk Assessment.



                                1. Introduction                                                         countability and explainability of AI systems. As these
                                In recent years, the adoption of Artificial Intelligence (AI)           systems become integral to decision-making processes, it
                                technologies has surged across various industries and                   is essential to comprehend how they reach their conclu-
                                domains. AI systems now play a pivotal role in making                   sions or recommendations. TAI increases transparency
                                critical decisions, automating tasks, and augmenting hu-                and offers mechanisms for interpreting the rationale be-
                                man capabilities. However, with the expanding influence                 hind AI-generated decisions, allowing users and stake-
                                and complexity of AI, it is crucial to ensure the develop-              holders to hold systems accountable. Cobianchi et al. [2]
                                ment and deployment of Trustworthy AI (TAI) systems.                    emphasize the importance of accountability, technical ro-
                                TAI encompasses the creation and implementation of AI                   bustness, and transparency in AI applications in surgery,
                                technologies adhering to a set of principles that promote               which can be extended to other domains. Third, TAI
                                transparency, fairness, accountability, and robustness. By              aids in mitigating risks associated with AI technologies.
                                designing TAI systems, the aim is to inspire trust among                If developed or deployed irresponsibly, AI systems can
                                users, stakeholders, and society as a whole where these                 introduce numerous risks, including privacy breaches,
                                systems must operate reliably, ethically, and in a man-                 biased decision-making, safety concerns, and the perpet-
                                ner that respects fundamental rights and values. The                    uation of social inequalities. Addressing these risks is
                                significance of TAI cannot be overstated, as it has the                 vital to protect individuals, organizations, and society
                                potential to address pressing concerns that arise from                  from potential harm and adverse consequences.
                                increasing reliance on AI systems. Some notable rea-                       The AI Act draft proposal for a Regulation1 of the Euro-
                                sons why it is critical for AI systems to be designed with              pean Parliament and of the Council laying down harmo-
                                trustworthiness in mind including the following three;                  nized rules on AI represents the first attempt to enact a
                                First, TAI cultivates user confidence and trust by ensur-               horizontal AI regulation. This proposed legal framework,
                                ing that personal data is handled responsibly, decisions                focusing specifically on the use of AI systems, advocates
                                made by AI systems are fair and unbiased, and privacy                   for a technology-neutral definition of AI systems in EU
                                is protected. This is critical for building user confidence             legislation. It emphasizes a risk-based approach where
                                and trust in AI systems. The authors in [1] discuss the                 AI systems are classified with varying obligations pro-
                                theoretical framework of AI trustworthiness, including                  portional to their level of risk. The AI Act categorizes
                                aspects of privacy preservation and fairness, which are                 risks into four levels: minimal, limited, high, and unac-
                                key to fostering user trust. Second, TAI bolsters the ac-               ceptable (i.e., the latter are not permitted to be sold on
                                                                                                        the EU market). It focuses on high-risk AI applications
                                Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- (HRAI) by setting specific requirements and obligations
                                nized by CINI, May 29-30, 2024, Naples, Italy                           for both users and providers of these applications. This
                                *
                                  Corresponding author.
                                                                                                        includes a conformity assessment before market place-
                                $ mmollaeefar@fbk.eu (M. Mollaeefar); emarchesini@fbk.eu
                                (E. Marchesini); carbone@fbk.eu (R. Carbone); ranise@fbk.eu
                                (S. Ranise)                                                                                                 1
                                                                                                                                                https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=
                                             © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                             Attribution 4.0 International (CC BY 4.0).                                                         CELEX:52021PC0206




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
ment or service commencement, enforcement measures                   developing techniques for better human understanding
post-market placement, and a governance structure at                 of ML-generated algorithms. The choice between tra-
both European and national levels. The aim is to ensure              ditional and modern methods depends on the specific
that obligations are aligned with the associated risk level          application’s needs, including considerations of security
of each AI system.                                                   and trustworthiness. An effective risk analysis is crucial
   One of the areas where AI holds a sensible impact is in           in determining the suitability of an AI-produced algo-
the legal context, where for instance judges can benefit             rithm for a given scenario.
from the presence of automated decision-making in ju-
dicial proceedings [3, 4], potentially reducing the effort
                                                                     2.2. Trustworthy AI
                                                                     Trustworthiness is a prerequisite for people and soci-
required to search through documents, seek out relevant
                                                                     eties to develop, deploy and use AI systems. Without
legal provisions, or support them in complex cases where
                                                                     AI systems—and the human beings behind them—being
the human capacity to detect patterns is limited [5]. AI
                                                                     demonstrably worthy of trust, unwanted consequences
tools like ChatGPT, while useful, present several limi-
                                                                     may ensue, and their uptake might be hindered, pre-
tations in legal contexts. They may produce inaccurate
                                                                     venting the realization of the potentially vast social and
information, as demonstrated in cases like Roberto Mata
                                                                     economic benefits that they can bring [6]. In the past
vs Avianca2 , where reliance on ChatGPT led to legal
                                                                     few decades, the success of ML has primarily been evalu-
issues due to the citation of non-existent cases. This
                                                                     ated based on its quantitative accuracy, which has made
stresses the necessity for legal professionals, particularly
                                                                     training AI models much more manageable. Predictive
judges, to be acutely aware of the risks associated with
                                                                     accuracy has also become the standard measure for de-
their use of HRAI Systems. In this paper, we introduce a
                                                                     termining the superiority of an AI product. However,
risk-based approach designed to evaluate and mitigate
                                                                     with the widespread use of AI, the limitations of using
potential risks associated with the trustworthiness of AI
                                                                     accuracy as the sole measurement have become apparent,
applications in judicial settings.
                                                                     as new challenges have arisen, such as malicious attacks
2. Background                                                        and the misuse of AI. To address these challenges, the AI
                                                                     community has recognized that factors beyond accuracy
Below we introduce background information to better                  need to be considered and improved when building an
perceive the approach.                                               AI system. Recently, a number of enterprises, academia,
2.1. AI Algorithms                                                   public sectors, and organizations have identified princi-
In the realm of AI, the development of algorithms falls              ples of AI trustworthiness that go beyond accuracy-based
into two primary views: traditional and modern. The                  measurements [7]. According to [8], the current degree
traditional approach involves human-created models for               of trustworthiness of an AI system is dependent on how
specific problems or computations, where a limited set               the user perceives its technical characteristics. Various
of features and a fixed sequence of instructions are em-             organizations, including the G20, the EU Parliament, the
ployed. This method, exemplified by classical planning               General Partnership on AI (GPAI), and the Organisation
in autonomous systems, relies on symbolic representa-                for Economic Co-operation and Development3 (OECD)
tions and a predefined set of rules, necessitating heuris-           have proposed different principles for ensuring trustwor-
tics to navigate the vast potential state spaces. Despite            thiness in AI systems [9]. The OECD, for instance, has
its rigidity, this approach allows for the construction of           put forward a set of five principles aimed at promoting
algorithms that are easily understood and verified by hu-            TAI: (i) inclusive growth, sustainable development and
mans. Conversely, the modern perspective, dominated                  well-being, (ii) human-centered values and fairness, (iii)
by Machine Learning (ML), leverages large datasets to                transparency and explainability, (iv) robustness, security
generate rules for problem-solving. Through processes                and safety, and (v) accountability. The use of AI is in-
like training and deployment, algorithms are formulated              tended to promote human good and well-being, and as
to classify or interpret data, such as classifying images            such, it should not cause any harm. AI systems must be
of dogs and cats. The ML-based methods benefit from                  characterized by fairness, accuracy, and reliability, and
the ability to tackle complex problems without extensive             should not be discriminatory. To be considered trustwor-
human ingenuity, employing powerful optimization tech-               thy, AI systems must be transparent and explainable,
niques. However, it faces challenges such as potential               meaning they should have the necessary capabilities,
imprecision, bias in training data, and the complexity of            functions, and features to achieve user goals, with their
the resulting algorithms making them difficult for hu-               algorithms being easily understood by users. Addition-
mans to comprehend. Strategies to mitigate these issues              ally, AI systems must be resilient to threats that may
include performance monitoring, dataset filtering, and               try to exploit their normal behaviors and turn them into
                                                                     harmful ones. In the literature, additional principles have
2
    https://law.justia.com/cases/federal/district-courts/new-york/
                                                                     3
    nysdce/1:2022cv01461/575368/54/                                      https://oecd.ai/
                                                                              Unbiasedness
been proposed such as accuracy [10], acceptance [11],                       Non-discrimination     Fairness

predictability and performance [12]. The AI HLEG [6],                           Diversity


has focused on the concept of TAI, offering guidance                           Compliance

in the form of a framework and identifying seven key                           Auditability      Accountability




                                                                                                                              Ethical
ethical and technical requirements.                                            Traceability


                                                                              Transparency

3. Our View on Trustworthiness                                                    Trust          Explainability

                                                                             Interpretability
In our analysis of the literature on finding principles of
trustworthiness in AI, the commonly agreed-upon prin-
                                                                             Confidentiality
                                                                                                    Privacy
                                                                               Anonymity




                                                                                                                  Technical
ciples are accuracy, robustness, privacy, explainability,
accountability, and fairness. While these six principles                         Security



are widely acknowledged in the literature, there are ad-
                                                                                 Safety           Robustness

                                                                                Resiliency

ditional considerations that can be incorporated within                          Integrity

them. For instance, the concept of “human in the loop”                          Reliability        Accuracy

can be viewed as an aspect of fairness. We differentiate be-                  Data Validity


tween properties and principles. While both concepts are
related and work together to ensure the overall trustwor-      Figure 1: TAI principles and properties relationship.
thiness of AI systems, they represent different aspects
of the trustworthiness framework. Properties refer to
specific characteristics or attributes of an AI system that    rithms [13]. Different AI models exhibit variability in
contribute to ensure a principle. For instance, integrity,     how they align with TAI principles. This variation stems
reliability, and data validity can be considered as prop-      from the inherent differences in model structures, train-
erties relevant to the accuracy principle; Integrity refers    ing methods, data used, and their intended applications.
to the quality of an AI system being honest, consistent,       For example, a model designed for healthcare decision
and maintaining the integrity of the data and algorithms       support may prioritize accuracy and privacy, while one
it operates on. It ensures that the AI system is resistant     for autonomous vehicles might focus more on safety and
to unauthorized modifications or tampering. Reliability,       robustness. The data used to train AI models significantly
focuses on the consistency and dependability of an AI          affects their trustworthiness. A model trained on limited
system’s performance. A reliable AI system consistently        or biased data may exhibit lower trustworthiness due to
produces accurate results over time and under different        its potential to generate skewed or unfair results. Addi-
conditions. Data validity refers to the quality and correct-   tionally, the type of algorithm—whether it is rule-based
ness of the data used by an AI system to generate outputs.     or learning-based—plays a crucial role in determining
Valid data ensures that the information processed by the       the model’s reliability, fairness, and transparency [13].
AI system is accurate, relevant, and representative of
the problem domain. On the other hand, principles rep-         3.2. Algorithm-based Trustworthiness
resent high-level guidelines or concepts that guide the    The relationship between algorithms and TAI principles
development and deployment of TAI systems. The rela-       is a critical aspect of responsible AI development and
tionship between properties and principles lies in how     deployment. TAI principles serve as benchmarks against
                                                           which the performance and ethical considerations of al-
properties contribute to fulfilling the principles. Figure 1
depicts the relationship between properties and six essen- gorithms can be evaluated. Each algorithm has its own
tial principles for TAI, categorized into either technical,set of advantages and limitations that align or conflict
ethical, or both. Accuracy and robustness serve as tech-   with these principles, making it essential to investigate
nical principles, whereas fairness and accountability fall their compatibility in specific use cases. Since each al-
within the ethical domain. Located in the center of the    gorithm has a distinct set of characteristics, their com-
figure, privacy, and explainability are unique principles  patibility with TAI principles can differ significantly; in
that encompass both the technical and ethical facets.      other words, they have different compliance levels. To
                                                           define Algorithm-based Trustworthiness (ABT) levels, it
3.1. AI Algorithms & Trustworthiness                       is essential to consider both the inherent characteristics
Trustworthiness in AI is a multifaceted concept, often of each algorithm and the specific attributes related to
seen as a relationship between two entities—the AI sys- each AI principle. We define the following qualitative
tem and its user. The trustworthiness of an AI system is levels for this assessment; High: The algorithm inher-
largely dependent on how it is perceived by the user in ently aligns with the AI principle in question, requiring
terms of its technical characteristics. This perception is minimal or no additional measures to ensure compliance.
influenced by various factors, including the type of AI Moderate: While the algorithm generally aligns with
model, its application context, and the underlying algo- the principle, additional safeguards or contextual consid-
erations may be necessary. Low: The algorithm poses           achieve the same level of accuracy in complex scenar-
challenges or risks that make it difficult to align with      ios as their more sophisticated counterparts. On the
the AI principle, and significant adjustments or limita-      other hand, SVMs and neural networks, especially in
tions would be required for compliance. To conduct a          their advanced forms, are capable of handling complex,
comparison between rule-based and ML-based AI algo-           high-dimensional data with greater accuracy but often
rithms, we need to consider some assumptions such as          sacrifice explainability, presenting a challenge in under-
consistency of environment (i.e., static or dynamic), the     standing the rationale behind their decisions. When it
complexity of problems, availability and quality of data,     comes to robustness, SVMs are distinguished by their
risk of bias, need for transparency, and explainability.      high resilience, particularly against adversarial attacks,
With these considerations, in our judicial case, we take      thanks to their strong generalization capabilities. NNs,
these assumptions; (i) the operational environment for        despite their adeptness at complex pattern recognition,
the AI system is dynamic, (ii) the complexity of the prob-    exhibit moderate to low robustness and are vulnerable to
lem can be considered as high, (iii) the high quality of      adversarial examples, requiring specialized methods like
datasets are available, free of bias and sensitive personal   adversarial training to enhance their robustness. DTs
information, and (iv) the explanation of the decisions is     offer a moderate level of robustness, valued more for
required. With these considerations, in the following,        their interpretability than their resistance to adversarial
we qualitatively evaluate the compatibility of the two        examples, while LR models are less robust, particularly
distinct types of algorithms with TAI principles.             in complex datasets and adversarial environments. In
3.2.1. Rule-based AI                                          terms of accountability, LR models excel due to their
These AI systems are perfectly suited to applications that    straightforward and transparent nature, which makes
require small amounts of data and simple, straightfor-        tracing decisions back to specific data points relatively
ward rules. These algorithms exhibit high accuracy due        easy. DTs also score highly in this regard, due to their
to deterministic outcomes from well-defined rules. How-       clear decision-making paths. SVMs, particularly with
ever, since the assumption of the operational environ-        non-linear kernels, present a more complex picture, offer-
ment is dynamic and the problem is complex, we consider       ing moderate to low accountability due to the intricacies
a moderate level for the accuracy principle. These algo-      involved in their decision-making processes. NNs are at
rithms can be very robust if the rules are well-crafted       the lower end of the spectrum in terms of accountability,
to handle various edge cases. But they may falter in          often described as “black boxes” due to their complex, lay-
scenarios not covered by the existing rules, therefore,       ered structures, although efforts like layer-wise relevance
their robustness can also be considered moderate. These       propagation (LRP) and SHAP4 values are employed to
algorithms stand out for their high explainability and        enhance their interpretability. The aspects of fairness and
accountability, as their rule-based nature makes them         privacy are also pivotal in evaluating the TAI alignment
transparent and easy to understand, even for non-experts.     of ML algorithms. The fairness of algorithms such as
                                                              LR, DTs, SVMs, and NNs is predominantly governed by
3.2.2. ML-based AI                                            the nature of their training data. Since these algorithms
These AI systems, particularly suited for environments        inherently lack bias, any unfairness in decision-making
with abundant data, vary in their alignment with TAI          largely stems from biases present in the training data.
principles. For the sake of simplicity, we focus only on      This reality highlights the importance of precise data
four key supervised ML models; Linear Regression (LR),        collection and processing, ensuring that the data is rep-
Decision Trees (DT), Support Vector Machines (SVM),           resentative and free of biases to maintain fairness in the
and Neural Networks (NNs). LR is chosen for its fun-          outcomes. Alongside fairness, privacy considerations in
damental approach to data modeling. DTs offer a more          these algorithms are crucial, yet they are not intrinsic
intricate decision-making structure. SVMs are known           to the algorithms themselves. Instead, privacy risks are
for their efficiency in high-dimensional spaces, while        closely tied to how the data is handled. Ensuring the pri-
NNs, especially in deep learning, handle complex tasks        vacy and security of data, especially sensitive personal
like image and language processing. These models col-         information, is vital, regardless of the algorithm in use.
lectively represent the diverse capabilities of ML and        Effective data handling practices, including anonymiza-
provide insights into their trustworthiness in dynamic,       tion and secure storage, play a critical role in mitigating
data-intensive scenarios. For accuracy and explainability     privacy risks in machine learning applications. There-
principles, there is a notable trade-off observed across      fore, in both fairness and privacy, the emphasis shifts
the algorithms. In the literature [14, 15], there has been    from the algorithmic design to the careful management
a comprehensive comparison of different ML models in          of the data they process. In Table 1, we summarized the
terms of their accuracy and explainability level. The         ABT levels for rule-based and ML-based algorithms. This
LR and DT algorithms, while offering high levels of ex-
plainability due to their transparent nature, may not         4
                                                                  https://github.com/shap/shap
Table 1                                                                    sesses the potential consequences of the principle being
Qualitative comparison between the algorithms and their                    compromised within the context of the tool’s application.
alignment with TAI principles. Legend; Low, Moderate, High                    Figure 2 illustrates the proposed approach is organized
   TAI Principles   Rule-based           ML-based (Supervised)             sequentially into four steps: Data Collection, Data Model-
   Accuracy             M
                                    LR
                                     L
                                           DT
                                            H
                                                    SVM
                                                     H
                                                                 NNs
                                                                  H
                                                                           ing & Analyzing, Risk Evaluation, and Suggestion which
   Robustness           M            L      H        M            M        operates in two modes: user-only (M1) or user-plus devel-
   Accountability
   Explainability
                        H
                        H
                                    H
                                    H
                                            M
                                            M
                                                     L
                                                     L
                                                                  L
                                                                  L
                                                                           oper (M2). The figure employs a color-coded system to
   Privacy          Depends on data handling, not inherent to the model.   differentiate between the specific actions and processes
   Fairness         Depends on the data pipeline.                          associated with each mode: elements highlighted in blue
                                                                           pertain to the User, those in green correspond to the
                                                                           Developer, and the components in black apply to both
comparison, which provides a framework to gauge how                        modes. Below, we explain each step concisely.
various algorithms align with TAI principles, supports                     Data Collection. The data collection process is going to
the risk assessment process effectively. In the next sec-                  be performed by having comprehensive questionnaires
tion, we will propose a risk-based approach, where these                   that cover multiple factors regarding the development of
comparative insights become a vital factor in evaluating                   AI tools. Depending on the involvement of the AI devel-
AI trustworthiness and assessing risk levels.                              oper, three different questionnaires are provided—i.e., Q1-
4. The Risk-based Approach                                                 TAI Implementation, Q2-Criticality, and Q3-Algorithmic.
                                                                           Data Modeling & Analysis. The results obtained from
The primary goal of this approach is to support judges                     the questionnaires in the previous step flow into this
and legal practitioners with a set of best practices when                  step as essential inputs. Based on the scenario mode, out
utilizing AI tools in their judicial work. This includes pro-              of this step, two models can be generated; (i) the Basic
viding them with a clear understanding of the potential                    model, which considers M1 mode, and (ii) the Advanced
risks associated with these tools and offering actionable                  model, which is enriched with the involvement of both
suggestions to mitigate these risks, ensuring responsible                  the AI developer and the user. The Advanced model
and informed use of AI in legal settings. The approach                     extends beyond user feedback by integrating technical
is a semi-automated process that requires user interac-                    insights, allowing for a more intricate analysis of the AI
tion at the beginning of the approach to collect useful                    tool’s alignment with TAI principles. There are different
information about the AI tool. This approach assesses                      automated processes in this step that are connected to
risks associated with the use of AI tools, focusing on their               each obtained response for the questionnaires, namely,
alignment with TAI principles and their role in legal con-                 CE Assessment (P1), ABT Assessment (P2), Algorithmic
texts. Before diving into the approach, we consider some                   Estimation (P3), and Criticality Analysis (P4). Below, we
assumptions; (i) the user has some experience using the                    provide a brief description of each process; P1. This pro-
AI tool, (ii) the user does not know anything about the                    cess analyses responses to Q1, determining CE levels for
technical details behind the AI tool, (iii) the user knows                 each TAI principle. For each principle, specific properties
only about the required input and output. Typically risk                   are identified (as depicted in Figure 1), with each property
defines as a function of two values Likelihood and Impact                  being assessed through a series of targeted questions. P2.
(i.e., Risk =𝑓 (L,I)). Similarly, we formulate the likelihood              To conduct this analysis, preliminary we need to identify
as function of two values which are ABT and Control                        the algorithm used in the AI system. In M2 mode, this
effectiveness (CE), where the ABT refers to the degree                     identification is straightforward as the developer spec-
to which the AI tool’s algorithm aligns with TAI prin-                     ifies the algorithm. In M1 mode, two scenarios arise: if
ciples. It assesses whether the algorithmic design and                     the tool’s documentation is available and the user can
functionality inherently support or conflict with these                    specify its algorithm; if not or the user is unable to spec-
principles. For instance, the tool utilized with deep neural               ify the algorithm, the user is prompted to complete Q3,
networks has a high level of accuracy in prediction while                  which is part of the subsequent P3 process. P3. This
their “black-box” nature makes them less explainable (see                  process performs in the case of M1 mode, which helps us
Table 1). Instead, the CE represents the effectiveness of                  uncover the algorithm through responding to Q3. The
implemented controls in mitigating risks associated with                   responses obtained from Q3 determine if the algorithm
the AI tool. For example, strict access controls and log-                  is rule-based or ML-based. P4. For this analysis, the
ging mechanisms increase confidentiality mitigate the                      user’s responses to Q2. We made a correlation between
risk to the privacy principle. The combination of these                    each question in Q2 and TAI principles (they are constant
two values produces the Likelihood level which collec-                     in our approach), which aids in assessing the extent to
tively evaluates the probability of a TAI principle being                  which the principles of TAI may be affected in light of
compromised. The Impact measures the criticality of the                    the specific use-case scenarios provided by the user.
use-case scenario in terms of each TAI principle. It as-                   Risk Evaluation. In this step, we conduct likelihood
                             Step 1                                        Step 2                         Step 3                               Step 4

                                                                       Data Modeling &
                          Data Collection                                                              Risk Evaluation                        Suggestion
                                                                           Analysis

                                                                                                                                                                                                  Legend
            Specify mitigation controls                                    CE levels
                                                CE Assessment
                                                                                                                                                                               Questionnaire
                                                                                          Likelihood    Likelihood levels
                                                                          ABT levels     Assessment                                                                              Actor action
                 Specify the algorithm
                                               ABT Assessment                                                                                                                  Process output
                      Yes
Developer
                                                         Uncover the algorithm                                                 Risk      Risk levels          Risk Profile        Automated
                                                                                                                            Assessment                        Translation          process
                Is the tool's         No          Algorithmic
            document available?                   Estimation
  Respond




                                                                                                                                                                                    Report


                                                                                                                                                           Suggestion Report
            Specify the use-case criticality                             Impact scores     Impact         Impact levels                                                        Scenario Mode
                                               Criticality Analysis
                                                                                         Assessment                                                                                   User-only
  User                                                                                                                                                                                User-plus Developer




Figure 2: The proposed risk-aware approach.



and impact assessments based on the previous step out-                 G. R. Marseglia, M. Massaro, et al., Artificial in-
put. Depending on the mode, the risk assessment yields                 telligence and surgery: ethical dilemmas and open
varying risk levels. In fact, the difference between these             issues, Journal of the American College of Surgeons
models lies in the input they provide for assessing likeli-            235 (2022) 268–275.
hood. The basic model operates under constraints due to            [3] A. Reichman, Y. Sagy, S. Balaban, From a panacea
a lack of developer involvement, overlooking both (i) de-              to a panopticon: the use and misuse of technology
tailed algorithmic insights, where it might be possible the            in the regulation of judges, Hastings LJ 71 (2019).
document of the tool is not available or the user may be           [4] L. Winmill, Technology in the judiciary: One
unable to extract information regarding the algorithmic                judge’s experience, Drake L. Rev. 68 (2020) 831.
information of the tool and only rely on a general esti-           [5] W. De Mulder, P. Valcke, J. Baeck, A collaboration
mation, and (ii) CE levels. Instead, the advanced model                between judge and machine to reduce legal uncer-
integrates insights from both actors, providing a compre-              tainty in disputes concerning ex aequo et bono com-
hensive perspective on the AI tool’s trustworthiness.                  pensations, Artificial Intelligence and Law 31 (2023)
Suggestion. Upon completing the risk assessment step                   325–333.
with either the basic or advanced model, the next step             [6] H. AI, High-level expert group on artificial intelli-
is translating the risk profiles into concrete suggestions.            gence, 2019.
This step aims to empower legal practitioners with (ac-            [7] B. Li, P. Qi, B. Liu, S. Di, J. Liu, J. Pei, J. Yi, B. Zhou,
tionable) insights to enhance their awareness regarding                Trustworthy ai: From principles to practices, ACM
the trustworthiness and reliability of the AI tool within              Computing Surveys 55 (2023) 1–46.
their judicial workflows.                                          [8] B. Stanton, T. Jensen, et al., Trust and artificial
                                                                       intelligence, preprint (2021).
5. Conclusion                                                      [9] L. N. Tidjon, F. Khomh, The different faces of ai
                                                                       ethics across the world: a principle-implementation
We proposed a risk-based approach that offers a system-
                                                                       gap analysis, arXiv:2206.03225 (2022).
atic method for evaluating and managing potential risks
                                                                  [10] J. M. Wing, Trustworthy ai, Communications of
associated with AI applications in judicial contexts. Com-
                                                                       the ACM 64 (2021) 64–71.
bining user feedback, particularly from judges, with tech-
                                                                  [11] D. Kaur, S. Uslu, K. J. Rittichier, A. Durresi, Trust-
nical insights, our approach assesses the alignment of AI
                                                                       worthy artificial intelligence: a review, ACM Com-
tools with TAI principles. Through this semi-automated
                                                                       puting Surveys (CSUR) 55 (2022) 1–38.
process, we aim to enhance awareness and accountability
                                                                  [12] S. Thiebes, S. Lins, A. Sunyaev, Trustworthy artifi-
in AI usage within legal frameworks.
                                                                       cial intelligence, Electronic Markets 31 (2021).
Acknowledgments                                                   [13] L. N. Tidjon, F. Khomh, Never trust, always verify:
This work was partially supported by the JuLIA project,                a roadmap for trustworthy ai?, arXiv:2206.11981
funded by the Justice Programme of the European Union                  (2022).
— JuLIA (101046631), JUST – 2021 JTRA.                            [14] G. Yang, Q. Ye, J. Xia, Unbox the black-box for the
                                                                       medical explainable ai via multi-modal and multi-
References                                                             centre data fusion: A mini-review, two showcases
                                                                       and beyond, Information Fusion 77 (2022) 29–52.
 [1] B. Li, P. Qi, B. Liu, S. Di, J. Liu, J. Pei, J. Yi, B. Zhou, [15] T. A. Abdullah, M. S. M. Zahid, W. Ali, A review of
      Trustworthy ai: From principles to practices, ACM                interpretable ml in healthcare: taxonomy, applica-
      Computing Surveys 55 (2023) 1–46.                                tions, challenges, and future directions, Symmetry
 [2] L. Cobianchi, J. M. Verde, T. J. Loftus, D. Piccolo,              13 (2021) 2439.
      F. Dal Mas, P. Mascagni, A. G. Vazquez, L. Ansaloni,