<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ECLI:NL:GHDHA:</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Investigating the value of qualitative Bayesian networks of complete cases as "double-check" tools on traditional judicial reasoning: An exploratory study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Leya Hampson</string-name>
          <email>l.l.hampson@rug.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ludi van Leeuwen</string-name>
          <email>l.s.van.leeuwen@rug.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bernoulli Institute of Mathematics</institution>
          ,
          <addr-line>Computer Science and Artificial Intelligence</addr-line>
          ,
          <institution>University of Groningen</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Transboundary Legal Studies, Faculty of Law, University of Groningen</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <volume>2859</volume>
      <abstract>
        <p>The overarching aim of this study is to empirically examine the feasibility and value of complete-case Bayesian network (BN) modelling within judicial deliberation. Building on critiques concerning the subjectivity and perceived overprecision of quantitative BN approaches, it explores whether qualitative BNs can serve as "double-check" tools on traditional judicial reasoning. Employing a sequential, mixed-methods design, comprising independent modelling, structured reflection, and collaborative comparison, two independent modellers constructed both qualitative and quantitative BNs of the same Dutch appellate verdict. The findings show that qualitative models can capture the essential reasoning structure of the court and assist in identifying implicit assumptions, incomplete dependencies and sources of uncertainty. Quantification significantly impacted the structure of the networks and highlighted the importance of precise and stable variable definitions to enhance transparency and interpretability. While they did not expose probabilistic fallacies, qualitative BNs aligned with the court's reasoning and revealed interpretive gaps, highlighting their heuristic rather than substitutive value.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Bayesian networks</kwd>
        <kwd>Legal Reasoning</kwd>
        <kwd>Belief Updating</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the Netherlands, an ongoing debate concerns both the feasibility and utility of constructing
quantitative Bayesian Networks (BNs) of complete criminal cases. This debate has been fuelled by three
instances in which such models were presented during trial; twice by the prosecution and once by the
defence. In all three cases 1, the courts decided not to use their analyses (see [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for an overview), citing
concerns about the reliability of the Bayesian method [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>A Bayesian model of a complete criminal case can take various forms, depending on several factors: the
employed modelling method (e.g., Bayesian Networks or linear Bayes 2), the level of detail included, and
whether the approach is qualitative 3 or quantitative. It further depends on the individual(s) constructing
the model (e.g., investigators, experts, prosecution/defence, or the court itself) and in which phase of the
legal process the model is developed (e.g., during investigation, at trial, in the deliberation chamber, or
as an element of the written verdict). Finally, the purpose for which the model is constructed may difer:
from guiding investigations, structuring argumentation during trial, to assisting judicial reasoning or
serving as a means of ‘double-checking’ a verdict.</p>
      <p>In the present study, the focus lies on models constructed post-verdict by experts, serving as this
latter “double-check" function. A complete case in this context refers to a model incorporating all
items of evidence explicitly mentioned by the court in its verdict. Any fact or piece of information
cited by the court as part of its reasoning, irrespective of (un-)assigned probative value, is considered
part of the evidential set to be modelled. The inclusion of such information signals it played a role
in the court’s assessment and should therefore be included in the model. Model completeness also
involves addressing the relationships between various pieces of evidence, thus all dependencies should
be explicitly modelled.</p>
      <p>
        Several recurring arguments have been raised against the use of BNs in such comprehensive
applications. Constructing a BN of a criminal case requires both statistical and domain-specific expertise,
including an understanding of forensic evidence, investigative processes, and legal reasoning [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Furthermore, at present, no uniform method exists for constructing BNs of criminal cases [
        <xref ref-type="bibr" rid="ref2 ref4 ref5 ref6 ref7">4, 5, 2, 6, 7</xref>
        ],
and thus modellers may difer in how they define variables, determine dependencies, and structure
the overall narrative of the network (e.g., employing an idiom-based approach [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] vs a scenario-based
approach [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]). These variations introduce an element of subjectivity [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], but these concerns become
especially pronounced at the quantification stage. Critics question the basis on which numerical
probabilities are assigned to the conditional probability tables – often summarised as the question where do
the numbers come from? [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In practice, the assignment of prior and conditional probabilities relies
heavily on expert judgement or subjective estimation rather than on empirical data [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]. Even
when empirical data is available for constructing Bayesian networks, some variables inevitably concern
unique events – such as the probability that a particular defendant performed a specific act – for which
no empirical frequencies can exist [
        <xref ref-type="bibr" rid="ref12 ref8">8, 12</xref>
        ]. Critics argue that quantification can therefore convey a
misleading impression of precision, concealing subjective assumptions behind a facade of mathematical
rigour [
        <xref ref-type="bibr" rid="ref13 ref6">6, 13</xref>
        ]. This perceived precision can, in turn, lend the model undue persuasive force [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Judges,
typically untrained in Bayesian principles, may overvalue numerical outcomes or rely too heavily on
expert interpretation. Such reliance, whilst often unavoidable, can blur the boundary between evidential
assessment and external expertise, potentially undermining judicial independence [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Moreover, the
modeller’s interpretive framework may subtly shape how the court perceives and weighs the evidence.
      </p>
      <p>
        These critiques have led some commentators to question the role that Bayesian reasoning can
and should assume in legal decision-making [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Rather than serving as a comprehensive analytical
framework, it is argued that Bayes should be employed in the qualitative and global sense [
        <xref ref-type="bibr" rid="ref15 ref9">9, 15</xref>
        ];
as a means of double-checking or triangulating existing legal reasoning. In this view, the function
of Bayesian modelling is not to determine the quantitative outcome of a case, but to ensure that no
probabilistic inconsistencies or fallacies have occurred during the deliberation process.
      </p>
      <p>
        While much of the existing discourse on Bayesian modelling in law has focused on its limitations,
these critiques can overshadow the potential value of the approach. In particular, debates surrounding
subjectivity, numerical precision, and the source of probabilities have come to dominate the discourse,
leaving comparatively little attention to the qualitative insights that the modelling process itself can
ofer (see Meester (2020) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], for instance, who briefly touched upon the qualitative value of the proposed
BN before delivering a 22 page analysis on why the numbers did not work). By focusing narrowly on
the reliability of numerical outputs, critics risk overlooking the interpretive and diagnostic functions
that Bayesian reasoning may serve in structuring legal argumentation and revealing inconsistencies in
evidential assessment.
      </p>
      <p>The current study forms part of a broader project that aims to empirically examine the feasibility
and value of complete-case Bayesian modelling within judicial deliberation. It builds directly on the
theoretical claim that, given the critiques surrounding subjectivity and feasibility, Bayes may have a
more limited—but still valuable—role to play in legal reasoning: not as a quantitative decision-making
tool, but as a qualitative means of checking the coherence of evidential reasoning.</p>
      <p>The study therefore explores whether qualitative Bayesian networks can, in practice, fulfill this
proposed “double-check” function while sidestepping the main critique in the literature: the subjectivity
inherent in quantification. Using a Dutch appellate case as a test example, two independent modellers
constructed both qualitative and quantitative Bayesian Networks (BNs) of the same verdict. This design
allows us to examine how quantification influences the structure and interpretation of the networks,
and whether the qualitative models alone capture the essential reasoning structure necessary for judicial
evaluation.</p>
      <p>Guided by this objective, the present paper makes an initial step toward addressing the following
three exploratory research questions:
1. Do the modellers perceive their models as complete representations of the case?
2. To what extent does quantification impact the structure of the network?
3. Can a qualitative network efectively serve as a “double-check” tool on traditional legal reasoning?
More specifically, we examine the ability of qualitative networks to a) detect errors, b) align with
the courts reasoning, and c) provide additional value beyond identifying probabilistic fallacies.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <sec id="sec-2-1">
        <title>2.1. Design</title>
        <p>The study employed a sequential, mixed-methods design consisting of (i) independent modelling, (ii)
structured reflection, and (iii) collaborative comparison. The two modellers were the authors themselves,
both PhD candidates with prior experience in Bayesian modelling (for more detailed information on
their respective backgrounds and relevant modelling experience please see Appendix A).</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Case material</title>
        <p>To preserve the blindness of the modellers to the case pre-modelling, an independent external expert
selected the case and translated the appellate verdict into English to facilitate both modellers
understanding of the case (the full translated verdict document is available as supplementary material and
will be made available upon request). The case 4 concerns the armed robbery of a supermarket in
NieuwDordrecht in March 2021, for which the defendant was convicted. The evidential material described
in the verdict includes CCTV footage, eyewitness testimonies, gait observations, weapon evidence,
and cell-site data. No individual item carried strong probative value; instead, the conviction relied on
the cumulative efect of multiple weaker indicators. This made the case suitable for examining how
BN modelling may support or “double-check” judicial reasoning in borderline or uncertainty-sensitive
cases. The modellers were restricted to the evidence explicitly mentioned in the court’s verdict. This
reflects judicial practice, in which the judge has the exclusive authority to select and weigh evidence.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Procedure</title>
        <p>The full modelling process consisted of eight structured phases 5, illustrated in Figure 1. Only data
collected during the first six phases are discussed in the current analysis.
4Arnhem-Leeuwarden Court of Appeal, 15 May 2025, ECLI:NL:GHARL:2025:3013.
5The modelling process was intentionally divided into distinct, sequential steps. While this may not reflect real-world
modelling practices, in which modellers may iterate between qualitative structure and quantitative input, or develop both
simultaneously, it allowed for a more controlled evaluation of the impact of quantification.</p>
        <p>Phase 1 required modellers to independently read the case and make initial notes. In phase 2, they
individually constructed a qualitative BN of the case, before, phase 3, completing a structured reflection
on the process of constructing the qualitative model. In phase 4, each modeller expanded their qualitative
structure into a quantitative network, and in phase 5 completed a second structured reflection. In phase
6, the modellers collaboratively compared and discussed their final BNs using a structured discussion
protocol. Phases 7 and 8 involved the joint construction and discussion of a single, agreed-upon BN.</p>
        <p>The semi-structured reflection and discussion protocols (available in the supplementary material),
employed in phases 3,5,6 and 8, were developed to introduce a degree of procedural consistency into
this exploratory study, providing a systematic framework for within-subject (and later between-subject)
comparison.</p>
        <p>Throughout all phases, the modellers followed a think-aloud-protocol to document the reasoning
behind their modelling choices. All sessions were recorded via Zoom Workplace for Education (Version
6.5.12), which provided video, audio and automatic transcription. The time spent on each step was
systematically registered. All models were constructed using the Hugin software package (Version 9.5
&amp; 9.6) 6.</p>
        <p>Although both modellers intended to use the same version of Hugin, one inadvertently used the Pro
edition while the other used the free version. This did not have a significant impact on the modelling
process or the planned analysis; all functionalities and comparison measures required for the study are
available in both editions. The main diference concerns the resulting model complexity, which does
not afect the outcomes reported here.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>This section presents the constructed models, contributing to the assessment of model completeness.
We compare the structural changes between the qualitative and quantitative networks, evaluating the
value of qualitative case modelling compared to its quantitative counterpart.</p>
      <sec id="sec-3-1">
        <title>3.1. Model overviews</title>
        <p>
          While both models were fully quantified, only their structural components are reported here, as the
analysis focuses on the impact of the quantification process itself rather than on the specific numerical
probabilities assigned. The completed conditional probability tables are available upon request and will
be addressed in forthcoming publications.
3.1.1. Modeller 1
Modeller 1 employed an idiom-based modelling approach, as outlined in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. The qualitative and
quantitative models can be seen in Figure 2a and 2b respectively 7.
        </p>
        <p>Both networks are centred around the ultimate hypothesis concerning the defendant’s guilt
(H_Defendant_Guilty). Supporting evidence is organised around four sub-hypotheses —relating to
motive (Hm_Defendant_Had_Motive), opportunity (Ho_Defendant_had_opportunity, H_D_at_crime_scene),
gun evidence (H1_Defendent_gun_used_in_robbery, H1_D_gun_used_in_robbery), and car evidence
(H2_RobberyCar_belongs_defendant, H_D_man_in_CCTV ) — each serving as a parent to the ultimate
hypothesis. Two items of evidence, eyewitness testimonies, are directly connected to the ultimate
hypothesis.</p>
        <p>The total time spent modelling was 188 minutes (3.1 hours); 82 minutes (1.4 hours) were spent on
reading the case and making initial notes, 54 minutes (0.9 hours) on building the qualitative structure
and 52 minutes (0.9 hours) on the quantification process.
6Hugin was selected due to its ease of use and accessibility, making it a suitable platform for potential employment in
real-world legal settings.
7The models presented in this section are reduced in size for readability, please see Appendix B for the full scale versions.</p>
        <p>During the quantification process, several structural changes were made to the original qualitative
network. In the quantified network, an additional sub-hypothesis relating to the gun evidence was
introduced (H_Robbery_gun_is_gold). The total number of evidence nodes decreased by one, while the
number of reliability nodes increased by one. A side-by-side comparison of the basic network properties
can be seen in Table 1.
3.1.2. Modeller 2</p>
        <sec id="sec-3-1-1">
          <title>Modeller 1</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Modeller 2</title>
        </sec>
        <sec id="sec-3-1-3">
          <title>Qualitative</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>Quantitative</title>
        </sec>
        <sec id="sec-3-1-5">
          <title>Qualitative Quantitative</title>
          <p>
            Modeller 2 adopted a temporal-narrative (scenario-based) modelling approach, based on [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]. The
qualitative and quantitative networks are presented in Figure 3 and 4 respectively.
          </p>
          <p>The qualitative model features two parallel structures, separating hypotheses specific to the known
defendant (red nodes), abstract hypotheses specifying the criminal events, without reference to a specific
defendant (yellow), and a set of identification nodes (blue) linking these two layers. The ultimate
hypothesis is DefendantRobsShopThreatensetc, the subhypotheses of the scenario are temporally ordered
around this node. The evidence (green) represented explicitly allows for the possibility that the crime
happened, without the defendant being guilty. Reliability and alternative explanation considerations
are represented in orange.</p>
          <p>The total modelling time was 9.5 hours; 2.5 hours were spent on reading the case, 3 hours on building
the qualitative structure and 4 hours on the quantification process.</p>
          <p>In the quantitative model, the identification nodes were streamlined, and several links within the
abstract criminal scenario were removed. Two evidence nodes related to the tattoo cluster were removed.
Additionally, the structure of the gun evidence cluster underwent substantial modification. A simple
analysis of structural changes between the qualitative and quantified models reveals no diference in
the number of nodes. The number of edges decreased slightly, from 57 to 56. Similarly to modeller 1,
the number of hypotheses decreased between models, and the number of evidence nodes increased. A
full list of network properties can be seen above in Table 1.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model revisions during quantification</title>
        <p>To elucidate the model alterations introduced during the quantification process, we analyzed the
diferences in node composition and link configuration between the two networks. This qualitative analysis
serves to capture changes that are not apparent from numerical comparison alone (e.g., models may
exhibit identical node counts while representing distinct variable sets). Nodes were classified as added
or removed when introduced or omitted between versions. A variable name was considered refined only
when one or more words were changed (minor adjustments, such as abbreviations, renumbering or
variations in capitalisation were disregarded). A definition change was coded when a node’s conceptual
scope shifted (e.g., when a variable was broadened, narrowed or reframed within the same evidential
theme; this includes negation). Alterations to nodes, whether additions or removals, inherently
necessitate corresponding modifications to the network’s edges. Link changes were accordingly classified as
added, removed, direction change or mediated (i.e. when an existing connection became indirect through
the introduction of an intermediate node).
3.2.1. Modeller 1
Of the 23 variables in the initial qualitative model, 19 remained in the quantitative expansion.
Three variables were added: one hypothesis (H_Robbery_gun_is_gold), one piece of evidence
(E_distinct_features), and a reliability consideration (R_reliability). Two evidence nodes, E_CCTV_footage
and E1_Defendant_had_gold_gun, were removed. Six node labels were refined, and one variable was
redefined to reflect a conceptual shift in meaning. A full qualitative overview of network modifications
is presented in Appendix A, a summarised version presenting the frequency count of each modification
can be seen in Table 2.</p>
        <p>Only one link was removed independently of node changes: the connection between the two
eyewitness testimony nodes, E5_Eyewitness1_description_match and E6_Eyewitness2_description_match. The
removal of one evidence item, E_CCTV_footage, directly led to the deletion of its four links to parent
nodes. Likewise, the removal of a further evidence node, E1_Defendant_had_gold_gun, resulted in the
automatic loss of two additional links. The inclusion of the sub-hypothesis H_robbery_gun_is_gold
not only necessitated the inclusion of a link to the existing network but also reconfigured several
existing connections. Relationships between variables that were previously direct — such as those
between H1_Defendant_gun_used_in_robbery and multiple evidence nodes—became mediated through
this intermediate node. Table 2 outlines these link changes, distinguishing between those arising as a
direct consequence of node alterations and those resulting from independent structural revisions.
3.2.2. Modeller 2
There are 34 variables that are the same across both models, and there are 18 variables in the qualitative
model that are not in the quantitative, and vice versa. The diferences in node composition can be
summarised as follows: one identification node was removed to streamline the model; eight scenario
(sub-hypothesis) nodes were refined; and two reliability nodes were modified to clarify their meaning.
Within the tattoo evidence cluster, two evidence nodes were removed: the node PerpHasTattoo previously
mislabelled as evidence was reclassified as a supporting hypothesis, and DefendantNoTattoo was deemed
redundant, as this condition was already ensured by the CPTs. Several evidence labels were refined for
greater specificity, most notably within the gun evidence cluster.</p>
        <p>There were significant changes in the arcs of the network. For a full overview, see Appendix A. The
main changes are in the abstract scenario structure. This included edges between these abstract scenario
nodes. This was deemed unnecessary in the quantitative model, as these relations were captured in the
specified scenario.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Discussion protocol analysis</title>
        <p>Table 4 summarises the modellers’ reflections regarding the alignment between their Bayesian networks
(BNs) and the court’s reasoning. Despite being instructed to model the complete case, neither modeller
felt that their BN fully represented all evidence cited in the verdict. Both, however, expressed confidence
that their networks accurately captured the key inferential relationships and mirrored the overall logical
structure of the court’s reasoning. Both modellers reported that the (in)dependencies within their
respective networks were clearly defined, though Modeller 2 qualified this assessment by noting that
full clarity would only be achievable after quantification of the conditional probability tables (CPTs)
(for the complete reflective responses see the Supplementary Materials 8.</p>
        <p>Similarly, both modellers confirmed that alternative hypotheses, including negations and competing
scenarios, had been considered conceptually, though only Modeller 2 incorporated these alternatives
explicitly into their network structure.</p>
        <p>To further explore the potential of qualitative models to serve as a “double-check" on traditional
judicial reasoning, the modellers were asked to reflect on the error detection capabilities of their
networks. The summarised responses are presented in in Table 5. Neither modeller was able to explicitly
identify probabilistic or logical fallacies within the court’s reasoning. Modeller 1 clearly stated that the
network did not implicitly expose any probabilistic fallacies, whereas modeller 2 emphasised that such
8Available upon request.
Does the BN fully represent all evidence cited in the court’s verdict?
Are all (in)dependencies clearly defined?
Is the network suficiently nuanced for the analysis it is meant to support? Are all chains of inference visible?
Have alternative hypotheses been considered, including both the negation of the main hypothesis and any competing hypotheses?
Does the structure mirror the logic of the court?
No
Yes
No
Yes
Yes</p>
        <p>No
Yes
No
Yes</p>
        <p>Yes
Modeller 1 Modeller 2
assessments were not possible at the qualitative stage of modelling. Both agreed that the direction of
inference within their networks was plausible and consistent with real-world causal relationships.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion</title>
      <p>Very much so. Bayesian thinking forces deeper
consideration of (in)dependencies, and
visualising these via links clarified relationships I
had previously overlooked.</p>
      <p>Yes. The match between the defendant’s car
and the CCTV car seems a jump; additional
evidence is needed. Likewise, assumptions
about the gun require frequency information
to justify the match.</p>
      <p>I think something is wrong with the
gait/movement testimony regarding independence.
Yes. The verdict assumes various sightings
relate to the defendant without justification.
The alternative explanation for the phone mast
location was not considered. There is also an
unjustified assumption about the identity of
the man in the store and the man near the car.</p>
      <sec id="sec-4-1">
        <title>4.1. On the feasibility of constructing qualitative models of complete cases</title>
        <p>While this article does not aim to make generalizable claims about the feasibility of BN modelling in
judicial deliberation, two key considerations emerged: model completeness and time constraints.</p>
        <p>Modeller 1 was restricted by the Hugin free version, which allows for a maximum of 50 state nodes.
As a result, subjective decisions had to be made regarding which evidential items to include in the
network, resulting in an incomplete representation of the court’s verdict. This limitation conflicts with
the judicial principle that the judge has the exclusive authority to weigh and select the evidence, making
the Hugin free version (Hugin Lite 9.6), although accessible, an unsuitable platform for complete case
modelling. Model 2, by contrast, ofers a more comprehensive representation of the case. The only
notable omission concerns the car investigation, where not all witness testimonies were explicitly
modelled. However, the reading of the case and construction of the qualitative structure alone required
around 5.5 hours, which — following verbal discussions with legal professionals — may be considered
as unfeasible in the context of double-checking the verdict in the deliberation chamber.</p>
        <p>Post-hoc reflection further revealed that neither modeller felt that their networks fully captured all
considerations of (in)dependencies: some were omitted due to uncertainty on how to operationalise
them, while others were only recognised in later mutual discussion. The court’s verdict provided both
modellers with dificulty in this aspect: dependencies were rarely explicitly discussed, and missing
information (potentially available to the court but not reflected in the verdict) would have been required
to model such relationships accurately.</p>
        <p>Both modellers’ experience reflect longstanding discussions in the literature concerning the definition
and attainability of model completeness in Bayesian modelling of legal cases, namely, that legal reasoning,
as expressed in verdicts, is inherently selective and often omits the explicit dependencies required for
full formal representation. The study therefore provides empirical support for these discussions by
showing that the limits to completeness arise not only from technical or time constraints, but from the
nature of judicial reasoning itself.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Understanding model modifications between networks</title>
        <p>Our second research question aims to examine the extent to which quantification impacts the structure
of a qualitative BN. The extensive structural revisions observed during the quantification process
indicate that the process of assigning probabilities did far more than simply numerically parametrise a
pre-defined network. It reshaped how both modellers conceptualized the case. While it may initially
appear that such modifications arose simply from prolonged cognitive engagement with the case
material, both modellers explicitly rejected this interpretation. They identified quantification itself –
the act of populating conditional probability tables and directly confronting questions of the type What
is the probability of this event, given this evidence? – as the moment that prompted then to reevaluate
their earlier assumptions and adjust the structure accordingly.</p>
        <p>Modeller 1’s transcript (request supplementary material) illustrates this process. While completing
the CPT for H_Defendant_gun_used_in_robbery, they paused mid-sentence:
"So, if the robbery gun is gold ... Oh, I have just realised I would like to add nother node,
actually, for the gun ..."</p>
        <p>Here, the demand to specify probabilities directly triggered recognition of an unmodelled conceptual
distinction, prompting structural refinement. A similar process occurred during the quantification of
the eyewitness testimonies:
"Ok, now, actually, I am not too sure about this dependent link right here between the two
eyewitnesses. So I am going to remove it ..."</p>
        <p>Modeller 2’s transcript (supplementary material) demonstrates a similar dynamic. While filling in
the CPTs for the car-related CCTV evidence, they repeatedly interrupted themselves to reconsider the
dependencies at play:
"This is ... the probability, given that the guy breaks the car and gets out of the car, what is
the probability that we see that in that location?" ... Well, we think that is pretty likely ...</p>
        <p>Oh yeah, because this one is still weird, ... it has an extra parent..."</p>
        <p>These reflections show that probability elicitation can act as a diagnostic probe into model coherence
and completeness, highlighting uncertainties which remained hidden during qualitative construction.
Across both modellers, despite the study design, quantification was not experienced as a separate
technical phase but an integral part of understanding the case. When asked directly whether improved
understanding resulted from quantification or simply from spending more time with the model, both
answered unequivocally: Quantification. As modeller 1 reflected:
"If I just continued thinking about the structure without the numbers, I would not have
realised these things ... It was actually when I went to fill it in, and I was putting up these
questions in my head, ’Ok, if he is a family man, how likely is it that...’, and then I was like,
wait, how am I meant to say that?"
Modeller 2 echoed this point:
"But the only way you can get at this, like, qualitative view is through, ... is by going
through the quantitative view.... I don’t think, at least the way I approach it, I can separate
the qualitative and the quantitative part."</p>
        <p>
          While previous studies have discussed the broader epistemic value of BN modelling [
          <xref ref-type="bibr" rid="ref3 ref6">3, 6</xref>
          ], they do
not distinguish between the qualitative and quantitative phases of the modelling process, nor assess
the individual contribution of each. Our study makes this distinction explicit by comparing networks
before and after quantification. Taken together, the above accounts show that quantification functioned
as a source of conceptual change. This raises an important question for the function of qualitative BNs
as post-hoc "double-check" tools on judicial reasoning: if the act of quantification substantially alters
the structure and consequential interpretation of the model, can an unquantified network truly serve as
a reliable check of the court’s reasoning and hidden assumptions?
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. The importance of clear node definitions</title>
        <p>An important insight emerged during the modelling and discussion phases: proper and consistent
variable definitions are essential for transparency and interpretability in BN modelling of complete
criminal cases. During quantification, modeller 1 refined six variable names (representing 26% of
the total) and redefined one variable, while modeller 2 revised thirteen variable labels (25%) and
redefined two. These are substantial proportions of the variable set. Thus both modellers not only
encountered dificulties maintaining clear conceptual clarity across modelling stages, as well as when
discussing their respective models, repeatedly requiring clarification of each other’s variable meanings
despite working from the same evidential basis (as reflected in the discussion transcript, available
as supplementary material). This issue was explicitly acknowledged by the modellers in their
postquantification discussion:</p>
        <p>
          Modeller 2: “. . . I think that this is a big problem of Bayesian network [modelling] that . . .
is not discussed, but, . . . , if you make a [node], and you fill out the table with one sort of
interpretation of what that variable means, and then maybe later on you look at it and you
forgot what exactly you [meant], and then you fill out the diference. Or . . . you go on . . .
add another node, and . . . [forget] exactly what you [meant] by the first interpretation. Or
. . . you condition on the parents, but maybe you now consider the parents as broader than
before. . . there is . . . a sort of implicit inconsistency in how you define the nodes.”
Modeller 1: “I . . . mid-quantification . . . changed all my node names, because I said I need
to be more specific. . . . I’d write . . . a general statement —- CCTV —- what does CCTV
mean?”
This underscores a broader issue rarely reflected in the literature: while BNs make evidential structures
explicit, their interpretability ultimately depends on precise, stable variable semantics. Fenton et al.
(2016) [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] emphasize that the strength of a BN lies in its capacity of represent complex evidential
variables transparently; yet this transparency collapses if variables are ambiguously defined or evolve
mid-modelling. Their later analysis of the Simonshaven case [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] provides a clear example: although the
paper presents a complete BN and describes the modelling process in detail, it ofers little explanation
of how individual variables were defined. As a result, the model’s structure can be dificult to interpret,
even for technically informed readers. Should Bayesian modelling, whether qualitative or quantitative,
serve as an independent, structured “double-check” on the court’s traditional reasoning process, such
clarity and definitional precision are essential.
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. The value of qualitative networks as “double-check" tools</title>
        <p>It is dificult to empirically evaluate whether qualitative networks can serve as double-check tools. in
an attempt to somewhat formalise this analysis, in our discussion checklist we evaluated the qualitative
nets based on various criteria, amongst others: alignment, error detection and value.</p>
        <p>Both modellers reported that constructing the qualitative network substantially enhanced their
understanding of the case. The process encouraged explicit consideration of evidential dependencies
and helped identify areas of uncertainty or missing information. As Modeller 1 noted:
“It forced me to think a lot deeper about the (possible) dependencies between the diferent
pieces of evidence”
Modeller 2 observed that the BN:</p>
        <p>“identifies the ‘source’ of uncertainty.”
These reflections confirm the interpretive and diagnostic value of qualitative BNs: they make reasoning
structures visible and expose where uncertainty resides. This finding difers from the existing literature
in that we explicitly distinguish between the qualitative and quantitative value of the networks, rather
than treating the BN modelling process as a single tool.</p>
        <p>However, neither modeller identified explicit probabilistic fallacies or logical inconsistencies in
the court’s reasoning. This absence does not necessarily undermine the potential of BNs as
"doublecheck" tools, as the case itself may have been "perfectly" reasoned, but, together with the observed
structural impact of the quantification, highlights a potential limitation warranting further investigation:
Qualitative modelling alone may be insuficient to detect more subtle probabilistic missteps.</p>
        <p>
          Beyond explicit error detection, both modellers agreed that the qualitative process revealed implicit
assumptions and unspoken leaps in the court’s reasoning. Modeller 1 highlighted the treatment
of the car and gun evidence—particularly the strength of the assumed matches and the absence of
frequency information—as examples where the court appeared to rely on under-specified or weakly
supported inferences. Modeller 2 drew attention to gaps in the treatment of identification evidence,
questioning the implicit assumption that multiple diferent eyewitness descriptions referred to the same
individual. These observations suggest that qualitative modelling can surface areas where reasoning
relies on implicit or weakly articulated assumptions, even when no formal probabilistic fallacies are
present. This directly aligns with Prakken’s (2020) [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] argument that the use of Bayes in law should not
dictate conclusions but rather support structured dialogue on evidential coherence. The qualitative BN
accomplishes precisely that. It renders visible the tacit dependencies that traditional judicial writing
leaves implicit, thereby allowing others to ask whether the inferential links assumed by the court are
defensible, complete, and mutually consistent.
        </p>
        <p>Together, the findings suggest that the primary value of constructing qualitative BNs lies not in
error detection per se, but in fostering a structured form of reflection of evidential coherence. As
“double-check” tools, their strength is heuristic rather than diagnostic: they do not mechanically verify
the correctness of a verdict but create a structured environment in which the assumptions, dependencies,
and uncertainties embedded in judicial reasoning can be examined explicitly. This reflective capacity is
particularly relevant in appellate or review contexts, where transparency of reasoning is as important
as its substantive outcome.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Limitations and future research</title>
        <p>The present study involved only two expert modellers (the authors) and focussed on a single criminal case.
Consequently, the insights derived from this work are exploratory in nature and cannot be generalised
to the wider population of legal practitioners or to other case types. It forms part of a broader study
examining the feasibility and value of constructing Bayesian networks of complete criminal cases.
Future work will address several outstanding questions, including whether two independent modellers
analysing the same case can arrive at the same (or similar) outcomes, and whether disagreements —
if any — can be resolved through discussion and model refinement. Further, drawing on the insights
and dificulties encountered in this study, the authors aim to develop and test more formalised BN
comparison tools (including the development of the discussion protocols into a standardised BN
evaluation tool, supported by a taxonomy of model modifications and their potential interpretive
and structural implications). To extend these findings, future research should apply the experimental
approach to additional criminal cases and a broader participant group with varying expertise—from
students to legal practitioners and forensic advisors. Within the wider PMJ project, the long-term goal
is to develop tools that help judges recognise and avoid probabilistic reasoning fallacies. If
completecase modelling proves impractical due to time or complexity, simplified alternatives become essential.
Ongoing work explores a scenario-based method and a complementary question list designed to support
structured probabilistic reflection.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This study provides an initial empirical examination of qualitative Bayesian networks as post-hoc
“double-check” tools in legal reasoning. By contrasting qualitative and quantitative models of a single
appellate case, we found that the quantification process significantly influenced not only network
parametrization but also conceptual understanding of the case and structural formulation. Although
qualitative networks alone can clarify evidential dependencies, expose implicit assumptions, and
enhance transparency in reasoning, they thus far appear limited in their capacity to identify probabilistic
fallacies or subtle logical errors without numerical specification. Our findings underscore the dual
nature of Bayesian modelling in legal settings: its strength lies in structuring complex evidential
relations, yet its interpretive reliability depends on clear, consistent node definitions and an awareness
of how quantification reshapes conceptual framing. Based on the current case analysis, given the
time and expertise required for the complete-case modelling, the practical use of qualitative BNs in
judicial deliberation may lie in their role as heuristic or educational tools, supporting judges and legal
practitioners in identifying uncertainty and hidden assumptions rather than in producing decisive
probabilistic outcomes. Future work should extend this approach to additional cases and participants,
standardize reflection and comparison protocols, and further explore hybrid frameworks that balance
qualitative transparency with the analytical rigour of quantified reasoning.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research was supported by Dutch Research Council (NWO) under the project ’Preventing
miscarriages of justice (PMJ)’ [Grant number: 406.21.RB.004.] and supported by the Hybrid Intelligence
Center, a 10-year programme funded by the Dutch Ministry of Education, Culture and Science through
the Netherlands Organisation for Scientific Research, https://hybrid-intelligence-centre.nl. The authors
thank the afiliated members of the PMJ project for their valuable support and feedback.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT 5.0 in order to: Grammar and spelling
check, paraphrase, and reword. After using this tool, the authors reviewed and edited the content as
needed and assume full responsibility for the content of the publication.</p>
    </sec>
    <sec id="sec-8">
      <title>Appendix A: Modelling experience</title>
      <p>Leya Lisa Hampson is a PhD student at the University of Groningen specializing in the application
of Bayes in legal contexts. She has a background in mathematics and forensic science. She was
ifrst introduced to Bayesian probability during her undergraduate studies in mathematics, where she
developed a strong theoretical foundation in probabilistic reasoning. During her master’s program, she
completed the course ‘Interpreting and Understanding Forensic Evidence’, which focused on Bayesian
modelling in legal contexts, based on Fenton et al. (2011) textbook. This knowledge was further
applied and deepened in a course on digital evidence, in which Bayesian Networks were used to
model evidential relationships in data-heavy cases. Over the course of her academic work, she has
independently modelled approximately 6 complete legal cases using the AgenaRisk software. In addition,
she has constructed numerous partial models and idiomatic structures, further reinforcing her practical
experience with both qualitative and quantitative aspects of Bayesian modelling.</p>
      <p>
        Ludi van Leeuwen is a PhD student at the University of Groningen specializing in evaluating Bayesian
network models for legal evidential reasoning. She has a background in artificial intelligence and
philosophy. During her Bachelor’s degree, she compared diferent types of formalization of evidence
by modelling a legal case [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] in both a Bayesian network, using the scenario idiom [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] as well as in a
case model [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In her Master’s in AI, her aim shifted to testing and evaluating idioms for Bayesian
networks using artificial ground truths as groundings. In this project, continued into her PhD work,
she has modelled 2 diferent (simplified) complete cases in Bayesian network and has implemented
and evaluated numerous legal idiom structures. She usually models manually in Hugin, and generates
Bayesian networks from artificial data using PyAgrum.
      </p>
    </sec>
    <sec id="sec-9">
      <title>A. Full Bayesian Network Models</title>
      <p>The full Bayesian networks are provided below for reference.</p>
    </sec>
    <sec id="sec-10">
      <title>Appendix C: Modeller 1: Variable and link modifications between the qualitative and quantitative networks</title>
      <p>E_vehicle_comparison E_CCTV
H_gun_gold
H_gun_gold
H_gun_gold
r_reliability
H_gun_gold
–
–
–
–
–
–
–
–
–
–
–
–
H1_gun_robbery
Added
H1_gun_robbery / Mediated
E_silver_gun
H1_gun_robbery / Mediated
E_gun_match
E_CCTV_silver_gun Added
H1_gun_robbery / Mediated
E3_silver_gun
Removed
Removed
Removed
Removed
Removed
Removed</p>
      <sec id="sec-10-1">
        <title>Related to node change?</title>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>Appendix D: Modeller 2: Variable and link modifications between the qualitative and quantitative networks</title>
      <p>CarIsDefendants PerpParksandExitsCar – – Removed No
– – DefendantParksandExitsCar CarIsDefendants Added No
DefIsPerp1 PerpParksandExitsCar – – Removed No
PerpParksandExitsCar PerpRobsThreatensGunBreaksReg – – Removed No
– – DefendantRobsShopThreatensetc DefIsPerp1 Added No
PerpThreatensWithGunandBreaksRegister PerpReturnsToCarL1T2 – – Removed No
– – DefendantRobsShopThreatensetc DefendantReturnsToCar Added Yes
– – DefendantReturnsToCar PerpReturnsToCarL1T2 Added Yes
– – GunChars GunDefendantMatchesPerpetrator Added Yes
– – CrimeSceneGunCharA GunChars Added Yes
– – GunColorCrimeScene GunChars Added Yes
DefIsPerp2 GunisDefsGun GunDefendantMatchesPerpetrator DefIsPerp1 Direction reversed Yes
GunColorCrimeScene GunColorMatchesSuspectsGun – – Removed No
DefendantGun GunFoundOnNightStand – – Removed No
– – DefendantOwnsGun DefendantHasCharAGun Added Yes
– – DefendantHasGoldCharAGun GoldAGunFoundOnNightstand Added No
DefendantGun OtherCharacteristicsGunMatchesSuspect – – Removed No
OtherCharacteristicsGunMatchesSuspect GunIsDefsGun – – Removed No
INDEPENDENT REFLECTION OF QUALITATIVE BNs
[Goal: Assessment of the use of qualitative BNs as a double-check for the court’s
reasoning]
Instruction: Please complete this checklist independently immediately after finishing
the first modelling phase i.e. once you have constructed a qualitative BN for the case.</p>
      <p>This checklist is designed to support structured reflection of your model. The prompts
serve as cognitive guides and highlight key issues for discussion. You are not expected
to answer each question word-for-word or in list form. Instead, use them to organise
your reflections and note any relevant insights, modelling decisions, uncertainties or
concerns that arose during the modelling process.
1 Any fact or piece of information cited by the court as part of its reasoning, whether directly discussed in
terms of probative value or clearly presented as a basis for the final decision, is considered part of the
evidential set to be modelled. Even if evidence is not explicitly assigned weight in the verdict, its inclusion
indicates that it played a role in the court’s assessment and should therefore be included in the model.
(3) Did building the qualitative BN allow
you to identify any probabilistic fallacies?
(4) Did building the qualitative BN allow
you to identify other, non-probabilistic
fallacies?
(5) To what extent did Bayesian thinking
assist the identification of missing
evidence?
(6) To what extent did Bayesian thinking
assist the handling of (in)dependencies?
(7) Did building the qualitative model help I find the car match to the defendants a
you expose any jumps in reasoning? If so, bit of a jump in reasoning. Additional
did this reveal any (un)acceptable evidence could help expose whether this
implicit assumptions? was a harmful jump or not. I further think
that the identification of the gun poses a
gap in reasoning, I believe more
information on the frequency of this type
of gun (rather than just CCTV match
statements) is needed to make the
assumption that this gun is his.</p>
      <p>EXTRA NOTES:
4: QUANTIFICATION
(1) Which parts of the structure would
most benefit from quantification?
INDEPENDENT REFLECTION OF QUALITATIVE BNs
[Goal: Assessment of the use of qualitative BNs as a double-check for the court’s
reasoning]
Instruction: Please complete this checklist independently immediately after finishing
the first modelling phase i.e. once you have constructed a qualitative BN for the
case. This checklist is designed to support structured reflection of your model. The
prompts serve as cognitive guides and highlight key issues for discussion. You are
not expected to answer each question word-for-word or in list form. Instead, use
them to organise your reflections and note any relevant insights, modelling
decisions, uncertainties or concerns that arose during the modelling process.
1: ALIGNMENT
(1) Does the BN fully represent all</p>
      <p>     Most of it. Evidence that was not
evidence cited in the court’s verdict?
modelled:
- The details of the car (matching</p>
      <p>wheels/trekhaak/interior colors)
- Footage of the defendant at his house,</p>
      <p>establishing the car exit
- Alternative explanations for the
suspect’s strange gait/exit are not
modelled.</p>
      <p>station
- Footage of defendant in car at gas
- Details of personal identification</p>
      <p>(length, hair, clothing, age)
- Precise specification of gun details
except colors
(also not in verdict)
camera images
- Precise specification of gait analysis
- How the gun was identified in the
(2) Are (in)dependencies clearly
defined?
     Yes, but they are difficult to think
about without entering the CPTs
(3) Are all chains of inference visible? Reliability of witnesses not always
modelled explicitly, chain of inference
from perp to defendant is visible.</p>
      <p>Some nodes have been collated into
a single node (PerpParksExitsCar,</p>
      <p>PerpRobsShop).
(4) Have alternative hypotheses been      Yes, orange nodes for alternative
considered, including both the negation explanations (both on the
of the main hypothesis and any prosecution side and on the defense
competing hypotheses? side), but not represented in a single</p>
      <p>node/single alternative story
(5) Does the structure mirror the logic of      The evidence is considered in
the court? separate clusters, from left to right
there is a “timeline”. The
identification of the suspect as the
defendant is made explicit. I think to a
large extent the structure mirrors that
of the court.</p>
      <p>EXTRA NOTES:
     
2 : VALUE
(1) Did building the qualitative structure
increase your understanding of the
case?</p>
      <p>Yes, but I was confused sometimes
about how to model things (such as
“identification”). There seemed to
never be a doubt about the facts that
occurred, only who did them. The BN
identifies the “source” of uncertainty.
(2) Did you detect issues you may have      Yes, it is not made explicit if the
missed using text-based reasoning phone signal of the defendant near
alone? the crime scene could also be due to
that being near his house. 
 
It is not clear how often these similar
cars occur, like how many cars were
found in the list to be investigated?
(3) Does the building the qualitative BN      I think something went wrong in
allow for the identification of probabilistic the case with the gait/exit likelihoods,
fallacies? but I’m not sure if I modelled that
correctly in the BN either. I will need
to add numbers for that. 
Also, independent witnesses and
double-counting seems important
here.</p>
      <p>     Not sure
(4) Does building the qualitative BN
allow for the identification of other,
nonprobabilistic fallacies?
(5) To what extent did Bayesian thinking      There seemed to be missing
assist the identification of missing evidence for the alternative
evidence? explanations, also the reliability of
witness 2, and the whole process
seems to hinge on the suspect
driving this car &amp; then finding the
similar gun.
(6) To what extent did Bayesian thinking      I think something is wrong with
assist the handling of (in)dependencies? the gait/movement testimony</p>
      <p>re:independence.
(7) Did building the qualitative model
expose any jumps in reasoning? If so,
did this reveal any (un)acceptable
implicit assumptions?
EXTRA NOTES:
     
3: ERROR DETECTION
(1) Does the BN implicitly expose any
probabilistic fallacies? If so, which
one(s)?
     Doesn’t seem to have established
that the defendant actually physically
looked like the suspect, apart from
that they have similar heights and
accents. Also, did not consider the
alternative explanation of his phone
sending to the mast if that was also
where he lived.</p>
      <p>Also: jumps to assume that the
suspect running to and from the car
was the same as the man in the store,
even though that’s not justified in the
verdict.
     Can’t see that now. Maybe the
dependence of the gait/exit (as I can
imagine), or a reference class
problem for the car…, or neglecting
the alternative hypothesis that the
phone is near his home.</p>
      <p>     Not sure.
(2) Are any logical fallacies or other
reasoning errors (beyond calculation
mistakes) present? If so, which one(s)?
(3) Is the direction of inference plausible      Yes, I mostly used evidence-idiom
i.e. are the directions of the links constructions with some abstract
between the nodes consistent with the node relating to ‘identification’
real-world causal relationships they
represent?
(4) Are any plausible alternative      Can't say without quantification.</p>
      <p>hypotheses ignored?
     All of it!
I find it hard to judge whether the
structure is correct without seeing if
setting some evidence aligns with
what I think it should do.
(2) Which assumptions do you expect      I hope it will give me insight into to
the quantitative structure to confirm, what extent the evidence is strong
challenge or clarify? enough to carry the “identification” of</p>
      <p>the perpetrator as the defendant.
(3) Is overall quantification necessary or      YES
useful in evaluating whether this model
meets the BARD threshold?
EXTRA NOTES:
     </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Mackor</surname>
          </string-name>
          ,
          <article-title>Bayesian modelling of criminal cases as a whole: A philosophical reflection on dutch case law, Questio Facti (forthcoming</article-title>
          ,
          <year>2026</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Prakken</surname>
          </string-name>
          ,
          <article-title>A new use case for argumentation support tools: supporting discussions of bayesian analyses of complex criminal cases</article-title>
          .,
          <source>Artif Intell Law</source>
          <volume>28</volume>
          (
          <year>2020</year>
          )
          <fpage>27</fpage>
          -
          <lpage>49</lpage>
          . doi:
          <volume>10</volume>
          .1007/ s10506-018-9235-z.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. N. Norman</given-names>
            <surname>Fenton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Berger</surname>
          </string-name>
          ,
          <article-title>Bayes and the law</article-title>
          ,
          <source>Annu Rev Stat Appl</source>
          (
          <year>2016</year>
          )
          <fpage>51</fpage>
          -
          <lpage>77</lpage>
          . doi:
          <volume>10</volume>
          . 1146/annurev-statistics-
          <volume>041715</volume>
          -033428.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Vlek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Prakken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Renooij</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Verheij</surname>
          </string-name>
          ,
          <article-title>Modeling crime scenarios in a Bayesian network</article-title>
          ,
          <source>in: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Law</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , Rome Italy,
          <year>2013</year>
          , pp.
          <fpage>150</fpage>
          -
          <lpage>159</lpage>
          . URL: https://dl.acm.org/doi/10.1145/2514601.2514618. doi:
          <volume>10</volume>
          . 1145/2514601.2514618.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Lagnado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fenton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Neil</surname>
          </string-name>
          ,
          <article-title>Legal idioms: a framework for evidential reasoning</article-title>
          ,
          <source>Argument &amp; Computation</source>
          <volume>4</volume>
          (
          <year>2013</year>
          )
          <fpage>46</fpage>
          -
          <lpage>63</lpage>
          . doi:
          <volume>10</volume>
          .1080/19462166.
          <year>2012</year>
          .
          <volume>682656</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Meester</surname>
          </string-name>
          ,
          <article-title>The limits of bayesian thinking in court</article-title>
          ,
          <source>Topics in Cognitive Science</source>
          <volume>12</volume>
          (
          <year>2020</year>
          )
          <fpage>1205</fpage>
          -
          <lpage>1212</lpage>
          . URL: https://onlinelibrary.wiley.com/ doi/abs/10.1111/tops.12478. doi:https://doi.org/10.1111/tops.12478. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/tops.12478.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <article-title>The problematic value of mathematical models of evidence</article-title>
          .,
          <source>The Journal of Legal Studies</source>
          <volume>36</volume>
          (
          <year>2007</year>
          )
          <fpage>107</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Judicial fact-finding and the bayesian method: The case for deeper scepticism about their combination</article-title>
          ,
          <source>The International Journal of Evidence Proof</source>
          <volume>1</volume>
          (
          <year>1996</year>
          )
          <fpage>25</fpage>
          -
          <lpage>47</lpage>
          . doi:
          <volume>10</volume>
          .1177/ 136571279600100103.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Meester</surname>
          </string-name>
          , L. Stevens,
          <article-title>Bayesian reasoning and the prior in court: not legally normative but unavoidable</article-title>
          .,
          <source>Law, Probability and Risk</source>
          <volume>23</volume>
          (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .1093/lpr/mgae001.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Dahlman</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Kolflaath,</surname>
          </string-name>
          <article-title>The problem of the prior in criminal trials</article-title>
          ,
          <source>in: Philosophical Foundations of Evidence Law</source>
          , Oxford Academic,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Pardo</surname>
          </string-name>
          ,
          <article-title>Relative plausibility and its critics</article-title>
          ,
          <source>The International Journal of Evidence Proof</source>
          <volume>23</volume>
          (
          <year>2019</year>
          )
          <fpage>5</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Berger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Slooten</surname>
          </string-name>
          ,
          <article-title>The lr does not exist</article-title>
          ,
          <source>Science % Justice</source>
          <volume>56</volume>
          (
          <year>2016</year>
          )
          <fpage>388</fpage>
          -
          <lpage>391</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D. J. N.</given-names>
            <surname>Kristy</surname>
          </string-name>
          <string-name>
            <given-names>A.</given-names>
            <surname>Martire</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Gary</given-names>
            <surname>Edmond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Newell</surname>
          </string-name>
          ,
          <article-title>On the likelihood of “encapsulating all uncertainty</article-title>
          ,
          <source>Science % Justice</source>
          <volume>57</volume>
          (
          <year>2017</year>
          )
          <fpage>76</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Biedermann</surname>
          </string-name>
          , T. Lau,
          <article-title>Decisionalizing the problem of reliance on expert and machine evidence</article-title>
          ,
          <source>Law, Probability and Risk</source>
          <volume>23</volume>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sjerps</surname>
          </string-name>
          ,
          <year>2025</year>
          . Personal communication.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>B. Y.</given-names>
            <surname>Norman Fenton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Neil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lagnado</surname>
          </string-name>
          ,
          <article-title>Analyzing the simonshaven case using bayesian networks</article-title>
          .,
          <source>Topics in cognitive science 12</source>
          (
          <year>2020</year>
          )
          <fpage>1092</fpage>
          -
          <lpage>1114</lpage>
          . doi:
          <volume>10</volume>
          .1111/tops.12417.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Leeuwen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Verheij</surname>
          </string-name>
          ,
          <article-title>A Comparison of Two Hybrid Methods for Analyzing Evidential Reasoning</article-title>
          ,
          <source>in: Legal Knowledge and Information Systems</source>
          , Madrid,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>B.</given-names>
            <surname>Verheij</surname>
          </string-name>
          ,
          <article-title>Proof with and without probabilities: Correct evidential reasoning with presumptive arguments, coherent hypotheses and degrees of uncertainty</article-title>
          ,
          <source>Artificial Intelligence and Law</source>
          <volume>25</volume>
          (
          <year>2017</year>
          )
          <fpage>127</fpage>
          -
          <lpage>154</lpage>
          . URL: http://link.springer.
          <source>com/10.1007/s10506-017-9199-4</source>
          . doi:
          <volume>10</volume>
          .1007/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>