<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards a Partial Coverage of Neural Network Explanations Using Approximation Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mathieu Brassart</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laurent Simon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Université de Bordeaux</institution>
          ,
          <addr-line>Bordeaux INP, CNRS, LaBRI, UMR 5800, F-33400 Talence</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>We introduce a neuro-symbolic verification framework that challenges the conventional trade-of between accuracy and explainability by maintaining model precision while enabling formal verification or explanation for the majority of decisions. Rather than attempting to verify an entire neural network - often computationally intractable - we train it to decompose its decision boundary into smaller, verifiable submodels, allowing for partial coverage with Reduced Ordered Binary Decision Diagrams representations. This ensures rigorous verification for the majority of cases, while acknowledging and identifying cases that remain unverifiable. Our method allows us to formally analyze over 94% of decisions in standard benchmarks without sacrificing accuracy. Additionally, our system is designed to identify cases where a decision cannot be formally verified, ensuring that uncertain predictions are flagged for human intervention or additional scrutiny. This guarantees that AI-driven decisions remain not only precise and explainable but also trustworthy. Our approach provides a scalable and practical solution for deploying safe, accountable, and formally verifiable AI, ofering a new verification pathway where accuracy and interpretability are simultaneously achievable rather than being at odds.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neuro-Symbolic AI</kwd>
        <kwd>Reduced Ordered Binary Decision Diagrams</kwd>
        <kwd>Formal Verification</kwd>
        <kwd>Bounded Decisions</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The widespread adoption of Artificial Intelligence (AI) and Machine Learning (ML) in critical applications
(such as healthcare, finance, and autonomous systems) has created a growing demand for trustworthy
and verifiable AI. However, modern machine learning models, particularly deep neural networks
and complex ensemble methods, often function as black boxes, making their decisions dificult to
interpret or formally verify. This lack of explainability and verifiability is a major concern in high-stakes
environments, where erroneous or unverifiable predictions can lead to severe consequences.</p>
      <p>
        Existing explainability and verification methods typically rely on post-hoc interpretability techniques
[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], inherently interpretable models such as decision trees [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6 ref7 ref8">3, 4, 5, 6, 7, 8</xref>
        ], or formal methods using
Boolean circuits [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] and logical reasoning approaches [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
        ]. While these methods ofer insights
into model behavior, they fail to scale when applied to large, high-dimensional models, limiting their
practical use in real-world AI systems. Thus, a long-standing challenge in AI verification is the apparent
trade-of between accuracy and explainability, leading to the belief that highly accurate models must
remain opaque and unverifiable. We challenge this notion by demonstrating that formal verification
and explainability can be achieved without degrading performance but by accepting to only partially
covering explainable decisions. Rather than attempting to verify an entire neural network — which is
often computationally infeasible - we train it to decompose its decision boundary into smaller, verifiable
submodels, leveraging an Reduced Ordered Binary Decision Diagram (ROBDD) representation of neural
network submodules. In practice, our goal is to ensure that a vast majority of decisions can be formally
verified or explained, while cases that cannot be verified are explicitly identified.
      </p>
      <p>
        To do so, we introduce a hierarchical verification structure, where approximation networks, trained
alongside the main model, are able to provide formal guarantees for most decisions while allowing
complex, unverifiable cases to be flagged for additional control. We use a neuro-symbolic [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] approach
with an hybrid model combining Neural Networks and ROBDD representations, enabling eficient
logical reasoning over neural network submodules. ROBDDs are particularly attractive for verification
because they allow polynomial-time queries over Boolean functions, enabling eficient verification of
logical constraints, bias mitigation, and adversarial robustness properties. Our ROBDD representation
is based on prior work [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ], which considers neural networks as Boolean circuits, and extends it using
binarized neural networks [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] on submodules, thanks to decision boundary decomposition [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. We
ensure that a majority of decisions (over 94% on some examples as we will see) can be formally verified
(i.e. are decided by a submodule represented as a ROBDD), while maintaining full model accuracy. This
new approach challenges the traditional trade-of between accuracy and explainability by ofering a
new pathway for verification. Interestingly, our system is capable of identifying cases where a decision
cannot be formally verified, ensuring that uncertain predictions are flagged for human intervention
or additional scrutiny. This selective verification mechanism makes AI systems more reliable and
trustworthy, particularly in safety-critical applications.
      </p>
      <p>The remainder of this paper is structured as follows. Section 2 examines the limitations of existing
formal verification methods for neural networks. Section 3 introduces our approximation-based
verification framework, detailing how smaller submodels constrain the overall decision space. Section
4 presents experimental results demonstrating that our approach preserves accuracy while allowing
a significant proportion of decisions to be formally verified. Finally, we conclude by discussing the
implications of our findings and potential future research directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Neural Networks Accessible to Formal Verification</title>
      <p>
        Our approach applies formal verification (FM) to neural networks (NNs) by expressing them as Boolean
circuits [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. However, for any formal method to be applicable, the AI model must first be expressed
in a suitable formalism, close to propositional logic for FM, and, more importantly, must remain of
manageable size. In our framework, we assume that a neural network (or one of its submodules)
is verifiable if it can be expressed as a ROBDD. This assumption is motivated by the computational
properties of ROBDDs, which allow eficient polynomial-time verification of Boolean functions within
the Knowledge Compilation Map [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Unlike standard neural networks, which function as opaque
black-box models, ROBDDs enable logical and constraint-based reasoning over decision functions. This
makes them a powerful tool for formal AI verification.
      </p>
      <p>
        Once expressed as an ROBDD, a (binarized) neural network allows for precise formal verification
queries that are otherwise infeasible in conventional deep learning models, for instance bias
elimination (ensuring that protected attributes such as gender or ethnicity do not afect predictions), safety
verification in autonomous systems (guaranteeing that the AI model does not enter unsafe states under
any possible input configuration) and even adversarial robustness. However, despite these advantages,
applying ROBDDs to large-scale neural networks remains highly challenging. Converting a neural
network into a ROBDD may have a time complexity of (! · 2 · 22 ) and may result in a BDD of size
(22 ) (for a special family of function) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] (Table 1), making it computationally infeasible for relatively
small networks, and preventing its use in practical deep learning systems. Additionally, the process of
binarizing neural networks to fit into a Boolean circuit framework is itself a challenging task. Previous
studies [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] have highlighted that even when using binarized neural networks (BNNs) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], the resulting
Boolean structure remains too large for direct formal verification methods. The challenge, therefore,
is not only in expressing neural networks as Boolean circuits, but also in managing the explosion of
state-space complexity that arises during this transformation.
      </p>
      <p>To address this scalability issue, we introduce approximation networks, which allow us to reduce
the complexity of the ROBDD conversion while preserving the benefits of formal verification. Instead
of verifying the full neural network, our approach selectively trains smaller subnetworks, designed to
approximate specific decision boundaries while remaining tractable for ROBDD representation. These
approximation networks act as verifiable surrogates, covering a large proportion of decisions that can
be formally verified, while leaving only a limited set of complex cases unverifiable. By structuring the
decision boundary in this way, we enable partial but scalable verification, ensuring that the majority
prev_layer ← layers_obdd[0]
p_vars ← prev_layer[0].vars
for each layer of layers_obdd[1:] do
res_layer ← []
for each obdd of layer do
res_bdd ← obdd.compose(obdd.vars, prev_layer)
res_bdd.vars ← p_vars
append(res_layer, res_bdd)
prev_layer ← res_layer
return prev_layer
1
2
3</p>
      <sec id="sec-2-1">
        <title>Layer 1</title>
      </sec>
      <sec id="sec-2-2">
        <title>Layer 2</title>
        <p>1
1
2
3
4
of AI decisions remain provably correct while explicitly flagging uncertain cases for further scrutiny.
This approach provides a practical pathway for integrating formal verification into modern AI systems,
balancing computational feasibility with the need for rigorous, logic-based guarantees in machine
learning models.</p>
        <sec id="sec-2-2-1">
          <title>2.1. Evaluating the Complexity of Neural Network Compilation</title>
          <p>While it is possible to train a network directly in its binarized form, this often leads to optimization
challenges due to the non-diferentiability of the binarization function. Therefore, in practice, BNNs
are usually trained using floating-point weights and activations, and then binarized afterward for
compilation purposes. Binarized NN have binary inputs and outputs, but floating-point weights and
biases. We use an activation layer between each fully connected layer that implements the threshold
function 2.1. This ensures that every neuron input and output is binary, either 1 or 0 ( () = 1 if  ≥
0 otherwise 0). The binarization by itself is not particularly computationally expansive (if we accept to
loose precision). The main challenge is to be able to "compile" a BNN into a ROBDD, in order to use
formal verification on it.</p>
          <p>
            During the training phase, the activation layers function as sigmoid activations, facilitating faster
training. The compilation method follows these steps: first, each layer is processed sequentially,
considering one neuron at a time, based on the algorithm of [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ]. Each neuron is then converted into
an ROBDD, expressed in terms of the outputs of the previous layer or the input features for the first
layer. Then, by replacing the previous layer’s outputs with the corresponding BDD, a single BDD is
generated, representing the neural network’s output in terms of the input features (see Algorithm 2.1).
          </p>
          <p>This algorithm can be illustrated using the example of a binarized neural network in Figure 1: we
sequentially express the outputs (ℎ1, ℎ2, ℎ3, ℎ4) of the first layer with inputs ( 1, 2, 3), and the output
(1) of the second layer with the outputs (ℎ1, ℎ2, ℎ3, ℎ4) of the first layer. At the end of the algorithm, a
single BDD will be constructed, representing the output in terms of the inputs.

2
1
features</p>
          <p>State-of-the-art
Neural network</p>
          <p>M
a
x
F
u
n
c
t
i
o
n</p>
          <p>
            Unfortunately, Table 1 illustrates our measured growth in computational cost required to rewrite
a binarized neural network based on its size (also reported in [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ]). In conclusion, while several prior
approaches have attempted to address this problem, compiling a neural network into an ROBDD suitable
for practical formal verification remains a major computational challenge (our network has only one
hidden layer!).
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Coverage of a Neural Network Model by Approximation Networks</title>
      <p>We propose a novel structured verification approach that overcomes the limitations discussed earlier by
integrating a logical layer that connects multiple neural networks, with controlled sizes. Our approach
replaces a single large neural network with a Global Model that integrates multiple approximation
networks (see Figure 2). Each approximation network is trained to capture a verifiable subset of
decisions, while we expect the main neural network to remain responsible for handling more complex
cases. The final prediction, denoted as , is computed as  = max(1, 2, . . . , , ), where
 represents the output of an approximation network and  is the output of the main neural network.
This architecture ensures that if any approximator is confident in its decision, the final model prediction
is immediately verifiable. If no approximator provides a decision, the main neural network takes over,
handling more intricate cases that may not be directly verifiable.</p>
      <p>The intuition behind our idea is to build on Decision Boundary Decomposition. We take advantage
of the fact that many real-world decision boundaries are locally simple but globally complex: while
the global decision boundary may be highly complex, many local subregions are governed by simple,
well-separated decision rules. This phenomenon aligns with the Pareto principle, where (typically)
80% of the decision complexity arises from only 20% of cases. Our aim is to exploit this structure by
assigning verifiable approximation networks to low-complexity regions, while more intricate cases are
handled by the main neural network. This design follows a Pareto-like pattern, where the majority of
decisions fall into simpler, verifiable subregions, significantly improving formal verification feasibility.</p>
      <p>Note that, in the current work, we focus on explaining positive decisions (cases where the model
responds "yes" or "1"), using the max function to capture verifiable subsets. While our framework is
currently designed for positive predictions, generalizing it to both "yes" and "no" decisions (0/1 outputs)
is a natural extension and remains an ongoing area of research.</p>
      <sec id="sec-3-1">
        <title>3.1. Guarantees on Explainability and Bias through Decision Composition</title>
        <p>Since the final decision inherits the properties of at least one approximator when it takes precedence, we
can ensure global model properties by controlling the behavior of its subcomponents. If all approximators
are designed/proved to be fully explainable or bias-free, any decision one approximator covers will also
be explainable at the global level: let’s suppose that approximator  satisfies a given formal property 
(e.g., fairness, monotonicity, adversarial robustness), then for any input  where () = 1, we can
guarantee that the Global Model also satisfies  on that input, since</p>
        <p>() = max(1(), 2(), . . . , (), ()) ≥ ()</p>
        <p>For example, consider the case of fairness and explainability, where we construct multiple
approximation networks that adhere to a specific constraint: “The decision must not be influenced by the individual’s
gender.” Our decomposition of the decision-making process across multiple approximation networks
(each one enforcing this fairness constraint) ensures that any decision made by one of these modules
inherently satisfies the condition at the global level. If these approximation networks are trained to
cover all positive cases, the fairness property extends to the entire model, guaranteeing that all decisions
made by at least one approximator remain unbiased.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Training Strategies for the Global Model</title>
        <p>To efectively integrate approximation networks into the global model, diferent training strategies
may be proposed. One approach is sequential training, where approximation networks are trained
ifrst to handle simpler decisions before training the main neural network to focus on more complex
cases. This method encourages approximation networks to learn the most verifiable decisions,
improving explainability. Alternatively, simultaneous training allows all components to be trained
together, ensuring consistency between models, though it may limit the efectiveness of approximation
networks in capturing simple cases independently. Another option is independent training with
post-integration, where each component is trained separately and later combined through the logical
layer. This last method is only used in Section 4 as it is not relevant for the current experiment (each
component is independently trained with classical methods that do not need to be studied here).
Gradient Propagation through the Max Layer For each of these methods, we also consider two
diferent gradient propagation strategies through the max layer: ALL and MAX. The ALL method
distributes the gradient to all inputs, while the MAX method propagates the gradient only to the
maximum value. These methods may impact the performance of the global model, particularly in terms
of coverage and accuracy.</p>
        <p>Loss Function To optimize the training process, we employed a range of ad hoc loss functions in a
preliminary study. The loss function that yielded the best results is a weighted binary cross-entropy
function (see eq. 1), where  is a penalty applied to class 1. When the expected label is 0 but the predicted
label is 1, the loss is  times larger than in the opposite case. The idea is to penalized submodules that
wrongly predict class 1 (because the decision will be irrevocable): it is better for submodules to have a
false negative, that can be recovered by the large NN.</p>
        <p>= − ( · (1 − ) · log(1 − ) +  · log())
(1)</p>
        <p>Using this loss function provided better results compared to standard binary cross-entropy,
particularly in terms of recall for class 0 and precision for class 1. However, this approach led to a reduction
in precision for class 0 and a decrease in recall for class 1. Increasing the penalty  results in higher
confidence in class 1 predictions but at the expense of reduced coverage (for example, setting  = 25
often results in a model predicting only class 0). A suitable value must be chosen to balance precision
and coverage (see below).</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Tuning hyper-parameters on the Diabetes Dataset</title>
        <p>
          We based our experimental study on all the available benchmarks on [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] for fairness that fitted our
prerequisites. To adjust and test diferent scenarios for backpropagation mechanisms, loss functions
and training methods of the diferent NNs within the global model, we first conducted a preliminary
study on a simple and well-known simple tabular example: the "Diabetes" dataset [
          <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
          ] (8 features
and 2 classes). We used a model composed of 3 approximators and 1 neural network, trained with 100
runs of 500 epochs. The results of this preliminary study are presented in Table 2. The second step
(section 4) is a more comprehensive study on additional datasets, using only the best method from this
preliminary step, which will be presented in the next section.
        </p>
        <p>The evaluation metrics used in our study include several key measures to assess the performance of
the global model and its components. Model Coverage refers to the accuracy of the global model when
predicting positive cases (class "1"). Approximator Coverage measures the accuracy considering only
the predictions made by the approximator network. Relative Coverage represents the proportion of
positive predictions made by the global model that are also predicted by the approximator, computed
as the ratio of approximator coverage to model coverage ( AMpopdroelxccoovveerraaggee ). Neural Network Coverage
reflects the accuracy of the large neural network, though a lower value does not necessarily indicate poor
performance, as the network is designed to handle only complex cases. Additionally, Approximator
Misclassifying Zeros quantifies the percentage of false positives for class 1, capturing errors where the
approximator incorrectly predicts a positive outcome. Finally, Global Model Accuracy accounts for
the overall accuracy of the model across both classes, providing a global measure of system performance.</p>
        <p>The results of this preliminary study are presented in Table 2. We observe that the ALL method
performs best. However, it is quite surprising to see that the Simultaneous method also provides strong
results. The main diference is seen in the incorrect classification of zeros by the approximator. This
metric is particularly important because if an approximator incorrectly classifies a 0 as a 1, the large
neural network will not be able to correct this decision, regardless of its size.</p>
        <p>The Table 3 shows the results of the same experiment, but with diferent penalties applied to class
1. We can see that the best results are obtained with a penalty of 2, which provides a good balance
between precision and recall for both classes.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimentation on Additional Datasets</title>
      <p>
        We selected datasets containing tabular data and providing binary classification problems and we
binarized their features that may be originally boolean, floating-point, or categorical. We included the
datasets Compas ([20]), Diabetes ([
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]), Titanic ([21]), and Adult ([22]), taken on [23]. Each dataset has
diferent feature dimensions and levels of binarization. They covers the main benchmarks available
on the Fairness dataset [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The Compas dataset: 13 features, which expand to 18 binarized features,
F1-score of 0.66 with a neural network. The Diabetes dataset: 8 features, increasing to 29 binarized
features, F1-score of 0.76. The Titanic dataset: 7 features, which are transformed into 16 binarized
features, yF1-score of 0.85. The Adult dataset: 14 features, expanded to 113 binarized features, F1-score
of 0.81. An extensive search on Kaggle and other ML repositories was conducted to identify the most
promising solutions and were selected for comparison (Table 4). This Table demonstrates that our
approach yields very good results. Our method achieves F1-scores close to those of the other two
methods, but, as we will see, it has the advantage of being able to provide explanations for and formally
verify a large portion of positive predictions (75% in the worst-case dataset studied (Compas) and 94%
in the best case (Adult). The worst case results (Compas) may be disappointing (75% of the positive
predictions can be explained) but this is already a huge improvement compared to the 0% of overall
neural network used as state of the art.
      </p>
      <p>One limitation of our current work is that, for simplicity, both the neural network and the
approximators are trained with binarized datasets, which decrease the performances of the large NN (we are
working on this point). Our final observation concerns how the diferent neural networks are trained.
The Table 5 shows that sequential and separate trainings are the best methods for covering most of
the cases with approximation networks. Between these two methods, sequential training appears to
yield slightly fewer misclassifications and a better coverage than separate training. As for simultaneous
training, it generally results in significantly more "0" misclassifications and lower coverage. Additionally,
using the sequential or separate training methods allows for easy extension at a later date by adding
new approximation networks. Another advantage is the ability to change the loss function for each
part of the global model, unlike the simultaneous method, which does not easily support this flexibility.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>The increasing deployment of neural networks in critical domains highlights the importance of formal
verification. As these models are often used in safety-critical or ethically sensitive settings, providing
guarantees about their behavior is essential. Formal methods, such as compilation into decision
diagrams, ofer a promising path for analyzing and reasoning about the internal logic of neural models
in a principled and rigorous manner. One of the main advantages of ROBDDs lies in their ability to
eficiently count models, which can be leveraged to assess fairness criteria—such as balancing positive
outcomes across protected groups—through exact counting.</p>
      <p>Our approach, however, is not limited to neural networks. In principle, other interpretable
approximators could be used, such as decision trees or small random forests. These models are inherently
explainable and can sometimes yield more compact logical representations. Nonetheless, random forests
are more dificult to integrate into end-to-end diferentiable pipelines and typically lack the flexibility
needed for tasks such as knowledge distillation or joint optimization with other components.</p>
      <p>By contrast, using a neural network as the approximation module ofers significantly more modeling
freedom. Neural networks can be trained jointly with other subsystems, fine-tuned for specific
downstream tasks, and serve as targets for distillation from larger, more accurate models. This enables a
hybrid strategy in which a complex model is first trained for accuracy and then distilled into a smaller,
verifiable architecture.</p>
      <p>Thus, retaining a neural representation—even if it is eventually binarized for compilation and not
trivially amenable to formal analysis—provides greater flexibility in the learning pipeline. It supports a
wider range of optimization strategies while still enabling formal guarantees through the compilation
step.</p>
      <p>
        An additional benefit of our approach is that it enables the integration of symbolic prior knowledge
through knowledge compilation [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Logical rules or domain-specific constraints can be encoded
directly into ROBDDs and composed with the compiled neural network. This makes it possible to
combine learned behavior with formally verified knowledge, yielding a hybrid system that is both
data-driven and constraint-aware. Such integration can be particularly valuable in domains where
certain properties must hold by design (e.g., legal compliance, physical laws), and supports reasoning
tasks such as consistency checking, abduction, or symbolic inference over the compiled knowledge
base.
      </p>
      <p>One final point of discussion concerns a limitation of our current work: we only target positive
decisions. To overcome this, we are investigating the use of the ROBDD from the current approximator
as a routing mechanism. Specifically, the 1/0 output would indicate whether to delegate the input to a
specialized approximator—for handling multiple classes—or to the main neural network. In this setting,
the router functions as a selector, determining whether a given instance is suficiently simple to be
handled by an auxiliary component, or requires the full capacity of the original model. Our preliminary
results suggest that this routing strategy could achieve similar levels of coverage while extending the
approach beyond binary classification.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The goal of our proposed method is to enable formal verification of simple cases decided by neural
networks, which can be processed by reasonably sized subnetworks. We demonstrate that it is possible
to integrate a small neural network within a larger neural network model in such a way that if any of
the small networks decide "Yes" for a given decision, the global model will also decide "Yes". We show
that even on non-trivial examples, very small neural networks (NNs) can capture a high proportion of
decisions for which formal verification or explanation can be performed eficiently.</p>
      <p>While our results demonstrate that up to 94% of decisions can be formally verified, there remain
important limitations. The subset of unverifiable decisions (ranging from 6% to 25% depending on
the dataset) may correspond to more complex or edge-case scenarios, potentially requiring additional
scrutiny. Furthermore, our approach relies on a binarization step that can lead to feature explosion,
especially in datasets with continuous variables. Future work will explore hybrid approaches combining
our method with adversarial training techniques to improve coverage in high-complexity cases.</p>
      <p>A key advantage of our method is its ability to identify unverifiable decisions—cases where the
approximation networks fail to provide a provable answer—allowing these instances to be flagged for
further scrutiny. In other words, our approach ensures that the system "knows when it doesn’t know",
a crucial property for safe and trustworthy AI deployment.</p>
      <p>Looking forward, a natural extension of this work is to explore knowledge distillation as a means to
apply our approach to existing neural networks (instead of training the approximator at the same time).
By systematically transferring knowledge from the larger model to smaller, verifiable networks, we
could refine the decision boundary decomposition even further, maximizing the proportion of formally
verifiable decisions. Additionally, extending this framework to handle both "yes" and "no" decisions,
and integrating it with adversarial robustness techniques, could further enhance its applicability in
real-world AI deployment. We also want to study the exact nature of the cases that remain unverifiable.
Are they edge cases, adversarial examples, or ambiguous samples?</p>
      <p>By bridging the gap between machine learning and formal verification, this work paves the way for
scalable, explainable, and provably reliable AI systems. This contributes to the broader goal of building
trustworthy and accountable artificial intelligence.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work has been partially funded by the "Chaire IA Digne de Confiance" (Chair on Trustworthy AI),
operated by the Bordeaux University Foundation.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>All the experiments and code were developped without generative AI. The translation of a number of
sentences in this paper, the english grammar and language were improved by text generation AI.
on Computer Applications in Medical Care, American Medical Informatics Association, 1988, pp.
261–265. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2245318/.
[20] J. Angwin, J. Larson, S. Mattu, L. Kirchner, Machine bias: There’s software used across the
country to predict future criminals. and it’s biased against blacks, ProPublica (2016). URL: https:
//www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.
[21] British Board of Trade, Report on the loss of the titanic (s.s.), 1912. URL: https://www.kaggle.com/
c/titanic.
[22] D. Dua, C. Graf, UCI Machine Learning Repository - Adult Dataset, Technical Report, University
of California, Irvine, School of Information and Computer Sciences, 2019. URL: https://archive.ics.
uci.edu/ml/datasets/adult.
[23] Kaggle, Kaggle: Your machine learning and data science community, 2024. URL: https://www.
kaggle.com.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G. V.</given-names>
            <surname>den Broeck</surname>
          </string-name>
          , A. Lykov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schleich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Suciu</surname>
          </string-name>
          ,
          <source>On the tractability of shap explanations</source>
          ,
          <year>2021</year>
          . arXiv:
          <year>2009</year>
          .08634.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nohara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Matsumoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Soejima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nakashima</surname>
          </string-name>
          ,
          <article-title>Explanation of machine learning models using shapley additive explanation and application for real data in hospital</article-title>
          ,
          <source>Computer Methods and Programs in Biomedicine</source>
          <volume>214</volume>
          (
          <year>2022</year>
          )
          <fpage>106584</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Audemard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bellart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Koriche</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-M. Lagniez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>On the Explanatory Power of Boolean Decision Trees</article-title>
          ,
          <source>Data and Knowledge Engineering</source>
          <volume>142</volume>
          (
          <year>2022</year>
          )
          <article-title>102088</article-title>
          . URL: https://hal.science/hal-03939107. doi:
          <volume>10</volume>
          .1016/j.datak.
          <year>2022</year>
          .
          <volume>102088</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <article-title>A symbolic approach to explaining bayesian network classifiers</article-title>
          ,
          <year>2018</year>
          . arXiv:
          <year>1805</year>
          .03364.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Chan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <source>Reasoning about bayesian network classifiers</source>
          ,
          <year>2012</year>
          . arXiv:
          <volume>1212</volume>
          .
          <fpage>2470</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hatano</surname>
          </string-name>
          , E. Takimoto,
          <article-title>Boosting-based construction of bdds for linear threshold functions and its application to verification of neural networks</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2306</volume>
          .
          <fpage>05211</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Song</surname>
          </string-name>
          , T. Chen,
          <article-title>Precise quantitative analysis of binarized neural networks: a bdd-based approach</article-title>
          ,
          <source>ACM Transactions on Software Engineering and Methodology</source>
          <volume>32</volume>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <article-title>Verifying binarized neural networks by Angluin-Style learning</article-title>
          ,
          <source>in: Theory and Applications of Satisfiability Testing-SAT</source>
          <year>2019</year>
          : 22nd International Conference,
          <string-name>
            <surname>SAT</surname>
          </string-name>
          <year>2019</year>
          , Lisbon, Portugal, July 9-
          <issue>12</issue>
          ,
          <year>2019</year>
          , Proceedings, volume
          <volume>11628</volume>
          , Springer,
          <year>2019</year>
          , p.
          <fpage>354</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <article-title>Compiling neural networks into tractable boolean circuits</article-title>
          ,
          <source>intelligence</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>W.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shih</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <source>On tractable representations of binary neural networks</source>
          ,
          <year>2020</year>
          . arXiv:
          <year>2004</year>
          .
          <year>02082</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Leofante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Narodytska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pulina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tacchella</surname>
          </string-name>
          ,
          <source>Automated verification of neural networks: Advances, challenges and perspectives</source>
          ,
          <year>2018</year>
          . arXiv:
          <year>1805</year>
          .09938.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.-A.</given-names>
            <surname>Goulet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. H.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Amiri</surname>
          </string-name>
          ,
          <article-title>Tractable approximate gaussian inference for bayesian neural networks</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <year>2004</year>
          .09281.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ignatiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Narodytska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Marques-Silva</surname>
          </string-name>
          ,
          <article-title>Abduction-based explanations for machine learning models</article-title>
          ,
          <year>2018</year>
          . arXiv:
          <year>1811</year>
          .10656.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Besold</surname>
          </string-name>
          , A. d. Garcez,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bowman</surname>
          </string-name>
          , et al.,
          <article-title>Neural-symbolic learning and reasoning: A survey and interpretation</article-title>
          ,
          <source>Cognitive Computation 9</source>
          (
          <year>2017</year>
          )
          <fpage>395</fpage>
          -
          <lpage>419</lpage>
          . URL: https://doi.org/10.1007/ s12559-017-9491-6.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Courbariaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Hubara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Soudry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>El-Yaniv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Binarized neural networks: Training deep neural networks with weights and activations</article-title>
          constrained to +
          <volume>1</volume>
          <fpage>or</fpage>
          -
          <lpage>1</lpage>
          ,
          <year>2016</year>
          . arXiv:
          <volume>1602</volume>
          .
          <fpage>02830</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hastie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <source>The Elements of Statistical Learning</source>
          , Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Darwiche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Marquis</surname>
          </string-name>
          ,
          <article-title>A knowledge compilation map</article-title>
          ,
          <source>Journal of Artificial Intelligence Research</source>
          <volume>17</volume>
          (
          <year>2002</year>
          )
          <fpage>229</fpage>
          -
          <lpage>264</lpage>
          . URL: http://dx.doi.org/10.1613/jair.989. doi:
          <volume>10</volume>
          .1613/jair.989.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>R. K. E. Bellamy</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Dey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Hind</surname>
            ,
            <given-names>S. C.</given-names>
          </string-name>
          <string-name>
            <surname>Hofman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Houde</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kannan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Lohia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Martino</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Mehta</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mojsilovic</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Nagar</surname>
            ,
            <given-names>K. N.</given-names>
          </string-name>
          <string-name>
            <surname>Ramamurthy</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Richards</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Saha</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Sattigeri</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>K. R.</given-names>
          </string-name>
          <string-name>
            <surname>Varshney</surname>
          </string-name>
          , Y. Zhang,
          <source>AI</source>
          Fairness
          <volume>360</volume>
          :
          <article-title>An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias</article-title>
          ,
          <year>2018</year>
          . URL: https://arxiv.org/abs/
          <year>1810</year>
          .
          <year>01943</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Everhart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. C.</given-names>
            <surname>Dickson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. C.</given-names>
            <surname>Knowler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Johannes</surname>
          </string-name>
          ,
          <article-title>Using the adap learning algorithm to forecast the onset of diabetes mellitus</article-title>
          ,
          <source>in: Proceedings of the Annual Symposium</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>