<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>October</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>False Positives in Robustness Checking of Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohammad Afzal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ashutosh Gupta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R. Venkatesh</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Akshay</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology Bombay</institution>
          ,
          <addr-line>Mumbai</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>TCS Research</institution>
          ,
          <addr-line>Pune</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>26</volume>
      <issue>2025</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Neural networks are increasingly used in safety-critical systems. To ensure their trustworthy deployment, we need to verify that they are robust against minor perturbations. But in doing so, the counterexamples found by state-of-the-art neural network verifiers for robustness may not be really meaningful. In fact, a counterexample that is considered a robustness violation reported by a neural network may not be a violation in the view of a domain expert. We refer to such cases as false positives (FP). In this work, we propose a new approach to evaluate the robustness of neural networks by considering the view of domain expert. Our goal is to evaluate the presence or absence of such FP in state of the art verifiers. However, doing this manually indeed may not scale. Thus, in our experiments, we evaluate the local robustness property based on the notions of FP and TP (True Positives), while approximating the domain expert using an ensemble of state-of-the-art neural network classifiers.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neural Network Verification</kwd>
        <kwd>False Positive</kwd>
        <kwd>Local Robustness Verification</kwd>
        <kwd>Formal Methods</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Neural networks are increasingly used in safety-critical applications such as flight control [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
autonomous vehicles [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5">2, 3, 4, 5</xref>
        ], and medical diagnosis [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In the domain of computer vision, an
adversarial example is an image that is visually similar to a given input image but is classified diferently by the
network. The work by Goodfellow et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] demonstrated the existence of the adversarial examples
in classical neural networks by slightly modifying an image in an algorithmic manner. This raises
serious safety concerns for the use neural networks and has led to a vast research endeavor towards
ifnding such adversarial examples. The machine learning community [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14 ref8 ref9">8, 9, 10, 11, 12, 13, 14</xref>
        ] has actively
worked on finding such adversarial examples. Some of the well-known methods are fast gradient sign
method (FGSM) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and projected gradient descent (PGD) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. However, these eforts typically fall
short of verifying the absence of such examples.
      </p>
      <p>
        One approach towards addressing this issue has been to use formal verification techniques. By
leveraging formal methods, researchers provide certification of the absence of adversarial examples,
if none exist [
        <xref ref-type="bibr" rid="ref16 ref17 ref18">16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31</xref>
        ]. These methods use abstract
interpretation [32] or constraint solving [33] to analyze the neural network. Many of the works in this
setting [34, 35, 36] focus on the local robustness verification problem, i.e., for a given neural network
and an input image, the network is considered locally robust if all images close to the given image are
classified the same as the original, where distance is defined in terms of a simple distance metric (e.g.,
1, 2, or ∞ norms). In the verification community, such images, which close to the original, but
are classified diferently are more often called counterexamples, and existing work focuses on either
ifnding counterexamples or providing certification of their absence.
      </p>
      <p>However, in doing so, existing works often lose the focus on evaluating the genuineness of the
counterexample. This leads to false positives, i.e., counterexamples that exist from the perspective of
the underlying specification, but may not be considered genuine from a domain expert’s perspective. In</p>
      <sec id="sec-1-1">
        <title>Seed</title>
        <p>Pred: 7
Ensembles: {1,7}
Cex
Pred: 1</p>
        <p>Cex
Pred: 8
(f)</p>
      </sec>
      <sec id="sec-1-2">
        <title>Seed</title>
        <p>Pred: 9
Ensembles: {4,9}</p>
        <p>Cex
Pred: 4</p>
        <p>Cex
Pred: 8
this work, our aim is to identify false and true positives and study their prevalence in commonly used
image classification benchmarks. Let us start by looking at a few motivating examples.</p>
        <p>Consider the two images on the left of Figure 1(a)-(b), which are from the MNIST [37] dataset and
have ground truth labels 0 and 2, respectively. The images on the right are counterexamples generated
by the state-of-the-art verifier  -CROWN [38] for the image labeled 0 (misclassified as 6), and by
the same verifier for the image labeled 2 (misclassified as 1). A domain expert may not consider these
as actual bugs because the counterexample labeled as 6 also visually resembles a 6, and the network
predicts it as either 0 or 6—which is reasonable. Similarly, the counterexample misclassified as 1 also
resembles a 1, and the network predicts it as either 1 or 2, which is again acceptable. Since the network’s
predictions are visually justifiable, labeling them as counterexamples is, in fact, a false positive.</p>
        <p>On the other hand, a counterexample may be considered a genuine counterexample, which we call
“true positive”, if the network misclassifies the image and the misclassification is not visually justifiable.
In Figure 1(c), the image on the left is a seed image, and the image on the right is a counterexample
reported by the  -CROWN verifier. The image is classified as 3 by the verifier and human expert
would say that it is a 1. We present a similar example in Figure 1(d).</p>
        <p>In this work, we recall and define the well-known notions of false positive (FP) and true positive
(TP) in the context of local robustness verification of neural network image classifiers. Our modified
definition requires a domain expert to classify the images and assign the ground truth labels, which
we formally call an oracle. An important point to note is that in our setting, even an oracle may not
always be able to classify the images since images may resemble many labels. Thus it may assign a set
of possible labels to an image. As a result, we may use the oracle in two possible ways: (1) to determine
whether a counterexample produced by the verifier is a true or false positive by checking whether the
classification of the counterexample belongs to one of the oracle-provided labels; and (2) to guide the
verifier to search only for counterexamples that are not classified as one of the oracle-provided labels.</p>
        <p>For example, in Figure 1(e), the oracle may state that the original image could be either a 1 or a 7.
We first ask the verifier to find a counterexample, and it reports one classified as 1 (middle image). We
then ask the verifier to look specifically for a counterexample that is not classified as 1 or 7. If such
a counterexample is found (right image), it qualifies as a true positive. Similarly, in Figure 1(f), the
oracle identifies the original image as possibly 4 or 9. We ask the verifier to find a counterexample that
is not classified as 4 or 9, and it reports 8—a true positive. Otherwise, it would have reported 4 as a
counterexample, which would not be valid true positive.</p>
        <p>It is an open secret that false positives exist in practice. However, we ask a more precise question:
what is the rate of false positives in practice (RQ1)? To answer this, we evaluate the standard local
robustness property with respect to false positives and true positives. To approximate the domain expert
(oracle), we develop an ensemble based approach that exploits multiple neural network classifiers, as
we detail below.</p>
        <p>We conduct experiments using state-of-the-art neural network verifier  -CROWN [38] on the
MNIST [37] and CIFAR-10 [39] datasets. The ensemble consists 9 classifiers for MNIST and 13 for
CIFAR-10, taken from the ERAN benchmark repository [40]. A label is considered valid if at least
30% of the classifiers agree on it. Our experiments reveal that approximately 6.9% and 1.95% of the
counterexamples generated by the verifiers are false positives for the MNIST and CIFAR-10 datasets,
respectively. We also manually checked whether the label assigned by the ensemble to each
counterexample is accurate, which forms our second experimental research question (RQ2). We found that the
false positive rates based on manual analysis were 12.86% for the MNIST benchmarks and 5.65% for
the CIFAR-10 benchmarks. We also ask if there are counterexamples that are not classified as one of the
the ensemble labels (RQ3), which would be considered true positives under the ensemble definition.
To answer this, we modified the verifiers to search for such counterexamples and reran them on both
the MNIST and CIFAR-10 benchmarks. We found that the false positive rate for MNIST benchmarks
reduced significantly from 12.86% to 3.22% and true positives for the same benchmarks increase from
88.63% to 96.78%. In summary, our approach and novel analysis shows that the local robustness as
is usually considered can be overtly conservative. And thus, verifiers are not able to declare a model
trustworthy even when the model is actually trustworthy according to the domain experts.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries and Definitions</title>
      <p>In this section, we present the preliminaries, defines the oracle, and explains the classification of
counterexamples as true positives and false positives in robustness verification. As shown in Section 3,
many counterexamples under standard local robustness are false positives. We also provide an ensemble
based approach to reduce false positives.</p>
      <p>A neural network  : R → R is a function that takes an -dimensional input and produces an
-dimensional output. Here,  ()[] denotes the logit value corresponding to the th output neuron.
The final decision ˆ of the neural network is determined using the argmax [41] function, which returns
the index of the maximum output value: ˆ () = argmax( ()).</p>
      <p>Standard local robustness: Most prior works [34, 42, 43, 36, 44, 45] define the local robustness
verification property as follows:
Definition 1. For a neural network  , an input , and an input perturbation value  ,  is locally robust
if for every ′ such that (′, ) ≤  , we have ˆ (′) = ˆ ().</p>
      <p>The above property requires that a neural network’s classification remains unchanged under small
perturbations  of the input. Here,  is a user defined parameter and  is a distance metric, often taken
to be some  norm, where  ∈ {1, 2, ∞}. An input ′ is considered a counterexample if (′, ) ≤ 
and ˆ (′) ̸= ˆ ().</p>
      <p>Oracle, true positive, and false positive: An oracle is a function that takes an input  and assigns
one or more labels to the input. The oracle models the domain expert’s view of the input image .
Let us assume the neural network classifies an input  into one of the classes from the set [], where
[] = {0, 1, . . . ,  − 1}.</p>
      <p>Definition 2. An oracle O is a function O : R → 2[], where 2[] is the power set of []. The oracle
assigns one or more labels to the input image .</p>
      <p>Intuitively, the oracle O assigns the ground truth labels to the input image . It may assign more than
one label to an image if the image visually resembles more than one class. For example, in Figure 1(a),
the image on the left is a seed image , which resembles both 0 and 6, we may have O() = {0, 6}.
Using the oracle, we can define the notions of true and false positives in the context of robustness
verification of neural networks.</p>
      <p>Definition 3. An input counterexample ′ is a true positive (resp. false positive) in a neural network 
if ˆ (′) ∈/ O(′) (resp. ˆ (′) ∈ O(′)).</p>
      <p>In Figure 1(a), the image on the left is a seed image  with ˆ () = 0, and the image on the right is
a counterexample ′ reported as per Definition 1 with ˆ (′) = 6. Since O(′) = {0, 6}, we have
ˆ (′) ∈ O(′). Therefore, it is not a genuine bug, and we consider it a false positive. A false positive
does not violate the oracle’s view of local robustness, as the oracle O considers the counterexample
′ to be similar to the seed image , but violates the standard property. On the other hand, a true
positive violates both the oracle’s view of local robustness and the standard local robustness property.
In Figure 1(c), the image on the left is a seed image  with ˆ () = 1, and the image on the right
is a counterexample ′ reported as per the standard property with ˆ (′) = 3. Since O(′) = {1}
and ˆ (′) ∈/ O(′), this indicates a true counterexample, as the oracle’s classification and the neural
network’s classification are completely diferent. We can use true positives and false positives to evaluate
the results of the standard local robustness property.</p>
      <p>Ensemble-Based Approach to Reduce False Positives: As illustrated in Figure 1(a), the original
image has a ground truth label of 0, but it also visually resembles as 6. Therefore, it is unreasonable
to strictly require the classifier not to predict 6; rather, it should be allowed to classify the image as
either 0 or 6. Motivated by this intuition, we propose using an ensemble of state-of-the-art classifiers
for the corresponding dataset, which is formally defined in Section 3. Let () denote the set of labels
assigned to the input  by the ensemble of classifiers.</p>
      <p>Definition 4. For a neural network  , an input , and an input perturbation value  ,  is considered
ensemble-guided locally robust at  if ˆ () ∈ () and for every ′ such that (′, ) ≤  , we have
ˆ (′) ∈ ().</p>
      <p>We can modify the solver query for the ensemble-aware local robustness property as ∃′ (′, ) ≤
 ∧ ⋁︀=1,∈/() ⋀︀∈()  (′)[] ≤  (′)[]. If this query is satisfiable, then we obtain a
counterexample ′ on which the network  disagrees with the ensemble prediction on the original input .</p>
      <p>The above definition is similar to the property (afinity robustness) presented in [ 46], where the
expert provides a set of sets of possible labels among which the network must classify. In afinity
robustness, the top- classes refer to the  classes with the highest logit values ( ()) in the network
output. Afinity robustness requires that for some , the top- classes remain the same across both 
and ′, and moreover, the top- classes must be a subset of one of the expert-provided set of labels. In
contrast, our proposed local robustness property requires only that the top-1 (predicted) output of the
neural network on input ′ belongs to the set of labels provided by the ensembles. Our property is
incomparable to the notion of afinity robustness.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>
        We evaluate the standard robustness property using the state-of-the-art (SOTA) verifier  -CROWN [38]
by reporting the number of false positives and true positives on a standard class of benchmarks.
Benchmarks: We considered networks trained on the MNIST [37] and CIFAR-10 [39] datasets. All
networks were obtained from VNNCOMP [
        <xref ref-type="bibr" rid="ref1">47, 1, 48, 49</xref>
        ]. Table 1 provides the details of these networks.
      </p>
      <p>We randomly selected 1000 images from each dataset. For the neural networks trained on MNIST, we
used two diferent values of input perturbation  , resulting in approximately 2000 benchmark instances
per network. For CIFAR-10 networks, we used a single value of  , leading to approximately 1000
benchmark instances per network. The values of  were chosen to match those used in VNNCOMP
and are shown in the last column of Table 1. Many neural networks approximate the human
decisionmaking process with the goal of reducing human efort. However, it is impractical for humans to act as
oracles for all counterexample images. To address this, we propose using an ensemble of AI models as a
proxy for the oracle, defined as follows.</p>
      <p>
        Definition 5. Let A be a set of AI models and agreement threshold  be a positive integer. We define an
ensemble oracle () such that, for each , () = {| |{ ∈ A | ˆ () = }| ≥ }.
The AI models can be of various types, such as neural networks, decision trees, or random forests.
These models may be trained on diferent datasets, which might not always be publicly accessible. By
leveraging such models, we aim to incorporate the collective knowledge of the community into our
analysis. In this work, we use neural network classifiers as our AI models. For the MNIST dataset, we
use || = 9 classifiers with a threshold  = 3, and for the more complex CIFAR-10 dataset, we use
|| = 13 classifiers with  = 4. The intuition behind these thresholds is that a label is accepted if
approximately 30% of the networks agree on it. If we choose  to be large, we may disallow diversity
of opinion and if we choose  to be small, a single member of ensemble may pollute the predictions. We
use all the classifiers in ensembles from the ERAN repository [ 40], ranging from simple fully connected
to complex classifiers, from simply trained to adversarially trained classifiers. To adversarially train the
networks [40], DifAI [ 50] and PGD [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] methods are used. DifAI uses abstract interpretation to create
a range of safe adversarial examples, helping the model become more robust in diferent input areas.
PGD (Projected Gradient Descent) is an attack-based method that trains the model using adversarial
examples made by slightly changing inputs through gradient steps within a fixed limit. Tables 2 and 3
show details about the selected neural network classifiers.
      </p>
      <p>We conduct the following three experiments: we identify the false positives (FP) and true positives
(TP) for the standard robustness property using the ensemble, we manually analyze the accuracy of
the decisions of the ensemble, and we construct the ensemble-based robustness property verifier and
manually validate the resulting false positives and true positives.</p>
      <p>False Positives Analysis Using Ensemble (RQ1): Tables 4(a) and 4(b) present the number of true
positives (TP) and false positives (FP) identified using the ensemble base oracle defined above. We do not
analyze false negatives or true negatives, as it is generally infeasible to determine in advance whether a
neural network is locally robust around a given input. To empirically evaluate RQ1, we observed false
positive rates of 6.9% for the MNIST benchmarks and 1.95% for the CIFAR-10 benchmarks. The FP rate
is higher for the MNIST benchmarks compared to the CIFAR-10 benchmarks. A potential reason is that
many MNIST images resemble more than one label, whereas CIFAR-10 images tend to be more distinct.
Overall, we observe that the number of false positives is significant and merits further consideration.
Manual Analysis of the Ensemble (RQ2): To evaluate the accuracy of the ensemble, we conducted a
manual analysis of the false positives (FPs) and true positives (TPs) reported in Tables 4(a) and 4(b).
Since it is impractical to manually examine all positive cases, we focused on the first column of each
table (mnist-net-256× 2.onnx and cifar-base-kw.onnx, with  = 0.03). For MNIST, we found
that 13 out of 48 reported FPs were actually TPs, and 44 out of 647 reported TPs were actually FPs. For
CIFAR-10, 4 out of 12 reported FPs were actually TPs, and 27 out of 603 reported TPs were actually
FPs. Overall, this results in a false positive rate of 12.86% for MNIST and 5.67% for CIFAR-10 on the
selected benchmark networks. This analysis demonstrates that although an approximation of the oracle
can reduce human efort, it is not perfect.</p>
      <p>Ensemble-based Property Analysis (RQ3): We guide the robustness property using the previously
described ensemble, allowing misclassifications into ensemble-predicted labels (Section 2), and use
humans as oracles to analyze counterexamples. Since human evaluation for all counterexamples is
impractical, we limit this analysis to the mnist-net-256× 2.onnx network with  = 0.03. Table 5
summarizes the number of FPs and TPs. We observed that the false positive rate dropped from 12.86%
to 3.22%, while the true positive rate increased from 88.63% to 96.78%.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>In this paper, we considered true and false positives in the context of the local robustness using oracles.
We used an ensemble of classifiers as an approximation of the oracle and analyzed the robustness of
neural networks with respect to this ensemble. We showed that the standard local robustness property
can report many false positives under this oracle and proposed a new ensemble-guided local robustness
property that can help reduce the number of false positives.</p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <p>We, the authors of this paper, declare that we used the tools Grammarly and ChatGPT solely for grammar
and spelling checks, as well as for paraphrasing and rewording tasks. After using these tools, we carefully
reviewed, edited, and validated the content. We take full responsibility for the final content of the
publication.
[19] C.-H. Cheng, G. Nührenberg, H. Ruess, Maximum resilience of artificial neural networks, in:
International Symposium on Automated Technology for Verification and Analysis, Springer, 2017,
pp. 251–268.
[20] G. Katz, C. Barrett, D. L. Dill, K. Julian, M. J. Kochenderfer, Reluplex: An eficient smt solver
for verifying deep neural networks, in: International conference on computer aided verification,
Springer, 2017, pp. 97–117.
[21] G. Katz, D. A. Huang, D. Ibeling, K. Julian, C. Lazarus, R. Lim, P. Shah, S. Thakoor, H. Wu,
A. Zeljić, et al., The marabou framework for verification and analysis of deep neural networks, in:
International Conference on Computer Aided Verification, Springer, 2019, pp. 443–452.
[22] R. Ehlers, Formal verification of piece-wise linear feed-forward neural networks, in: International</p>
      <p>Symposium on Automated Technology for Verification and Analysis, Springer, 2017, pp. 269–286.
[23] X. Huang, M. Kwiatkowska, S. Wang, M. Wu, Safety verification of deep neural networks, in:</p>
      <p>International conference on computer aided verification, Springer, 2017, pp. 3–29.
[24] S. Wang, H. Zhang, K. Xu, X. Lin, S. Jana, C.-J. Hsieh, J. Z. Kolter, Beta-crown: Eficient bound
propagation with per-neuron split constraints for neural network robustness verification, Advances
in Neural Information Processing Systems 34 (2021) 29909–29921.
[25] K. Xu, H. Zhang, S. Wang, Y. Wang, S. Jana, X. Lin, C. Hsieh, Fast and complete: Enabling
complete neural network verification with rapid and massively parallel incomplete verifiers, CoRR
abs/2011.13824 (2020). URL: https://arxiv.org/abs/2011.13824. arXiv:2011.13824.
[26] H. Zhang, S. Wang, K. Xu, L. Li, B. Li, S. Jana, C. Hsieh, J. Z. Kolter, General cutting planes for
bound-propagation-based neural network verification, CoRR abs/2208.05740 (2022). URL: https:
//doi.org/10.48550/arXiv.2208.05740. doi:10.48550/arXiv.2208.05740. arXiv:2208.05740.
[27] S. Wang, K. Pei, J. Whitehouse, J. Yang, S. Jana, Formal security analysis of neural networks using
symbolic intervals, in: 27th {USENIX} Security Symposium ({USENIX} Security 18), 2018, pp.
1599–1614.
[28] S. Wang, K. Pei, J. Whitehouse, J. Yang, S. Jana, Eficient formal safety analysis of neural networks,
in: Advances in Neural Information Processing Systems, 2018, pp. 6367–6377.
[29] Y. Y. Elboher, J. Gottschlich, G. Katz, An abstraction-based framework for neural network
verification, in: International Conference on Computer Aided Verification, Springer, 2020, pp. 43–65.
[30] P. Yang, R. Li, J. Li, C.-C. Huang, J. Wang, J. Sun, B. Xue, L. Zhang, Improving neural network
verification through spurious region guided refinement, in: International Conference on Tools
and Algorithms for the Construction and Analysis of Systems, Springer, 2021, pp. 389–408.
[31] X. Lin, H. Zhu, R. Samanta, S. Jagannathan, Art: Abstraction refinement-guided training for
provably correct neural networks, in: 2020 Formal Methods in Computer Aided Design, FMCAD
2020, Haifa, Israel, September 21-24, 2020, IEEE, 2020, pp. 148–157. URL: https://doi.org/10.34727/
2020/isbn.978-3-85448-042-6_22. doi:10.34727/2020/isbn.978-3-85448-042-6\_22.
[32] P. Cousot, R. Cousot, Abstract interpretation: a unified lattice model for static analysis of programs
by construction or approximation of fixpoints, in: Proceedings of the 4th ACM SIGACT-SIGPLAN
symposium on Principles of programming languages, 1977, pp. 238–252.
[33] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, C. Tinelli,
cvc4, in: Computer Aided Verification: 23rd International Conference, CAV 2011, Snowbird, UT,
USA, July 14-20, 2011. Proceedings 23, Springer, 2011, pp. 171–177.
[34] G. Singh, T. Gehr, M. Mirman, M. Püschel, M. T. Vechev, Fast and efective robustness
certification, in: S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi,
R. Garnett (Eds.), Advances in Neural Information Processing Systems 31: Annual
Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018,
Montréal, Canada, 2018, pp. 10825–10836. URL: https://proceedings.neurips.cc/paper/2018/hash/
f2f446980d8e971ef3da97af089481c3-Abstract.html.
[35] G. Singh, T. Gehr, M. Püschel, M. T. Vechev, Boosting robustness certification of neural networks,
in: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA,
May 6-9, 2019, OpenReview.net, 2019. URL: https://openreview.net/forum?id=HJgeEh09KQ.
[36] H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, L. Daniel, Eficient neural network robustness
certification with general activation functions, Advances in neural information processing systems
31 (2018).
[37] L. Deng, The mnist database of handwritten digit images for machine learning research [best of
the web], IEEE signal processing magazine 29 (2012) 141–142.
[38] H. Zhang, K. Xu, S. Wang, et al., alpha-beta-CROWN: Complete and incomplete neural network
verification, https://github.com/Verified-Intelligence/alpha-beta-CROWN, ???? Accessed:
2025-0717.
[39] A. Krizhevsky, G. Hinton, et al., Learning multiple layers of features from tiny images (2009).
[40] M. N. Müller, G. Singh, M. Balunovic, G. Makarchuk, A. Ruoss, F. Serre, M. Baader, D. D. Cohen,
T. Gehr, A. Hofmann, J. Maurer, M. Mirman, C. Müller, M. Püschel, P. Tsankov, M. Vechev, Eran:
Eth robustness analyzer for neural networks,
urlhttps://github.com/eth-sri/eran, ???? Secure, Reliable, and Intelligent Systems Lab (SRI), ETH
Zurich. Accessed: 2025-07-17.
[41] C. M. Bishop, N. M. Nasrabadi, Pattern recognition and machine learning, volume 4, Springer,
2006.
[42] G. Singh, T. Gehr, M. Püschel, M. Vechev, An abstract domain for certifying neural networks,</p>
      <p>Proceedings of the ACM on Programming Languages 3 (2019) 1–30.
[43] G. Singh, R. Ganvir, M. Püschel, M. Vechev, Beyond the single neuron convex barrier for neural
network certification, Advances in Neural Information Processing Systems 32 (2019).
[44] K. Xu, H. Zhang, S. Wang, Y. Wang, S. Jana, X. Lin, C.-J. Hsieh, Fast and Complete: Enabling
complete neural network verification with rapid and massively parallel incomplete verifiers, in:
International Conference on Learning Representations, 2021. URL: https://openreview.net/forum?
id=nVZtXBI6LNn.
[45] D. Zhou, C. Brix, G. A. Hanasusanto, H. Zhang, Scalable neural network verification with
branch-and-bound inferred cutting planes, in: The Thirty-eighth Annual Conference on Neural
Information Processing Systems, 2024.
[46] K. Leino, M. Fredrikson, Relaxing local robustness, Advances in Neural Information Processing</p>
      <p>Systems 34 (2021) 17072–17083.
[47] S. Bak, C. Liu, T. Johnson, The second international verification of neural networks competition
(vnn-comp 2021): Summary and results, arXiv preprint arXiv:2109.00498 (2021).
[48] C. Brix, S. Bak, C. Liu, T. T. Johnson, The fourth international verification of neural networks
competition (vnn-comp 2023): Summary and results, 2023. URL: https://arxiv.org/abs/2312.16760.
arXiv:2312.16760.
[49] C. Brix, S. Bak, T. T. Johnson, H. Wu, The fifth international verification of neural networks
competition (vnn-comp 2024): Summary and results, arXiv preprint arXiv:2412.19985 (2024).
[50] M. Mirman, T. Gehr, M. Vechev, Diferentiable abstract interpretation for provably robust neural
networks, in: International Conference on Machine Learning, PMLR, 2018, pp. 3578–3586.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Emami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castaldi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Banazadeh</surname>
          </string-name>
          ,
          <article-title>Neural network-based flight control systems: Present and future</article-title>
          ,
          <source>Annual Reviews in Control</source>
          <volume>53</volume>
          (
          <year>2022</year>
          )
          <fpage>97</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bojarski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Testa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dworakowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Firner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Flepp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. D.</given-names>
            <surname>Jackel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Monfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Muller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zieba</surname>
          </string-name>
          ,
          <article-title>End to end learning for self-driving cars</article-title>
          ,
          <source>CoRR abs/1604</source>
          .07316 (
          <year>2016</year>
          ). URL: http://arxiv.org/abs/1604.07316. arXiv:
          <volume>1604</volume>
          .
          <fpage>07316</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anpalagan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Guan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Khwaja</surname>
          </string-name>
          ,
          <article-title>Deep learning for object detection and scene perception in self-driving cars: Survey, challenges</article-title>
          , and open issues,
          <source>Array</source>
          <volume>10</volume>
          (
          <year>2021</year>
          )
          <fpage>100057</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Chib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>Recent advancements in end-to-end autonomous driving using deep learning: A survey</article-title>
          ,
          <source>IEEE Transactions on Intelligent Vehicles</source>
          <volume>9</volume>
          (
          <year>2023</year>
          )
          <fpage>103</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuutti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bowden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fallah</surname>
          </string-name>
          ,
          <article-title>A survey of deep learning applications to autonomous vehicle control</article-title>
          ,
          <source>IEEE Transactions on Intelligent Transportation Systems</source>
          <volume>22</volume>
          (
          <year>2020</year>
          )
          <fpage>712</fpage>
          -
          <lpage>733</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Amato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>López</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Peña-Méndez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vaňhara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hampl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Havel</surname>
          </string-name>
          ,
          <article-title>Artificial neural networks in medical diagnosis</article-title>
          ,
          <source>Journal of applied biomedicine 11</source>
          (
          <year>2013</year>
          )
          <fpage>47</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I. J.</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          ,
          <article-title>Explaining and harnessing adversarial examples</article-title>
          , in: Y. Bengio, Y. LeCun (Eds.),
          <source>3rd International Conference on Learning Representations, ICLR</source>
          <year>2015</year>
          , San Diego, CA, USA, May 7-
          <issue>9</issue>
          ,
          <year>2015</year>
          , Conference Track Proceedings,
          <year>2015</year>
          . URL: http://arxiv.org/abs/1412.6572.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kushman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ermon</surname>
          </string-name>
          ,
          <article-title>Constructing unrestricted adversarial examples with generative models</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>31</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>I.</given-names>
            <surname>Dunn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pouget</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kroening</surname>
          </string-name>
          , T. Melham,
          <article-title>Exposing previously undetectable faults in deep neural networks</article-title>
          ,
          <source>in: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>56</fpage>
          -
          <lpage>66</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>I.</given-names>
            <surname>Dunn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hanu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pouget</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kroening</surname>
          </string-name>
          , T. Melham,
          <article-title>Evaluating robustness to context-sensitive feature perturbations of diferent granularities</article-title>
          , arXiv preprint arXiv:
          <year>2001</year>
          .
          <volume>11055</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>I.</given-names>
            <surname>Dunn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Pouget</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Melham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kroening</surname>
          </string-name>
          ,
          <article-title>Adaptive generation of unrestricted adversarial inputs</article-title>
          , arXiv preprint arXiv:
          <year>1905</year>
          .
          <volume>02463</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hayes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Danezis</surname>
          </string-name>
          ,
          <article-title>Learning universal adversarial perturbations with generative models, in: 2018 IEEE Security and Privacy Workshops (SPW)</article-title>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Baluja</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Fischer</surname>
          </string-name>
          ,
          <article-title>Learning to attack: Adversarial transformation networks</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <article-title>Generating adversarial examples with adversarial networks</article-title>
          , arXiv preprint arXiv:
          <year>1801</year>
          .
          <volume>02610</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Madry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Makelov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tsipras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vladu</surname>
          </string-name>
          ,
          <article-title>Towards deep learning models resistant to adversarial attacks</article-title>
          ,
          <source>in: International Conference on Learning Representations</source>
          ,
          <year>2018</year>
          . URL: https://openreview.net/forum?id=rJzIBfZAb.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lomuscio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Maganti</surname>
          </string-name>
          ,
          <article-title>An approach to reachability analysis for feed-forward relu neural networks</article-title>
          ,
          <source>CoRR abs/1706</source>
          .07351 (
          <year>2017</year>
          ). URL: http://arxiv.org/abs/1706.07351. arXiv:
          <volume>1706</volume>
          .
          <fpage>07351</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fischetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jo</surname>
          </string-name>
          ,
          <article-title>Deep neural networks and mixed integer linear optimization, Constraints An Int</article-title>
          . J.
          <volume>23</volume>
          (
          <year>2018</year>
          )
          <fpage>296</fpage>
          -
          <lpage>309</lpage>
          . URL: https://doi.org/10.1007/s10601-018-9285-6. doi:
          <volume>10</volume>
          .1007/ s10601-018-9285-6.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dutta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sankaranarayanan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tiwari</surname>
          </string-name>
          ,
          <article-title>Output range analysis for deep feedforward neural networks</article-title>
          ,
          <source>in: NASA Formal Methods Symposium</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>138</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>