<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>The full version is available at arXiv:</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Eficient Volume Computation for SMT Formulas</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Chennai Mathematical Institute</institution>
          ,
          <addr-line>India IAI, TCG CREST, Kolkata</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2508</year>
      </pub-date>
      <volume>09934</volume>
      <fpage>11</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>Satisfiability Modulo Theory (SMT) has recently emerged as a powerful tool for solving various automated reasoning problems across diverse domains. Unlike traditional satisfiability methods confined to Boolean variables, SMT can reason on real-life variables like bitvectors, integers, and reals. A natural extension in this context is to ask quantitative questions. One such query in the SMT theory of Linear Real Arithmetic (LRA) is computing the volume of the entire satisfiable region defined by SMT formulas. This problem is important in solving diferent quantitative verification queries in software verification, cyber-physical systems, and neural networks. We introduce ttc, an eficient algorithm that extends the capabilities of SMT solvers to volume computation. Our method decomposes the solution space of SMT Linear Real Arithmetic formulas into a union of overlapping convex polytopes, then computes their volumes and calculates their union. Our algorithm builds on recent developments in streaming-mode set unions, volume computation algorithms, and AllSAT techniques. Experimental evaluations demonstrate significant performance improvements over state-of-the-art approaches.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Satisfiability Modulo Theories</kwd>
        <kwd>Quantitative Reasoning</kwd>
        <kwd>Volume Computation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Satisfiability Modulo Theories (SMT) has revolutionized automated reasoning, serving as the
foundational technology for diverse problems. The power of SMT stems from its ability to reason over diverse
theories, including bitvectors, reals, and integers, extending well beyond the capabilities of traditional
SAT solvers. This versatility has established SMT as the de facto decision procedure not only in formal
verification of software and hardware workflows, but across numerous domains requiring sophisticated
logical reasoning, including security, synthesis, planning, and optimization.</p>
      <p>
        Meanwhile, quantitative reasoning has emerged as a critical advancement in satisfiability solving.
Rather than merely determining whether a Boolean formula can be satisfied, model counting techniques
calculate the number of satisfying assignments - establishing a robust framework for addressing
quantitative challenges like probabilistic inference, software verification, network reliability, neural
network verification, and numerous other problems [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>The natural evolution of these parallel developments leads to the compelling extension to efectively
handle quantitative queries within SMT frameworks. This challenge is nuanced by the diversity of the
underlying theories, each demanding diferent approaches. In discrete domains like bitvectors and linear
integer arithmetic, the problem manifests as model counting. For linear real arithmetic, it transforms
into volume computation or counting distinct regions. Recent years have witnessed remarkable progress
across these domains, yielding both theoretical insights and practical algorithms for bitvectors, linear
integers, and strings. But these approaches are mostly limited to discrete domains, and hardly been
extended to continuous domains. In this work, we address the question: Given an SMT LRA formula,
can we design an eficient volume computation algorithm?</p>
      <p>Our primary contribution is the development of ttc, a novel algorithmic framework that provides an
afirmative answer to this question. The ttc algorithm approximates the volume of SMT LRA formula
solution spaces with provable theoretical guarantees.</p>
      <p>
        We start with a very related and well-studied problem to SMT volume computation: the problem of
volume computation of bounded convex bodies. Sophisticated exact and approximate methods to solve
the problem have been developed in the last few decades. Although the exact volume problem is #P-hard,
the seminal work by Dyer et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] showed a polynomial-time randomized approximation algorithm
(FPRAS) for this problem. Furthermore, advancements have not only improved the asymptotic running
times of these algorithms, but also yielded practically eficient methods that forgo certain theoretical
guarantees to eliminate prohibitive hidden constants in their runtime [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>A central challenge in applying these ideas to SMT lies in the non-convex nature of the solution spaces
generated by SMT formulas. Unlike convex bodies, non-convex regions are more complex to analyze
due to their irregular shapes and potential discontinuities. A natural strategy to overcome this hurdle is
to partition the non-convex space into a union of convex bodies, where each convex piece can be more
easily managed with existing techniques. Decomposing a non-convex SMT solution space into convex
components introduces its own set of challenges. Current state-of-the-art techniques can’t handle the
union of non-disjoint components. In many cases, the decomposition yields an excessive number of
disjoint components, which may not accurately reflect the underlying structure of the solution space.
In practice, solution spaces manifest as unions of overlapping, non-disjoint polytopic regions with
boundaries and intersections that encode critical constraint information. This overlapping structure
is not merely theoretical - our empirical analysis reveals cases where state-of-the-art decomposition
techniques transform a natural representation of 7 overlapping polytopes into an unwieldy collection
of 20,595 disjoint components, creating unnecessary computational complexity.</p>
      <p>Our approach builds upon recent advancements in counting distinct elements across set unions in
streaming models by Meel et al. (MVC, 2021). The fundamental challenge in adapting the MVC algorithm
to volume computation arises from the inherent diference between discrete and continuous domains.
The MVC algorithm was specifically engineered for discrete settings, while volume computation operates
in continuous space. We overcome this obstacle through a principled discretization approach, efectively
reducing continuous volume computation to the problem of counting lattice points within a carefully
constructed fine-grained lattice space.</p>
      <p>We have implemented ttc and evaluated it on a comprehensive benchmark suite. The results
demonstrate significant gains in scalability and accuracy. Out of a benchmark set of 1131 instances, ttc solved
1112, while the current state of the art can solve only 145.</p>
      <p>Problem Statement Let  be the whole solution space of the given formula  , which has 
dimensions. Let ℬ ⊂ R  be an -dimensional measurable set. The volume of ℬ is defined as  (ℬ) =
∫︀ x, where x denotes the diferential volume element. In Cartesian coordinates, this element is
ℬ
expressed as x = 1 2 · · ·  .</p>
      <p>Applications. The theory of linear real arithmetic has significant applications in the formal
verification of systems with real variables. These include hybrid systems such as cyber-physical systems [6],
and control systems [7] and timed systems [8]. Advanced verification tools like Reluplex [ 9] extend
SMT solving to neural networks by encoding real-valued variables for network inputs. Extending these
approaches to quantitative verification would require LRA solvers with eficient volume computation
capabilities. This follows the established pattern where model counting tools have enabled quantitative
verification advances in software [10, 11] and binarized neural networks [12].</p>
    </sec>
    <sec id="sec-2">
      <title>2. Algorithm and Analysis</title>
      <p>This section presents the core contribution of our work: the ttc algorithm along with its theoretical
analysis.</p>
      <sec id="sec-2-1">
        <title>2.1. Algorithm</title>
        <p>Given an SMT LRA formula  , the ttc algorithm returns an estimate of Volume ( ). Initially, the
algorithm decomposes the solution space of the SMT formula into non-disjoint polytopes and computes
the volume for each polytope. Subsequently, it estimates the volume of the union of the polytopes’
solution space using a sampling-based approach.</p>
        <p>Algorithm 1 ttc(, , )
1: , ← BooleanAbstraction( )
2:  ← toDNF(
3: ← GetPrecision(, , 4 )</p>
        <p>)
4: ′ ←

12
5: Thresh ← max
6:  ← 1 ;  ← ∅
7: for  = 1 to  do
︁(</p>
        <p>24 · (l1n−(24′)/)′2 , 6(ln 6 + ln ))︁
8:
9:
for  ∈  do
 ← Volume(Polytope( ), ′,  ′)
while  ≥</p>
        <p />
        <p>Thresh do
if  ∈ Polytope() then remove  from 
 ← /2</p>
        <sec id="sec-2-1-1">
          <title>Remove every element of  with prob. 1/2</title>
          <p>← Poisson( · )
while  + | | &gt; Thresh do
 ← Poisson( · /2) and  ← /2</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Remove every element of  with prob. 1/2</title>
          <p>.Append(S)
 ← GenerateSamples(Polytope( ), , )
20: Output | |/</p>
          <p>Since we only have a volume computation algorithm for convex polytopes, but the solution space of
an SMT formula may be non-convex, we decompose the solution space as a union of convex polytopes.
First, we observe that using , the Boolean abstraction of  in DNF form, we can capture the solution
space of  as a union of polytopes. Let  = ⋀︀</p>
          <p>=1 ℓ be a cube in the DNF. Then, the conjunction
we denote by Polytope(). We can show that ⋃︀
⋀︀=1  (ℓ ) of the corresponding linear inequalities defines a (possibly empty) convex polytope, which
=1 Polytope() = Sol( ). This relation shows that
the DNF representation of the Boolean abstraction of an SMT formula can be used to decompose its
solution space into convex polytopes.</p>
          <p>To eficiently compute the union of these polytopes, we leverage recent breakthroughs in streaming
algorithms for set union operations. The central idea is to compute the union of volumes by maintaining
a representative set of points that approximates the total volume. The main caveat of this approach is
that the algorithm requires the underlying sets to be finite, while the volume of a polytope cannot be
measured directly with a finite number of points. To overcome this issue, we consider an axis-parallel
lattice in R with cell side length 10− , defined as
L = 10− Z = {(10− 1, 10− 2, . . . , 10− ) |
 ∈ Z for  = 1, . . . , }. If  is suficiently large, we can use this lattice to establish a relationship
between the number of lattice points contained within a polytope, | ∩ L|, and the volume of the
polytope, Volume ().
in detail.</p>
          <p>We present our algorithm, ttc, in Algorithm 1, and in the subsequent part, we describe the algorithm
In line 1 of ttc, we parse the formula  and construct its Boolean abstraction as a circuit, specifically as
an And-Inverter Graph (AIG). Then, in line 2, ttc converts the circuit to DNF. The circuit representation
enables us to avoid introducing any auxiliary variables. In essence, converting the circuit to DNF is
equivalent to solving a circuit AllSAT problem, for which we leverage recent advances in the literature.</p>
          <p>Based on the desired accuracy  of the volume estimate, we begin by computing the precision
parameter  using GetPrecision in line 3, and threshold value, Thresh, in line 5, which determines
approximately how many points will be maintained during the algorithm’s execution. In the main loop
(lines 7 to 19), we process each cube Polytope() sequentially. For each cube, we first compute the
(approximate) volume of its corresponding polytope using a polytope volume computation algorithm
(line 8). Next, in line 10, the algorithm removes from the current set  all solutions that are already
accounted for. We then determine the number of solutions  that would be sampled from Polytope()
if each solution were independently sampled with probability ; here,  is modeled by a Poisson
distribution. Since we wish to keep the size of  bounded by Thresh, if the sum | | +  exceeds
Thresh, we decrease  and adjust  accordingly - this adjustment is performed by resampling 
from a Poisson distribution with the revised parameter () and by removing elements from  with
probability 1/2 (line 12). Next, we sample  solutions from the cube  uniformly at random and add
to  - this is essentially done by uniformly sampling a lattice point from Polytope() ∩ L. Finally,
the algorithm outputs the final volume estimate | | .</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experimental Evaluation</title>
      <p>We evaluated our implementation concerning both eficiency and accuracy.</p>
      <p>Baseline. For performance evaluation, we used the current state of the art volume computation
framework SharpSMT, which ofers two distinct modes: the polyvest algorithm, which estimates the volume,
and vinci, which performs exact volume computation. To establish meaningful comparisons, we utilized
polyvest for performance analysis and vinci for accuracy check.</p>
      <p>Benchmarks. As a first step, we sought to rely on benchmarks from SMT-Lib, but these benchmarks
could not be handled by polyvest or vinci owing to them containing extremely thin geometric regions
where dimensions may be constrained to narrow ranges (e.g., (0,10−7 )). In these cases, polyvest and
vinci fail to handle the required precision and incorrectly classify these polytopes as degenerate with
zero volume. Such a study would simply showcase the abilty of ttc to handle constraints requiring high
precision arithmetic but would not allow a more meaningful comparison with SharpSMT. Accordingly,
we focus on the construction of synthetic instances that are disjunctions of intersecting polytopes, with
each polytope defined in H-representation (  ≤  ). In total, our benchmark suite consists of 1131
benchmarks.</p>
      <p>
        1. We vary two parameters: (1) dimension parameter : ranges from 6 to 34, (2) number of polytopes
: ranges from 6 to 42.
2. For each instance, we generate polytopes using three diferent geometric shapes studied by
Cousins and Vempala [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]: (a) Cubes: -dimensional cubes with random bounds, then translated
and rotated. (b) Zonotypes: Minkowski sum of  diferent -dimensional vectors generated
randomly. (c) Simplex: Shapes where all coordinates are nonnegative and sum to at most 1,
defined as { ∈ R  : ∑︀
      </p>
      <p>=1  ≤ 1,   ≥ 0}.</p>
      <p>Environment. We conducted all our experiments on a high-performance computer cluster, with each
node consisting of Intel Xeon Gold 6148 CPUs. We allocated one CPU core and a 5GB memory limit to
each solver instance pair. To adhere to the standard timeout used in model counting competitions, we
set the timeout for all experiments to 3600 seconds. We use values of  = 0.8 and  = 0.2 , in line with
prior work in the model counting community.</p>
      <p>With the above setup, we conduct extensive experiments to understand the following:
RQ1. How does the runtime performance of ttc compare to that of polyvest?
RQ2. How accurate is the count computed by ttc in comparison to the exact count?
Summary of Results. ttc achieves a significant performance improvement over polyvest by finishing on
1112 instances in a benchmark set consisting of 1131, while polyvest could only finish on 145 instances.
polyvest barely finishes on instances with more than 25 polytopes or 20 dimensions, while ttc seamlessly
handles 40 polytopes of 35 dimensions. The accuracy of the approximate count is also noteworthy, with
an average error of a count by ttc of only 0.059.</p>
      <sec id="sec-3-1">
        <title>3.1. Performance of ttc</title>
        <p>Instances Solved. In Table 1, we compare the number of benchmarks that can be solved by polyvest and
ttc. First, it is evident that the polyvest only solved 145 out of the 1131 benchmarks in the test suite,
indicating its lack of scalability. Conversely, ttc solved 1112 instances, demonstrating a substantial
improvement compared to polyvest.</p>
        <p>Solver
vinci
polyvest
ttc</p>
        <p>Solving Time Comparison. A performance evaluation of polyvest and ttc is depicted in Figure 1, which
is a cactus plot comparing the solving time. The -axis represents the number of instances, while the
-axis shows the time taken. A point (, ) in the plot represents that a solver solved  benchmarks out
of the 1131 benchmarks in the test suite in less than or equal to  seconds. The curves for polyvest and
ttc indicate that for a few instances, polyvest was able to give a quick answer, while in the long run, ttc
could solve many more instances given any fixed timeout.</p>
        <p>3500
3000
2500
)
s
(e2000
m
i
t
un1500
R
1000
500
0
0
ttc
polyvest
vinci
200
400</p>
        <p>600
Benchmarks
800
1000</p>
        <p>In Table 1 we also show the PAR-2 score of the solvers, which is the mean runtime over all instances,
assigning a cost of 2 to each instance timed out at  . ttc shows significantly small PAR-2 score. In
Figure 2, we present a comparative analysis of solving times between ttc and polyvest. Each data point
(, ) represents an instance that was solved in  seconds by ttc and  seconds by polyvest.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Quality of Approximation</title>
        <p>In our experimental evaluation, we found the exact volume of 142 benchmarks from vinci, enabling
us to calculate the error made by ttc on these instances. We quantify the error made by ttc by the
parameter  = |−| , where  represents the count from vinci and  from ttc. This measure is the</p>
        <p>102
ttc</p>
        <p>103
observed error, analogous to the theoretical error guarantees provided by ttc. Analysis of all 142 cases
found the median  to be 0.059, geometric mean 0.038, and maximum 0.39, contrasting sharply with a
theoretical guarantee of 0.8. This signifies ttc substantially outperforms its theoretical bounds.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>This paper introduces ttc, a scalable approximate SMT volume computation tool that demonstrates
exceptional performance on practical benchmarks. Our approach harnesses probabilistic techniques to
deliver theoretical guarantees on computation results, and empirical results significantly surpassing
theoretical guarantees. Our work suggests several promising research directions. First, many formulas
contain equality constraints that result in zero volume when computing in  dimensions. A natural
extension would be to develop methods for correctly computing ( − ) -dimensional volume in such
cases. Second, while we prioritized performance over strict theoretical guarantees in our implementation,
experimental results consistently demonstrate error rates well below theoretical bounds. This raises the
intriguing question, whether rigorous guarantees can be established for our current implementation
without sacrificing its performance advantages.</p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <p>The author has not employed any Generative AI tools.
[5] K. S. Meel, N. Vinodchandran, S. Chakraborty, Estimating the size of union of sets in streaming
models, in: Proc. of PODS, 2021.
[6] I. Koley, S. Dey, D. Mukhopadhyay, S. Singh, L. Lokesh, S. V. Ghotgalkar, CAD Support for
Security and Robustness Analysis of Safety-critical Automotive Software, ACM Transactions on
Cyber-Physical Systems (2023).
[7] A. Cimatti, S. Mover, S. Tonetta, Smt-based verification of hybrid systems, in: Proc. of AAAI, 2012.
[8] A. Cimatti, A. Griggio, B. J. Schaafsma, R. Sebastiani, The mathsat5 SMT solver, in: Proc. of</p>
      <p>TACAS, 2013.
[9] G. Katz, C. Barrett, D. L. Dill, K. Julian, M. J. Kochenderfer, Reluplex: An eficient smt solver for
verifying deep neural networks, in: Proc. of CAV, 2017.
[10] G. Girol, B. Farinier, S. Bardin, Not all bugs are created equal, but robust reachability can tell the
diference, in: Proc. of CAV, 2021.
[11] S. Teuber, A. Weigl, Quantifying software reliability via model-counting, in: Proc. of QEST, 2021.
[12] T. Baluta, S. Shen, S. Shinde, K. S. Meel, P. Saxena, Quantitative verification of neural networks
and its security applications, in: Proc. of CCS, 2019.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Sarkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Meel</surname>
          </string-name>
          ,
          <article-title>Eficient volume computation for smt formulas</article-title>
          ,
          <source>in: Proceedings of Knowledge Representation and Reasoning (KR)</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Meel</surname>
          </string-name>
          ,
          <article-title>Model counting in the wild</article-title>
          ,
          <source>in: Proc. of KR</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Frieze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kannan</surname>
          </string-name>
          ,
          <article-title>A random polynomial-time algorithm for approximating the volume of convex bodies</article-title>
          ,
          <source>Journal of the ACM (JACM) 38</source>
          (
          <year>1991</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Cousins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vempala</surname>
          </string-name>
          ,
          <article-title>A practical volume algorithm</article-title>
          ,
          <source>Mathematical Programming Computation</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>