=Paper= {{Paper |id=Vol-3725/short9 |storemode=property |title=A Bit-vector to Integer Translation with bv2nat and nat2bv |pdfUrl=https://ceur-ws.org/Vol-3725/short9.pdf |volume=Vol-3725 |authors=Max Barth,Matthias Heizmann |dblpUrl=https://dblp.org/rec/conf/smt/BarthH24 }} ==A Bit-vector to Integer Translation with bv2nat and nat2bv== https://ceur-ws.org/Vol-3725/short9.pdf
                         A Bit-vector to Integer Translation with
                         bv2nat and nat2bv⋆
                         Max Barth1,* , Matthias Heizmann2
                         1
                             LMU Munich, Munich, Germany
                         2
                             University of Stuttgart, Stuttgart, Germany


                                         Abstract
                                         In this paper we present a translation from bit-vector formulas to integer formulas. The translation uses the
                                         function symbols bv2nat and nat2bv𝑘 which are both utilized in the theory of fixed-width bit-vectors of
                                         the SMT-LIB [1] language to define the semantics of bit-vector operations. Our translation replaces bit-vector
                                         operations with their semantic definition. This facilitates a more modular application as bit-vector operations
                                         and their semantic definition have the same sort. As a postprocessing our translation replaces the composition
                                         bv2nat ∘ nat2bv𝑘 with a modulo operation, and removes redundant modulo operations from the translation
                                         result. The evaluation of our translation shows that we are able to solve 9% more tasks, 10% faster and with 23%
                                         less memory usage compared to a closely related, up-to-date translation approach. Additionally, our translation
                                         supports the translation of quantified formulas and arrays over bit-vectors.

                                         Keywords
                                         Int-blasting, Bit-vectors, Translation from bit-vectors to integers, bv2nat and nat2bv, Translation of quantified
                                         formulas and arrays




                         1. Introduction
                         In many program languages, integer data types represent only a fixed number of values. A sequence
                         of bits is utilized to represent an integer via two’s complement or a binary encoding. We call such
                         a sequence of bits a bit-vector. The SMT-LIB [1] theory of “FixedSizeBitVectors"1 is well-suited for
                         modeling such programming languages since it offers many function symbols that capture precisely
                         the semantics operations that occur in programming languages.
                            However, the expressiveness of the bit-vector theory comes at a certain price. Due to their complex
                         semantics, formulas of this theory are rather intractable and there are only few algorithms that handle
                         these formulas directly. Typically, algorithms that work on bit-vector formulas first do a translation,
                         either to propositional logic or to integer arithmetic. The translation to propositional logic is called
                         bit-blasting [2, 3]. Bit-blasting translates each bit of a bit-vector into a propositional logical variable
                         and translates each bit-vector operation into a propositional logical formula that captures exactly the
                         semantics of that operation. The strength of bit-blasting is that we can utilize powerful SAT solvers
                         for deciding statisfiability of the resulting formulas. The alternative to bit-blasting is the translation
                         to integer arithmetic. A recent publication coined the term int-blasting [4] for this translation. Here,
                         bit-vector variables are translated to integer variables and a comprehensive application of modulo
                         operations makes sure that we can establish a connection between models for the bit-vector formulas
                         and models for the integer formulas. In order to model bit-precise bit-vector operations, int-blasting
                         can access individual bits via a combination of integer division (div) and modulo (mod) operations. E.g.,
                         if we translate a bit-vector variable x into an integer variable x’ such that x is the binary encoding of x’,
                         we can access the third least-significant bit as follows: We divide x’ by 4 and take the result modulo two.
                         The result is one iff this bit was set to true. In order to improve the performance, int-blasting-based
                         translations often work with approximations and increase their precision later if required.

                         SMT’24: 22st International Workshop on Satisfiability Modulo Theories, July 22–23, 2024, Montreal, Canada
                         ⋆
                           This project was supported by the Deutsche Forschungsgemeinschaft (DFG) — 378803395 (ConVeY).
                         *
                           Corresponding author.
                          0009-0002-7716-3898 (M. Barth); 0000-0003-4252-3558 (M. Heizmann)
                                      © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                         1
                             https://smt-lib.org/theories-FixedSizeBitVectors.shtml

CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
   Even though the state of the art to decide the satisfiability of bit-vector formulas is bit-blasting [2],
there exist applications that need a translation from bit-vector formulas to formulas over integers. For
example in software model checking many techniques only work on integers: loop acceleration [5, 6],
invariant syntheses [7, 8] and syntheses of ranking functions [9, 10, 11]. While other techniques perform
better on integers, especially Craig interpolation.
   In this paper we present a translation from bit-vectors to integers that is closely related to the
translation of [4]. In order to explain our translation and conceptual differences to [4] we use the
functions nat2bv𝑘 and bv2nat (see Section 3) which implement the binary encoding of natural
numbers and its inverse function, respectively. The translation of [4] ensures the following relation
between models of bit-vector formulas and the models of the resulting integer formulas: If the bit-vector
model maps a term 𝑡 to a bitvector value 𝑣 then the integer model maps the translation result of 𝑡 to
bv2nat(𝑣). In order to ensure this relation, the translation of [4] adds constraints that require that
integer variables are in a given range and modulo operations such that every integer term is in the
same range as its corresponding bit-vector term.
   Our translation only requires a less strict relation between models of bit-vector formulas and the
models of the resulting integer formulas. If the integer model maps the translation result of the term 𝑡 to
an integer 𝑣 ′ then the bit-vector model maps 𝑡 to nat2bv(𝑣 ′ ) . I.e., we do not require that nat2bv(𝑣 ′ ) is
the binary encoding of 𝑣 ′ , we only require that nat2bv(𝑣 ′ ) is the binary encoding of nat2bv(𝑣 ′ mod 𝑘),
where 𝑘 is the bit-length of 𝑡. This conceptual difference to [4] allows us to omit constraints on the
translated variables and it allows us to omit several modulo operations that the translation of [4]
introduces.
   The translation of [4] introduces modulo operations after each arithmetic operation. Our translation
omits modulo operations after arithmetic operations but introduces modulo operations before each
operation where we need that integer values are in a certain range. E.g., we translate bvult to the
less-than relation < but apply a modulo operation to both operands. We call the translation of [4] eager
int-blasting and our translation lazy int-blasting.
Example 1. For the bit-vector formula: (bvult 𝑦 (bvadd 0101 (bvmul 𝑦 0011))). The result of the
eager int-blasting is: (< 𝑦 ′ (mod (+ 5 (mod (· 𝑦 ′ 3) 24 )) 24 )) ∧ (≥ 𝑦 ′ 0) ∧ (< 𝑦 ′ 24 ). The result of our
lazy int-blasting is: (< (mod 𝑦 ′ 24 ) (mod (+ 5 (· 𝑦 ′ 3)) 24 )).
   Technically, our translation proceeds in two steps. In the first step, we inductively translate the
formula and distinguish two cases. If the SMT-LIB semantic definition of the operation that we have
to translate utilizes the functions bv2nat and nat2bv𝑘 , we replace the operation with their semantic
definition. For most of the other operations our translation is similar to the eager int-blasting. The
result of our first translation step contains the functions bv2nat and nat2bv𝑘 only as a concatenation
bv2nat ∘ nat2bv𝑘 . Our second translation step replaces this concatenation either by a modulo operation
or by the identity function.
   An additional contribution of this paper is that we present how we translate quantified formulas and
arrays over bit-vectors into integers.
   We have implemented our translation in two ways: as a wrapper script [12] for SMT solver and
directly in the SMT solver SMTInterpol [13]. Through our evaluation of both implementations, we
demonstrated that our translation does not produce incorrect results on the benchmark set. Moreover,
we conducted a comparison between two settings of our translation implemented in SMTInterpol:
one setting adds modulo operations lazily, while the other adds them eagerly. Our evaluation results
revealed that SMTInterpol is capable of solving 9% more tasks, is 10% faster, and requires 23% less
memory when modulo operations are added lazily compared to eagerly.


2. Related Work
There have been numerous approaches to translating from bit-vectors to integers. Some methods
perform the translation during the verification process e.g. on the source level [14] or during invariant
synthesis [15]. More closely related to our work are translations for SMT formulas [16, 17, 18, 19, 4].
However, our translation approach differs from the related work. Firstly, we incorporate modular
arithmetic lazily to eliminate redundant modulo operations in the translation result. There exist
simplification techniques to eliminate such redundant modulo operations. For example when the SMT
solver Z3 is asked simplify(mod (· (mod (+ 𝑥 𝑦) 256) 𝑧) 256), it returns (mod (· 𝑧 (+ 𝑥 𝑦)) 256).
However, since we do not add the redundant modulo operations in the first place, there is no need for
such a simplification on our translation results. Secondly, no other approach utilizes function symbols
with behavior similar to the functions bv2nat and nat2bv𝑘 . We give a brief overview of the related
approaches.
   A similar concept to our lazy translation, but in a different research field, can be found in [20]. Where
the authors apply modulo operations lazily during their translation from BTOR to C. Their evaluation
shows that their lazy translation is faster in general, but not strictly better than their eager translation.
   Griggio et al. introduces a layered SMT solver in [16], where on the higher layers, the solver performs
translation from bit-vector formulas to integer formulas. This translation is for a fragment of the
bit-vector functions, including arithmetic functions, extraction, concatenation, left shift, and relations.
Rather than incorporating modulo operations to bound the integers, Griggio’s approach utilizes an
auxiliary variable, such as variable 𝑣 in the expression (𝑡1 + 𝑡2 − 2𝑛 · 𝑣) ∧ (0 ≤ 𝑣) ∧ (𝑣 ≤ 1). Each
integer term derived during the translation is within the bounds of its corresponding bit-vector term,
effectively achieving an eager translation.
   Backeman, Rummer, and Zeljic [18] introduce a new calculus for non-linear integer arithmetic, which,
in certain cases, can eliminate quantifiers and extract Craig interpolants. Subsequently, they define a
corresponding calculus for arithmetic bit-vector constraints. Both calculi allow for a flexible switch
between bit-vectors and integers. Initially, integers are not bound by modular arithmetic; instead, the
authors introduce an uninterpreted function symbol that represents the modulo operation. They note
that the remainder operation tends to be a bottleneck for interpolation. If necessary, a definition for
the uninterpreted function symbol can be added to precisely cover the remainder. In contrast to our
approach with uninterpreted function symbols, their uninterpreted function is directly associated with
the modulo operation and does not affect the sort of a term. Furthermore, their approach supports a
translation of quantified formulas, but not of arrays over bit-vectors.
   Recently, the first precise and complete translation for bit-vector formulas with bit-vectors of fixed
size was published in [4]. This work is closely related to their translation for bit-vectors with parametric
bit-width, proposed in their previous work [19]. The precise translation presented in [4] has been
implemented in the cvc5 prover [21]. It employs a translation from bit-vector formulas to non-linear
integer arithmetic formulas with uninterpreted functions and universal quantification. During the
translation, modulo operations are added eagerly and without the use of uninterpreted function symbols.
Our translation approach combines elements from the translation in [4] with the semantics of the
theory of "FixedSizeBitVectors" defined in the SMT-LIB [1].


3. Preliminaries and Notations
The SMT-LIB [1] defines a many-sorted first-order logic with equality. In this paper we use the sorts,
signatures Σ and theories defined in the SMT-LIB. In particular we use the sorts, signatures and theories
of Booleans, fixed-size bit-vectors, integers, and arrays.

Bit-Vectors
The SMT-LIB defines a signature ΣBv and a theory called “FixedSizeBitVectors" for bit-vectors of fixed
size. For every possible bit-vector size 𝑘, that is every positive integer greater than zero, ΣBv contains
a unique sort 𝜎𝑘 . We call a term of sort 𝜎𝑘 ∈ ΣBv bit-vector of size 𝑘 or bit-vector of width 𝑘. In
the following let 𝑥 be a bit-vector variable, 𝑐 be a bit-vector constant, and 𝑡, 𝑡1 , and 𝑡2 be bit-vector
terms. Furthermore, let the width of bit-vectors 𝑥, 𝑐 and 𝑡 be 𝑘, the width of 𝑡1 be 𝑘1 and the width of
𝑡2 be 𝑘2 . The signature ΣBv as defined in the SMT-LIB contains a set of bit-vector function symbols.
We denote the extract function from 𝑖 to 𝑗 as extract𝑖𝑗 (𝑡), where 𝑖 and 𝑗 are natural numbers with
𝑖, 𝑗 ≥ 0 ∧ 𝑖 ≥ 𝑗. For the other bit-vector functions 𝑓 we use the notation 𝑓 (𝑡1 𝑡2 ) instead of the SMT-LIB
notation (𝑓 𝑡1 𝑡2 ).

Integers
The SMT-LIB defines a signature ΣInt and a ΣInt -theory for mathematical integers. The sort 𝜎 ∈ ΣInt
is defined as the set of all integers. The integer signature ΣInt consists of variables, constants and the
usual functions and relations. Let constants 𝑐′ and terms 𝑡′ , 𝑡′1 and 𝑡′2 all have sort Int. As notation for
integer functions we write (𝑡′1 + 𝑡′2 ) instead of the SMT-LIB notation (+ 𝑡′1 𝑡′2 ).
   In [4] the authors introduce a binary function symbols &𝑘 (−, −), for every positive integer 𝑘. The
functions &𝑘 (−, −) are introduced to represents bit-wise and. Therefore, they extend the signature
ΣInt and define two theories for this extended signature. We will do the same in this paper and refer
to [4] for details.

Functions bv2nat and nat2bv𝑘
In the theory of “FixedSizeBitVectors" the functions bv2nat and nat2bv𝑘 are defined. Given an
arbitrary binary 𝑏 = (𝑏𝑘−1 , ..., 𝑏𝑖 , ..., 𝑏∑︀
                                             0 ) and its corresponding natural number 𝑛, the function bv2nat
is defined as follows: bv2nat(𝑏) := 𝑘−1                   𝑖
                                                𝑖=0 𝑏𝑖 · 2 . Furthermore, the function nat2bv𝑘 is defined as:
nat2bv𝑘 (𝑛) := (𝑏𝑘−1 , ..., 𝑏𝑖 , ..., 𝑏0 ), where 𝑏𝑖 = 𝑛 div 2𝑖 mod 2.
   Note, we do not extend our signatures and theories with bv2nat and nat2bv𝑘 . Instead, we treat
them as auxiliary functions and ensure they are eliminated in the translation result. For the sake of
readability, we denote bv2nat(𝑡) and nat2bv𝑘 (𝑡′ ) as 𝑡bv2nat and 𝑡′nat2bv𝑘 , respectively.


4. Translation with bv2nat and nat2bv𝑘
Our bit-vector to integer translation maps from the set of ΣBv -formulas to the set of ΣInt -formulas
(extended with &𝑘 (−, −)). We say a translation "translates" a term or formula if it associates that term
or formula with an element of its co-domain. Before we can define the translation, we need to define
some auxiliary functions. First we define a variable mapping 𝜒. Similar to the variable mapping defined
in [4], 𝜒 maps a variable of sort bit-vector to a fresh variable of sort integer. We extend the definition of
𝜒 for arrays over bit-vectors and quantified variables.

Definition 4.1 (Variable Mapping 𝜒). Given a bit-vector formula 𝜑, we define a one-to-one mapping
𝜒 as the following. For every variable and quantified variable 𝑥 that occurs in 𝜑, 𝜒 maps to a fresh
variable 𝑥′ , such that if 𝑥 is of sort Bv, then 𝑥′ is of sort Int. If 𝑥 is of sort Array with arguments of
sort 𝑠 ∈ {Bv, Array} then 𝑥′ is of sort Array with arguments of sort 𝑠′ ∈ {Int, Array} correspondingly.
Finally, if 𝑥 is of sort Bool, then 𝑥′ is of sort Bool. We write 𝜒 maps 𝑥 to 𝑥′ as 𝜒(𝑥) = 𝑥′ .

   In Table 1 we use the auxiliary function 𝑢𝑡𝑠𝑘 (−) from [4]. For a bit-vector term 𝑡′ , 𝑢𝑡𝑠𝑘 (𝑡′ ) is
an abbreviation for the term 2 · (𝑡′ mod 2𝑘−1 ), which transforms an unsigned bit-vector to a signed
bit-vector. Initially, our translation interprets every bit-vector as unsigned. In the case of a signed
relation, we enclose the arguments of the relation with the function 𝑢𝑡𝑠𝑘 . to ensures that the semantics
of signed relation is preserved properly.
   Finally, we define the translation function 𝑇 that maps from ΣBv -formulas 𝜑 of the theory of bit-
vectors to ΣInt -formulas 𝜓 of the theory of integers (extended with &𝑘 (−, −)). Therefore, we define
a conversion functions 𝐶 in Table 1 (column Lazy) and a replacement function 𝑅 in Table 3. The
translation function 𝑇 is defined as:

                                           𝑇 := 𝜑 ↦→ 𝑅(𝐶(𝜑))

Our translation consist of two steps. In step one 𝐶 replaces bit-vector formulas and terms by their
semantic definition. The conversion functions 𝐶 matches a term or formula 𝑒 to a term or formula
in the first column. The match is then translated to the term or formula in the middle column named
Lazy. For a direct comparison, we display the translation steps from the Eager translation in [4] in the
third column in gray. Note, in [4] the authors translate a bit-vector variable 𝑣 by adding constraints in
the form of (0 ≤ 𝜒(𝑣) < 2𝑘 ) to the translation result and do not surround the integer variable 𝜒(𝑣)
with modulo as we do in the third column of Table 2. For readability reasons, we split the definition
of function 𝐶 into three functions: 𝐶, 𝐶𝑡 and 𝐶𝑡′ . Functions 𝐶𝑡 and 𝐶𝑡′ are both defined in Table 2.
The conversion function uses bv2nat and nat2bv𝑘 to replace bit-vector formulas and terms by their
semantic definition, but our signature ΣInt and integer theory do not contain bv2nat and nat2bv𝑘 .
In the second step, we use the replacement function 𝑅 to get rid of bv2nat and nat2bv𝑘 . Therefore,
we either remove the concatenation bv2nat ∘ nat2bv𝑘 or replace it with a modulo operation. In the
tables defining 𝐶, 𝐶𝑡 , and 𝐶𝑡′ , we have indicated certain bv2nat function calls in blue. If function
𝑅(𝑒) matches 𝑒 to (𝑡′nat2bv𝑘 )bv2nat where bv2nat is marked blue, then we replace 𝑒 with 𝑡′ mod 2𝑘 .
Otherwise, if bv2nat is not marked blue we replace (𝑡′nat2bv𝑘 )bv2nat with 𝑡′ .
   So far we translate bvand to the uninterpreted function symbol &𝑘 (−, −) in 𝐶 and do not treat it
any further. For a more sophisticated translation of bvand we refer to literature [4].

                                                Lazy                                       Eager
      𝐶(𝑒) :
      𝑀 𝑎𝑡𝑐ℎ 𝑒 :
          𝑡1 = 𝑡2           𝐶𝑡 (𝑡1 )bv2nat = 𝐶𝑡 (𝑡2 )bv2nat                   𝐶𝑡 (𝑡1 ) = 𝐶𝑡 (𝑡2 )
          bvult(𝑡1 , 𝑡2 )   𝐶𝑡 (𝑡1 )bv2nat < 𝐶𝑡 (𝑡2 )bv2nat                   𝐶𝑡 (𝑡1 ) < 𝐶𝑡 (𝑡2 )
          bvule(𝑡1 , 𝑡2 )   𝐶𝑡 (𝑡1 )bv2nat ≤ 𝐶𝑡 (𝑡2 )bv2nat                   𝐶𝑡 (𝑡1 ) ≤ 𝐶𝑡 (𝑡2 )
          bvslt(𝑡1 , 𝑡2 )   𝑢𝑡𝑠𝑘 (𝐶𝑡 (𝑡1 )bv2nat ) < 𝑢𝑡𝑠𝑘 (𝐶𝑡 (𝑡2 )bv2nat )   𝑢𝑡𝑠𝑘 (𝐶𝑡 (𝑡1 )) < 𝑢𝑡𝑠𝑘 𝐶𝑡 (𝑡2 ))
          bvsle(𝑡1 , 𝑡2 )   𝑢𝑡𝑠𝑘 (𝐶𝑡 (𝑡1 )bv2nat ) ≤ 𝑢𝑡𝑠𝑘 (𝐶𝑡 (𝑡2 )bv2nat )   𝑢𝑡𝑠𝑘 (𝐶𝑡 (𝑡1 )) ≤ 𝑢𝑡𝑠𝑘 𝐶𝑡 (𝑡2 ))

           □(𝑡1 , ...𝑡𝑖 )                       □(𝐶(𝑡1 ), ..., 𝐶(𝑡𝑖 )) □ ∈ {∧, ∨, ¬, ⇒, ⇔}

                                      𝑢𝑡𝑠𝑘 (𝑡′ ) := 2 · (𝑡′ mod 2𝑘−1 ) − 𝑡′

Table 1
Definition of the Conversion Function 𝐶



4.1. Translation of Arrays and Quantified Formulas
For the translation of quantified formulas and arrays we extend our conversion functions 𝐶 and 𝐶𝑡 with
the conversions outlined in Table 4. Therefore, let in Table 4 𝑎 and 𝑏 be arrays over bit-vectors, where
the bit-vectors that represent the indices of 𝑎 and 𝑏 have width 𝑘. Additionally, let 𝑖′ be a quantified
integer variable. The translation of quantified formulas utilizes the variable mapping 𝜒(𝑣) = 𝑣 ′ , where
𝑣 is a quantified bit-vector variable of width 𝑘 and 𝑣 ′ is a quantified integer variable. Additionally,
we ensure that a translated quantified formula is unsatisfiable for values of 𝑣 ′ that are not within the
bounds of 𝑣. Therefore, we add the bound (0 ≤ 𝜒(𝑣) < 2𝑘 ) as constraints within the scope of the
quantifier.
   Arrays over bit-vectors 𝑎 are translated to fresh arrays over integers with the help of the one-to-one
mapping 𝜒(𝑎). Since arrays over integers have infinitely indices and arrays over bit-vectors don’t, we
have to add some limitations. First of all, we ensure to only read from and write to indices that are within
the bounds of the corresponding bit-vector array. Secondly, we have to ensure that if two translated
arrays are equal on every index in the range of the bit-vector array, then they are equal on every index.
Therefore, we add a constraint to the translation result that evaluates to false if this condition is violated.
This is achieved by the constraint function Lem in Table 4. Furthermore, we change the translation
function to add constraints for every array equality: 𝑇𝐴𝑟𝑟𝑎𝑦 := 𝜑 ↦→ 𝑅(𝐶(𝜑)) ∧ Lem(𝜑).
                                                       Lazy                                         Eager
       𝐶𝑡 (𝑒) :
       𝑀 𝑎𝑡𝑐ℎ 𝑒 :
            concat(𝑡1 , 𝑡2 ) 𝐶𝑡′ (𝑒)nat2bv𝑘1 +𝑘2                                    𝐶𝑡′ (𝑒)
            extract𝑖𝑗 (𝑡1 ) 𝐶𝑡′ (𝑒)nat2bv𝑖−𝑗+1                                      𝐶𝑡′ (𝑒)
            else: 𝑒                 𝐶𝑡′ (𝑒)nat2bv𝑘                                  𝐶𝑡′ (𝑒)

       𝐶𝑡′ (𝑒) :
       𝑀 𝑎𝑡𝑐ℎ 𝑒 :
             𝑥                      𝜒(𝑥)                                            𝜒(𝑥) mod 2𝑘
                                               ∑︀𝑘−1
            𝑐                                        𝑖=0 𝑐𝑖   · 2𝑖   where 𝑐𝑖 is the 𝑖-th bit of 𝑐

            bvneg(𝑡1 )              2𝑘1 − 𝐶𝑡 (𝑡1 )bv2nat                            2𝑘1 − 𝐶𝑡 (𝑡1 )
            bvmul(𝑡1 , 𝑡2 )         𝐶𝑡 (𝑡1 )bv2nat · 𝐶𝑡 (𝑡2 )bv2nat                 (𝐶𝑡 (𝑡1 ) · 𝐶𝑡 (𝑡2 )) mod 2𝑘
            bvadd(𝑡1 , 𝑡2 )         𝐶𝑡 (𝑡1 )bv2nat + 𝐶𝑡 (𝑡2 )bv2nat                 (𝐶𝑡 (𝑡1 ) + 𝐶𝑡 (𝑡2 )) mod 2𝑘
            bvsub(𝑡1 , 𝑡2 )         𝐶𝑡 (𝑡1 )bv2nat − 𝐶𝑡 (𝑡2 )bv2nat                 (𝐶𝑡 (𝑡1 ) − 𝐶𝑡 (𝑡2 )) mod 2𝑘

                             ite(𝐶𝑡 (𝑡2 )bv2nat = 0, 2𝑘 − 1,                        ite(𝐶𝑡 (𝑡2 ) = 0, 2𝑘 − 1,
            bvudiv(𝑡1 , 𝑡2 )
                             𝐶𝑡 (𝑡1 )bv2nat div 𝐶𝑡 (𝑡2 )bv2nat )                    𝐶𝑡 (𝑡1 ) div 𝐶𝑡 (𝑡2 ))
                             ite(𝐶𝑡 (𝑡2 )bv2nat = 0, 𝑡1 ,                           ite(𝐶𝑡 (𝑡2 ) = 0, 𝐶𝑡 (𝑡1 ),
            bvurem(𝑡1 , 𝑡2 )
                             𝐶𝑡 (𝑡1 )bv2nat mod 𝐶𝑡 (𝑡2 )bv2nat )                    𝐶𝑡 (𝑡1 ) mod 𝐶𝑡 (𝑡2 ))

                                    ite(𝐶𝑡 (𝑡2 )bv2nat = 1,                        ite(𝐶𝑡 (𝑡2 ) = 1,
                                    2 · 𝐶𝑡 (𝑡1 )bv2nat ,                           2 · 𝐶𝑡 (𝑡1 ) mod 2𝑘 ,
                                    ...                                            ...
            bvshl(𝑡1 , 𝑡2 )
                                    ite(𝐶𝑡 (𝑡2 )bv2nat = 𝑘 − 1,                    ite(𝐶𝑡 (𝑡2 ) = 𝑘 − 1,
                                    2𝑘−1 · 𝐶𝑡 (𝑡1 )bv2nat ,                        2𝑘−1 · 𝐶𝑡 (𝑡1 ) mod 2𝑘 ,
                                    0)...)                                         0)...)


                                    ite(𝐶𝑡 (𝑡2 )bv2nat = 1,                         ite(𝐶𝑡 (𝑡2 ) = 1,
                                    𝐶𝑡 (𝑡1 )bv2nat div 2,                           𝐶𝑡 (𝑡1 ) div 2,
                                    ...                                             ...
            bvlshr(𝑡1 , 𝑡2 )
                                    ite(𝐶𝑡 (𝑡2 )bv2nat = 𝑘 − 1,                     ite((𝐶𝑡 (𝑡2 ) = 𝑘 − 1),
                                    (𝐶𝑡 (𝑡1 )bv2nat div 2𝑘−1 ),                     𝐶𝑡 (𝑡1 ) div 2𝑘−1 ,
                                    0)...)                                          0)...)

            concat(𝑡1 , 𝑡2 )        𝐶𝑡 (𝑡1 )bv2nat · 2𝑘2 + 𝐶𝑡 (𝑡2 )bv2nat           𝐶𝑡 (𝑡1 ) · 2𝑘2 + 𝐶𝑡 (𝑡2 )
            extract𝑖𝑗 (𝑡1 )         𝐶𝑡 (𝑡1 )bv2nat div 2𝑗                           𝐶𝑡 (𝑡1 ) div 2𝑗 mod 2𝑖−𝑗+1
            bvnot(𝑡1 )              2𝑘1 − 𝐶𝑡 (𝑡1 )bv2nat + 1                        2𝑘1 − 𝐶𝑡 (𝑡1 ) + 1
            bvand(𝑡1 , 𝑡2 )         &𝑘 (𝐶𝑡 (𝑡1 )bv2nat , 𝐶𝑡 (𝑡2 )bv2nat )           &𝑘 (𝐶𝑡 (𝑡1 ), 𝐶𝑡 (𝑡2 ))
Table 2
Definition of the Conversion Function 𝐶𝑡



        𝑅(𝑒) :
        𝑀 𝑎𝑡𝑐ℎ 𝑒 :
               (𝑡′nat2bv𝑘 )bv2nat              →        𝑅(𝑡′ ) mod 2𝑘              where bv2nat is marked blue

                (𝑡′nat2bv𝑘 )bv2nat             →        𝑅(𝑡′ )                     otherwise

                else: 𝑒(𝑡′1 , ..., 𝑡′𝑛 )       →        𝑒(𝑅(𝑡′𝑖 ), ..., 𝑅(𝑡′𝑛 ))
Table 3
Definition of the Replacement Function 𝑅
                                                                              Lazy
                𝐶(𝑒) :
                𝑀 𝑎𝑡𝑐ℎ 𝑒 :
                       ∃ 𝑣.(𝑒)                   ∃ 𝐶(𝑣).(𝐶(𝑒) ∧ (0 ≤ 𝜒(𝑣) < 2𝑘 ))
                       ∀ 𝑣.(𝑒)                   ∀ 𝐶(𝑣).((0 ≤ 𝜒(𝑣) < 2𝑘 ) ⇒ 𝐶(𝑒))
                𝐶𝑡 (𝑒) :
                𝑀 𝑎𝑡𝑐ℎ 𝑒 :
                       𝑎                         𝜒(𝑎)
                       (select 𝑎 𝑖)              (select 𝐶𝑡 (𝑎) 𝐶𝑡 (𝑖)bv2nat )
                       (store 𝑎 𝑖 𝑣)             (store 𝐶𝑡 (𝑎) 𝐶𝑡 (𝑖)bv2nat 𝐶𝑡 (𝑣))
                Lem(𝑒) :
                𝑀 𝑎𝑡𝑐ℎ 𝑒 :
                                                 (∀𝑖′ .(0 ≤ 𝑖′ < 2𝑘 ) ⇒
                                                 ((select 𝐶𝑡 (𝑎) 𝑖′ ) mod 2𝑘 =
                       𝑎=𝑏
                                                 (select 𝐶𝑡 (𝑏) 𝑖′ ) mod 2𝑘 ))
                                                 ⇒ 𝐶𝑡 (𝑎) = 𝐶𝑡 (𝑏)
                                                 ⋀︀𝑛
                       □(𝑡1 , ...𝑡𝑖 )                𝑖=1 Lem(𝑡𝑖 )        □ ∈ {∧, ∨, ¬, ⇒, ⇔}
                       else:                     ⊤
Table 4
Extension for Quantified Formulas and Arrays


5. Implementation
We have two implementations for our lazy translation presented in Section 4. The first implementation
is a wrapper script for SMT solvers, called Ultimate IntBlastingWrapper [12]. It is implemented in
the Ultimate framework2 . The wrapper script does not use the functions bv2nat and nat2bv𝑘 instead
modulo operations are added directly. It supports a translation of bit-wise operations with all features
described in [4], quantified formulas and arrays over bit-vectors.
   The second implementation is in the SMT solver SMTInterpol3 . This implementation is still work
in progress. So far we use bv2nat and nat2bv𝑘 as uninterpreted functions, but the implementation
does not support bit-wise operations, quantified variables and arrays yet. We implemented two settings
to translate bit-vectors to integers. The first setting is called Lazy and it applies our lazy translation in
Section 4. The second setting is called Eager and it applies a translation similar to [4]. Each setting
applies the conversions in their respective column in Table 2. Additionally, we compare our settings
Lazy and Eager with the original implementation (cvc5-int) of [4] in the SMT solver cvc5. We selected
SMTInterpol as the SMT solver for our evaluation because, at the point of writing, it did not support
bit-vectors, and we were already familiar with the tool. Unfortunately, it did not support non-linear
integer arithmetic either. When SMTInterpol encounters a bit-wise operation or non-linear integer
arithmetic we return an error.


6. Evaluation
We evaluate our two implementations to answer the following research questions:
       • How does the performance of the approach Lazy and Eager compare?
       • How does Lazy and Eager compare to int-blasting in cvc5?

6.1. Evaluation of Ultimate IntBlastingWrapper
We participated with Ultimate IntBlastingWrapper [12] at the latest SMT-COMP 2023. Ulti-
mate IntBlastingWrapper competed in the Single Query Track on every logic that contains bit-
2
    https://ultimate.informatik.uni-freiburg.de and github.com/ultimate-pa/ultimate
3
    https://github.com/ultimate-pa/smtinterpol/tree/Intblasting
                                                                                  Memory used by Lazy [MB]
                              100
       CPU time of Lazy [s]
                                                                                                         1000


                               10


                                                                                                             100
                                1




                              0.1                                                                             10
                                 0.1      1            10         100                                           10        100              1000
                                              CPU time of Eager [s]                                                  Memory used by Eager [MB]

Figure 1: Comparing CPU time and Memory usage of Eager (x-axis) and Lazy (y-axis)


vectors. The results of Ultimate IntBlastingWrapper in the SMT-COMP 2023 showed wrong
results on three benchmarks. Of these three, two are from the category QF_AUFBV and one from
QF_ABV. The wrong results were caused by a mistake in the translation of equalities between arrays
over bit-vectors. This has been fixed in commit: https://github.com/ultimate-pa/ultimate/commit/
928447c7dc8c44e406f0a52121ccf96fcbe4d5b5.

6.2. Evaluation of Int-blasting in SMTInterpol
To evaluate our implementation in the SMTInterpol, we ran the settings Lazy and Eager and the
implementation of int-blasting in cvc5 (cvc5-int) from the paper [4]. For cvc5-int we used the cvc5
options --solve-bv-as-int=sum and --nl-ext-tplanes. We run Lazy, Eager and cvc5-int on a randomly
picked subset of the non-incremental QF_BV benchmarks from the SMT-LIB. From the set we excluded
every benchmark where the expected result is unknown. There remained 12302 benchmarks.

Environment
We run our experiments on a cluster of machines with 33 GB of memory and an Intel Xeon E3-1230 v5
CPU with 8 processing units and a frequency of 3.40 GHz that run Ubuntu 22.04 (Linux kernel 5.15.0).
To measure and limit resources we use Benchexec 3.18 [22]. We limit every run to 1 core, 15 min of
CPU time and 15 GB of memory.

How does the performance of the approach Lazy and Eager compare?

                                       Lazy          Eager            cvc5-int
 Correct                               2961          2698             8409
 SAT                                   662           425              2135                                                Lazy             Eager
 UNSAT                                 2299          2273             6274                                    Correct               2679
                                                                                                              ∑︀
 Timeout                               2124          2709             3769                                    ∑︀CPU       23000 s          25600 s
 Unsupported                           7217          6895             -                                          Memory   252 GB           327 GB
 Memory Out                            -             -                124
                                                                                 Table 6
 Total                                                 12302
                                                                                 Benchmarks where Lazy and Eager are correct
Table 5
Overview of the evaluation results

   The evaluation results of Lazy and Eager are displayed in Table 5. We can see that SMTInterpol
solves more benchmarks with Lazy than with Eager. Lazy creates 282 more correct results that is 9%
(2698 of 2961). Among these, Eager times out on 280 cases, and on 2 cases, Eager reports that the
formula contains non-linear integer arithmetic. On the other hand Eager returns a correct result on
19 benchmarks where Lazy times out. Furthermore, on 426 benchmarks Lazy returns an error where
Eager times out. All errors in Table 5 are caused by either non-linear integer arithmetic or a bit-wise
operation. To compare the CPU time and memory usage of Lazy and Eager, we analyze the 2679
benchmarks for which both settings return a correct result (see Table 6). When measuring the CPU time
and memory used on these 2679 benchmarks, Lazy requires 10% less CPU time (23000 s out of 25600 s)
and 23% less memory (252 GB out of 327 GB) to decide their satisfiability. For a more detailed view, we
provide two scatter plots in Figure 1. Both scatter plots have logarithmic scales, the first shows the CPU
time used and the second plot shows the memory usage. Every dot in the scatter plots below the line is
in favor of Lazy. We observe that in many cases, the Lazy approach requires less CPU time and/or
memory. Specifically, the CPU time and memory usage of Lazy tends to deviate less frequently and
significantly from Eager. However, it is important to note that Lazy is not strictly superior to Eager.

How does Lazy and Eager compare to int-blasting in cvc5?
In Table 5, we observe that cvc5-int solves 8409 tasks. Out of these 8409 tasks Lazy times out on 553,
returns an error on 5166 and solves correctly 2690. On the 2961 tasks where Lazy returns a correct
result, cvc5-int times out 271 times and returns 2691 correct results. Among the 2006 benchmarks
where Lazy, Eager, and cvc5-int all return a correct result, cvc5-int demonstrates significantly lower
CPU time and memory usage. For solving these 2006 tasks, Lazy requires 30380 s, Eager requires 62800
s, and cvc5-int requires 12750 s. Regarding memory usage, Lazy consumes 246.7 GB, Eager consumes
321.4 GB, and cvc5-int consumes 16.5 GB.
   The comparison between cvc5-int and our implementations shows that cvc5-int solves significantly
more benchmarks. However, cvc5-int is not strictly superior, we solve 271 benchmarks on which
cvc5-int times out.


7. Conclusion
We present a lazy translation from bit-vector formulas to integer formulas that utilizes the functions
bv2nat and nat2bv𝑘 . Our translation consists of a conversion function 𝐶, and a replacement function
𝑅. Conversion function 𝐶 replaces bit-vector terms and formulas with their semantic definition,
thus incorporating bv2nat and nat2bv𝑘 into the translation process This makes our translation
more modular since bit-vector terms and their semantic definition have the same sort. Additionally,
conversion function 𝐶 supports a translation of quantified formulas and arrays over bit-vectors. Within
replacement function 𝑅, we eliminate bv2nat and nat2bv𝑘 from the translation result and introduce
modulo operations instead. This is done in such a way that the amount of redundant modulo operations
is reduced. We implemented our lazy translation in the SMTInterpol and as wrapper script for
SMT solver. Our evaluation of both implementations indicates the correctness of the lazy translation.
Furthermore, it shows that SMTInterpol with our lazy translation is able to solve 9% more tasks,
10% faster and with 23% less memory usage than with a closely related, up-to-date eager translation
approach.


References
 [1] C. Barrett, P. Fontaine, C. Tinelli, The SMT-LIB Standard: Version 2.6, Technical Report, Department
     of Computer Science, The University of Iowa, 2017. Available at www.SMT-LIB.org.
 [2] A. Niemetz, M. Preiner, Bitwuzla, in: C. Enea, A. Lal (Eds.), Computer Aided Verification - 35th
     International Conference, CAV 2023, Paris, France, July 17-22, 2023, Proceedings, Part II, volume
     13965 of Lecture Notes in Computer Science, Springer, 2023, pp. 3–17. URL: https://doi.org/10.1007/
     978-3-031-37703-7_1. doi:10.1007/978-3-031-37703-7\_1.
 [3] L. Aniva, H. Barbosa, C. Barrett, M. Brain, V. Camillo, G. Kremer, H. Lachnitt, A. Mohamed,
     M. Mohamed, A. Niemetz, et al., Cvc5 at the smt competition 2023 (2023).
 [4] Y. Zohar, A. Irfan, M. Mann, A. Niemetz, A. Nötzli, M. Preiner, A. Reynolds, C. Barrett, C. Tinelli, Bit-
     precise reasoning via int-blasting, in: B. Finkbeiner, T. Wies (Eds.), Verification, Model Checking,
     and Abstract Interpretation, Springer International Publishing, Cham, 2022, pp. 496–518.
 [5] F. Frohn, A calculus for modular loop acceleration, in: A. Biere, D. Parker (Eds.), Tools and
     Algorithms for the Construction and Analysis of Systems - 26th International Conference, TACAS
     2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS
     2020, Dublin, Ireland, April 25-30, 2020, Proceedings, Part I, volume 12078 of Lecture Notes in
     Computer Science, Springer, 2020, pp. 58–76. URL: https://doi.org/10.1007/978-3-030-45190-5_4.
     doi:10.1007/978-3-030-45190-5\_4.
 [6] P. Ganty, R. Iosif, F. Konecný, Underapproximation of procedure summaries for integer pro-
     grams, Int. J. Softw. Tools Technol. Transf. 19 (2017) 565–584. URL: https://doi.org/10.1007/
     s10009-016-0420-7. doi:10.1007/s10009-016-0420-7.
 [7] S. Gulwani, S. Srivastava, R. Venkatesan, Program analysis as constraint solving, in: R. Gupta,
     S. P. Amarasinghe (Eds.), Proceedings of the ACM SIGPLAN 2008 Conference on Programming
     Language Design and Implementation, Tucson, AZ, USA, June 7-13, 2008, ACM, 2008, pp. 281–292.
     URL: https://doi.org/10.1145/1375581.1375616. doi:10.1145/1375581.1375616.
 [8] M. Colón, S. Sankaranarayanan, H. Sipma, Linear invariant generation using non-linear constraint
     solving, in: W. A. H. Jr., F. Somenzi (Eds.), Computer Aided Verification, 15th International
     Conference, CAV 2003, Boulder, CO, USA, July 8-12, 2003, Proceedings, volume 2725 of Lecture Notes
     in Computer Science, Springer, 2003, pp. 420–432. URL: https://doi.org/10.1007/978-3-540-45069-6_
     39. doi:10.1007/978-3-540-45069-6\_39.
 [9] M. Colón, H. Sipma, Synthesis of linear ranking functions, in: T. Margaria, W. Yi (Eds.), Tools
     and Algorithms for the Construction and Analysis of Systems, 7th International Conference,
     TACAS 2001 Held as Part of the Joint European Conferences on Theory and Practice of Software,
     ETAPS 2001 Genova, Italy, April 2-6, 2001, Proceedings, volume 2031 of Lecture Notes in Computer
     Science, Springer, 2001, pp. 67–81. URL: https://doi.org/10.1007/3-540-45319-9_6. doi:10.1007/
     3-540-45319-9\_6.
[10] A. R. Bradley, Z. Manna, H. B. Sipma, Linear ranking with reachability, in: K. Etessami, S. K.
     Rajamani (Eds.), Computer Aided Verification, 17th International Conference, CAV 2005, Edinburgh,
     Scotland, UK, July 6-10, 2005, Proceedings, volume 3576 of Lecture Notes in Computer Science,
     Springer, 2005, pp. 491–504. URL: https://doi.org/10.1007/11513988_48. doi:10.1007/11513988\
     _48.
[11] A. Rybalchenko, Constraint solving for program verification: Theory and practice by example,
     in: T. Touili, B. Cook, P. B. Jackson (Eds.), Computer Aided Verification, 22nd International
     Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings, volume 6174 of Lecture Notes
     in Computer Science, Springer, 2010, pp. 57–71. URL: https://doi.org/10.1007/978-3-642-14295-6_7.
     doi:10.1007/978-3-642-14295-6\_7.
[12] M. Barth, M. Heizmann, Ultimate IntBlastingWrapper (2023). Available at https://smt-comp.github.
     io/2023/system-descriptions/UltimateIntBlastingWrapper%2BSMTInterpol.pdf.
[13] J. Christ, J. Hoenicke, A. Nutz, Smtinterpol: An interpolating SMT solver, in: A. F. Donald-
     son, D. Parker (Eds.), Model Checking Software - 19th International Workshop, SPIN 2012,
     Oxford, UK, July 23-24, 2012. Proceedings, volume 7385 of Lecture Notes in Computer Science,
     Springer, 2012, pp. 248–254. URL: https://doi.org/10.1007/978-3-642-31759-0_19. doi:10.1007/
     978-3-642-31759-0\_19.
[14] Y. C. Liu, C. Pang, D. Dietsch, E. Koskinen, T. Le, G. Portokalidis, J. Xu, Source-level bitwise
     branching for temporal verification of lifted binaries, CoRR abs/2105.05159 (2021). URL: https:
     //arxiv.org/abs/2105.05159. arXiv:2105.05159.
[15] A. Gurfinkel, A. Belov, J. Marques-Silva, Synthesizing safe bit-precise invariants, in: E. Ábrahám,
     K. Havelund (Eds.), Tools and Algorithms for the Construction and Analysis of Systems - 20th
     International Conference, TACAS 2014, Held as Part of the European Joint Conferences on Theory
     and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings, volume
     8413 of Lecture Notes in Computer Science, Springer, 2014, pp. 93–108. URL: https://doi.org/10.1007/
     978-3-642-54862-8_7. doi:10.1007/978-3-642-54862-8\_7.
[16] A. Griggio, Effective word-level interpolation for software verification, in: P. Bjesse, A. Slobodová
     (Eds.), International Conference on Formal Methods in Computer-Aided Design, FMCAD ’11,
     Austin, TX, USA, October 30 - November 02, 2011, FMCAD Inc., 2011, pp. 28–36. URL: http:
     //dl.acm.org/citation.cfm?id=2157662.
[17] T. Okudono, A. King, Mind the gap: Bit-vector interpolation recast over linear integer arithmetic,
     in: A. Biere, D. Parker (Eds.), Tools and Algorithms for the Construction and Analysis of Systems -
     26th International Conference, TACAS 2020, Held as Part of the European Joint Conferences on
     Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings,
     Part I, volume 12078 of Lecture Notes in Computer Science, Springer, 2020, pp. 79–96. URL: https:
     //doi.org/10.1007/978-3-030-45190-5_5. doi:10.1007/978-3-030-45190-5\_5.
[18] P. Backeman, P. Rümmer, A. Zeljic, Bit-vector interpolation and quantifier elimination by lazy
     reduction, in: N. S. Bjørner, A. Gurfinkel (Eds.), 2018 Formal Methods in Computer Aided Design,
     FMCAD 2018, Austin, TX, USA, October 30 - November 2, 2018, IEEE, 2018, pp. 1–10. URL:
     https://doi.org/10.23919/FMCAD.2018.8603023. doi:10.23919/FMCAD.2018.8603023.
[19] A. Niemetz, M. Preiner, A. Reynolds, Y. Zohar, C. W. Barrett, C. Tinelli, Towards bit-width-
     independent proofs in SMT solvers, in: P. Fontaine (Ed.), Automated Deduction - CADE 27
     - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019,
     Proceedings, volume 11716 of Lecture Notes in Computer Science, Springer, 2019, pp. 366–384. URL:
     https://doi.org/10.1007/978-3-030-29436-6_22. doi:10.1007/978-3-030-29436-6\_22.
[20] D. Beyer, P. Chien, N. Lee, Bridging hardware and software analysis with btor2c: A word-
     level-circuit-to-c translator, in: S. Sankaranarayanan, N. Sharygina (Eds.), Tools and Algorithms
     for the Construction and Analysis of Systems - 29th International Conference, TACAS 2023,
     Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS
     2022, Paris, France, April 22-27, 2023, Proceedings, Part II, volume 13994 of Lecture Notes in
     Computer Science, Springer, 2023, pp. 152–172. URL: https://doi.org/10.1007/978-3-031-30820-8_12.
     doi:10.1007/978-3-031-30820-8\_12.
[21] H. Barbosa, C. W. Barrett, M. Brain, G. Kremer, H. Lachnitt, M. Mann, A. Mohamed, M. Mohamed,
     A. Niemetz, A. Nötzli, A. Ozdemir, M. Preiner, A. Reynolds, Y. Sheng, C. Tinelli, Y. Zohar, cvc5: A
     versatile and industrial-strength SMT solver, in: D. Fisman, G. Rosu (Eds.), Tools and Algorithms
     for the Construction and Analysis of Systems - 28th International Conference, TACAS 2022,
     Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS
     2022, Munich, Germany, April 2-7, 2022, Proceedings, Part I, volume 13243 of Lecture Notes in
     Computer Science, Springer, 2022, pp. 415–442. URL: https://doi.org/10.1007/978-3-030-99524-9_24.
     doi:10.1007/978-3-030-99524-9\_24.
[22] D. Beyer, S. Löwe, P. Wendler, Reliable benchmarking: Requirements and solutions, STTT 21
     (2019) 1–29. doi:10.1007/s10009-017-0469-y.