A Bit-vector to Integer Translation with bv2nat and nat2bv⋆ Max Barth1,* , Matthias Heizmann2 1 LMU Munich, Munich, Germany 2 University of Stuttgart, Stuttgart, Germany Abstract In this paper we present a translation from bit-vector formulas to integer formulas. The translation uses the function symbols bv2nat and nat2bvπ‘˜ which are both utilized in the theory of fixed-width bit-vectors of the SMT-LIB [1] language to define the semantics of bit-vector operations. Our translation replaces bit-vector operations with their semantic definition. This facilitates a more modular application as bit-vector operations and their semantic definition have the same sort. As a postprocessing our translation replaces the composition bv2nat ∘ nat2bvπ‘˜ with a modulo operation, and removes redundant modulo operations from the translation result. The evaluation of our translation shows that we are able to solve 9% more tasks, 10% faster and with 23% less memory usage compared to a closely related, up-to-date translation approach. Additionally, our translation supports the translation of quantified formulas and arrays over bit-vectors. Keywords Int-blasting, Bit-vectors, Translation from bit-vectors to integers, bv2nat and nat2bv, Translation of quantified formulas and arrays 1. Introduction In many program languages, integer data types represent only a fixed number of values. A sequence of bits is utilized to represent an integer via two’s complement or a binary encoding. We call such a sequence of bits a bit-vector. The SMT-LIB [1] theory of β€œFixedSizeBitVectors"1 is well-suited for modeling such programming languages since it offers many function symbols that capture precisely the semantics operations that occur in programming languages. However, the expressiveness of the bit-vector theory comes at a certain price. Due to their complex semantics, formulas of this theory are rather intractable and there are only few algorithms that handle these formulas directly. Typically, algorithms that work on bit-vector formulas first do a translation, either to propositional logic or to integer arithmetic. The translation to propositional logic is called bit-blasting [2, 3]. Bit-blasting translates each bit of a bit-vector into a propositional logical variable and translates each bit-vector operation into a propositional logical formula that captures exactly the semantics of that operation. The strength of bit-blasting is that we can utilize powerful SAT solvers for deciding statisfiability of the resulting formulas. The alternative to bit-blasting is the translation to integer arithmetic. A recent publication coined the term int-blasting [4] for this translation. Here, bit-vector variables are translated to integer variables and a comprehensive application of modulo operations makes sure that we can establish a connection between models for the bit-vector formulas and models for the integer formulas. In order to model bit-precise bit-vector operations, int-blasting can access individual bits via a combination of integer division (div) and modulo (mod) operations. E.g., if we translate a bit-vector variable x into an integer variable x’ such that x is the binary encoding of x’, we can access the third least-significant bit as follows: We divide x’ by 4 and take the result modulo two. The result is one iff this bit was set to true. In order to improve the performance, int-blasting-based translations often work with approximations and increase their precision later if required. SMT’24: 22st International Workshop on Satisfiability Modulo Theories, July 22–23, 2024, Montreal, Canada ⋆ This project was supported by the Deutsche Forschungsgemeinschaft (DFG) β€” 378803395 (ConVeY). * Corresponding author.  0009-0002-7716-3898 (M. Barth); 0000-0003-4252-3558 (M. Heizmann) Β© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 1 https://smt-lib.org/theories-FixedSizeBitVectors.shtml CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Even though the state of the art to decide the satisfiability of bit-vector formulas is bit-blasting [2], there exist applications that need a translation from bit-vector formulas to formulas over integers. For example in software model checking many techniques only work on integers: loop acceleration [5, 6], invariant syntheses [7, 8] and syntheses of ranking functions [9, 10, 11]. While other techniques perform better on integers, especially Craig interpolation. In this paper we present a translation from bit-vectors to integers that is closely related to the translation of [4]. In order to explain our translation and conceptual differences to [4] we use the functions nat2bvπ‘˜ and bv2nat (see Section 3) which implement the binary encoding of natural numbers and its inverse function, respectively. The translation of [4] ensures the following relation between models of bit-vector formulas and the models of the resulting integer formulas: If the bit-vector model maps a term 𝑑 to a bitvector value 𝑣 then the integer model maps the translation result of 𝑑 to bv2nat(𝑣). In order to ensure this relation, the translation of [4] adds constraints that require that integer variables are in a given range and modulo operations such that every integer term is in the same range as its corresponding bit-vector term. Our translation only requires a less strict relation between models of bit-vector formulas and the models of the resulting integer formulas. If the integer model maps the translation result of the term 𝑑 to an integer 𝑣 β€² then the bit-vector model maps 𝑑 to nat2bv(𝑣 β€² ) . I.e., we do not require that nat2bv(𝑣 β€² ) is the binary encoding of 𝑣 β€² , we only require that nat2bv(𝑣 β€² ) is the binary encoding of nat2bv(𝑣 β€² mod π‘˜), where π‘˜ is the bit-length of 𝑑. This conceptual difference to [4] allows us to omit constraints on the translated variables and it allows us to omit several modulo operations that the translation of [4] introduces. The translation of [4] introduces modulo operations after each arithmetic operation. Our translation omits modulo operations after arithmetic operations but introduces modulo operations before each operation where we need that integer values are in a certain range. E.g., we translate bvult to the less-than relation < but apply a modulo operation to both operands. We call the translation of [4] eager int-blasting and our translation lazy int-blasting. Example 1. For the bit-vector formula: (bvult 𝑦 (bvadd 0101 (bvmul 𝑦 0011))). The result of the eager int-blasting is: (< 𝑦 β€² (mod (+ 5 (mod (Β· 𝑦 β€² 3) 24 )) 24 )) ∧ (β‰₯ 𝑦 β€² 0) ∧ (< 𝑦 β€² 24 ). The result of our lazy int-blasting is: (< (mod 𝑦 β€² 24 ) (mod (+ 5 (Β· 𝑦 β€² 3)) 24 )). Technically, our translation proceeds in two steps. In the first step, we inductively translate the formula and distinguish two cases. If the SMT-LIB semantic definition of the operation that we have to translate utilizes the functions bv2nat and nat2bvπ‘˜ , we replace the operation with their semantic definition. For most of the other operations our translation is similar to the eager int-blasting. The result of our first translation step contains the functions bv2nat and nat2bvπ‘˜ only as a concatenation bv2nat ∘ nat2bvπ‘˜ . Our second translation step replaces this concatenation either by a modulo operation or by the identity function. An additional contribution of this paper is that we present how we translate quantified formulas and arrays over bit-vectors into integers. We have implemented our translation in two ways: as a wrapper script [12] for SMT solver and directly in the SMT solver SMTInterpol [13]. Through our evaluation of both implementations, we demonstrated that our translation does not produce incorrect results on the benchmark set. Moreover, we conducted a comparison between two settings of our translation implemented in SMTInterpol: one setting adds modulo operations lazily, while the other adds them eagerly. Our evaluation results revealed that SMTInterpol is capable of solving 9% more tasks, is 10% faster, and requires 23% less memory when modulo operations are added lazily compared to eagerly. 2. Related Work There have been numerous approaches to translating from bit-vectors to integers. Some methods perform the translation during the verification process e.g. on the source level [14] or during invariant synthesis [15]. More closely related to our work are translations for SMT formulas [16, 17, 18, 19, 4]. However, our translation approach differs from the related work. Firstly, we incorporate modular arithmetic lazily to eliminate redundant modulo operations in the translation result. There exist simplification techniques to eliminate such redundant modulo operations. For example when the SMT solver Z3 is asked simplify(mod (Β· (mod (+ π‘₯ 𝑦) 256) 𝑧) 256), it returns (mod (Β· 𝑧 (+ π‘₯ 𝑦)) 256). However, since we do not add the redundant modulo operations in the first place, there is no need for such a simplification on our translation results. Secondly, no other approach utilizes function symbols with behavior similar to the functions bv2nat and nat2bvπ‘˜ . We give a brief overview of the related approaches. A similar concept to our lazy translation, but in a different research field, can be found in [20]. Where the authors apply modulo operations lazily during their translation from BTOR to C. Their evaluation shows that their lazy translation is faster in general, but not strictly better than their eager translation. Griggio et al. introduces a layered SMT solver in [16], where on the higher layers, the solver performs translation from bit-vector formulas to integer formulas. This translation is for a fragment of the bit-vector functions, including arithmetic functions, extraction, concatenation, left shift, and relations. Rather than incorporating modulo operations to bound the integers, Griggio’s approach utilizes an auxiliary variable, such as variable 𝑣 in the expression (𝑑1 + 𝑑2 βˆ’ 2𝑛 Β· 𝑣) ∧ (0 ≀ 𝑣) ∧ (𝑣 ≀ 1). Each integer term derived during the translation is within the bounds of its corresponding bit-vector term, effectively achieving an eager translation. Backeman, Rummer, and Zeljic [18] introduce a new calculus for non-linear integer arithmetic, which, in certain cases, can eliminate quantifiers and extract Craig interpolants. Subsequently, they define a corresponding calculus for arithmetic bit-vector constraints. Both calculi allow for a flexible switch between bit-vectors and integers. Initially, integers are not bound by modular arithmetic; instead, the authors introduce an uninterpreted function symbol that represents the modulo operation. They note that the remainder operation tends to be a bottleneck for interpolation. If necessary, a definition for the uninterpreted function symbol can be added to precisely cover the remainder. In contrast to our approach with uninterpreted function symbols, their uninterpreted function is directly associated with the modulo operation and does not affect the sort of a term. Furthermore, their approach supports a translation of quantified formulas, but not of arrays over bit-vectors. Recently, the first precise and complete translation for bit-vector formulas with bit-vectors of fixed size was published in [4]. This work is closely related to their translation for bit-vectors with parametric bit-width, proposed in their previous work [19]. The precise translation presented in [4] has been implemented in the cvc5 prover [21]. It employs a translation from bit-vector formulas to non-linear integer arithmetic formulas with uninterpreted functions and universal quantification. During the translation, modulo operations are added eagerly and without the use of uninterpreted function symbols. Our translation approach combines elements from the translation in [4] with the semantics of the theory of "FixedSizeBitVectors" defined in the SMT-LIB [1]. 3. Preliminaries and Notations The SMT-LIB [1] defines a many-sorted first-order logic with equality. In this paper we use the sorts, signatures Ξ£ and theories defined in the SMT-LIB. In particular we use the sorts, signatures and theories of Booleans, fixed-size bit-vectors, integers, and arrays. Bit-Vectors The SMT-LIB defines a signature Ξ£Bv and a theory called β€œFixedSizeBitVectors" for bit-vectors of fixed size. For every possible bit-vector size π‘˜, that is every positive integer greater than zero, Ξ£Bv contains a unique sort πœŽπ‘˜ . We call a term of sort πœŽπ‘˜ ∈ Ξ£Bv bit-vector of size π‘˜ or bit-vector of width π‘˜. In the following let π‘₯ be a bit-vector variable, 𝑐 be a bit-vector constant, and 𝑑, 𝑑1 , and 𝑑2 be bit-vector terms. Furthermore, let the width of bit-vectors π‘₯, 𝑐 and 𝑑 be π‘˜, the width of 𝑑1 be π‘˜1 and the width of 𝑑2 be π‘˜2 . The signature Ξ£Bv as defined in the SMT-LIB contains a set of bit-vector function symbols. We denote the extract function from 𝑖 to 𝑗 as extract𝑖𝑗 (𝑑), where 𝑖 and 𝑗 are natural numbers with 𝑖, 𝑗 β‰₯ 0 ∧ 𝑖 β‰₯ 𝑗. For the other bit-vector functions 𝑓 we use the notation 𝑓 (𝑑1 𝑑2 ) instead of the SMT-LIB notation (𝑓 𝑑1 𝑑2 ). Integers The SMT-LIB defines a signature Ξ£Int and a Ξ£Int -theory for mathematical integers. The sort 𝜎 ∈ Ξ£Int is defined as the set of all integers. The integer signature Ξ£Int consists of variables, constants and the usual functions and relations. Let constants 𝑐′ and terms 𝑑′ , 𝑑′1 and 𝑑′2 all have sort Int. As notation for integer functions we write (𝑑′1 + 𝑑′2 ) instead of the SMT-LIB notation (+ 𝑑′1 𝑑′2 ). In [4] the authors introduce a binary function symbols &π‘˜ (βˆ’, βˆ’), for every positive integer π‘˜. The functions &π‘˜ (βˆ’, βˆ’) are introduced to represents bit-wise and. Therefore, they extend the signature Ξ£Int and define two theories for this extended signature. We will do the same in this paper and refer to [4] for details. Functions bv2nat and nat2bvπ‘˜ In the theory of β€œFixedSizeBitVectors" the functions bv2nat and nat2bvπ‘˜ are defined. Given an arbitrary binary 𝑏 = (π‘π‘˜βˆ’1 , ..., 𝑏𝑖 , ..., π‘βˆ‘οΈ€ 0 ) and its corresponding natural number 𝑛, the function bv2nat is defined as follows: bv2nat(𝑏) := π‘˜βˆ’1 𝑖 𝑖=0 𝑏𝑖 Β· 2 . Furthermore, the function nat2bvπ‘˜ is defined as: nat2bvπ‘˜ (𝑛) := (π‘π‘˜βˆ’1 , ..., 𝑏𝑖 , ..., 𝑏0 ), where 𝑏𝑖 = 𝑛 div 2𝑖 mod 2. Note, we do not extend our signatures and theories with bv2nat and nat2bvπ‘˜ . Instead, we treat them as auxiliary functions and ensure they are eliminated in the translation result. For the sake of readability, we denote bv2nat(𝑑) and nat2bvπ‘˜ (𝑑′ ) as 𝑑bv2nat and 𝑑′nat2bvπ‘˜ , respectively. 4. Translation with bv2nat and nat2bvπ‘˜ Our bit-vector to integer translation maps from the set of Ξ£Bv -formulas to the set of Ξ£Int -formulas (extended with &π‘˜ (βˆ’, βˆ’)). We say a translation "translates" a term or formula if it associates that term or formula with an element of its co-domain. Before we can define the translation, we need to define some auxiliary functions. First we define a variable mapping πœ’. Similar to the variable mapping defined in [4], πœ’ maps a variable of sort bit-vector to a fresh variable of sort integer. We extend the definition of πœ’ for arrays over bit-vectors and quantified variables. Definition 4.1 (Variable Mapping πœ’). Given a bit-vector formula πœ‘, we define a one-to-one mapping πœ’ as the following. For every variable and quantified variable π‘₯ that occurs in πœ‘, πœ’ maps to a fresh variable π‘₯β€² , such that if π‘₯ is of sort Bv, then π‘₯β€² is of sort Int. If π‘₯ is of sort Array with arguments of sort 𝑠 ∈ {Bv, Array} then π‘₯β€² is of sort Array with arguments of sort 𝑠′ ∈ {Int, Array} correspondingly. Finally, if π‘₯ is of sort Bool, then π‘₯β€² is of sort Bool. We write πœ’ maps π‘₯ to π‘₯β€² as πœ’(π‘₯) = π‘₯β€² . In Table 1 we use the auxiliary function π‘’π‘‘π‘ π‘˜ (βˆ’) from [4]. For a bit-vector term 𝑑′ , π‘’π‘‘π‘ π‘˜ (𝑑′ ) is an abbreviation for the term 2 Β· (𝑑′ mod 2π‘˜βˆ’1 ), which transforms an unsigned bit-vector to a signed bit-vector. Initially, our translation interprets every bit-vector as unsigned. In the case of a signed relation, we enclose the arguments of the relation with the function π‘’π‘‘π‘ π‘˜ . to ensures that the semantics of signed relation is preserved properly. Finally, we define the translation function 𝑇 that maps from Ξ£Bv -formulas πœ‘ of the theory of bit- vectors to Ξ£Int -formulas πœ“ of the theory of integers (extended with &π‘˜ (βˆ’, βˆ’)). Therefore, we define a conversion functions 𝐢 in Table 1 (column Lazy) and a replacement function 𝑅 in Table 3. The translation function 𝑇 is defined as: 𝑇 := πœ‘ ↦→ 𝑅(𝐢(πœ‘)) Our translation consist of two steps. In step one 𝐢 replaces bit-vector formulas and terms by their semantic definition. The conversion functions 𝐢 matches a term or formula 𝑒 to a term or formula in the first column. The match is then translated to the term or formula in the middle column named Lazy. For a direct comparison, we display the translation steps from the Eager translation in [4] in the third column in gray. Note, in [4] the authors translate a bit-vector variable 𝑣 by adding constraints in the form of (0 ≀ πœ’(𝑣) < 2π‘˜ ) to the translation result and do not surround the integer variable πœ’(𝑣) with modulo as we do in the third column of Table 2. For readability reasons, we split the definition of function 𝐢 into three functions: 𝐢, 𝐢𝑑 and 𝐢𝑑′ . Functions 𝐢𝑑 and 𝐢𝑑′ are both defined in Table 2. The conversion function uses bv2nat and nat2bvπ‘˜ to replace bit-vector formulas and terms by their semantic definition, but our signature Ξ£Int and integer theory do not contain bv2nat and nat2bvπ‘˜ . In the second step, we use the replacement function 𝑅 to get rid of bv2nat and nat2bvπ‘˜ . Therefore, we either remove the concatenation bv2nat ∘ nat2bvπ‘˜ or replace it with a modulo operation. In the tables defining 𝐢, 𝐢𝑑 , and 𝐢𝑑′ , we have indicated certain bv2nat function calls in blue. If function 𝑅(𝑒) matches 𝑒 to (𝑑′nat2bvπ‘˜ )bv2nat where bv2nat is marked blue, then we replace 𝑒 with 𝑑′ mod 2π‘˜ . Otherwise, if bv2nat is not marked blue we replace (𝑑′nat2bvπ‘˜ )bv2nat with 𝑑′ . So far we translate bvand to the uninterpreted function symbol &π‘˜ (βˆ’, βˆ’) in 𝐢 and do not treat it any further. For a more sophisticated translation of bvand we refer to literature [4]. Lazy Eager 𝐢(𝑒) : 𝑀 π‘Žπ‘‘π‘β„Ž 𝑒 : 𝑑1 = 𝑑2 𝐢𝑑 (𝑑1 )bv2nat = 𝐢𝑑 (𝑑2 )bv2nat 𝐢𝑑 (𝑑1 ) = 𝐢𝑑 (𝑑2 ) bvult(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat < 𝐢𝑑 (𝑑2 )bv2nat 𝐢𝑑 (𝑑1 ) < 𝐢𝑑 (𝑑2 ) bvule(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat ≀ 𝐢𝑑 (𝑑2 )bv2nat 𝐢𝑑 (𝑑1 ) ≀ 𝐢𝑑 (𝑑2 ) bvslt(𝑑1 , 𝑑2 ) π‘’π‘‘π‘ π‘˜ (𝐢𝑑 (𝑑1 )bv2nat ) < π‘’π‘‘π‘ π‘˜ (𝐢𝑑 (𝑑2 )bv2nat ) π‘’π‘‘π‘ π‘˜ (𝐢𝑑 (𝑑1 )) < π‘’π‘‘π‘ π‘˜ 𝐢𝑑 (𝑑2 )) bvsle(𝑑1 , 𝑑2 ) π‘’π‘‘π‘ π‘˜ (𝐢𝑑 (𝑑1 )bv2nat ) ≀ π‘’π‘‘π‘ π‘˜ (𝐢𝑑 (𝑑2 )bv2nat ) π‘’π‘‘π‘ π‘˜ (𝐢𝑑 (𝑑1 )) ≀ π‘’π‘‘π‘ π‘˜ 𝐢𝑑 (𝑑2 )) β–‘(𝑑1 , ...𝑑𝑖 ) β–‘(𝐢(𝑑1 ), ..., 𝐢(𝑑𝑖 )) β–‘ ∈ {∧, ∨, Β¬, β‡’, ⇔} π‘’π‘‘π‘ π‘˜ (𝑑′ ) := 2 Β· (𝑑′ mod 2π‘˜βˆ’1 ) βˆ’ 𝑑′ Table 1 Definition of the Conversion Function 𝐢 4.1. Translation of Arrays and Quantified Formulas For the translation of quantified formulas and arrays we extend our conversion functions 𝐢 and 𝐢𝑑 with the conversions outlined in Table 4. Therefore, let in Table 4 π‘Ž and 𝑏 be arrays over bit-vectors, where the bit-vectors that represent the indices of π‘Ž and 𝑏 have width π‘˜. Additionally, let 𝑖′ be a quantified integer variable. The translation of quantified formulas utilizes the variable mapping πœ’(𝑣) = 𝑣 β€² , where 𝑣 is a quantified bit-vector variable of width π‘˜ and 𝑣 β€² is a quantified integer variable. Additionally, we ensure that a translated quantified formula is unsatisfiable for values of 𝑣 β€² that are not within the bounds of 𝑣. Therefore, we add the bound (0 ≀ πœ’(𝑣) < 2π‘˜ ) as constraints within the scope of the quantifier. Arrays over bit-vectors π‘Ž are translated to fresh arrays over integers with the help of the one-to-one mapping πœ’(π‘Ž). Since arrays over integers have infinitely indices and arrays over bit-vectors don’t, we have to add some limitations. First of all, we ensure to only read from and write to indices that are within the bounds of the corresponding bit-vector array. Secondly, we have to ensure that if two translated arrays are equal on every index in the range of the bit-vector array, then they are equal on every index. Therefore, we add a constraint to the translation result that evaluates to false if this condition is violated. This is achieved by the constraint function Lem in Table 4. Furthermore, we change the translation function to add constraints for every array equality: π‘‡π΄π‘Ÿπ‘Ÿπ‘Žπ‘¦ := πœ‘ ↦→ 𝑅(𝐢(πœ‘)) ∧ Lem(πœ‘). Lazy Eager 𝐢𝑑 (𝑒) : 𝑀 π‘Žπ‘‘π‘β„Ž 𝑒 : concat(𝑑1 , 𝑑2 ) 𝐢𝑑′ (𝑒)nat2bvπ‘˜1 +π‘˜2 𝐢𝑑′ (𝑒) extract𝑖𝑗 (𝑑1 ) 𝐢𝑑′ (𝑒)nat2bvπ‘–βˆ’π‘—+1 𝐢𝑑′ (𝑒) else: 𝑒 𝐢𝑑′ (𝑒)nat2bvπ‘˜ 𝐢𝑑′ (𝑒) 𝐢𝑑′ (𝑒) : 𝑀 π‘Žπ‘‘π‘β„Ž 𝑒 : π‘₯ πœ’(π‘₯) πœ’(π‘₯) mod 2π‘˜ βˆ‘οΈ€π‘˜βˆ’1 𝑐 𝑖=0 𝑐𝑖 Β· 2𝑖 where 𝑐𝑖 is the 𝑖-th bit of 𝑐 bvneg(𝑑1 ) 2π‘˜1 βˆ’ 𝐢𝑑 (𝑑1 )bv2nat 2π‘˜1 βˆ’ 𝐢𝑑 (𝑑1 ) bvmul(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat Β· 𝐢𝑑 (𝑑2 )bv2nat (𝐢𝑑 (𝑑1 ) Β· 𝐢𝑑 (𝑑2 )) mod 2π‘˜ bvadd(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat + 𝐢𝑑 (𝑑2 )bv2nat (𝐢𝑑 (𝑑1 ) + 𝐢𝑑 (𝑑2 )) mod 2π‘˜ bvsub(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat βˆ’ 𝐢𝑑 (𝑑2 )bv2nat (𝐢𝑑 (𝑑1 ) βˆ’ 𝐢𝑑 (𝑑2 )) mod 2π‘˜ ite(𝐢𝑑 (𝑑2 )bv2nat = 0, 2π‘˜ βˆ’ 1, ite(𝐢𝑑 (𝑑2 ) = 0, 2π‘˜ βˆ’ 1, bvudiv(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat div 𝐢𝑑 (𝑑2 )bv2nat ) 𝐢𝑑 (𝑑1 ) div 𝐢𝑑 (𝑑2 )) ite(𝐢𝑑 (𝑑2 )bv2nat = 0, 𝑑1 , ite(𝐢𝑑 (𝑑2 ) = 0, 𝐢𝑑 (𝑑1 ), bvurem(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat mod 𝐢𝑑 (𝑑2 )bv2nat ) 𝐢𝑑 (𝑑1 ) mod 𝐢𝑑 (𝑑2 )) ite(𝐢𝑑 (𝑑2 )bv2nat = 1, ite(𝐢𝑑 (𝑑2 ) = 1, 2 Β· 𝐢𝑑 (𝑑1 )bv2nat , 2 Β· 𝐢𝑑 (𝑑1 ) mod 2π‘˜ , ... ... bvshl(𝑑1 , 𝑑2 ) ite(𝐢𝑑 (𝑑2 )bv2nat = π‘˜ βˆ’ 1, ite(𝐢𝑑 (𝑑2 ) = π‘˜ βˆ’ 1, 2π‘˜βˆ’1 Β· 𝐢𝑑 (𝑑1 )bv2nat , 2π‘˜βˆ’1 Β· 𝐢𝑑 (𝑑1 ) mod 2π‘˜ , 0)...) 0)...) ite(𝐢𝑑 (𝑑2 )bv2nat = 1, ite(𝐢𝑑 (𝑑2 ) = 1, 𝐢𝑑 (𝑑1 )bv2nat div 2, 𝐢𝑑 (𝑑1 ) div 2, ... ... bvlshr(𝑑1 , 𝑑2 ) ite(𝐢𝑑 (𝑑2 )bv2nat = π‘˜ βˆ’ 1, ite((𝐢𝑑 (𝑑2 ) = π‘˜ βˆ’ 1), (𝐢𝑑 (𝑑1 )bv2nat div 2π‘˜βˆ’1 ), 𝐢𝑑 (𝑑1 ) div 2π‘˜βˆ’1 , 0)...) 0)...) concat(𝑑1 , 𝑑2 ) 𝐢𝑑 (𝑑1 )bv2nat Β· 2π‘˜2 + 𝐢𝑑 (𝑑2 )bv2nat 𝐢𝑑 (𝑑1 ) Β· 2π‘˜2 + 𝐢𝑑 (𝑑2 ) extract𝑖𝑗 (𝑑1 ) 𝐢𝑑 (𝑑1 )bv2nat div 2𝑗 𝐢𝑑 (𝑑1 ) div 2𝑗 mod 2π‘–βˆ’π‘—+1 bvnot(𝑑1 ) 2π‘˜1 βˆ’ 𝐢𝑑 (𝑑1 )bv2nat + 1 2π‘˜1 βˆ’ 𝐢𝑑 (𝑑1 ) + 1 bvand(𝑑1 , 𝑑2 ) &π‘˜ (𝐢𝑑 (𝑑1 )bv2nat , 𝐢𝑑 (𝑑2 )bv2nat ) &π‘˜ (𝐢𝑑 (𝑑1 ), 𝐢𝑑 (𝑑2 )) Table 2 Definition of the Conversion Function 𝐢𝑑 𝑅(𝑒) : 𝑀 π‘Žπ‘‘π‘β„Ž 𝑒 : (𝑑′nat2bvπ‘˜ )bv2nat β†’ 𝑅(𝑑′ ) mod 2π‘˜ where bv2nat is marked blue (𝑑′nat2bvπ‘˜ )bv2nat β†’ 𝑅(𝑑′ ) otherwise else: 𝑒(𝑑′1 , ..., 𝑑′𝑛 ) β†’ 𝑒(𝑅(𝑑′𝑖 ), ..., 𝑅(𝑑′𝑛 )) Table 3 Definition of the Replacement Function 𝑅 Lazy 𝐢(𝑒) : 𝑀 π‘Žπ‘‘π‘β„Ž 𝑒 : βˆƒ 𝑣.(𝑒) βˆƒ 𝐢(𝑣).(𝐢(𝑒) ∧ (0 ≀ πœ’(𝑣) < 2π‘˜ )) βˆ€ 𝑣.(𝑒) βˆ€ 𝐢(𝑣).((0 ≀ πœ’(𝑣) < 2π‘˜ ) β‡’ 𝐢(𝑒)) 𝐢𝑑 (𝑒) : 𝑀 π‘Žπ‘‘π‘β„Ž 𝑒 : π‘Ž πœ’(π‘Ž) (select π‘Ž 𝑖) (select 𝐢𝑑 (π‘Ž) 𝐢𝑑 (𝑖)bv2nat ) (store π‘Ž 𝑖 𝑣) (store 𝐢𝑑 (π‘Ž) 𝐢𝑑 (𝑖)bv2nat 𝐢𝑑 (𝑣)) Lem(𝑒) : 𝑀 π‘Žπ‘‘π‘β„Ž 𝑒 : (βˆ€π‘–β€² .(0 ≀ 𝑖′ < 2π‘˜ ) β‡’ ((select 𝐢𝑑 (π‘Ž) 𝑖′ ) mod 2π‘˜ = π‘Ž=𝑏 (select 𝐢𝑑 (𝑏) 𝑖′ ) mod 2π‘˜ )) β‡’ 𝐢𝑑 (π‘Ž) = 𝐢𝑑 (𝑏) ⋀︀𝑛 β–‘(𝑑1 , ...𝑑𝑖 ) 𝑖=1 Lem(𝑑𝑖 ) β–‘ ∈ {∧, ∨, Β¬, β‡’, ⇔} else: ⊀ Table 4 Extension for Quantified Formulas and Arrays 5. Implementation We have two implementations for our lazy translation presented in Section 4. The first implementation is a wrapper script for SMT solvers, called Ultimate IntBlastingWrapper [12]. It is implemented in the Ultimate framework2 . The wrapper script does not use the functions bv2nat and nat2bvπ‘˜ instead modulo operations are added directly. It supports a translation of bit-wise operations with all features described in [4], quantified formulas and arrays over bit-vectors. The second implementation is in the SMT solver SMTInterpol3 . This implementation is still work in progress. So far we use bv2nat and nat2bvπ‘˜ as uninterpreted functions, but the implementation does not support bit-wise operations, quantified variables and arrays yet. We implemented two settings to translate bit-vectors to integers. The first setting is called Lazy and it applies our lazy translation in Section 4. The second setting is called Eager and it applies a translation similar to [4]. Each setting applies the conversions in their respective column in Table 2. Additionally, we compare our settings Lazy and Eager with the original implementation (cvc5-int) of [4] in the SMT solver cvc5. We selected SMTInterpol as the SMT solver for our evaluation because, at the point of writing, it did not support bit-vectors, and we were already familiar with the tool. Unfortunately, it did not support non-linear integer arithmetic either. When SMTInterpol encounters a bit-wise operation or non-linear integer arithmetic we return an error. 6. Evaluation We evaluate our two implementations to answer the following research questions: β€’ How does the performance of the approach Lazy and Eager compare? β€’ How does Lazy and Eager compare to int-blasting in cvc5? 6.1. Evaluation of Ultimate IntBlastingWrapper We participated with Ultimate IntBlastingWrapper [12] at the latest SMT-COMP 2023. Ulti- mate IntBlastingWrapper competed in the Single Query Track on every logic that contains bit- 2 https://ultimate.informatik.uni-freiburg.de and github.com/ultimate-pa/ultimate 3 https://github.com/ultimate-pa/smtinterpol/tree/Intblasting Memory used by Lazy [MB] 100 CPU time of Lazy [s] 1000 10 100 1 0.1 10 0.1 1 10 100 10 100 1000 CPU time of Eager [s] Memory used by Eager [MB] Figure 1: Comparing CPU time and Memory usage of Eager (x-axis) and Lazy (y-axis) vectors. The results of Ultimate IntBlastingWrapper in the SMT-COMP 2023 showed wrong results on three benchmarks. Of these three, two are from the category QF_AUFBV and one from QF_ABV. The wrong results were caused by a mistake in the translation of equalities between arrays over bit-vectors. This has been fixed in commit: https://github.com/ultimate-pa/ultimate/commit/ 928447c7dc8c44e406f0a52121ccf96fcbe4d5b5. 6.2. Evaluation of Int-blasting in SMTInterpol To evaluate our implementation in the SMTInterpol, we ran the settings Lazy and Eager and the implementation of int-blasting in cvc5 (cvc5-int) from the paper [4]. For cvc5-int we used the cvc5 options --solve-bv-as-int=sum and --nl-ext-tplanes. We run Lazy, Eager and cvc5-int on a randomly picked subset of the non-incremental QF_BV benchmarks from the SMT-LIB. From the set we excluded every benchmark where the expected result is unknown. There remained 12302 benchmarks. Environment We run our experiments on a cluster of machines with 33 GB of memory and an Intel Xeon E3-1230 v5 CPU with 8 processing units and a frequency of 3.40 GHz that run Ubuntu 22.04 (Linux kernel 5.15.0). To measure and limit resources we use Benchexec 3.18 [22]. We limit every run to 1 core, 15 min of CPU time and 15 GB of memory. How does the performance of the approach Lazy and Eager compare? Lazy Eager cvc5-int Correct 2961 2698 8409 SAT 662 425 2135 Lazy Eager UNSAT 2299 2273 6274 Correct 2679 βˆ‘οΈ€ Timeout 2124 2709 3769 βˆ‘οΈ€CPU 23000 s 25600 s Unsupported 7217 6895 - Memory 252 GB 327 GB Memory Out - - 124 Table 6 Total 12302 Benchmarks where Lazy and Eager are correct Table 5 Overview of the evaluation results The evaluation results of Lazy and Eager are displayed in Table 5. We can see that SMTInterpol solves more benchmarks with Lazy than with Eager. Lazy creates 282 more correct results that is 9% (2698 of 2961). Among these, Eager times out on 280 cases, and on 2 cases, Eager reports that the formula contains non-linear integer arithmetic. On the other hand Eager returns a correct result on 19 benchmarks where Lazy times out. Furthermore, on 426 benchmarks Lazy returns an error where Eager times out. All errors in Table 5 are caused by either non-linear integer arithmetic or a bit-wise operation. To compare the CPU time and memory usage of Lazy and Eager, we analyze the 2679 benchmarks for which both settings return a correct result (see Table 6). When measuring the CPU time and memory used on these 2679 benchmarks, Lazy requires 10% less CPU time (23000 s out of 25600 s) and 23% less memory (252 GB out of 327 GB) to decide their satisfiability. For a more detailed view, we provide two scatter plots in Figure 1. Both scatter plots have logarithmic scales, the first shows the CPU time used and the second plot shows the memory usage. Every dot in the scatter plots below the line is in favor of Lazy. We observe that in many cases, the Lazy approach requires less CPU time and/or memory. Specifically, the CPU time and memory usage of Lazy tends to deviate less frequently and significantly from Eager. However, it is important to note that Lazy is not strictly superior to Eager. How does Lazy and Eager compare to int-blasting in cvc5? In Table 5, we observe that cvc5-int solves 8409 tasks. Out of these 8409 tasks Lazy times out on 553, returns an error on 5166 and solves correctly 2690. On the 2961 tasks where Lazy returns a correct result, cvc5-int times out 271 times and returns 2691 correct results. Among the 2006 benchmarks where Lazy, Eager, and cvc5-int all return a correct result, cvc5-int demonstrates significantly lower CPU time and memory usage. For solving these 2006 tasks, Lazy requires 30380 s, Eager requires 62800 s, and cvc5-int requires 12750 s. Regarding memory usage, Lazy consumes 246.7 GB, Eager consumes 321.4 GB, and cvc5-int consumes 16.5 GB. The comparison between cvc5-int and our implementations shows that cvc5-int solves significantly more benchmarks. However, cvc5-int is not strictly superior, we solve 271 benchmarks on which cvc5-int times out. 7. Conclusion We present a lazy translation from bit-vector formulas to integer formulas that utilizes the functions bv2nat and nat2bvπ‘˜ . Our translation consists of a conversion function 𝐢, and a replacement function 𝑅. Conversion function 𝐢 replaces bit-vector terms and formulas with their semantic definition, thus incorporating bv2nat and nat2bvπ‘˜ into the translation process This makes our translation more modular since bit-vector terms and their semantic definition have the same sort. Additionally, conversion function 𝐢 supports a translation of quantified formulas and arrays over bit-vectors. Within replacement function 𝑅, we eliminate bv2nat and nat2bvπ‘˜ from the translation result and introduce modulo operations instead. This is done in such a way that the amount of redundant modulo operations is reduced. We implemented our lazy translation in the SMTInterpol and as wrapper script for SMT solver. Our evaluation of both implementations indicates the correctness of the lazy translation. Furthermore, it shows that SMTInterpol with our lazy translation is able to solve 9% more tasks, 10% faster and with 23% less memory usage than with a closely related, up-to-date eager translation approach. References [1] C. Barrett, P. Fontaine, C. Tinelli, The SMT-LIB Standard: Version 2.6, Technical Report, Department of Computer Science, The University of Iowa, 2017. Available at www.SMT-LIB.org. [2] A. Niemetz, M. Preiner, Bitwuzla, in: C. Enea, A. Lal (Eds.), Computer Aided Verification - 35th International Conference, CAV 2023, Paris, France, July 17-22, 2023, Proceedings, Part II, volume 13965 of Lecture Notes in Computer Science, Springer, 2023, pp. 3–17. URL: https://doi.org/10.1007/ 978-3-031-37703-7_1. doi:10.1007/978-3-031-37703-7\_1. [3] L. Aniva, H. Barbosa, C. Barrett, M. Brain, V. Camillo, G. Kremer, H. Lachnitt, A. Mohamed, M. Mohamed, A. Niemetz, et al., Cvc5 at the smt competition 2023 (2023). [4] Y. Zohar, A. Irfan, M. Mann, A. Niemetz, A. NΓΆtzli, M. Preiner, A. Reynolds, C. Barrett, C. Tinelli, Bit- precise reasoning via int-blasting, in: B. Finkbeiner, T. Wies (Eds.), Verification, Model Checking, and Abstract Interpretation, Springer International Publishing, Cham, 2022, pp. 496–518. [5] F. Frohn, A calculus for modular loop acceleration, in: A. Biere, D. Parker (Eds.), Tools and Algorithms for the Construction and Analysis of Systems - 26th International Conference, TACAS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, Part I, volume 12078 of Lecture Notes in Computer Science, Springer, 2020, pp. 58–76. URL: https://doi.org/10.1007/978-3-030-45190-5_4. doi:10.1007/978-3-030-45190-5\_4. [6] P. Ganty, R. Iosif, F. KonecnΓ½, Underapproximation of procedure summaries for integer pro- grams, Int. J. Softw. Tools Technol. Transf. 19 (2017) 565–584. URL: https://doi.org/10.1007/ s10009-016-0420-7. doi:10.1007/s10009-016-0420-7. [7] S. Gulwani, S. Srivastava, R. Venkatesan, Program analysis as constraint solving, in: R. Gupta, S. P. Amarasinghe (Eds.), Proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation, Tucson, AZ, USA, June 7-13, 2008, ACM, 2008, pp. 281–292. URL: https://doi.org/10.1145/1375581.1375616. doi:10.1145/1375581.1375616. [8] M. ColΓ³n, S. Sankaranarayanan, H. Sipma, Linear invariant generation using non-linear constraint solving, in: W. A. H. Jr., F. Somenzi (Eds.), Computer Aided Verification, 15th International Conference, CAV 2003, Boulder, CO, USA, July 8-12, 2003, Proceedings, volume 2725 of Lecture Notes in Computer Science, Springer, 2003, pp. 420–432. URL: https://doi.org/10.1007/978-3-540-45069-6_ 39. doi:10.1007/978-3-540-45069-6\_39. [9] M. ColΓ³n, H. Sipma, Synthesis of linear ranking functions, in: T. Margaria, W. Yi (Eds.), Tools and Algorithms for the Construction and Analysis of Systems, 7th International Conference, TACAS 2001 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2001 Genova, Italy, April 2-6, 2001, Proceedings, volume 2031 of Lecture Notes in Computer Science, Springer, 2001, pp. 67–81. URL: https://doi.org/10.1007/3-540-45319-9_6. doi:10.1007/ 3-540-45319-9\_6. [10] A. R. Bradley, Z. Manna, H. B. Sipma, Linear ranking with reachability, in: K. Etessami, S. K. Rajamani (Eds.), Computer Aided Verification, 17th International Conference, CAV 2005, Edinburgh, Scotland, UK, July 6-10, 2005, Proceedings, volume 3576 of Lecture Notes in Computer Science, Springer, 2005, pp. 491–504. URL: https://doi.org/10.1007/11513988_48. doi:10.1007/11513988\ _48. [11] A. Rybalchenko, Constraint solving for program verification: Theory and practice by example, in: T. Touili, B. Cook, P. B. Jackson (Eds.), Computer Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings, volume 6174 of Lecture Notes in Computer Science, Springer, 2010, pp. 57–71. URL: https://doi.org/10.1007/978-3-642-14295-6_7. doi:10.1007/978-3-642-14295-6\_7. [12] M. Barth, M. Heizmann, Ultimate IntBlastingWrapper (2023). Available at https://smt-comp.github. io/2023/system-descriptions/UltimateIntBlastingWrapper%2BSMTInterpol.pdf. [13] J. Christ, J. Hoenicke, A. Nutz, Smtinterpol: An interpolating SMT solver, in: A. F. Donald- son, D. Parker (Eds.), Model Checking Software - 19th International Workshop, SPIN 2012, Oxford, UK, July 23-24, 2012. Proceedings, volume 7385 of Lecture Notes in Computer Science, Springer, 2012, pp. 248–254. URL: https://doi.org/10.1007/978-3-642-31759-0_19. doi:10.1007/ 978-3-642-31759-0\_19. [14] Y. C. Liu, C. Pang, D. Dietsch, E. Koskinen, T. Le, G. Portokalidis, J. Xu, Source-level bitwise branching for temporal verification of lifted binaries, CoRR abs/2105.05159 (2021). URL: https: //arxiv.org/abs/2105.05159. arXiv:2105.05159. [15] A. Gurfinkel, A. Belov, J. Marques-Silva, Synthesizing safe bit-precise invariants, in: E. ÁbrahΓ‘m, K. Havelund (Eds.), Tools and Algorithms for the Construction and Analysis of Systems - 20th International Conference, TACAS 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings, volume 8413 of Lecture Notes in Computer Science, Springer, 2014, pp. 93–108. URL: https://doi.org/10.1007/ 978-3-642-54862-8_7. doi:10.1007/978-3-642-54862-8\_7. [16] A. Griggio, Effective word-level interpolation for software verification, in: P. Bjesse, A. SlobodovΓ‘ (Eds.), International Conference on Formal Methods in Computer-Aided Design, FMCAD ’11, Austin, TX, USA, October 30 - November 02, 2011, FMCAD Inc., 2011, pp. 28–36. URL: http: //dl.acm.org/citation.cfm?id=2157662. [17] T. Okudono, A. King, Mind the gap: Bit-vector interpolation recast over linear integer arithmetic, in: A. Biere, D. Parker (Eds.), Tools and Algorithms for the Construction and Analysis of Systems - 26th International Conference, TACAS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, Part I, volume 12078 of Lecture Notes in Computer Science, Springer, 2020, pp. 79–96. URL: https: //doi.org/10.1007/978-3-030-45190-5_5. doi:10.1007/978-3-030-45190-5\_5. [18] P. Backeman, P. RΓΌmmer, A. Zeljic, Bit-vector interpolation and quantifier elimination by lazy reduction, in: N. S. BjΓΈrner, A. Gurfinkel (Eds.), 2018 Formal Methods in Computer Aided Design, FMCAD 2018, Austin, TX, USA, October 30 - November 2, 2018, IEEE, 2018, pp. 1–10. URL: https://doi.org/10.23919/FMCAD.2018.8603023. doi:10.23919/FMCAD.2018.8603023. [19] A. Niemetz, M. Preiner, A. Reynolds, Y. Zohar, C. W. Barrett, C. Tinelli, Towards bit-width- independent proofs in SMT solvers, in: P. Fontaine (Ed.), Automated Deduction - CADE 27 - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019, Proceedings, volume 11716 of Lecture Notes in Computer Science, Springer, 2019, pp. 366–384. URL: https://doi.org/10.1007/978-3-030-29436-6_22. doi:10.1007/978-3-030-29436-6\_22. [20] D. Beyer, P. Chien, N. Lee, Bridging hardware and software analysis with btor2c: A word- level-circuit-to-c translator, in: S. Sankaranarayanan, N. Sharygina (Eds.), Tools and Algorithms for the Construction and Analysis of Systems - 29th International Conference, TACAS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Paris, France, April 22-27, 2023, Proceedings, Part II, volume 13994 of Lecture Notes in Computer Science, Springer, 2023, pp. 152–172. URL: https://doi.org/10.1007/978-3-031-30820-8_12. doi:10.1007/978-3-031-30820-8\_12. [21] H. Barbosa, C. W. Barrett, M. Brain, G. Kremer, H. Lachnitt, M. Mann, A. Mohamed, M. Mohamed, A. Niemetz, A. NΓΆtzli, A. Ozdemir, M. Preiner, A. Reynolds, Y. Sheng, C. Tinelli, Y. Zohar, cvc5: A versatile and industrial-strength SMT solver, in: D. Fisman, G. Rosu (Eds.), Tools and Algorithms for the Construction and Analysis of Systems - 28th International Conference, TACAS 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2-7, 2022, Proceedings, Part I, volume 13243 of Lecture Notes in Computer Science, Springer, 2022, pp. 415–442. URL: https://doi.org/10.1007/978-3-030-99524-9_24. doi:10.1007/978-3-030-99524-9\_24. [22] D. Beyer, S. LΓΆwe, P. Wendler, Reliable benchmarking: Requirements and solutions, STTT 21 (2019) 1–29. doi:10.1007/s10009-017-0469-y.