=Paper=
{{Paper
|id=Vol-3185/paper675
|storemode=property
|title=Trail Saving in SMT
|pdfUrl=https://ceur-ws.org/Vol-3185/paper675.pdf
|volume=Vol-3185
|authors=Milan Banković,David Šćepanović
|dblpUrl=https://dblp.org/rec/conf/smt/BankovicS22
}}
==Trail Saving in SMT==
Trail Saving in SMT Milan Banković1,* , David Šćepanović1 1 University of Belgrade, Faculty of Mathematics, Serbia Abstract In this paper we discuss and evaluate the method of trail saving on backjumps in CDCL(T)-based SMT solvers. The method was originally proposed for CDCL-based SAT solvers at the SAT conference in 2020, showing a positive impact on solving SAT instances. Since a SAT solver tends to follow a similar path down the search tree after a backjump, saving the retracted portion of the trail enables speeding up the inference after the backjump by copying the saved inferred literals to the assertion trail, instead of re-propagating them by the unit-propagation mechanism. A similar behaviour may be expected when SMT solvers are concerned, but since the process of theory propagation within SMT solvers is usually even more expensive, the trail saving technique may potentially have even more significant impact in case of SMT instances. Although the experimental evaluation given in this paper shows some potential of the approach, the obtained results are generally mixed, and depend greatly on the chosen benchmark family, even within the same theory. Further analysis might be needed in order to better understand the behaviour of the method and its effects on the entire solving process. Keywords SMT solving, Trail saving, Experimental evaluation 1. Introduction SAT and SMT solvers have become indispensable tools in the previous two decades, with appli- cations spanning from hardware and software verification [1, 2, 3, 4], to proving mathematical theorems [5, 6] and solving different kinds of constraint satisfaction problems [7, 8]. SAT solvers decide on the satisfiability of a given propositional formula, typically in conjunctive normal form (CNF). Although quite simple, propositional language is powerful enough to enable a great num- ber of problems from different areas to be expressed as problems of propositional satisfiability (or validity) and then solved by a SAT solver. This approach has showed tremendous success in the previous years, thanks to very efficient SAT solvers based on the conflict-driven-clause- learning (CDCL) algorithm [9]. On the other hand, SMT solvers decides on the satisfiability of a first-order formula, modulo some given theory of interest. Typical theories used in practical SMT solving are motivated by the main line of application of SMT solvers in the field of software verification, and include equality with uninterpreted functions (EUF), linear real and integer arithmetic, theory of arrays, theory of bitvectors, theory of inductive datatypes, and so on. Most modern SMT solvers uses a CDCL SAT solver for dealing with the propositional structure of a given first-order formula, while the theory reasoning is the responsibility of some dedicated theory-specific procedure [10]. SMT 2022: Satisfiability Modulo Theories, August 11–12, 2022, Haifa, Israel * Corresponding author. " milan@matf.bg.ac.rs (M. Banković); davidscepanovic96@gmail.com (D. Šćepanović) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 2 New applications pose new challenges to SAT and SMT solvers, and there is a constant pressure to make SAT and SMT technology even more efficient and robust. Therefore, new implementational and algorithmic enhancements of the existing procedures used in SAT and SMT solvers have become a very popular research direction. One such improvement in the area of SAT solving was presented by Hickey and Bacchus [11] at the SAT conference in 2020. Their work considers saving the retracted portion of the assertion trail on each backjump (that is, non- chronological backtrack). Since SAT solvers tend to follow a similar path while redescending the search tree after a backjump, the saved literals may be used to speed up that redescent, since they can be simply copied to the trail, instead of being re-propagated by the mechanism of unit- propagation [9] . This is especially significant in the context of non-chronological backtracking, because the retracted portions of the trail tend to be very large, and the solver must reproduce a large number of literals at a similar cost as before the backjump. Hickey and Bacchus showed that this approach had a positive impact on the overall performance of the SAT solvers that were used in their experiments [11]. The main goal of this work is to evaluate the potential of using the described technique within SMT solvers. The process of theory propagation [12], which is used as the inference mechanism within SMT solvers, is usually much more time-consuming, compared to the unit propagation of SAT solvers, because of more complex decision procedures (and the corresponding data structures) used for the theory reasoning. Therefore, the trail saving method may be even more efficient in the context of SMT solvers. On the other hand, implementing the trail saving method within an SMT solver may pose additional challenges, since a non-trivial modifications of the theory decision procedures might be needed in order to fully exploit the benefits of the technique. In this paper, we aim to discuss all these issues and also to provide an experimental evaluation that will give us some preliminary answers about the effectiveness of the proposed trail saving method in the area of SMT solving. Up to the authors’ knowledge, such an analysis was not published in the literature so far. The rest of the paper is organized as follows. In Section 2 we present the basic concepts and notions needed in the rest of the text. In Section 3 we describe the basic trail saving method, as presented in [11]. Section 4 considers the usage of the trail saving method within SMT solvers, and also discusses some challenges in that context. In Section 5 we discuss our implementation and present the experimental results obtained by our implementation. Finally, in Section 6 we give some conclusions and mention some further work directions. 2. Background In this paper we assume the standard syntax and semantics of the first order logic with equality, adopting the terminology from [10]. A first order formula 𝐹 is satisfiable with respect to a given theory T (or T-satisfiable) if there is a model of T in which 𝐹 is true. The problem of satisfiability of first order formulae with respect to a given theory is known as the SMT problem. We assume that formulae are in conjunctive normal form (CNF), that is, a formula is a conjunction of clauses, and each clause is a disjunction of literals (first order atoms or their negations). We also assume that all considered formulae are ground, i.e. do not contain variables. 3 We follow the usual approach for solving SMT problems, based on the CDCL(T)1 scheme [12]. That is, we assume that the solver consists of a SAT solving core, implementing the CDCL algorithm [9], and a dedicated procedure called a theory solver, which decides on T-satisfiability of the found propositional model, expressed as a conjunction of ground first order literals. The notation used in the following text is mostly borrowed from [11]. The CDCL algorithm incrementally builds a propositional model of the given formula 𝐹 , by assigning truth values to the literals of 𝐹 . A partial propositional valuation 𝑣 is represented by the (assertion) trail 𝑀 , which is a stack of literals that are true in 𝑣. The trail 𝑀 is partitioned into decision levels 𝑀 [[𝑖]], based on the number of decision literals that precede each literal on 𝑀 (that is, 𝑀 [[0]] is the sequence of literals on 𝑀 before the first decision literal, 𝑀 [[1]] is the sequence of literals starting from the first decision literal up to (but not including) the second decision literal, and so on). The topmost (current) decision level will be denoted by 𝑀 [[𝐿𝑑𝑒𝑒𝑝 ]]. Whenever a decision is made, literals inferred from 𝑀 and the clauses of 𝐹 are pushed on 𝑀 (the process known as unit propagation). The clause responsible for the inference of a literal 𝑙 is called the (propagation) reason of 𝑙, and is denoted by 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙). If a clause becomes falsified, the process of conflict analysis uses the falsified clause (called the conflict reason) and the reasons of propagated literals on 𝑀 to determine the real cause of the conflict (expressed as a backjump clause 𝐶𝑏𝑎𝑐𝑘 ) and backjumps to the appropriate level 𝐿𝑏𝑎𝑐𝑘 , i.e. restores the trail to the state 𝑀 [[0]], 𝑀 [[1]], . . . , 𝑀 [[𝐿𝑏𝑎𝑐𝑘 ]] (denoted by 𝑀 [[0 . . . 𝐿𝑏𝑎𝑐𝑘 ]] for short). The level 𝐿𝑏𝑎𝑐𝑘 is the minimal level such that 𝐶𝑏𝑎𝑐𝑘 is a unit clause with respect to 𝑀 [[0 . . . 𝐿𝑏𝑎𝑐𝑘 ]] (i.e. it would trigger a unit propagation immediately after the backjump). When a CDCL-based SAT solver is coupled with a theory solver for some first order theory T, a conflict may also arise if the theory solver deduces that a subset 𝑅 of literals from 𝑀 is T-unsatisfiable. Such a conflict is called a theory conflict (or T-conflict), and 𝑅 is a T-conflict reason which is used as a starting point for conflict analysis. The theory solver may also T-infer literals (i.e. discover literals of 𝐹 that are T-consequence of some subsets of literals from 𝑀 ). These literals are pushed on 𝑀 (the process known as theory propagation). If a literal 𝑙 is T-inferred from the set 𝐸 ⊆ 𝑀 , then 𝐸 is called (T-propagation) reason of 𝑙. It is again denoted by 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙) and may be used during the conflict analysis in the future. A formula is unsatisfiable if a conflict (propositional or T-conflict) happens when 𝐿𝑑𝑒𝑒𝑝 = 0, and is satisfiable if all literals of 𝐹 have assigned values, none of the clauses are falsified and there is no T-conflict detected by the theory solver. 3. Trail Saving Algorithm In this section we discuss the basic trail saving algorithm [11], in the context of a CDCL- based SAT solver. The algorithm maintains the list of saved literals retracted from the trail 𝑀 , denoted by 𝑀𝑠𝑎𝑣𝑒 (it is initially empty, by assumption). We denote the 𝑖-th literal in 𝑀𝑠𝑎𝑣𝑒 as 𝑀𝑠𝑎𝑣𝑒 [𝑖]. At each backjump, the portion 𝑀 [[𝐿𝑏𝑎𝑐𝑘 + 1 . . . 𝐿𝑑𝑒𝑒𝑝 − 1]] of the trail should be saved to 𝑀𝑠𝑎𝑣𝑒 , i.e. the retracted part of the trail without the conflicting level 𝑀 [[𝐿𝑑𝑒𝑒𝑝 ]] (the conflicting level is not saved, since it would certainly produce the same conflict in the future). Together with the literals, their reasons must also be saved. The saved reason of a 1 In older literature, the name DPLL(T) was used for the same procedure instead. 4 literal 𝑙 will be denoted by 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑙). In case of a saved decision literal 𝑙, we assume that 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑙) = 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙) = ∅. In order to preserve the correctness of the CDCL algorithm, the property of reason soundness [11] of the saved trail 𝑀𝑠𝑎𝑣𝑒 must be maintained during the operation of the solver. This property guarantees that if the literals 𝑀𝑠𝑎𝑣𝑒 [0], 𝑀𝑠𝑎𝑣𝑒 [1],. . . ,𝑀𝑠𝑎𝑣𝑒 [𝑖 − 1] are true in 𝑀 , and 𝑀𝑠𝑎𝑣𝑒 [𝑖] is a saved inferred literal, then all the literals from 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑀𝑠𝑎𝑣𝑒 [𝑖]) distinct from 𝑀𝑠𝑎𝑣𝑒 [𝑖] itself are false in 𝑀 . This means that it is safe to push 𝑀𝑠𝑎𝑣𝑒 [𝑖] to 𝑀 as an inferred literal (if it is undefined in 𝑀 ), or to report a conflict, if 𝑀𝑠𝑎𝑣𝑒 [𝑖] is false in 𝑀 . In the basic variant of the trail saving algorithm [11], the current content of the list 𝑀𝑠𝑎𝑣𝑒 is always replaced by 𝑀 [[𝐿𝑏𝑎𝑐𝑘 + 1 . . . 𝐿𝑑𝑒𝑒𝑝 − 1]]. An alternative, which is proposed as one of the enhancements in [11], is to prepend 𝑀 [[𝐿𝑏𝑎𝑐𝑘 + 1 . . . 𝐿𝑑𝑒𝑒𝑝 − 1]] to the current state of 𝑀𝑠𝑎𝑣𝑒 . In this paper, we consider this enhancement as an integral part of the algorithm. The basic functions of the trail saving algorithm [11] are given in Algorithm 1, and are discussed in the following paragraphs. Saving the trail. The function SaveTrail is invoked by the solver immediately before a backjump to the level 𝐿𝑏𝑎𝑐𝑘 is performed. It first checks whether the prepending of the newly saved portion of the trail to 𝑀𝑠𝑎𝑣𝑒 may spoil the reason soundness of 𝑀𝑠𝑎𝑣𝑒 . As we will discuss later in more detail, this may happen only when a new conflict occurs immediately after a backjump, while we are still at the level 𝐿𝑏𝑎𝑐𝑘 . Therefore, in such cases we must clear 𝑀𝑠𝑎𝑣𝑒 before 𝑀 [[𝐿𝑏𝑎𝑐𝑘 + 1 . . . 𝐿𝑑𝑒𝑒𝑝 − 1]] is prepended, in order to stay on the safe side. For this reason, we keep track of the decision level of the previous backjump as 𝐿𝑜𝑙𝑑 𝑏𝑎𝑐𝑘 , and reset the saved trail if 𝐿𝑑𝑒𝑒𝑝 = 𝐿𝑜𝑙𝑑 𝑏𝑎𝑐𝑘 . After that, 𝑀 [[𝐿𝑏𝑎𝑐𝑘 + 1 . . . 𝐿𝑑𝑒𝑒𝑝 − 1]] is prepended to 𝑀𝑠𝑎𝑣𝑒 . The function SaveTrail also saves the reasons of the saved literals. The variable 𝑝𝑖𝑣𝑜𝑡 represents the position of the next-to-process literal in the list 𝑀𝑠𝑎𝑣𝑒 . This variable is reset at the end of the function SaveTrail. Example 1. Assume that we have, among others, the following set of clauses: {¬𝑥16 , ¬𝑥15 , ¬𝑥5 }, {𝑥9 , ¬𝑥2 , 𝑥8 }, {¬𝑥13 , 𝑥8 , 𝑥10 }, {𝑥19 , 𝑥3 , 𝑥20 }, {𝑥11 , ¬𝑥1 , 𝑥10 }, {𝑥16 , ¬𝑥6 , ¬𝑥15 }, {𝑥14 , 𝑥8 , 𝑥15 }, {¬𝑥12 , ¬𝑥2 , 𝑥10 }, {¬𝑥17 , ¬𝑥18 }, {¬𝑥8 , ¬𝑥7 }, {¬𝑥20 , 𝑥7 }, {¬𝑥22 , ¬𝑥23 }, {¬𝑥22 , ¬𝑥24 }, {𝑥22 , ¬𝑥25 }, {𝑥23 , 𝑥24 , ¬𝑥1 }, {¬𝑥4 , ¬𝑥19 , ¬𝑥21 }, {𝑥19 , 𝑥8 , ¬𝑥9 }, {¬𝑥1 , 𝑥22 , 𝑥25 }. Assume that after the first several decisions the following state of the trail is reached: 𝑀 = 𝑥𝑑0 𝑥𝑑1 𝑥𝑑2 ¬𝑥3 ¬𝑥𝑑4 𝑥5 𝑥6 𝑥𝑑7 ¬𝑥8 𝑥9 ¬𝑥𝑑10 𝑥11 ¬𝑥12 ¬𝑥13 ¬𝑥𝑑14 𝑥15 𝑥16 Note that the literals labelled with 𝑑 in the exponent are the decision literals, so the cur- rent decision level is 𝐿𝑑𝑒𝑒𝑝 = 7. At that moment, a conflict is encountered with the clause {¬𝑥16 , ¬𝑥15 , ¬𝑥5 }. If the reason of ¬𝑥16 is the clause {𝑥16 , ¬𝑥6 , ¬𝑥15 }, the conflict analysis gives us the backjump clause {¬𝑥15 , ¬𝑥5 , ¬𝑥6 }, and the backjump level 𝐿𝑏𝑎𝑐𝑘 = 4. Before the backjump, the portion of the trail consisting of the levels 5 and 6 is saved, so after the backjump, we have the following state: 𝑀 = 𝑥𝑑0 𝑥𝑑1 𝑥𝑑2 ¬𝑥3 ¬𝑥𝑑4 𝑥5 𝑥6 ¬𝑥15 𝑀𝑠𝑎𝑣𝑒 = 𝑥𝑑7 ¬𝑥8 𝑥9 ¬𝑥𝑑10 𝑥11 ¬𝑥12 ¬𝑥13 5 Algorithm 1 Trail saving algorithm 1: SaveTrail(𝐿𝑏𝑎𝑐𝑘 ) 2: if 𝐿𝑑𝑒𝑒𝑝 = 𝐿𝑜𝑙𝑑 𝑏𝑎𝑐𝑘 then 3: 𝑀𝑠𝑎𝑣𝑒 ← ∅ ◁ Case when concatenation may spoil reason-soundness 4: 𝐿𝑜𝑙𝑑 𝑏𝑎𝑐𝑘 ← 𝐿𝑏𝑎𝑐𝑘 ◁ Remember the backjump level 5: 𝑀𝑠𝑎𝑣𝑒 ← 𝑀 [[𝐿𝑏𝑎𝑐𝑘 + 1...𝐿𝑑𝑒𝑒𝑝 − 1]] + 𝑀𝑠𝑎𝑣𝑒 ◁ Conflicting 𝐿𝑑𝑒𝑒𝑝 level is dropped 6: for all 𝑙𝑠𝑎𝑣𝑒 ∈ 𝑀 [[𝐿𝑏𝑎𝑐𝑘 + 1...𝐿𝑑𝑒𝑒𝑝 − 1]] do ◁ Saving reasons 7: 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑙𝑠𝑎𝑣𝑒 ) ← 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙𝑠𝑎𝑣𝑒 ) 8: 𝑝𝑖𝑣𝑜𝑡 ← 0 9: 10: UseSavedTrail() 11: while 𝑝𝑖𝑣𝑜𝑡 < |𝑀𝑠𝑎𝑣𝑒 | do ◁ For each unprocessed literal 𝑙𝑠𝑎𝑣𝑒 12: 𝑙𝑠𝑎𝑣𝑒 ← 𝑀𝑠𝑎𝑣𝑒 [𝑝𝑖𝑣𝑜𝑡] 13: if 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑙𝑠𝑎𝑣𝑒 ) = ∅ then ◁ Saved decision literal 14: if 𝑀 |= 𝑙𝑠𝑎𝑣𝑒 then 15: 𝑝𝑖𝑣𝑜𝑡 ← 𝑝𝑖𝑣𝑜𝑡 + 1 ◁ Passing saved literal that is true on 𝑀 16: else ◁ Decisions are not inferred by 𝑀 17: return ∅ 18: else ◁ Saved inferred literal 19: if 𝑀 |= 𝑙𝑠𝑎𝑣𝑒 then 20: 𝑝𝑖𝑣𝑜𝑡 ← 𝑝𝑖𝑣𝑜𝑡 + 1 ◁ Passing saved literal that is true on 𝑀 21: else if 𝑀 |= ¬𝑙𝑠𝑎𝑣𝑒 then ◁ Conflict found on 𝑀𝑠𝑎𝑣𝑒 22: return 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑙𝑠𝑎𝑣𝑒 ) 23: else ◁ Push saved inferred literal to 𝑀 24: 𝑀 ← 𝑀 + 𝑙𝑠𝑎𝑣𝑒 ; 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙𝑠𝑎𝑣𝑒 ) ← 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑙𝑠𝑎𝑣𝑒 ) 25: 𝑝𝑖𝑣𝑜𝑡 ← 𝑝𝑖𝑣𝑜𝑡 + 1 26: return ∅ 27: 28: ConfirmPropagatedSavedLiterals() 29: while 𝑝𝑖𝑣𝑜𝑡 > 0 do 30: 𝑝𝑜𝑝_𝑓 𝑟𝑜𝑛𝑡_𝑙𝑖𝑡𝑒𝑟𝑎𝑙(𝑀𝑠𝑎𝑣𝑒 ) ◁ Confirming 𝑀𝑠𝑎𝑣𝑒 changes 31: 𝑝𝑖𝑣𝑜𝑡 ← 𝑝𝑖𝑣𝑜𝑡 − 1 32: 33: FilterSavedTrail() 34: 𝑀𝑛𝑒𝑤 ← ∅ ◁ Filtered version of 𝑀𝑠𝑎𝑣𝑒 35: 𝑖←0 36: while 𝑖 < |𝑀𝑠𝑎𝑣𝑒 | do 37: 𝑙𝑠𝑎𝑣𝑒 ← 𝑀𝑠𝑎𝑣𝑒 [𝑖] 38: if 𝑙𝑠𝑎𝑣𝑒 ∈ / 𝑀𝑛𝑒𝑤 then ◁ Non-duplicates are preserved in 𝑀𝑠𝑎𝑣𝑒 39: 𝑀𝑛𝑒𝑤 ← 𝑀𝑛𝑒𝑤 + 𝑙𝑠𝑎𝑣𝑒 40: if ¬𝑙𝑠𝑎𝑣𝑒 ∈ 𝑀𝑛𝑒𝑤 then 41: break ◁ Cut the portion of 𝑀𝑠𝑎𝑣𝑒 after the first conflicting literal 42: 𝑖←𝑖+1 43: 𝑀𝑠𝑎𝑣𝑒 ← 𝑀𝑛𝑒𝑤 The reasons of saved inferred literals are also saved. For instance, we have 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑥9 ) = {𝑥9 , ¬𝑥2 , 𝑥8 }, 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (¬𝑥8 ) = {¬𝑥8 , ¬𝑥7 }, 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑥11 ) = {𝑥11 , ¬𝑥1 , 𝑥10 }, and fi- nally, 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (¬𝑥13 ) = {¬𝑥13 , 𝑥8 , 𝑥10 } (we will need these saved reasons in the later examples). 6 Using the saved literals. The function UseSavedTrail uses the literals from the list 𝑀𝑠𝑎𝑣𝑒 to speed up the inference. In the original algorithm [11], this procedure is invoked within the main propagation loop, before processing of each of the watch lists [13]. The function UseSavedTrail returns the conflict reason, if one is discovered during the processing of the saved trail, or ∅ if no conflict occurs. Note that saved decision literals may be passed during the processing of 𝑀𝑠𝑎𝑣𝑒 only if they are already true in 𝑀 – otherwise the processing of 𝑀𝑠𝑎𝑣𝑒 is stopped and may be continued later, if the decision literal in question becomes true in 𝑀 by some other mechanism (decision or propagation). On the other hand, saved inferred literal 𝑙𝑠𝑎𝑣𝑒 is either passed (if 𝑙𝑠𝑎𝑣𝑒 is already true in 𝑀 ), or pushed on 𝑀 (if 𝑙𝑠𝑎𝑣𝑒 is undefined in 𝑀 ), or the conflict is reported (if 𝑙𝑠𝑎𝑣𝑒 is false in 𝑀 ). In the second two cases, 𝑟𝑒𝑎𝑠𝑜𝑛𝑠𝑎𝑣𝑒 (𝑙𝑠𝑎𝑣𝑒 ) is used as a propagation (or conflict) explanation. Example 2. Continuing the previous example, assume that after the next two decides, the following state is reached: 𝑀 = 𝑥𝑑0 𝑥𝑑1 𝑥𝑑2 ¬𝑥3 ¬𝑥𝑑4 𝑥5 𝑥6 ¬𝑥15 𝑥𝑑17 ¬𝑥18 ¬𝑥𝑑19 𝑥20 𝑥7 ¬𝑥*8 𝑥*9 𝑀𝑠𝑎𝑣𝑒 = 𝑥𝑑7 ¬𝑥8 𝑥9 ¬𝑥𝑑10 𝑥11 ¬𝑥12 ¬𝑥13 Note that the literal 𝑥7 is now inferred (from the clause {¬𝑥20 , 𝑥7 }), and the literals ¬𝑥8 and 𝑥9 are copied from 𝑀𝑠𝑎𝑣𝑒 , once 𝑥7 became true in 𝑀 (we mark such literals with * in the exponent). The next literal ¬𝑥10 cannot be copied from 𝑀𝑠𝑎𝑣𝑒 , since it is a saved decision literal undefined in 𝑀 , so it is not a consequence of 𝑀 at this moment. Note also that 𝑀𝑠𝑎𝑣𝑒 is not changed by the function UseSavedTrail, i.e. the literals 𝑥7 , ¬𝑥8 , 𝑥9 are still on 𝑀𝑠𝑎𝑣𝑒 . Confirmation of the propagated saved literals. The function ConfirmPropagated- SavedLiterals is invoked before each decision. Namely, as we have seen in the previous example, the literals from 𝑀𝑠𝑎𝑣𝑒 that are pushed to 𝑀 must remain on 𝑀𝑠𝑎𝑣𝑒 until the propa- gation process at that level is exhausted, in order to maintain the reason-soundness of 𝑀𝑠𝑎𝑣𝑒 (otherwise, in case of a conflict, the SaveTrail function would drop the conflicting topmost level, possibly spoiling the reason-soundness of the remaining literals in 𝑀𝑠𝑎𝑣𝑒 ). When we are assured that there is no conflict at the topmost level, we may safely drop the processed prefix of 𝑀𝑠𝑎𝑣𝑒 , just before the new decision is made. Example 3. Continuing the previous example, recall that the literals 𝑥7 , ¬𝑥8 and 𝑥9 are still on 𝑀𝑠𝑎𝑣𝑒 , since their transfer to 𝑀 is not confirmed yet. In the present state of the trail we again have a conflict, this time with the clause {𝑥19 , 𝑥8 , ¬𝑥9 }. During the conflict analysis, we first resolve the literals 𝑥9 and ¬𝑥8 (using their saved reasons {𝑥9 , ¬𝑥2 , 𝑥8 } and {¬𝑥8 , ¬𝑥7 }), reach- ing the clause {𝑥19 , ¬𝑥2 , ¬𝑥7 }, and then resolve the literal 𝑥7 using its reason {𝑥7 , ¬𝑥20 } (the resolvent being the clause {𝑥19 , ¬𝑥2 , ¬𝑥20 }), and the literal 𝑥20 using its reason {𝑥19 , 𝑥3 , 𝑥20 }, finally obtaining the backjump clause {𝑥19 , ¬𝑥2 , 𝑥3 }. The backjump level is now 𝐿𝑏𝑎𝑐𝑘 = 3, and the new state is: 𝑀 = 𝑥𝑑0 𝑥𝑑1 𝑥𝑑2 ¬𝑥3 𝑥19 𝑀𝑠𝑎𝑣𝑒 = ¬𝑥𝑑4 𝑥5 𝑥6 ¬𝑥15 𝑥𝑑17 ¬𝑥18 𝑥𝑑7 ¬𝑥8 𝑥9 ¬𝑥𝑑10 𝑥11 ¬𝑥12 ¬𝑥13 Note that the levels 4 and 5 from the trail 𝑀 were prepended to 𝑀𝑠𝑎𝑣𝑒 before the backjump. Note also why it is important to keep the used literals on 𝑀𝑠𝑎𝑣𝑒 until their transfer to 𝑀 is 7 confirmed: since the conflicting level is never saved, if the literal ¬𝑥8 was not kept on 𝑀𝑠𝑎𝑣𝑒 when it was copied to 𝑀 , it would be lost after the backjump, and the literal ¬𝑥13 would stay on 𝑀𝑠𝑎𝑣𝑒 with an unsound saved reason {¬𝑥13 , 𝑥8 , 𝑥10 }. Now assume that after the next two decides we have the following state: 𝑀 = 𝑥𝑑0 𝑥𝑑1 𝑥𝑑2 ¬𝑥3 𝑥19 𝑥𝑑21 ¬𝑥4 𝑥*5 𝑥*6 ¬𝑥*15 𝑥𝑑22 ¬𝑥23 ¬𝑥24 𝑀𝑠𝑎𝑣𝑒 = 𝑥𝑑17 ¬𝑥18 𝑥𝑑7 ¬𝑥8 𝑥9 ¬𝑥𝑑10 𝑥11 ¬𝑥12 ¬𝑥13 Namely, after the decision 𝑥21 , the literal ¬𝑥4 was inferred from the clause {¬𝑥4 , ¬𝑥19 , ¬𝑥21 }, which once again unblocked the saved trail and the literals 𝑥5 , 𝑥6 and ¬𝑥15 were copied from 𝑀𝑠𝑎𝑣𝑒 to 𝑀 . Just before the next decision 𝑥22 , the copied literals from 𝑀𝑠𝑎𝑣𝑒 were confirmed on 𝑀 , i.e. they were removed from 𝑀𝑠𝑎𝑣𝑒 . Note that this is a safe operation – since these literals are now on the trail 𝑀 , but not at its topmost level, they will either be saved to 𝑀𝑠𝑎𝑣𝑒 or stay on 𝑀 after the next backjump, preserving the reason soundness of the literals on 𝑀𝑠𝑎𝑣𝑒 . The obtained state of 𝑀 is again conflicting, and this time the conflicting clause is {𝑥23 , 𝑥24 , ¬𝑥1 }. After explaining ¬𝑥24 with the clause {¬𝑥22 , ¬𝑥24 }, and ¬𝑥23 with the clause {¬𝑥22 , ¬𝑥23 }, we obtain the backjump clause {¬𝑥1 , ¬𝑥22 }. The backjump level is now 𝐿𝑏𝑎𝑐𝑘 = 2, and we reach the following state: 𝑀 = 𝑥𝑑0 𝑥𝑑1 ¬𝑥22 𝑀𝑠𝑎𝑣𝑒 = 𝑥𝑑2 ¬𝑥3 𝑥19 𝑥𝑑21 ¬𝑥4 𝑥5 𝑥6 ¬𝑥15 𝑥𝑑17 ¬𝑥18 𝑥𝑑7 ¬𝑥8 𝑥9 ¬𝑥𝑑10 𝑥11 ¬𝑥12 ¬𝑥13 Note that the literals ¬𝑥4 , 𝑥5 , 𝑥6 , ¬𝑥15 that were confirmed on 𝑀 are now again on 𝑀𝑠𝑎𝑣𝑒 . Filtering the saved trail. Note that concatenation of saved portions of the trail may result in duplicates in 𝑀𝑠𝑎𝑣𝑒 . As a consequence, the list 𝑀𝑠𝑎𝑣𝑒 may grow indefinitely. Moreover, con- flicting literals 𝑙 and ¬𝑙 may appear on 𝑀𝑠𝑎𝑣𝑒 , so the portion of 𝑀𝑠𝑎𝑣𝑒 after the two conflicting literals may be unreachable and, therefore, useless. For this reason, we should periodically filter the list 𝑀𝑠𝑎𝑣𝑒 , by removing the duplicates and all the literals after the first conflicting literal in 𝑀𝑠𝑎𝑣𝑒 . This is done by the function FilterSavedTrail. Note that the first conflicting literal ¬𝑙𝑠𝑎𝑣𝑒 is kept on 𝑀𝑠𝑎𝑣𝑒 , since it may help in discovering a conflict. The procedure FilterSavedTrail is invoked whenever the length of 𝑀𝑠𝑎𝑣𝑒 becomes greater than the total number of atoms of the formula 𝐹 . Preserving the reason soundness. It can be argued that the CDCL algorithm enhanced with the trail saving mechanism remains correct, provided that the reason soundness of 𝑀𝑠𝑎𝑣𝑒 is maintained during the operation of the solver. For the proof of this fact, we refer the interested reader to the original work of Hickey and Bacchus [11]. Here we only discuss the unique case when the concatenation of the saved trails may spoil the reason soundness, which is the case when a new conflict happens immediately after a backjump, while we are still at the level 𝐿𝑏𝑎𝑐𝑘 .2 This may happen because 𝐿𝑑𝑒𝑒𝑝 = 𝐿𝑏𝑎𝑐𝑘 in such a case, so the level 𝑀 [[𝐿𝑑𝑒𝑒𝑝 ]] = 𝑀 [[𝐿𝑏𝑎𝑐𝑘 ]] was not saved on the previous backjump (since then it was the backjump level), and it will 2 This issue was not discussed by Hickey and Bacchus [11], but they did implement the appropriate check for such a condition in their solver, which means that they were aware of it. 8 not be saved on the next backjump (since now it is the conflicting level). This can make saved reasons of some of the literals in 𝑀𝑠𝑎𝑣𝑒 invalid after the next backjump.3 The next (and final) example illustrates this phenomenon. Example 4. In the state from the previous example, the clause {¬𝑥1 , 𝑥22 , 𝑥25 } triggers the unit propagation of the literal 𝑥25 , which makes the trail 𝑀 in conflict with the clause {𝑥22 , ¬𝑥25 }. We first resolve the literal 𝑥25 with its reason, and obtain the clause {𝑥22 , ¬𝑥1 }, and then resolve the literal ¬𝑥22 with its reason {¬𝑥1 , ¬𝑥22 }, and obtain the backjump clause {¬𝑥1 }. The backjump level 𝐿𝑏𝑎𝑐𝑘 = 0 this time, so after the backjump we reach the following state: 𝑀 = ¬𝑥1 𝑀𝑠𝑎𝑣𝑒 = 𝑥𝑑0 Note that the previous state of 𝑀𝑠𝑎𝑣𝑒 was cleared before the new portion of 𝑀 (the literal 𝑥0 ) was prepended to it. This is because the conflict happened immediately after the backjump, before the next decide. In such a situation, the literals 𝑥1 and ¬𝑥22 from the conflicting level were neither saved, nor they were kept on 𝑀 , which might compromise the reason soundness of some literals on 𝑀𝑠𝑎𝑣𝑒 . For instance, recall that the saved reason of the literal 𝑥11 was {𝑥11 , ¬𝑥1 , 𝑥10 }, and this reason is no longer sound, so the concatenation of the saved portions of the trail is not possible. 4. Employing Trail Saving in SMT The described trail saving technique may be naturally extended to be used in the context of CDCL(T)-based SMT solvers, since such solvers are driven by a CDCL SAT solving engine. In this section, the CDCL(T) algorithm with the trail saving enabled is presented and discussed in detail. The overall structure of the CDCL(T) algorithm is not changed – the only thing that should be done to enable the trail saving is to invoke the trail saving functions described in the previous section at the appropriate places (which are mostly the same places as in the CDCL algorithm). The main loop of the CDCL(T) algorithm. The main loop of the algorithm is implemented in the function Solve (Algorithm 2). It first invokes UnitPropagate and TheoryPropagate procedures to do the inference. These procedures check for unit/theory propagations and conflicts, and will be discussed later in more detail. If a conflict is encountered, a conflict reason 𝐶 is returned. In that case, the procedure Solve starts the conflict analysis, by invoking the AnalyzeConflict4 function which returns the backjump clause 𝐶𝑏𝑎𝑐𝑘 and the backjump level 𝐿𝑏𝑎𝑐𝑘 . Before backjumping to the level 𝐿𝑏𝑎𝑐𝑘 , the procedure first saves the part of the trail that will be retracted (by calling the function SaveTrail). Then it performs the backjump by restoring 3 Note that there are cases when a conflict happens at 𝐿𝑏𝑎𝑐𝑘 , but the reason-soundness still holds after the concate- nation of the saved trails. This means that the resetting of the saved trail each time a conflict is encountered at 𝐿𝑏𝑎𝑐𝑘 may be a quite conservative strategy. In our implementation, we try to recognize (some of) such cases in order to avoid unnecessary resets of 𝑀𝑠𝑎𝑣𝑒 . 4 The pseudo-code of the function AnalyzeConflict is omitted (as well as of several other functions), in order to save space, since it does not invoke any of the trail saving functions. 9 Algorithm 2 CDCL(T) algorithm with the trail saving enabled: Solve function 1: Solve(𝐹 ) 2: while true do 3: while 𝑀 is changed do ◁ Unit and theory propagation loop 4: 𝐶 ← UnitPropagate() ◁ Returns a conflicting clause, or ∅ 5: if 𝐶 ̸= ∅ then ◁ If a conflict is found, exit the propagation loop 6: break 7: 𝐶 ← TheoryPropagate() ◁ Returns a theory conflict reason, or ∅ 8: if 𝐶 ̸= ∅ then ◁ If a conflict is found, exit the propagation loop 9: break 10: if 𝐶 ̸= ∅ then ◁ A conflict is discovered during the propagation 11: if 𝐿𝑑𝑒𝑒𝑝 = 0 then ◁ A conflict at the level 0 12: return UNSAT 13: (𝐿𝑏𝑎𝑐𝑘 , 𝐶𝑏𝑎𝑐𝑘 ) ← AnalyzeConflict(𝐶, 𝑀 ) ◁ 𝐶𝑏𝑎𝑐𝑘 is the backjump clause 14: SaveTrail(𝐿𝑏𝑎𝑐𝑘 ) ◁ Save the retracted portion of the trail 15: 𝑀 ← 𝑀 [[0...𝐿𝑏𝑎𝑐𝑘 ]] ◁ Backjump the trail to the level 𝐿𝑏𝑎𝑐𝑘 16: RestoreTheorySolverState(𝐿𝑏𝑎𝑐𝑘 ) ◁ Notify the theory solver 17: 𝐿𝑑𝑒𝑒𝑝 ← 𝐿𝑏𝑎𝑐𝑘 18: if |𝑀𝑠𝑎𝑣𝑒 | is greater than the total number of atoms in 𝐹 then 19: FilterSavedTrail() ◁ Filter 𝑀𝑠𝑎𝑣𝑒 if needed 20: 𝐹 ← 𝐹 ∪ {𝐶𝑏𝑎𝑐𝑘 } ◁ Learning the backjump clause triggers unit propagation 21: else 22: if all atoms from 𝐹 are assigned in 𝑀 then 23: return SAT 24: ConfirmPropagatedSavedLiterals() ◁ Confirm the changes of 𝑀𝑠𝑎𝑣𝑒 25: 𝑙𝑑 ← PickBranchingVariable(𝑀, 𝐹 ) 26: 𝑀 ← 𝑀 + {𝑙𝑑 }; 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙𝑑 ) ← ∅ ◁ Make a new decision 27: 𝐿𝑑𝑒𝑒𝑝 ← 𝐿𝑑𝑒𝑒𝑝 + 1 the trail to 𝑀 [[0..𝐿𝑏𝑎𝑐𝑘 ]]. It also tells the theory solver to restore its state to the level 𝐿𝑏𝑎𝑐𝑘 , and performs the saved trail filtering if needed (by invoking the function FilterSavedTrail). Finally, the backjump clause is learned, triggering the next propagation cycle. On the other hand, if no conflict is encountered during the propagation, and there are still atoms from the formula 𝐹 that are unassigned in 𝑀 , we first confirm the changes of the saved trail by invoking the function ConfirmPropagatedSavedLiterals. Then we pick a literal for the next decide and push it to 𝑀 as a decision literal. Propagation functions and using the saved trail. The functions UnitPropagate and TheoryPropagate which are responsible for the propagation are given in Algorithm 3. The function UnitPropagate implements the well-known two-watched-literals algorithm for unit propagation [13]. It processes the literals from 𝑀 one by one in a loop. In each iteration of the loop, it first tries to use the literals from 𝑀𝑠𝑎𝑣𝑒 if possible, by invoking the function UseSavedTrail. If no conflict is found, then the next unprocessed literal 𝑙 is read from 𝑀 and the watch list of its opposite literal ¬𝑙 (which is false in 𝑀 ) is processed in the usual fashion. Note that for each propagated literal 𝑙, its reason (which is a clause of the formula 𝐹 ) is set. We assume that the function TheoryPropagate has a similar global structure, although its precise structure may depend on a concrete theory-specific procedure. It also processes the literals from 𝑀 in a loop, but not necessarily one by one (depending on a concrete procedure, it 10 Algorithm 3 CDCL(T) algorithm with the trail saving enabled: propagation functions 1: UnitPropagate() 2: while there are unprocessed literals on 𝑀 do 3: 𝐶 ← UseSavedTrail() ◁ First, try to use the saved trail 4: if 𝐶 ̸= ∅ then ◁ If a conflict is found, return the conflicting clause 𝐶 5: return 𝐶 6: 𝑙 is next-to-process literal from 𝑀 7: for all 𝐶 ∈ 𝑤𝑎𝑡𝑐ℎ𝑙𝑖𝑠𝑡(¬𝑙) do ◁ process the watchlist of the falsified literal ¬𝑙 8: 𝑙𝑎𝑙𝑡 ← FindAlternativeWatch(𝐶, ¬𝑙) 9: if 𝑙𝑎𝑙𝑡 ̸= ∅ then ◁ A new watch found 10: 𝑙𝑎𝑙𝑡 replaces ¬𝑙 as a watch literal of 𝐶 11: move 𝐶 from 𝑤𝑎𝑡𝑐ℎ𝑙𝑖𝑠𝑡(¬𝑙) to 𝑤𝑎𝑡𝑐ℎ𝑙𝑖𝑠𝑡(𝑙𝑎𝑙𝑡 ) 12: else ◁ All literals of 𝐶 are false, except, possibly, the other watch 𝑙′ 13: if the other watch 𝑙′ of 𝐶 is false in 𝑀 then 14: return 𝐶 ◁ 𝐶 is a conflicting clause 15: else if the other watch 𝑙′ of 𝐶 is undefined in 𝑀 then 16: 𝑀 ← 𝑀 + {𝑙′ } ◁ Unit propagation 17: 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙′ ) = 𝐶 ◁ 𝐶 is the propagation reason 18: return ∅ 19: 20: TheoryPropagate() 21: while there are unprocessed literals on 𝑀 do 22: 𝐶 ← UseSavedTrail() ◁ First, try to use the saved trail 23: if 𝐶 ̸= ∅ then ◁ If a conflict is found, return the conflicting clause 𝐶 24: return 𝐶 25: 𝑙1 , . . . , 𝑙𝑛 is 𝑛 next-to-process literals from 𝑀 ◁ 𝑛 is theory-dependent 26: (𝐶, 𝑀𝑖𝑛𝑓 ) ← DoTheTheoryInference(𝑙1 , . . . , 𝑙𝑛 ) 27: if 𝐶 ̸= ∅ then ◁ If a theory conflict is found, return the theory conflict reason 𝐶 28: return 𝐶 29: for all 𝑙 in 𝑀𝑖𝑛𝑓 do ◁ Otherwise, push the inferred literals 𝑀𝑖𝑛𝑓 to 𝑀 30: 𝑀 ← 𝑀 + {𝑙} 31: 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙) ← 𝑙𝑎𝑧𝑦 ◁ The reason will be generated lazily, if needed 32: return ∅ might process all unprocessed literals at once). The function UseSavedTrail is invoked once per the loop iteration, before the processing of the literals. The theory-specific inference is captured by the function DoTheTheoryInference, which, by assumption, returns either a theory conflict reason 𝐶, or the set of literals 𝑀𝑖𝑛𝑓 that are T-inferred from the prefix of 𝑀 up to the literal 𝑙𝑛 , in case when no theory conflict is encountered. In the latter case, the literals from 𝑀𝑖𝑛𝑓 are pushed to 𝑀 . However, their reasons are not set, since the reasons of T-inferred literals are usually generated by the theory solver lazily, i.e. only when needed, during the conflict analysis. Thus, we set 𝑟𝑒𝑎𝑠𝑜𝑛(𝑙) to a special value 𝑙𝑎𝑧𝑦, which serves as a placeholder for the true reason which may be calculated later, if needed. An issue that deserves a discussion here concerns the exact location and the frequency of invocation of the function UseSavedTrail within the propagation functions. In case of the unit-propagation, we exactly followed the approach from the original trail saving algorithm, presented by Hickey and Bacchus [11]. Moreover, we extended the same approach to the case of the theory propagation. That is, in both cases, the function UseSavedTrail is invoked once 11 per iteration of the propagation loop, before the literals from 𝑀 are processed. The question remains whether this approach is the best approach. The rationale behind the approach is that the saved trail should be consulted whenever there are new literals on 𝑀 , since some of them may unblock the next portion of 𝑀𝑠𝑎𝑣𝑒 (recall that the consumption of 𝑀𝑠𝑎𝑣𝑒 is stopped when a saved decision literal 𝑙 that is undefined in 𝑀 is encountered). Although we cannot guarantee that each iteration of the loop will indeed produce new literals on 𝑀 , it will be the case very often in practice. On the other hand, too frequent invocation of the function UseSavedTrail should not be an issue, since its unfruitful calls are not expensive (the function will return immediately if the next literal on 𝑀𝑠𝑎𝑣𝑒 is a saved decision literal undefined in 𝑀 ). For this reason, we decided to stick to that approach. Further analysis of this issue is left for the future work. Saving the theory propagation reasons. Another issue concerns saving of explanations of T-inferred literals. Recall that the function SaveTrail must save the propagation reasons of the inferred literals, together with the literals themselves. In case of unit-propagated literals, this is easy, since their reasons are the clauses of the formula 𝐹 (or some of the learned clauses), and they are immediately available. On the other side, the reasons of the T-inferred literals may not be known yet at the moment when the function SaveTrail is invoked, since we already said that the theory solvers usually produce reasons of propagations lazily, i.e. only when a reason is required during the conflict analysis. This problem can be resolved in two ways: • The theory solver may be asked to produce the reasons of all saved T-inferred literals eagerly, when the trail is saved during the execution of the SaveTrail function. The produced reasons are then saved in the same way as the reasons of the unit-propagated literals. This approach is very easy to implement, since it does not require any modification of the theory solver. On the other hand, producing all the reasons eagerly may be too expensive, since only a portion of them will be actually needed during the conflict analysis. • Instead of saving the reason of a T-inferred literal 𝑙, we may only save the information that the theory solver is responsible for producing the reason of 𝑙 later.5 The reason will be produced lazily, when it is needed during the conflict analysis. In other words, the idea is to follow the usual approach used in SMT. The main problem with this approach is that the theory solver may not be able to produce the reason of a literal that is, technically, not pushed to 𝑀 as a T-inferred literal during the last propagation cycle, but it is copied from 𝑀𝑠𝑎𝑣𝑒 as a saved T-inferred literal from some previous propagation cycle. This means that the internal state of the theory solver has been changed meanwhile, possibly multiple times over consecutive backjumps. This also holds for the data structures within the theory solver that are used by the reason-generating procedures. Such data structures may not contain information needed for explaining the literals that are not propagated to 𝑀 by the theory solver in that propagation cycle, even if those literals are indeed T-consequences of the trail 𝑀 . In order to resolve this issue, non-trivial modification of the theory solver’s data structures may be needed, making this approach much more difficult to implement. 5 In case of multiple theory solvers, when a combination of theories is considered, this information should contain a reference to the theory solver responsible for producing the reason of 𝑙 in the future. 12 In our work we follow the first approach, due to its simplicity. Examining the second approach is left for the future work. 5. Implementation and Evaluation For the purpose of evaluation, the trail saving method described in previous sections is imple- mented within the SMT solver argosmt6 , which is an open-source CDCL(T)-based SMT solver implemented in the C++ programming language for research purposes. The solver includes two theory solvers: • The EUF theory solver based on the congruent closure procedure described in [14]. This procedure maintains a data structure called the proof forest, which is used for generation of propagation (and conflict) reasons. The state of this data structure is restored when a backjump is performed, so it cannot be used for explaining the propagations from the previous propagation cycles. Therefore, during the trail saving operation, this theory solver must produce all the reasons of the saved literals eagerly, and this is what is done in our implementation. Notice that the eager reason generation approach may induce significant overhead in this case, since the generation of the propagation reasons within the EUF theory solver may be time-consuming. • The linear arithmetic theory solver based on the simplex procedure [15]. This theory solver is optimized for real arithmetic, but it also includes some basic support for integer constraints (based on the branch-and-bound method [15]). Note that this theory solver may be used for solving the difference logic problems, too. Unlike the EUF theory solver, the reasons for theory propagations produced by the simplex-based theory solver are much cheaper to generate. Therefore, the expectations are that the overhead of the eager reason generation approach will not be as significant in this case. We evaluated the implementation on SMT-LIB [16] instances. All instances from all non- incremental QF_LRA, QF_RDL and QF_UF benchmark families were included in the evaluation.7 The experiments were run on a computer with four AMD Opteron 6168 1.6GHz 12-core proces- sors (that is, 48 cores in total), and with 94GB of RAM. For comparison, each instance was evaluated with and without trail saving enabled (the two versions of the solver are denoted by argosmt-ts and argosmt-nts, respectively). In both cases, time limit per instance was 1200 seconds. After the results were obtained, several benchmark families were excluded from the further analysis, either because they were too easy for our solver (all instances were solved in less then a second on average by both argosmt-nts and argosmt-ts) or too hard (no instance was solved in the given time limit by neither of the solver’s variants).8 The final results are presented in Table 1. 6 The solver is available at github: https://github.com/milanbankovic/argosmt. To experiment with the trail saving, checkout the branch ts and then follow the instructions in the file README.md. 7 Instances are available at: https://clc-gitlab.cs.uiowa.edu:2443/SMT-LIB-benchmarks 8 Too easy: keymaera, meti-tarski, sal, spider_benchmarks (QF_LRA), CLEARSY, Rodin (QF_UF). Too hard: tropical- matrix (QF_LRA). 13 Table 1 The results of the evaluation of the trail saving method on the selected SMT-LIB benchmark families. The left side of the table shows the numbers of solved instances and average solving times for argosmt-nts and argosmt-ts. Times are given in seconds. Time limit per instance was 20 minutes. For unsolved instances, twice the timeout was used when the average solving time was calculated. The winners are printed in boldface. The right side of the table shows some trail saving statistics, on average per instance (percent of backjumps by more than one level, average numbers of saved levels and saved literals per SaveTrail call, percents of propagations originating from the saved trail). avg. time (nts) % saved props # saved levels % deep bjmps # solved (nts) avg. time (ts) # solved (ts) # saved lits # instances Family Logic uart QF_LRA 73 61 508 59 545 81.9 11 252 16.7 Heizmann QF_LRA 58 9 2063 9 2055 89.5 10.7 52.4 18.1 tta_startup QF_LRA 72 64 321.3 62 376.4 76.5 11.4 430 12.3 miplib QF_LRA 42 10 1861.5 10 1853.5 77.3 7.6 142.8 5.68 DTP-Sched QF_LRA 91 89 62 89 62.3 99.3 34.3 148.8 8.8 clock_synchro QF_LRA 36 25 749.5 27 658.4 62.5 10.9 83 12 latendresse QF_LRA 18 15 404.7 15 404.7 100 14.6 32.6 0.42 2019-ezsmt QF_LRA 105 72 796.1 73 785.5 82.9 8.07 173.4 4.05 TM QF_LRA 25 19 588.8 20 538.7 61 7.9 180.3 8.1 SV-COMP QF_LRA 94 21 1959.7 27 1834.6 88.9 49.3 214.7 11.8 CoopT2 QF_LRA 202 54 1879 48 1962.8 91 86.7 426 10.5 sc QF_LRA 144 132 246.9 132 247.1 85 14 205.8 14 Ultimate QF_LRA 123 42 1710.1 44 1713.7 92 366.7 818.1 8.5 Total QF_LRA 1083 613 1113.6 615 1112.5 84.1 51.1 287.7 10.7 sal QF_RDL 60 42 765.5 42 764.8 46.3 5.5 231.6 7.2 scheduling QF_RDL 106 51 1306 56 1192 95.1 15.8 56.7 19.8 skdmxa QF_RDL 36 17 1462 16 1513 61.7 69.4 2472 12.3 temporal QF_RDL 51 49 167.2 49 159.9 95.8 87 235.8 10.3 Total QF_RDL 253 159 970.6 163 928.5 80.1 40.5 395.7 13 eq_diamond QF_UF 100 22 1887 21 1907 17 1.2 2.5 0.9 hwbench QF_UF 773 769 17.1 766 33.4 39.7 14.6 286.4 5.5 QG-class QF_UF 6396 6334 34.7 6321 43.7 49.4 4.7 331.2 20.7 SEQ QF_UF 56 48 411.5 41 707.9 59.2 18.6 56.4 20.5 NEQ QF_UF 48 34 818.9 26 1249 51.2 58 120.6 11.7 PEQ QF_UF 47 27 1162 21 1423 44.7 4 13.5 14.4 Total QF_UF 7420 7234 72.9 7196 89.3 48.6 5.6 323.8 19 The results show that the trail saving technique has a positive effect in case of some QF_LRA benchmark families, but there are also QF_LRA benchmark families for which the trail saving algorithm does not improve the solver’s performance (either it does not have any effect, or even induces some degradation of performance). On average, there is no significant difference between argosmt-nts and argosmt-ts on QF_LRA instances. On the other hand, the results are positive in general for QF_RDL instances, where argosmt-ts performs better on three of four benchmark families. Finally, the trail saving technique exhibits a bad performance on 14 QF_UF instances on all benchmark families. These claims are also supported by the plots given in Figure 1. Figure 1: Per instance scatter plots (top) and survival plots (bottom) for: QF_LRA (left), QF_RDL (middle) and QF_UF (right). Times are given in seconds. The consistently bad performance on QF_UF instances is somewhat expected, and may be explained by the expensiveness of the eager reason generation in case of the EUF theory solver. Moreover, the EUF solver is not expected to benefit much from the saved literals, since it must process all these literals again and perform merging of the corresponding congruence classes anyway [14]. On the other hand, prior to the experiments it was expected that the simplex-based arithmetic solver would benefit from the trail saving. Indeed, not only the eager reason generation is cheap, but we can also expect that copying the saved literals from 𝑀𝑠𝑎𝑣𝑒 instead of propagating them again by the theory solver might help in avoiding some very expensive pivoting steps within the simplex procedure [15]. However, such a hypothesis is not supported by our experimental results, since the method does not introduce any consistent improvement on QF_LRA benchmarks. Another hypothesis was that the obtained performance improvement should be proportional to the degree of utilization of the saved trail. In order to test this hypothesis, we calculated some statistics concerning the trail saving (also presented in Table 1), such as how often the trail was saved on backjumps (that is, the percent of backjumps for more than one level), and how many levels and literals were saved on average per backjump. Also, the last column in Table 1 shows the average percent of propagations that actually came from the saved trail, which is probably the most accurate measure of the saved trail’s utilization during the operation of the solver. Surprisingly, we did not find any correlation between any of these features and the performance of the trail saving algorithm, neither at the per-family level nor at the per-instance level. In addition to the main experimental evaluation presented in this section, we have also evaluated our solver on all QF_UFLRA benchmark families from SMT-LIB, in order to see how the trail saving method performs when the combination of the two theories is considered. These 15 results are not presented here, since there were not almost any difference in behaviour between argosmt-nts and argosmt-ts on these instances. We also did some preliminary evaluation on QF_LIA instances, but taken only from several randomly selected benchmark families, which was not sufficient to draw some general con- clusions. However, the obtained results suggest the similar behaviour as on QF_LRA instances (that is, the behaviour greatly depends on the chosen benchmark family9 ). Finally, let us mention that Hickey and Bacchus in their original work [11] also experimented with two additional enhancements of the basic trail saving method (one of them considers the quality of saved reasons and the other considers using the saved decisions as lookaheads for conflicts). We have implemented and evaluated the same enhancements in our solver, but contrary to the results of Hickey and Bacchus, our evaluation on the selected SMT-LIB instances did not show any significant impact of these enhancements. 6. Conclusions and Further Work In this paper, we have discussed the potential of using the method of trail saving within SMT solvers. For the purpose of evaluation, we have implemented the trail saving method within a CDCL(T)-based SMT solver argosmt, equipped with two theory solvers which are among the most commonly implemented within modern SMT solvers – the congruence closure EUF theory solver, and simplex-based linear arithmetic theory solver. Our experimental results show that the trail saving exhibits a consistently bad performance on EUF instances, which is probably caused by the expensiveness of the eager reason generation strategy. On the other hand, the effectiveness of the trail saving on linear arithmetic instances greatly depends on the chosen benchmark family, and further investigation is needed in order to discover the exact traits of the benchmarks that determine the behaviour of the trail saving. Our hypothesis that the efficiency gains should be proportional to the degree of utilization of the saved trail was not supported by the obtained experimental results. This means that the main effect of the trail saving may not be in speeding up the propagation, which was our starting assumption. The main effect is probably the influence on the search, since the saved reasons influence the conflict analysis and the learned backjump clauses. The main line of the further work is, therefore, to perform a more detailed analysis of these effects. Another further research direction is to evaluate the method on other important theories used in SMT (such as theory of arrays, bitvectors and so on). Finally, since the implementation details may significantly influence the obtained experimental results, it might be needed to implement the technique within some state-of-the-art SMT solver, in order to more reliably evaluate the potential of the trail saving method in SMT solving. Acknowledgments This work was partially supported by the Serbian Ministry of Science grant 174021. We are very grateful to the anonymous reviewers whose insightful comments and remarks helped us to make this text much better. 9 For example, trail saving performed well on the convert, rings and tropical_matrix benchmark families, but not on the slacks, mathsat and wisa families. 16 References [1] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, Y. Zhu, Bounded model checking., in: Handbook of satisfiability, volume 185, IOS Press, 2009, pp. 457–481. [2] D. Kroening, Software verification, in: Handbook of Satisfiability, IOS Press, 2009, pp. 505–532. [3] M. Vujošević-Janičić, V. Kuncak, Development and evaluation of LAV: an SMT-based error finding platform, in: International Conference on Verified Software: Tools, Theories, Experiments, Springer, 2012, pp. 98–113. [4] R. Mukherjee, D. Kroening, T. Melham, Hardware verification using software analyzers, in: 2015 IEEE Computer Society Annual Symposium on VLSI, IEEE, 2015, pp. 7–12. [5] F. Marić, Fast formal proof of the Erdős–Szekeres conjecture for convex polygons with at most 6 points, Journal of Automated Reasoning 62 (2019) 301–329. [6] M. Scheucher, Two disjoint 5-holes in point sets, Computational Geometry 91 (2020) 101670. [7] M. Bofill, M. Palahí, J. Suy, M. Villaret, Solving constraint satisfaction problems with SAT modulo theories, Constraints 17 (2012) 273–303. [8] N. Tamura, M. Banbara, Sugar: A CSP to SAT translator based on order encoding, Pro- ceedings of the Second International CSP Solver Competition (2008) 65–69. [9] J. Marques-Silva, I. Lynce, S. Malik, Conflict-Driven Clause Learning SAT Solvers, in: Handbook of Satisfiability, IOS Press, 2009, pp. 131–155. [10] C. Barrett, R. Sebastiani, S. A. Seshia, C. Tinelli, Satisfiability Modulo Theories, in: Handbook of Satisfiability, IOS Press, 2009, pp. 825–885. [11] R. Hickey, F. Bacchus, Trail saving on backtrack, in: International Conference on Theory and Applications of Satisfiability Testing, Springer, 2020, pp. 46–61. [12] R. Nieuwenhuis, A. Oliveras, C. Tinelli, Solving SAT and SAT Modulo Theories: From an abstract Davis–Putnam–Logemann–Loveland procedure to DPLL(T), Journal of the ACM (JACM) 53 (2006) 937–977. [13] M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, S. Malik, Chaff: Engineering an Efficient SAT Solver, in: Annual ACM IEEE Design Automation Conference, ACM, 2001, pp. 530–535. [14] R. Nieuwenhuis, A. Oliveras, Fast congruence closure and extensions., Information and Computation 205 (2007) 557–580. [15] B. Dutertre, L. d. Moura, A fast linear-arithmetic solver for DPLL(T), in: International Conference on Computer Aided Verification, Springer, 2006, pp. 81–94. [16] C. Barrett, P. Fontaine, C. Tinelli, The SMT-LIB Standard: Version 2.6, Technical Re- port, Department of Computer Science, The University of Iowa, 2017. Available at www.SMT-LIB.org. 17