First-Order Instantiation using Discriminating Terms Chad E. Brown1 , Mikoláš Janota1 1 Czech Technical University in Prague, Czech Institute of Informatics, Robotics and Cybernetics, Jugoslávských partyzánů 1580/3, 160 00 Prague 6, Dejvice, Czech Republic Abstract This paper proposes a technique to limit the number of possible terms to be considered in quantifier instantiation. One of the major hurdles that SMT solvers face when dealing with quantifiers is that there are simply too many terms to instantiate with. So even if the right set of terms is available to the solver, meaning they appear in the formula, the solver might not have enough resources to come upon the right combination. This motivates the technique presented in this paper, which instantiates only by a certain type of terms, called discriminating terms. The paper introduces a class of formulas, where the proposed technique has a considerable impact. Keywords SMT, quantifiers, instantiation 1. Introduction Quantifiers represent one of the major challenges for contemporary SMT solvers and since typically they lead to undecidability or extreme computational complexity, they are likely to remain a challenge for times to come. Most commonly, the general techniques for dealing with quantifiers gradually instantiate the quantified part of the formula with ground terms until the resulting ground formula becomes unsatisfiable. The terms to be used in instantiations may be chosen either by syntactic properties (E-matching [1]) or semantic properties (e.g. model-based quantifier instantiation [2]). Interest- ingly, it has been shown that these techniques do not always pay off and simple enumeration of terms gives better results in some cases [3]. This is where this paper comes in. We propose a technique to limit the set of terms to enumerate. Roughly speaking, in the context of first order logic with equality, a term is labeled as discriminating if it participates in a disequality. We have modified the enumeration instantiation algorithm in CVC4 [4] so that only discrimi- nating terms are considered. We further construct a family of formulas where this approach demonstrably helps. SMT 2021: 19th International Workshop on Satisfiability Modulo Theories, July 18-19, 2021, Los Angeles, CA " Mikolas.Janota@cvut.cz (M. Janota) ~ http://people.ciirc.cvut.cz/~janotmik/ (M. Janota)  0000-0003-3487-784X (M. Janota) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 17 2. A Class of Problems Kaminski and Smolka [5] consider an identity that holds over the booleans: 𝑓 (𝑓 (𝑓 (𝑐))) = 𝑓 (𝑐). Here 𝑐 is a boolean and 𝑓 is a unary function on booleans. It is easy to informally see why this identity holds by considering all four possible interpretations of 𝑓 . Obviously the identity also holds if the domain of interest has only one element. Hence one can make an easy first-order problem by including an axiom ∀𝑥𝑦𝑧.𝑥 = 𝑦 ∨ 𝑥 = 𝑧 ∨ 𝑦 = 𝑧 stating there are at most two elements and making the conclusion 𝑓 (𝑓 (𝑓 (𝑐))) = 𝑓 (𝑐). A formal proof would proceed by equational reasoning after instantiating the quantifiers using the subterms of the conjecture: 𝑐, 𝑓 (𝑐), 𝑓 (𝑓 (𝑐)) and 𝑓 (𝑓 (𝑓 (𝑐))). This problem can be made slightly more difficult by including 𝑛 unary functions 𝑔0 , . . . , 𝑔𝑛−1 and writing the conclusion as 𝑓 (𝑓 (𝑓 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )))) = 𝑓 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )). The modified problem has 𝑛 + 4 subterms instead of only 4. The subterms of the form 𝑔𝑖 (· · · 𝑔𝑛−1 (𝑐) · · · ) with 𝑖 > 0 are red herrings. The four subterms 𝑓 𝑘 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )) with 𝑘 ∈ {0, 1, 2, 3} are sufficient to use as instantiations to complete the proof. The problems can also be made more difficult in a different way. Following a proof from [6] we will argue that for each natural number 𝑚 > 0 there are natural numbers 𝑚1 > 0 and 𝑚2 ≥ 0 such that 𝑓 𝑚1 +𝑚2 (𝑐) = 𝑓 𝑚2 (𝑐) if the domain of interest has at most 𝑚 elements. If 𝑚 = 2, we can take 𝑚1 = 2 and 𝑚2 = 1 as above. More generally, we choose 𝑚2 as 𝑚 − 1 and 𝑚1 is the least common multiple (lcm) of the sequence 2, . . . , 𝑚. The reason behind these choices are explored in the following subsection. To construct the family of formulae in question, define atmost𝑚 to be the first-order formula ⋁︁ ∀𝑥0 · · · 𝑥𝑚 . 𝑥𝑖 = 𝑥𝑗 0≤𝑖<𝑗≤𝑚 and let kam𝑛𝑚 be the first-order formula atmost𝑚 ∧ 𝑓 lcm({2,...,𝑚})+𝑚−1 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )) ̸= 𝑓 𝑚−1 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )). 2.1. Unsatisfiability of kam𝑛𝑚 Following the proof (and terminology) of Theorem 7 in [6] we show that for an interpretation of size at most 𝑚 the following identity holds: 𝑓 lcm({2,...,𝑚})+𝑚−1 (𝑐) = 𝑓 𝑚−1 (𝑐) The intuition for the equality is as follows. The sequence 𝑐, 𝑓 (𝑐), . . . , 𝑓 𝑘 (𝑐), . . . must even- tually repeat. Both sides of the equation will be in the part that repeats and the length of the repeating part will be a number at most 𝑚. Since this length divides lcm({2, . . . , 𝑚}) we will be able to conclude 𝑓 lcm({2,...,𝑚}) (𝑓 𝑚−1 (𝑐)) = 𝑓 𝑚−1 (𝑐) Let us now expand this idea more carefully. 18 Consider an arbitrary interpretation of the function 𝑓 and the constant 𝑐 assuming that the universe has at most 𝑚 elements. For the purpose of this subsection, we abuse notation by writing 𝑓 (· · · ) for the value of 𝑓 under such interpretation and write 𝑐 for the value of 𝑐 under the interpretation. By the pigeonhole principle there must exist 𝑞1 and 𝑞2 such that 0 ≤ 𝑞2 < 𝑞1 ≤ 𝑚 such that 𝑓 𝑞1 (𝑐) = 𝑓 𝑞2 (𝑐). Let 𝑞1 and 𝑞2 be the least numbers with this property. Following [6] we call 𝑞1 the size and 𝑞2 the prefix. We also call the positive number 𝑞1 − 𝑞2 the lasso and say 𝑓 𝑘 (𝑐) is in the lasso if 𝑘 ≥ 𝑞2 . The sequence can be written as follows: 𝑞2 −1 ⏞ ⏟ 𝑓 0 (𝑐), . . . , 𝑓 𝑞2 −1 (𝑐), 𝑓 𝑞2 (𝑐), . . . , 𝑓 𝑞1 −1 (𝑐), . . . , 𝑓 𝑞2 +𝑗(𝑞1 −𝑞2 ) (𝑐), . . . , 𝑓 𝑞1 −1+𝑗(𝑞1 −𝑞2 ) (𝑐), . . . ⏟ ⏞ ⏟ ⏞ 𝑞1 −𝑞2 𝑞1 −𝑞2 (1) For every 𝑓 𝑘 (𝑐) in the lasso it is clear that 𝑓 𝑗(𝑞1 −𝑞2 ) (𝑓 𝑘 (𝑐)) = 𝑓 𝑘 (𝑐). Since 𝑞2 ≤ 𝑚 − 1 we know 𝑓 𝑚−1 (𝑐) is in the lasso. Since 𝑞1 ≤ 𝑚 we know the lasso is at most 𝑚 and thus divides lcm({2, . . . , 𝑚}). Hence we know 𝑓 lcm({2,...,𝑚}) (𝑓 𝑚−1 (𝑐)) = 𝑓 𝑚−1 (𝑐) as desired. 3. Quasidiscriminating Terms The tableau calculus from [7] restricts first-order quantifier instantiation to so-called discrimi- nating terms, i.e., terms that occur on one side of a disequation on the branch. A consequence of the completeness proof for the tableau calculus is a refinement of Herbrand’s theorem indicating that (in the presence of the tableau calculus rules or analogous rules) instantiating with members of the universe of discriminating terms is sufficient to lead to an inconsistency, if the branch is inconsistent. The change of setting from the tableau calculus of [7] to SMT means discriminating terms are not always sufficient. As a simple example we consider kam02 . Assume we have clause normalized so that there is one quantified formula ∀𝑥𝑦𝑧.𝑥 = 𝑦 ∨ 𝑥 = 𝑧 ∨ 𝑦 = 𝑧 and one disequation 𝑓 (𝑓 (𝑓 (𝑐))) ̸= 𝑓 (𝑐). There are only two discriminating terms: 𝑓 3 (𝑐) and 𝑓 (𝑐). Instantiating 𝑥, 𝑦 and 𝑧 with these two terms will always lead to at least two of 𝑥, 𝑦 and 𝑧 being the same term so that the resulting disjunction will always have a literal that is trivial by reflexivity. In the calculus of [7] there is a decomposition rule that would add 𝑓 (𝑓 (𝑐)) ̸= 𝑐 to the branch since 𝑓 3 (𝑐) ̸= 𝑓 (𝑐) is on the branch. This new disequation means there are now four discriminating terms. As discussed above, these four terms are sufficient to use as instantiations to derive a contradiction. One option would be to extend CVC4 to behave in ways that simulate the additional rules of [7]. In the example above, this would mean when the current propositional model sets the literal 𝑓 3 (𝑐) = 𝑓 (𝑐) to false, CVC4 could mimic the decomposition rule by adding a propositional 19 discriminating 250 enumeration z3 default 200 150 time (s) 100 50 0 0 100 200 300 400 500 instances Figure 1: Cactus plot of the results. clause corresponding to 𝑓 3 (𝑐) = 𝑓 (𝑐) ∨ 𝑓 2 (𝑐) ̸= 𝑐. Further tableau rules that would need to have a similar counterpart are the mating and confrontation rules. We have chosen a simpler, heuristic approach without attempting to maintain completeness. However, a heuristic that restricts to discriminating terms without simulating the tableau rules would be far too restrictive. An intermediate heuristic is to restrict to terms that would be discriminating if the decomposition rule were included. We call these quasidiscriminating terms. As a technical definition, we say a pair (𝑠, 𝑡) is a discriminating pair if the literal 𝑠 = 𝑡 is as- signed false by the propositional model. We recursively define quasidiscriminating pairs (𝑠, 𝑡) as follows: Every discriminating pair is a quasidiscriminating pair. If (𝑓 (𝑠1 , . . . , 𝑠𝑛 ), 𝑓 (𝑡1 , . . . , 𝑡𝑛 )) is a quasidiscriminating pair, then (𝑠𝑖 , 𝑡𝑖 ) is a quasidiscriminating pair for each 𝑖 ∈ {1, . . . , 𝑛} where 𝑠𝑖 and 𝑡𝑖 are not the same term. A term 𝑠 is quasidiscriminating if there is some 𝑡 such that (𝑠, 𝑡) or (𝑡, 𝑠) is a quasidiscriminating pair. We have modified the enumeration instantiation algorithm in CVC4 [4] so that only quasidis- criminating terms are considered. 4. Results The problems used for evaluation are the formulas kam𝑛𝑚 as defined above. The parameters were chosen as follows. The parameter 𝑚 ranges between 4..10 and the value of 𝑛 ranges between 0..99 for 𝑚 ∈ 4..8 and it ranges between 0..9 for 𝑚 ∈ 9..10. Recall that the parameter 𝑚 represents the domain size and 𝑛 the number of the “dummy” terms 𝑔. For the comparison we considered the default version of CVC4, CVC4 run only in the enumeration mode, and Z3. Figure 1 shows a cactus plot for the experiment results under 5-minute timeout (300s). Table 1 breaks down the number of solved instances by the domain size, i.e. by the parameter 𝑚. 20 domain discriminating enumeration Z3 default 4 100 100 75 28 5 100 100 44 0 6 100 100 29 0 7 100 100 24 0 8 100 100 21 0 9 10 10 10 0 10 10 1 10 0 Total (520) 520 511 213 28 Table 1 Results Our strategy using discriminating terms solved all the considered benchmarks very quickly except for the largest ones. CVC4’s enumerative is also performing quite well but the time starts to increase much more quickly and eventually it times out on the largest problems. The default version of CVC4 performs rather poorly; our explanation for this is that the conflict-based instantiation [8] is taking up too much time because of the deep terms. The results for Z3 are surprising because it can successfully solve the largest instances but tends to fail on the smaller ones in a somewhat nonuniform fashion. We have also tried running Vampire [9] but that has timed out on all the considered problems. 5. Summary and Future Work This paper proposes a way to restrict possible candidates term for quantifier instantiation by looking at syntactic properties of the given formula. In particular, we consider only terms that participate in a disequality. We construct a family of formulas where this technique has demonstrably the best results. The presented techniques opens a number of avenues for future work. Since the presented technique disregards theories, a natural generalization would be to include theory-specific pred- icates, other than just disequality. For instance, in the context of arithmetic strict comparisons (<) imply disequality and therefore could be used in a similar fashion. While the technique is clearly performing well on the constructed family of formulas, as of now we don’t have any dividends that is helpful on general formulas. We conjecture that this might happen in problems with many function nesting but also, careful integration with other techniques will be needed. A natural next step to take would be to run the modified CVC4 over the problem sets in the SMT-LIB to obtain data for how often the quasi-discriminating terms technique is helpful and how often it is not. The family of formulas proposed here is interesting on its own, if only because they are easy to understand and yet present a challenge to existing SMT solvers and first-order automated theorem provers. In general, it is unclear whether it is better to instantiate with deeper terms or with more shallow terms. In the provided family, the outermost terms are actually the right ones and the inner ones are “red herrings.” But one can envision scenarios where the opposite is true. So the question is, how to distinguish scenarios like these. 21 Acknowledgments The results were supported by the Ministry of Education, Youth and Sports within the dedicated program ERC CZ under the project POSTMAN no. LL1902. This scientific article is part of the RICAIP project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 857306. References [1] D. Detlefs, G. Nelson, J. B. Saxe, Simplify: a theorem prover for program checking, J. ACM 52 (2005) 365–473. doi:10.1145/1066100.1066102. [2] Y. Ge, L. M. de Moura, Complete instantiation for quantified formulas in satisfiability modulo theories, in: Computer Aided Verification, 21st International Conference, CAV, 2009, pp. 306–320. doi:10.1007/978-3-642-02658-4\_25. [3] A. Reynolds, H. Barbosa, P. Fontaine, Revisiting enumerative instantiation, in: Tools and Algorithms for the Construction and Analysis of Systems, volume 10806, 2018, pp. 112–131. doi:10.1007/978-3-319-89963-3\_7. [4] C. W. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanovic, T. King, A. Reynolds, C. Tinelli, CVC4, in: G. Gopalakrishnan, S. Qadeer (Eds.), Computer Aided Verification - 23rd International Conference, CAV, volume 6806, Springer, 2011, pp. 171–177. doi:10. 1007/978-3-642-22110-1\_14. [5] M. Kaminski, G. Smolka, A finite axiomatization of propositional type theory in pure lambda calculus, in: Reasoning in Simple Type Theory: Festschrift in Honor of Peter B. Andrews on His 70th Birthday, College Publications, 2008, pp. 243–258. [6] M. P. Bonacina, C. A. Lynch, L. de Moura, On deciding satisfiability by theorem proving with speculative inferences, Journal of Automated Reasoning 47 (2011) 161–189. doi:10. 1007/s10817-010-9213-y. [7] C. E. Brown, G. Smolka, Analytic tableaux for simple type theory and its first-order fragment, Logical Methods in Computer Science 6 (2010). doi:10.2168/LMCS-6(2:3)2010. [8] A. Reynolds, C. Tinelli, L. M. de Moura, Finding conflicting instances of quantified formulas in SMT, in: Formal Methods in Computer-Aided Design, FMCAD 2014, Lausanne, Switzer- land, October 21-24, 2014, IEEE, 2014, pp. 195–202. doi:10.1109/FMCAD.2014.6987613. [9] L. Kovács, A. Voronkov, First-Order Theorem Proving and Vampire, in: International Conference on Computer Aided Verification, volume 8044, 2013, pp. 1–35. doi:10.1007/ 978-3-642-39799-8_1. 22