First-Order Instantiation using Discriminating Terms
Chad E. Brown1 , Mikoláš Janota1
1
 Czech Technical University in Prague, Czech Institute of Informatics, Robotics and Cybernetics, Jugoslávských
partyzánů 1580/3, 160 00 Prague 6, Dejvice, Czech Republic


                                         Abstract
                                         This paper proposes a technique to limit the number of possible terms to be considered in quantifier
                                         instantiation. One of the major hurdles that SMT solvers face when dealing with quantifiers is that there
                                         are simply too many terms to instantiate with. So even if the right set of terms is available to the solver,
                                         meaning they appear in the formula, the solver might not have enough resources to come upon the right
                                         combination. This motivates the technique presented in this paper, which instantiates only by a certain
                                         type of terms, called discriminating terms. The paper introduces a class of formulas, where the proposed
                                         technique has a considerable impact.

                                         Keywords
                                         SMT, quantifiers, instantiation


1. Introduction
Quantifiers represent one of the major challenges for contemporary SMT solvers and since
typically they lead to undecidability or extreme computational complexity, they are likely to
remain a challenge for times to come.
   Most commonly, the general techniques for dealing with quantifiers gradually instantiate the
quantified part of the formula with ground terms until the resulting ground formula becomes
unsatisfiable. The terms to be used in instantiations may be chosen either by syntactic properties
(E-matching [1]) or semantic properties (e.g. model-based quantifier instantiation [2]). Interest-
ingly, it has been shown that these techniques do not always pay off and simple enumeration of
terms gives better results in some cases [3].
   This is where this paper comes in. We propose a technique to limit the set of terms to
enumerate. Roughly speaking, in the context of first order logic with equality, a term is labeled
as discriminating if it participates in a disequality.
   We have modified the enumeration instantiation algorithm in CVC4 [4] so that only discrimi-
nating terms are considered. We further construct a family of formulas where this approach
demonstrably helps.


SMT 2021: 19th International Workshop on Satisfiability Modulo Theories, July 18-19, 2021, Los Angeles, CA
" Mikolas.Janota@cvut.cz (M. Janota)
~ http://people.ciirc.cvut.cz/~janotmik/ (M. Janota)
 0000-0003-3487-784X (M. Janota)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                          17
2. A Class of Problems
Kaminski and Smolka [5] consider an identity that holds over the booleans: 𝑓 (𝑓 (𝑓 (𝑐))) = 𝑓 (𝑐).
Here 𝑐 is a boolean and 𝑓 is a unary function on booleans. It is easy to informally see why this
identity holds by considering all four possible interpretations of 𝑓 . Obviously the identity also
holds if the domain of interest has only one element. Hence one can make an easy first-order
problem by including an axiom

                                     ∀𝑥𝑦𝑧.𝑥 = 𝑦 ∨ 𝑥 = 𝑧 ∨ 𝑦 = 𝑧

stating there are at most two elements and making the conclusion 𝑓 (𝑓 (𝑓 (𝑐))) = 𝑓 (𝑐). A
formal proof would proceed by equational reasoning after instantiating the quantifiers using
the subterms of the conjecture: 𝑐, 𝑓 (𝑐), 𝑓 (𝑓 (𝑐)) and 𝑓 (𝑓 (𝑓 (𝑐))). This problem can be made
slightly more difficult by including 𝑛 unary functions 𝑔0 , . . . , 𝑔𝑛−1 and writing the conclusion
as 𝑓 (𝑓 (𝑓 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )))) = 𝑓 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )). The modified problem has 𝑛 + 4
subterms instead of only 4. The subterms of the form 𝑔𝑖 (· · · 𝑔𝑛−1 (𝑐) · · · ) with 𝑖 > 0 are red
herrings. The four subterms 𝑓 𝑘 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )) with 𝑘 ∈ {0, 1, 2, 3} are sufficient to use
as instantiations to complete the proof.
   The problems can also be made more difficult in a different way. Following a proof from [6]
we will argue that for each natural number 𝑚 > 0 there are natural numbers 𝑚1 > 0 and
𝑚2 ≥ 0 such that 𝑓 𝑚1 +𝑚2 (𝑐) = 𝑓 𝑚2 (𝑐) if the domain of interest has at most 𝑚 elements. If
𝑚 = 2, we can take 𝑚1 = 2 and 𝑚2 = 1 as above. More generally, we choose 𝑚2 as 𝑚 − 1
and 𝑚1 is the least common multiple (lcm) of the sequence 2, . . . , 𝑚. The reason behind these
choices are explored in the following subsection.
   To construct the family of formulae in question, define atmost𝑚 to be the first-order formula
                                                           ⋁︁
                                          ∀𝑥0 · · · 𝑥𝑚 .           𝑥𝑖 = 𝑥𝑗
                                                   0≤𝑖<𝑗≤𝑚

and let kam𝑛𝑚 be the first-order formula

      atmost𝑚 ∧ 𝑓 lcm({2,...,𝑚})+𝑚−1 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )) ̸= 𝑓 𝑚−1 (𝑔0 (· · · 𝑔𝑛−1 (𝑐) · · · )).

2.1. Unsatisfiability of kam𝑛𝑚
Following the proof (and terminology) of Theorem 7 in [6] we show that for an interpretation
of size at most 𝑚 the following identity holds:

                                  𝑓 lcm({2,...,𝑚})+𝑚−1 (𝑐) = 𝑓 𝑚−1 (𝑐)

  The intuition for the equality is as follows. The sequence 𝑐, 𝑓 (𝑐), . . . , 𝑓 𝑘 (𝑐), . . . must even-
tually repeat. Both sides of the equation will be in the part that repeats and the length of the
repeating part will be a number at most 𝑚. Since this length divides lcm({2, . . . , 𝑚}) we will
be able to conclude
                              𝑓 lcm({2,...,𝑚}) (𝑓 𝑚−1 (𝑐)) = 𝑓 𝑚−1 (𝑐)
Let us now expand this idea more carefully.


                                                     18
   Consider an arbitrary interpretation of the function 𝑓 and the constant 𝑐 assuming that
the universe has at most 𝑚 elements. For the purpose of this subsection, we abuse notation
by writing 𝑓 (· · · ) for the value of 𝑓 under such interpretation and write 𝑐 for the value of
𝑐 under the interpretation. By the pigeonhole principle there must exist 𝑞1 and 𝑞2 such that
0 ≤ 𝑞2 < 𝑞1 ≤ 𝑚 such that 𝑓 𝑞1 (𝑐) = 𝑓 𝑞2 (𝑐). Let 𝑞1 and 𝑞2 be the least numbers with this
property. Following [6] we call 𝑞1 the size and 𝑞2 the prefix. We also call the positive number
𝑞1 − 𝑞2 the lasso and say 𝑓 𝑘 (𝑐) is in the lasso if 𝑘 ≥ 𝑞2 . The sequence can be written as follows:
             𝑞2 −1
  ⏞              ⏟
  𝑓 0 (𝑐), . . . , 𝑓 𝑞2 −1 (𝑐), 𝑓 𝑞2 (𝑐), . . . , 𝑓 𝑞1 −1 (𝑐), . . . , 𝑓 𝑞2 +𝑗(𝑞1 −𝑞2 ) (𝑐), . . . , 𝑓 𝑞1 −1+𝑗(𝑞1 −𝑞2 ) (𝑐), . . .
                                ⏟              ⏞                       ⏟                          ⏞
                                           𝑞1 −𝑞2                                            𝑞1 −𝑞2
                                                                                                   (1)
For every 𝑓 𝑘 (𝑐) in the lasso it is clear that 𝑓 𝑗(𝑞1 −𝑞2 ) (𝑓 𝑘 (𝑐)) = 𝑓 𝑘 (𝑐). Since 𝑞2 ≤ 𝑚 − 1 we
know 𝑓 𝑚−1 (𝑐) is in the lasso. Since 𝑞1 ≤ 𝑚 we know the lasso is at most 𝑚 and thus divides
lcm({2, . . . , 𝑚}). Hence we know

                                        𝑓 lcm({2,...,𝑚}) (𝑓 𝑚−1 (𝑐)) = 𝑓 𝑚−1 (𝑐)

as desired.


3. Quasidiscriminating Terms
The tableau calculus from [7] restricts first-order quantifier instantiation to so-called discrimi-
nating terms, i.e., terms that occur on one side of a disequation on the branch. A consequence of
the completeness proof for the tableau calculus is a refinement of Herbrand’s theorem indicating
that (in the presence of the tableau calculus rules or analogous rules) instantiating with members
of the universe of discriminating terms is sufficient to lead to an inconsistency, if the branch is
inconsistent.
   The change of setting from the tableau calculus of [7] to SMT means discriminating terms
are not always sufficient. As a simple example we consider kam02 . Assume we have clause
normalized so that there is one quantified formula

                                             ∀𝑥𝑦𝑧.𝑥 = 𝑦 ∨ 𝑥 = 𝑧 ∨ 𝑦 = 𝑧

and one disequation 𝑓 (𝑓 (𝑓 (𝑐))) ̸= 𝑓 (𝑐). There are only two discriminating terms: 𝑓 3 (𝑐) and
𝑓 (𝑐). Instantiating 𝑥, 𝑦 and 𝑧 with these two terms will always lead to at least two of 𝑥, 𝑦 and 𝑧
being the same term so that the resulting disjunction will always have a literal that is trivial by
reflexivity. In the calculus of [7] there is a decomposition rule that would add 𝑓 (𝑓 (𝑐)) ̸= 𝑐 to
the branch since 𝑓 3 (𝑐) ̸= 𝑓 (𝑐) is on the branch. This new disequation means there are now four
discriminating terms. As discussed above, these four terms are sufficient to use as instantiations
to derive a contradiction.
   One option would be to extend CVC4 to behave in ways that simulate the additional rules
of [7]. In the example above, this would mean when the current propositional model sets the
literal 𝑓 3 (𝑐) = 𝑓 (𝑐) to false, CVC4 could mimic the decomposition rule by adding a propositional


                                                                19
                                                                           discriminating
                               250                                         enumeration
                                                                           z3
                                                                           default
                               200

                               150
                    time (s)


                               100

                               50

                                0
                                     0   100     200          300    400         500
                                                       instances

Figure 1: Cactus plot of the results.


clause corresponding to 𝑓 3 (𝑐) = 𝑓 (𝑐) ∨ 𝑓 2 (𝑐) ̸= 𝑐. Further tableau rules that would need to
have a similar counterpart are the mating and confrontation rules.
   We have chosen a simpler, heuristic approach without attempting to maintain completeness.
However, a heuristic that restricts to discriminating terms without simulating the tableau rules
would be far too restrictive. An intermediate heuristic is to restrict to terms that would be
discriminating if the decomposition rule were included. We call these quasidiscriminating terms.
   As a technical definition, we say a pair (𝑠, 𝑡) is a discriminating pair if the literal 𝑠 = 𝑡 is as-
signed false by the propositional model. We recursively define quasidiscriminating pairs (𝑠, 𝑡) as
follows: Every discriminating pair is a quasidiscriminating pair. If (𝑓 (𝑠1 , . . . , 𝑠𝑛 ), 𝑓 (𝑡1 , . . . , 𝑡𝑛 ))
is a quasidiscriminating pair, then (𝑠𝑖 , 𝑡𝑖 ) is a quasidiscriminating pair for each 𝑖 ∈ {1, . . . , 𝑛}
where 𝑠𝑖 and 𝑡𝑖 are not the same term. A term 𝑠 is quasidiscriminating if there is some 𝑡 such
that (𝑠, 𝑡) or (𝑡, 𝑠) is a quasidiscriminating pair.
   We have modified the enumeration instantiation algorithm in CVC4 [4] so that only quasidis-
criminating terms are considered.


4. Results
The problems used for evaluation are the formulas kam𝑛𝑚 as defined above. The parameters
were chosen as follows. The parameter 𝑚 ranges between 4..10 and the value of 𝑛 ranges
between 0..99 for 𝑚 ∈ 4..8 and it ranges between 0..9 for 𝑚 ∈ 9..10. Recall that the parameter
𝑚 represents the domain size and 𝑛 the number of the “dummy” terms 𝑔.
   For the comparison we considered the default version of CVC4, CVC4 run only in the
enumeration mode, and Z3. Figure 1 shows a cactus plot for the experiment results under
5-minute timeout (300s). Table 1 breaks down the number of solved instances by the domain
size, i.e. by the parameter 𝑚.


                                                         20
                   domain       discriminating    enumeration     Z3    default
                      4               100             100          75     28
                      5               100             100          44      0
                      6               100             100          29      0
                      7               100             100          24      0
                      8               100             100          21      0
                      9                10              10          10      0
                      10              10                1         10       0
                  Total (520)         520             511         213     28

Table 1
Results


   Our strategy using discriminating terms solved all the considered benchmarks very quickly
except for the largest ones. CVC4’s enumerative is also performing quite well but the time starts
to increase much more quickly and eventually it times out on the largest problems. The default
version of CVC4 performs rather poorly; our explanation for this is that the conflict-based
instantiation [8] is taking up too much time because of the deep terms. The results for Z3 are
surprising because it can successfully solve the largest instances but tends to fail on the smaller
ones in a somewhat nonuniform fashion. We have also tried running Vampire [9] but that has
timed out on all the considered problems.


5. Summary and Future Work
This paper proposes a way to restrict possible candidates term for quantifier instantiation by
looking at syntactic properties of the given formula. In particular, we consider only terms
that participate in a disequality. We construct a family of formulas where this technique has
demonstrably the best results.
   The presented techniques opens a number of avenues for future work. Since the presented
technique disregards theories, a natural generalization would be to include theory-specific pred-
icates, other than just disequality. For instance, in the context of arithmetic strict comparisons
(<) imply disequality and therefore could be used in a similar fashion. While the technique is
clearly performing well on the constructed family of formulas, as of now we don’t have any
dividends that is helpful on general formulas. We conjecture that this might happen in problems
with many function nesting but also, careful integration with other techniques will be needed.
A natural next step to take would be to run the modified CVC4 over the problem sets in the
SMT-LIB to obtain data for how often the quasi-discriminating terms technique is helpful and
how often it is not.
   The family of formulas proposed here is interesting on its own, if only because they are easy
to understand and yet present a challenge to existing SMT solvers and first-order automated
theorem provers. In general, it is unclear whether it is better to instantiate with deeper terms
or with more shallow terms. In the provided family, the outermost terms are actually the right
ones and the inner ones are “red herrings.” But one can envision scenarios where the opposite
is true. So the question is, how to distinguish scenarios like these.


                                                 21
Acknowledgments
The results were supported by the Ministry of Education, Youth and Sports within the dedicated
program ERC CZ under the project POSTMAN no. LL1902. This scientific article is part of the
RICAIP project that has received funding from the European Union’s Horizon 2020 research
and innovation programme under grant agreement No 857306.


References
[1] D. Detlefs, G. Nelson, J. B. Saxe, Simplify: a theorem prover for program checking, J. ACM
    52 (2005) 365–473. doi:10.1145/1066100.1066102.
[2] Y. Ge, L. M. de Moura, Complete instantiation for quantified formulas in satisfiability
    modulo theories, in: Computer Aided Verification, 21st International Conference, CAV,
    2009, pp. 306–320. doi:10.1007/978-3-642-02658-4\_25.
[3] A. Reynolds, H. Barbosa, P. Fontaine, Revisiting enumerative instantiation, in: Tools and
    Algorithms for the Construction and Analysis of Systems, volume 10806, 2018, pp. 112–131.
    doi:10.1007/978-3-319-89963-3\_7.
[4] C. W. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanovic, T. King, A. Reynolds,
    C. Tinelli, CVC4, in: G. Gopalakrishnan, S. Qadeer (Eds.), Computer Aided Verification
    - 23rd International Conference, CAV, volume 6806, Springer, 2011, pp. 171–177. doi:10.
    1007/978-3-642-22110-1\_14.
[5] M. Kaminski, G. Smolka, A finite axiomatization of propositional type theory in pure
    lambda calculus, in: Reasoning in Simple Type Theory: Festschrift in Honor of Peter B.
    Andrews on His 70th Birthday, College Publications, 2008, pp. 243–258.
[6] M. P. Bonacina, C. A. Lynch, L. de Moura, On deciding satisfiability by theorem proving
    with speculative inferences, Journal of Automated Reasoning 47 (2011) 161–189. doi:10.
    1007/s10817-010-9213-y.
[7] C. E. Brown, G. Smolka, Analytic tableaux for simple type theory and its first-order fragment,
    Logical Methods in Computer Science 6 (2010). doi:10.2168/LMCS-6(2:3)2010.
[8] A. Reynolds, C. Tinelli, L. M. de Moura, Finding conflicting instances of quantified formulas
    in SMT, in: Formal Methods in Computer-Aided Design, FMCAD 2014, Lausanne, Switzer-
    land, October 21-24, 2014, IEEE, 2014, pp. 195–202. doi:10.1109/FMCAD.2014.6987613.
[9] L. Kovács, A. Voronkov, First-Order Theorem Proving and Vampire, in: International
    Conference on Computer Aided Verification, volume 8044, 2013, pp. 1–35. doi:10.1007/
    978-3-642-39799-8_1.


                                               22