<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Selecting Quantifiers for Instantiation in SMT</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jan Jakubův</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mikoláš Janota</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bartosz Piotrowski</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jelle Piepenbrock</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrew Reynolds</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Logic Center, University of Iowa</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague</institution>
          ,
          <addr-line>Czechia</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This work attempts to choose quantifiers to instantiate based on a feedback from the SMT solver. This tackles the challenge that if a problem contains many quantifier expressions, it eventually gets flooded by too many generated ground terms. Therefore, we instantiate only some of the quantifiers but the question is, which are the useful ones? The SMT solver is modeled as a multi-armed bandit, where each quantifier is a lever producing a positive reward if an instantiation of the quantifier was useful. The issue is that we do not know whether an instantiation was useful or not until the given problem instance is actually proven (and then it's too late). In this paper we explore several heuristics that attempt to assess the usefulness of quantifier instantiations.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;satisfiability modulo theories</kwd>
        <kwd>quantifier instantiation</kwd>
        <kwd>machine learning</kwd>
        <kwd>reinforcement learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Most approaches instantiate quantifiers gradually, meaning, the new instantiations are added
after testing that the old ones do not already yield a contradiction in the ground solver. In this
work, we propose to use dynamic machine-learning-based heuristic to decide which quantifiers
should be instantiated during solving. We model the problem as multi-armed bandits (MAB),
which is one of the simplest form of reinforcement learning [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. A multi-armed bandit is equipped
with a set of levers, which are pulled one at a time. Pulling a lever leads to a reward (typically a
real number). The multi-armed bandit problem is to find a strategy that leads to the highest
total reward overall. Our scenario is more complicated because quantifiers influence each other
and the SMT solver has a state, which changes with each instantiation. Hence, here we apply
the MAB model as a simplification of the problem as a first step in this area of research.
      </p>
      <p>The overall setup is shown in Figure 1. Quantifiers are instantiated in a loop that terminates
if the ground solver determines that the ground part is unsatisfiable. Our guidance picks the set
of quantifiers to be instantiated in each round. This guidance is based on feedback obtained
from the ground solver.</p>
      <p>Input
formula</p>
      <p>SMT
solver</p>
      <p>Ground
assignment</p>
      <p>Instantiation
Ground solver</p>
      <p>ML
advice
Instances</p>
      <p>Feedback
satisfiable
unsatisfiable
or infinite loop</p>
      <p>
        In this paper, we only consider the simplest strategy for picking the ground terms that we
instantiate with: the enumerative instantiation [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. There, the ground terms are enumerated
from the Herbrand universe, while almost completely ignoring the semantics of the formula at
hand. However, more advanced techniques exist [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ] and similar approach could be integrated
into them. In the standard setting, these techniques would instantiate all quantifiers that are
true in the current ground model. This can rapidly bloat the ground part, which means that
it is harder to solve but also that the space of possible instantiations is getting unwieldy. We
also remark that we had applied machine learning to ordering terms for instantiation within
cvc5 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]—this could also be combined with the approach presented here.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Picking Quantifiers</title>
      <p>In the standard MAB setting, only one lever is pulled at a time. This is not desirable in our
setting, because this could cause an unwieldy number of instantiation rounds. Therefore we
always pick a set of quantifiers (levers) to instantiate in each round. This is always done based
on some score assigned to each quantifier, where the score is calculated from the observations
made so far (Section 3). There are multiple ways of how one could use such scores. In our
experiments, we use a simple strategy where the top  percent of quantifiers is selected in each
round, where  is a fixed parameter.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Scores and Reward Heuristics</title>
      <p>The objective is to assign to each quantifier a score that determines how likely that quantifier is
to be instantiated. This score combines two components. The first component estimates how
useful the quantifier has been in the past. The second component pushes the search towards
quantifiers that have been underused so far. This falls within the well-known paradigm in
reinforcement learning: exploration versus exploitation.</p>
      <p>The current implementation uses as a source of usefulness of a quantifier two values obtained
from the SAT solver, namely dificulty and activity measures. Both of these sources provide
an integer statistic for each quantifier, with values evolving in time. We further normalize the
values, dividing them by the number of current quantifier instantiations. Hence we obtain the
utility per quantifier instance. We use these values to estimate usefulness of quantifiers and to
guide quantifier instantiation.</p>
      <p>In our implementation, dificulty is mapping from formulas (those from input or lemmas
generated during solving) to natural numbers, where a formula with higher dificulty contributes
more towards a proof of unsatisfiability. This measure is incrementally updated during the
course of SMT solving. In particular, when a theory solver generates a conflict clause or lemma  ,
we compute the literals ℓ that are propositionally entailed by the negation of  (e.g., the negation
of all literals in a clause), call this set ilits( ). For each such ℓ, we increment the dificulty
measure of the (oldest) formula  such that the negation of ℓ is in ilits( ). We subsequently
track this dificulty measure on the generated lemma  . Notice that quantifier instantiation
lemmas  ⇒ [⃗ ↦→ ⃗ ] increment dificulty on the oldest formula that implies . The dificulty
measure of the lemma itself will be incremented whenever literals in the instantiated body
of the quantified formula participate in further conflicts and lemmas. Similarly, we measure
activity, which tracks how many times a formula participated in propositional propagation.</p>
      <p>As a basic estimate of the quantifier quality, we train a simple linear regression model to
obtain reasonable weights  and  for the linear combination of normalized activity (nact) and
normalized dificulty ( ndif ):</p>
      <p>· nact +  · ndif .</p>
      <p>As training data, we collect statistics of the values of nact and ndif from runs on several
thousand problems evaluated with 60 seconds time limit. The statistics contain the values of
nact and ndif for each quantifier in every instantiation round, thusly recording the progress
of the values in time. The quantifiers whose instantiations were needed to solve the problem
are marked as positive, other quantifiers as negative. From this training data, we train a linear
regression classifier. The linear regression training yielded the coeficients  = 0.04 and  = 0.1
and these shall be used for the experimental evaluation in Section 4. The prediction computed
by the linear formula can be directly used as the quality estimate, or it can be used as a reward
in the MAB setting as follows.</p>
      <p>To include the exploration factor we use the upper confidence bound (UCB) formula, where
the quality () of the quantifier  at time step  (quantifier selection round) is defined as
() = () + 
√︃ log()
()
with () is the mean reward for the quantifier  so far, and () is the number of times  was
selected for instantiation up to this point. As rewards, we use predictions of quantifier qualities
obtained from the linear model described above. Finally,  is the confidence value controlling
the level of exploration with respect to exploitation. The idea behind UCB is that the quality is
primarily governed by the reward. The second summand will be higher for the quantifiers that
were less often selected in the previous instantiation rounds (quantifiers with smaller ()).
Hence the quality of unselected quantifiers keeps increasing and they will eventually become
selected in some future round. How fast this happens is controlled by the parameter  where
higher values favor exploration.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>
        The above-presented ideas were implemented in the SMT solver cvc5 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], while turning of all
the other instantiation techniques and focusing only on the enumerative instantiation [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ].
A subset 5000 of SMTLIB benchmark problems is used for the experiments. We deliberately
select problems with a large number of quantifiers. The experiments were run on a server with
four AMD EPYC 7513 32-Core Processor @ 2.6GHz and with 504 GB of memory.
      </p>
      <p>The default technique where all available quantifiers are instantiated in each round is used as
the baseline. To assess the performance of our techniques for quantifier selection, we evaluate
several random selection strategies. In each strategy, % of randomly selected quantifiers is
instantiated with  ranging from 10%, 20%, up to 90%. Apart from these control strategies, we
evaluate our linear model from Section 3 where the quality of each quantifier is computed as a
linear combination of two utility values nact and ndif . The linear model is evaluated in two
modes, namely, with and without the use of the UCB formula. As with the random strategies,
we evaluate several variants with diferent selection ratios , selecting % of the quantifier
with the highest quality. In the case of the UCB formula, additional confidence coeficient  is
required to control the level of exploration. We evaluate several values for  between 0.001
and 10.</p>
      <p>The ML approach performs well in short timeouts. With the timeout of 1 s the baseline
strategy, which instantiates all available quantifiers, solved 1107 problems. The worst of the
random selection strategies for  = 10% solved 1016 problems, and the best one for  = 50%
already outperformed the baseline with 1171 solved problems. The linear model scored slightly
higher with 1186 problems, in particular, its best variant with  = 50%. The UCB formula
led to an almost identical performance with 1190 solved problems, namely, its variant with
 = 80% and  = 3. Our techniques perform slightly better than the random selection. The
two random strategies solved 1212 problems together, while the two ML strategies solved 1264.</p>
      <p>The experiment progress is depicted in Figure 2, where the graphs show the number of
problems solvable by each of the strategies in time . Both graphs use the baseline strategy as
the reference. The left graph compares the baseline to the random selection, while the right
graph compares it to the best ML-based selection technique, namely the linear model with the
UCB formula. We see that the line for random 10% performs more poorly than the baseline. On
the other hand, the random 50% line stays above the baseline up till ∼ 60 s. The experiment
shows the versions with and without the UCB formula to be of comparable performance. The
right graph, which compares our best ML selection method with best random selection, supports
our previous observation that the ML selection slightly outperforms the random one. The ML
approach stays above the baseline up till ∼ 100 s. The methods are slightly complementary, the
version without UCB adds 6.4% to the problems solved by the UCB version.</p>
      <p>While our strategies outperform the random selection, the improvement is rather small. This
is further supported by Figure 3, which compares the strategies in a scatter plot. For each solved
problem , a point is plotted at the position (1, 2) where  is the runtime of strategy  on
problem . In both graphs, 1 is the baseline strategy, while 2 is the random selection of 50% in
the left graph and the linear model with UCB on the right. Hence the points below the diagonal
designate the runtime improvement, that is, problems solved faster than the baseline. For the
ML approach we observe a cluster of points below the diagonal with time below 10 s. For larger
times we observe both improvement and worsening of performance.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and Future Work</title>
      <p>
        In this extended abstract we report on our initial experiments that attempt to estimate the
usefulness of quantifiers during search. This contrasts with standard machine learning techniques
that learn from one problem to another (cf. [
        <xref ref-type="bibr" rid="ref6 ref8 ref9">8, 9, 6</xref>
        ]). In our case, the objective is to obtain
feedback from the problem being solved. To our best knowledge, this is the first time such an
approach has been attempted. Eventually, machine learning across multiple instances could be
combined with single-instance learning.
      </p>
      <p>The experiments suggest that a guided quantifier selection can lead to a runtime improvement
and hence to more problems solved with very short timeouts. It seems, however, that the
improvement over the baseline strategy is decreasing with increasing time limits. In our future
research, we plan to investigate whether an improvement over higher time limits can be achieved
with more advanced methods for quantifier selection. In particular, we plan to investigate other
metrics for the usefulness of quantifiers.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The results were supported by the Ministry of Education, Youth and Sports within the dedicated
program ERC CZ under the project POSTMAN no. LL1902, the European Regional
Development Fund under the Czech project AI&amp;Reasoning no. CZ.02.1.01/0.0/0.0/15_003/0000466, and
Amazon Research Awards. This article is part of the RICAIP project that has received funding
from the European Union’s Horizon 2020 research and innovation programme under grant
agreement No 857306.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Sutton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Barto</surname>
          </string-name>
          , Reinforcement Learning,
          <source>Adaptive Computation and Machine Learning series</source>
          , 2 ed.,
          <string-name>
            <surname>Bradford</surname>
            <given-names>Books</given-names>
          </string-name>
          , Cambridge, MA,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Barbosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fontaine</surname>
          </string-name>
          ,
          <article-title>Revisiting enumerative instantiation, in: Tools and Algorithms for the Construction and Analysis of Systems</article-title>
          , volume
          <volume>10806</volume>
          ,
          <year>2018</year>
          , pp.
          <fpage>112</fpage>
          -
          <lpage>131</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -89963-3\_7.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Janota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Barbosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fontaine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          ,
          <article-title>Fair and adventurous enumeration of quantifier instantiations</article-title>
          , in: Formal Methods in Computer-Aided Design,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Detlefs</surname>
          </string-name>
          , G. Nelson,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Saxe</surname>
          </string-name>
          ,
          <article-title>Simplify: A theorem prover for program checking</article-title>
          ,
          <source>J. ACM</source>
          <volume>52</volume>
          (
          <year>2005</year>
          )
          <fpage>365</fpage>
          -
          <lpage>473</lpage>
          . doi:
          <volume>10</volume>
          .1145/1066100.1066102.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ge</surname>
          </string-name>
          , L. M. de Moura,
          <article-title>Complete instantiation for quantified formulas in satisfiability modulo theories</article-title>
          , in: Computer Aided Verification, 21st International Conference, CAV,
          <year>2009</year>
          , pp.
          <fpage>306</fpage>
          -
          <lpage>320</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -02658-4\_
          <fpage>25</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Janota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Piepenbrock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Piotrowski</surname>
          </string-name>
          ,
          <article-title>Towards learning quantifier instantiation in SMT, in: Theory and Applications of Satisfiability Testing - SAT</article-title>
          <year>2022</year>
          , LIPIcs,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H.</given-names>
            <surname>Barbosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Barrett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brain</surname>
          </string-name>
          , G. Kremer,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lachnitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Niemetz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nötzli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ozdemir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Preiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tinelli</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Zohar,</surname>
          </string-name>
          <article-title>cvc5: A versatile and industrial-strength SMT solver</article-title>
          , in: D.
          <string-name>
            <surname>Fisman</surname>
          </string-name>
          , G. Rosu (Eds.),
          <article-title>Tools and Algorithms for the Construction and Analysis of Systems -</article-title>
          28th International Conference, TACAS 2022,
          <article-title>Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS</article-title>
          , volume
          <volume>13243</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2022</year>
          , pp.
          <fpage>415</fpage>
          -
          <lpage>442</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -99524-9\_
          <fpage>24</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Urban</surname>
          </string-name>
          , G. Sutclife,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pudlák</surname>
          </string-name>
          , J. Vyskocil,
          <article-title>MaLARea SG1 - machine learner for automated reasoning with semantic guidance</article-title>
          , in: A.
          <string-name>
            <surname>Armando</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Baumgartner</surname>
          </string-name>
          , G. Dowek (Eds.),
          <source>Automated Reasoning, 4th International Joint Conference, IJCAR</source>
          <year>2008</year>
          , Sydney, Australia,
          <source>August 12-15</source>
          ,
          <year>2008</year>
          , Proceedings, volume
          <volume>5195</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2008</year>
          , pp.
          <fpage>441</fpage>
          -
          <lpage>456</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -71070-7\_
          <fpage>37</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Piepenbrock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heskes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Janota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Urban</surname>
          </string-name>
          ,
          <article-title>Guiding an automated theorem prover with neural rewriting</article-title>
          , in: J.
          <string-name>
            <surname>Blanchette</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kovács</surname>
          </string-name>
          , D. Pattinson (Eds.),
          <source>Automated Reasoning - 11th International Joint Conference, IJCAR 2022, Haifa, Israel, August</source>
          <volume>8</volume>
          -
          <issue>10</issue>
          ,
          <year>2022</year>
          , Proceedings, volume
          <volume>13385</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2022</year>
          , pp.
          <fpage>597</fpage>
          -
          <lpage>617</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -10769-6\_
          <fpage>35</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>