Bayesian Optimisation of Solver Parameters in CBMC
           Chaitanya Mangla, Sean B. Holden, and Lawrence C. Paulson
                                       University of Cambridge
                                  {cm772,sbh11,lp15}@cl.cam.ac.uk

                                                Abstract
          Satisfiability solvers can be embedded in applications to perform specific formal reasoning
      tasks. CBMC, for example, is a bounded model checker for C and C++ that embeds SMT
      and SAT solvers to check internally generated formulae. Such solvers will be solely used
      to evaluate the class of formulae generated by the embedding application and therefore
      may benefit from domain-specific parameter tuning. We propose the use of Bayesian
      optimisation for this purpose, which offers a principled approach to black-box optimisation
      within limited resources. We demonstrate its use for optimisation of the solver embedded
      in CBMC specifically for a collection of test harnesses in active industrial use, for which we
      have achieved a significant improvement over the default parameters.


1     Introduction
F* is a programming language aimed at formal program verification (Nikhil et al. [17]). Dafny is
another such language, with built-in specification constructs and a corresponding static analyser
for formal verification (Leino et al. [9]). CBMC is a static analysis tool for the C and C++
programming languages that uses bounded model checking to formally verify programs (Clarke
et al. [2]). All three systems share an architectural similarity: formal verification is achieved
by transforming the properties in question into logical formulae, which are then solved using
external tools. This architecture is advantageous, because it allows the authors of a variety
of tools to exploit the advancing capabilities of high performance solvers to perform complex
logical reasoning, while allowing them to focus on tasks specific to their applications.
    Modern SAT and SMT solvers rely on a variety of parameterized algorithms to achieve
good performance, subject to a choice of good parameters for the target application. The
space of parameters may be large; the Lingeling solver, for example, has 323 parameters [1].
Also, individual parameters can have significant influence on the performance of the solver. For
example, Huang [5] has shown that the restart policy has a significant impact on the performance
of a clause learning SAT solver. Finally, the best parameter choice is often sensitive to the class
of target problems, as was shown to be the case for restart parameters by Sinz and Iser [15].
    Any solver embedded in an application will in effect be used to check the specific class of
formulae generated by that application. Could the performance of such a solver be improved by
tuning its parameters according to the target application? We show this to be true in the case
of a test suite that uses CBMC to formally verify properties of a library written in C. In this
work, we use Bayesian optimisation to find an improved parameter configuration for the solver
in CBMC, that is tuned especially for that test suite. Bayesian optimisation (Shahriari et al.
[14]) provides a black-box parameter optimisation method that is founded on the principles of
Bayesian statistics and practical to use within limited resource budgets.
    Our work joins a large chorus of prior work on automatic parameter optimisation of Satisfi-
ability solvers. Hutter et al. [6] used ParamILS to optimise parameters for their SAT solver
Spear, which yielded speed-ups of 4.5 times on bounded model checking instances and 500 times
on software verification problems, in comparison to manually tuned parameters. ParamILS
is an automatic parameter optimisation tool for general algorithms, developed by Hutter et

François Bobot, Tjark Weber (eds.); SMT 2020, pp. 37–47
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
Bayesian Optimisation of Solver Parameters in CBMC                      Mangla, Holden and Paulson


al. [8]. ParamILS begins with a random collection of parameter settings, searches around
the neighbourhood of these parameters, and uses a system of randomisation and restarts to
avoid getting lost in local minima. In addition to being a black-box method, ParamILS is also
model-free, since it does not explicitly attempt to model the algorithm it is optimising. In
contrast, Bayesian optimisation is a model-based method, wherein a probabilistic model of the
optimisation target is constructed and improved iteratively, and that model is used to guide
the parameter search. The Sequential Model-based Algorithm Configuration (SMAC) method,
also developed by Hutter et al. [7], is another model-based parameter optimisation method that
has been used to optimise SAT solvers. Although published research on the use of Bayesian
optimisation for SMT solvers is limited, it has been successfully utilised in varied disciplines
and is a topic of active research in the machine learning community. Shahriari et al. [14] have
noted some interesting applications, such as A/B testing, policy parametrisation for robots,
highway traffic congestion management and combinatorial optimisation. In our previous work
with Slowik et al. [18], Bayesian optimisation was used to tune parameters of the SInE heuristic,
commonly used for premise selection in first-order logic theorem provers. Recently, Oh et al. [12]
have published work related to our work presented here, on the use of Bayesian optimisation for
adapting a static analyser to given programs.
    In the following sections, we describe the relevant aspects of CBMC and Bayesian optimisation
in detail, and our experiments.


2    CBMC
CBMC is a bounded model checker for C and C++. CBMC performs code analysis using
symbolic simulation, that is, it compiles code into logical formulae which are then sent to a
solver for verification. With this method, assertions in the code can be verified statically. CBMC
can also automatically generate assertions to check for common errors, such as buffer overflows,
unsafe pointer access, memory leaks and arithmetic faults.


Test harness. Any project utilising CBMC will typically define a suite of test harnesses:
carefully written C functions that act as entry-points for CBMC, along with specifications for loop
unrolling, function pointer destinations, variables that should be treated as non-deterministic
and other optimisations. Each test harness should aim to formally verify some aspect of the
code. Test harnesses may be viewed as static analysis unit tests.


aws-c-common. aws-c-common is an open source library of commonly used datastructures
and algorithms written in the C programming language, developed for use in the Amazon Web
Services Software Development Kit and utilised by multiple projects. Given the central role of
this library, the developers have committed considerable resources towards its formal verification
using CBMC. At the time of writing, the library contains 166 CBMC test harnesses.
    An example test harness from aws-c-common is shown in listing 1. This harness was designed
to verify certain properties of aws_string_eq_ignore_case(): a function exported by the
library. Of note, is the use of nondet_bool() to specify a non-deterministic value to CBMC.
Non-deterministic values are also available for other datatypes, such as nondet_ushortint()
for unsigned short integers. Such values represent input read from the environment and thus
CBMC analyses the test harness for any possible choice of those values.

38
Bayesian Optimisation of Solver Parameters in CBMC                        Mangla, Holden and Paulson


void aws_string_eq_ignore_case_harness() {
    struct aws_string *str_a = nondet_bool() ?
    ,→  ensure_string_is_allocated_bounded_length(MAX_STRING_LEN) : NULL;
    struct aws_string *str_b =
        nondet_bool() ? (nondet_bool() ? str_a : NULL) :
        ,→  ensure_string_is_allocated_bounded_length(MAX_STRING_LEN);
    if (aws_string_eq_ignore_case(str_a, str_b) && str_a && str_b) {
        assert(aws_string_is_valid(str_a));
        assert(aws_string_is_valid(str_b));
        assert(str_a->len == str_b->len);
    }
}


          Listing 1: An example CBMC test harness from the aws-c-common library.

aws_string_eq_ignore_case_harness
    struct aws_string *str_a;
    _Bool return_value_nondet_bool;
    return_value_nondet_bool = NONDET(__CPROVER_bool);
    struct aws_string *tmp_if_expr;
    IF !(return_value_nondet_bool != FALSE) THEN GOTO 1
    struct aws_string *return_value_ensure_string_is_allocated_bounded_length;
    ensure_string_is_allocated_bounded_length((size_t)16);
    return_value_ensure_string_is_allocated_bounded_length =
    ,→  ensure_string_is_allocated_bounded_length#return_value;
    dead ensure_string_is_allocated_bounded_length#return_value;
    tmp_if_expr = return_value_ensure_string_is_allocated_bounded_length;
    GOTO 2
1: tmp_if_expr = (struct aws_string *)(void *)0;
2: str_a = tmp_if_expr;
    ...


Listing 2: A snippet from the goto file generated for the test harness shown in listing 1 from the
aws-c-common library.


Goto files. CBMC uses a modified C compiler to compile source code down to a simplified
format that is more amenable to further conversion into logical formulae. One such simplification,
for example, is the unrolling of every loop to a specified number of iterations. The output files
are called goto files in CBMC parlance. Formal verification of each test harness can be viewed
as a two-stage process. First the test harness is compiled down to a goto file, and in the next
stage the goto file is formally verified by conversion into logical formulae and checked by a solver.
This project focuses on optimisation of the solver, and therefore we only need to generate the
goto files for every test harness once. Only the second stage needs to be re-executed on any
changes to the solver.
    The goto files are binaries but a human readable text representation can be generated using
CBMC. The text version of the goto file generated from the test harness shown in listing 1 spans
6231 lines. The large size is in part due to inclusion of other code from the library relevant to
the harness in the goto file. A snippet of that text is shown in listing 2, which shows the first
few lines of the code generated for the test harness function.


   1 337 lines if the generated comments are excluded


                                                                                                  39
Bayesian Optimisation of Solver Parameters in CBMC                        Mangla, Holden and Paulson


aws-batch-cbmc. Formal verification using CBMC is a computationally intensive task, but
each test harness can be verified independently and in parallel. The developers of aws-c-common
run their suite of CBMC tests for their continuous integration process and therefore require a
fast turnaround from them. Their solution is to run each test harness as an independent batch
process on a public cloud service, using the system from another open source project called
aws-batch-cbmc.

MiniSat. CBMC can use multiple external SMT and SAT solvers to check internally generated
formulae, but its default solver is MiniSat — an open source SAT solver by Een et al. [3]. When
using a SAT solver, CBMC performs bit-vector flattening of its formulae before verification by
the solver. CBMC is compiled with MiniSat using its default parameters. For the purposes of
this project, we have modified CBMC to accept solver parameters as runtime options. Although
the experiments presented here are for a SAT solver, our methods are similarly applicable for
parameter optimisation of SMT solvers.


3      Bayesian Optimisation
Bayesian Optimisation (BO) is a model-based black-box optimisation method, suitable for
optimising targets that are expensive to evaluate and lack an analytical form, as is the case
with typical SMT solvers. The optimisation target is modelled as an objective function, which
maps from the parameters of the target system to a performance metric. In this work, we will
often refer to the target system being optimised as the objective, with the target system being
MiniSat by default. Since the objective is a black-box, the shape of the objective function is
modelled using Gaussian Process (GP) regression [13], a machine learning method that produces
a non-linear regression model using Bayesian statistics. Instead of modelling the objective as a
single function, the GP models it as a probability distribution over functions. In effect, every
point in the parameter space is mapped by the GP to a probability distribution over the value
of the objective function.
    Since Bayesian optimisation uses a machine learning model of the objective function, train-
ing data must be produced by testing the objective, initially at random. Subsequently, the
optimisation process cycles between prediction, testing and training. Using the predictions from
the latest model, a point of interest in the parameter space is chosen for testing, based on a
criterion explained later in this section. The point is then evaluated on the objective. The result
is a new item of data for the GP model to learn from, refining its estimate of the objective. This
process is repeated until a pre-defined resource budget is consumed. The resource budget may
be, for example, a fixed number of iterations or wall-clock time. The persistent goal throughout
the process is to intelligently search for better performing parameters.
    Suppose the noisy black-box objective we wish to optimise using Bayesian optimisation is f ,
such that
                                            y = f (x) + .
Our goal, therefore, is to find the point x that minimises f (x), where x is vector valued. In our
setting, x is a vector of MiniSat parameter values, f measures the time spent in MiniSat, and the
noise  is due to the inherent non-determinism in the underlying system which influences that
measure. After some initial evaluations of f , Bayesian optimisation progresses in the following
steps.
     1. We treat all previous evaluations of f as training data and use a machine learning algorithm
        to build a predictive distribution of the values of y for any given x: p(y|x). With the use

40
Bayesian Optimisation of Solver Parameters in CBMC                      Mangla, Holden and Paulson


      of a GP model, this distribution is Gaussian:

                                       p(y|x) = N (y|µ(x), σ 2 (x)).

      We build this distribution instead of a point estimate because many different estimates
      of f will be able to interpolate the points we have evaluated thus far. Two additional
      observations reinforce the need for this design: our measurements of the objective may be
      noisy; and the objective may be expensive to evaluate. When the objective is expensive
      to evaluate, as is the case with the work presented here, models of the objective will
      be derived from only a limited number of previous evaluations, and so we must account
      for the wider distribution of functions that could interpolate them. This is achieved by
      estimating the variance in addition to the mean.
   2. We then use this distribution to build an acquisition function α:

                                         α(x) = Ep(y|x) [U (y|x)]

      where U is the utility function, discussed in further detail below. The acquisition function
      is an inexpensive function that measures the expected utility of collecting data about any
      point x. The point x0 that maximises α(x) is therefore the point to evaluate next.
   3. We evaluate the objective at point x0 , which adds to our training dataset, and we return to
      step 1. However, if the pre-allocated resources for the optimisation process are exhausted,
      we terminate the optimisation process. The minimum is extracted from the collection of
      points evaluated so far.

Utility function. Given a model of the objective function, consider the utility of evaluating
points in the parameter space. The expectation as well as the uncertainty of the performance
of any point in the parameter space as predicted by the model will vary. Some regions of the
parameter space may be predicted with high expectation and low variance, wherein it may be
useful to exploit the knowledge in the model in search of an optimal point. Alternatively, it
may be beneficial to explore the parameter space in regions of high uncertainty in the model, to
reduce the uncertainty in the model and to reveal potentially hidden optimal points. Exploration
and exploitation are trade-offs within the pre-determined resource budget allocated for the
optimisation process and must be carefully balanced. The choice of the point of interest during
Bayesian optimisation at any stage is the point that maximises the acquisition function, which
in turn uses the utility function. This acquisition function maps the model of the objective,
which is a distribution over functions, to a single function which measures the expected utility
of any point in the parameter space. The utility function is user-specified and can be used to
control the balance between exploration and exploitation. A commonly used utility function is
expected improvement. As explained before, we model our objective using GP regression and
thereby obtain a distribution over functions f :

                                        p(f ) = GP(f ; µ, K)

where µ(x) is the mean function and K(x, x0 ) is the kernel covariance function. Suppose the
minimum value of the objective that we have observed so far is v. The expected improvement
utility function is then defined as:

                                     U (x) = max(0, v − f (x)).

In effect, this function rewards improvements proportionally, but does not reward degradations.

                                                                                               41
Bayesian Optimisation of Solver Parameters in CBMC                                  Mangla, Holden and Paulson


Discrete parameters. The Bayesian optimisation framework presented thus far is defined for
continuous arguments to f . SMT solvers however often have discrete parameters. One solution
is to map the discrete parameters to integers, and yet model them in the GP regression as any
other real-valued parameter. When the objective needs to be evaluated, values supplied for the
discrete parameters can be rounded to integers. Rounding is a common workaround for integer
parameters in BO, but it can lead to problems, since the points that are actually evaluated are
different from the points that are modelled in the GP, as shown in work by Garrido-Merchán et
al. [4]. Their suggested solution is to modify the kernel covariance function K(., .) in the GP to

                                       K 0 (xi , xj ) = K(T (xi ), T (xj )),

where T (x) rounds all integer-valued variables to the closest integer. Kernel covariance functions
are discussed further in appendix A. In this work, we have used a commercially available BO
framework described below, which provides built-in facilities for integer parameters.


4     Methodology
In this work, we derive a parameter configuration for MiniSat that is optimised for a subset of
the test harnesses in aws-c-common.

Goto files. Using aws-batch-cbmc, we completed one batch of the CBMC tests for aws-c-
common. This generated goto files for each of the test harnesses, which we used subsequently
for all our tests. The system also reported time spent by the tests in the solver. The 166 tests
spend a total of 4252 seconds in the solver, with a median of 1.2 seconds; of these, 159 tests
spend less than 100 seconds, and another 4 spend less than 200 seconds. The remaining three
tests are notably slower, taking up 488, 687 and 1005 seconds respectively in the solver.

Tests and parameters. Given the resources available to us, we limited our sample of test
harnesses to the 125 tests that spend the least amount of time in MiniSat. They represent 254
seconds of total time in the solver before optimisation. CBMC uses the SimpSolver in MiniSat,
the parameters of which are shown in Table 1. We decided to limit our optimisation task to
the SimpSolver specific parameters, leaving the remaining MiniSat parameters at their defaults.
Early experiments revealed that any search for more optimal parameters would maintain the
three Boolean parameters asymm, rcheck and elim at their default settings of false, false
and true respectively. Therefore, we decided to fix these Boolean parameters to their default
values and attempted to optimise only the remaining four parameters.

Objective function. Our goal was to find a single parameter configuration for the whole set
of test harnesses that could reduce the total time spent in evaluating the entire set. In more
detail, suppose there are N test harnesses, and functions ti map from the parameters to the
evaluation time of the test harness indexed by i, such that

                                              ti : (g, c, s, r) 7→ τ
    where g, c, s and r are the MiniSat parameters grow, cl-lim, sub-lim and simp-gc-frac
respectively, and τ is the time spent in MiniSat when that test harness is evaluated.    P For any fixed
parameter configuration tuple (g 0 , c0 , s0 , r0 ), the objective function can simply be ti (g 0 , c0 , s0 , r0 ).
However, it is necessary to adjust the objective function with a timeout, since some parameter

42
Bayesian Optimisation of Solver Parameters in CBMC                       Mangla, Holden and Paulson


                            Table 1: MiniSat SimpSolver parameters
    Parameter               Type Default Description
        asymm             Bool         False   Shrink clauses by asymmetric branching
       rcheck             Bool         False   Check if a clause is already implied
         elim             Bool         True    Perform variable elimination
         grow      Int (−∞, ∞)             0   Allow a variable elimination step to grow by a
                                               number of clauses
       cl-lim          Int [0, ∞)         20   Variables are not eliminated if it produces a resol-
                                               vent with a length above this limit
      sub-lim          Int [0, ∞)      1000    Do not check if subsumption against a clause is
                                               larger than this
simp-gc-frac      Double (0, ∞)          0.5   The fraction of wasted memory allowed before a
                                               garbage collection is triggered during simplification


configurations will lead to a prohibitively long evaluation time. Therefore, when the objective
begins to exceed the pre-defined timeout, we take note and terminate it.

Implementation. Our implementation of the Bayesian optimisation process for this task
is in Matlab [10], a commercially available numerical computing platform and development
environment. Matlab 2019 includes a comprehensive Bayesian optimisation framework that was
suitable for this task. Since our objective is a measure of elapsed real time on a local machine,
small variations can be expected in the objective for any fixed set of parameters. The framework
in Matlab can be set to accept such a non-deterministic objective. The framework also accepts
integer parameters, which was another key requirement of our project, since all parameters
except for the garbage collection factor are discrete.
    In the Matlab implementation of Bayesian optimisation, the objective function may return
with NaN (not a number) to signify an error. Matlab builds an error model to accommodate
errors in the objective, and this model can accommodate non-deterministic errors. We chose
to model timeouts in our objective function as errors. Modelling timeouts as errors allows the
Bayesian optimisation framework to avoid parameter spaces that frequently cause timeouts. In
this scenario, the choice of the timeout value has trade-offs that must be considered. A short
timeout will be resource efficient, reducing the time spent in parameter spaces that perform
worse than the original ones. However, poor performing parameters are also informative of the
shape of the objective function, but when they are modelled as errors due to the timeout, the
GP model has less data to learn from. Based on the practical resources available to us, we
specified a large timeout of 1200 seconds, more than four times the time taken by MiniSat in
the original configuration.

Acquisition function & GP kernel. We used the expected improvement plus acquisition
function from the framework in Matlab. This is similar to the expected improvement function
described above, with a modification to avoid excessively exploiting an area. We used the
default exploration ratio of 0.5. To obtain the maximum control over the Bayesian optimisation
process in Matlab, we used the bayesopt function. However, the kernel covariance function
used by this is fixed to the Matérn 5/2 kernel, which is briefly discussed in appendix A. The
kernel hyperparameters are optimised internally by Matlab, and options to influence them are
unavailable.

                                                                                                 43
Bayesian Optimisation of Solver Parameters in CBMC                                                  Mangla, Holden and Paulson


5         Results

                                                      Table 2: Parameter bounds for optimisation.
                                                               Parameter Bounds
                                                                  grow     [−10000, 10000]
                                                                cl-lim     [0, 1000] ∪ {∞}
                                                               sub-lim     [0, 100000] ∪ {∞}
                                                          simp-gc-frac     (0, 50.0]


    It is necessary to specify bounds on the parameter space for the optimisation process. Our
specified bounds are shown in Table 2. These bounds were chosen to be sufficiently large, again
based on preliminary experiments. In the case of cl-lim and sub-lim, it is possible to set
both parameters to −1 in MiniSat, which then removes the clause limit and subsumption limit
entirely, and we included that option in our parameter search space.


                                            210
     Minimum Observed Objective (seconds)


                                            200


                                            190


                                                  0         50             100            150                200
                                                                          Iterations


Figure 1: The minimum observed objective as Bayesian optimisation progresses through each
iteration.

   After running the optimisation for a total of 210 iterations, which completed in thirty-four
hours, the best discovered parameters are shown in Table 3. In comparison to the original
parameters, these parameters reduce the time spent in the solver from 254 seconds to 184
seconds. Figure 1 plots the minimum objective that is observed until each iteration during
Bayesian optimisation. The times presented here are a measure of the times spent in MiniSat.
Each invocation of CBMC to evaluate a goto file will require relatively expensive additional steps
such as the preparation of formulae, which are unaffected by changes in MiniSat parameters.
   We have optimised a collection of test harnesses all together to find a single parameter
configuration that, applied to all the tests, reduces the total time spent in the solver. An

44
Bayesian Optimisation of Solver Parameters in CBMC                     Mangla, Holden and Paulson


             Table 3: Improved parameters discovered using Bayesian optimisation
                                      Parameter Value
                                              grow      1615
                                            cl-lim      2
                                           sub-lim      61603
                                      simp-gc-frac      28.32


alternative is to optimise each test harness individually, and save the discovered parameters
along with the test harness. When the suite of test harnesses is evaluated, each test harness can
then be evaluated with individualised solver parameters. The trade-off between this approach and
our presented approach depends on the resources available and the nature of the test harnesses.
For a basic comparison, we optimised two randomly selected tests harnesses individually for the
same number of iterations. The results, shown in Table 4, show that the optimised parameters
discovered for the two are very different, and further improvement is achieved.


Table 4: Parameters optimised for two test harnesses individually, with durations (original,
optimised) in the solver.
                                                                              Solver time
           Test harness      grow    cl-lim     sub-lim     simp-gc-frac   Orig.   Opt.     Gain
   aws_byte_buf_cat            0         −1          1000            0.5    56s     51s      9%
 aws_hash_table_put        −8208         230           36          29.68    78s     59s     24%


6    Future Work
We have focussed on optimisation of the SimpSolver specific parameters in this work. However,
another eleven parameters are available in MiniSat, and tuning all parameters together may
yield improved results.


7    Conclusions
We have shown that the use of Bayesian optimisation to tune parameters of the default solver
in CBMC is effective, and leads to a significant improvement in our experiments. Bayesian
optimisation has shown to be successful in a variety of optimisation tasks, and our results show
it can be effectively used to tune parameters of satisfiability solvers.


8    Acknowledgements
This work was developed during an internship at the Automated Reasoning Group of Ama-
zon Web Services, supervised by John Harrison. Subsequent research was supported by the
Engineering and Physical Sciences Research Council Doctoral Training studentship [reference
number 1788755].

                                                                                               45
Bayesian Optimisation of Solver Parameters in CBMC                            Mangla, Holden and Paulson


References
 [1] Armin Biere. Yet another local search solver and Lingeling and friends entering the SAT Competition
     2014. Sat competition, 2014(2):65.
 [2] Edmund Clarke, Daniel Kroening, and Flavio Lerda. A tool for checking ANSI-C programs. In
     International Conference on Tools and Algorithms for the Construction and Analysis of Systems,
     pages 168–176. Springer, 2004.
 [3] Niklas Eén and Niklas Sörensson. An extensible sat-solver. In Enrico Giunchiglia and Armando Tac-
     chella, editors, Theory and Applications of Satisfiability Testing, pages 502–518, Berlin, Heidelberg,
     2004. Springer Berlin Heidelberg.
 [4] Eduardo C. Garrido-Merchán and Daniel Hernández-Lobato. Dealing with categorical and integer-
     valued variables in Bayesian optimization with Gaussian processes. Neurocomputing, 380:20–35,
     2020.
 [5] Jinbo Huang. The effect of restarts on the efficiency of clause learning. In Proceedings of the 20th
     International Joint Conference on Artifical Intelligence, IJCAI’07, page 2318–2323, San Francisco,
     CA, USA, 2007. Morgan Kaufmann Publishers Inc.
 [6] Frank Hutter, Domagoj Babic, Holger H Hoos, and Alan J Hu. Boosting verification by automatic
     tuning of decision procedures. In Formal Methods in Computer Aided Design, FMCAD’07, pages
     27–34. IEEE, 2007.
 [7] Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential model-based optimization
     for general algorithm configuration. In Carlos A. Coello Coello, editor, Learning and Intelligent
     Optimization, pages 507–523, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
 [8] Frank Hutter, Holger H. Hoos, and Thomas Stützle. Automatic algorithm configuration based on
     local search. In Proceedings of the 22nd National Conference on Artificial Intelligence - Volume 2,
     AAAI’07, page 1152–1157. AAAI Press, 2007.
 [9] K. Rustan M. Leino and Michal Moskal. Co-induction simply. In Cliff Jones, Pekka Pihlajasaari, and
     Jun Sun, editors, FM 2014: Formal Methods, pages 382–398, Cham, 2014. Springer International
     Publishing.
[10] The Mathworks, Inc., Natick, Massachusetts. MATLAB version 9.7.0.1296695 (R2019b) Update 4,
     2019.
[11] Ha Quang Minh, Partha Niyogi, and Yuan Yao. Mercer’s theorem, feature maps, and smoothing.
     In International Conference on Computational Learning Theory, pages 154–168. Springer, 2006.
[12] Hakjoo Oh, Hongseok Yang, and Kwangkeun Yi. Learning a strategy for adapting a program
     analysis via Bayesian optimisation. ACM SIGPLAN Notices, 50(10):572–588, 2015.
[13] Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning.
     MIT Press, Cambridge, MA, 2008.
[14] Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the
     human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175,
     2015.
[15] Carsten Sinz and Markus Iser. Problem-sensitive restart heuristics for the DPLL procedure. In
     Oliver Kullmann, editor, Theory and Applications of Satisfiability Testing - SAT 2009, pages
     356–362, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.
[16] Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical Bayesian optimization of machine
     learning algorithms. In Advances in neural information processing systems, pages 2951–2959, 2012.
[17] Nikhil Swamy, Joel Weinberger, Cole Schlesinger, Juan Chen, and Benjamin Livshits. Verifying
     higher-order programs with the Dijkstra monad. ACM SIGPLAN Notices, 48(6):387–398, 2013.
[18] Agnieszka Slowik, Chaitanya Mangla, Mateja Jamnik, Sean Holden, and Lawrence Paulson. Bayesian
     optimisation for heuristic configuration in automated theorem proving. In Laura Kovacs and Andrei
     Voronkov, editors, Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops, volume 71
     of EPiC Series in Computing, pages 45–51. EasyChair, 2020.

46
Bayesian Optimisation of Solver Parameters in CBMC                                    Mangla, Holden and Paulson


A       Kernel covariance functions in GP regression
A kernel function is a real-valued function of two vector-valued arguments,

                                                   k(x, x0 ) ∈ R,

for x, x0 ∈ χ where χ is some abstract space. Kernels may be viewed as a measure of similarity
between the two arguments when the function is symmetric and non-negative. For example, the
following kernel measures cosine similarity:

                                                             xT x0
                                         k(x, x0 ) =                      .
                                                          ||x||2 ||x0 ||2

Kernel functions used as covariance functions in Gaussian processes must be positive definite.
Such a kernel is called a Mercer kernel. Mercer’s theorem (Minh et al. [11]) shows that for such
kernels there exists a function
                                          φ : χ 7→ RD
such that
                                         k(x, x0 ) = φ(x)T φ(x0 ).
This kernel is effectively computing an inner product of the two arguments in a D dimensional
space, where D is typically larger than the dimensionality of χ and potentially infinite. For
example, consider the following kernel when x, x0 ∈ R2 :

                 k(x, x0 ) = (1 + xT x0 )2
                          = (1 + x1 x01 + x2 x02 )2
                          = 1 + 2x1 x01 + 2x2 x02 + (x1 x01 )2 + (x2 x02 )2 + 2x1 x01 x2 x02
                          = φ(x)T φ(x0 )

where                                √      √                 √      T
                              φ(x) = 1, 2x1 , 2x2 , x21 , x22 , 2x1 x2 .
Therefore, this kernel represents a measure of similarity in a six-dimensional space for two-
dimensional vectors. The cosine similarity kernel shown above is also a Mercer kernel. As Snoek
et al. [16] explain, the squared exponential kernel is often a default choice for the covariance
function in Gaussian processes regression:
                                                                              D
                                           1   2     0                        X (xd − x0 )2
                    kSE (x, x0 ) = θ0 e{− 2 r (x,x )} ,      r2 (x, x0 ) =                 d
                                                                                               .
                                                                                     θd2
                                                                              d=1

However, this may lead to sample functions in the GP that are unrealistically smooth for
practical optimisation problems. Instead, Snoek et al. recommend the use of the Matérn 5/2
kernel:                                                             √ 2
                                       p              5                       0
                kM 52 (x, x0 ) = θ0 1 + 5r2 (x, x0 ) + r2 (x, x0 ) e{− 5r (x,x )} .
                                                      3


                                                                                                             47