Random Models of Very Hard 2QBF and
         Disjunctive Programs: An Overview?

       Giovanni Amendola1 , Francesco Ricca1 , and Miroslaw Truszczynski2
                              1
                               University of Calabria, Italy
                           {ricca,amendola}@mat.unical.it
                             2
                               University of Kentucky, USA
                                  mirek@cs.uky.edu


        Abstract. We present an overview of models of random quantified boolean
        formulas and their natural random disjunctive ASP program counter-
        parts that we have recently proposed. The models have a simple struc-
        ture but also theoretical and empirical properties that make them useful
        for further advancement of the SAT, QBF and ASP solvers.

        Keywords: QBF, ASP, Random models


1     Introduction
The study of theoretical and practical properties of random models has received
substantial attention in Computer Science. The results obtained in these studies
have had major practical impact especially in boolean satisfiability (SAT) [1,
13] and constraint satisfaction problems (CP) [10]. Random models are char-
acterized by intriguing phase-transition phenomena that are often associated
with the inherent hardness of instances. Inherently hard instances for SAT and
QBF solvers are essential for designing and testing search methods employed by
solvers [1]. Thus it is important to study models of random propositional formu-
las and QBFs that can reliably generate instances of a desired hardness [9].
    Previous work in this area mainly focused on random formulas in the con-
junctive normal form (CNF), and random prenex-form QBFs with the matrix in
CNF or disjunctive normal form (DNF) (depending on the quantifier sequence).
The fixed-length clause model of k-CNF formulas and its 2QBF extension have
been the subjects of extensive studies. Formulas in the fixed-length clause model
consist of m clauses over a (fixed) set of n variables, each clause with k non-
complementary literals. All formulas are assumed to be equally likely. For that
model it is known that there are reals ρl (k) and ρu (k) such that if m/n < ρl (k),
a formula from the model is almost surely satisfiable (SAT), and if m/n > ρu (k),
almost surely unsatisfiable (UNSAT). It is conjectured that ρl (k) = ρu (k). That
conjecture is still open. However, it holds asymptotically, i.e., the two bounds
converge to each other with k → ∞ [2]. An important empirical property of
these models is called the easy-hard-easy pattern [11]. Basically, instances from
?
    Results originally published in [4]
2       G. Amendola et al.

the phase transition region are difficult to be solved (hard in jargon), and so
they are useful for assessing solver performance; whereas those from regions on
both sides of the phase transition are easy to solve and thus, they are well suited
for solver testing and similar purposes [1].
    The fixed-length clause model was extended to QBFs by Chen and Interian
[6]. In addition to n and m (understood as above), their model includes pa-
rameters controlling the structure of formulas. Once these parameters are fixed,
similar properties as in the case of the k-CNF model emerge. There is a phase
transition region associated with a specific value of the ratio m/n (that does not
depend on n) and the easy-hard-easy pattern can be experimentally verified.
Both the fixed-length clause model for SAT and the Chen and Interian model
for QBF are based on formulas in normal forms. However, many applications give
rise to formulas in non-normal forms motivating studies of solvers of non-normal
form formulas and QBFs, and raising the need of models of random non-normal
form formulas. A first response to that challenge was provided by Navarro and
Voronkov [12, 7] with the fixed-shape model. The model is similar to the exten-
sion of the k-CNF one to QBFs, but fixed shape (and size) non-normal form
formulas are used in place of k-clauses as the key building blocks. Motivated
by the work on random SAT and QBF models, researchers proposed models of
random logic programs, and obtained empirical and theoretical results concern-
ing their properties [16, 14, 15]. Those results could be of substantial interest to
answer set programming (ASP) [5], a popular computational formalism based
on disjunctive logic programs. However, the first results in this area were limited
to non-disjunctive logic programs.
    In [4] we have presented new models of random non-normal form formulas
and 2QBFs. Specifically, we considered disjunctions of t k-CNF formulas (in the
case of QBFs, using them as matrices). We called models generating such formu-
las multi-component. Instances obtained according to our models are different
from the ones originating from the fixed-shape model of Navarro and Voronkov,
as their building blocks (k-CNF formulas) do not have a fixed size. QBFs from
our model have a natural representation as disjunctive logic programs, obtained
by extending the well-known encoding of [8]. Thus, a model of random disjunc-
tive logic programs originates from the multi-component models of QBFs.
    In this paper we give an overview of the random models presented in [4], and
in particular we focus on the theoretical properties of our models by recalling the
theoretical bounds on the location of the phase transition. Experimental results
reported in [4] confirm that our models exhibit the easy-hard-easy pattern in
correspondence of the phase transition. Thus, our results provide new ways to
generate hard and easy instances of SAT, and QBF formulas and ASP programs.


2   Preliminaries

In this section, we recall the fixed-length clause model for random CNF formulas
and the Chen-Interian model for random QBF formulas. We then describe the
models of QBFs and programs proposed by [4]. By C (k, n, m) we denote the
                                 Very Hard 2QBF and Disjunctive Programs            3

set of all k-CNF formulas consisting of m clauses over (some fixed) set of n
propositional variables. Similarly, D(k, n, m) stands for the set of all k-DNF
formulas of m products (conjunctions of non-complementary literals) over an
n-element set of atoms.
The fixed-length clause model. The model is given by the set C (k, n, m)
of CNF formulas, with all formulas assumed equally likely. Formulas from the
model can be generated by selecting m k-literal clauses over a set of n variables
uniformly, independently and with replacement. Let us denote by p(k, n, m) the
probability that a random formula in C (k, n, m) is SAT. We define ρl (k) to be
the supremum over all real numbers x such that for every ρ < x, it holds that
limn→∞ p(k, n, bρnc) = 1. Similarly, we define ρu (k) to be the infimum over
all real numbers x such that for every ρ > x, limn→∞ p(k, n, bρnc) = 0. It is
known that ρl (k) and ρu (k) are well defined. Moreover, ρl (k) ≤ ρu (k) and, it
is conjectured that ρl (k) = ρu (k). The conjecture holds asymptotically, that is,
limk→∞ ρl (k) = limk→∞ ρu (k).
The Chen-Interian model. The model generates QBFs of the form ∀X∃Y F ,
where sets X and Y are disjoint, |X| = A, |Y | = E, and F is a CNF for-
mula with m clauses, each containing a literals with variables in X and e
literals with variables in Y . We write Q(a, e; A, E; m) for the set of all such
QBFs. The Chen-Interian model generates QBFs from Q(a, e; A, E; m), with
all formulas equally likely. Let q(a, e; A, E; m) be the probability that a ran-
dom QBF from Q(a, e; A, E; m) is true. Let r > 0 be a fixed real. We set
νl (a, e; r) to be the supremum over all real numbers x such that for every ν < x,
limn→∞ q(a, e; A, E; bνnc) = 1, where A = brEc and n = A + E. Similarly, we
set νu (a, e; r) to be the infimum over all real numbers x such that for every
ν > x, limn→∞ q(a, e; A, E; bνnc) = 0, again with A = brEc and n = A + E.
Chen and Interian proved that νl (a, e; r) and νu (a, e; r) are well defined. Clearly,
νl (a, e; r) ≤ νu (a, e; r). Whether νl (a, e; r) = νu (a, e; r) is an open problem.
The quantities νl (a, e; r) and νu (a, e; r) delineate the phase-transition region.
For QBFs generated from the model Q(a, e; brEc, E; bνnc) (with fixed r), Chen
and Interian experimentally observed the easy-hard-easy pattern as ν grows,
showed that the hard region is aligned with the phase transition, and that the
same behavior emerges no matter what concrete r is fixed as the ratio A/E.


3    Multi-component models

Random SAT and QBF. Let F be a class of propositional formulas (or a model of
random formula). By t-F we denote the class of all disjunctions of t formulas from
F (or a model generating disjunctions of random formulas from F). Similarly,
if Q is a class (model) of QBFs of the form ∀X∃Y F , where F ∈ F, we write
t-Q for the class (model) of all QBFs of the form ∀X∃Y F , where F ∈ t-F. We
refer to models t-F and t-Q as multi-component. For QBFs we also consider the
dual model to t-Q, based on conjunctions of t DNF formulas. It gives rise to
the multi-component model of disjunctive logic programs via the Eiter-Gottlob
4       G. Amendola et al.

translation [8]. In all cases, we assume that formulas (QBFs, respectively) are
equally likely.
    In our work we focus on models based on the model C(k, n, m). First, we
note that the corresponding multi-component model t-C (k, n, m) has similar
satisfiability properties and that the phase transition regions in the two models
are closely related. Let pt (k, n, m) be the probability that a random formula in
t-C (k, n, m) is SAT.

Theorem 1 Let t ≥ 1 be a fixed integer. Then, for every ρ < ρl (k), it holds that
limn→∞ pt (k, n, bρnc) = 1, and for every ρ > ρu (k), limn→∞ pt (k, n, bρnc) = 0.

    The proof follows from the identity pt (k, n, m) = 1 − (1 − p(k, n, m))t .Thus, if
the phase transition conjecture holds for the single component model C (k, n, m),
it also holds for the multi-component model t-C (k, n, m), and the threshold value
is the same for every t.
    We also considered the multi-component model t-Q(a, e; A, E; m) of QBFs,
with the Chen-Interian model as its single-component. Let qt (a, e; A, E; m) be
the probability that a random QBF from t-Q(a, e; A, E; m) is true (in particular,
q1 (a, e; A, E; m) = q(a, e; A, E; m)). Extending Chen and Interian’s work, we can
prove that the phase transition for different values of t coincide (and coincide
with the phase transition in the Chen-Interian model).

Theorem 2 For every integer t ≥ 1 and real number r > 0: if ν < νl (a, e; r),
it holds that limn→∞ qt (a, e; A, E; bνnc) = 1, and if ν > νu (a, e; r), it holds that
limn→∞ qt (a, e; A, E; bνnc) = 0 (where A = brEc and n = A + E).

    The theorems above describe the situation when t is fixed and n is large.
When n is fixed and t grows, the analysis of pt (k, n, m) and qt (a, e; A, E; m)
shows that the region of the transition from SAT to UNSAT shifts to the right. Of
course, once we stop growing t and start increasing n again, the phase transition
region will move back to the left.

Random Disjunctive Programs. Our model of random disjunctive programs is
based on the translation from QBFs to programs due to Eiter and Gottlob [8].
It works on the model dual to the one we discussed above. The model consists
of QBFs Φ = ∃X∀Y F , where F ∈ D(e, a; E, A; m), the set of formulas dual to
C (e, a; E, A; m). The model clearly has the same properties (modulo a switch
between true and false).
    The Eiter-Gottlob translation has a simple extension to QBFs obtained from
the multi-component models. We write t-Ddlp (e, a; E, A; m) for the set of pro-
grams consisting of the following fixed part: z ∨ z 0 , for each z ∈ Z; y ← w, and
y 0 ← w, for each y ∈ Y ; w ← w1 , . . . , wt , and w ← not w. And the core of
mt Horn rules of the form wh ← z1 , . . . , z` , where h = 1, . . . , t, each wh is the
head of exactly m rules, ` = e + a, and the body of each rule has e atoms z and
z 0 with z ∈ X (and so, also a atoms of the form z and z 0 with z ∈ Y ). Note
that programs in 1-Ddlp (e, a; E, A; m) coincide with those in Ddlp (e, a; E, A; m)
modulo a rewriting, where the rule w ← w1 is removed and w1 is replaced by w.
                                  Very Hard 2QBF and Disjunctive Programs             5

   Programs in t-Ddlp (e, a; E, A; m) correspond to ∃∀ QBFs whose matrix be-
longs to t-D(e, a; E, A; m). They can be seen as the results of translating QBFs
Φ into logic programs Φdlp , where we use variables wi to represent component
DNF formulas in the matrix of Φ. The correspondence Φ 7→ Φdlp preserves the
semantics in the following sense.
Theorem 3 Let Φ = ∃X∀Y F , for F ∈ t-D(e, a; E, A; , m). Then Φ is true if
and only if Φdlp has an answer set, where Φdlp is the disjunctive logic program
in t-Ddlp (e, a; E, A; m) corresponding to Φ.
Empirical properties and conclusion. The experimental results on satisfiability [4,
3], agree with the above-reported theoretical analysis, that is formulas from our
models show the easy-hard-easy pattern and a strong dependence of hardness on
t. Thus, despite their simple structure, our models have theoretical and empirical
properties that make them important for further advancement of solvers.

References
 1. Achlioptas, D.: Random satisfiability. In: Handbook of Satisfiability, pp. 245–270
    (2009)
 2. Achlioptas, D., Moore, C.: The asymptotic order of the random k-sat threshold.
    In: FOCS 2002. pp. 779–788 (2002)
 3. Amendola, G., Ricca, F., Truszczynski, M.: A generator of hard 2qbf formulas and
    ASP programs. In: KR. pp. 52–56. AAAI Press (2018)
 4. Amendola, G., Ricca, F., Truszczynski, M.: Generating hard random boolean for-
    mulas and disjunctive logic programs. In: IJCAI. pp. 532–538. ijcai.org (2017)
 5. Brewka, G., Eiter, T., Truszczynski, M.: Answer set programming at a glance.
    Commun. ACM 54(12), 92–103 (2011)
 6. Chen, H., Interian, Y.: A model for generating random quantified boolean formulas.
    In: IJCAI 2005. pp. 66–71 (2005)
 7. Creignou, N., Egly, U., Seidl, M.: A framework for the specification of random SAT
    and QSAT formulas. In: TAP 2012. LNCS, vol. 7305, pp. 163–168 (2012)
 8. Eiter, T., Gottlob, G.: On the computational cost of disjunctive logic programming:
    Propositional case. AMAI 15(3-4), 289–323 (1995)
 9. Gent, I.P., Walsh, T.: Beyond NP: the QSAT phase transition. In: AAAI. pp.
    648–653 (1999)
10. Mitchell, D.G.: Resolution complexity of random constraints. In: CP 2002. LNCS,
    vol. 2470, pp. 295–309 (2002)
11. Mitchell, D.G., Selman, B., Levesque, H.J.: Hard and easy distributions of SAT
    problems. In: AAAI 1992. pp. 459–465 (1992)
12. Navarro, J., Voronkov, A.: Generation of hard non-clausal random satisfiability
    problems. In: AAAI 2005. pp. 436–442 (2005)
13. Selman, B., Mitchell, D.G., Levesque, H.J.: Generating hard satisfiability problems.
    Artif. Intell. 81(1-2), 17–29 (1996)
14. Wang, K., Wen, L., Mu, K.: Random logic programs: linear model. TPLP 15(6),
    818–853 (2015)
15. Wen, L., Wang, K., Shen, Y., Lin, F.: A model for phase transition of random
    answer-set programs. ACM Trans. Comput. Log. 17(3), 22 (2016)
16. Zhao, Y., Lin, F.: Answer set programming phase transition: A study on randomly
    generated programs. In: ICLP 2003. LNCS, vol. 2916, pp. 239–253 (2003)