Towards Deterministic Decomposable Circuits
                 for Safe Queries

                        Mikaël Monet1 and Dan Olteanu2
            1
              LTCI, Télécom ParisTech, Université Paris-Saclay, France,
          Inria Paris; Paris, France mikael.monet@telecom-paristech.fr
                 2
                   University of Oxford dan.olteanu@cs.ox.ac.uk


      Abstract. There exist two approaches for exact probabilistic inference
      of UCQs on tuple-independent databases. In the extensional approach,
      query evaluation is performed within a DBMS by exploiting the structure
      of the query. In the intensional approach, one first builds a representa-
      tion of the lineage of the query on the database, then computes the
      probability of the lineage. In this paper we propose a new technique to
      construct lineage representations as deterministic decomposable circuits
      in PTIME. The technique can apply to a class of UCQs that has been
      conjectured to separate the complexity of the two approaches. We test
      our technique experimentally, and show that it succeeds on all the queries
      of this class up to a certain size parameter, i.e., over 20 million queries.


1    Introduction

Probabilistic databases [1] have been introduced in answer to the need to cap-
ture data uncertainty and reasoning about it. This uncertainty can come from
various angles: imperfect sensor precision of scientific data, imprecise automatic
processes (e.g., natural language processing, rule mining in knowledge bases),
untrusted data sources (e.g., web crawling), etc. In their simplest and most com-
mon form, probabilistic databases consist of a relational database where each
tuple is annotated with a probability value that is supposed to represent how
confident we are about having this tuple in the database. While a traditional
(deterministic) database can only satisfy or violate a Boolean query, a proba-
bilistic database has a certain probability of satisfying it. Given a Boolean query
Q the probabilistic query evaluation problem for Q (PQE(Q)) then asks for the
probability that the query holds on an input probabilistic database. We measure
the complexity of PQE(Q) as a function of the input database, hence consider-
ing that the Boolean query Q is fixed. This is known as data complexity, and is
motivated by the fact that the queries are usually much smaller than the data.
    Unfortunately, even for very simple queries, PQE(Q) can be intractable.
When Q is a union of conjunctive queries (UCQ), a dichotomy result is provided
by the work of Dalvi and Suciu [2]: either Q is safe and PQE(Q) is PTIME, or Q
is not safe and PQE(Q) is #P-hard. The algorithm to compute the probability
of a safe UCQ exploits the first order structure of the query to find a so called
safe query plan (using extended relational operators that can manipulate prob-
abilities) and can be implemented within a DBMS. This approach is referred to
as extensional query evaluation, or lifted inference.
     A second approach to PQE is intensional query evaluation or grounded in-
ference, and consists of two steps. First, compute a representation of the lineage
of the query Q on the database D, which is a Boolean formula intuitively repre-
senting which tuples of D suffice to satisfy Q. Second, perform weighted model
counting on the lineage to obtain the probability. To ensure that model counting
is tractable, we use the structure of the query to represent the lineage in tractable
formalisms from the field of knowledge compilation, such as read once Boolean
formulas, free or ordered binary decision diagrams (OBDDs, FBDDs), deter-
ministic decomposable normal forms (d-DNNFs), decision decomposable normal
forms (dec-DNNFs), deterministic decomposable circuits (d-Ds), etc. The main
advantage of this approach compared to lifted inference is that the lineage can
help explain the query answer. Moreover, having the lineage in a good knowledge
compilation formalism can be useful for other applications: we could for instance
change the tuples’ probabilities and compute the new result easily, or compute
the most probable state of the database that satisfies the query.
    What we call the q9 conjecture, formulated by Dalvi, Jha, and Suciu [2, 3],
states that for safe queries, extensional query evaluation is strictly more powerful
than the knowledge compilation approach. Or in other words, that there exists
a query which is safe (i.e., can be handled by the extensional approach) whose
lineages on arbitrary databases cannot be computed in PTIME in a knowledge
compilation formalism that allows tractable weighted model counting (i.e., can-
not be handled by the intensional approach). Note that the conjecture depends
on the tractable formalism that we consider. The conjecture has recently been
shown by Beame, Li, Roy, and Suciu [4] to hold for the formalism of dec-DNNFs
(including OBDDs and FBDDs), which captures the traces of modern model
counting algorithms. Another independent result by Bova and Szeider [5] shows
that the conjecture also holds when we consider the class of deterministic struc-
tured negation normal forms (d-SDNNFs), which are d-DNNFs that follow the
structure of a v-tree [6]. However the question is still open for more expressive
formalisms, namely, d-DNNFs and d-Ds. Maybe the conjecture fails for such ex-
pressive formalisms, i.e., maybe the reason why PQE is PTIME for safe queries is
because we can build deterministic decomposable circuits in PTIME for them?
    In this paper we focus on a class of queries (the H-queries) that was conjec-
tured in [2, 3] to separate the two approaches and that was used to prove the
conjecture for dec-DNNFs [4] and d-SDNNFs [5]. Our first contribution is to
develop a new technique to build d-DNNFs and d-Ds in polynomial time for the
H-queries, based on what we call nice Boolean functions. Because we were not
able to prove that this technique works for all the safe H-queries, our second
contribution is to test this technique with the help of the SAT solver Glucose [7]
on all the H queries up to a certain size parameter, that we generated automat-
ically. We found no query on which it does not work. Interestingly, we found a
few queries for which we can build d-Ds with a single internal negation at the
very top, whereas we do not know if we can build d-DNNFs (could these queries
separate UCQ(d-DNNF) and UCQ(d-D)?). We conjecture that this technique
can build d-Ds for all safe H-queries.
   To do this analysis, we had to solve a task of independent interest, namely,
computing explicitly the list of all inequivalent monotone Boolean functions on
7 variables. This task had previously been undertaken by Cazé, Humphries, and
Gutkin [8] and by Stephen and Yusun [9]. We reused parts of the code from [8]
and confirmed the number of such functions: 490, 013, 148.

Paper structure We start our presentation with preliminaries in Section 2. We
then define the H-queries in Section 3 and review what is known about them.
In Section 4 we introduce our technique, and we experimentally demonstrate its
effectiveness in Section 5. Our code and all the functions are available online [10].

2   Preliminaries
We will consider in this work the most commonly used model for probabilistic
databases: the tuple-independent model, where each tuple is annotated with a
probability of being present or absent, assuming independence across tuples:
Definition 1. A tuple-independent (TID) database is a pair (D, π) consisting
of a relational instance D and a function π mapping each tuple t ∈ D to a
rational probability π(t) ∈ [0; 1]. A TID instance (D, π)Qdefines a probability dis-
tribution Pr on D0 ⊆ D, where Pr(D0 ) := t∈D0 π(t)× t∈D\D0 (1−π(t)). Given
                                            Q

a Boolean query Q, the probabilistic query evaluation problem for Q (PQE(Q))
asks, given as input a TID instance (D, π), the probability
                                                         P that Q is satisfied in
the distribution Pr. That is, formally, Pr(Q, (D, π)) := D0 ⊆D s.t. D0 |=Q Pr(D0 ).
    Dalvi and Suciu [2] have shown a dichotomy result on UCQs for PQE: either
Q is safe and PQE(Q) is PTIME, or Q is not safe and PQE(Q) is #P-hard.
Moreover they show that all the safe queries can be handled by the extensional
approach, i.e., by using the structure of the query to compute the probability.
Due to space constraints, we point to [2] for a presentation of their algorithm
to compute the probability of a safe query, though it is not strictly necessary
to understand the current paper. We denote by UCQ(P) the set of safe UCQs
(hence which corresponds to the set of tractable UCQs if P 6= #P).
    By contrast, in the intentional approach, one first computes a representation
of the lineage Lin(Q, D) of the query Q on the instance D:
Definition 2. The lineage of a Boolean query Q over D is a Boolean formula
Lin(Q, D) on the tuples of D mapping each Boolean valuation ν : D → {0, 1} to 1
or 0 depending on whether Dν satisfies Q or not, where Dν := {t ∈ D | ν(t) = 1}.
    The lineage can be represented with any formalism that represents Boolean
functions (Boolean formulas, BDDs, Boolean circuits, etc), but the crucial idea
is to use a formalism that allows tractable probability computation. In this
work we will specifically focus on deterministic decomposable circuits (d-Ds)
and deterministic decomposable normal forms [11] (d-DNNFs).
Definition 3. Let C be a Boolean circuit (featuring and, or, not, and variable
gates). An and-gate g of C is decomposable if for every two input gates g1 6= g2
of g we have Vars(g1 ) ∩ Vars(g2 ) = ∅, where Vars(g) denotes the set of variable
gates that have a directed path to g in C. We call C decomposable if each and-
gate is. An or-gate g of C is deterministic if there is no pair g1 6= g2 of input
gates of g and valuation ν of the variables such that g1 and g2 both evaluate to 1
under ν. We call C deterministic if each or-gate is. A negation normal form
(NNF) is a circuit in which the inputs of not-gates are always variable gates.
    Probability computation is in linear time for d-Ds (hence, for d-DNNFs): to
compute the probability of a d-D, compute by a bottom-up pass the probability
of each gate, where and gates are evaluated using ×, or gates using +, and not
gates using 1−x. While there does not seem to be any interest in using d-DNNFs
rather that d-Ds for probabilistic databases, we are also interested by d-DNNFs
from a knowledge compilation point of view, as it is currently not known if
d-Ds are strictly more succinct than d-DNNFs. We write UCQ(d-DNNF) (resp.,
UCQ(d-D)) to denote the set of UCQs Q such that for any database instance D,
we can compute in polynomial time (in data complexity) a d-DNNF (resp., d-D)
representation of Lin(Q, D). For a study of the intensional approach using weaker
formalisms for Boolean functions (read once formulas, ordered and free binary
decision diagrams), see [3]. Hence we have:

                     UCQ(d-DNNF) ⊆ UCQ(d-D) ⊆ UCQ(P)                                 (1)

    Dalvi, Jha, and Suciu [2, 3] conjectured that the inclusion UCQ(d-D) ⊆
UCQ(P) is strict, i.e., that the extensional approach is strictly more power-
ful than the intensional approach, and proposed a candidate query to separate
these classes (named q9 and that we define in the next section). The purpose of
this paper is to study this conjecture.


3    The H-queries
We define in this section the H-queries and review what is known about them.
The building blocks of these queries are the queries hki , which were first defined
in the work of Dalvi and Suciu to show the hardness of UCQs that are not safe:
Definition 4. Let k ∈ N, k ≥ 1. The queries hki for 0 ≤ i ≤ k are defined by:
 – hk0 = ∃x∃y R(x) ∧ S1 (x, y);
 – hki = ∃x∃y Si (x, y) ∧ Si+1 (x, y) for 1 ≤ i < k;
 – hkk = ∃x∃y Sk (x, y) ∧ T (y).
    We define the H-queries to be combinations of queries hki , as in [4]:
Definition 5. For k ≥ 1, we define the set of variables hki := {0, . . . , k}. Given
a Boolean function φ on variables hki, we define the Boolean query Qφk to be
the query represented by the first order formula φ[0 7→ hk0 , . . . , k 7→ hkk ], i.e., φ
where we substituted each variable i ∈ hki by the formula hki .
   The query class Hk (resp., Hk+ ) is then the set of queries Qφk when φ ranges
over all Boolean functions (resp., monotone Boolean functions) on variables hki.
                                        ∞              ∞
We finally define H (resp., H+ ) to be                    Hk+ ). Observe that the
                                        S              S
                                           Hk (resp.,
                                         k=1             k=1
queries in H+ are in particular UCQs.

Example 1. Let k = 3, and φ9 be the monotone Boolean function (2 ∨ 3) ∧ (0 ∨
3) ∧ (1 ∨ 3) ∧ (0 ∨ 1 ∨ 2). Then Qφ3 9 represents the query q9 = (h32 ∨ h33 ) ∧ (h30 ∨
h33 ) ∧ (h31 ∨ h33 ) ∧ (h30 ∨ h31 ∨ h32 ) ∈ H3+ , which is safe and was conjectured
in [2, 3] not to be in UCQ(d-D).

   To study the H-queries, we need the following notions on Boolean functions:

Definition 6. Let φ be a Boolean function on variables hki. We will always
consider a valuation ν of hki simply as the set of variables that ν maps to 1.
We write SAT(φ) the set of satisfying valuations of φ. We say that φ depends on
variable l ∈ hki if there exists a valuation ν ⊆ hki such that φ(ν∪{l}) 6= φ(ν\{l}).
We write DEP(φ) ⊆ hki for the set of variables on which φ depends. We call φ
and Qφk nondegenerate if DEP(φ) = hki (and degenerate otherwise).

    Then, if φ is degenerate (i.e., does not depend on all its k + 1 variables), Qφk
is safe and is in UCQ(d-DNNF) (in fact, even in UCQ(OBDD)):

Proposition 1 (Theorem 3.12 of [4], or Lemma 3.8 of [12]). Let k ≥ 1,
and Qφk ∈ Hk with DEP(φ) ( hki. Then Qφk ∈ UCQ(d-DNNF).

    This is in contrast to when φ is nondegenerate. Indeed, Beame, Li, Roy, and
Suciu then show [4] that Qφk do not admit polynomial sized decision decompos-
able NNFs (dec-DNNF). A dec-DNNF is a d-DNNF in which the determinism
of or gates is restricted to simply choosing the value of a variable [13, 14]. That
is, each or gate is of the form (v ∧ g) ∨ (¬v ∧ g 0 ) for some variable v. In fact,
they show a lower bound for more general representations than dec-DNNFs,
namely, for what they called Decomposable Logic Decision Diagrams (DLDDs),
which generalise dec-DNNFs in that they allow negations at arbitrary places and
also allow decomposable binary operator gates. When φ is monotone, another
independent lower bound by Bova and Szeider [5] tells us than we cannot im-
pose structuredness either (i.e., use d-SDNNFs) when φ is nondegenerate. These
results mean that for such queries, one cannot restrict too much the expres-
sivity of determinism. The question is then: do the nondegenerate queries have
polynomial sized d-DNNFs (or d-Ds)?
    Let us first see what the dichotomy theorem tells us about H-queries that are
nondegenerate. We shall restrict our attention to monotone functions now, i.e.,
to queries in H+ , because the dichotomy theorem applies only to UCQs, and Qφk
is not a UCQ when φ is not monotone. We need to define the CNF lattice of φ:

Definition 7. Let φ be a monotone Boolean function on variables hki such that
DEP(φ) = hki, and let FCNF = C0 ∧ . . . ∧ Cn be the (unique) minimized CNF
representing φ, where we see each clause
                                       S simply as the set of variables that it
contains. For s ⊆ hni, we define ds :=   Ci . Note that d∅ is ∅, and that we can
                                            i∈s
have ds = ds0 for s 6= s0 . The CNF lattice of φ is the lattice (L, ≤), where L
is {ds | s ⊆ hni}, and where ≤ is reversed set inclusion. In particular, the top
element 1̂ of LCNF is ∅, while its bottom element 0̂ is hki (because φ depends on
all the variables, hence each variable is in at least one clause).
Example 2. The Hasse diagram of the CNF lattice of φ9 is shown in Figure 1
(ignore for now the values at the right inside the nodes).


                                            ∅:1


             {2, 3} : -1      {1, 3} : -1          {0, 3} : -1


        {1, 2, 3} : 1      {0, 2, 3} : 1          {0, 1, 3} : 1    {0, 1, 2} : -1


                                     {0, 1, 2, 3} : 0


         Fig. 1. CNF lattice of φ9 with the values µ(n, 1̂) for each node n.


    The criterion of Dalvi and Suciu is based on the Mobius function [15] of the
CNF lattice L of φ. The Mobius function µ : L × L →  PZ on L is defined on pairs
(u, v) with u ≤ v by µ(u, u) = 1 and µ(u, v) = −         µ(w, v) for u < v. The
                                                           u<w≤v
dichotomy theorem for UCQs then states:
1. If µL (0̂, 1̂) 6= 0, then Qφk is #P-hard. Hence, if P is different from #P, we
   have Qφk ∈   / UCQ(d-D).
2. If µL (0̂, 1̂) = 0, then Qφk is PTIME. We do not know if Qφk ∈ UCQ(d-DNNF)
   or Qφk ∈ UCQ(d-D). But by what precedes, we know that Qφk ∈     / UCQ(dec-DNNF)
   and Qφk ∈   / UCQ(d-SDNNF).
   Our goal is to investigate, when we are in the second case (φ is monotone,
nondegenerate and safe), for which functions φ we can build d-DNNFs or d-Ds.

4   Nice Boolean Functions
In this section we present a technique to prove that some queries Qφk ∈ Hk are in
UCQ(d-DNNF). This will in particular apply to the query q9 (which, we recall,
was conjectured not to be in UCQ(d-D)). Our goal is to rewrite φ as φ0 ∨. . .∨φk ,
where the φi are mutually exclusive (φi ∧ φj ≡ ⊥ for i 6= j) and depend on strict
subsets Si ( hki of hki. When such a rewriting exists, we say that φ is nice:

Definition 8. Let φ be a Boolean function on variables hki. We call φ nice if
there exist strict subsets Si of hki and mutually exclusive Boolean functions (not
                                                                            k
                                                                            W
necessarily monotone) φi for 0 ≤ i ≤ k such that DEP(φi ) = Si and φ ≡         φi .
                                                                               i=0

    Observe that allowing to have more (or less) than k + 1 functions φi would
not change the definition of being nice. Also, note that if φ is degenerate, then
φ is trivially nice.

Example 3. The function φ9 is equivalent to the mutually exclusive disjunction
φ0 ∨ φ1 ∨ φ2 ∨ φ3 , where φ0 ≡ 0 ∧ ¬2 ∧ 3; φ1 ≡ ¬1 ∧ 2 ∧ 3; φ2 ≡ ¬0 ∧ 1 ∧ 3; and
φ3 ≡ 0 ∧ 1 ∧ 2. Moreover for 0 ≤ i ≤ 3 we have DEP(φi ) ( h3i, hence φ9 is nice.
                                                 k
    When φ is nice, we can express Qφk as              Qφk i , thus showing that Qφk ∈
                                                 W
                                                 i=0
UCQ(d-DNNF). Indeed, given a database D, we can use Proposition 1 to con-
struct in PTIME a d-DNNF C φi representing Lin(Qφk i , D) for each i ∈ {0, . . . , k},
                                  k
and then build the d-DNNF C φ =     C φi that represents Lin(Qφk , D). In other
                                 W
                                      i=0
words, the following holds:

Proposition 2. Let k ∈ N and φ be a Boolean function on variables hki. If φ
is nice, then Qφk ∈ UCQ(d-DNNF).

    Hence q9 ∈UCQ(d-DNNF). This result shows that, for all queries Qφk where
φ is nice, we can compute a d-DNNF representation of their lineage in PTIME,
and hence compute their probability efficiently. We do not know to which queries
this technique can be applied. Moreover also we have the following corollary:
Corollary 1. Let k ∈ N and φ be a Boolean function on variables hki. If ¬φ is
nice, then Qφk ∈ UCQ(d-D).
    We will call co-nice a function φ such that ¬φ is nice.


5    Experiments
We have presented a technique that can be used to show that some queries are
in UCQ(d-DNNF) or UCQ(d-D), but we have not characterized the queries to
which it applies. In this section, we present our experiments that show that for
all k ∈ {1, . . . , 6}, every nondegenerate monotone function φ for which Qφk is safe
is either nice or co-nice. Hence all the safe queries in Hk+ for k ∈ {1, . . . , 6} are
in UCQ(d-D). This suggests that UCQ(d-D) = UCQ(P), or at least that any
counterexample query in H+ must be in Hk+ for k ≥ 7.
    We used a machine with 40 x86 64 CPUs of 2.6 GHz and 512 GB RAM.
The code was written in Python 2.7.12 and parallelized using Python’s multi-
processing library. We explain briefly how we generated all the functions in Hk+
for k ∈ {1, . . . , 6} in Section 5.1, then explain how we tested niceness of these
functions in Section 5.2.


5.1   Generating H+
                  k

We started by generating the set R(k) of all monotone Boolean functions on
variables hki up to isomorphism, that is, up to renaming the variables. The
size of R(k) corresponds to the OEIS sequence A003182, which is only known
up to k = 6 (computed in [8] and in [9]). We used parts of the code from [8]
to generate all functions in R(k) for k in {1, . . . , 6}. We then filtered R(k) to
obtain the set of functions that are nondegenerate. Then we tested whether Qφk
is safe by computing the CNF lattice of φ and checking that µ(0̂, 1̂) = 0. Let us
call SND(k) the set of remaining functions (that is, the functions that are safe
and nondegenerate). It took about 2 weeks (using the 40 CPUs) to compute the
explicit lists of all the functions in R(k) and SND(k) for k ∈ {1, . . . , 6}, and the
sizes of these sets can be found in Table 1. We next explain how we tested the
niceness of each function in SND(k).


5.2   Testing Niceness

Let us call boxes the functions φi used in Definition 8. That is, φ is nice if and
only if we can partition its satisfying valuations into k+1 ordered boxes (we allow
some boxes to be empty), where the i-th box has a symmetry around variable i:

Definition 9. Let ν ⊆ hki be a valuation of hki, and l ∈ hki be a variable. We
define the valuation Toggle(ν, l) to be the valuation ν ∪ {l} if l ∈
                                                                   / ν and ν \ {l} if
l ∈ ν. We say that a set B of valuations of hki has a symmetry around variable
l if for every valuation ν ⊆ hki we have ν ∈ B iff Toggle(ν, l) ∈ B.

    To check if φ is nice, we build a CNF Nice(φ) that expresses exactly that
SAT(φ) can be partitioned nicely, i.e., Nice(φ) is satisfiable if and only if φ is
nice. We can then use a SAT solver.

Definition 10. Let k ≥ 1 and φ be a Boolean function on hki. We define the
CNF Nice(φ) as follows. Its set of variables is {xlν | ν ∈ SAT(φ) and l ∈ hki},
where xlν intuitively expresses that ν is put in box l. Its set of clauses is:
                                          k
                                               xlν , expressing the valuation ν must be
                                          W
1. For each ν ∈ SAT(φ), the clause
                                         l=0
   put in at least one box;
                                                                                       0
2. For each ν ∈ SAT(φ) and l, l0 ∈ {0, . . . , k} with l 6= l0 , the clause ¬xlν ∨ ¬xlν ,
   expressing that the valuation ν is in at most one box;
3. For each ν ∈ SAT(φ) and l ∈ {0, . . . , k}, then:
Table 1. Results of our experiments. The meaning of columns is explained after Propo-
sition 3.

    k           |R(k)|      |SND(k)|          |N (k)|    |co-N (k)|     |BAD(k)|
    1                 5              0              0            0              0
    2                10              0              0            0              0
    3                30              2              2            0              0
    4               210             25             25            0              0
    5           16, 353         2, 531         2, 529            2              0
    6     490, 013, 148   21, 987, 161   21, 987, 094           67              0


                        / SAT(φ), the clause ¬xlν ;
    (a) If Toggle(ν, l) ∈
    (b) Else, the clause ¬xlν ∨ xlToggle(ν,l) .
    This ensures that the box l has a symmetry around l.

Proposition 3. φ is nice iff Nice(φ) is satisfiable.

    Now for each function φ in SND(k), we constructed the CNF formula Nice(φ),
and used the SAT solver Glucose [7] to determine if it is satisfiable. If Nice(φ) is
satisfiable then Qφk ∈ UCQ(d-DNNF) and we store φ in N (k). If it is not we give
the formula Nice(¬φ) to Glucose. If this formula is satisfiable then φ is co-nice
and Qφk ∈ UCQ(d-D) (but we do not know if it is in UCQ(d-DNNF)) and we
store φ in co-N (k). If Nice(¬φ) is not satisfiable then φ is in BAD(k) and we do
not know if Qφk is in UCQ(d-D). The results of these experiments are displayed
in Table 1, and, as we found no function in BAD(k), imply:

Proposition 4. All the safe queries in Hk+ for k ∈ {1, . . . , 6} are in UCQ(d-D).

   We give here one of the 2 functions that are in co-N (5), φco−N1 := 24 ∧ 034 ∧
013 ∧ 12 ∧ 15 ∧ 05 ∧ 35 ∧ 23 ∧ 02 ∧ 25 ∧ 014 ∧ 45 where we write, for instance, 014
to mean 0 ∨ 1 ∨ 4. Could φco−N1 separate UCQ(d-DNNF) from UCQ(d-D)?


6       Conclusion
We have introduced a new technique to construct deterministic decomposable
circuits for safe H-queries and have experimentally demonstrated its effectiveness
on the first 20 million such queries. We conjecture that this technique can build
d-Ds for all safe H-queries. We leave open many intriguing questions:
 – For the H-queries, can we use the DNF lattice instead of the CNF lattice to
   decide if the query is safe [16]?
 – Can we show (unconditionally to P 6= #P) that if Qφk is not safe then φ is
   not nice?
 – What is the link between our technique and the notion of d-safety defined
   in [3]?
 – Do the queries in co-N separate UCQ(d-DNNF) from UCQ(d-D)?
    We have put online our code, and the complete list of all the functions studied
in Section 5.2, which we believe could be useful for people studying the H-queries.
It can be used, for instance, to enter by hand a monotone Boolean function φ
and check if Qφk is safe of not, draw its CNF lattice, check if φ is nice, etc.

Acknowledgements. We are grateful to Antoine Amarilli for careful proofreading
of the article (and for the many discussions about the code), to Romain Cazé
for pointing out to us his code to generate monotone Boolean functions, and to
Stephen Tamon for referring us to Romain Cazé. The second author would like
to acknowledge Guy van den Broeck for initial discussions on the q9 conjecture.
The fact that q9 is expressible as a succinct d-D has been already known to him
and has been mentioned in several of his presentations prior to this article. This
work was partly funded by the Télécom ParisTech Research Chair on Big Data
and Market Insights, and by the EPSRC platform grant DBOnto (L012138) that
funded Mikaël’s research visit at Oxford.

References
 1. Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Morgan &
    Claypool (2011)
 2. Dalvi, N., Suciu, D.: The dichotomy of probabilistic inference for unions of con-
    junctive queries. J. ACM 59(6) (2012)
 3. Jha, A., Suciu, D.: Knowledge compilation meets database theory: Compiling
    queries to decision diagrams. Theory of Computing Systems 52(3) (2013)
 4. Beame, P., Li, J., Roy, S., Suciu, D.: Exact model counting of query expressions:
    Limitations of propositional methods. ACM Trans. Database Syst. 42(1) (2017)
 5. Bova, S., Szeider, S.: Circuit treewidth, sentential decision, and query compilation.
    In: PODS. (2017)
 6. Pipatsrisawat, K., Darwiche, A.: New compilation languages based on structured
    decomposability. In: AAAI. (2008)
 7. Audemard, G., Simon, L.: Predicting learnt clauses quality in modern SAT solvers.
    In: IJCAI. (2009)
 8. Cazé, R.D., Humphries, M., Gutkin, B.: Passive dendrites enable single neurons
    to compute linearly non-separable functions. PLOS Computational Biology 9(2)
    (2013) Code available at https://github.com/rcaze/PlosCB2013.
 9. Stephen, T., Yusun, T.: Counting inequivalent monotone boolean functions. Dis-
    crete Applied Mathematics 167 (2014)
10. Monet, M.: mikael-monet.net/en/publications.html#monet2018towards
11. Darwiche, A.: On the tractable counting of theory models and its application to
    truth maintenance and belief revision. J. Applied Non-Classical Logics 11(1-2)
    (2001)
12. Fink, R., Olteanu, D.: Dichotomies for queries with negation in probabilistic
    databases. ACM Trans. Database Syst. 41(1) (2016)
13. Huang, J., Darwiche, A.: DPLL with a trace: From SAT to knowledge compilation.
    IJCAI’05 (2005)
14. Huang, J., Darwiche, A.: The language of search. J. Artif. Int. Res. 29(1) (2007)
15. Stanley, R.P.: Enumerative Combinatorics: Volume 1. 2nd edn. (2011)
16. Monet, M.: Möbius values of CNF and DNF lattices of a monotone boolean func-
    tion. http://cstheory.stackexchange.com/q/39754 (2018)