=Paper=
{{Paper
|id=Vol-2011/paper4
|storemode=property
|title=On the Quest for an Acyclic Graph
|pdfUrl=https://ceur-ws.org/Vol-2011/paper4.pdf
|volume=Vol-2011
|authors=Mikoláš Janota,Radu Grigore,Vasco Manquinho
|dblpUrl=https://dblp.org/rec/conf/aiia/JanotaGM17
}}
==On the Quest for an Acyclic Graph==
On the Quest for an Acyclic Graph
Mikoláš Janota1, Radu Grigore2, and Vasco Manquinho1
1
IST/INESC-ID, Lisbon, Portugal
2
School of Computing, University of Kent, UK
Abstract. The paper aims at finding acyclic graphs under a given set of
constraints. More specifically, given a propositional formula φ over edges
of a fixed-size graph, the objective is to find a model of φ that corresponds
to a graph that is acyclic. The paper proposes several encodings of the
problem and compares them in an experimental evaluation using state-
of-the-art SAT solvers.
1 Introduction
SAT solvers have become popular means of solving computationally hard prob-
lems. However, most modern SAT solvers require the problem to be specified in
conjunctive normal form (CNF). This may pose difficulty in problems compris-
ing of constraints that do not have a straightforward translation to propositional
logic. This paper targets one such constraint, graph acyclicity.
A graph is naturally modeled by a set of Boolean variables xjk expressing the
existence of an edge from j to k. So, for instance, a constraint x12 ∨ x13 expresses
that we’re looking for a graph where the node 1 is connected to at least one of
the nodes 2 and 3. But how do we ensure that we are looking for a graph that
is acyclic?
Acyclicity is a core concept from graph theory and naturally arises in a
number of applications. In planning it is used to ensure causality [42]. Similarly,
it is needed in networks used for computation, e.g. Bayesian networks [49]. More
recently, software verification of concurrent programs relies on relaxed memory
models, which are defined in terms of acyclic relations [34, 51].
There is a large body of work of translating constraints into CNF. Prime
examples are cardinality constraints [5,36,43] pseudo-Boolean constraints [6,17],
XOR-constraints [24] and general CSP [46]. A number of graph-related problems
approached by SAT can be found in the literature. For instance, encoding graph
connectivness property [47], graph coloring [50], calculating Steiner tree [14, 30]
or maximum clique [28]. It is also worth mentioning that graphs and SAT play
an important role in the research on Gene regulatory networks [15].
The paper has the following contributions. It reviews CNF encodings that are
found in the literature, and introduces a number of new ones. All these encodings
were implemented and extensively evaluated. To generate a well-defined testbed
we introduce The Supervisor Problem, which asks to assign supervisors to a
group of persons, where there is a lower bound for each person’s supervisors and
an upper bound on how many persons he or she can supervise. This problem
41
M. Janota et al. On the Quest for an Acyclic Graph
is a generalization of the well-known ordering principle (also known as the GT
family) studied in proof complexity [1, 10, 32, 44].
2 Preliminaries
Our basic setting is the following. We represent a directed graph on n ≥ 1
vertices by Boolean variables xij , associated with edges. A family of graphs
is defined by a formula φn (�x, �y ) over the edge variables �x and some auxiliary
variables �y . Each graph corresponds to a satisfying assignment of ∃�y φn (�x, �y ).
Given such a formula φn , we wish� to construct another � formula ψn , such that the
satisfying assignments of ∃�y ∃�z φn (�x, �y ) ∧ ψn (�x, �z ) correspond to the graphs
from the given family that, in addition, are acyclic. We refer to the formula ψn
as the acylicity checker or, for short, the checker. In the rest of the section, we
make this more precise, and we describe the particular formulas φn that we use
in experiments. The acyclicity checkers ψn are discussed later (§3). Standard
terminology from propositional logic and graph theory is assumed.
2.1 Propositional Logic and SAT Solving
Propositional formulas are constructed from propositional variables by Boolean
connectives (∨, ∧, ⇒) and negation (¬). A propositional formula is in conjunc-
tive normal form (CNF) if it is a conjunction of clauses, where a clause is a
disjunction of literals, where a literal is a propositional variable or its negation.
Whenever convenient, we treat CNF as a set of clauses and a clause as a set
of literals. Observe that the empty set of clauses corresponds semantically to
true and the empty clause corresponds semantically to false. The empty clause
is denoted as ⊥.
Any propositional formula can be converted to an equisatisfiable CNF for-
mula in linear time [41, 48].
The problem of satisfiability (SAT) is to decide for a given propositional
formula whether there is a satisfying assignments for it or not.
The paper requires minimal understanding of SAT solving. We will assume
that a SAT solver accepts a formula in CNF and gives the response “YES” if
the formula is satisfiable and “NO” otherwise. Most modern SAT solvers also
provide an actual satisfying assignment if the answer is “YES”.
When discussing the performance of SAT solvers, sometimes we refer to the
underlying proof system, which is propositional resolution. Given two clauses
x∨C and ¬x∨D their resolution is C ∨D. The two clauses are called antecedents
and the resulting clause the resolvent. Resolution proof of a clause C from a
formula φ is a sequence of clauses C1 , . . . , Ck such that Ck = C and each clause
Ci is either in φ or it is a resolvent of some Cj , Ck with 1 ≤ j < k < i. A
resolution refutation is a resolution proof of the empty clause. A CNF formula
is unsatisfiable if and only if there exists a resolution refutation for it. For most
modern SAT solvers it holds that for any run of the SAT solver that responds
“NO”, there exists a resolution refutation of the input formula linear to the run
42
M. Janota et al. On the Quest for an Acyclic Graph
of the solver. In fact, it even holds that for any resolution refutation there exists
a polynomial run of the solver [39].
2.2 Graphs
Throughout the paper we will be discussing directed graphs. A directed graph
G = (V, E) consists of a set V of vertices and a set E of edges. In this paper, we
work with finite graphs, and we fix V = [n] = {1, . . . , n}. We denote vertices by
i, j, k, . . . and we abbreviate the (directed) edge (i, j) by ij. For an edge ij, we
say that it has source i and target j; conversely, we say that ij is an outgoing
edge of i, and an incoming edge of j. The in-degree of a vertex is its number of
incoming edges, and the out-degree of a vertex is its number of outgoing edges.
A sink is a vertex with out-degree 0.
A graph property is a family G of graphs. In this paper, we consider only
properties Gn that contain graphs with a fixed number n of vertices, and are
defined by a formula φn . More precisely, φn has one variable xij for each pair
(i, j) ∈ V × V , and may also use some auxiliary variables �y . Then, a graph
with edges E is in Gn if and only if it corresponds to a satisfying assignment
of ∃�y φn (�x, �y ). We say that a property is monotone if adding edges to a graph
preserves the property. This corresponds to the Boolean function ∃�y φn (�x, �y )
being monotone. We remark that the property of having a cycle is monotone,
and the intersection of two monotone properties is monotone.
2.3 The No-Sink Family
The graphs with no sinks are described by the following formula:
��� �
φn (�x ) := xij (no-sink)
i∈[n] j∈[n]
Observe that the no-sink property is monotone. Also, the ordering principle says
that graphs with no sink have a cycle.
� It follows that, no �matter what acyclicity
check ψn we use, the formula ∃�y ∃�z φn (�x, �y ) ∧ ψn (�x, �z ) will be unsatisfiable.
Since we know the answer, the no-sink family is a good test case for the cor-
rectness of our implementation. In addition, as we shall see, these unsatisfiable
formulas are not easy for state-of-the art SAT solvers.
2.4 The Supervisor Problem
We define a parameterized family of graphs, which will typically contain both
cyclic and acyclic graphs. The idea is as follows. Consider a group of n people
where each person must have a certain number of supervisors and at the same
time there is a limit on how many people a person can supervise. The supervisor
problem is formalized as follows.
43
M. Janota et al. On the Quest for an Acyclic Graph
Input: A positive integer n and a sequence of pairs of nonnegative integers
(u1 , l1 ), . . . , (un , ln ).
Output: “YES” if and only if there exists an acyclic graph with the vertices
V = [n] such that vertex i has at most ui incoming edges and at least li outgoing
edges.
An instance n, �u, �l of the Supervisor problem reduces to the satisfiability of
the following formula:
�
φn,�u,�l (�x ) := atleast(li ; xi1 , . . . , xin )
i∈[n]
� (supervisor)
∧ atmost(uj ; x1j , . . . , xnj )
j∈[n]
The cardinality constraint atleast(l; �x ) evaluates to 1 when at least l of the
Boolean variables �x evaluate to 1. There are several ways to encode such cardi-
nality constraints in CNF [5, 17, 36, 43]. The cardinality constraint atmost(u; �x )
is analogous.
The Supervisor problem has three interesting special cases: (a) the no-sink
family; (b) the pigeonhole principle; and (c) the DAG realizability problem. We
obtain a description of the no-sink family by requiring the bounds (n, 1) for each
of the n vertices. We obtain a description of the pigeonhole principle by requiring
the bounds (0, 1) for k1 vertices and the bounds (1, 0) for k2 vertices, with
k1 + k2 = n and k1 > k2 . Finally, any instance of the DAG realizability problem
is also an �instance of�the Supervisor problem, which satisfies the additional
n n
constraint i=1 ui = i=1 li .
Since the pigeonhole principle is a special case, it follows that (supervisor)
sometimes requires large resolution refutations [25]. Since the DAG realizability
problem is a special case, it follows that it is NP-complete to find a model
of (supervisor) that corresponds to an acyclic graph [26]. Finally, we observe
that (supervisor) is not monotone.
3 Encodings
We discuss several ways of constructing the acyclicity checker ψn , based on
reachability (§3.1), based on labeling vertices with numbers (§3.2 and §3.3), and
based on oblivious algorithms (§3.4 and §3.5).
3.1 Transitive Closure
A graph G = (V, E) does not have cycles if and only if E + is irreflexive. Here, E +
denotes the irreflexive transitive closure of E. Further, G does not have cycles if
and only if there exists an irreflexive transitive relation R that contains E; that
is, if there exists R ⊆ V × V such that (a) ii ∈ / R for all i ∈ V , (b) ij ∈ R and
jk ∈ R only if ik ∈ R for all i, j, k ∈ V , and (c) ij ∈ R if ij ∈ E for all i, j ∈ V .
44
M. Janota et al. On the Quest for an Acyclic Graph
We represent the relation R by variables yij , and we define ψn by clauses that
directly correspond to the conditions (a), (b), and (c):
� � �
ψn (�x, �y ) := ¬yii ∧ (yij ∧ yjk → yik ) ∧ (xij → yij ) (tc1)
i∈[n] i,j,k∈[n] i,j∈[n]
A possible alternative is the following:
� � �
ψn (�x, �y ) := ¬yii ∧ (yij ∧ xjk → yik ) ∧ (xij → yij ) (tc2)
i∈[n] i,j,k∈[n] i,j∈[n]
In relational language, (tc1) corresponds to the conditions (a) R ∩ id = ∅,
(b) RR ⊆ R, and (c) E ⊆ R. Here, id is defined to be { ii | i ∈ V }. In (tc2), we
replace condition (b) by RE ⊆ R. Juxtaposing relations corresponds to normal
relational composition:
RE := { (i, j) | (i, k) ∈ R and (k, j) ∈ E, for some k }
If the family of graphs satisfying φn is monotone, in the sense that adding
edges preserves φn , then we do not need to use auxiliary variables yij , as we can
simply reuse the variables xij :
� �
ψn (�x ) := ¬xii ∧ (xij ∧ xjk → xik ) (tc3)
i∈[n] i,j,k∈[n]
The conjunction of (tc3) and (no-sink) has Ω(2n ) regular tree resolution proofs
but O(n3 ) dag resolution proofs; see [10, 32, 44] and [31, §7.2.2.2, Theorem R].
Such conjunctions express the ordering principle.
Observe that the formulas in each of (tc1), (tc2), and (tc3) have size Θ(n3 ).
3.2 Binary Labeling
A graph is acyclic when it can be embedded in a linear order; that is, when there
exists a labeling l : V → Z of the vertices such that ij ∈ E implies l(i) < l(j)
for all i, j ∈ V . Recall that V = [n]. Clearly, it is sufficient to use b := �log2 n�
bits for labels, so l : V → {0, 1}b . Accordingly, we use n groups �y1 , . . . , �yn of
auxiliary variables, each containing b Boolean variables. The acyclicity checker
is then [42]:
�� �
ψn (�x, �y1 , . . . , �yn ) := xij → less(�yi , �yj ) (bin)
i,j∈[n]
The implementation of less depends on how we represent numbers. Because we
use binary numbers, we set less := lexb , and define the latter to test lexicographic
ordering of the bits:
lex0 () := 0 (1)
� �
lexb (�y y, �zz) := (¬y ∧ z) ∨ (¬y ∨ z) ∧ lexb−1 (�y , �z ) (2)
45
M. Janota et al. On the Quest for an Acyclic Graph
Converting this circuit to CNF will require some more auxiliary variables, which
we do not denote explicitly. The conversion is standard, and the size of the
resulting CNF is Θ(b).
Observe that the formula in (bin) has size Θ(n2 log n).
3.3 Unary Labeling
We can also use the labeling idea with unary numbers. To represent a number,
instead of �log2 n� bits we use n − 1 bits, and force them to follow the pattern
0∗ 1∗ . We could still compare these numbers by their lexicographic order using lex,
but we can also write directly a CNF formula, thus avoiding the (small) overhead
associated with converting a circuit into CNF.
The acyclicity checker is defined by
�� � �
ψn (�x, �y1 , . . . , �yn ) := xij → less(�yi , �yj ) ∧ unary(�yi ) (unr)
i,j∈[n] i∈[n]
where each �yi uses n − 1 bits.
We could use less := lexn−1 , but we prefer less(�y , �z ) := ∃�u lessunr(�y , �z, �u ),
where
n−1
� n−1
�
� �
lessunr(�y , �z, �u ) := (¬yi ∨ ¬ui ) ∧ (zi ∨ ¬ui ) ∧ ui (3)
i=1 i=1
The auxiliary �u variables will identify a position i where yi = 0 and zi = 1. This
is sufficient because we enforce the pattern 0∗ 1∗ .
n−1
�
unary(�y ) := (yi−1 → yi ) (4)
i=2
Observe that the formula in (unr) has size Θ(n3 ).
3.4 Warshall
The Warshall algorithm computes transitive closure, in cubic time.
Warshall
1 for k ∈ [n]
2 for i ∈ [n]
3 for j ∈ [n]
4 aij := Or(aij , And(aik , akj ))
Initially, aij says if the graph contains an edge ij; and, at the end aij says if the
graphs contains a path from i to j. In particular, at the end, aii says if there is
a cycle containing vertex i.
In a usual implementation, the procedures Or, And are simply performing
the corresponding Boolean operations. Instead, we could replace each of them
46
M. Janota et al. On the Quest for an Acyclic Graph
with a function that constructs the corresponding circuit. Because the Warshall
algorithm is simple enough, we can also build the acyclicity checker directly.
We use n2 (n + 1) auxiliary variables yijk , for i, j ∈ [n] and k ∈ {0, . . . , n}. The
variable yijk will be 1 only if there exists a path from i to j that uses intermediate
vertices only from the set [k].
� � �
ψn (�x, �y ) := ¬yiin ∧ (xij → yij0 ) ∧ (yij(k−1) → yijk )
i∈[n] i,j∈[n] i,j,k∈[n]
� (fw)
∧ (yik(k−1) ∧ ykj(k−1) → yijk )
i,j,k∈[n]
We remark that the formula above is close but not the same as what we would
obtain by replacing the basic operations Or, And with circuit builders. One dif-
ference is that we use implications rather than equivalences: It is possible that
yijk = 1 even if there is no corresponding path. Another difference is that the
last conjunct built by Warshall would be (yik(k−[k≥j]) ∧ ykj(k−[k≥i]) ) ↔ yijk ,
where [p] evaluates to 1 when p is true, and 0 when p is false. Either form is
correct.
The formula in (fw) is similar to the one in (tc1). It has the same size Θ(n3 ),
but uses more auxiliary variables.
3.5 Matrix Multiplication
Let us take stock of the encodings we have so far. Four encodings have size
Θ(n3 ), namely (tc1), (tc2), (unr), and (fw); another encoding has size Θ(n2 log n),
namely (bin). But, size is not the only criterion on which to judge a satisfiability
formula. We note that (tc1), (tc2), and (fw) have the following desirable property:
if the edge variables �x are fixed, then satisfiability is decided by unit propagation.
The smaller encoding (bin) does not have this property. Is it possible to have a
formula of sub-cubic size that is solved by unit propagation once the edge vari-
ables �x are fixed? We answer this question affirmatively by showing an encoding
based on matrix multiplication. We first describe the algorithm. The formula ψn
will be constructed by replacing the basic operations with circuit builders, and
then converting the circuit to CNF.
We will consider n×n matrices of two kinds, over integers and over Booleans.
Integers form a commutative ring, so the corresponding matrix operations are
unrestricted. Booleans do not form a ring: We take addition to be logical-or
and we take multiplication to be logical-and, but there is no subtraction. Since
we use both types of operations and sometimes reinterpret the same matrix as
being over integers or over Booleans, we distinguish Boolean matrices with a hat.
When we reinterpret an integer matrix X as a Boolean matrix X̂, all non-zero
integers collapse to 1.
Let X be the adjacency matrix of a graph, with entries in {0, 1}. Entry (i, j)
of X l is the number of paths of length l from i to j, and entry (i, j) of X̂ l tells
us if there exists a path of length l from i to j. Thus, to decide if the graph
47
M. Janota et al. On the Quest for an Acyclic Graph
�n l
has a cycle it suffices to check the diagonal of the Boolean matrix l=1 X̂ . We
compute this matrix by repeated squaring:
Â0 := X̂ Âk+1 := Â2k (5)
B̂0 := X̂ B̂k+1 := B̂k + ÂB̂k (6)
k � 2k
It is easy to check that Âk = X̂ 2 and B̂k = l=1 X̂ l , and also that B̂k equals
�n l
l=1 X̂ whenever k ≥ log2 n. Thus we can decide if a directed graph is acyclic
in the time needed to perform Θ(log n) multiplications of n × n Boolean matri-
ces. One could perform a naive matrix multiplication in Θ(n3 ) time, giving an
acyclicity checker ψn of size Θ(n3 log n).
Alternatively, to multiply Boolean matrices  and B̂, we multiply them as
integer matrices obtaining C = AB; the product of  and B̂ is then Ĉ. (The
idea of using intermediate integer matrices is old [18, 19, 37].) To perform the
multiplication of integer matrices, we use Strassen’s algorithm [45], which works
in O(n2.81 ) time. To achieve this improved runtime, Strassen’s algorithm makes
essential use of subtraction, which is not available over Booleans. However, the
runtime O(n2.81 ) counts operations over integers, while we must count operations
over bits since we ultimately want to build a circuit. Since the matrices A and B,
which we multiply, have entries from {0, 1}, the product matrix C has entries
from {0, . . . , n}. Some intermediate values in Strassen’s algorithm are outside
this range. Nevertheless, one can check that Θ(log n) bits are sufficient to repre-
sent all intermediate values. So, the size of the circuit is O(n2.81 log n · f (log n)),
where f (w) is the size of a circuit necessary to perform an arithmetic opera-
tion on integers with w bits. In our implementation, we use a simple quadratic
multiplication algorithm, leading to a circuit of size O(n2.81 �log3 n).
In theory, one can use the same approach to achieve O nω log2 n log log n ·
∗ �
8log log n by using better algorithms for the multiplication of matrices [20] and
for the multiplication of integers [27]. But such algorithms are known to not
be practical. Our current implementation based on Strassen is not competitive
either (§4). Improving it seems to require new ideas.
Discussion. The approach we took for both Warshall and for matrix multiplica-
tion was to replace the basic operations in the algorithm with circuit builders.
This approach is fairly general, but it does not apply to all algorithms, just to
the oblivious ones. An algorithm is said to be oblivious when the memory loca-
tions it accesses do not depend on the input data. If an algorithm runs in t(n)
time in the Turing model � of computation,
� then one can construct an oblivious
algorithm that runs in O t(n) log t(n) time, and hence a Boolean circuit of that
size [40]. In contrast, a similar construction does not exist in the deterministic
RAM model of computation [23].
The standard (and optimal) algorithm for finding cycles in directed graphs
is DFS (depth-first search). However, we are not aware of an implementation of
DFS that works in sub-cubic time in the (oblivious, deterministic, multi-tape)
Turing machine model of computation.
48
M. Janota et al. On the Quest for an Acyclic Graph
solver/checker tc1 unr bin fw tc2 solver/checker tc1 unr bin fw tc2
lingeling 49 48 10 38 11 lingeling 436 429 426 435 435
glucose 32 37 14 15 11 glucose 437 434 427 437 437
minisat 25 49 12 12 11 minisat 435 425 424 435 435
minisat-prepro 26 49 11 19 11 minisat-prepro 435 426 422 435 435
(a) GT Family with 49 instances (b) Supervisor Family with 441 instances
Fig. 1. Number of solved instances for sizes 2-50.
4 Experimental Evaluation
Overall setup. Three popular state-of-the-art SAT solvers were used for the
evaluation: Lingeling Version 276 [8], glucose 4.0 [3], and minisat 2.2 (from
github) [16]. Further, minisat was a run in two modes: the default mode, and
with additional preprocessing techniques enabled (asymmetric breaking and re-
dundancy checking). The encodings we consider are Transitive closure (tc1),
Unary labeling (unr), Binary labeling (bin), Warshall (fw), and an alternate ver-
sion of Transitive closure (tc2). The techniques based on matrix multiplication
are not included, because they quickly lead to formulas in the realm of hundred
of megabytes, and more (Figure 3).
The experimental results were obtained on an Intel Xeon 5160 3GHz, with
4GB of memory, time limit 500 seconds, and memory limit 2GB.
Generation of Graph Families. The generation of the no-sink graph family is
trivial: we simply instantiate (no-sink) for different values of n. We also generate
random instances of the Supervisor problem. Each upper bound ui on the in-
degree of vertex i is chosen uniformly at random from the set {0, . . . , n − 1}.
Each lower bound li is 0 with probability p, which is a parameter, and otherwise
is chosen uniformly at random from the set {1, . . . , n − 1}. If the generated
sequence (l1 , u1 ), . . . , (un , ln ) is not satisfied by any directed graph, we throw
it away and try again. This filtering is easy, because it is possible to check in
linear time whether a sequence is satisfiable by some directed graph, using the
Fulkerson–Chen–Anstee theorem [2]; but, as we mentioned, checking whether a
sequence is satisfiable by an acyclic directed graph is NP-complete [26].
For each n ∈ {2, 3, . . . , 50} and for each p ∈ {10%, 20%, . . . , 90%}, we gener-
ate a sequence (l1 , u1 ), . . . , (ln , un ) using the method just described. For each
such sequence, we generate the formula (supervisor). The atmost and atleast
constraints were encoded into CNF using the well-known Sequential Counter
encoding [43].
Presentation of the results. Figure 1 presents the number of instances solved.
Figure 1(a) shows the results for graphs with no sinks, which lead to the GT fam-
ily of formulas when coupled with the acyclicity checker (tc1). Figure 1(b) shows
the results for the general supervisor problem. Figure 2 shows a cactus for the
(no-sink) family for all the considered combinations of solvers and encodings.
49
M. Janota et al. On the Quest for an Acyclic Graph
500
glucose-bin
450 glucose-fw
glucose-tc2
glucose-unr
400 glucose-tc1
lingeling-fw
350 lingeling-tc2
lingeling-tc1
lingeling-bin
300
CPU time (s)
minisat-prepro-fw
minisat-prepro-tc2
250 lingeling-unr
minisat-prepro-tc1
minisat-prepro-bin
200 minisat-prepro-unr
minisat-bin
150 minisat-fw
minisat-tc2
100 minisat-unr
minisat-tc1
50
0
0 5 10 15 20 25 30 35 40 45 50
instances
Fig. 2. Cactus plot for the (no-sink) family.
Recall that if a cactus plot contains a point (i, s) it means there are i instances
such that each was solved within s seconds. (Therefore, the graph is necessarily
increasing.) A more detailed presentation of the results can be found on the
authors’ homepage.3
The first thing to observe is that the supervisor family turns out to be easy
for all the approaches. Indeed, all approaches solved over 90% benchmarks. Even
though the differences are rather small, the tables suggests that the Transitive
closure encoding ((tc1)) is good for all solvers. In contrast, Binary labeling (bin)
appears to be the worst.
In the case of (no-sink) graphs, the results are widely different for each en-
coding but also for each solver. Most notably, lingeling performs extremely well
on (tc1). Both versions of minisat perform well on Unary labeling (unr). Glucose
also performs the best with (unr), but once the graph size gets over 30 nodes,
the performance quickly deteriorates. The worst encoding appears to be Binary
labeling (bin) and the alternative Transitive closure (tc2), with which all solvers
could only handle less than half of all instances. Warshall has a peculiar behavior,
as it performs poorly for all the solvers except lingeling.
3
http://sat.inesc-id.pt/~mikolas/sat-cycles/
50
M. Janota et al. On the Quest for an Acyclic Graph
Discussion. The results are surprisingly uneven across different solvers. In a
particular, lingeling performs extremely well with (tc1), while the other solvers
do not perform very well with the same encoding. It is possible that lingeling
benefits from some simplification technique (pre- or in-processing) that is missing
from the other solvers, but this remains to be understood. Binary labeling (bin) is
overall the most compact but with the poorest performance. This is not entirely
surprising. First, binary labeling does not offer many opportunities for applying
unit propagation. And second, binary labeling makes it virtually impossible to
learn anything about any particular variable.
In contrast, Unary labeling (unr) fares quite well over the considered solvers.
Further, we observed that minisat with preprocessing solves the (no-sink) family
with no conflicts, i.e., merely with preprocessing. This also holds for lingeling
but only up to a certain size of the graph. It is not difficult to see how these
formulas can be completely solved by preprocessing, in particular by failed literal
detection and asymmetric breaking.
In unary labeling (§3.3), the ordering of labels is ensured by the following
constraint. For every vertex i labeled by a number l(i) ∈ {0, . . . , n − 1} there
must be a neighbor j labeled by a number l(j) ∈ {0, . . . , n − 1} such that
l(i) < k ≤ l(j) for some k ∈ {0, . . . , n − 1}. To show that such a labeling does
not exist for no-sink graphs, we reason as follows. If k = n − 1, it immediately
follows that vertex j has no neighbor labeled with a greater number since it has
reached the maximum. Performing this reasoning for every vertex yields that the
labels must be in the interval {0, . . . , n − 1}. Repeating this reasoning n times
leads to a contradiction.
We argue that to perform this reasoning, it is sufficient to use unit propa-
gation on unary labeling. In definition (3), setting uk := 1 imposes exactly the
condition as listed above, i.e., the label of the source node is at most k − 1 and
the target node at least k. It is easy to verify that setting un−1 := 1 yields a
conflict by unit propagation.
This idea is realized in the preprocessing techniques failed literal detec-
tion [12, 35] and asymmetric breaking [38]. More generally, if setting a literal l
to true yields a conflict by unit propagation it is called a failed literal and the
literal is set to false. In this manner, these two techniques acting in tandem
can gradually remove the ui literal corresponding to the highest number, and
eventually derive an empty clause.
Conclusion. The experimental results support the following conclusions. The
combination of choice for checking acyclicity is lingeling with Transitive clo-
sure (tc1). (Do not use the seemingly very similar alternative Transitive clo-
sure (tc2), which leads to poor performance.) Minisat is the second solver of
choice but not with the (tc1) encoding—instead, use (unr). SAT solvers appear
to be good at finding an acyclic graph if it exists; proving that an acyclic graph
does not exist can easily become very difficult.
Formula Size. Figure 3 presents the size of the acyclicity checkers (§3). The cubic
methods (tc1, tc2, unr, fw) are all within a factor of 3 of each-other. The super-
51
M. Janota et al. On the Quest for an Acyclic Graph
109
tc1
acyclicity checker formula size
108 tc2
107 bin
unr
106 fw
mm
105 ss
104
103
102
101
100
1 10 100
graph vertex count
Fig. 3. The size of the acylicity checker formula ψn as a function of the number of
vertices n. Five lines correspond to equations (tc1), (tc2), (bin), (unr), and (fw). The
remaining two correspond to Boolean matrix multiplication methods, using the defini-
tion (mm), and using Strassen’s algorithm � (ss). The size of a CNF formula consisting
of clauses C1 , . . . , Cm is defined to be m + m i=1 |Ci |.
cubic one (mm, which is based on the definition of Boolean matrix multiplication)
is already worse than the cubic ones at n = 3, and it only gets worse for larger n.
There are two sub-cubic methods: ss with size O(n2.81 log3 n), and bin with size
O(n2 log n). The method based on Strassen (ss) generates large formulas even
for n = 10 and, looking at Figure 3, there is no overtaking in sight. Indeed, even
if the constants hidden in the O-notation were just as good for ss as they are
for tc1, the crossing point would be at n ∼ 1032 . The log3 n factor is important,
and it seems that without new ideas this method will never be practical. On the
other hand, the method based on binary labeling (bin), generates the smallest
acyclicity checkers for n ≥ 100.
5 Summary and Future Work
The paper presents a number of encodings for the graph acyclicity constraint into
conjunctive normal form. This is useful when calculating acyclic graphs using
SAT solvers but also other logic-based solvers, such as quantified Boolean for-
mula (QBF) and Satisfiability modulo theories (SMT) solvers. The experimental
evaluation suggests that the performance is highly sensitive both to be encoding
but also to the SAT solver used. This suggests that for practical purposes it is
useful to maintain a portfolio of encodings and solvers.
There are a number of directions for further research. All the encodings that
are asymptotically better than cubic perform poorly in practice. Hence, it is a
challenge for future research to identify encodings with asymptotic complexity
better than cubic but that also perform well.
52
M. Janota et al. On the Quest for an Acyclic Graph
Ultimately, all the considered encodings are not feasible for large graphs.
Indeed, any cubic encoding generates formulas that are too large once the number
of nodes gets over 1000. A possible solution is to consider a lazy approach where
only some portion of the encoding is considered. However, this will probably
require an iterative sequence of calls to the SAT solver. Another approach, tried
recently [9,13], is to integrate specialized algorithms for acyclicity inside the SAT
solver. This is also closely related to loop formulas, which are used in Answer
Set Programming (ASP) to avoid cyclic reasoning cf. [21, 29, 33].
Some questions about properties of the encodings with respect to unit propa-
gation remain open. In particular, is it the case that any partial assignment that
contains a cycle produces a conflict by unit propagation in a given encoding? In
theory this is possible due to monotonicity [7] (adding edges preserves cycles).
We believe this holds for the encodings Transitive closure, Warshall, and matrix
multiplication, but does not hold for Unary and Binary labeling. Hence, this also
begs the question whether the labeling methods can be made more determinis-
tic. Other properties related to unit propagation are also worth exploring, e.g.,
generalized arc consistency and propagation completeness, cf. [4, 7, 11, 22, 24].
The experimental results lead to several interesting questions. Why is lin-
geling so efficient on the transitive closure encoding (tc1)? Why does the alter-
native encoding for transitive closure (tc2) perform so poorly across all solvers?
Do formulas (tc2) lack short resolution refutations?
Last but not least, this research opens avenues for exploring other graph-
related properties, e.g., connectivness and reachability in general.
Acknowledgments
This work was supported by national funds through Fundação para a Ciência e a
Tecnologia (FCT) with reference UID/CEC/50021/2013, and by the PrideMM Web
Interface grant from VeTSS.
References
1. Alekhnovich M, Johannsen J, et al. An exponential separation between regular and
general resolution. Theory of Computing, 3(5):81–102, 2007.
2. Anstee RP. Triangular (0, 1)-matrices with prescribed row and column sums. Dis-
crete Mathematics, 40(1):1–10, 1982.
3. Audemard G, Simon L. Predicting learnt clauses quality in modern SAT solvers.
In International Joint Conference on Artificial Intelligence (IJCAI). 2009.
4. Babka M, Balyo T, et al. Complexity issues related to propagation completeness.
Artif Intell, 203:19–34, 2013.
5. Bailleux O, Boufkhad Y. Efficient CNF encoding of boolean cardinality constraints.
In Principles and Practice of Constraint Programming (CP). Springer, 2003.
6. Bailleux O, Boufkhad Y, et al. A translation of pseudo-boolean constraints to SAT.
Journal on Satisfiability, Boolean Modeling and Computation, 2006.
7. Bessiere C, Katsirelos G, et al. Circuit complexity and decompositions of global
constraints. In International Joint Conference on Artificial Intelligence (IJCAI),
pages 412–418. 2009.
53
M. Janota et al. On the Quest for an Acyclic Graph
8. Biere A. Lingeling, Plingeling and Treengeling entering the SAT competition. Pro-
ceedings of SAT Competition, 2013.
9. Bomanson J, Gebser M, et al. Answer set programming modulo acyclicity. Fundam
Inform, 2016.
10. Bonet ML, Galesi N. Optimality of size-width tradeoffs for resolution. Computa-
tional Complexity, 10(4):261–276, 2001.
11. Bordeaux L, Marques-Silva J. Knowledge compilation with empowerment. In In-
ternational Conference on Current Trends in Theory and Practice of Computer
Science (SOFSEM), pages 612–624. Springer, 2012.
12. Crawford JM, Auton LD. Experimental results on the crossover point in satisfia-
bility problems. In AAAI, volume 93, pages 21–27. 1993.
13. Cuteri B, Dodaro C, et al. Constraints, lazy constraints, or propagators in ASP
solving: An empirical analysis. In International Conference on Logic Programming
(ICLP). 2017. http://arxiv.org/abs/1707.04027.
14. de Oliveira RT, Silva F. SAT and MaxSAT encodings for trees applied to the
Steiner tree problem. In Brazilian Conference on Intelligent Systems (BRACIS),
pages 192–197. IEEE, 2014.
15. Dubrova E, Teslenko M. A SAT-based algorithm for finding attractors in syn-
chronous boolean networks. IEEE/ACM transactions on computational biology
and bioinformatics, 2011.
16. Eén N, Sörensson N. An extensible SAT-solver. In International Conference on
Theory and Applications of Satisfiability Testing (SAT), pages 502–518. 2003.
17. Eén N, Sorensson N. Translating pseudo-boolean constraints into SAT. Journal on
Satisfiability, Boolean Modeling and Computation, 2:1–26, 2006.
18. Fisher MJ, Meyer AR. Boolean matrix multiplication and transitive closure.
Switching and Automata, 1971.
19. Furman ME. Application of a method of fast multiplication of matrices in the
problem of finding the transitive closure of a graph. Soviet Math Dokl, 1970.
20. Gall FL. Powers of tensors and fast matrix multiplication. In ISSAC. 2014.
21. Gebser M, Janhunen T, et al. SAT modulo graphs: Acyclicity. In Logics in Artificial
Intelligence (JELIA). 2014.
22. Gent IP. Arc consistency in SAT. In European Conference on Artificial Intelligence
(ECAI), pages 121–125. IOS Press, 2002.
23. Goldreich O, Ostrovsky R. Software protection and simulation on oblivious RAMs.
J ACM, 43(3):431–473, 1996.
24. Gwynne M, Kullmann O. On SAT representations of XOR constraints. In Language
and Automata Theory and Applications Conference (LATA), pages 409–420. 2014.
25. Haken A. The intractability of resolution. Theoretical Computer Science, 39:297–
308, 1985.
26. Hartung S, Nichterlein A. NP-hardness and fixed-parameter tractability of realizing
degree sequences with directed acyclic graphs. SIAM Journal on Discrete Mathe-
matics, 29(4):1931–1960, 2015.
27. Harvey D, van der Hoeven J, et al. Even faster integer multiplication. Journal of
Complexity, 2016.
28. Ignatiev A, Morgado A, et al. Cardinality encodings for graph optimization prob-
lems. In International Joint Conference on Artificial Intelligence (IJCAI). 2017.
29. Janhunen T, Niemelä I. Compact translations of non-disjunctive answer set pro-
grams to propositional clauses. In Logic Programming, Knowledge Representation,
and Nonmonotonic Reasoning: Essays Dedicated to Michael Gelfond on the Occa-
sion of His 65th Birthday. 2011.
54
M. Janota et al. On the Quest for an Acyclic Graph
30. Jiang Y, Kautz H, et al. Solving problems with hard and soft constraints using
a stochastic algorithm for MAX-SAT. In 1st International Joint Workshop on
Artificial Intelligence and Operations Research. 1995.
31. Knuth DE. The Art of Computer Programming, Volume 4, Fascicle 6: Satisfiability.
Addison-Wesley Professional, 1st edition, 2015.
32. Krishnamurthy B. Short proofs for tricky formulas. Acta Informatica, 22(3):253–
275, 1985.
33. Lifschitz V, Razborov A. Why are there so many loop formulas? ACM Transactions
on Computational Logic (TOCL), 7(2):261–268, 2006.
34. Lustig D, Sethi G, et al. COATCheck: Verifying memory ordering at the hardware-
OS interface. ACM SIGOPS Operating Systems Review, 2016.
35. Lynce I, Marques-Silva J. Probing-based preprocessing techniques for propositional
satisfiability. In ICTAI, pages 105–110. 2003.
36. Marques-Silva J, Lynce I. Towards robust CNF encodings of cardinality con-
straints. In Principles and Practice of Constraint Programming (CP), pages 483–
497. Springer, 2007.
37. Munro I. Efficient determination of the transitive closure of a directed graph. Inf
Process Lett, 1(2):56–58, 1971.
38. Piette C, Hamadi Y, et al. Vivifying propositional clausal formulae. In European
Conference on Artificial Intelligence (ECAI), pages 525–529. 2008.
39. Pipatsrisawat K, Darwiche A. On the power of clause-learning SAT solvers as
resolution engines. Artificial Intelligence, 175(2):512–525, 2011.
40. Pippenger N, Fischer MJ. Relations among complexity measures. J ACM,
26(2):361–381, 1979.
41. Plaisted DA, Greenbaum S. A structure-preserving clause form translation. J
Symb Comput, 2(3):293–304, 1986.
42. Rintanen J, Heljanko K, et al. Planning as satisfiability: parallel plans and algo-
rithms for plan search. Artificial Intelligence, 170(12):1031 – 1080, 2006.
43. Sinz C. Towards an optimal cnf encoding of boolean cardinality constraints. In
Principles and Practice of Constraint Programming (CP), pages 827–831. Springer,
2005.
44. Stålmarck G. Short resolution proofs for a sequence of tricky formulas. Acta
Informatica, 33(3):277–280, 1996.
45. Strassen V. Gaussian elimination is not optimal. Numer Math, 1969.
46. Tamura N, Taga A, et al. Compiling finite linear CSP into SAT. Constraints,
2009.
47. Tavares de Oliveira R, Silva F, et al. On modeling connectedness in reductions
from graph problems to extended satisfiability. In Advances in Artificial Intelligence
(IBERAMIA), pages 381–391. Springer, 2012.
48. Tseitin GS. On the complexity of derivations in the propositional calculus. Studies
in Constructive Mathematics and Mathematical Logic, Part II, 1968.
49. van Beek P, Hoffmann HF. Machine learning of Bayesian networks using constraint
programming. In International Conference on Principles and Practice of Constraint
Programming (CP), pages 429–445. 2015.
50. Van Gelder A. Another look at graph coloring via propositional satisfiability. Dis-
crete Appl Math, 2008.
51. Wickerson J, Batty M, et al. Automatically comparing memory consistency models.
In Symposium on Principles of Programming Languages (POPL). ACM, 2017.
55