=Paper= {{Paper |id=Vol-2011/paper4 |storemode=property |title=On the Quest for an Acyclic Graph |pdfUrl=https://ceur-ws.org/Vol-2011/paper4.pdf |volume=Vol-2011 |authors=Mikoláš Janota,Radu Grigore,Vasco Manquinho |dblpUrl=https://dblp.org/rec/conf/aiia/JanotaGM17 }} ==On the Quest for an Acyclic Graph== https://ceur-ws.org/Vol-2011/paper4.pdf
           On the Quest for an Acyclic Graph

            Mikoláš Janota1, Radu Grigore2, and Vasco Manquinho1
                         1
                            IST/INESC-ID, Lisbon, Portugal
                  2
                      School of Computing, University of Kent, UK



      Abstract. The paper aims at ﬁnding acyclic graphs under a given set of
      constraints. More speciﬁcally, given a propositional formula φ over edges
      of a ﬁxed-size graph, the objective is to ﬁnd a model of φ that corresponds
      to a graph that is acyclic. The paper proposes several encodings of the
      problem and compares them in an experimental evaluation using state-
      of-the-art SAT solvers.


1   Introduction
SAT solvers have become popular means of solving computationally hard prob-
lems. However, most modern SAT solvers require the problem to be speciﬁed in
conjunctive normal form (CNF). This may pose diﬃculty in problems compris-
ing of constraints that do not have a straightforward translation to propositional
logic. This paper targets one such constraint, graph acyclicity.
     A graph is naturally modeled by a set of Boolean variables xjk expressing the
existence of an edge from j to k. So, for instance, a constraint x12 ∨ x13 expresses
that we’re looking for a graph where the node 1 is connected to at least one of
the nodes 2 and 3. But how do we ensure that we are looking for a graph that
is acyclic?
     Acyclicity is a core concept from graph theory and naturally arises in a
number of applications. In planning it is used to ensure causality [42]. Similarly,
it is needed in networks used for computation, e.g. Bayesian networks [49]. More
recently, software veriﬁcation of concurrent programs relies on relaxed memory
models, which are deﬁned in terms of acyclic relations [34, 51].
     There is a large body of work of translating constraints into CNF. Prime
examples are cardinality constraints [5,36,43] pseudo-Boolean constraints [6,17],
XOR-constraints [24] and general CSP [46]. A number of graph-related problems
approached by SAT can be found in the literature. For instance, encoding graph
connectivness property [47], graph coloring [50], calculating Steiner tree [14, 30]
or maximum clique [28]. It is also worth mentioning that graphs and SAT play
an important role in the research on Gene regulatory networks [15].
     The paper has the following contributions. It reviews CNF encodings that are
found in the literature, and introduces a number of new ones. All these encodings
were implemented and extensively evaluated. To generate a well-deﬁned testbed
we introduce The Supervisor Problem, which asks to assign supervisors to a
group of persons, where there is a lower bound for each person’s supervisors and
an upper bound on how many persons he or she can supervise. This problem


                                         41
M. Janota et al. On the Quest for an Acyclic Graph

  is a generalization of the well-known ordering principle (also known as the GT
  family) studied in proof complexity [1, 10, 32, 44].


  2     Preliminaries

  Our basic setting is the following. We represent a directed graph on n ≥ 1
  vertices by Boolean variables xij , associated with edges. A family of graphs
  is deﬁned by a formula φn (�x, �y ) over the edge variables �x and some auxiliary
  variables �y . Each graph corresponds to a satisfying assignment of ∃�y φn (�x, �y ).
  Given such a formula φn , we wish� to construct another      � formula ψn , such that the
  satisfying assignments of ∃�y ∃�z φn (�x, �y ) ∧ ψn (�x, �z ) correspond to the graphs
  from the given family that, in addition, are acyclic. We refer to the formula ψn
  as the acylicity checker or, for short, the checker. In the rest of the section, we
  make this more precise, and we describe the particular formulas φn that we use
  in experiments. The acyclicity checkers ψn are discussed later (§3). Standard
  terminology from propositional logic and graph theory is assumed.


  2.1   Propositional Logic and SAT Solving

  Propositional formulas are constructed from propositional variables by Boolean
  connectives (∨, ∧, ⇒) and negation (¬). A propositional formula is in conjunc-
  tive normal form (CNF) if it is a conjunction of clauses, where a clause is a
  disjunction of literals, where a literal is a propositional variable or its negation.
      Whenever convenient, we treat CNF as a set of clauses and a clause as a set
  of literals. Observe that the empty set of clauses corresponds semantically to
  true and the empty clause corresponds semantically to false. The empty clause
  is denoted as ⊥.
      Any propositional formula can be converted to an equisatisﬁable CNF for-
  mula in linear time [41, 48].
      The problem of satisﬁability (SAT) is to decide for a given propositional
  formula whether there is a satisfying assignments for it or not.
      The paper requires minimal understanding of SAT solving. We will assume
  that a SAT solver accepts a formula in CNF and gives the response “YES” if
  the formula is satisﬁable and “NO” otherwise. Most modern SAT solvers also
  provide an actual satisfying assignment if the answer is “YES”.
      When discussing the performance of SAT solvers, sometimes we refer to the
  underlying proof system, which is propositional resolution. Given two clauses
  x∨C and ¬x∨D their resolution is C ∨D. The two clauses are called antecedents
  and the resulting clause the resolvent. Resolution proof of a clause C from a
  formula φ is a sequence of clauses C1 , . . . , Ck such that Ck = C and each clause
  Ci is either in φ or it is a resolvent of some Cj , Ck with 1 ≤ j < k < i. A
  resolution refutation is a resolution proof of the empty clause. A CNF formula
  is unsatisﬁable if and only if there exists a resolution refutation for it. For most
  modern SAT solvers it holds that for any run of the SAT solver that responds
  “NO”, there exists a resolution refutation of the input formula linear to the run


                                            42
M. Janota et al. On the Quest for an Acyclic Graph

  of the solver. In fact, it even holds that for any resolution refutation there exists
  a polynomial run of the solver [39].


  2.2   Graphs

  Throughout the paper we will be discussing directed graphs. A directed graph
  G = (V, E) consists of a set V of vertices and a set E of edges. In this paper, we
  work with ﬁnite graphs, and we ﬁx V = [n] = {1, . . . , n}. We denote vertices by
  i, j, k, . . . and we abbreviate the (directed) edge (i, j) by ij. For an edge ij, we
  say that it has source i and target j; conversely, we say that ij is an outgoing
  edge of i, and an incoming edge of j. The in-degree of a vertex is its number of
  incoming edges, and the out-degree of a vertex is its number of outgoing edges.
  A sink is a vertex with out-degree 0.
       A graph property is a family G of graphs. In this paper, we consider only
  properties Gn that contain graphs with a ﬁxed number n of vertices, and are
  deﬁned by a formula φn . More precisely, φn has one variable xij for each pair
  (i, j) ∈ V × V , and may also use some auxiliary variables �y . Then, a graph
  with edges E is in Gn if and only if it corresponds to a satisfying assignment
  of ∃�y φn (�x, �y ). We say that a property is monotone if adding edges to a graph
  preserves the property. This corresponds to the Boolean function ∃�y φn (�x, �y )
  being monotone. We remark that the property of having a cycle is monotone,
  and the intersection of two monotone properties is monotone.


  2.3   The No-Sink Family

  The graphs with no sinks are described by the following formula:
                                        ���          �
                            φn (�x ) :=          xij                          (no-sink)
                                          i∈[n]   j∈[n]


  Observe that the no-sink property is monotone. Also, the ordering principle says
  that graphs with no sink have a cycle.
                                       � It follows that, no �matter what acyclicity
  check ψn we use, the formula ∃�y ∃�z φn (�x, �y ) ∧ ψn (�x, �z ) will be unsatisﬁable.
  Since we know the answer, the no-sink family is a good test case for the cor-
  rectness of our implementation. In addition, as we shall see, these unsatisﬁable
  formulas are not easy for state-of-the art SAT solvers.


  2.4   The Supervisor Problem

  We deﬁne a parameterized family of graphs, which will typically contain both
  cyclic and acyclic graphs. The idea is as follows. Consider a group of n people
  where each person must have a certain number of supervisors and at the same
  time there is a limit on how many people a person can supervise. The supervisor
  problem is formalized as follows.


                                          43
M. Janota et al. On the Quest for an Acyclic Graph

  Input: A positive integer n and a sequence of pairs of nonnegative integers
  (u1 , l1 ), . . . , (un , ln ).
  Output: “YES” if and only if there exists an acyclic graph with the vertices
  V = [n] such that vertex i has at most ui incoming edges and at least li outgoing
  edges.
     An instance n, �u, �l of the Supervisor problem reduces to the satisﬁability of
  the following formula:
                                 �
               φn,�u,�l (�x ) :=   atleast(li ; xi1 , . . . , xin )
                              i∈[n]
                                   �                                        (supervisor)
                               ∧       atmost(uj ; x1j , . . . , xnj )
                                j∈[n]


  The cardinality constraint atleast(l; �x ) evaluates to 1 when at least l of the
  Boolean variables �x evaluate to 1. There are several ways to encode such cardi-
  nality constraints in CNF [5, 17, 36, 43]. The cardinality constraint atmost(u; �x )
  is analogous.
      The Supervisor problem has three interesting special cases: (a) the no-sink
  family; (b) the pigeonhole principle; and (c) the DAG realizability problem. We
  obtain a description of the no-sink family by requiring the bounds (n, 1) for each
  of the n vertices. We obtain a description of the pigeonhole principle by requiring
  the bounds (0, 1) for k1 vertices and the bounds (1, 0) for k2 vertices, with
  k1 + k2 = n and k1 > k2 . Finally, any instance of the DAG realizability problem
  is also an �instance of�the Supervisor problem, which satisﬁes the additional
                n          n
  constraint i=1 ui = i=1 li .
      Since the pigeonhole principle is a special case, it follows that (supervisor)
  sometimes requires large resolution refutations [25]. Since the DAG realizability
  problem is a special case, it follows that it is NP-complete to ﬁnd a model
  of (supervisor) that corresponds to an acyclic graph [26]. Finally, we observe
  that (supervisor) is not monotone.


  3     Encodings

  We discuss several ways of constructing the acyclicity checker ψn , based on
  reachability (§3.1), based on labeling vertices with numbers (§3.2 and §3.3), and
  based on oblivious algorithms (§3.4 and §3.5).


  3.1   Transitive Closure

  A graph G = (V, E) does not have cycles if and only if E + is irreﬂexive. Here, E +
  denotes the irreﬂexive transitive closure of E. Further, G does not have cycles if
  and only if there exists an irreﬂexive transitive relation R that contains E; that
  is, if there exists R ⊆ V × V such that (a) ii ∈  / R for all i ∈ V , (b) ij ∈ R and
  jk ∈ R only if ik ∈ R for all i, j, k ∈ V , and (c) ij ∈ R if ij ∈ E for all i, j ∈ V .


                                              44
M. Janota et al. On the Quest for an Acyclic Graph

  We represent the relation R by variables yij , and we deﬁne ψn by clauses that
  directly correspond to the conditions (a), (b), and (c):
                      �         �                          �
      ψn (�x, �y ) :=   ¬yii ∧       (yij ∧ yjk → yik ) ∧    (xij → yij )  (tc1)
                    i∈[n]        i,j,k∈[n]                      i,j∈[n]

  A possible alternative is the following:
                     �            �                         �
     ψn (�x, �y ) :=    ¬yii ∧         (yij ∧ xjk → yik ) ∧   (xij → yij )         (tc2)
                    i∈[n]        i,j,k∈[n]                      i,j∈[n]

  In relational language, (tc1) corresponds to the conditions (a) R ∩ id = ∅,
  (b) RR ⊆ R, and (c) E ⊆ R. Here, id is deﬁned to be { ii | i ∈ V }. In (tc2), we
  replace condition (b) by RE ⊆ R. Juxtaposing relations corresponds to normal
  relational composition:

               RE := { (i, j) | (i, k) ∈ R and (k, j) ∈ E, for some k }

     If the family of graphs satisfying φn is monotone, in the sense that adding
  edges preserves φn , then we do not need to use auxiliary variables yij , as we can
  simply reuse the variables xij :
                               �           �
                   ψn (�x ) :=     ¬xii ∧       (xij ∧ xjk → xik )              (tc3)
                                i∈[n]        i,j,k∈[n]

  The conjunction of (tc3) and (no-sink) has Ω(2n ) regular tree resolution proofs
  but O(n3 ) dag resolution proofs; see [10, 32, 44] and [31, §7.2.2.2, Theorem R].
  Such conjunctions express the ordering principle.
     Observe that the formulas in each of (tc1), (tc2), and (tc3) have size Θ(n3 ).

  3.2   Binary Labeling
  A graph is acyclic when it can be embedded in a linear order; that is, when there
  exists a labeling l : V → Z of the vertices such that ij ∈ E implies l(i) < l(j)
  for all i, j ∈ V . Recall that V = [n]. Clearly, it is suﬃcient to use b := �log2 n�
  bits for labels, so l : V → {0, 1}b . Accordingly, we use n groups �y1 , . . . , �yn of
  auxiliary variables, each containing b Boolean variables. The acyclicity checker
  is then [42]:
                                                      ��                        �
                       ψn (�x, �y1 , . . . , �yn ) :=    xij → less(�yi , �yj )     (bin)
                                             i,j∈[n]

  The implementation of less depends on how we represent numbers. Because we
  use binary numbers, we set less := lexb , and deﬁne the latter to test lexicographic
  ordering of the bits:

                       lex0 () := 0                                                  (1)
                                                  �                            �
                 lexb (�y y, �zz) := (¬y ∧ z) ∨ (¬y ∨ z) ∧ lexb−1 (�y , �z )         (2)



                                             45
M. Janota et al. On the Quest for an Acyclic Graph

  Converting this circuit to CNF will require some more auxiliary variables, which
  we do not denote explicitly. The conversion is standard, and the size of the
  resulting CNF is Θ(b).
     Observe that the formula in (bin) has size Θ(n2 log n).

  3.3     Unary Labeling
  We can also use the labeling idea with unary numbers. To represent a number,
  instead of �log2 n� bits we use n − 1 bits, and force them to follow the pattern
  0∗ 1∗ . We could still compare these numbers by their lexicographic order using lex,
  but we can also write directly a CNF formula, thus avoiding the (small) overhead
  associated with converting a circuit into CNF.
      The acyclicity checker is deﬁned by
                                              ��                       � �
               ψn (�x, �y1 , . . . , �yn ) :=    xij → less(�yi , �yj ) ∧  unary(�yi ) (unr)
                                          i,j∈[n]                            i∈[n]

  where each �yi uses n − 1 bits.
     We could use less := lexn−1 , but we prefer less(�y , �z ) := ∃�u lessunr(�y , �z, �u ),
  where
                                          n−1
                                          �                                          n−1
                                                                                     �
                                                �                               �
               lessunr(�y , �z, �u ) :=             (¬yi ∨ ¬ui ) ∧ (zi ∨ ¬ui ) ∧           ui   (3)
                                          i=1                                        i=1

  The auxiliary �u variables will identify a position i where yi = 0 and zi = 1. This
  is suﬃcient because we enforce the pattern 0∗ 1∗ .
                                                        n−1
                                                        �
                                  unary(�y ) :=               (yi−1 → yi )                      (4)
                                                        i=2

        Observe that the formula in (unr) has size Θ(n3 ).

  3.4     Warshall
  The Warshall algorithm computes transitive closure, in cubic time.
   Warshall
   1 for k ∈ [n]
   2      for i ∈ [n]
   3           for j ∈ [n]
   4                aij := Or(aij , And(aik , akj ))
  Initially, aij says if the graph contains an edge ij; and, at the end aij says if the
  graphs contains a path from i to j. In particular, at the end, aii says if there is
  a cycle containing vertex i.
      In a usual implementation, the procedures Or, And are simply performing
  the corresponding Boolean operations. Instead, we could replace each of them


                                                      46
M. Janota et al. On the Quest for an Acyclic Graph

  with a function that constructs the corresponding circuit. Because the Warshall
  algorithm is simple enough, we can also build the acyclicity checker directly.
  We use n2 (n + 1) auxiliary variables yijk , for i, j ∈ [n] and k ∈ {0, . . . , n}. The
  variable yijk will be 1 only if there exists a path from i to j that uses intermediate
  vertices only from the set [k].
                          �                   �                            �
        ψn (�x, �y ) :=           ¬yiin ∧              (xij → yij0 ) ∧               (yij(k−1) → yijk )
                          i∈[n]              i,j∈[n]                     i,j,k∈[n]
                                  �                                                                       (fw)
                          ∧               (yik(k−1) ∧ ykj(k−1) → yijk )
                              i,j,k∈[n]


  We remark that the formula above is close but not the same as what we would
  obtain by replacing the basic operations Or, And with circuit builders. One dif-
  ference is that we use implications rather than equivalences: It is possible that
  yijk = 1 even if there is no corresponding path. Another diﬀerence is that the
  last conjunct built by Warshall would be (yik(k−[k≥j]) ∧ ykj(k−[k≥i]) ) ↔ yijk ,
  where [p] evaluates to 1 when p is true, and 0 when p is false. Either form is
  correct.
      The formula in (fw) is similar to the one in (tc1). It has the same size Θ(n3 ),
  but uses more auxiliary variables.


  3.5      Matrix Multiplication

  Let us take stock of the encodings we have so far. Four encodings have size
  Θ(n3 ), namely (tc1), (tc2), (unr), and (fw); another encoding has size Θ(n2 log n),
  namely (bin). But, size is not the only criterion on which to judge a satisﬁability
  formula. We note that (tc1), (tc2), and (fw) have the following desirable property:
  if the edge variables �x are ﬁxed, then satisﬁability is decided by unit propagation.
  The smaller encoding (bin) does not have this property. Is it possible to have a
  formula of sub-cubic size that is solved by unit propagation once the edge vari-
  ables �x are ﬁxed? We answer this question aﬃrmatively by showing an encoding
  based on matrix multiplication. We ﬁrst describe the algorithm. The formula ψn
  will be constructed by replacing the basic operations with circuit builders, and
  then converting the circuit to CNF.
      We will consider n×n matrices of two kinds, over integers and over Booleans.
  Integers form a commutative ring, so the corresponding matrix operations are
  unrestricted. Booleans do not form a ring: We take addition to be logical-or
  and we take multiplication to be logical-and, but there is no subtraction. Since
  we use both types of operations and sometimes reinterpret the same matrix as
  being over integers or over Booleans, we distinguish Boolean matrices with a hat.
  When we reinterpret an integer matrix X as a Boolean matrix X̂, all non-zero
  integers collapse to 1.
      Let X be the adjacency matrix of a graph, with entries in {0, 1}. Entry (i, j)
  of X l is the number of paths of length l from i to j, and entry (i, j) of X̂ l tells
  us if there exists a path of length l from i to j. Thus, to decide if the graph


                                                           47
M. Janota et al. On the Quest for an Acyclic Graph

                                                                           �n       l
  has a cycle it suﬃces to check the diagonal of the Boolean matrix           l=1 X̂ . We
  compute this matrix by repeated squaring:

                     Â0 := X̂                   Âk+1 := Â2k                        (5)
                     B̂0 := X̂                   B̂k+1 := B̂k + ÂB̂k                 (6)
                                     k            � 2k
  It is easy to check that Âk = X̂ 2 and B̂k = l=1 X̂ l , and also that B̂k equals
  �n       l
     l=1 X̂ whenever k ≥ log2 n. Thus we can decide if a directed graph is acyclic
  in the time needed to perform Θ(log n) multiplications of n × n Boolean matri-
  ces. One could perform a naive matrix multiplication in Θ(n3 ) time, giving an
  acyclicity checker ψn of size Θ(n3 log n).
      Alternatively, to multiply Boolean matrices Â and B̂, we multiply them as
  integer matrices obtaining C = AB; the product of Â and B̂ is then Ĉ. (The
  idea of using intermediate integer matrices is old [18, 19, 37].) To perform the
  multiplication of integer matrices, we use Strassen’s algorithm [45], which works
  in O(n2.81 ) time. To achieve this improved runtime, Strassen’s algorithm makes
  essential use of subtraction, which is not available over Booleans. However, the
  runtime O(n2.81 ) counts operations over integers, while we must count operations
  over bits since we ultimately want to build a circuit. Since the matrices A and B,
  which we multiply, have entries from {0, 1}, the product matrix C has entries
  from {0, . . . , n}. Some intermediate values in Strassen’s algorithm are outside
  this range. Nevertheless, one can check that Θ(log n) bits are suﬃcient to repre-
  sent all intermediate values. So, the size of the circuit is O(n2.81 log n · f (log n)),
  where f (w) is the size of a circuit necessary to perform an arithmetic opera-
  tion on integers with w bits. In our implementation, we use a simple quadratic
  multiplication algorithm, leading to a circuit of size O(n2.81 �log3 n).
      In theory, one can use the same approach to achieve O nω log2 n log log n ·
      ∗      �
  8log log n by using better algorithms for the multiplication of matrices [20] and
  for the multiplication of integers [27]. But such algorithms are known to not
  be practical. Our current implementation based on Strassen is not competitive
  either (§4). Improving it seems to require new ideas.

  Discussion. The approach we took for both Warshall and for matrix multiplica-
  tion was to replace the basic operations in the algorithm with circuit builders.
  This approach is fairly general, but it does not apply to all algorithms, just to
  the oblivious ones. An algorithm is said to be oblivious when the memory loca-
  tions it accesses do not depend on the input data. If an algorithm runs in t(n)
  time in the Turing model   � of computation,
                                          �     then one can construct an oblivious
  algorithm that runs in O t(n) log t(n) time, and hence a Boolean circuit of that
  size [40]. In contrast, a similar construction does not exist in the deterministic
  RAM model of computation [23].
      The standard (and optimal) algorithm for ﬁnding cycles in directed graphs
  is DFS (depth-ﬁrst search). However, we are not aware of an implementation of
  DFS that works in sub-cubic time in the (oblivious, deterministic, multi-tape)
  Turing machine model of computation.


                                           48
M. Janota et al. On the Quest for an Acyclic Graph

         solver/checker tc1 unr bin fw tc2       solver/checker tc1 unr bin fw tc2
             lingeling   49 48 10 38 11              lingeling   436 429 426 435 435
              glucose    32 37 14 15 11               glucose    437 434 427 437 437
              minisat    25 49 12 12 11               minisat    435 425 424 435 435
          minisat-prepro 26 49 11 19 11           minisat-prepro 435 426 422 435 435
          (a) GT Family with 49 instances       (b) Supervisor Family with 441 instances

                    Fig. 1. Number of solved instances for sizes 2-50.


  4    Experimental Evaluation

  Overall setup. Three popular state-of-the-art SAT solvers were used for the
  evaluation: Lingeling Version 276 [8], glucose 4.0 [3], and minisat 2.2 (from
  github) [16]. Further, minisat was a run in two modes: the default mode, and
  with additional preprocessing techniques enabled (asymmetric breaking and re-
  dundancy checking). The encodings we consider are Transitive closure (tc1),
  Unary labeling (unr), Binary labeling (bin), Warshall (fw), and an alternate ver-
  sion of Transitive closure (tc2). The techniques based on matrix multiplication
  are not included, because they quickly lead to formulas in the realm of hundred
  of megabytes, and more (Figure 3).
      The experimental results were obtained on an Intel Xeon 5160 3GHz, with
  4GB of memory, time limit 500 seconds, and memory limit 2GB.

  Generation of Graph Families. The generation of the no-sink graph family is
  trivial: we simply instantiate (no-sink) for diﬀerent values of n. We also generate
  random instances of the Supervisor problem. Each upper bound ui on the in-
  degree of vertex i is chosen uniformly at random from the set {0, . . . , n − 1}.
  Each lower bound li is 0 with probability p, which is a parameter, and otherwise
  is chosen uniformly at random from the set {1, . . . , n − 1}. If the generated
  sequence (l1 , u1 ), . . . , (un , ln ) is not satisﬁed by any directed graph, we throw
  it away and try again. This ﬁltering is easy, because it is possible to check in
  linear time whether a sequence is satisﬁable by some directed graph, using the
  Fulkerson–Chen–Anstee theorem [2]; but, as we mentioned, checking whether a
  sequence is satisﬁable by an acyclic directed graph is NP-complete [26].
      For each n ∈ {2, 3, . . . , 50} and for each p ∈ {10%, 20%, . . . , 90%}, we gener-
  ate a sequence (l1 , u1 ), . . . , (ln , un ) using the method just described. For each
  such sequence, we generate the formula (supervisor). The atmost and atleast
  constraints were encoded into CNF using the well-known Sequential Counter
  encoding [43].

  Presentation of the results. Figure 1 presents the number of instances solved.
  Figure 1(a) shows the results for graphs with no sinks, which lead to the GT fam-
  ily of formulas when coupled with the acyclicity checker (tc1). Figure 1(b) shows
  the results for the general supervisor problem. Figure 2 shows a cactus for the
  (no-sink) family for all the considered combinations of solvers and encodings.


                                           49
M. Janota et al. On the Quest for an Acyclic Graph


                  500
                                     glucose-bin
                  450                 glucose-fw
                                    glucose-tc2
                                    glucose-unr
                  400               glucose-tc1
                                    lingeling-fw
                  350             lingeling-tc2
                                  lingeling-tc1
                                  lingeling-bin
                  300
   CPU time (s)




                            minisat-prepro-fw
                            minisat-prepro-tc2
                  250             lingeling-unr
                            minisat-prepro-tc1
                            minisat-prepro-bin
                  200       minisat-prepro-unr
                                     minisat-bin
                  150                 minisat-fw
                                    minisat-tc2
                  100               minisat-unr
                                    minisat-tc1
                   50

                    0
                        0   5      10       15     20        25    30    35     40   45   50
                                                         instances


                                Fig. 2. Cactus plot for the (no-sink) family.



  Recall that if a cactus plot contains a point (i, s) it means there are i instances
  such that each was solved within s seconds. (Therefore, the graph is necessarily
  increasing.) A more detailed presentation of the results can be found on the
  authors’ homepage.3
      The ﬁrst thing to observe is that the supervisor family turns out to be easy
  for all the approaches. Indeed, all approaches solved over 90% benchmarks. Even
  though the diﬀerences are rather small, the tables suggests that the Transitive
  closure encoding ((tc1)) is good for all solvers. In contrast, Binary labeling (bin)
  appears to be the worst.
      In the case of (no-sink) graphs, the results are widely diﬀerent for each en-
  coding but also for each solver. Most notably, lingeling performs extremely well
  on (tc1). Both versions of minisat perform well on Unary labeling (unr). Glucose
  also performs the best with (unr), but once the graph size gets over 30 nodes,
  the performance quickly deteriorates. The worst encoding appears to be Binary
  labeling (bin) and the alternative Transitive closure (tc2), with which all solvers
  could only handle less than half of all instances. Warshall has a peculiar behavior,
  as it performs poorly for all the solvers except lingeling.

   3
            http://sat.inesc-id.pt/~mikolas/sat-cycles/



                                                    50
M. Janota et al. On the Quest for an Acyclic Graph

  Discussion. The results are surprisingly uneven across diﬀerent solvers. In a
  particular, lingeling performs extremely well with (tc1), while the other solvers
  do not perform very well with the same encoding. It is possible that lingeling
  beneﬁts from some simpliﬁcation technique (pre- or in-processing) that is missing
  from the other solvers, but this remains to be understood. Binary labeling (bin) is
  overall the most compact but with the poorest performance. This is not entirely
  surprising. First, binary labeling does not oﬀer many opportunities for applying
  unit propagation. And second, binary labeling makes it virtually impossible to
  learn anything about any particular variable.
      In contrast, Unary labeling (unr) fares quite well over the considered solvers.
  Further, we observed that minisat with preprocessing solves the (no-sink) family
  with no conﬂicts, i.e., merely with preprocessing. This also holds for lingeling
  but only up to a certain size of the graph. It is not diﬃcult to see how these
  formulas can be completely solved by preprocessing, in particular by failed literal
  detection and asymmetric breaking.
      In unary labeling (§3.3), the ordering of labels is ensured by the following
  constraint. For every vertex i labeled by a number l(i) ∈ {0, . . . , n − 1} there
  must be a neighbor j labeled by a number l(j) ∈ {0, . . . , n − 1} such that
  l(i) < k ≤ l(j) for some k ∈ {0, . . . , n − 1}. To show that such a labeling does
  not exist for no-sink graphs, we reason as follows. If k = n − 1, it immediately
  follows that vertex j has no neighbor labeled with a greater number since it has
  reached the maximum. Performing this reasoning for every vertex yields that the
  labels must be in the interval {0, . . . , n − 1}. Repeating this reasoning n times
  leads to a contradiction.
      We argue that to perform this reasoning, it is suﬃcient to use unit propa-
  gation on unary labeling. In deﬁnition (3), setting uk := 1 imposes exactly the
  condition as listed above, i.e., the label of the source node is at most k − 1 and
  the target node at least k. It is easy to verify that setting un−1 := 1 yields a
  conﬂict by unit propagation.
      This idea is realized in the preprocessing techniques failed literal detec-
  tion [12, 35] and asymmetric breaking [38]. More generally, if setting a literal l
  to true yields a conﬂict by unit propagation it is called a failed literal and the
  literal is set to false. In this manner, these two techniques acting in tandem
  can gradually remove the ui literal corresponding to the highest number, and
  eventually derive an empty clause.

  Conclusion. The experimental results support the following conclusions. The
  combination of choice for checking acyclicity is lingeling with Transitive clo-
  sure (tc1). (Do not use the seemingly very similar alternative Transitive clo-
  sure (tc2), which leads to poor performance.) Minisat is the second solver of
  choice but not with the (tc1) encoding—instead, use (unr). SAT solvers appear
  to be good at ﬁnding an acyclic graph if it exists; proving that an acyclic graph
  does not exist can easily become very diﬃcult.

  Formula Size. Figure 3 presents the size of the acyclicity checkers (§3). The cubic
  methods (tc1, tc2, unr, fw) are all within a factor of 3 of each-other. The super-


                                          51
M. Janota et al. On the Quest for an Acyclic Graph


                                        109
                                                  tc1
      acyclicity checker formula size

                                        108       tc2
                                        107       bin
                                                  unr
                                        106        fw
                                                  mm
                                        105         ss
                                        104
                                        103
                                        102
                                        101
                                        100
                                              1                  10              100
                                                         graph vertex count

  Fig. 3. The size of the acylicity checker formula ψn as a function of the number of
  vertices n. Five lines correspond to equations (tc1), (tc2), (bin), (unr), and (fw). The
  remaining two correspond to Boolean matrix multiplication methods, using the deﬁni-
  tion (mm), and using Strassen’s algorithm �   (ss). The size of a CNF formula consisting
  of clauses C1 , . . . , Cm is deﬁned to be m + m i=1 |Ci |.




  cubic one (mm, which is based on the deﬁnition of Boolean matrix multiplication)
  is already worse than the cubic ones at n = 3, and it only gets worse for larger n.
  There are two sub-cubic methods: ss with size O(n2.81 log3 n), and bin with size
  O(n2 log n). The method based on Strassen (ss) generates large formulas even
  for n = 10 and, looking at Figure 3, there is no overtaking in sight. Indeed, even
  if the constants hidden in the O-notation were just as good for ss as they are
  for tc1, the crossing point would be at n ∼ 1032 . The log3 n factor is important,
  and it seems that without new ideas this method will never be practical. On the
  other hand, the method based on binary labeling (bin), generates the smallest
  acyclicity checkers for n ≥ 100.

  5                                     Summary and Future Work
  The paper presents a number of encodings for the graph acyclicity constraint into
  conjunctive normal form. This is useful when calculating acyclic graphs using
  SAT solvers but also other logic-based solvers, such as quantiﬁed Boolean for-
  mula (QBF) and Satisﬁability modulo theories (SMT) solvers. The experimental
  evaluation suggests that the performance is highly sensitive both to be encoding
  but also to the SAT solver used. This suggests that for practical purposes it is
  useful to maintain a portfolio of encodings and solvers.
     There are a number of directions for further research. All the encodings that
  are asymptotically better than cubic perform poorly in practice. Hence, it is a
  challenge for future research to identify encodings with asymptotic complexity
  better than cubic but that also perform well.


                                                             52
M. Janota et al. On the Quest for an Acyclic Graph

      Ultimately, all the considered encodings are not feasible for large graphs.
  Indeed, any cubic encoding generates formulas that are too large once the number
  of nodes gets over 1000. A possible solution is to consider a lazy approach where
  only some portion of the encoding is considered. However, this will probably
  require an iterative sequence of calls to the SAT solver. Another approach, tried
  recently [9,13], is to integrate specialized algorithms for acyclicity inside the SAT
  solver. This is also closely related to loop formulas, which are used in Answer
  Set Programming (ASP) to avoid cyclic reasoning cf. [21, 29, 33].
      Some questions about properties of the encodings with respect to unit propa-
  gation remain open. In particular, is it the case that any partial assignment that
  contains a cycle produces a conﬂict by unit propagation in a given encoding? In
  theory this is possible due to monotonicity [7] (adding edges preserves cycles).
  We believe this holds for the encodings Transitive closure, Warshall, and matrix
  multiplication, but does not hold for Unary and Binary labeling. Hence, this also
  begs the question whether the labeling methods can be made more determinis-
  tic. Other properties related to unit propagation are also worth exploring, e.g.,
  generalized arc consistency and propagation completeness, cf. [4, 7, 11, 22, 24].
      The experimental results lead to several interesting questions. Why is lin-
  geling so eﬃcient on the transitive closure encoding (tc1)? Why does the alter-
  native encoding for transitive closure (tc2) perform so poorly across all solvers?
  Do formulas (tc2) lack short resolution refutations?
      Last but not least, this research opens avenues for exploring other graph-
  related properties, e.g., connectivness and reachability in general.


  Acknowledgments
  This work was supported by national funds through Fundação para a Ciência e a
  Tecnologia (FCT) with reference UID/CEC/50021/2013, and by the PrideMM Web
  Interface grant from VeTSS.


  References
   1. Alekhnovich M, Johannsen J, et al. An exponential separation between regular and
      general resolution. Theory of Computing, 3(5):81–102, 2007.
   2. Anstee RP. Triangular (0, 1)-matrices with prescribed row and column sums. Dis-
      crete Mathematics, 40(1):1–10, 1982.
   3. Audemard G, Simon L. Predicting learnt clauses quality in modern SAT solvers.
      In International Joint Conference on Artiﬁcial Intelligence (IJCAI). 2009.
   4. Babka M, Balyo T, et al. Complexity issues related to propagation completeness.
      Artif Intell, 203:19–34, 2013.
   5. Bailleux O, Boufkhad Y. Eﬃcient CNF encoding of boolean cardinality constraints.
      In Principles and Practice of Constraint Programming (CP). Springer, 2003.
   6. Bailleux O, Boufkhad Y, et al. A translation of pseudo-boolean constraints to SAT.
      Journal on Satisﬁability, Boolean Modeling and Computation, 2006.
   7. Bessiere C, Katsirelos G, et al. Circuit complexity and decompositions of global
      constraints. In International Joint Conference on Artiﬁcial Intelligence (IJCAI),
      pages 412–418. 2009.



                                          53
M. Janota et al. On the Quest for an Acyclic Graph

   8. Biere A. Lingeling, Plingeling and Treengeling entering the SAT competition. Pro-
      ceedings of SAT Competition, 2013.
   9. Bomanson J, Gebser M, et al. Answer set programming modulo acyclicity. Fundam
      Inform, 2016.
  10. Bonet ML, Galesi N. Optimality of size-width tradeoﬀs for resolution. Computa-
      tional Complexity, 10(4):261–276, 2001.
  11. Bordeaux L, Marques-Silva J. Knowledge compilation with empowerment. In In-
      ternational Conference on Current Trends in Theory and Practice of Computer
      Science (SOFSEM), pages 612–624. Springer, 2012.
  12. Crawford JM, Auton LD. Experimental results on the crossover point in satisﬁa-
      bility problems. In AAAI, volume 93, pages 21–27. 1993.
  13. Cuteri B, Dodaro C, et al. Constraints, lazy constraints, or propagators in ASP
      solving: An empirical analysis. In International Conference on Logic Programming
      (ICLP). 2017. http://arxiv.org/abs/1707.04027.
  14. de Oliveira RT, Silva F. SAT and MaxSAT encodings for trees applied to the
      Steiner tree problem. In Brazilian Conference on Intelligent Systems (BRACIS),
      pages 192–197. IEEE, 2014.
  15. Dubrova E, Teslenko M. A SAT-based algorithm for ﬁnding attractors in syn-
      chronous boolean networks. IEEE/ACM transactions on computational biology
      and bioinformatics, 2011.
  16. Eén N, Sörensson N. An extensible SAT-solver. In International Conference on
      Theory and Applications of Satisﬁability Testing (SAT), pages 502–518. 2003.
  17. Eén N, Sorensson N. Translating pseudo-boolean constraints into SAT. Journal on
      Satisﬁability, Boolean Modeling and Computation, 2:1–26, 2006.
  18. Fisher MJ, Meyer AR. Boolean matrix multiplication and transitive closure.
      Switching and Automata, 1971.
  19. Furman ME. Application of a method of fast multiplication of matrices in the
      problem of ﬁnding the transitive closure of a graph. Soviet Math Dokl, 1970.
  20. Gall FL. Powers of tensors and fast matrix multiplication. In ISSAC. 2014.
  21. Gebser M, Janhunen T, et al. SAT modulo graphs: Acyclicity. In Logics in Artiﬁcial
      Intelligence (JELIA). 2014.
  22. Gent IP. Arc consistency in SAT. In European Conference on Artiﬁcial Intelligence
      (ECAI), pages 121–125. IOS Press, 2002.
  23. Goldreich O, Ostrovsky R. Software protection and simulation on oblivious RAMs.
      J ACM, 43(3):431–473, 1996.
  24. Gwynne M, Kullmann O. On SAT representations of XOR constraints. In Language
      and Automata Theory and Applications Conference (LATA), pages 409–420. 2014.
  25. Haken A. The intractability of resolution. Theoretical Computer Science, 39:297–
      308, 1985.
  26. Hartung S, Nichterlein A. NP-hardness and ﬁxed-parameter tractability of realizing
      degree sequences with directed acyclic graphs. SIAM Journal on Discrete Mathe-
      matics, 29(4):1931–1960, 2015.
  27. Harvey D, van der Hoeven J, et al. Even faster integer multiplication. Journal of
      Complexity, 2016.
  28. Ignatiev A, Morgado A, et al. Cardinality encodings for graph optimization prob-
      lems. In International Joint Conference on Artiﬁcial Intelligence (IJCAI). 2017.
  29. Janhunen T, Niemelä I. Compact translations of non-disjunctive answer set pro-
      grams to propositional clauses. In Logic Programming, Knowledge Representation,
      and Nonmonotonic Reasoning: Essays Dedicated to Michael Gelfond on the Occa-
      sion of His 65th Birthday. 2011.



                                          54
M. Janota et al. On the Quest for an Acyclic Graph

  30. Jiang Y, Kautz H, et al. Solving problems with hard and soft constraints using
      a stochastic algorithm for MAX-SAT. In 1st International Joint Workshop on
      Artiﬁcial Intelligence and Operations Research. 1995.
  31. Knuth DE. The Art of Computer Programming, Volume 4, Fascicle 6: Satisﬁability.
      Addison-Wesley Professional, 1st edition, 2015.
  32. Krishnamurthy B. Short proofs for tricky formulas. Acta Informatica, 22(3):253–
      275, 1985.
  33. Lifschitz V, Razborov A. Why are there so many loop formulas? ACM Transactions
      on Computational Logic (TOCL), 7(2):261–268, 2006.
  34. Lustig D, Sethi G, et al. COATCheck: Verifying memory ordering at the hardware-
      OS interface. ACM SIGOPS Operating Systems Review, 2016.
  35. Lynce I, Marques-Silva J. Probing-based preprocessing techniques for propositional
      satisﬁability. In ICTAI, pages 105–110. 2003.
  36. Marques-Silva J, Lynce I. Towards robust CNF encodings of cardinality con-
      straints. In Principles and Practice of Constraint Programming (CP), pages 483–
      497. Springer, 2007.
  37. Munro I. Eﬃcient determination of the transitive closure of a directed graph. Inf
      Process Lett, 1(2):56–58, 1971.
  38. Piette C, Hamadi Y, et al. Vivifying propositional clausal formulae. In European
      Conference on Artiﬁcial Intelligence (ECAI), pages 525–529. 2008.
  39. Pipatsrisawat K, Darwiche A. On the power of clause-learning SAT solvers as
      resolution engines. Artiﬁcial Intelligence, 175(2):512–525, 2011.
  40. Pippenger N, Fischer MJ. Relations among complexity measures. J ACM,
      26(2):361–381, 1979.
  41. Plaisted DA, Greenbaum S. A structure-preserving clause form translation. J
      Symb Comput, 2(3):293–304, 1986.
  42. Rintanen J, Heljanko K, et al. Planning as satisﬁability: parallel plans and algo-
      rithms for plan search. Artiﬁcial Intelligence, 170(12):1031 – 1080, 2006.
  43. Sinz C. Towards an optimal cnf encoding of boolean cardinality constraints. In
      Principles and Practice of Constraint Programming (CP), pages 827–831. Springer,
      2005.
  44. Stålmarck G. Short resolution proofs for a sequence of tricky formulas. Acta
      Informatica, 33(3):277–280, 1996.
  45. Strassen V. Gaussian elimination is not optimal. Numer Math, 1969.
  46. Tamura N, Taga A, et al. Compiling ﬁnite linear CSP into SAT. Constraints,
      2009.
  47. Tavares de Oliveira R, Silva F, et al. On modeling connectedness in reductions
      from graph problems to extended satisﬁability. In Advances in Artiﬁcial Intelligence
      (IBERAMIA), pages 381–391. Springer, 2012.
  48. Tseitin GS. On the complexity of derivations in the propositional calculus. Studies
      in Constructive Mathematics and Mathematical Logic, Part II, 1968.
  49. van Beek P, Hoﬀmann HF. Machine learning of Bayesian networks using constraint
      programming. In International Conference on Principles and Practice of Constraint
      Programming (CP), pages 429–445. 2015.
  50. Van Gelder A. Another look at graph coloring via propositional satisﬁability. Dis-
      crete Appl Math, 2008.
  51. Wickerson J, Batty M, et al. Automatically comparing memory consistency models.
      In Symposium on Principles of Programming Languages (POPL). ACM, 2017.




                                           55