1. Introduction

T. Kaiser, D. Král', R. Škrekovski, X. Zhu, The circu- (by hand or using a SAT solver). lar chromatic index of graphs of high girth, Journal Our feeling is that a SAT solver should perhaps be the of Combinatorial Theory, Series B

10.1016/j.jctb.2006.03.002

Circular chromatic index of small snarks

Dušan Bernát

Ján Mazák

0 0 FMFI UK , Bratislava , Slovakia

2024

97 2007 0000 0002

This paper verifies and extends previous computational results on circular chromatic index of small snarks, gives further support for certain conjectures, and discusses various practical aspects of using SAT solvers to solve graph colouring problems (e.g. the impact of solver choice and Boolean formula construction on running time).

eol>cubic graph snark circular chromatic index SAT solver

1. Introduction This paper focuses on circular chromatic index ′,

which is a refinement of chromatic index that allows real The main aim of this article is to popularize the usage of values as colours instead of just integers. For ≥ 1, SAT solvers for computations in combinatorics. For more a circular -edge-colouring of a graph is a mapping than a decade, SAT solvers have been among the most : () → [0, ) such that 1 ≤ | () − ( )| ≤ − 1 eficient tools for problems in discrete mathematics, and for any two incident edges and of . If a graph has there are plenty of them freely available. However, many a circular -edge-colouring, we say that it is circularly researchers in the area, both young and more experi- -edge-colourable. The circular chromatic index ′ of enced, are still not suficiently aware of their advantages is the infimum of all such that has a circular -edgeand the simplicity of their use. We will illustrate the colouring. This infimum is in fact a minimum; for a finite usage of SAT solvers through computations concerned graph it is always attained. Moreover, it is rational, and with edge-colourings of graphs. the only possible candidates for ′() are fractions /

Cubic graphs that do not admit a 3-edge-colouring such that ≤ | ()|. Circular colourings can be used (so-called uncolourable) have been studied for more than for optimization in certain types of scheduling problems, a century, initially in connection with the Four Colour but we are not exploring this in this paper. For a detailed Problem, and more recently in the context of flows and introduction to circular colourings, we refer the reader cycle covers. In this article, a snark is a bridgeless simple to the survey [4]. cubic graph with chromatic index 4. Since the problem of Colourable cubic graphs have ′ = 3, while snarks determining the chromatic index of a cubic graph is NP- have ′ > 3, so ′ is also a “measure of uncolouracomplete, we cannot expect a simple characterization bility”, perhaps the most natural one. Snarks with ′ of snarks, and volumes have been written on various close to 3 are plentiful, whereas snarks with ′ above attempts to gain a better understanding of their structure 10/3 seem rare. This is partly known, partly hypothe[1]. sized. An intuition behind this statement is as follows:

One possible approach is to acknowledge that “not In a 3-edge-colouring, when you fix the colours of two all snarks are of the same dificulty” and introduce a edges at a vertex, the colour of the third edge is uniquely measure that somehow splits snarks into classes with determined—there is no wiggle room. But in a circu“increasing levels of uncolourability” [2]. For instance, lar (3 + )-edge-colouring, there is typically a small inif we consider the minimum number of odd cycles in a terval of length up to for allowed values of the third 2-factor of a cubic graph (its oddness), there are cubic colour. With a bit of flexibility at each vertex, over a large graphs with oddness 2 for every non-negative integer enough subgraph, we can gradually shift the colours into . For = 0, we get colourable graphs, and presumably pretty much anything we want on the edges leaving the as oddness increases, the dificulty of proving theorems subgraph, making it easy to get a circular (3 + )-edgefor such snarks increases: e.g. the 5-flow conjecture was colouring of the whole large graph even for a small value gradually proved for snarks with oddness 0, 2, and 4 [3]. of . Of course, there are situations where this does not apply—for instance, if we have a small subgraph that cannot be coloured with 3 + colours for a small , it will ensure a lower bound on ′ for the whole graph.

The following theorems and conjectures summarize the current state of knowledge about ′ of cubic graphs.

Theorem 1 ([5]). There is no graph with ′ ∈ (11/3, 4).

Theorem 2 ([6, 7]). The Petersen graph has ′ = 11/3. colour and that no two incident edges can be assigned There exists an infinite class of snarks with connectivity 3 colours with diference less than (using clauses like and ′ = 7/2. ¬,0 ∨ ¬,1). Then a SAT solver is used to decide the satisfiability of the problem instance. As explained in Theorem 3 ([8]). For every > 0, there exists an integer Section 1, there are finitely many candidates for fractions such that every snark with girth at least has ′ ≤ 3+. / and some bounds can be set based on the size of the input graph, so it is possible to determine ′ by subTheorem 4 ([6]). The circular chromatic index of every sequently solving several SAT instances. This type of snark with girth at least 6 is at most 7/2. problem is often solved by a binary search (we order the possible fractions from the lowest to the highest, look Theorem 5 ([9]). For every rational ∈ (3, 10/3) ∪ {3 + into the middle etc.), but it would be suboptimal in this /(3 − 1) | ∈ Z+}, there exists a snark with ′ = . case, as we explain in the next paragraph. The time required to solve a particular instance for Conjecture 1 ([6, 7]). The circular chromatic index of a given graph highly depends on : a large value of every snark except the Petersen graph is at most 7/2. If results in a large number of variables, which is the biggest is cyclically 4-connected, ′() < 7/2. determinant of time complexity (Section 4.2 gives more

It is still open if the Petersen graph is the only graph details on this). Our experiments clearly indicate that we with ′ = 11/3. Based on our computational results, we need to reduce the number of instances with large values propose two additional conjectures and one problem. of as much as possible during a search for ′. There are two other factors at play, but both play a minor role. The Conjecture 2. If is a snark with girth at least 6, then first is that the time increases when the tested fraction ′() ≤ 10/3. is near ′—when the value is far below ′, the solver can quickly recognise unsatisfiability; when the fraction

The evidence for Conjecture 2 is somewhat scarce: it is too far above ′, it is easy to find a colouring. The holds for Isaacs snarks with girth 6 (though not for those second is specific to circular colourings: fractions of the of smaller girth), and we verified it for all 42 snarks with form 3 + 1/ are easier to exclude than 3 + 2/, 3 + 3/ girth 6 on at most 38 vertices and an incomplete list of etc. because there is less flexibility in colouring edges snarks on 40 to 44 vertices (all of them being the result incident to a vertex. Based on these observations, we use of a dot product applied to smaller snarks). Weak indirect a heuristic that always starts testing for the fraction with support is also provided by Theorems 3 and 4. the smallest numerator.

Conjecture 3. All cyclically 4-connected snarks have ′() ≤ 17/5.

Problem 4. Characterise cyclically 4-connected snarks

with ′() > 10/3. Is the set >10/3 of such snarks ifnite? Is the set of the values of ′ of such snarks finite?

We have found 17 members of >10/3, described in

more detail in Section 3. 2. Computing ′ In light of the conjectures in Section 1, our computations are primarily focused on proving that there are no snarks with ′ bigger than some value . This is the easier thing to do—one only needs to find a suitable circular -edge-colouring. On the other hand, determining ′ requires proving a lower bound, which takes much more time. Anyway, where computationally feasible, we fully determine ′, not just verify the conjectures.

For a given input graph and a fraction /, we construct a boolean formula that is satisfiable if and only if is circularly /-edge-colourable. This formula uses a variable , for each element (, ) ∈ () × {0, 1, . . . − 1}; it states that each edge gets exactly one

3. Results of computations

The exhaustive lists of snarks with given order and cyclic connectivity we used come from [10]. The counts of cyclically 4-connected snarks with girth 5 up to 30 vertices with particular ′ were presented in [11]. The results of this diploma thesis were not published properly and perhaps are not widely known. We have fully confirmed the results of [11] and extended them for snarks of girth 4, see Table 1 (rows are ordered according to the increasing value of ′).

All the results presented in this section were obtained using the CaDiCaL_vivinst, one of the winning solvers of the 2023 SAT Competition [12]. We used a machine with 20 cores (Intel Xeon Gold 5220R CPU @ 2.20GHz). Because the computations for diferent graphs are completely independent, the jobs within a batch are well suited for parallelisation, yielding the speedup approaching the number of CPU cores. The most demanding batch comprised all snarks of order 30 and girth greater or equal to 4; it took two weeks to complete. We verified the computed indices by using SBVA combined with Kissat [12], but so far only with the same formula-generation algo

We suggest starting with a formula minimizing the chance of a mistake (either conceptual or in implementation—we made plenty of both kinds, so do not underestimate it, and if you can, make sure you verify your implementation against some independently obtained data). Of course, it pays of to employ known tricks allowing translation of disjunctive normal forms into small equisatisfiable conjunctive normal forms 1, thus avoiding the exponential blowup in the number of clauses. 4.2. Number of variables and clauses

Internally, almost all current leading SAT solvers use the

CDCL algorithm. They thus learn new clauses along the way and operate with many clauses that were not part of the original input. Consequently, including some seemrithm; we are working on additional verification and ingly redundant clauses in the input might even help extension of the results to snarks on 32 or more vertices. (though it usually does not). Also, solvers use preproOn the other hand, our code is completely independent of cessing which significantly rewrites the given formula: what Kunertová used, and the results also agree with the- while colouring problems typically yield clauses of size 2 oretically determined indices for graphs that we checked and 3, it seems solvers prefer bigger clauses of size about (e.g. Isaacs and generalized Blanuša snarks), which gives 10 (at least in the couple of situations we were able to us confidence in the results. check). Thanks to technical tricks like watched literals,

Conjecture 3 suggests that non-trivial snarks with the clauses are not manipulated over and over, but are ′ > 10/3 are quite exceptional. This is supported accessed only when relevant. The takeaway here is that also by Table 1 which is sorted by index value. The mem- the number of clauses is of very little practical concern bers of the set >10/3 from Section 1 are below a line: (unless it is really extreme). the Petersen graph, the type 2 Blanuša snark, the flower On the other hand, the number of variables is critisnark on 20 vertices and five other snarks on 22, 28 cal because it afects the depth (and thus the number and 30 vertices. Additional computations for all cycli- of nodes) of the CDCL search tree. For SAT instances cally 4-connected snarks on up to 36 vertices (girth at derived from graphs, the number of variables is typically least 4) identified one more snark of order 34 and two some polynomial dependent on the size of the graph, snarks of order 36 with ′ = 17/5. In the incomplete say, . The resulting theoretical worst-case complexity set of generated snarks of order 38, there were six ad- of roughly 1.3 seems terrifying, but in practice, SAT ditional snarks (three of them have ′ = 27/8, three solvers do much better. We will illustrate this with our of them have ′ = 37/11). All these snarks are avail- experience with several problems on cubic graphs. able at https://github.com/janmazak/research-data (see circular_chromatic_index).

4. SAT solvers for graph theory

4.1. Construction of Boolean formulas

We experimented with diferent ways of constructing

Boolean formulas capturing 3-edge-colouring of cubic graphs. For instance, one can say that for every colour and every two incident edges, at least one of the edges does not have colour . Or that every colour is used on the three edges incident to a vertex at least once; or something based on combinatorial nullstellensatz etc. The running times were very similar, somewhat depended on the input graphs, and our experiences show that arcane ways of formula construction are likely not worth it for smaller-scale computations. • For 3-edge-colouring with integer colours, = 1.

Solvers easily beat other approaches2 and can deal with graphs with hundreds or even thousands of vertices. • For circular edge-colouring, the number of colours linearly depends on the size of the graph, and we need a variable for each pair [edge, colour], so = 2 at worst. The number of clauses is cubic. Solvers work decently for small graphs, but somewhere around 60 edges the computation time significantly increases (from minutes to days per single graph). However, solvers are 1For instance, the Tseytin transformation, or the approach mentioned in https://en.wikipedia.org/wiki/Conjunctive_normal_form# Other_approaches. 2Except tiny graphs and possibly except a recent algorithm based on path-width decomposition [13]. much faster than our backtracking algorithm3 at tually shorter, in theory, the real time might be half of that threshold. the CPU time. The Kissat variants do about 10% better • For Hamiltonian cycles, the best formula con- than the CaDiCaL. The solver lingeling used in previous struction we are aware of uses variables for pairs work is about 7 years old but was only 1.5 times slower. [vertex, order of vertex along the cycle], i.e. = The SeqFROST solver appeared to do much better on our 2. The number of clauses is also cubic. Again, benchmark task than the reference SAT solver. Repeated there is a threshold around 60–75 vertices after runs for each solver gave almost the same results, the which computation takes days for a single graph. diferences in time consumption were all less than 0.5%. In contrast to circular colouring, backtracking In another experiment we let the solvers search for competes with solvers fairly well. ′ of one single graph on 36 vertices with girth 5 (the result is 10/3). The results presented in Table 3 show

In summary, SAT solvers are likely to be all you would a diferent perspective than previous benchmark. For ever need if you can keep the number of variables linear. this kind of task, all solvers except plain Kissat beats the If quadratic, they are helpful, but large graphs (some- CaDiCaL. Even several years old lingeling was almost where above 50–80 edges) are out of their reach, and twice as fast. For MergeSat, the total time consumed you are likely to run into trouble if you want to do com- by two threads is shown, while the running time was putations for sizeable complete lists of graphs of a given only 1304 seconds, which is 44% of the time needed order. by CaDiCaL. Interestingly, SeqFROST4 repeated the best performance as it took only one quarter of the time of 4.3. Which solver should I use? the reference solver.

For computations taking hours, solver selection is not worth the efort. But for longer tasks taking weeks, it is likely to pay of. A reasonable way is to randomly pick a small sample of the instances in question and test them on solvers that are available. A word of warning: solvers might contain rarely-encountered bugs; it has happened in the past for the solvers submitted to the competition. So in situations where you are proving something for all instances, it would be better to stay with a more mainstream and seasoned solver; if you are trying to just find some suitable object that can be verified later, possible bugs in a freshly implemented solver are not a critical problem.

We conducted several simple experiments in order to evaluate diferent SAT solvers. We considered state-of-the-art solvers available from the website of SAT Competition [12]. We included also lingeling, used in the 2017 work [11]. The criterion we used was the overall CPU user time consumed by all the spawned processes. The test case for performance evaluation was the computation of ′ for all cyclically 4-connected snarks on 24 vertices. There are 155 such graphs and the computation for each of them involved solving several SAT instances as explained in Section 2.

The results are shown in Table 2 which shows total run time and comparison to the CaDiCaL_vivinst, denoted ′, which was one of the winners of the competition (the third place in main track). The table also shows 4.4. Should I strive for the best solver the speedup achieved on the 20 CPU machine denoted configuration? 20. At first sight, the MergeSat solver looks like quite We experimented with tweaking configurable solver paa bad choice but as it was always using two computa- rameters a bit (for instance, lingeling has hundreds of tional threads the real time of result delivery was ac- them). The best speedup we were able to get is 2-4x for a single graph and less than 2x for a large set of graphs.

However, it is unclear if the gains are transferable to other input instances. For instance, the distribution of 3The algorithm first constructs a linear ordering of edges such that the next edge in the ordering has the most incident edges among the edges already included in the ordering, and then tries all possible colours for each edge in the ordering subsequently, backtracking when a colouring conflict arises. Our other attempts at backtracking did not yield anything meaningfully faster. 4A winner of SAT Competition 2022, https://gears.win.tue.nl/papers/ sc2022_seqfrost.pdf, https://github.com/muhos/SeqFROST graph features allowing the speedup for smaller graphs solver by repeatedly adding clauses excluding already might be diferent for larger graphs. Additionally, fine- found solutions to the input formula). There is a fair tuning the parameters takes quite some time because chance this might work if the number of solutions is in of the necessary repeated solver runs. Overall, if your thousands or perhaps millions. With billions of solutions computations are going to run for weeks, it might be (or larger input instances in general), solvers tend to run worthwhile to work on this, but for quick verification out of memory or experience a significant slowdown. of conjectures and the like, not really. Also, note that As an example of this method, it is much easier to in Section 4.3, lingeling turned out to be slow for small describe a 2-factor than a Hamiltonian cycle (because instances, but fast for large instances, thus solver config- the conditions encoded into the Boolean formula are uration should be focused on bigger instances primarily local, around a single vertex)—the resulting formula has a or even exclusively. linear number of variables. For each 2-factor computed

Your time might be better spent on trying to break the by an AllSAT solver, we then check if is connected. symmetry of your Boolean formulas [14]. For instance, This approach is comparable to or even somewhat faster for circular edge-colouring, one can pre-colour an edge than direct SAT solving for cubic graphs with around 30 with 0 without a loss of generality, and immediately get vertices, but it fails for graphs above 40 vertices because rid of a handful of variables, potentially gaining a better the number of 2-factors grows exponentially [16]. speedup than anything that could be gained by tweaking One possible reason for the success of generic SAT solver parameters. solvers on graph problems (versus other approaches tai

Our experience with parallelism is that it does not pay lored to solve specific problems) is that a solver can see of to use multiple threads per single SAT instance. It more than is obvious. We have experienced this when seems preferable to solve many instances in parallel, each considering an alternative approach to colourings: inin its own thread. stead of running an exponential algorithm on the whole graph at once, what if we split the graph along a small 4.5. Alternatives to SAT cut, ran the slow algorithm on the much smaller parts separately, and then combined the results? It turned out Kunertová [11] described several alternatives for comput- it is no big win (unless the graph in question contains ing ′, some of them with measurements of computation many isomorphic subgraphs), possibly because all the time. It seems that they all fall far short of SAT solvers small cuts are reflected in the Boolean formula (as subsets (unless we are talking about very small instances, where of clauses barely sharing variables), and thus the solver backtracking is king). These alternatives typically rely can work with all the cuts as it sees fit anyway. Also, on some specific properties of circular colourings, so are leading solvers include tricks like random restarts, so not generalizable and we will not discuss them. One of e.g. if the colouring conflict is in a specific subgraph, the the approaches, though, deserves attention. solver does not get stuck outside of this subgraph for too

For a graph with ′() = , one way to prove long. the lower bound is to refute the existence of an ′-edge- SAT solvers also have a disadvantage: there is no efcolouring for ′ being the largest relevant fraction smaller fective way of reporting computation progress (unlike than . Another is finding all -edge-colourings of and for plain backtracking having at least a somewhat balchecking that each of them contains a so-called tight anced search tree). In case one wants to solve a single cycle. While it is not particularly fruitful for circular particularly hard instance, when does one give up?. . . colourings [11], it might sometimes be the most eficient A note on integer/mixed linear programming (ILP): approach for graphs of limited size. The problem of find- every SAT instance can be expressed as a problem of ing all satisfying assignments to a given Boolean formula ILP, and there are powerful ILP solvers, so ILP might be is called AllSAT and there are specialized solvers for it worth a try for your problem. Our limited experience [15], although not many, and they are not heavily em- shows that ILP is significantly slower, and it comes with ployed, so there are concerns about reliability (but any numerical and rounding problems, so perhaps the best ordinary SAT solver can be easily turned into an AllSAT use of it would be for refuting working hypotheses or for