<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Parallel Computation of Generalized Hypertree Decompositions?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georg Gottlob</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cem Okulmus</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Reinhard Pichler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>TU Wien</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Answering Conjunctive Queries (CQs) and solving Constraint Satisfaction Problems (CSPs) are arguably among the most important tasks in Computer Science. They are classical NP-complete problems. However, they have tractable fragments for instances where the underlying hypergraphs are acyclic. There has been active research for several decades to generalize acyclicity with several notions of decompositions and width [4]. Here we focus on generalized hypertree decompositions (GHDs) and generalized hypertree width (ghw ). Deciding if a given CQ or CSP (strictly speaking, the underlying hypergraph H) has ghw k is NP-complete even for k = 2 [3]. However, it was also shown in [3] that the problem of deciding if a given hypergraph has ghw (H) k becomes tractable for fixed k under realistic restrictions. One such restriction is the Bounded Intersection Property (BIP): a hypergraph H has intersection width i (denoted as iwidth(H) i), if the intersection of any two edges in H has at most i vertices. A class of hypergraphs satisfies the BIP, if there exists a constant i, such that every hypergraph H 2 C has iwidth(H) i. In [2], three different algorithms for checking ghw (H) k (and, if so, computing a concrete GHD of width k) were implemented and tested on the HyperBench benchmark [2] comprising over 3,000 hypergraphs derived from CQs and CSPs from various sources. All the algorithms thus implemented rely on the observation that hypergraphs of CQs or CSPs stemming from applications tend to have low iwidth. These GHD-computations were purely sequential, even though one of the algorithms seems to lend itself naturally to parallel processing: more precisely, this GHD-algorithm is based on so-called “balanced separators”; at each step of the top-down construction of a GHD, this algorithm searches for a set of at most k edges fe1; : : : ; ekg (= a “balanced separator”), such that the vertex set e1 [ [ ek splits the hypergraph into components whose size (measured in terms of the edges that intersect with each component) is at most half the size of the component processed by the parent node in the GHD. Given that many of the experiments reported in [2] had high run times or were even stopped due to a time out (with default value 3,600 seconds), we have to look for a different computation strategy. The goal of this work is to provide a parallel computation of GHDs and to move the GHD-computation to a powerful cluster. To this end, we adopt the aforementioned GHD-algorithm ? This work was supported by the Austrian Science Fund (FWF):P30930-N35.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        based on balanced separators (referred to as “b-seps”, for short, in the sequel)
and implement it in the Go programming language [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Developed at Google
in 2009, it has already seen widespread use by a number of companies such as
Dropbox, CloudFare, Netflix and by Google itself.
      </p>
      <p>Below, we describe the challenges that we have faced in designing a parallel
implementation of the b-seps approach:
– Finding a good design of the main targets for parallelization, namely the
search for the next balanced separator and the recursive calls of the
GHDcomputation for the resulting components;
– Finding a way to partition the work space equally among CPUs;
– Designing parallel search to support efficient backtracking;
– Splitting resources equally among the search space during recursive calls;
– Introducing subedges of the edges in the given hypergraph (which is needed to
exploit the low iwidth(H)) while avoiding unneeded combinatorial explosion.</p>
      <p>
        Below, we summarize how we have dealt with the above challenges and we
report on first, promising experimental results. We will show that the parallel
approach is indeed able to significantly speed up the expensive GHD-computations
and that it allowed us to solve some cases which were out of reach with the
previous, purely sequential GHD-implementations in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>We assume the reader to be familiar with basic notions such as conjunctive
queries (CQs) and their corresponding hypergraphs. A GHD of a hypergraph
H = (V (H); E(H)) is a tuple hT; ; i where T = (N; E(T )) is a tree, and and
are labelling functions, which map to each node n 2 N two sets, (n) V (H)
and (n) E(H). We denote with B( (n)) the set fv 2 V (H) j v 2 e; e 2 (n)g.
The functions and have to satisfy the following conditions:
1. For each edge e 2 E(H) there exists a node n 2 N such that e (n).
2. For each vertex v 2 V (H), fn 2 N j v 2 (n)g is a connected subtree of T .
3. For each node n 2 N , we have that (n) B( (n)).</p>
      <p>
        The width of a GHD (ghw) is the size of the largest label for . It was shown
in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] that for a class of hypergraphs enjoying the BIP, one can add polynomially
many subedges of edges in E(H) to ensure B( (n)) = (n) for every n 2 N .
      </p>
      <p>For a set of edges S E(H), we say two vertices v1; v2 2 V (H) are
[S]connected if there is a path of edges e1; : : : ; en 2 E(H)nS such that v1 2 e1 and
v2 2 en and for each pair in the path ei; ei+1; i n 1 we have that ei and
ei+1 share a common vertex. We define an [S]-component to be a maximal set of
[S]-connected vertices. The size of an [S]-component C is defined as the number
of edges e 2 E(H) such that e \ C 6= ;. For a hypergraph H and a set of edges
S E(H), we say that S is a balanced separator (b-sep) if all [S]-components
of H have a size of jE(2H)j or less .</p>
      <p>graph
Dubois-016.xml.hg
rand-25-10-25-87-24.xml.hg
Nonogram-012-table.xml.hg
Pi-20-10-20-30-17.xml.hg</p>
      <p>
        It was shown in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] that, for every GHD hT; ; i of a hypergraph H, there
exists a node n 2 N such that (n) is a balanced separator of H. This property
can be made use of when searching for a GHD of size k of a hypergraph H: if
no such separator exists, then clearly there can be no GHD of H of width k.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>
        The Go programming language [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] has a model of concurrency that is based on
Hoare’s Communicating Sequential Processes [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The smallest concurrent unit
of computation is called a “goroutine”, essentially a light-weight thread.
      </p>
      <p>For the b-seps algorithm, we looked at two key areas of parallelization: the
search for b-seps and the recursive calls. For the search, while testing out a
number of configurations, we settled ultimately on using two types of goroutines:
A number of workers, spawned via a central master goroutine, which starts them
and waits on exactly two conditions (whichever happens first): either one of the
workers finds a b-sep, or none of them do and they all terminate on their own.
In the first case it makes sure all workers terminate and continues the main
computation. For the recursive calls, we so far only spawn a single goroutine
for each of them and wait for each call to return its result (a GHD of the
corresponding subhypergraph).</p>
      <p>To split the work equally among the workers during the search, a simple
work balancer (each worker receiving jobs from the central master goroutine)
would address this nicely. However, we found that this introduces a considerable
synchronization delay when compared with splitting up the work beforehand. We
assume here that the choice of edges has only small influence on the time needed
to check if they form a b-sep. Thus we split the search space of size M = Nk by
the number of workers W (and if there is some remainder, we simply increase
the workload of the first (M mod W ) workers by 1). Each worker has its own
iterator, thus minimizing the need for communication during the search.</p>
      <p>Our algorithm must support backtracking, i.e., restarting the search and
finding another solution (if it exists). If we do not keep track carefully where
each worker left off when the parallel search halted earlier, we could be forced
to repeat work. We address this by saving the above mentioned iterators, used
by each worker, in the main thread so as to reuse them during backtracking.</p>
      <p>Another challenge lies in parallelizing the recursive calls in such a way that
the resources (CPU cores) are split among them, to reduce unnecessary
synchro4:6
4:4
pu4:2
d
e
eSp 4
n
iad3:8
e
M
3:6
3:4
width 2
width 3
width 4
nization delays and not focus too much on one branch of the search tree. We
have not yet found a satisfactory solution for this. One idea would be to go back
to load balancing, and see if it helps with parallelizing the recursive calls, at the
cost of taking away resources from the parallel search of b-seps.</p>
      <p>
        The algorithm needs to compute subedges and consider them as possible
choices for b-seps, as it would otherwise not be complete. The first and obvious
choice is to add all possible subedges in the beginning (adding them globally).
This leads to a combinatorial explosion, and more crucially also leads to many
useless combinations, such as considering multiple subedges from the same
original edge at once. We address this by adopting the local subedge variant from [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
and making it more restricted by first finding a balanced separator among the
“original” edges of E(H), and if this leads to a reject case, considering subedge
variants of its edges. We also make sure not to repeat the same subedges by
caching the vertex sets that generate them.
      </p>
      <p>We used our parallel implementation to look at further instances that can
be determined negatively (proving that no GHD of a certain width can exist) or
positively (actually producing a GHD of that width). Our test setup was a cluster
with 11 machines, each with two 12-core Intel Xeon CPU E5-2650 v4, clocked
at 2.20GHz. For the tests, each job ran exclusively on a machine spawning up
to 24 goroutines.</p>
      <p>
        Possible speedups of four sample hypergraphs from HyberBench when
compared to a sequential execution1 are showcased in Table 1, where speedup is
simply the sequential time divided by the parallel one. Furthermore, when
comparing the execution times of the purely sequential version with parallelizing
both the search for b-seps and parallelizing the recursive calls, we observe an
increase in speedups at higher widths, seen in Figure 1. Additionally, we could
produce new results for some previously unsolved instances of the HyperBench
benchmark, thus determining their exact ghw . We are positive that with further
1 ’Sequential execution’ here refers to running the same general algorithm, but
rewritten so that it does not use any goroutines. This version therefore only uses a single
CPU core.
improvements, we will be able to “fill out the blank spots” on many hypergraphs
of the HyperBench benchmark from [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and in the process produce a more robust
tool to compute GHDs.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Outlook</title>
      <p>
        This paper is about work in progress. The next step will be to incorporate
further optimizations into our Go implementation such as the following: (1)
Applying various heuristics for ordering the hyperedges (to find b-seps faster),
(2) caching of previous computations, and (3) looking at hybrid solutions which
first apply the recursive splitting into subproblems via the b-seps approach and
then (for sufficiently small subproblems) switch to one of the other (sequential)
GHD-algorithms in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The ultimate goal is, at least for all hypergraphs in the
HyperBench benchmark where an upper bound on ghw of at most 6 is known
(i.e., slightly more than 1,500 instances), to be able to compute the precise value
of ghw . Currently, for about half of these instances, the exact ghw is still open.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Golang.
          <source>org (Feb</source>
          <year>2019</year>
          ), https://golang.org/
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Fischl</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Longo</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pichler</surname>
          </string-name>
          , R.:
          <article-title>Hyperbench: A benchmark and tool for hypergraphs and empirical findings</article-title>
          .
          <source>In: PODS</source>
          <year>2019</year>
          (to appear) (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Fischl</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pichler</surname>
          </string-name>
          , R.:
          <article-title>General and fractional hypertree decompositions: Hard and easy cases</article-title>
          .
          <source>In: Proc. PODS</source>
          <year>2018</year>
          . pp.
          <fpage>17</fpage>
          -
          <lpage>32</lpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greco</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leone</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scarcello</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Hypertree decompositions: Questions and answers</article-title>
          .
          <source>In: Proc. PODS</source>
          <year>2016</year>
          . pp.
          <fpage>57</fpage>
          -
          <lpage>74</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hoare</surname>
            ,
            <given-names>C.A.R.</given-names>
          </string-name>
          :
          <article-title>Communicating sequential processes</article-title>
          .
          <source>Commun. ACM</source>
          <volume>21</volume>
          (
          <issue>8</issue>
          ),
          <fpage>666</fpage>
          -
          <lpage>677</lpage>
          (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>