Workload-Aware Streaming Graph Partitioning

                                Hugo Firth                                                     Paolo Missier
                     School of Computing Science                                     School of Computing Science
                        Newcastle University                                            Newcastle University
                           h.firth@ncl.ac.uk                                         paolo.missier@ncl.ac.uk


ABSTRACT                                                                                                                         q1
                                                                                                                G                                         Q
Partitioning large graphs, in order to balance storage and
                                                                                                                         b
processing costs across multiple physical machines, is be-                   1:a     2:b     3:c         4:d         a       a
coming increasingly necessary as the typical scale of graph                                                              b
data continues to increase. A partitioning, however, may in-                                                                                                  q3
troduce query processing latency due to inter-partition com-                   5:b     6:a         7:d         8:c               q2
                                                                                                                                      a   b       c       d
munication overhead, especially if the query workload ex-                                                            a   b   c                a       b
hibits skew, frequently traversing a limited subset of graph
edges. Existing partitioners are typically workload agnostic
and susceptible to such skew; they minimise the likelihood
of any edge crossing partition boundaries.                               Figure 1: An example graph G with query workload
   We present our progress on LOOM: a streaming graph                    Q
partitioner based upon efficient existing heuristics, which re-
duces inter-partition traversals when executing a stream of
sub-graph pattern matching queries Q. We are able to con-
                                                                         matching over these graphs is common to many modern ap-
tinuously summarise the traversal patterns caused by queries
                                                                         plications, including fraud detection [18], recommender sys-
within a window over Q. We do this using a generalisation
                                                                         tems [7] and genome analysis [4]. Pattern matching can be
over a trie data structure, which we call TPSTry++, to
                                                                         simply discussed in terms of sub-graph isomorphism, where,
compactly encode frequent sub-graphs, or motifs, common
                                                                         given a labelled query graph Gqi and a labelled parent graph
to many query graphs in Q. When the graph-stream be-
                                                                         G, the answer to the query is all sub-graphs in G which
ing partitioned contains a match for a motif, LOOM uses
                                                                         are isomorphic to Gqi ; i.e. all sub-graphs which have the
graph-stream pattern matching to capture it, and place it
                                                                         same structure (vertices, edges, and labels) as Gqi . For in-
wholly within partition boundaries. This increases the like-
                                                                         stance, given the graph and query workload in figure 1, the
lihood that a random query q ∈ Q may be answered within
                                                                         answer to q1 would be the sub-graph of G containing the
a single partition, with no inter-partition communication to
                                                                         vertices 1, 2, 5, 6 and their interconnecting edges. Figure 1
introduce additional latency.
                                                                         also demonstrates query graphs which share common sub-
   Finally, we discuss the potential pitfalls and drawbacks
                                                                         structure. In this work, we exploit the existence of such fre-
which exist with our approach, and detail the work yet to
                                                                         quently reoccurring sub-graphs within a query workload Q
be completed.
                                                                         to improve Q’s performance over large, distributed graphs.
                                                                         We refer to frequent sub-graphs of query graphs as motifs.
Categories and Subject Descriptors                                          Pattern matching is computationally complex and, over
                                                                         “big-graph data”, would prove prohibitively expensive to a
H.2.4 [Database Management]: Systems
                                                                         single commodity machine. Distributed graph partitioning
                                                                         has long been seen as a viable approach to address such scal-
1.    INTRODUCTION                                                       ability issues in graph processing frameworks [11, 12], and
  Recently there has been a proliferation of web hyperlinks,             graph database management systems (GDBMS) [1]. These
social network users, protein interaction networks, and other            systems distribute vertices and computation across multiple
content readily modelled as large graphs. Sub-graph pattern              machines, using a simple hash function to determine ver-
                                                                         tex placement by default. Although a hash-partitioning is
                                                                         efficient to compute and creates partitions with even num-
                                                                         bers of vertices, it ignores vertex locality, and is therefore to
                                                                         create a large number of inter-partition edges. This is unde-
                                                                         sirable, incurring a high communication overhead between
                                                                         partitions for many types of graph workload, including pat-
 c 2016, Copyright is with the authors. Published in the Workshop Pro-   tern matching queries.
ceedings of the EDBT/ICDT 2016 Joint Conference (March 15, 2016, Bor-       The problem of minimising the number of inter-partition,
deaux, France) on CEUR-WS.org (ISSN 1613-0073). Distribution of this
paper is permitted under the terms of the Creative Commons license CC-   or cut, edges in a distributed graph, whilst maintaining an
by-nc-nd 4.0                                                             even distribution of vertices, is known as k-balanced graph
partitioning, which is NP-Hard [3]. Despite this, there ex-              the probable edge traversals caused by a given work-
ist several practical solutions [8, 17, 19, 20] to the problem.          load of sub-graph pattern matching queries Q.
Some, such as the state-of-the-art “offline” partitioner METIS         • We propose a graph-stream pattern matching approach
[8], are memory intensive, their performance suffering over              to transform a stream of vertices and edges G into a
graphs with billions of vertices [19]. They may also have to             sequence of motifs, where each motif represents a sub-
perform expensive repartitioning in the presence of graph                graph in G likely to be traversed during execution of
changes. Others [17,19,20] adopt a simpler streaming graph               a random q ∈ Q.
partitioning model. A graph-stream is an ordering over
the elements of a dynamic, growing graph, often by cre-              The rest of this paper is organised as follows. In the sub-
ation time. Social networks are often viewed as graph-            sequent section we discuss background material and related
streams [16]. Procedures on graph-streams will usually con-       work. In Section 4.2 we present the TPSTry++ datastruc-
sider each graph element, in order, just once and therefore       ture for capturing an intensional representation of graph
likely have very good complexity, regardless of graph size.       traversals from a query workload. In Section 4.3 we present
Streaming graph partitioners typically produce more inter-        a detailed overview of the streaming graph partitioner which
partition edges than METIS, but are much faster.                  accounts for the workloads captured in Section 4.2. In the
   Although these streaming graph partitioners do reduce the      Conclusion we discuss our progress with this work, highlight
number of cut edges and seamlessly handle graph updates,          its limitations, and present some potential avenues for future
they are agnostic to the specific workload being executed         study.
over the graph partitioning: edges to be cut are computed
purely based upon graph structure. For some workloads,            2.    DEFINITIONS
which may traverse a limited subset of edges in a graph, such        A labelled graph G = (V, E, LV , fl ) is of the form: a set
partitionings may incur unnecessary communication over-           of vertices V = {v1 , v2 , ..., vn }, a set of pairwise relationships
head [15, 20]. An example of such a workload is a one of          called edges e = (vi , vj ) ∈ E and a set of vertex labels LV .
pattern matching queries, where the topologies which are          The function fl : V → LV is a surjective mapping of vertices
likely to be traversed are those which correspond to mo-          to labels.
tifs defined in query graphs. In this work we focus on a             A graph motif is simply a sub-graph structure which
different measure of partitioning quality: the probability of     occurs repeatedly within a graph or graph-stream G.
inter-partition traversals which is different from the number        A pattern matching query is defined in terms of sub-
of inter-partition edges, given a workload Q.                     graph isomorphism. Given a pattern graph Q = (VQ , EQ ),
   Whilst systems such as LogGP [20] do attempt to collect        a query should return G0 : a set of sub-graphs of G. For
runtime statistics in order to improve graph partitioning for     each returned sub-graph G0i = (Vi0 , Ei0 ) there should exist
a given workload, they are focused on the Bulk-Synchronous-       a bijective function f such that: (a) for every vertex v ∈
Parallel (BSP) model of computation used by Pregel-like           Vi0 , there exists a corresponding vertex f (v) ∈ VQ ; (b) for
systems for “offline” analytical workloads. Our goal, how-        every edge (v1 , v2 ) ∈ Ei0 , there exists a corresponding edge
ever, is to improve graph partitionings for a given “online”      (f (v1 ), f (v2 )) ∈ EQ ; and (c) for every vertex v ∈ G0i , the
workload of pattern matching queries over a dynamic la-           labels match those of the corresponding vertices in Q, l(v) =
belled graph, as would be common to GDBMS.                        l(f (v)).
                                                                     A graph partitioning is defined as an disjoint family
1.1    Contributions                                              of sets of vertices Pk (V ) = {V1 , V2 , . . . , Vk }. Each set Vi ,
   We describe a novel extension to existing streaming graph      together with its edges Ei (where ei ∈ Ei , ei = (vi , vj ), and
partitioning methods [17], aimed at avoiding introducing          {vi , vj } ⊆ Vi ), is referred to as a partition Si . A partition
unnecessary inter-partition communication overhead for a          forms a proper sub-graph of G such that Si = (Vi , Ei ), Vi ⊆
given query workload. More precisely, let Q be a work-            V and Ei ⊆ E.
load of queries over G, along with the relative frequency
of each query in Q. We are able to efficiently derive the         3.    BACKGROUND & RELATED WORK
most common motifs from the query graphs in Q. Using an             The three main areas of work which relate to our own are:
approach to graph-stream pattern matching [16], we iden-          1) graph partitioning, particularly when workload-aware;
tify those sub-graphs in G which match these motifs, and          2) frequent sub-graph mining; & 3) graph-stream pattern
are therefore likely to be traversed during the execution of      matching. We provide an overview of the first below, but
a random q ∈Q. Having grouped a graph-stream into fre-            defer discussion of the latter two to sections 4.2 and 4.3 re-
quently traversed sub-graphs, we are then able to use the         spectively, for context.
successful Linear Deterministic Greedy heuristic [17] (LDG)
to assign these to a partition, excepting some balance con-       3.1     Graph partitioning
straints. This increases the likelihood that a random q ∈ Q          balanced graph partitioning is an NP-Hard problem with
can be processed without causing inter-partition traversals       application to many areas across distributed systems and
and communication overhead.                                       scientific computing; it has been exhaustively studied in lit-
   Concretely, this work makes the following contributions:       erature since the 1970s [6, 8, 9], and several practical solu-
                                                                  tions exist [8, 19]. One such solution is METIS [8], a re-
   • We extend an existing streaming graph partitioning
                                                                  liable standard for offline, fast partitioning. METIS is a
     approach [17], to account for the probabilities of cross-
                                                                  multilevel technique: it computes a succession of recursively
     ing partition boundaries during execution of a query
                                                                  compressed graphs, partitions the smallest then “projects”
     from a given workload Q.
                                                                  that partitioning onto previous graphs in the sequence, ap-
   • We present an efficient intensional representation of        plying local refinement techniques [9] to the partitioning at
each step. This produces a balanced k-way partitioning on              There are a number of works addressing the related prob-
the original graph, optimised for minimal edge cut.                 lem of workload-aware data partitioning in distributed rela-
   Despite its prevalence, there are soome issues with METIS        tional database systems [5,13]. Schism [5] and SWORD [13]
which makes it unsuitable for our goal of workload-aware            use an a priori workload to generate a hypergraph, where
partitioning of a large, dynamic graph. Firstly, the per-           each edge represents a set of tuples involved in a single
formance of METIS suffers in the presence of graphs with            transaction. This hypergraph is then partitioned using a
more than a few hundred million elements [19]. Secondly, if         version of METIS to achieve a minimal edge-cut. Mapped
a graph partitioning produced with METIS, or other offline          back to the original database, the partitioning represents an
techniques, grows over time then expensive full repartition-        arrangement of records which causes a minimal number of
ing operations will be required to maintain partition qual-         transactions in the captured workload to be distributed.
ity. Finally, METIS may account for a static query work-               Though the goals of these works and our own are similar,
load known a priori, using individual edge-weights to repre-        they are focused on a relational data model, where typical
sent traversal frequency, however tracking this information         workloads overwhelmingly consist of short 1-2 “hop” queries.
is memory intensive, and otherwise non-trivial.                     It is unclear how the techniques described would perform
   The streaming graph partitioning model [17] addresses the        given a workload containing many successions of JOIN op-
first two of these shortcomings. By assigning vertices and          erations, equivalent to the traversals required for sub-graph
edges to a partition as soon as they arrive and not stor-           pattern matching. Furthermore, these works do not consider
ing them to perform introspection of graph structure, such          dynamic graphs at all.
partitioners are able to maintain a small memory footprint.            In [21], Yang et al propose algorithms to efficiently anal-
Thus, streaming partitioners such as Fennel [19], created           yse online query workloads and to dynamically replicate
by Tsourakakis et al, are able to scale to large graphs un-         “hotspots” (clusters of vertices over 2 or more partitions
bounded by the main memory of a host machine. Also, be-             which are being frequently traversed), thereby temporarily
cause element placement is computed “on the fly”, streaming         dissipating network load. Whilst highly effective at dealing
partitioners adapt seamlessly to graph growth, applying the         with unbalanced query workloads, Yang et al focus solely
same placement operation for each new vertex and edge that          upon the replication of vertices and edges using temporary
arrives over time.                                                  secondary partitions. They do not consider workload char-
   It is worth noting, however, that the heuristics used by         acteristics when producing the initial partitioning, nor do
streaming graph partitioners are sensitive to the order of          they consider workload characteristics when producing it.
graph elements in a stream [17]. There are three cate-              This can result in replication mechanisms doing far more
gories of graph ordering commonly considered when evaluat-          work than is necessary over time, adversely affecting the
ing streaming graph partitioners: random, adversarial and           performance of a system. As a result, the partitioning tech-
stochastic. Consider an 2-way partitioning for the graph in         nique we present here could effectively complement many
figure 1, with a vertex ordering of V = (1, 3, 6, 8, 2, 4, 5, 7).   workload aware replication approaches, such as this.
Given no neighbours for the first half of vertices received, a
naive partitioner might greedily place them in a single par-        4.    LOOM PARTITIONING OVERVIEW
tition which, intuitively, causes a final balanced partitioning
                                                                       In this section we provide an intuition of how we are going
with the maximum edge cut: |E|. This is an adversarial
                                                                    to partition a graph-stream to account for a specific work-
ordering. For the streaming graph partitioning approach
                                                                    load. Initially, we describe the efficient streaming-graph par-
described in this work, we will consider stochastic order-
                                                                    titioning heuristic used as a base for our workload-aware
ing; that is, a graph-stream continuously generated by some
                                                                    extensions. The heuristic assesses characteristics of each in-
stochastic process, such as user input.
                                                                    dividual vertex before placing them in an appropriate par-
                                                                    tition. However, by running efficient pattern matching pro-
3.2    Workload aware partitioning                                  cedures against a buffered window over the graph-stream,
  To the best of our knowledge, existing streaming graph            we are able to capture motifs; treating these motifs as single
partitioning solutions do not satisfy our goal for this work:       vertices, we may then use the same heuristic to place them
producing workload-aware graph partitionings, which ac-             wholly within beneficial partitions. If the motifs we cap-
count for the edge traversals patterns of a given “online”          ture correspond to those likely to be frequently traversed
workload of sub-graph pattern matching queries.                     by a known workload Q, then this would increase the likeli-
  Xu et al ’s LogGP [20] tackles workload aware partition-          hood that a random q ∈Q is executed without inter-partition
ing improvement for graphs processed in Pregel-like sys-            traversals. Thus, we subsequently present a method for con-
tems [12], where operations are computed in a vertex-centric        tinuously summarising the motifs most frequently traversed
fashion across a series of supersteps. LogGP collects in-           by a given stream of sub-graph pattern matching queries
formation about the set of vertices accessed in each su-            Q. Finally, we present our chosen pattern matching proce-
perstep, using it to predicts the set to be accessed in the         dure and discuss the issues which currently exist with our
next. Subsequently, this meta-data is incorporated with the         partitioning approach.
original graph, transforming it into a hypergraph. With a
novel streaming hypergraph partitioning technique, LogGP            4.1    Base partitioning heuristic
then repartitions the graph after each superstep, reducing            LOOM’s partitioning is based upon the Linear Deter-
the communication overhead for the next and reducing the            ministic Greedy heuristic (LDG) proposed by Stanton and
overall execution time of an operation. Though LogGP’s              Kliot [17]. LDG is a simple heuristic which seeks to assign
approach is workload-aware, its dependence on supersteps            a new vertex to the partition where it has the most edges,
and vertex-centric computation renders it unsatisfactory for        as this is efficient to compute, and greedily minimises the
our goal.                                                           number of inter-partition edges for each vertex. In order to
                                                 a   b       c                                      In this work we extend the TPSTry data structure from a
                                                         a       b                               trie to a directed acyclic graph, which we call TPSTry++,
                             b   a       b   c
                                                                                                 capable of encoding the features of more complex motifs:
   a             b   a   b
                             a   b       a   b       b                   a   b       c       d
                                                                                                 branches, cycles etc. Encoding the motifs described by in-
                                                 a
                                                     b
                                                             a                   a       b       exact pattern matching queries, such as those including vari-
        a    b   a   b   a
                             a   b       c                                                       able length paths, is considered out of scope for this work.
   b             a   b   c
                                     a                                                              The TPSTry++ is inspired by the work of Ribeiro and
                                                 a   b       c       d
                                                         a
                                                                                                 Silva, who propose G-Tries [14]. A G-Trie is a trie data
        b    c               a   b       c   d
                                                                                                 structure which stores unlabelled graphs in such a way that
   c             b   c   d                                                                       a parent node in the trie represents a sub-graph of its chil-
                                                                                                 dren. Each graph in a G-Trie node is represented in its
         c   d
                                                                                                 canonical form. A canonical form is guaranteed to be equal
   d
                                                                                                 for two graphs which are isomorphic to one another, avoid-
                                                                                                 ing multiple trie branches per graph. In order to discover
                                                                                                 frequent sub-graphs (motifs), Ribeiro and Silva traverse the
             Figure 2: TPSTry++ for Q in fig.1                                                   elements of a graph, constructing the branches of a G-Trie
                                                                                                 as they encounter distinct motifs. A p-value is associated
                                                                                                 with each node, based upon the number of times a particu-
create a partitioning which is balanced in the number of ver-                                    lar motif has been observed.
tices, each partition is given a capacity constraint C. For a                                       This process is similar to how we construct a TPSTry++
given vertex v and partition Si = (Vi , Ei ), the number of v’s                                  from a stream of sub-graph pattern matching queries, how-
edges in Si is weighted by Si ’s free capacity 1 − |VCi | . In this                              ever there are a number of differences. Firstly, as we cap-
way partitions are progressively more penalised the more                                         ture labelled topologies, the TPSTry++ must be a directed
vertices they contain. LDG is an effective heuristic [17, 19],                                   acyclic graph (DAG), rather than a tree, as it may have mul-
reducing the number of edges cut by up to 90%.                                                   tiple possible root nodes: one for each vertex with a distinct
   In LOOM we buffer a sliding window over a graph-stream,                                       label. Secondly, we must use a different method for checking
and use LDG to assign both connected sub-graphs1 and sin-                                        isomorphism between two motifs, as the unlabelled canonical
gle vertices from the buffer to partitions. Stanton and Kliot                                    form used when constructing G-Tries is no longer sufficient.
describe similar extensions in their original work [17]. In par-                                 To match two motifs, we use an efficient algorithm by Song
ticular, the Greedy EvoCut partitioning heuristic is closely                                     et al [16] for computing numerical signatures for graphs.
related to our own. The local partitioning algorithm Evo-                                        This algorithm is proposed as part of a graph-stream pat-
Cut [2] is used to split sub-graphs which occur within the                                       tern matching approach, which we use to detect matches for
stream buffer into small pseudo-partitions, which are then                                       Q’s motifs in a graph-stream G. Both algorithm and pat-
wholly assigned to parent partitions using LDG. In LOOM                                          tern matching approach are detailed in the next section 4.3.
however, we attempt to detect sub-graphs within the stream                                       Note that signature equality constitutes a non-authoritative
buffer which are likely to be frequently traversed by a work-                                    form of isomorphism checking. However, the probability of
load Q, and greedily place those wholly within a partition.                                      signature collisions, and therefore of mistakenly represent-
                                                                                                 ing distinct motifs with a single TPSTry++ node, is shown
4.2    Capturing a query workload                                                                to be very low.
   In order to detect the sub-graphs from stream G which are                                        Figure 2 shows a representation of a TPSTry++ for the
likely to be frequently traversed by a workload Q, we must                                       workload Q in figure 1, without p-values. We capture the
first discover those motifs which occur frequently within the                                    motifs common to Q, along with their frequencies, by ex-
query graphs defined in Q. This act of discovery is a form                                       ecuting a simple co-recursive algorithm, presented in algo-
of frequent sub-graph mining.                                                                    rithm 1, for each query graph Gqi .
   In a previous work by the authors which is submitted
for publication elsewhere, we define the traversal pattern
summary trie (TPSTry). The TPSTry datastructure, in-                                             Algorithm 1 Recompute TPSTry++ for each query q ∈Q
spired by Li et al ’s work [10] to find common traversal paths                                     qG ← the query graph defined by q
amongst sessions of hyperlink click-streams, is an encoding                                        signature(g) ← the signature of a graph g
of the frequent motifs in a workload of path queries. The en-                                      support(g) ← a map of TPSTry++ nodes to p-values
coding is intensional, encoding paths of vertex labels, rather                                     tpstry ← the TPSTry++ for Q
than the vertices themselves, in order to save space. Each                                         g ← some sub-graph of qG, initially a single vertex
node n is associated with the set of queries which could
                                                                                                   weave(qg, tpstry)
cause the path of traversals which n represents. Each node                                           for v in vertices from qG do
n in the trie is additionally associated with a probability                                             g ← new graph with just {v}
P (n), representing the likelihood of a traversal in graph G                                            corecurse(g, tpstry)
along a path whose vertex labels match those of the path                                                   sig ← signature(g)
ε → . . . → n in the TPSTry. Using these probabilities, we                                                 if sig not in tpstry then
are able to estimate the probability of any traversal from a                                                  tpstry ← tpstry + g //Add a g node to TPSTry++
                                                                                                           support(g) ← support(g) + 1
vertex v, given its v label and those of v’s local neighbour-                                              newEdges ← edges incident to g but not in g
hood.                                                                                                      for e in newEdges
1                                                                                                             corecurse(g + e, tpstry) //Traverse through qG
  When assigning sub-graphs, LDG considers the total edges                                           return tpstry
from all vertices, to each partition.
  Any node in the TPSTry which has a p-value above a                                      graph-stream G
user-defined threshold T is denoted frequent, and the node’s
associated sub-graph is considered a motif in Q.                           a      b       c                a     b        c

4.3    Detecting motif matches in a graph-stream
                                                                                                                      c
   The TPSTry++ for Q provides a set of query motifs
which, where they occur in G, are likely to be frequently                         S                              S`
traversed by a random q ∈Q. Given this information, we
must attempt to identify sub-graphs in G which match these
motifs, in order to make sure they cross partition boundaries         Figure 3: Motif matching over the graph-stream
as little as possible. As mentioned, global graph introspec-
tion or pattern matching operations are expensive and limit
                                                                  the new edge e. This computation is similar to algorithm
the scalability of a partitioner [19]. Instead, we use a graph-
                                                                  1, in that we traverse each edge in S 0 , starting with those
stream pattern matching algorithm to capture those sub-
                                                                  incident to vertices in e. After each step we recompute the
graphs in G which match a query motif and occur within a
                                                                  signature for the sub-graph of S 0 which we have traversed so
given window2 over the graph-stream.
                                                                  far. If this recomputed signature is not in the TPSTry++
   Song et al [16] propose a highly efficient algorithm based
                                                                  then we discard the most recent edge, and do not traverse
on number theoretic signatures. Offline, they construct a
                                                                  to its neighbours. We eventually traverse and identify the
“signature” for each query graph Gqi in a workload. This
                                                                  largest sub-graph of S 0 which both contains the edge e and
signature is really a large integer hash, which captures key
                                                                  is a match for a query motif3 .
information about a graph, such as vertices, labels and their
degree, as distinct factors. Subsequently, as an edge e arrives   4.4      Assigning motif matches to partitions
online, the signature for a sub-graph S which contains e is
                                                                     The pattern matching algorithm described in the previous
calculated by multiplying the previous signature of the sub-
                                                                  section will maintain the set of sub-graphs which are cur-
graph S\e by the factor for e. If the signature for S is
                                                                  rently within the graph-stream window, and which match
divisible by the signature for Gqi then there is likely to be a
                                                                  common motifs from a query workload Q. Over time, ver-
match for qi in S.
                                                                  tices and edges will leave the stream window and be assigned
   Song et al demonstrate that if a graph does not have a
                                                                  to partitions using the LDG heuristic, as mentioned in sec-
signature equal to that of a given query graph Gqi , then
                                                                  tion 4.1. When the “oldest” vertex in a motif match is due
it cannot be a match for the query qi . Note that this is
                                                                  to be assigned, we assign the whole matching sub-graph at
a weaker property than a signature match being equivalent
                                                                  once. Other matching sub-graphs which share common sub-
to a graph match, and as such this pattern matching algo-
                                                                  structure with the sub-graph being assigned, as in figure 3,
rithm is non-authoritative; indeed, Song et al must use a
                                                                  will also be assigned to the same partition. This greedy
secondary algorithm to verify matches. However, they also
                                                                  approach is naive, as it risks assigning some motif match-
demonstrate that signature collision is highly unlikely, which
                                                                  ing sub-graphs to sub-optimal partitions because they share
should be sufficient for our purposes of heuristically improv-
                                                                  substructure with another sub-graph which was assigned
ing a partitioning, without further verification.
                                                                  earlier. Furthermore if a set of connected sub-graphs is very
   Recall from algorithm 1 that a signature is computed for
                                                                  large, it is unclear what effect this would have on partition
each motif represented by a node in the TPSTry++. As each
                                                                  balance, even given LDG’s penalty weighting. Evaluating
edge arrives in the graph-stream, if it connects two vertices
                                                                  alternative approaches, including local partitioning of motif
within the stream window to form a sub-graph S, then we
                                                                  matches to separate them across partitions, is a focus of our
compute a signature. If the signature is a match for a node
                                                                  ongoing work.
n in the TPSTry++, then S is a match for a motif. For
                                                                     As stated previously, isolated vertices, or sub-graphs which
a subsequently added edge, the signature for sub-graph S 0
                                                                  do match motifs from Q, are assigned according to the LDG
must match a signature associated with a child of n, or else
                                                                  heuristic.
S 0 is not a match for a motif. Note that the sub-graph S 0
not being a match for a motif does not imply that the newly
added edge is not part of a sub-graph which is a match.           5.     CONCLUSION AND FUTURE WORK
   Figure 3 presents an illustrative example of the above.           We have presented our ongoing work on LOOM: a workload-
The subgraph S 0 is not a match for any node of the TPSTry++      aware streaming graph partitioner. Our primary contri-
in figure 2, however it contains two distinct instances of the    bution is using a generalised trie data structure to iden-
abc motif. The pattern matching algorithm does not de-            tify query motifs, small sub-graphs common to many of
tect this, because sub-graph signatures are iteratively recom-    the query graphs defined in a sub-graph pattern matching
puted with each update, and previous signatures discarded.        workload Q. We have also described how we use an effi-
As a result, we risk assigning the added c labelled vertex        cient graph-stream pattern matching technique to identify
to a different partition than the sub-graph S, creating an        matches for query motifs in a graph-stream G, and greedily
inter-partition edge which is likely to be traversed. In order    assign these matches to partitions to reduce the probability
to avoid this, we adopt the following simple procedure: if an     of inter-partition traversals.
edge e is added to a sub-graph S, and the new sub-graph              As future work we will perform extensive evaluation of the
S 0 is not a match for a motif in the TPSTry++, then we           prototype LOOM architecture, specifically in the presence
incrementally compute a new signature for S 0 , starting with     of a number of different graph-stream orderings, and dif-
2                                                                 ferent query workloads. Furthermore, our choice of greedy
  Stream windows may be defined in terms of time, or ele-
                                                                  3
ment count                                                            This may be none!
assignment semantics, never splitting sub-graphs in G which     [14] P. Ribeiro and F. Silva. G-Tries: a data structure for
match query motifs, risks poor performance when large sub-           storing and finding subgraphs. Data Mining and
graphs are assigned to sub-optimal partitions in order to            Knowledge Discovery, 28(2):337–377, mar 2014.
maintain partition balance. We must propose a local parti-      [15] Z. Shang and J. X. Yu. Catch the Wind: Graph
tioning procedure for large matched sub-graphs which allevi-         workload balancing on cloud. 2013 IEEE 29th
ates this. Finally it would be interesting to extend our base        International Conference on Data Engineering
partitioning heuristic (LDG) to incorporate edge traversal           (ICDE), pages 553–564, apr 2013.
probabilities from the TPSTry++ into the process of select-     [16] C. Song, T. Ge, C. Chen, and J. Wang. Event pattern
ing assignment partitions.                                           matching over graph streams. Proceedings of the
                                                                     VLDB Endowment, 8(4):413–424, dec 2014.
6.   REFERENCES                                                 [17] I. Stanton and G. Kliot. Streaming graph partitioning
                                                                     for large distributed graphs. In Proceedings of the 18th
 [1] Titan - Distributed Graph Database.                             ACM SIGKDD international conference on Knowledge
     http://thinkaurelius.github.io/titan/. Accessed                 discovery and data mining, pages 1222–1230, 2012.
     on: 2015-12-01.                                            [18] H. Tong, B. Gallagher, C. Faloutsos, and
 [2] R. Andersen and Y. Peres. Finding sparse cuts locally           T. Eliassi-Rad. Fast best-effort pattern matching in
     using evolving sets. In Proceedings of the 41st annual          large attributed graphs. In Proceedings of the 13th
     ACM symposium on Symposium on theory of                         ACM SIGKDD international conference on Knowledge
     computing - STOC ’09, page 235, New York, New                   discovery and data mining, page 737, 2007.
     York, USA, 2009. ACM Press.                                [19] C. Tsourakakis, C. Gkantsidis, B. Radunovic, and
 [3] K. Andreev and H. Racke. Balanced Graph                         M. Vojnovic. FENNEL. In Proceedings of the 7th
     Partitioning. Theory of Computing Systems,                      ACM international conference on Web search and
     39(6):929–939, nov 2006.                                        data mining, pages 333–342, 2014.
 [4] G. Brevier, R. Rizzi, and S. Vialette. Pattern             [20] N. Xu, L. Chen, and B. Cui. LogGP. Proceedings of
     Matching in Protein-Protein Interaction Graphs. In              the VLDB Endowment, 7(14):1917–1928, oct 2014.
     Fundamentals of Computation Theory, pages 137–148.         [21] S. Yang, X. Yan, B. Zong, and A. Khan. Towards
     Springer Berlin Heidelberg, Berlin, Heidelberg, 2007.           effective partition management for large graphs. In
 [5] C. Curino, E. Jones, Y. Zhang, and S. Madden.                   Proceedings of the 2012 international conference on
     Schism. Proceedings of the VLDB Endowment,                      Management of Data, pages 517–528. ACM Press,
     3(1-2):48–57, 2010.                                             2012.
 [6] B. Hendrickson and R. Leland. An Improved Spectral
     Graph Partitioning Algorithm for Mapping Parallel
     Computations. SIAM Journal on Scientific
     Computing, 16(2):452–469, mar 1995.
 [7] Z. Huang, W. Chung, T.-H. Ong, and H. Chen. A
     graph-based recommender system for digital library.
     In Proceedings of the 2nd ACM/IEEE-CS joint
     conference on Digital libraries, pages 65–73, 2002.
 [8] G. Karypis and V. Kumar. Multilevel k -way
     Partitioning Scheme for Irregular Graphs. Journal of
     Parallel and Distributed Computing, 47(2):109–124,
     1997.
 [9] B. W. Kernighan and S. Lin. An efficient heuristic
     procedure for partitioning graphs. Bell systems
     technical journal, 49(2):291—-307, 1970.
[10] H. Li and S. Lee. Mining Top-K Path Traversal
     Patterns over Streaming Web Click-Sequences.
     Journal of Information Science and Engineering,
     1133(95):1121–1133, 2009.
[11] Y. Low, D. Bickson, J. Gonzalez, C. Guestrin,
     A. Kyrola, and J. M. Hellerstein. Distributed
     GraphLab. Proceedings of the VLDB Endowment,
     5(8):716–727, apr 2012.
[12] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert,
     I. Horn, N. Leiser, and G. Czajkowski. Pregel. In
     Proceedings of the 2010 international conference on
     Management of data - SIGMOD ’10, page 135, New
     York, New York, USA, 2010. ACM Press.
[13] A. Quamar, K. A. Kumar, and A. Deshpande.
     SWORD. In Proceedings of the 16th International
     Conference on Extending Database Technology, page
     430. ACM Press, 2013.