=Paper= {{Paper |id=Vol-2578/BigVis7 |storemode=property |title=Graph Waves |pdfUrl=https://ceur-ws.org/Vol-2578/BigVis7.pdf |volume=Vol-2578 |authors=James Abello,Daniel Nakhimovich |dblpUrl=https://dblp.org/rec/conf/edbt/AbelloN20 }} ==Graph Waves== https://ceur-ws.org/Vol-2578/BigVis7.pdf
                                                                    Graph Waves
                                                                       James Abello∗
                                                                    Daniel Nakhimovich∗
                                                                     abelloj@cs.rutgers.edu
                                                                       d.nak@rutgers.edu
                                                                       Rutgers University
                                                                   New Brunswick, New Jersey

Abstract                                                                              known canonical hierarchical decomposition for k-connectivity.
We present efficient algorithmic mechanisms to partition graphs                       The notion of cores was first introduced in [25] and their utiliy was
with up to 1.8 billion edges into subgraphs which are fixed points                    further identified in [5]. One of the main advantages of core based
of degree peeling [3]. Similarly to [1], we are able to process graphs                vertex decompositions is that they can be computed efficiently [10]
with tens of millions of edges in seconds and hundreds of millions                    and they provide an arguably “natural” hierarchical view of any
of edges in minutes. For fixed points that turn out to be larger than a               network. The vertex core decomposition has been used to obtain
desired interactivity parameter we further decompose them with a                      edge partitions where each set of edges in the partition is a max-
novel (to our knowledge) linear algorithm into what we call “graph                    imal edge subgraph that is a fixed point of degree peeling. The
waves and fragments”. This decomposition is used to create span-                      maximality of these decompositions is desirable because it provides
ning views of fixed points that we call “DAG Covers”. We illustrate                   bounds on the connectivity between the decomposed subgraphs [3].
these decompositions by presenting intuitive and interactive visu-                    The pragmatic challenge of computing this iterative edge “layer”
alizations of the meta-structures of a variety of publicly available                  decomposition for graphs with several hundred million edges in a
data sets including social, web, and citation networks [18].                          few minutes has been undertaken in [1]. However, if a particular
                                                                                      layer in the decomposition is large, the navigation and visualiza-
Keywords                                                                              tion of such big fixed points is still a major task. For example, in
graph waves, kcore, graph decomposition, graph visualization                          the Friendster social network there are four fixed points with over
                                                                                      100 million edges. We address this issue by introducing a novel
1    Introduction                                                                     and efficient algorithm that we call the “wave decomposition”. The
How far can we simplify computation and visualization in order                        “waves” produced by the decomposition can be naturally separated
to maintain “large scale” macro structural graph features without                     into connected sub-waves which can in turn be refined into a collec-
losing the ability to interactively extract more detailed connectivity?               tion of “fragments”. For large fragments we propose using maximal
Some progress in this direction appears in [3]. In general, macro                     matching based contractions combined with graph rendering and
structural graph views can be obtained by using some form of                          clustering techniques similar to [9, 30]. Our notion of fragments is
vertex or edge aggregation that can be represented as hierarchy                       quite different to that used in [7]. The ultimate goal of this work is
trees [4, 8, 12, 23, 29]. The choice of suitable representations that                 to obtain a humanly-interpretable, hierarchical description of any
facilitate smooth visual interaction is a subject of active research                  graph.
[13, 19, 20, 22]. In [11] an edge stratification technique is proposed                   First we introduce fixed points and waves in sections 2 and 3. In
that is applied to a given layout of a graph with a few hundred                       sections 4 and 5 we show sample visualizations and runtime results.
vertices and edges. All of these techniques require algorithms with                   Lastly we discuss our contributions and future work in section 6.
running time greater than linear in the number of graph elements.                     2    Fixed Points and Degree Peeling
Our focus is on graph decompositions that are applicable to graphs
                                                                                      We deal with undirected graphs 𝐺 = (𝑉 , 𝐸) with vertex set 𝑉 (𝐺)
of several orders of magnitude larger for which a direct layout is
                                                                                      and edge set 𝐸 (𝐺). We denote by 𝑛 the number of vertices in 𝑉 ,
unavailable due to memory constraints or impractical due to screen
                                                                                      𝑚 the number of edges in 𝐸, and the degree of a vertex 𝑢 ∈ 𝑉 by
size contraints.
                                                                                      deg(u).
   Some hierarchical decompositions have faster techniques. Major
approaches include SPQR-trees, k-core and nuclei based decomposi-                     Definition 1. (Peel Value) The peel value of a vertex 𝑢 ∈ 𝑉 (𝐺)
tions [1, 5, 10, 15, 17, 24] or even simple stochastic sampling [21, 27].             denoted 𝑝𝑒𝑒𝑙𝐺 (𝑢) is the largest 𝑖 ∈ [1, 𝑑𝑒𝑔(𝑢)] such that 𝑢 belongs to
k-connectivity decomposition techniques have been explored in                         a subgraph of 𝐺 of minimum degree 𝑖.
[4, 6, 16] and terrain metaphors have been explored in [2] and [28].                  Definition 2. (Graph Core) The core of 𝐺, 𝑐𝑜𝑟𝑒 (𝐺), sometimes
For large graphs, such techniques are limited because there is no                     called the k-core of 𝐺, is the subgraph induced by the maximal subset
∗ Both authors contributed equally to this research.                                  of vertices of 𝐺 whose peeling value is maximum.
                                                                                         Cores can be intuitively thought of as dense subgraphs. Although,
Copyright ©2020 for this paper by its author(s). Published in the Workshop Proceed-   they are not necessarily maximally dense in any known way, [26]
ings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020, Copenhagen,      shows that k-cores are 2-approximate locally dense subgraphs.
Denmark) on CEUR-WS.org. Use permitted under Creative Commons License Attribu-           In a previous work, [3] utilizes k-cores to produce an iterative
tion 4.0 International (CC BY 4.0)
                                                                                      decomposition of a graph’s edges into layers of fixed points (which
EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark                                                              James Abello and Daniel Nakhimovich


we redefine below) with the same peel value. Since the decomposi-           large. To further decompose fixed points we introduce a “wave de-
tion forms a partition of edges, we call vertices that appear in more       composition” which forms an edge partition by iteratively removing
than one fixed point, clone vertices. The notion of clone vertices can      “leaf-like” fragments.
be used to define a diversity measure and are useful for detecting
                                                                            3.1    Wave Decomposition
community overlap [3].
                                                                            Definition 4. We define the boundary of a vertex set 𝑆 ⊂ 𝑉 as
Definition 3. (Fixed Point) A graph 𝐹𝑘 is a fixed point of degree
                                                                            𝜕𝑆 = {𝑣 ∈ 𝑉 : ∃ (𝑢, 𝑣) ∈ 𝐸, 𝑢 ∈ 𝑆, 𝑣 ∉ 𝑆 }. We define the k-
peeling 𝑘 if 𝑐𝑜𝑟𝑒 (𝐹𝑘 ) = 𝑉 (𝐹𝑘 ) and the peel value of each vertex in 𝐹𝑘
                                                                            boundary of S as 𝜕𝑘 𝑆 to be the vertices in 𝜕𝑆 with degree less than k
is 𝑘.
                                                                            restricted to the graph induced by 𝑉 \ 𝑆. We extend these boundary
   From proposition 1 in [3] it is easy to verify that a fixed point        definitions to a collection of disjoint sets by taking their union.
of peel value k must have minimum degree k and average degree
                                                                            Definition 5. (Graph Fragment) Given a seed set of vertices 𝑆 ⊂
strictly less than 2k. In Figure 1 we show the distribution of the
                                                                            𝑉 the fragment generated by S, denoted frag(S), is the set of edges,
average degrees of fixed points in the Patents citation and Friendster
                                                                            (𝑢, 𝑣), such that 𝑢 ∈ 𝑆.
social networks, along with two black lines indicating the minimum
and supremum average degree possible.                                          In Figure 2 (right) we show a decomposition of a fixed point into
                                                                            fragments. Next, we use graph fragments to introduce the notion
                                                                            of graph waves of fixed points.
                                cit-Patents
                                                                            Definition 6. (Graph Waves of Fixed Points) A wave of a fixed
                                                                            point of peel value 𝑘 is the union of fragments in the sequence {frag(𝑆 𝑗 )}𝑚
                                                                                                                                                        𝑗=0
                                                                            with 𝑆 0 being the set of vertices of minimum degree 𝑘 and each sub-
                                                                                                    𝑗
                                                                            sequent 𝑆 𝑗+1 = 𝜕𝑘 (∪𝑖=0𝑆𝑖 ).
                                                                               Notice that a fixed point could consist of only one wave and a
                                                                            wave could consist of only one fragment (i.e. k-regular graphs).
                                                                               Due to space limitations pseudo code of the wave decomposition,
                             com-Friendster                                 Alg.1, is in the appendix. If we let 𝑘 = 1 then 𝐹 1 is a forest and the
                                                                            first fragment is the neighborhood of the leaves of that forest. For
                                                                            𝑘 > 1, the wave decomposition mimics the iterative removal of
                                                                            leaves in a forest by removing collections of k-leaves (a.k.a frag-
                                                                            ments) from k-dense forests (a.k.a fixed points of degree peeling)
                                                                            [14]. The example in Figure 2 shows a coloring corresponding to
                                                                            the edge partition formed by applying the wave decomposition.
                                                                            The wave decomposition was heavily inspired by the method of
                                                                            computing k-cores in [10] and much like the implementation of
                                                                            that algorithm the wave decomposition can be computed in linear
Figure 1: Plots of average degree of fixed points (y-axis) by peel          time and space with respect to the number of edges in a graph.
value (x-axis) with frequency of fixed points of a given peel value
and average degree along the z-axis being emphasized by color (blue
to red) and point size. Note that the x-axis and z-axis are shown on
a log scale.


   What we call fixed points of degree peeling 𝑘, fixed points of peel
value k, or simply fixed points, are in fact extensions of forests called
k-dense forests in [14]. In that paper the authors first formulate the
notion of k-dense cycles as connected graphs where each vertex
is of degree > k. From there it is natural to define a k-dense forest       Figure 2: (Left): A fixed point of peel value 2 whose edges are colored
as a graph without k-dense cycles. Using any algorithm to find              based on the wave decomposition (wave 1: blue, wave 2: green, wave
the k-core of a graph, one can find what [14] calls a k-elimination         3: red). (Right): The same graph sub-divided into fragments: wave 1
order of a fixed point of degree peeling k. Thus, by Lemma 2.9              with 1 blue fragment, wave 2 with 3 fragments (green, yellow, or-
in [14] the definition of a fixed point of degree peeling k and the         ange), and wave 3 with 1 red fragment.
definition of a k-dense forest are equivalent. The importance of
noticing that fixed points are generalizations of trees is that it
motivates a decomposition of fixed points that works by iteratively         3.2    DAG Cover of Waves and Fixed Points
peeling “leaf-like” fragments (subsection 3.1).                             The concept of fragments is powerful in its own right. Consider
                                                                            the ordered collection of vertex sets 𝑆 = {𝑆 0, 𝑆 1, ..., 𝑆 𝑗 } as defined
3    Waves and Fragments                                                    in Def. 6. These sets form a vertex partition of the wave vertices.
Waves are introduced in order to decompose Fixed points. Previous           We construct a directed acyclic graph (DAG) representation of the
works do not address decompositions of fixed points if they are             wave decomposition by directing thoses edges (𝑥, 𝑦) with 𝑥 ∈ 𝑆𝑖
Graph Waves                                                                                    EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark


and 𝑦 ∈ 𝑆𝑖+1 for 0 ≤ 𝑖 < 𝑗 − 1. We call this a DAG cover of a
wave because it contains (or covers) all the vertices of the wave
plus the edge cuts between consecutive sets in 𝑆. We create a DAG
cover of a fixed point by using its ordered wave decomposition.
Similarly we could create a DAG cover for an entire graph by using
its fixed point decomposition. Moreover, we use this DAG cover in
conjunction with other techniques to produce graph drawings. In
particular, we next illustrate how we use the DAG cover to visualize
stratification of fixed points and their waves.                           Figure 4: A fixed point of peel value 30 from the Patents citation
                                                                          network with 80 vertices and 1547 edges drawn with a force-directed
3.2.1 Visualization of Fixed Points with DAG Cover
                                                                          layout (Left) and with forces applied to the DAG sets (Right).
   We use three different techniques – color, force manipulation,
and edge sparsification – driven by the DAG cover to provide better
interpretability of the structure of fixed points.                        on the order of a minute to render on a machine without a dedi-
   The most visually striking technique is to use the DAG cover           cated GPU whereas the DAG cover visualization rendered nearly
to color the vertices and edges of a graph. First we assign a color       instantly and remained interactive. The fraction of edges in the
to each vertex in the DAG cover based on the sets in 𝑆, then we           DAG cover is clearly less than or equal to the number of edges in
color the edges of the original graph as a function of the color of       the original graph and in practice the reduction often exceeds a
their end points (e.g. hue average). Using a spectrum color scheme        factor of 2. Despite not having all the edges of the original graph,
from blue to red, we show a few examples of fixed points (Figure 3)       the graph drawing of just the DAG cover edges in combination
for which the DAG coloring reveals more clearly a stratification          with the DAG coloring and DAG set forces often still shows the
structure than an uncolored drawing with the same layout.                 general graph stucture.




Figure 3: (Left): A fixed point of peel value 22 from the Patents cita-
tion network with 742 vertices and 12985 edges shown using a basic
force-directed layout in grey. (Right): The same fixed point shown
with the DAG coloring.                                                    Figure 5: A fixed point of peel value 32 from the Friendster social
                                                                          network with 752 vertices and 21,055 edges drawn with a force-
                                                                          directed layout (Left) and with only the 2,686 DAG cover edges
   Another technique we use is to manipulate the forces in a force-       (Right).
directed layout using the information of the DAG set ordering. In a
standard force-directed layout, there is a repulsive force between
each pair of vertices and an attraction between pairs of vertices         4     Visual Exploration
connected by an edge. In addition to those forces, we apply unique        Using the iterative edge core and wave decompositions we have
radial forces to each of the sets of vertices in 𝑆 centered at increas-   designed an interactive graph visualization tool capable of exploring
ingly larger radii based on their order in 𝑆. Furthermore, we remove      large graph data up to 1.8 billion edges. The system offers multiple
forces along the edges which are not part of the DAG cover. This          high level views including a spiral view and a layer view that, in our
tends to separate out the vertex sets in the layout and helps untan-      experience, aid in meaningful graph exploration. These structures
gle hairball looking fixed points when the standard force-directed        allow the user to make intuitive decisions as to which subgraphs
layout does not. Figure 4 shows a fixed point with four sets dis-         (created by our decompositions) to zoom in to for deeper analysis.
played using a standard force-directed layout and a DAG induced           We define interactivity parameters 𝐼𝑒 and 𝐼 𝑣 to be the maximum
force-directed layout. Unlike the standard layout, the DAG induced        number of edges and vertices respectively that a user allows to
layout helps to easily identify independent (or “near-independent”)       be drawn on their screen. These parameters are predominantly
sets of vertices to highlight a “bipartite-like” structure even without   dependent on screen size, GPU power, and user preference for
using color.                                                              interactivity versus displaying more data points/links at a time.
   For a typical, interactive force-directed layout drawing applica-
tion there is a maximum number of edges that can be rendered              4.1    Fixed Points Spiral View
before making the system unresponsive. For such cases we could            To offer users an overall view of the entire collection of fixed points
push the system to draw larger fixed points by only rendering the         appearing in a very large graph we tabulate the fixed points of
edges in the DAG cover. Figure 5 shows a graph rendered before            degree peeling found through the iterative edge core decomposition.
and after filtering by the DAG cover. Note that the full graph took       We then create buckets containing connected fixed points. These
EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark                                                         James Abello and Daniel Nakhimovich


connected fixed points are grouped together into buckets according
to the number of edges. Each 𝑖 𝑡ℎ bucket has fixed points of size
s such that 𝑙𝑜𝑔𝑖−1 (𝑚) < 𝑠 ≤ 𝑙𝑜𝑔𝑖 (𝑚). This bucketing scheme is
visually represented by a spiral of boxes that we call the spiral view.
It provides direct access to any group of similarly sized fixed points
in the graph data set.
4.1.1 Spiral View
   The spiral view consists of boxes of two types corresponding to
the type of fixed points the bucket contains. If a bucket has many
fixed points we show a polar bar graph of frequencies of fixed
points per unique size. If a bucket has only one fixed point larger
than the interactivity parameter, 𝐼𝑒 , we show a 2D wave map as
described in subsubsection 4.2.1. Notice how in (Figure 6) the size
of the spiral view of the larger Friendster network (1.8 billion edges)
is about the same as that of the smaller Patents citation network
(only 0.16 billion edges). Since the size of the spiral view scales at
most logarithmically with graph size it works well for navigating
graphs at almost any scale.

                                                                          Figure 7: (Top): The layer view of a bucket smaller than 𝐼𝑒 . (Bottom
                                                                          Left) The first 13 layers of the layer ribbon. (Bottom Right): The 2D
                                                                          wave map of a bucket larger than 𝐼𝑒 which consists of a fixed point
                                                                          with 10,585 vertices and 103,101 edges. All of these are produced
                                                                          from the Patents citation network



                                                                          4.2.1 2D Wave map
                                                                             A 2D wave map (Figure 7 bottom right) is a collection of con-
                                                                          centric rings. Each ring represents a wave and is made of multiple
Figure 6: The spiral view of the Patents citation network with            arcs representing connected components (or sub-waves) of a wave.
3,774,768 vertices and 16,518,947 edges (Left) and the Friendster         The thickness of any arc represents the number of fragments in
social network with 65,608,366 vertices and 1,806,067,135 edges           the sub-wave and the arc length is proportional to the number of
(Right). The colors indicate the relative size of the buckets (num-       edges in that sub-wave. The arcs are color coded by increasing
ber of edges increasing from blue to red) and with stronger opacity
                                                                          wave number, from blue to red.
indicating higher density.
                                                                          4.2.2 3D Wave map
                                                                             A 3D wave map (Figure 8) is a “vase” made of multiple conic
4.1.2 Layer View                                                          frustums, each representing a wave. The volume of each frustum is
    The layer view (Figure 7 top) shows a rectangular representa-         proportional to the number of edges in that wave and the area of
tion of buckets. The size of each sub-rectangle encodes the overall       the intersection of two frustums represents the number of shared
number of fixed points in the bucket. When a bucket is clicked,           vertices between those two waves. Recall, waves form an edge
either the fixed points in the bucket are displayed sorted by peel        partition of the fixed point so any two waves may share common
value and size, or a 2D wave map (see subsubsection 4.2.1) is shown       vertices, however only the shared vertices of adjacent waves are
if there is only one very large fixed point (i.e. size greater than       represented in the 3D wave map. Users can either select a wave or
𝐼𝑒 ) in the bucket. Actually, the system only shows one graph for         individual fragments of wave larger than 𝐼𝑒 to view the correspond-
each fixed point of the same degree distribution in a bucket and          ing subgraph. The color coding of the frustums and the color bar is
explicitly indicates the multiplicity of such graphs. For example, if     the same as for the 2D wave map (see subsubsection 4.2.1).
a fixed point is annotated as x16 it means that there are 16 fixed
points with the same degree distribution as the one being shown.          5     Implementation and Results
Furthermore, the user can filter the displayed fixed points by peel
                                                                          5.1    Procedure
value using a layer ribbon (Figure 7 bottom left).
                                                                          The process we use to perform the iterative edge-core decomposi-
4.2    Visualizing Waves and Fragments                                    tion is taken from [1]. The implementation of the wave decomposi-
For fixed points larger than the interactivity parameter 𝐼𝑒 we use        tion was partly inspired by the implementation for computing the
the wave decomposition of the fixed point to partition its edges          k-core of a graph in [10]. The connected components implemen-
into smaller chunks which can be viewed individually. To have an          tation used was from the Boost Graph Library. The visualization
overview of the whole fixed point we use 2D and 3D wave maps,             was developed using D3, three.js, chart.js, plotly.js. To produce our
that give a sense of the distribution of wave sizes in a fixed point.     results we performed the following steps:
Graph Waves                                                                                  EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark

                                                                           p
                                                                        |𝐸| |𝑉 |. Due to space limitations we include the results of the wave
                                                                        decomposition on the Patents citation and Friendster networks in
                                                                        Table 2 and Table 3 of the appendix.
                                                                           For our experiments (see Table 2 and Table 3) we computed the
                                                                        wave decomposition on all the fixed points in a layer simultaneously
                                                                        as this was actually quicker than filtering by connected components.
                                                                           Although not included in Table 2 and 3, we note that layer 1 of
                                                                        the Patents citation network has 534,601 trees totaling 2,370,043
                                                                        vertices and 1,835,442 edges out of a total 3,774,768 vertices and
                                                                        16,518,947. What is surprising is that about 63% of vertices in this
                                                                        network are part of a forest. In the Friendster network, layer 1 has
                                                                        11,222,669 trees totaling 43,053,668 vertices and 31,830,999 edges.
Figure 8: (Left): The 3D wave map of a fixed point of peel value 11     So similarly about 65% of vertices in Friendster are part of trees.
with 18,208 vertices and 151,196 edges from the Patents citation net-
work. (Right): The 3D wave map of a fixed point of peel value 71
                                                                        5.2    Handling Large Fragments
with 126,835 vertices and 8,339,470 edges from the Friendster social    The sheer size of very large networks is usually dealt with some
network.                                                                form of iterative vertex or edge decomposition. The level of granu-
                                                                        larity of the decomposition may be driven by storage or computa-
                                                                        tional resources coupled with graph structure. The iterative edge
(1) Input: a graph 𝐺 = (𝑉 , 𝐸) represented using an edge list.          core decomposition first introduced in [3] has been used in [1] as a
(2) Pre-processing: each edge 𝑒 ∈ 𝐸 is reversed (i.e. 𝑒 = (𝑢, 𝑣)        useful graph abstraction to make sense of very large graphs. In this
    becomes 𝑒 = (𝑣, 𝑢)) and appended to the original edge list. The     work, we go two steps beyond and decompose “large” fixed points
    modified edge list is then sorted in increasing order removing      into waves and if they are still “large” we decompose them further
    duplicates. Additionally we remove self loops, i.e edges of the     into fragments (see subsection 4.2). It is worth mentioning that
    type 𝑒 = (𝑢, 𝑢).                                                    the overall approach is basically the same, i.e., iterative removal of
(3) Iterative edge-core decomposition: we use the parallelized          vertices that satisfy certain degree conditions. Fragments can be
    implementation of the algorithm in [1] to compute the peel          viewed as the most atomic graph types obtained by plain vanilla
    layers of the graph. This assigns a layer value to each edge in     iterative degree removal methods. However, the main limitation
    the graph. The metadata, like that shown in Table 1 is written      is that these fragments can be “large” too and require specialized
    to a separate output log.                                           methods beyond the degree peeling based approaches.
(4) Connected Components: the connected components of the                   In these situations we propose a variation of the maximal match-
    entire graph, as well as of each layer are computed using the       ing edge contraction approach first suggested in [4]. Namely, we
    Boost Graph Library. The metadata of this calculation (time,        iteratively select a random maximal matching {𝑒 1, 𝑒 2, ..., 𝑒 𝑗 } and
    number of components) is logged as well as the metadata for         contract its edges until the set of vertices remaining have cardinality
    each component (size of component).                                 equal to the interactivity parameter, 𝐼 𝑣 . This contracted graph can
(5) Waves/Fragments: Using the metadata from the previous steps         then be visualized as a metagraph, where each vertex represents
    we compute the wave decomposition of fixed points with more         the subgraph of contracted edges. The combination of contractions
    than 𝐼𝑒 = 216 edges. We also compute the connected components       and matchings is an area of study that deserves further research.
    for each wave. The metadata of this calculation (time, number
    of waves) is logged as well as the metadata for each wave (size,    6     Summary and Future Work
    number of components, fragment distribution). Results for the       Our approach presents a high level computational overview of
    Patents citation network waves are shown in Table 2 and for the     “large graphs” as sequences of “waves”. This is fundamentally dif-
    Friendster social network in Table 3                                ferent to previous approaches in the sense that it is amenable to a
   We chose a wide range of of graphs, varying in both size and         visual representation (i.e. wave maps ) that is closely tied to “intu-
domain (e.g. social, geographic, hyperlink, and co-occurrence net-      itive” hierarchical navigation and exploration. Our next step will
works) [18]. We performed most of our experiments on a single           be to perform user experiments to evaluate the usability of the
computer equipped with an Intel® Core™ i7-8750H CPU clocked at          proposed system. To our knowledge this is the first type of system
2.20GHz with 32GB of RAM. The com-friendster data set however           that offers a high level representation of graphs at the billion edge
was processed on a server with an Intel® Xeon® CPU E5-2620 v2           scale that can be visually explored at different levels of connectivity
clocked at 2.10GHz with 126GB of RAM. Results of the iterative          within a global context.
edge-core decomposition are reported in Table 1, which includes            We believe that graph partitioning processes will become useful
the graph that is decomposed, its vertex and edge count, its highest    tools to address massive graph computations in a divide and conquer
degree, the number of layers and connected component in each            manner. This line of thinking opens up research to investigate which
graph, the highest peel value and number of waves from the decom-       types of massive graph problems can be solved by composing their
position, and the algorithm compute time and I/O time averaged          local fixed point and wave solutions.
over 5 runs. Notice that the computation times of the iterative edge
core decomposition for graphs with at least 1 million edges grows as
EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark                                                                                      James Abello and Daniel Nakhimovich


Table 1: Results of performing the iterative edge-core decomposition across a number of different graphs varying in size and domain sorted
by number of edges. (CC: number of connected components MP: max peel value, L: number of layers, MW: max number of waves)

           Graph Name                                             |V|             |E |     Max Deg       CC      MP        L     MW       Time (s)      I/O Time (s)
           Gnutella P2P network (p2p-gnutella31)                62586          147892            95       12       6       5        9         0.05              0.05
           Astro Physics (ca-astroph)                           18771          198050           504      289      56      47       10         0.14              0.10
           Amazon co-purchasing (amazon0601)                   403394         2443408          2752        7      10      10       22         1.00              0.98
           California road network (roadNet-CA)               1965206         2766607            12     2638       3       3       63         1.34              1.33
           Berkeley-Stanford web (web-BerkStan)                685230         6649470        84230       676     201      88      314         5.37              3.50
           Patents citation network (cit-Patents)             3774768        16518947           793     3627      64      41       47        18.37             13.33
           Pokec (soc-pokec)                                  1632803        22301964        14854         1      47      29       45        15.62             14.86
           LiveJournal network (LiveJournal)                  3997962        34681189        14815         1     360     119       49        54.04             35.73
           Orkut (com-orkut)                                  3072441       117185083        33313         1     253      91       88        81.62             56.90
           Friendster (com-friendster)                       65608366      1806067135          5214        1     304      72      213      3085.42           2351.32


   We close by mentioning that being able to efficiently compute                                  transactions on visualization and computer graphics 19, 12 (2013), 2596–2605.
the proposed wave decomposition in semi-external memory set-                                 [13] Dezhi Fang, Matthew Keezer, Jacob Williams, Kshitij Kulkarni, Robert Pienta, and
                                                                                                  Duen Horng Chau. 2017. Carina: interactive million-node graph visualization
tings or in a streaming fashion will enhance in a major way the                                   using web browser technologies. In International Conference on World Wide Web
applicability of our approach not only to “large graph visualization”                             Companion. International World Wide Web Conferences Steering Committee,
                                                                                                  775–776.
but to massive graph computation in general.                                                 [14] Gianni Franceschini, Fabrizio Luccio, and Linda Pagli. 2006. Dense trees: a new
   A video demonstrating our current prototype can be accessed at:                                look at degenerate graphs. Journal of Discrete Algorithms 4, 3 (2006), 455–474.
https://dl.dropboxusercontent.com/s/a4kpwv5609op73w/graphwaves_                              [15] Christos Giatsidis, Fragkiskos D Malliaros, Nikolaos Tziortziotis, Charanpal
                                                                                                  Dhanjal, Emmanouil Kiagias, Dimitrios M Thilikos, and Michalis Vazirgiannis.
bigvis.avi. Other formats are available at: https://www.dropbox.                                  2016. A k-core Decomposition Framework for Graph Clustering. arXiv preprint
com/sh/bowfhdrr82ti18u/AAAAFwUZwkq5pQD63Kqyh-Ela?dl=0.                                            arXiv:1607.02096 (2016).
                                                                                             [16] Michelle Girvan and Mark EJ Newman. 2002. Community structure in social and
Acknowledgments                                                                                   biological networks. Proceedings of the national academy of sciences 99, 12 (2002),
                                                                                                  7821–7826.
We thank Qi Dong for helping to develop graph drawing software,                              [17] Humayun Kabir and Kamesh Madduri. 2017. Parallel k-core decomposition on
Yi-Hsiang Lo for interface design, and Haodong Zheng for interface                                multicore platforms. In IEEE International Parallel and Distributed Processing
design and system integration. This work was partially supported                                  Symposium Workshops. IEEE, 1482–1491.
                                                                                             [18] Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network
by NSF grants IIS-1563816 and IIS-1563971.                                                        Dataset Collection. http://snap.stanford.edu/data.
                                                                                             [19] Zhiyuan Lin, Nan Cao, Hanghang Tong, Fei Wang, and U Kang. 2013. Interactive
References                                                                                        multi-resolution exploration of million node graphs. In IEEE Conference on Visual
                                                                                                  Analytics Science and Technology, Poster.
 [1] James Abello, Fred Hohman, Varun Bezzam, and Duen Horng Chau. 2019. Atlas:              [20] Peng Mi, Maoyuan Sun, Moeti Masiane, Yong Cao, and Chris North. 2016. Inter-
     Local Graph Exploration in a Global Context. In Proceedings of the International             active graph layout of a million nodes. In Informatics, Vol. 3. Multidisciplinary
     Conference on Intelligent User Interfaces. ACM.                                              Digital Publishing Institute, 23.
 [2] James Abello and Shankar Krishnan. 2000. Navigating graph surfaces. In Ap-              [21] Quan Hoang Nguyen, Seok-Hee Hong, Peter Eades, and Amyra Meidiana. 2017.
     proximation and Complexity in Numerical Optimization. Springer, 1–16.                        Proxy graph: visual quality metrics of big graph sampling. IEEE transactions on
 [3] James Abello and François Queyroi. 2013. Fixed points of graph peeling. In                   visualization and computer graphics 23, 6 (2017), 1600–1611.
     Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social         [22] Robert Pienta, Fred Hohman, Alex Endert, Acar Tamersoy, Kevin Roundy, Chris
     Networks Analysis and Mining. ACM, 256–263.                                                  Gates, Shamkant Navathe, and Duen Horng Chau. 2018. VIGOR: interactive
 [4] James Abello, Frank Van Ham, and Neeraj Krishnan. 2006. Ask-graphview: a                     visual exploration of graph query results. IEEE Transactions on Visualization and
     large scale graph visualization system. IEEE Transactions on Visualization and               Computer Graphics 24, 1 (2018), 215–225.
     Computer Graphics 12, 5 (2006), 669–676.                                                [23] Loïc Royer, Matthias Reimann, Bill Andreopoulos, and Michael Schroeder. 2008.
 [5] J Ignacio Alvarez-Hamelin, Luca Dall’Asta, Alain Barrat, and Alessandro Vespig-              Unraveling protein networks with power graph analysis. PLoS computational
     nani. 2006. Large scale networks fingerprinting and visualization using the k-core           biology 4, 7 (2008), e1000108.
     decomposition. In Advances in Neural Information Processing Systems. 41–50.             [24] Ahmet Erdem Sariyuce, C Seshadhri, Ali Pinar, and Umit V Catalyurek. 2015.
 [6] Daniel Archambault, Tamara Munzner, and David Auber. 2007. Topolayout:                       Finding the hierarchy of dense subgraphs using nucleus decompositions. In
     multilevel graph layout by topological features. IEEE Transactions on Visualization          Proceedings of the 24th International Conference on World Wide Web. International
     and Computer Graphics 13, 2 (2007).                                                          World Wide Web Conferences Steering Committee, 927–937.
 [7] Alessio Arleo, Oh-Hyun Kwon, and Kwan-Liu Ma. 2017. GraphRay: Distributed               [25] Stephen B Seidman. 1983. Network structure and minimum degree. Social
     pathfinder network scaling. In 2017 IEEE 7th Symposium on Large Data Analysis                networks 5, 3 (1983), 269–287.
     and Visualization (LDAV). IEEE, 74–83.                                                  [26] Nikolaj Tatti. 2019. Density-friendly graph decomposition. ACM Transactions on
 [8] Benjamin Bach, Nathalie Henry Riche, Christophe Hurter, Kim Marriott, and                    Knowledge Discovery from Data (TKDD) 13, 5 (2019), 54.
     Tim Dwyer. 2017. Towards unambiguous edge bundling: Investigating conflu-               [27] Wouter van Heeswijk, George HL Fletcher, and Mykola Pechenizkiy. 2016. On
     ent drawings for network visualization. IEEE Transactions on Visualization &                 structure preserving sampling and approximate partitioning of graphs. In Proceed-
     Computer Graphics (2017), 1–1.                                                               ings of the 31st Annual ACM Symposium on Applied Computing. ACM, 875–882.
 [9] Vladimir Batagelj, Franz J Brandenburg, Walter Didimo, Giuseppe Liotta, Pietro          [28] Yang Zhang, Yusu Wang, and Srinivasan Parthasarathy. 2017. Visualizing at-
     Palladino, and Maurizio Patrignani. 2011. Visual analysis of large graphs using              tributed graphs via terrain metaphor. In Proceedings of the 23rd ACM SIGKDD
     (x, y)-clustering and hybrid visualizations. IEEE transactions on visualization and          International Conference on Knowledge Discovery and Data Mining. ACM, 1325–
     computer graphics 17, 11 (2011), 1587–1598.                                                  1334.
[10] Vladimir Batagelj and Matjaz Zaversnik. 2003. An O(m) algorithm for cores               [29] Hong Zhou, Panpan Xu, Xiaoru Yuan, and Huamin Qu. 2013. Edge bundling in
     decomposition of networks. arXiv preprint cs/0310049 (2003).                                 information visualization. Tsinghua Science and Technology 18, 2 (2013), 145–156.
[11] Emilio Di Giacomo, Walter Didimo, Giuseppe Liotta, Fabrizio Montecchiani,               [30] Michael Zinsmaier, Ulrik Brandes, Oliver Deussen, and Hendrik Strobelt. 2012.
     and Ioannis G Tollis. 2014. Techniques for edge stratification of complex graph              Interactive level-of-detail rendering of large graphs. IEEE Transactions on Visual-
     drawings. Journal of Visual Languages & Computing 25, 4 (2014), 533–543.                     ization and Computer Graphics 18, 12 (2012), 2486–2495.
[12] Tim Dwyer, Nathalie Henry Riche, Kim Marriott, and Christopher Mears. 2013.
     Edge compression techniques for visualization of dense directed graphs. IEEE
Graph Waves                                                                                      EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark


Appendix                                                                   Table 3: Results of performing the wave decomposition on select
                                                                           layers of the Friendster social network. These are only the layers
                                                                           that contain a fixed point of size at least 222 edges. (L: layer, 𝐹𝑘 : num-
                                                                           ber of fixed points, W: number of waves, F: number of fragments).
                                                                           The highlighted row corresponds to the layer containing the largest
                                                                           fragment of 21,128,481 edges.
  Alg. 1: Wave Decomposition (O(m))
   Input: 𝐹𝑘 = (𝑉 , 𝐸), a fixed point of peel value k.                           L         |V|          |E |       𝐹𝑘     W        F    Time (s)
   Output: 𝑀 = {𝑊1,𝑊2,𝑊3, ...,𝑊𝑚 } where each 𝑊𝑖 ⊂ 𝐸 are                         2    20031131     26084775     39188      8      99       168.9
             waves.                                                              3    20912756     46640174       564     27     564       887.1
   Output: 𝑆 = {𝑆 0, 𝑆 1, 𝑆 2, ..., 𝑆𝑛 } where each 𝑆 𝑗 ⊂ 𝑉 .                    4     1509262      4760942      4816     11     317        27.0
                                                                                 5    14030072     54607126       447     10     536       314.2
 1 function waves(𝐹𝑘 ):
                                                                                 6     6352083     32606979       793     25    1177       233.0
 2    𝑀←∅
                                                                                 8     2882071     19767228       193     14     903       116.1
 3    𝑆←∅                                                                       11    13635761    131376950         5     75    2532      1422.0
 4    𝑖←1                                                                       15     5011964     68883449         9     46    1874       496.0
 5    𝑗 ←0                                                                      17      738488     11279923        16     18     970        50.8
 6    while 𝐸 ≠ ∅ do                                                            21      432014      7931887         5    106     978        40.9
                                                                                25    10460847    229286219         1     17    1127      1511.5
 7       𝑊𝑖 ← ∅
                                                                                29      244598      6430697         2     15     722        23.4
 8       𝑆 𝑗 ← {𝑣 ∈ 𝑉 : deg(𝑣) = 𝑘 }                                            38     3144226    107272464         1    213    3279      1828.4
 9        while 𝑆 𝑗 ≠ ∅ do                                                      40      216081      8035443         2     50     889        31.4
10           𝑆 ← 𝑆 ∪ {𝑆 𝑗 }                                                     53     6878498    322297063         1     10    1260      2543.1
11           𝐸 ← 𝐸 \ frag(𝑆 𝑗 )                                                 65      155562      9282393         1      7     415        31.6
12           𝑊𝑖 ← 𝑊𝑖 ∪ frag(𝑆 𝑗 )                                               71      126835      8339470         1     56     768        30.9
                                                                                89     4570064    367562291         1     49     737      5647.4
13           𝑆 𝑗+1 ← {𝑣 ∈ 𝜕𝑆 𝑗 | deg(𝑣) < 𝑘 }
                                                                                120     908835     97669888         1      2     367       418.0
14           𝑗 ← 𝑗 +1                                                           140     764401     94970270         1      7     443       410.1
15        end                                                                   169     228332     32277749         1     29     357       115.6
16        𝑀 ← 𝑀 ∪ {𝑊𝑖 }                                                         175     197091     32971011         1     40     931       121.1
17       𝑖 ←𝑖 +1                                                                234     157908     32985831         1    128    1188       123.8
                                                                                304      24528      6301889         1      7     203        19.2
18    end
19 end




Table 2: Results of performing the wave decomposition on layers
2 through 13 of the Patents citation network. These are only the
layers that contain a fixed point of size at least 216 edges. (L: layer,
𝐹𝑘 : number of fixed points, W: number of waves, F: number of frag-
ments). The highlighted row corresponds to the layer containing
the largest fragment of 1,356,327 edges.

        L         |V|       |E |     𝐹𝑘   W       F   Time (s)
        2      773841    953738    7682    7     55      2.779
        3     1663386   4097177      37   33    411     18.587
        4      198984    579406     787   15    267      1.652
        5      617644   2457675      42   17    610      8.187
        6      438875   2210942     100   47    996      8.935
        7      246576   1406616      66   19    619       4.02
        8      158620   1051389      85   44    765      3.243
        9       80554    590337      61   25    511       1.46
        10      56961    469522      73   45    579      1.211
        11      20602    168159      80   14    188      0.272
        12      13507    126909      73   12    236      0.192
        13      15043    154245      38   13    284      0.249