Graph Waves James Abello∗ Daniel Nakhimovich∗ abelloj@cs.rutgers.edu d.nak@rutgers.edu Rutgers University New Brunswick, New Jersey Abstract known canonical hierarchical decomposition for k-connectivity. We present efficient algorithmic mechanisms to partition graphs The notion of cores was first introduced in [25] and their utiliy was with up to 1.8 billion edges into subgraphs which are fixed points further identified in [5]. One of the main advantages of core based of degree peeling [3]. Similarly to [1], we are able to process graphs vertex decompositions is that they can be computed efficiently [10] with tens of millions of edges in seconds and hundreds of millions and they provide an arguably “natural” hierarchical view of any of edges in minutes. For fixed points that turn out to be larger than a network. The vertex core decomposition has been used to obtain desired interactivity parameter we further decompose them with a edge partitions where each set of edges in the partition is a max- novel (to our knowledge) linear algorithm into what we call “graph imal edge subgraph that is a fixed point of degree peeling. The waves and fragments”. This decomposition is used to create span- maximality of these decompositions is desirable because it provides ning views of fixed points that we call “DAG Covers”. We illustrate bounds on the connectivity between the decomposed subgraphs [3]. these decompositions by presenting intuitive and interactive visu- The pragmatic challenge of computing this iterative edge “layer” alizations of the meta-structures of a variety of publicly available decomposition for graphs with several hundred million edges in a data sets including social, web, and citation networks [18]. few minutes has been undertaken in [1]. However, if a particular layer in the decomposition is large, the navigation and visualiza- Keywords tion of such big fixed points is still a major task. For example, in graph waves, kcore, graph decomposition, graph visualization the Friendster social network there are four fixed points with over 100 million edges. We address this issue by introducing a novel 1 Introduction and efficient algorithm that we call the “wave decomposition”. The How far can we simplify computation and visualization in order “waves” produced by the decomposition can be naturally separated to maintain “large scale” macro structural graph features without into connected sub-waves which can in turn be refined into a collec- losing the ability to interactively extract more detailed connectivity? tion of “fragments”. For large fragments we propose using maximal Some progress in this direction appears in [3]. In general, macro matching based contractions combined with graph rendering and structural graph views can be obtained by using some form of clustering techniques similar to [9, 30]. Our notion of fragments is vertex or edge aggregation that can be represented as hierarchy quite different to that used in [7]. The ultimate goal of this work is trees [4, 8, 12, 23, 29]. The choice of suitable representations that to obtain a humanly-interpretable, hierarchical description of any facilitate smooth visual interaction is a subject of active research graph. [13, 19, 20, 22]. In [11] an edge stratification technique is proposed First we introduce fixed points and waves in sections 2 and 3. In that is applied to a given layout of a graph with a few hundred sections 4 and 5 we show sample visualizations and runtime results. vertices and edges. All of these techniques require algorithms with Lastly we discuss our contributions and future work in section 6. running time greater than linear in the number of graph elements. 2 Fixed Points and Degree Peeling Our focus is on graph decompositions that are applicable to graphs We deal with undirected graphs 𝐺 = (𝑉 , 𝐸) with vertex set 𝑉 (𝐺) of several orders of magnitude larger for which a direct layout is and edge set 𝐸 (𝐺). We denote by 𝑛 the number of vertices in 𝑉 , unavailable due to memory constraints or impractical due to screen 𝑚 the number of edges in 𝐸, and the degree of a vertex 𝑢 ∈ 𝑉 by size contraints. deg(u). Some hierarchical decompositions have faster techniques. Major approaches include SPQR-trees, k-core and nuclei based decomposi- Definition 1. (Peel Value) The peel value of a vertex 𝑢 ∈ 𝑉 (𝐺) tions [1, 5, 10, 15, 17, 24] or even simple stochastic sampling [21, 27]. denoted 𝑝𝑒𝑒𝑙𝐺 (𝑢) is the largest 𝑖 ∈ [1, 𝑑𝑒𝑔(𝑢)] such that 𝑢 belongs to k-connectivity decomposition techniques have been explored in a subgraph of 𝐺 of minimum degree 𝑖. [4, 6, 16] and terrain metaphors have been explored in [2] and [28]. Definition 2. (Graph Core) The core of 𝐺, 𝑐𝑜𝑟𝑒 (𝐺), sometimes For large graphs, such techniques are limited because there is no called the k-core of 𝐺, is the subgraph induced by the maximal subset ∗ Both authors contributed equally to this research. of vertices of 𝐺 whose peeling value is maximum. Cores can be intuitively thought of as dense subgraphs. Although, Copyright ©2020 for this paper by its author(s). Published in the Workshop Proceed- they are not necessarily maximally dense in any known way, [26] ings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020, Copenhagen, shows that k-cores are 2-approximate locally dense subgraphs. Denmark) on CEUR-WS.org. Use permitted under Creative Commons License Attribu- In a previous work, [3] utilizes k-cores to produce an iterative tion 4.0 International (CC BY 4.0) decomposition of a graph’s edges into layers of fixed points (which EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark James Abello and Daniel Nakhimovich we redefine below) with the same peel value. Since the decomposi- large. To further decompose fixed points we introduce a “wave de- tion forms a partition of edges, we call vertices that appear in more composition” which forms an edge partition by iteratively removing than one fixed point, clone vertices. The notion of clone vertices can “leaf-like” fragments. be used to define a diversity measure and are useful for detecting 3.1 Wave Decomposition community overlap [3]. Definition 4. We define the boundary of a vertex set 𝑆 ⊂ 𝑉 as Definition 3. (Fixed Point) A graph 𝐹𝑘 is a fixed point of degree 𝜕𝑆 = {𝑣 ∈ 𝑉 : ∃ (𝑢, 𝑣) ∈ 𝐸, 𝑢 ∈ 𝑆, 𝑣 ∉ 𝑆 }. We define the k- peeling 𝑘 if 𝑐𝑜𝑟𝑒 (𝐹𝑘 ) = 𝑉 (𝐹𝑘 ) and the peel value of each vertex in 𝐹𝑘 boundary of S as 𝜕𝑘 𝑆 to be the vertices in 𝜕𝑆 with degree less than k is 𝑘. restricted to the graph induced by 𝑉 \ 𝑆. We extend these boundary From proposition 1 in [3] it is easy to verify that a fixed point definitions to a collection of disjoint sets by taking their union. of peel value k must have minimum degree k and average degree Definition 5. (Graph Fragment) Given a seed set of vertices 𝑆 ⊂ strictly less than 2k. In Figure 1 we show the distribution of the 𝑉 the fragment generated by S, denoted frag(S), is the set of edges, average degrees of fixed points in the Patents citation and Friendster (𝑢, 𝑣), such that 𝑢 ∈ 𝑆. social networks, along with two black lines indicating the minimum and supremum average degree possible. In Figure 2 (right) we show a decomposition of a fixed point into fragments. Next, we use graph fragments to introduce the notion of graph waves of fixed points. cit-Patents Definition 6. (Graph Waves of Fixed Points) A wave of a fixed point of peel value 𝑘 is the union of fragments in the sequence {frag(𝑆 𝑗 )}𝑚 𝑗=0 with 𝑆 0 being the set of vertices of minimum degree 𝑘 and each sub- 𝑗 sequent 𝑆 𝑗+1 = 𝜕𝑘 (∪𝑖=0𝑆𝑖 ). Notice that a fixed point could consist of only one wave and a wave could consist of only one fragment (i.e. k-regular graphs). Due to space limitations pseudo code of the wave decomposition, com-Friendster Alg.1, is in the appendix. If we let 𝑘 = 1 then 𝐹 1 is a forest and the first fragment is the neighborhood of the leaves of that forest. For 𝑘 > 1, the wave decomposition mimics the iterative removal of leaves in a forest by removing collections of k-leaves (a.k.a frag- ments) from k-dense forests (a.k.a fixed points of degree peeling) [14]. The example in Figure 2 shows a coloring corresponding to the edge partition formed by applying the wave decomposition. The wave decomposition was heavily inspired by the method of computing k-cores in [10] and much like the implementation of that algorithm the wave decomposition can be computed in linear Figure 1: Plots of average degree of fixed points (y-axis) by peel time and space with respect to the number of edges in a graph. value (x-axis) with frequency of fixed points of a given peel value and average degree along the z-axis being emphasized by color (blue to red) and point size. Note that the x-axis and z-axis are shown on a log scale. What we call fixed points of degree peeling 𝑘, fixed points of peel value k, or simply fixed points, are in fact extensions of forests called k-dense forests in [14]. In that paper the authors first formulate the notion of k-dense cycles as connected graphs where each vertex is of degree > k. From there it is natural to define a k-dense forest Figure 2: (Left): A fixed point of peel value 2 whose edges are colored as a graph without k-dense cycles. Using any algorithm to find based on the wave decomposition (wave 1: blue, wave 2: green, wave the k-core of a graph, one can find what [14] calls a k-elimination 3: red). (Right): The same graph sub-divided into fragments: wave 1 order of a fixed point of degree peeling k. Thus, by Lemma 2.9 with 1 blue fragment, wave 2 with 3 fragments (green, yellow, or- in [14] the definition of a fixed point of degree peeling k and the ange), and wave 3 with 1 red fragment. definition of a k-dense forest are equivalent. The importance of noticing that fixed points are generalizations of trees is that it motivates a decomposition of fixed points that works by iteratively 3.2 DAG Cover of Waves and Fixed Points peeling “leaf-like” fragments (subsection 3.1). The concept of fragments is powerful in its own right. Consider the ordered collection of vertex sets 𝑆 = {𝑆 0, 𝑆 1, ..., 𝑆 𝑗 } as defined 3 Waves and Fragments in Def. 6. These sets form a vertex partition of the wave vertices. Waves are introduced in order to decompose Fixed points. Previous We construct a directed acyclic graph (DAG) representation of the works do not address decompositions of fixed points if they are wave decomposition by directing thoses edges (𝑥, 𝑦) with 𝑥 ∈ 𝑆𝑖 Graph Waves EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark and 𝑦 ∈ 𝑆𝑖+1 for 0 ≤ 𝑖 < 𝑗 − 1. We call this a DAG cover of a wave because it contains (or covers) all the vertices of the wave plus the edge cuts between consecutive sets in 𝑆. We create a DAG cover of a fixed point by using its ordered wave decomposition. Similarly we could create a DAG cover for an entire graph by using its fixed point decomposition. Moreover, we use this DAG cover in conjunction with other techniques to produce graph drawings. In particular, we next illustrate how we use the DAG cover to visualize stratification of fixed points and their waves. Figure 4: A fixed point of peel value 30 from the Patents citation network with 80 vertices and 1547 edges drawn with a force-directed 3.2.1 Visualization of Fixed Points with DAG Cover layout (Left) and with forces applied to the DAG sets (Right). We use three different techniques – color, force manipulation, and edge sparsification – driven by the DAG cover to provide better interpretability of the structure of fixed points. on the order of a minute to render on a machine without a dedi- The most visually striking technique is to use the DAG cover cated GPU whereas the DAG cover visualization rendered nearly to color the vertices and edges of a graph. First we assign a color instantly and remained interactive. The fraction of edges in the to each vertex in the DAG cover based on the sets in 𝑆, then we DAG cover is clearly less than or equal to the number of edges in color the edges of the original graph as a function of the color of the original graph and in practice the reduction often exceeds a their end points (e.g. hue average). Using a spectrum color scheme factor of 2. Despite not having all the edges of the original graph, from blue to red, we show a few examples of fixed points (Figure 3) the graph drawing of just the DAG cover edges in combination for which the DAG coloring reveals more clearly a stratification with the DAG coloring and DAG set forces often still shows the structure than an uncolored drawing with the same layout. general graph stucture. Figure 3: (Left): A fixed point of peel value 22 from the Patents cita- tion network with 742 vertices and 12985 edges shown using a basic force-directed layout in grey. (Right): The same fixed point shown with the DAG coloring. Figure 5: A fixed point of peel value 32 from the Friendster social network with 752 vertices and 21,055 edges drawn with a force- directed layout (Left) and with only the 2,686 DAG cover edges Another technique we use is to manipulate the forces in a force- (Right). directed layout using the information of the DAG set ordering. In a standard force-directed layout, there is a repulsive force between each pair of vertices and an attraction between pairs of vertices 4 Visual Exploration connected by an edge. In addition to those forces, we apply unique Using the iterative edge core and wave decompositions we have radial forces to each of the sets of vertices in 𝑆 centered at increas- designed an interactive graph visualization tool capable of exploring ingly larger radii based on their order in 𝑆. Furthermore, we remove large graph data up to 1.8 billion edges. The system offers multiple forces along the edges which are not part of the DAG cover. This high level views including a spiral view and a layer view that, in our tends to separate out the vertex sets in the layout and helps untan- experience, aid in meaningful graph exploration. These structures gle hairball looking fixed points when the standard force-directed allow the user to make intuitive decisions as to which subgraphs layout does not. Figure 4 shows a fixed point with four sets dis- (created by our decompositions) to zoom in to for deeper analysis. played using a standard force-directed layout and a DAG induced We define interactivity parameters 𝐼𝑒 and 𝐼 𝑣 to be the maximum force-directed layout. Unlike the standard layout, the DAG induced number of edges and vertices respectively that a user allows to layout helps to easily identify independent (or “near-independent”) be drawn on their screen. These parameters are predominantly sets of vertices to highlight a “bipartite-like” structure even without dependent on screen size, GPU power, and user preference for using color. interactivity versus displaying more data points/links at a time. For a typical, interactive force-directed layout drawing applica- tion there is a maximum number of edges that can be rendered 4.1 Fixed Points Spiral View before making the system unresponsive. For such cases we could To offer users an overall view of the entire collection of fixed points push the system to draw larger fixed points by only rendering the appearing in a very large graph we tabulate the fixed points of edges in the DAG cover. Figure 5 shows a graph rendered before degree peeling found through the iterative edge core decomposition. and after filtering by the DAG cover. Note that the full graph took We then create buckets containing connected fixed points. These EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark James Abello and Daniel Nakhimovich connected fixed points are grouped together into buckets according to the number of edges. Each 𝑖 𝑡ℎ bucket has fixed points of size s such that 𝑙𝑜𝑔𝑖−1 (𝑚) < 𝑠 ≤ 𝑙𝑜𝑔𝑖 (𝑚). This bucketing scheme is visually represented by a spiral of boxes that we call the spiral view. It provides direct access to any group of similarly sized fixed points in the graph data set. 4.1.1 Spiral View The spiral view consists of boxes of two types corresponding to the type of fixed points the bucket contains. If a bucket has many fixed points we show a polar bar graph of frequencies of fixed points per unique size. If a bucket has only one fixed point larger than the interactivity parameter, 𝐼𝑒 , we show a 2D wave map as described in subsubsection 4.2.1. Notice how in (Figure 6) the size of the spiral view of the larger Friendster network (1.8 billion edges) is about the same as that of the smaller Patents citation network (only 0.16 billion edges). Since the size of the spiral view scales at most logarithmically with graph size it works well for navigating graphs at almost any scale. Figure 7: (Top): The layer view of a bucket smaller than 𝐼𝑒 . (Bottom Left) The first 13 layers of the layer ribbon. (Bottom Right): The 2D wave map of a bucket larger than 𝐼𝑒 which consists of a fixed point with 10,585 vertices and 103,101 edges. All of these are produced from the Patents citation network 4.2.1 2D Wave map A 2D wave map (Figure 7 bottom right) is a collection of con- centric rings. Each ring represents a wave and is made of multiple Figure 6: The spiral view of the Patents citation network with arcs representing connected components (or sub-waves) of a wave. 3,774,768 vertices and 16,518,947 edges (Left) and the Friendster The thickness of any arc represents the number of fragments in social network with 65,608,366 vertices and 1,806,067,135 edges the sub-wave and the arc length is proportional to the number of (Right). The colors indicate the relative size of the buckets (num- edges in that sub-wave. The arcs are color coded by increasing ber of edges increasing from blue to red) and with stronger opacity wave number, from blue to red. indicating higher density. 4.2.2 3D Wave map A 3D wave map (Figure 8) is a “vase” made of multiple conic 4.1.2 Layer View frustums, each representing a wave. The volume of each frustum is The layer view (Figure 7 top) shows a rectangular representa- proportional to the number of edges in that wave and the area of tion of buckets. The size of each sub-rectangle encodes the overall the intersection of two frustums represents the number of shared number of fixed points in the bucket. When a bucket is clicked, vertices between those two waves. Recall, waves form an edge either the fixed points in the bucket are displayed sorted by peel partition of the fixed point so any two waves may share common value and size, or a 2D wave map (see subsubsection 4.2.1) is shown vertices, however only the shared vertices of adjacent waves are if there is only one very large fixed point (i.e. size greater than represented in the 3D wave map. Users can either select a wave or 𝐼𝑒 ) in the bucket. Actually, the system only shows one graph for individual fragments of wave larger than 𝐼𝑒 to view the correspond- each fixed point of the same degree distribution in a bucket and ing subgraph. The color coding of the frustums and the color bar is explicitly indicates the multiplicity of such graphs. For example, if the same as for the 2D wave map (see subsubsection 4.2.1). a fixed point is annotated as x16 it means that there are 16 fixed points with the same degree distribution as the one being shown. 5 Implementation and Results Furthermore, the user can filter the displayed fixed points by peel 5.1 Procedure value using a layer ribbon (Figure 7 bottom left). The process we use to perform the iterative edge-core decomposi- 4.2 Visualizing Waves and Fragments tion is taken from [1]. The implementation of the wave decomposi- For fixed points larger than the interactivity parameter 𝐼𝑒 we use tion was partly inspired by the implementation for computing the the wave decomposition of the fixed point to partition its edges k-core of a graph in [10]. The connected components implemen- into smaller chunks which can be viewed individually. To have an tation used was from the Boost Graph Library. The visualization overview of the whole fixed point we use 2D and 3D wave maps, was developed using D3, three.js, chart.js, plotly.js. To produce our that give a sense of the distribution of wave sizes in a fixed point. results we performed the following steps: Graph Waves EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark p |𝐸| |𝑉 |. Due to space limitations we include the results of the wave decomposition on the Patents citation and Friendster networks in Table 2 and Table 3 of the appendix. For our experiments (see Table 2 and Table 3) we computed the wave decomposition on all the fixed points in a layer simultaneously as this was actually quicker than filtering by connected components. Although not included in Table 2 and 3, we note that layer 1 of the Patents citation network has 534,601 trees totaling 2,370,043 vertices and 1,835,442 edges out of a total 3,774,768 vertices and 16,518,947. What is surprising is that about 63% of vertices in this network are part of a forest. In the Friendster network, layer 1 has 11,222,669 trees totaling 43,053,668 vertices and 31,830,999 edges. Figure 8: (Left): The 3D wave map of a fixed point of peel value 11 So similarly about 65% of vertices in Friendster are part of trees. with 18,208 vertices and 151,196 edges from the Patents citation net- work. (Right): The 3D wave map of a fixed point of peel value 71 5.2 Handling Large Fragments with 126,835 vertices and 8,339,470 edges from the Friendster social The sheer size of very large networks is usually dealt with some network. form of iterative vertex or edge decomposition. The level of granu- larity of the decomposition may be driven by storage or computa- tional resources coupled with graph structure. The iterative edge (1) Input: a graph 𝐺 = (𝑉 , 𝐸) represented using an edge list. core decomposition first introduced in [3] has been used in [1] as a (2) Pre-processing: each edge 𝑒 ∈ 𝐸 is reversed (i.e. 𝑒 = (𝑢, 𝑣) useful graph abstraction to make sense of very large graphs. In this becomes 𝑒 = (𝑣, 𝑢)) and appended to the original edge list. The work, we go two steps beyond and decompose “large” fixed points modified edge list is then sorted in increasing order removing into waves and if they are still “large” we decompose them further duplicates. Additionally we remove self loops, i.e edges of the into fragments (see subsection 4.2). It is worth mentioning that type 𝑒 = (𝑢, 𝑢). the overall approach is basically the same, i.e., iterative removal of (3) Iterative edge-core decomposition: we use the parallelized vertices that satisfy certain degree conditions. Fragments can be implementation of the algorithm in [1] to compute the peel viewed as the most atomic graph types obtained by plain vanilla layers of the graph. This assigns a layer value to each edge in iterative degree removal methods. However, the main limitation the graph. The metadata, like that shown in Table 1 is written is that these fragments can be “large” too and require specialized to a separate output log. methods beyond the degree peeling based approaches. (4) Connected Components: the connected components of the In these situations we propose a variation of the maximal match- entire graph, as well as of each layer are computed using the ing edge contraction approach first suggested in [4]. Namely, we Boost Graph Library. The metadata of this calculation (time, iteratively select a random maximal matching {𝑒 1, 𝑒 2, ..., 𝑒 𝑗 } and number of components) is logged as well as the metadata for contract its edges until the set of vertices remaining have cardinality each component (size of component). equal to the interactivity parameter, 𝐼 𝑣 . This contracted graph can (5) Waves/Fragments: Using the metadata from the previous steps then be visualized as a metagraph, where each vertex represents we compute the wave decomposition of fixed points with more the subgraph of contracted edges. The combination of contractions than 𝐼𝑒 = 216 edges. We also compute the connected components and matchings is an area of study that deserves further research. for each wave. The metadata of this calculation (time, number of waves) is logged as well as the metadata for each wave (size, 6 Summary and Future Work number of components, fragment distribution). Results for the Our approach presents a high level computational overview of Patents citation network waves are shown in Table 2 and for the “large graphs” as sequences of “waves”. This is fundamentally dif- Friendster social network in Table 3 ferent to previous approaches in the sense that it is amenable to a We chose a wide range of of graphs, varying in both size and visual representation (i.e. wave maps ) that is closely tied to “intu- domain (e.g. social, geographic, hyperlink, and co-occurrence net- itive” hierarchical navigation and exploration. Our next step will works) [18]. We performed most of our experiments on a single be to perform user experiments to evaluate the usability of the computer equipped with an Intel® Core™ i7-8750H CPU clocked at proposed system. To our knowledge this is the first type of system 2.20GHz with 32GB of RAM. The com-friendster data set however that offers a high level representation of graphs at the billion edge was processed on a server with an Intel® Xeon® CPU E5-2620 v2 scale that can be visually explored at different levels of connectivity clocked at 2.10GHz with 126GB of RAM. Results of the iterative within a global context. edge-core decomposition are reported in Table 1, which includes We believe that graph partitioning processes will become useful the graph that is decomposed, its vertex and edge count, its highest tools to address massive graph computations in a divide and conquer degree, the number of layers and connected component in each manner. This line of thinking opens up research to investigate which graph, the highest peel value and number of waves from the decom- types of massive graph problems can be solved by composing their position, and the algorithm compute time and I/O time averaged local fixed point and wave solutions. over 5 runs. Notice that the computation times of the iterative edge core decomposition for graphs with at least 1 million edges grows as EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark James Abello and Daniel Nakhimovich Table 1: Results of performing the iterative edge-core decomposition across a number of different graphs varying in size and domain sorted by number of edges. (CC: number of connected components MP: max peel value, L: number of layers, MW: max number of waves) Graph Name |V| |E | Max Deg CC MP L MW Time (s) I/O Time (s) Gnutella P2P network (p2p-gnutella31) 62586 147892 95 12 6 5 9 0.05 0.05 Astro Physics (ca-astroph) 18771 198050 504 289 56 47 10 0.14 0.10 Amazon co-purchasing (amazon0601) 403394 2443408 2752 7 10 10 22 1.00 0.98 California road network (roadNet-CA) 1965206 2766607 12 2638 3 3 63 1.34 1.33 Berkeley-Stanford web (web-BerkStan) 685230 6649470 84230 676 201 88 314 5.37 3.50 Patents citation network (cit-Patents) 3774768 16518947 793 3627 64 41 47 18.37 13.33 Pokec (soc-pokec) 1632803 22301964 14854 1 47 29 45 15.62 14.86 LiveJournal network (LiveJournal) 3997962 34681189 14815 1 360 119 49 54.04 35.73 Orkut (com-orkut) 3072441 117185083 33313 1 253 91 88 81.62 56.90 Friendster (com-friendster) 65608366 1806067135 5214 1 304 72 213 3085.42 2351.32 We close by mentioning that being able to efficiently compute transactions on visualization and computer graphics 19, 12 (2013), 2596–2605. the proposed wave decomposition in semi-external memory set- [13] Dezhi Fang, Matthew Keezer, Jacob Williams, Kshitij Kulkarni, Robert Pienta, and Duen Horng Chau. 2017. Carina: interactive million-node graph visualization tings or in a streaming fashion will enhance in a major way the using web browser technologies. In International Conference on World Wide Web applicability of our approach not only to “large graph visualization” Companion. International World Wide Web Conferences Steering Committee, 775–776. but to massive graph computation in general. [14] Gianni Franceschini, Fabrizio Luccio, and Linda Pagli. 2006. Dense trees: a new A video demonstrating our current prototype can be accessed at: look at degenerate graphs. Journal of Discrete Algorithms 4, 3 (2006), 455–474. https://dl.dropboxusercontent.com/s/a4kpwv5609op73w/graphwaves_ [15] Christos Giatsidis, Fragkiskos D Malliaros, Nikolaos Tziortziotis, Charanpal Dhanjal, Emmanouil Kiagias, Dimitrios M Thilikos, and Michalis Vazirgiannis. bigvis.avi. Other formats are available at: https://www.dropbox. 2016. A k-core Decomposition Framework for Graph Clustering. arXiv preprint com/sh/bowfhdrr82ti18u/AAAAFwUZwkq5pQD63Kqyh-Ela?dl=0. arXiv:1607.02096 (2016). [16] Michelle Girvan and Mark EJ Newman. 2002. Community structure in social and Acknowledgments biological networks. Proceedings of the national academy of sciences 99, 12 (2002), 7821–7826. We thank Qi Dong for helping to develop graph drawing software, [17] Humayun Kabir and Kamesh Madduri. 2017. Parallel k-core decomposition on Yi-Hsiang Lo for interface design, and Haodong Zheng for interface multicore platforms. In IEEE International Parallel and Distributed Processing design and system integration. This work was partially supported Symposium Workshops. IEEE, 1482–1491. [18] Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network by NSF grants IIS-1563816 and IIS-1563971. Dataset Collection. http://snap.stanford.edu/data. [19] Zhiyuan Lin, Nan Cao, Hanghang Tong, Fei Wang, and U Kang. 2013. Interactive References multi-resolution exploration of million node graphs. In IEEE Conference on Visual Analytics Science and Technology, Poster. [1] James Abello, Fred Hohman, Varun Bezzam, and Duen Horng Chau. 2019. Atlas: [20] Peng Mi, Maoyuan Sun, Moeti Masiane, Yong Cao, and Chris North. 2016. Inter- Local Graph Exploration in a Global Context. In Proceedings of the International active graph layout of a million nodes. In Informatics, Vol. 3. Multidisciplinary Conference on Intelligent User Interfaces. ACM. Digital Publishing Institute, 23. [2] James Abello and Shankar Krishnan. 2000. Navigating graph surfaces. In Ap- [21] Quan Hoang Nguyen, Seok-Hee Hong, Peter Eades, and Amyra Meidiana. 2017. proximation and Complexity in Numerical Optimization. Springer, 1–16. Proxy graph: visual quality metrics of big graph sampling. IEEE transactions on [3] James Abello and François Queyroi. 2013. Fixed points of graph peeling. In visualization and computer graphics 23, 6 (2017), 1600–1611. Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social [22] Robert Pienta, Fred Hohman, Alex Endert, Acar Tamersoy, Kevin Roundy, Chris Networks Analysis and Mining. ACM, 256–263. Gates, Shamkant Navathe, and Duen Horng Chau. 2018. VIGOR: interactive [4] James Abello, Frank Van Ham, and Neeraj Krishnan. 2006. Ask-graphview: a visual exploration of graph query results. IEEE Transactions on Visualization and large scale graph visualization system. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 215–225. Computer Graphics 12, 5 (2006), 669–676. [23] Loïc Royer, Matthias Reimann, Bill Andreopoulos, and Michael Schroeder. 2008. [5] J Ignacio Alvarez-Hamelin, Luca Dall’Asta, Alain Barrat, and Alessandro Vespig- Unraveling protein networks with power graph analysis. PLoS computational nani. 2006. Large scale networks fingerprinting and visualization using the k-core biology 4, 7 (2008), e1000108. decomposition. In Advances in Neural Information Processing Systems. 41–50. [24] Ahmet Erdem Sariyuce, C Seshadhri, Ali Pinar, and Umit V Catalyurek. 2015. [6] Daniel Archambault, Tamara Munzner, and David Auber. 2007. Topolayout: Finding the hierarchy of dense subgraphs using nucleus decompositions. In multilevel graph layout by topological features. IEEE Transactions on Visualization Proceedings of the 24th International Conference on World Wide Web. International and Computer Graphics 13, 2 (2007). World Wide Web Conferences Steering Committee, 927–937. [7] Alessio Arleo, Oh-Hyun Kwon, and Kwan-Liu Ma. 2017. GraphRay: Distributed [25] Stephen B Seidman. 1983. Network structure and minimum degree. Social pathfinder network scaling. In 2017 IEEE 7th Symposium on Large Data Analysis networks 5, 3 (1983), 269–287. and Visualization (LDAV). IEEE, 74–83. [26] Nikolaj Tatti. 2019. Density-friendly graph decomposition. ACM Transactions on [8] Benjamin Bach, Nathalie Henry Riche, Christophe Hurter, Kim Marriott, and Knowledge Discovery from Data (TKDD) 13, 5 (2019), 54. Tim Dwyer. 2017. Towards unambiguous edge bundling: Investigating conflu- [27] Wouter van Heeswijk, George HL Fletcher, and Mykola Pechenizkiy. 2016. On ent drawings for network visualization. IEEE Transactions on Visualization & structure preserving sampling and approximate partitioning of graphs. In Proceed- Computer Graphics (2017), 1–1. ings of the 31st Annual ACM Symposium on Applied Computing. ACM, 875–882. [9] Vladimir Batagelj, Franz J Brandenburg, Walter Didimo, Giuseppe Liotta, Pietro [28] Yang Zhang, Yusu Wang, and Srinivasan Parthasarathy. 2017. Visualizing at- Palladino, and Maurizio Patrignani. 2011. Visual analysis of large graphs using tributed graphs via terrain metaphor. In Proceedings of the 23rd ACM SIGKDD (x, y)-clustering and hybrid visualizations. IEEE transactions on visualization and International Conference on Knowledge Discovery and Data Mining. ACM, 1325– computer graphics 17, 11 (2011), 1587–1598. 1334. [10] Vladimir Batagelj and Matjaz Zaversnik. 2003. An O(m) algorithm for cores [29] Hong Zhou, Panpan Xu, Xiaoru Yuan, and Huamin Qu. 2013. Edge bundling in decomposition of networks. arXiv preprint cs/0310049 (2003). information visualization. Tsinghua Science and Technology 18, 2 (2013), 145–156. [11] Emilio Di Giacomo, Walter Didimo, Giuseppe Liotta, Fabrizio Montecchiani, [30] Michael Zinsmaier, Ulrik Brandes, Oliver Deussen, and Hendrik Strobelt. 2012. and Ioannis G Tollis. 2014. Techniques for edge stratification of complex graph Interactive level-of-detail rendering of large graphs. IEEE Transactions on Visual- drawings. Journal of Visual Languages & Computing 25, 4 (2014), 533–543. ization and Computer Graphics 18, 12 (2012), 2486–2495. [12] Tim Dwyer, Nathalie Henry Riche, Kim Marriott, and Christopher Mears. 2013. Edge compression techniques for visualization of dense directed graphs. IEEE Graph Waves EDBT 2020, March 30-April 2, 2020, Copenhagen, Denmark Appendix Table 3: Results of performing the wave decomposition on select layers of the Friendster social network. These are only the layers that contain a fixed point of size at least 222 edges. (L: layer, 𝐹𝑘 : num- ber of fixed points, W: number of waves, F: number of fragments). The highlighted row corresponds to the layer containing the largest fragment of 21,128,481 edges. Alg. 1: Wave Decomposition (O(m)) Input: 𝐹𝑘 = (𝑉 , 𝐸), a fixed point of peel value k. L |V| |E | 𝐹𝑘 W F Time (s) Output: 𝑀 = {𝑊1,𝑊2,𝑊3, ...,𝑊𝑚 } where each 𝑊𝑖 ⊂ 𝐸 are 2 20031131 26084775 39188 8 99 168.9 waves. 3 20912756 46640174 564 27 564 887.1 Output: 𝑆 = {𝑆 0, 𝑆 1, 𝑆 2, ..., 𝑆𝑛 } where each 𝑆 𝑗 ⊂ 𝑉 . 4 1509262 4760942 4816 11 317 27.0 5 14030072 54607126 447 10 536 314.2 1 function waves(𝐹𝑘 ): 6 6352083 32606979 793 25 1177 233.0 2 𝑀←∅ 8 2882071 19767228 193 14 903 116.1 3 𝑆←∅ 11 13635761 131376950 5 75 2532 1422.0 4 𝑖←1 15 5011964 68883449 9 46 1874 496.0 5 𝑗 ←0 17 738488 11279923 16 18 970 50.8 6 while 𝐸 ≠ ∅ do 21 432014 7931887 5 106 978 40.9 25 10460847 229286219 1 17 1127 1511.5 7 𝑊𝑖 ← ∅ 29 244598 6430697 2 15 722 23.4 8 𝑆 𝑗 ← {𝑣 ∈ 𝑉 : deg(𝑣) = 𝑘 } 38 3144226 107272464 1 213 3279 1828.4 9 while 𝑆 𝑗 ≠ ∅ do 40 216081 8035443 2 50 889 31.4 10 𝑆 ← 𝑆 ∪ {𝑆 𝑗 } 53 6878498 322297063 1 10 1260 2543.1 11 𝐸 ← 𝐸 \ frag(𝑆 𝑗 ) 65 155562 9282393 1 7 415 31.6 12 𝑊𝑖 ← 𝑊𝑖 ∪ frag(𝑆 𝑗 ) 71 126835 8339470 1 56 768 30.9 89 4570064 367562291 1 49 737 5647.4 13 𝑆 𝑗+1 ← {𝑣 ∈ 𝜕𝑆 𝑗 | deg(𝑣) < 𝑘 } 120 908835 97669888 1 2 367 418.0 14 𝑗 ← 𝑗 +1 140 764401 94970270 1 7 443 410.1 15 end 169 228332 32277749 1 29 357 115.6 16 𝑀 ← 𝑀 ∪ {𝑊𝑖 } 175 197091 32971011 1 40 931 121.1 17 𝑖 ←𝑖 +1 234 157908 32985831 1 128 1188 123.8 304 24528 6301889 1 7 203 19.2 18 end 19 end Table 2: Results of performing the wave decomposition on layers 2 through 13 of the Patents citation network. These are only the layers that contain a fixed point of size at least 216 edges. (L: layer, 𝐹𝑘 : number of fixed points, W: number of waves, F: number of frag- ments). The highlighted row corresponds to the layer containing the largest fragment of 1,356,327 edges. L |V| |E | 𝐹𝑘 W F Time (s) 2 773841 953738 7682 7 55 2.779 3 1663386 4097177 37 33 411 18.587 4 198984 579406 787 15 267 1.652 5 617644 2457675 42 17 610 8.187 6 438875 2210942 100 47 996 8.935 7 246576 1406616 66 19 619 4.02 8 158620 1051389 85 44 765 3.243 9 80554 590337 61 25 511 1.46 10 56961 469522 73 45 579 1.211 11 20602 168159 80 14 188 0.272 12 13507 126909 73 12 236 0.192 13 15043 154245 38 13 284 0.249