=Paper=
{{Paper
|id=Vol-3135/dataplat_paper2
|storemode=property
|title=RTGEN : A Relative Temporal Graph GENerator
|pdfUrl=https://ceur-ws.org/Vol-3135/dataplat_paper2.pdf
|volume=Vol-3135
|authors=Maria Massri,Zoltan Miklos,Philippe Raipin,Pierre Meye
|dblpUrl=https://dblp.org/rec/conf/edbt/Massri0RM22
}}
==RTGEN : A Relative Temporal Graph GENerator==
RTGEN : A Relative Temporal Graph GENerator Maria Massri1 , Zoltan Miklos2 , Philippe Raipin1 and Pierre Meye1 1 Orange Labs, Cesson-SΓ©vignΓ©, France 2 University of Rennes CNRS IRISA, Rennes, France Abstract Graph management systems are emerging as an efficient solution to store and query graph-oriented data. To assess the performance and compare such systems, practitioners often design benchmarks in which they use large scale graphs. However, such graphs either do not fit the scale requirements or are not publicly available. This has been the incentive of a number of graph generators which produce synthetic graphs whose characteristics mimic those of real-world graphs (degree distribution, community structure, diameter, etc.). Applications, however, require to deal with temporal graphs whose topology is in constant change. Although generating static graphs has been extensively studied in the literature, generating temporal graphs has received much less attention. In this work, we propose RTGEN a relative temporal graph generator that allows the generation of temporal graphs by controlling the evolution of the degree distribution. In particular, we propose to generate new graphs with a desired degree distribution out of existing ones while minimizing the efforts to transform our source graph to target. Our proposed relative graph generation method relies on optimal transport methods. We extend our method to also deal with the community structure of the generated graphs that is prevalent in a number of applications. Our generation model extends the concepts proposed in the Chung-Lu model with a temporal and community-aware support. We validate our generation procedure through experiments that prove the reliability of the generated graphs with the ground-truth parameters. Keywords Temporal graphs, Graph generation, Optimal transport 1. Introduction class citizen in graph management systems [13, 14, 15, 16]. Most of these systems rely on real-world temporal Graphs are the most natural model to describe real world graphs to evaluate their proposed methods. Real-world interactions and are currently used in a myriad of appli- graphs, however, do not often fit the scale requirements. cation domains such as citation [1], transportation [? ], Therefore, practitioners must rely on a temporal graph and sensor networks [2] to cite just a few. These graphs generator that is able to produce large scale graphs whose are managed by a graph management system whose per- evolution correlates with that of real world temporal formance is usually evaluated through graph-centered graphs. To tackle this challenge, we proposed RTGEN: benchmarks that address different performance metrics a relative temporal graph generator that produces large such as ingestion throughput, space usage and query scale temporal graphs by controlling a number of key execution time. In this context, practitioners refer to real- features that characterises the evolution of real-world world and synthetic graphs to use in the benchmarks. graphs. That is, our generation procedure, controls the Indeed, available graph generation techniques fill the evolution of the degree distribution by extending a very gap between real and synthetically generated graphs by common generation technique [17] referred to as the trying to mimic the characteristics of real graphs such as Chung-Lu model with temporal and community-aware controlling the degree distribution [3, 4, 5, 6, 7, 8]. Besides, support. a number of existing graph generators are community- We model a temporal graph by a sequence of snapshots aware in the sense that they group vertices that are more π = {πΊ0 , . . . , πΊπ } where πΊπ is the graph snapshot at densely connected between each other than they are timestamp π‘π and characterized by a degree distribution with the rest of the graph, in separate or overlapping that is generated from sampling user-defined temporal sub-graphs called communities [9, 10, 11]. parameters. Having this, our relative graph generation Real graphs, however, are dynamic [12] such that their procedure consists of transforming πΊπβ1 into πΊπ by ap- topology is subject to continuous changes. In this context, plying a stream of atomic graph operations with respect a new emphasis is being placed to support time as a first to the desired degree distribution at time instants π‘πβ1 and π‘π . Based on the fact that a strong correlation exists Published in the Workshop Proceedings of the EDBT/ICDT 2022 Joint between successive snapshots [18, 19, 20], we propose Conference (March 29-April 1, 2022), Edinburgh, UK to minimize the number of graph operations that have " maria.massri@orange.com (M. Massri); zoltan.miklos@irisa.fr (Z. Miklos); philippe.raipin@orange.com (P. Raipin); to be applied in order to transform a graph snapshot pierre.meye@orange.com (P. Meye) into its successor. The main idea consists of minimizing Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons Li- cense Attribution 4.0 International (CC BY 4.0). the distance between degree distributions of successive CEUR Workshop Proceedings (CEUR-WS.org) snapshots. We achieve this goal by relying on an optimal Given the number of vertices in each graph snap- transport solver which provides a transportation plan ca- shot ππ β {π1 , . . . , ππ }, a stochastic community pable of transforming a "mass" from source distribution matrix π and a sequence of degree distributions to target distribution with a minimum of work. In order {π1 , . . . , ππ }, we generate a sequence of graph snap- to apply the obtained transportation plan, we proposed a shots {πΊ1 , . . . , πΊπ } such that each snapshot πΊπ is straightforward generalization of the well-known Chung- relatively generated by transforming πΊπβ1 . This Luβs model, also known as the CL model, that was first transformation is based on morphing the ππΊπβ1 = discussed in [21] and formalized in [17, 22]. We choose to πΊ πΊ πΊ πΊ {(π₯1 πβ1 , π1 πβ1 ), . . . , (π₯π πβ1 , ππ πβ1 )} into ππ = extend this model for the reasons of simplicity and scal- {(π₯1 , π1 ), . . . , (π₯π , ππ )} and preserving the community π π π π ability. We also extended the CL model to partition the structure that is represented by the stochastic commu- graph into ground-truth communities that coexist with nity matrix π such that ππΊπβ1 = ππΊπ = π . Note the aforementioned time-dependent degree distribution. that each element ππ’π£ of M is equal to the probability Our contributions are validated through experimental of edge creation between the source and target commu- results showing the evolution of the degree distribution nities ππ’ and ππ£ . Figure 1 illustrates the relative graph and community structure with respect to ground-truth generation procedure. Each graph snapshot πΊπ is rela- input parameters. tively generated by transforming its ancestor πΊπβ1 . This The rest of the paper is organized as follows, Section transformation is based on computing a transportation 2 provides an overview of the generation procedure. Sec- matrix π that minimizes the cost of morphing ππΊπβ1 tion 3 introduces the baseline generation procedure of the into ππ . The computation of the transportation matrix CL model. Section 4 describes the proposed community- reduces to an optimal transport problem. Based on the aware extension of the CL model. Section 5 presents computed transportation matrix, each vertex belonging a detailed description of the proposed generation pro- to the graph πΊπβ1 is assigned with a linkage or break- cedure. Section 6 provides an experimental evaluation age probability to indicate the probability of adding or of the synthetically generated temporal graphs. Section removing an edge. This phase is followed by creating or 7 describes the related work. Section 8 concludes the removing edges to or from the graph πΊπβ1 to produce work. the graph πΊπ . These graph updates follows the linkage or breakage probabilities assigned for each of the ver- tices. Finally, the graph πΊπ is computed by applying 2. Overview the generated updates on πΊπβ1 . Note that, the genera- In this section, we describe the overall generation pro- tion procedure depicted in this Figure shows a simplified cedure. Given the characteristics of a series of graph scenario where the number of vertices does not change. snapshots, our relative generation procedure produces However, if that number changes, a phase consisting of the series of graph snapshots {πΊ1 , . . . , πΊπ } whose char- the addition or deletion of vertices should precede the acteristics approximate the given ones. These graph snap- computation of the transportation matrix to assure the shots are relatively computed by applying a number of following constraint: graph updates on each snapshot in order to produce its π π successor snapshot. To clarify, we apply a number of βοΈ βοΈ ππ πΊπ = ππ‘π , β1 β€ π β€ π graph updates on a graph snapshot πΊπβ1 to produce an- π =1 π‘=1 other graph snapshot πΊπ whose characteristics approx- imate the given parameters assigned for the πth graph This constraint implies that the sum of weights of distri- snapshot. butions ππΊπ and ππ should be equal. Formally, we define a graph snapshot πΊπ valid at a time instant π‘π as the tuple {ππΊπ , πΈπΊπ , ππΊπ , ππΊπ } where 3. Graph generation with given ππΊπ is the set of vertices, πΈπΊπ is the set of edges, ππΊπ is a degree distribution and ππΊπ is the density commu- expected degree distribution nity matrix. For instance, we consider ππΊπ of the form In this section, we describe the generation procedure of π , ππ )} as a discrete distribution {(π₯1πΊπ , π1πΊπ ), . . . , (π₯πΊπ πΊπ over N where π₯π refers to the degree of a node and ππ πΊπ πΊπ random static graphs with a given degree distribution. refers to the total number of vertices in the graph whose Random graphs were introduced by ErdΕs and RΓ©nyi total number of edges is equal to π₯πΊ π . A density commu- [23]. The popularity of this model, also known as the πΈπ π nity matrix ππΊπ defines the community structure of the model, stems from its simple generation procedure that generated graphs, each element ππ’π£ of which is equal consists of generating a number of vertices and connect- to the density of edges between the source community ing them by an edge after picking each endpoint with a ππ’ and the target community ππ£ . fixed probability π. However, this model produces graphs whose degree distribution follows a binomial distribution probability ππ£π in the following Equation: ππ£π ππ£π = (1) π· Subsequently, a linkage phase consists of picking |πΈ| = π· 2 pairs of vertices to connect such that for a sufficiently large |πΈ| the random variable denoting the degree of vertex π£π is Poisson distributed with a mean equals to ππ£π . Iterating the linkage phase |πΈ| times where an edge is equally likely to be chosen in both directions for undi- rected graphs, the insertion probability of an edge con- necting vertex π£π and vertex π£π is ππ£π π£π = 2ππ£π ππ£π π· 2 . The edge insertion probability can be rewritten in the more convenient form: ππ£π ππ£π ππ£π π£π = π· For optimisation sake, we gather all vertices shar- Figure 1: Relative graph generation procedure. ing the same degree together in a pool πΎπ = {π£π |π£π β π β§ ππ£π = π} that we use as a subsidiary gen- eration component. Each vertex in a pool is equally likely to be chosen assuring that the aforementioned linkage with a mean degree equals to (π β1)π where π is the to- probability π is not affected for a sufficiently large num- π£π tal number of vertices. Hence, it fails to mimic real-world ber of vertices. After the degree assignment phase, ver- graphs that usually follow a power-law degree distribu- tices are distributed throughout the pools having each tion. To tackle this limitation, the edge configuration the following linkage probability: model [24] consists of generating a random graph whose degree distribution matches, approximately, a given de- π|πΎπ | gree distribution. That is, each vertex is assigned with a ππΎπ = π· number of stubs equal to its desired degree that is drawn independently from the given degree distribution. Hav- Now, instead of picking vertices a pool is first picked ing this, pairs of stubs are linked randomly forming edges It should be highlighted that self-loops or multi-edges between their endpoints. Although this technique ap- can be created since each endpoint of an edge is picked proximately matches any given degree distribution, a independently. The number of these edges, however, is relaxed version known as the Chung-Lu model was in- independent of the number of vertices and thus can be troduced in [21]. This model consists of generating a ran- neglected for large scale graphs. dom graph that approximately matches a given degree distribution relying on a simple generation procedure 4. Community-aware graph that can be considered as a variant of the πΈπ model. For simplicity, we will refer to this model as the CL model in generation with given expected the following description. degree distribution Consider the degree distribution π as the input param- eter to the CL model and the undirected, unweighted and Although the CL model produces graphs with respect unlabeled graph πΊ = {π, πΈ, ππΊ } as the output where to a given degree distribution, it is not aware of the ππΊ denotes the degree distribution of πΊ, π and πΈ de- community structure existing in most real-world graphs. note the set of vertices and edges, respectively. Having Hence, we propose a community-aware extension of the this, the CL model produces a graph πΊ such that ππΊ is CL model based on the stochastic block model (SBM). an approximation of π. The main idea is to pick each Since a community is not quantitatively well defined, endpoint of an edge with a certain probability such that, many definitions where provided in literature. Intuitively, at the end of the generation procedure, the total number one can consider a community as a subgraph which ver- of incident edges to each vertex is close to its assigned tices are more densely connected between each other degree. Hence, the starting phase consists of assigning than they are with the rest of the graph. Letβs consider each vertex π£π β π with a degree ππ£π and a linkage the set of communities πΆ = {ππ } and suppose that a ver- probability ππ£π β ππ£π . Considering that π· is the sum of tex should belong to one community and edges should the degrees extracted from π, we define the CL linkage be differentiated into within and between edges: β’ Given a community ππ , an edge π is called a within define πππ = πππ = 2πππ and πππ = πππ . Furthermore, edge if the source vertex β ππ and the target vertex we assign each community ππ with a within edge creation β ππ . probability πππ ππ , a between edge creation equal to πππ ππ’π‘ β’ Given two communities ππ and ππ , an edge π is and a probability of edge creation πππ such that: called a between edge if the source vertex β ππ |πΆ| and target vertex β ππ or vice versa. πππ = πππ ππ’π‘ βοΈ (2) ππ + πππ = πππ + 0.5 πππ To insure that vertices belonging to a community are π=1,πΜΈ=π more densely connected to each other than they are with We define the linkage probability ππ£π of choosing a vertex the rest of the graph, the within and between edge cre- π£π belonging to community ππ as follows: ation probabilities πππππ and πππ of ππ must satisfy the ππ’π‘ condition πππ ππ’π‘ π π£π ππ > π ππ , βπ π β πΆ. π π£π = ππ , π£π β ππ (3) π·ππ π 4.1. Stochastic block model where π·ππ is the sum of the degrees of vertices belong- ing to community ππ and πππ is the probability of choos- In this section, we formulate the SBM model [9] (also ing ππ . The linkage probability of a vertex is the product known as the planted partition model) which is com- of the probability πππ of choosing the community to monly used for the generation of random graphs with a π which the vertex belongs and the probability π·ππ£π of given community structure. Hence, this generation pro- π cedure only considers controlling the community struc- choosing the vertex π£π in that community. Hence, Equa- ture of the graph and overlooks the resulting degree dis- tion 3 assures the approximation of the community ma- ππ£π tribution. The input of the generation procedure is a trix. However, ππ£π should be equal to π· (Equation 1) stochastic community matrix π , each element πππ of to assure the approximation of the degree distribution. which defines the probability of edge creation between Therefore, we define the following condition in order to the source community ππ and the target community ππ . reduce Equation (3) to Equation (1): The output is a graph πΊ = {π, πΈ, ππΊ } where ππΊ is π·ππ = π·πππ the obtained density community matrix, each element π· ππΊ ππ of which defines the relative edge density between Now, replacing π· by π·ππππ in the original CL linkage π the source community ππ and the target community ππ . probability (Equation 1) which assures the control of the The generation procedure starts with the distribution of degree distribution, we obtain Equation 3 which assures vertices between the planted communities such that each the control of the community structure. Having this, the vertex belongs to a single community. Now, the linkage duality of the linkage probability given in Equations (1) probability between a vertex belonging to community ππ and (3) insures that both requirements are satisfied by and another vertex belonging to community ππ is equal to our generation procedure. πππ . However, the extracted community density matrix For performance amelioration, we consider the selec- ππΊ from the resulting graph πΊ is an approximation of tion of pools rather than vertices such that a pool is local π . That is, each element ππΊ ππ is binomially distributed to one community. That is vertices having the same de- with mean equals to πππ and Poisson distributed with gree variation and belonging to the same community ππ the same mean for a sufficiently large number of edges. are grouped in a pool πΎπππ = {π£π |π£π β ππ β§ ππ£ π = π} such that the probability of a pool selection for edge 4.2. Stochastic block model with given insertion is: π|πΎπππ | degree distribution π πΎ ππ = π π· In this section, we propose a static graph generation procedure which controls both the community structure 4.3. Hierarchical community structure and degree distribution. Given a degree distribution π The specification of the stochastic matrix is not straight- and a stochastic community matrix π , our proposed forward and imposes an exhaustive number of user- model generates a graph πΊ which degree distribution ππΊ defined parameters. Hence, we define an auto-generative is an approximation of π and density community matrix procedure that fills the matrix with no exogenous effort. ππ is an approximation of π . In the following, we Considering a static graph, we construct a stochastic ma- provide a description of our generation mechanism that trix that reflects a hierarchical community structure with extends the stochastic block model depicted in Section only two given parameters. In a hierarchical community 4.1. matrix, communities recursively embed subsequent com- Since the generated graph πΊ is undirected, the matrix munities in a self-similar fashion such that the commu- π is symmetric such that πππ = πππ . Having this, we nity structure is represented by a hierarchical tree where each node represents a community. Each non-leaf node cost: π βοΈ π is expanded into π other nodes until reaching a desired βοΈ min π‘ππ πππ tree height β (Figure 2). The ending recursion results in π π=1 π=1 ππ = πβ leaf-nodes referencing the finest scale communi- ties having a linkage probability πππ proportional to the where πππ = π(π₯π , π¦π ) is a measure of distance between distance between ππ and ππ . The distance between two π₯π and π¦π . The following constraints must hold for the communities, π(ππ , ππ ), is equal to the number of hops optimal flow π : traversed in order to reach the least common ancestor of these communities. In order to satisfy the condition stat- π‘ππ β₯ 0, 1 β₯ π β₯ π, 1 β₯ π β₯ π ing that within edge linkage probability must be higher π π than between linkage probability (πππππ > πππ ), we define ππ’π‘ βοΈ βοΈ π‘ππ β€ ππ , 1 β₯ π β₯ π, π‘ππ β€ ππ , 1 β₯ π β₯ π πππ as follows: π=1 π=1 π β1 πβοΈ Once the optimal flow π is found, the EMD between πππ = 0.5 πππ + π π and π is computed as follows: π=0,πΜΈ=π βοΈπ βοΈπ π=1 π=1 π‘ππ πππ where π is a tunable parameter which calibration steers πΈπ π·(π, π) = βοΈπ βοΈπ π=1 π‘ππ the difference between within and between edge densi- π=1 ties. The effect of varying π is further highlighted in the The EMD is fundamental in our generation procedure Section 6. since it is used to compute the distance between two degree distributions as described in the following Section. 5.2. Baseline relative graph generation In this section, we provide the baseline procedure of transforming a graph πΊ with degree distribution π into πΊβ² with degree distribution πβ² which we refer to as the Baseline relative graph generation. Note that, we use this technique for generating temporal graphs such that πΊ Figure 2: Hierarchical community tree with height β and and πΊβ² corresponds to successive graph snapshots. For branching factor π. generalisation purposes, however, we remove the notion of time in this section. This transformation is enabled by a set of atomic graph operations including the addition and deletion of a vertex or an edge. Following the assumption that temporal graphs gradually evolve, this number of 5. Relative graph generation graph operations between successive snapshots should be minimized which is assured in our model by applying In order to control the evolution of the degree distribu- an optimal transport method. tion of the generated temporal graphs, we propose in this Consider the input graph πΊ = {π, πΈ, π} and de- section an extension of the CL model that is based on the gree distribution πβ² , the generated output graph πΊβ² = optimal transport to compute the minimal distance be- {π β² , πΈ β² , ππΊβ² } such that ππΊβ² is an approximation of πβ² . tween the degree distributions of each pair of successive We define the distance between two degree distributions graph snapshots. π and πβ² as the earth moverβs distance πΈπ π·(π, πβ² ). Consider πΏπ = |π β² | β |π | as the total number of ver- tices to be added to or removed from the graph based 5.1. Earth moverβs distance on whether πΏπ is a positive or negative number, respec- The Earth moverβs distance can be defined as a mea- tively. When adding a new vertex, this vertex is assigned sure of distance over a domain π· between two dis- with a degree equals to 0 and deleting a vertex consists tributions of the form {(π₯1 , π1 ), ..., (π₯π , ππ )} where of removing the vertex with its corresponding incident π₯π β π· and ππ is the density of π₯π . Having this, edges. This transformation phase assures that πΊ and πΊβ² the problem reduces to the computation of an optimal share the same number of vertices, hence, enables the flow (transportation matrix) π = [π‘ππ ] between two transformation of π into πβ² . In order to morph π into distributions π = {(π₯1 , π1 ), ..., (π₯π , ππ )} and π = πβ² , a transportation matrix π is computed, where each {(π¦1 , π1 ), ..., (π¦π , ππ )} such that π‘ππ is the mass trans- row corresponds to a degree π in the set of degrees in the ported between ππ and ππ which minimizes the overall source distribution π and each column corresponds to a degree πβ² in the set of degrees in the target distribution πβ² . Now, each cell consists of the portion of vertices having a degree π for which links are to be inserted or removed in order to be assigned a total number of edges equals to degree πβ² . That is, a vertex π£π , with a degree ππ£π = π, will be assigned a degree variation of πΏππ£π = πβ² β π resulting in a total number of edge insertions and deletions defined as π·+ and π·β , respectively. We assign, for each vertex π£π , a linkage probability π+ π£π or Figure 3: Graph representing the case of a non-possible edge a breakage probability πβ π£π defined as extensions of the breakage. CL linkage probability (1): πΏππ£π desired degree distribution π and the stochastic block π+ π£π = , πΏππ£π > 0 (4) π·+ matrix π . However, the output consists of a graph βπΏππ£π πΊβ² = {π β² , πΈ β² , ππΊβ² , ππΊβ² } where ππΊβ² is an approxima- πβ π£π = , πΏππ£π < 0 (5) tion of π and ππΊβ² is an approximation of π . Recall that π·β We collect vertices sharing the same degree variation the generation procedure depicted in section 4.2 produces πΏπ = πβ² β π into a linkage pool if πΏπ > 0 and in a a graph with a given expected degree distribution and breakage pool if πΏπ < 0. Consider πΎπβπβ² = {π£π |π£π β stochastic community matrix based on the proposed link- π β§ πΏπ£π = πβ² β π} to be the pool containing vertices age probability duality presented in Equations (1) and (3). having a degree π that should be transformed into πβ² . We Indeed, a relative community-aware graph generation is compute the probability of picking a linkage or breakage based on an extension of the aforementioned duality by pool π+ taking into consideration the degree variation of a vertex πΎπβπβ² and ππΎπβπβ² as follows: β instead of the its degree. That is, the following linkage πΏπ|πΎπβπβ² | and breakage probabilities present a straightforward ex- π+ πΎπβπβ² = , πΏπ > 0 tension of Equations (4) and (5): π·+ πΏππ£π βπΏπ|πΎπβπβ² | π+ π£π = πππ , π£π β ππ πβ πΎπβπβ² = , πΏπ < 0 π·π+π π·β However, breaking an edge might be impossible in situ- πΏππ£π πβ π£π = πππ , π£π β ππ ations where the source degree variation πΏπ is negative π·πβπ and the sum of the negative degree variations of its neigh- Where π·π+π and π·πβπ are the total number of edge bors is higher than πΏπ. For the sake of illustration, we insertions and deletions in ππ , respectively. From the present in Figure 3 a graph in which the number of edges transportation matrix defined in section 5.2, we find to remove from a node is higher than the sum of the πππ as the portion of vertices with degree variation number of edges to remove from its neighboring vertices. πΏπ = ππ β ππ . However, finding the portion πππππ of That is, the transformation of this graph implies remov- vertices in community ππ should satisfy three conditions ing 2 edges from vertex π£1 since πΏπ£1 = β2. However, detailed bellow. Each condition π results in a system of the number of the edges that have to be removed from linear equations of the form π΄π π = π΅π where π is a the neighboring vertices of π£1 is equal to πΏπ£2 = β1 since π vector composed of πππππ such that π = {ππππ | β1 β€ π β€ πΏπ£3 = 0 and πΏπ£4 = 1. To overcome this, we repeat the |ππΊ | β§ β1 β€ π β€ |π| β§ 0 β€ π β€ |πΆ|} where ππ is the morphing procedure until EMD(π, πβ² ) reaches a desired total number of communities. threshold. Our simulations have proved that the value of EMD(π, πβ² ) converges rapidly towards the minimum Condition 1: For each community ππ β πΆ, conditions threshold after a tolerable number of iterations. This stating that π·π+π = π·+ πππ and π·πβπ = π·β πππ statement will be further highlighted in Section 6. must hold, where π·+ and π·β are the total number of edge insertions and deletions in all communities 5.3. Relative community-aware graph of πΆ, respectively. Incorporating πππππ in the previous generation condition translates to the following equality: β² β² |ππΊ | |π | |ππΊ | |π | A more complex version of the previously described βοΈ βοΈ ππ βοΈ βοΈ (π π β ππ )πππ = ( (ππ β ππ )πππ )πππ relative graph generation, consists of preserving the π=0 π=0 π=0 π=0 graph community structure in the transformation proce- dure. That is, the input of our community-aware relative where ππΊ and πβ² are the source and target degree distri- graph generator is the graph πΊ = {π, πΈ, ππΊ , ππΊ }, the butions. Condition 2: This condition states that the sum of of adding and removing edges is repeated π·+ and π·β all portions of vertices with degree variation ππ β ππ times, respectively. In each iteration, communities ππ and βππ β ππ‘ in ππ should be equal to the portion πππ π of ππ are picked based on πππ πΆπππ and vertices ππ and vertices in ππ having a degree ππ resulting in the follow- ππ are picked using πππ π ππππ + [π] and πππ π ππππ β [π]. ing equality: Now, an addition or deletion graph update between the π βοΈ chosen vertices is added to the list of logs using functions πππππ = πππ π addEdge and removeEdge whether the vertices where π=0 chosen from the linkage or breakage pools. However, breaking an edge might be impossible in some situations Condition 3: This condition states that the por- as shown in Figure 3. In such a use case, no graph update tion πππ of vertices with degree variation ππ β ππ in is added to the list of logs πΏ. Finally, the EMD distance the graph must be equal to the sum of all portions πππππ π β²is computed between the obtained degree distribution βππ β πΆ. π πΊ and the desired one π. If π β² is higher than π and ππ βοΈ the number of repetitions ππ’π_ππ‘ππ has not yet reached πππππ = πππ πππ₯_ππ‘ππ, the same algorithm is repeated on the newly ππ =0 computed graph snapshot πΊβ² . The computation stops By solving the concatenated system of equations obtained when πβ² is lower than or equal to π or the number of from the previous conditions ππππππ‘(π΄1 , π΄2 , π΄3 )π = repetitions has already been reached. ππππππ‘(π΅1 , π΅2 , π΅3 ), we find the vector π, hence the Algorithm 1: CRGG values of πππππ . Pools are created on a local basis in each Input: πΊ = {π, πΈ, ππΊ , ππΊ }, π, π , π, community such that vertices with the same degree vari- πππ₯_ππ‘ππ, ππ’π_ππ‘ππ ation πΏπ = π β π and belonging to the same community β² Output: πΊβ² = {π β² , πΈ β² , ππΊβ² , ππΊβ² } ππ are collected in a single pool πΎπβπβ² . We compute ππ 1 π β getTransportMatrix(ππΊ , π) ; the probability of picking a linkage or breakage pool 2 X β getVector(ππΊ , π, π , π ) ; ππΎπβπβ² ,ππ and ππΎπβπβ² ,ππ as follows: + β 3 (π· , π· ) β getNumberOfEdges(T) ; + β ππ 4 πππ πΆππ β getCDFComs(π ) ; πΏπ|πΎπβπ β²| π+ πΎπβπβ² ,ππ = + , πΏπ > 0 5 (πππ π ππππ + , πππ π ππππ β ) β initCDFPools ; π· 6 πΏ β ππππ‘πΏπππ () ππ βπΏπ|πΎπβπ β²| 7 for ππ β πΆ do πβ πΎπβπβ² ,ππ = , πΏπ < 0 β π·β 8 (πππ π ππππ +ππ , πππ π ππππ ππ ) β Algorithm CRGG depicts the relative community aware getCDFPools(π, ππ ) ; graph generation procedure. The input parameters are 9 πππ π ππππ + [π] β πππ π ππππ + ππ ; the graph snapshot πΊ, desired degree distribution π, den- 10 πππ π ππππ β [π] β πππ π ππππ β ππ ; sity community matrix π , threshold of the EMD dis- + 11 for π β 0 π‘π π· do tance between ππΊ and π, maximum number of repeti- 12 (ππ , ππ ) β chooseComs(πππ πΆππ) ; tions πππ₯_ππ‘ππ and the current number of repetitions 13 (ππ , ππ ) β chooseVertices(πππ π ππππ + [π], ππ’π_ππ‘ππ. Whereas, the output is a new graph snapshot πΊβ² . Note that, the value of ππ’π_ππ‘ππ is equal to 0 in the πππ π ππππ + [π]) ; first iteration. The transportation matrix π is computed 14 πΏ.addEdge (ππ , ππ ) ; using the function getTransportMatrix by taking the 15 for π β 0 π‘π π·β do degree distributions ππΊ and π as input. The function 16 (ππ , ππ ) β chooseComs(πππ πΆππ) ; getVector, computes π΄ and π΅ based on the Conditions 17 (ππ , ππ ) β chooseVertices(πππ π ππππ β [π], 1, 2 and 3 and solves the system of equations defined by πππ π ππππ β [π]) ; π΄π = π΅ to find the vector π. The total number of edges 18 πΏ.removeEdge (ππ , ππ ) ; to add (π·+ ) and delete (π·β ) are then computed based on 19 πΊ β applyLogs(πΊ, πΏ) ; β² the transporation matrix π . The function getCDFComs 20 π β getEMD(π, ππΊ ) ; β² β² computes the cumulative distribution function πππ πΆππ β² 21 if π β₯ π β§ ππ’π_ππ‘ππ < πππ₯_ππ‘ππ then based on the density community matrix π . Then, vectors 22 ππ’π_ππ‘ππ β ππ’π_ππ‘ππ + 1 ; πππ π ππππ + and πππ π ππππ β representing the cumulative 23 πΊβ² β density functions of the linkage and breakage pools and a πΆπ πΊπΊ(πΊβ² , π, π, ππ’π_ππ‘ππ, πππ₯_ππ‘ππ) ; list of logs (graph updates) πΏ are initialized. The function getCDFPools is used to compute the cumulative distri- 24 else bution functions πππ π ππππ + and πππ π ππππ β based on 25 return πΊβ² ; the probabilities ππΎπβπβ² ,ππ and ππΎπβπβ² ,ππ . The process + β Γ10 4 Γ10 4 15000 10 6 Number of occurrences Number of occurrences Number of occurrences 8 5 10000 4 6 3 4 5000 2 2 1 0 0 0 20 20 0 40 10 40 10 10 8 8 20 8 60 6 60 6 6 Deg 4 Deg 4 Deg 40 4 ree 80 ree 80 ree 2 st amp 2 st amp 2 st amp 100 0 Time 100 0 Time 60 0 Time Figure 4: Gaussian degree distribution Figure 5: Gaussian degree distribution Figure 6: Zipfian degree distribution of a growth only graph of a graph with edge deletions of a growth only graph 5.4. Accuracy of the generation characteristics of the generated temporal graphs. Note procedure that the source code of RTGEN is publicly available1 . Besides the source code, we also provide the instruc- In order to measure how far the characteristics of the tions describing how to use the tool to generate temporal generated graphs are from the ground truth parameters, graphs. For instance, users can pass the input parame- we define two distance metrics ππ and ππ . ters to describe the desired sequence of degree distribu- The first metric ππ measures the inaccuracy of approx- tions or stochastic community matrix and the format of imating the degree distributions of the generated graphs the generated output files to RTGEN using a terminal with the given sequence of degree distributions. That command. RTGEN proposes two output types: snapshot- is, it measures the root mean square of the EMD dis- based and event-based. The snapshot based type consists tances between each degree distribution ππ in the given a sequence of graph snapshots represented each in a sep- sequence {π1 , . . . , ππ } and its corresponding degree dis- arate file. Whereas, the event-based type, consists of tribution ππΊπ in the sequence {ππΊ1 , . . . ππΊπ } extracted generating the sequence of graph updates (events) that from the generated graphs. Having this, ππ is computed we applied between successive snapshots to transform as follows: one snapshot into the next one. βοΈβοΈπ 2 π=1 (πΈπ π·(ππ , ππΊπ )) ππ = 6.0.1. Experimental setup π Whereas, the second metric ππ measures the inaccuracy The experiments were conducted on a single machine of approximating the community density matrix of the equipped with Intel(R) Core(TM) i5-8350U CPU @ generated graphs with a given stochastic matrix. That 1.70GHz 1.90 GHz, 16 GB memory and 500 GB SSD. is, it measure the root mean square of the difference be- We used Go 1.17.5 and Python 3.8.0. Besides, we referred tween the Frobenius norms of the given stochastic matrix to the optimal transport solver proposed in [25]. The π and the stochastic matrix ππΊπ extracted from every graphs shown in this section are visualized using Gephi generated graph snapshot. Having this, ππ is computed tool [26] which offers network visualization facilities and as follows: community detection algorithms [27]. βοΈβοΈπ π=1 (πΉ (π ) β πΉ (ππΊπ )) 2 ππ = 6.0.2. Preliminaries π In the following experiments, we refer to two types of where πΉ (π ) is the Frobenius norm of the stochastic common degree distributions: Gaussian ππΊ and Zipfian community matrix π . We recall that the Frobenius norm ππ that are defined as follows: of a matrix π΄ of dimensions (π,π) is defined as follows: 1 1 π₯βπ 2 ππΊ (π₯) = β πβ 2 ( π ) β― βΈ π βοΈ π βΈβοΈ π π πΉ (π΄) = β· |πππ |2 π=1 π=1 1 ππ (π₯) = π₯ β [0, ππππ₯ ] (π₯ + π£)π 6. Experimental evaluation We consider a special case where the value of a parameter π₯ β N in iteration π depends on the its value in the We conducted a number of experiments to validate the previous iteration π β 1 such as π₯π = π₯πβ1 + πΏπ₯ such efficiency of our generator RTGEN. We also provide an in- sight on how changing the input parameters can steer the 1 https://github.com/MariaMassri/RTGEN that ππ = ππβ1 +πΏπ. This is applied on the parameters of graphs where new nodes join the graph and new connec- the degree distributions π, π, ππππ₯ , π , π£ and π denoting tions are created as the time elapses. the total number of vertices. That is, πΏπ denotes the number of vertices to be added or removed from the graph 6.2. Controlling the community structure in the relative generation process. Note that, RTGEN also generates the first snapshot which implies that the of the generated graphs parameters of the degree distribution of the first snapshot In this experiment, we show the generated community should be given. structure with different parameters of the stochastic com- munity matrix and the effect of varying parameter π of 6.1. Controlling the evolution of the the hierarchical tree. As described in Section 4.3, RTGEN is capable of auto-generating the stochastic community degree distribution matrix representing a hierarchical community structure. In this experiment, we show the evolution of the degree Consider a stochastic community matrix generated by set- distribution of a sequence of graph snapshots generated ting π = 4 and β = 2. As depicted in Equation 2, one can with the relative generation procedure given a set of input tune the parameter π in order to control the within and parameters. Hence, we consider Gaussian and Zipfian between edge densities. Hence, we select three different degree distributions with different parameters and plot- values of π in {2, 4, 8}. Furthermore consider, π = 1000 ted the obtained degree distributions in Figures 4, 5 and to be the total number of vertices and parameters π = 30 6. Figure 4 shows the evolution of the degree distribution and π = 2 to be the parameters of a Gaussian distribu- of a generated sequence of 10 graph snapshots given the tion. Note that, in this experiment, we generate a single following parameters: {π0 = 10πΎ, π0 = 30, π 0 = graph snapshot relying on the generation procedure pro- 2, πΏπ = 10π, πΏπ = 5, πΏπ = 0.1}. By setting πΏπ posed in Section 4.2. The generated graphs are shown in to 5, we increase the average degree by 5 between each Figures 7a, 7b and 7c using the Gephi tool. It can be no- pair of snapshots. This indeed, can model a growth-only ticed that the difference between the within and between graph where the average edge degree tend to regularly edge densities is proportional to π since π β πππ ππ’π‘ ππ β πππ increase as the time elapses. where πππ and πππ are the within and between linkage ππ ππ’π‘ However, some real-world graphs are not growth-only probabilities of a community ππ . Furthermore, Figure 8 in the sense that they are subject to edge deletions. This presents the modularity in function of parameter π which is indeed the case of human-proximity or transportation we vary from 0 to 32. The modularity is a measure to graphs where an important number of short-term con- quantify the goodness of community structure. Its for- nections is only valid during peak hours. To model this mula compares, for all the communities, the fraction of characteristic, RTGEN also supports edge deletions. The edges that falls within the given community with the evolution of the degree distribution with edge deletions expected fraction if edges were distributed at random. is presented in Figure 5. Let the following parameters It is clear from the results that the modularity increases define the evolution of degree distribution for π β [0, 4]: with the increase of π. This is justified by the fact that {π0 = 1π, π0 = 60, π 0 = 4, πΏπ = 0, πΏπ = π is proportional to the difference between within and 5, πΏπ = 0} Whereas the following parameters de- between edge linkage probabilities πππ ππ β πππ . ππ’π‘ fine its evolution for π β [5, 9]: {π0 = 10πΎ, π0 = 80, π 0 = 2, πΏπ = 0, πΏπ = β5, πΏπ = 0}. In- 6.3. Generating graphs with deletions deed, setting πΏπ to β5 indicates that the average degree decreases by a value of 5 between each pair of successive between snapshots graph snapshots. As mentioned in Section 5, the relative graph genera- Since real-world temporal graphs usually exhibit a power tion procedure may incur a number of edge deletions. law degree distribution, we also generated graphs with This can be cumbersome when the number of edges to an evolutionary Zipfian degree distribution composed delete for a given vertex is higher than the total sum of of 10 graph snapshots as shown in Figure 6. For this edges to delete from its neighboring vertices. We solve generated temporal graph, we set the following param- this problem by repeating the generation process until eters {π0 = 50π, π 0 = 2.5, π£ 0 = 10, π0πππ₯ = reaching an acceptable error threshold that is defined by 10, πΏπ = 50π, π = 0, πΏπ£ = 0, πΏππππ₯ = 5}. the EMD between the obtained and desired degree dis- By setting parameter πΏππππ₯ to 5, we consider that the tributions. Figure 12 shows the variation of the number maximum degree of nodes increases by a value of 5 be- of iterations and the execution time of the generation tween each pair of successive snapshots. Whereas, the process in function of the threshold error defined by the value of πΏπ indicates that 50π new nodes join the graph EMD. The obtained results show that our generation pro- between successive snapshots. These parameters reflect cedure converges rapidly to a tolerable threshold. That the growth of a large number of real-world temporal is, a threshold equals to 0.001 can be reached with only 7 (a) k=2 (b) k=4 (c) k=8 Figure 7: A visualization of the generated graphs with a hierarchical Figure 8: Modularity value in function community structure with parameters: π = 4, β = 2, π = 4 and a varying π. of parameter π ranging from 0 to 32. Figure 9: Execution time in Figure 10: ππ in function of the Figure 11: ππ in function of the function of the number of edges. number of edges. number of edges. 6.4. Accuracy of the generation procedure We quantify the accuracy of the generated graphs with the given parameters by computing the distance met- rics ππ and ππ defined in Section 5.4. We generated a sequence of π = 5 snapshots with the following parameters of Gaussian degree distribution: {π0 β {10π, 100π, 500π, 1π }, π0 = 30, π0 = Figure 12: The variation of the number of iterations and 2, πΏπ = 0, πΏπ = 10, πΏπ = 0}. Besides, we con- execution time in function of the EMD. trolled the community structure by fixing the following parameters of a hierarchical tree: β = 2, π = 2, π = 4, π = 0. iterations. By comparing the execution time of 1 iteration Figures 9, 10 and 11 plot the execution time, value of and 7 iterations, we can notice that the difference is lower ππ and ππ in function of the total number of created edges than the execution time of a single iteration. Indeed, the from applying the Gaussian distribution whose parame- execution time resulting from repeating the generation ters are given above. It is clear that the execution time is lower than the first iteration since the majority of mod- increases with the number of the generated edges. The ifications are added in the first iteration and only the distance metric, however, decreases implying that RT- remaining vertices whose linkage probability does not GEN approximates more accurately the given sequence satisfy the sum of the linkage probabilities of its neigh- of degree distribution and community structure as the boring vertices are considered in the next iteration. Note total number of edges grows. that these results are obtained from the generation of two successive snapshots with the following input parameters of a Gaussian degree distribution: {π0 = 500π, π0 = 7. Related work 60, π0 = 2, πΏπ = 0, πΏππ’ = β30, πΏπ = 0}. Synthetic graphs are important for developing bench- marks for assessing the performance of graph-oriented data platforms, when real graphs are not publicly avail- able or expensive to obtain. This has been the incentive updates. DSNG-M (dynamic social network generator to design models and generators, which are very use- based on modularity) [35] is a graph generator that is ful for evaluating the efficiency of graph management capable of generating temporal graphs by flipping the techniques as storage, query evaluation, indexing, parti- direction of edges of a given graph in order to satisfy a tioning, etc. randomly chosen modularity value assigned to a single An extensive work has been posited for the genera- graph snapshot. tion of static graphs. For instance, a special emphasis Some of the aforementioned graph generators produce has been placed to control the degree distribution of the temporal graphs with properties on nodes or vertices, generated graphs. In this context, many graph genera- which we do not address in this paper. None of them, tors were designed such as RTG [3], RMAT [4] and its however, allows the control of the evolution of the degree generalisation Kronecker [5] producing only Power-Law distribution given ground truth parameters that describe distributions. Since real-world graphs are not limited this evolution. This challenge lead to the elaboration of to power-law distributions, BTER [6] and its extension the RTGEN tool that allows the approximation of any Darwini [7] and GMark [8] produce graphs with any user given sequence of degree distributions that describes the defined distribution. evolution of the graph. We firmly believe, that the degree Another graph generation model producing a given distribution is a key feature that characterizes graphs, degree distribution is the CL model, forming the basis hence, it should not be disregarded in graph generation of the RTGEN tool. This model can be regarded as a tools. successor of the Erdos-RΓ©nyi model [23] that is designed for the generation of random graphs and a variant of the edge configuration model of Newman et al. [24]. 8. Conclusion It was extensively discussed and reused [17, 28, 29, 30]. In this paper, we addressed the generation of temporal We choose to extend this model for its simplicity and graphs that represents a critical challenge in the design scalability. of benchmarks specific for evaluating temporal graph Besides, a number of existing graph generators are management systems. That is, we proposed RTGEN, a community-aware in the sense that they collect vertices temporal graph generator that produces a sequence of that are more densely connected between each other graph snapshots whose community structure and evolu- than they are with the rest of the graph, in separate or tion of the degree distribution results from approximating overlapping subgraphs called communities [9, 10, 11]. user defined parameters. This generation procedure con- Although these generators preserve a given community sists of relatively generating a graph snapshot from a structure, they fail to produce a graph with respect to a previous one by applying a number of atomic graph op- given degree distribution. In this paper, we overcome this erations. Our generation technique relies on an Optimal limitation by allowing not only the generation of a given transport solver to approximate a user-defined sequence community structure but also a given degree distribution. of degree distributions while minimizing the number of Despite the extensive work posited on the genera- operations needed to transform one snapshot into its tion of non-temporal graphs, the generation of tempo- successor. We conducted a number of experiments that ral graphs has received much less attention. For in- validated the efficiency and accuracy of our generation stance, DANCer [31] is capable of generating temporal, procedure. In the future, we are planning to include a community-aware property graphs. It separates opera- dynamic community structure to RTGEN. Indeed, the tions performed on communities (macro operations) from communities found in real-world graphs are subject to operations performed on vertices and edges (micro opera- splits, merges, shrinks or expansions which should also tions). ComAwareNetGrowth [32] is a community-aware be modelled in synthetic graphs. graph generator that is capable of creating growth only graphs. APA (Attribute-Aware Preferential Attachment) [33] is a graph generator capable of creating growth-only References property graphs based on a non-conventional triangle closing. Instead of closing a triangle based on a uniform [1] J. R. Clough, T. S. Evans, Time and citation net- probability given as an input parameter, their proposed works, arXiv preprint arXiv:1507.01388 (2015). model consists of closing a triangle based on the simi- [2] M. D. Mueller, D. Hasenfratz, O. Saukh, M. Fierz, larity between the candidate edgeβs endpoints. While C. Hueglin, Statistical modelling of particle number GMark [8] generates static graphs, EGG (Evolving Graph concentration in zurich at high spatio-temporal res- Generator) [34] proposes an extension including evolv- olution utilizing data from a mobile sensor network, ing properties attached to each vertex. EGG, however, Atmospheric Environment 126 (2016) 171β181. disregard the topological changes to the network and [3] L. Akoglu, C. Faloutsos, Rtg: a recursive realistic narrow the temporal evolution of the graph to property graph generator using random typing, in: Joint European Conference on Machine Learning and 15882. Knowledge Discovery in Databases, Springer, 2009, [18] A. G. Labouseur, J. Birnbaum, P. W. Olsen, S. R. pp. 13β28. Spillane, J. Vijayan, J.-H. Hwang, W.-S. Han, The [4] D. Chakrabarti, Y. Zhan, C. Faloutsos, R-mat: A g* graph database: efficiently managing large dis- recursive model for graph mining, in: Proceedings tributed dynamic graphs, Distributed and Parallel of the 2004 SIAM International Conference on Data Databases 33 (2015) 479β514. Mining, SIAM, 2004, pp. 442β446. [19] M. Then, T. Kersten, S. GΓΌnnemann, A. Kemper, [5] J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Falout- T. Neumann, Automatic algorithm transformation sos, Z. Ghahramani, Kronecker graphs: An ap- for efficient multi-snapshot analytics on temporal proach to modeling networks, Journal of Machine graphs, Proceedings of the VLDB Endowment 10 Learning Research 11 (2010) 985β1042. (2017) 877β888. [6] T. G. Kolda, A. Pinar, T. Plantenga, C. Seshadhri, A [20] C. Ren, E. Lo, B. Kao, X. Zhu, R. Cheng, On querying scalable generative graph model with community historical evolving graph sequences, Proceedings structure, SIAM Journal on Scientific Computing of the VLDB Endowment 4 (2011) 726β737. 36 (2014) C424βC452. [21] W. Aiello, F. Chung, L. Lu, A random graph model [7] S. Edunov, D. Logothetis, C. Wang, A. Ching, M. Ka- for power law graphs, Experimental Mathematics biljo, Darwini: Generating realistic large-scale so- 10 (2001) 53β66. cial graphs, arXiv preprint arXiv:1610.00664 (2016). [22] F. Chung, L. Lu, Connected components in ran- [8] G. Bagan, A. Bonifati, R. Ciucanu, G. H. Fletcher, dom graphs with given expected degree sequences, A. Lemay, N. Advokaat, gmark: Schema-driven Annals of combinatorics 6 (2002) 125β145. generation of graphs and queries, IEEE Transac- [23] P. Erdos, A. rΓ©nyi on random graphs i, Publ. Math. tions on Knowledge and Data Engineering 29 (2016) Debrecen 6 (1959) 290β297. 856β869. [24] M. E. Newman, D. J. Watts, S. H. Strogatz, Random [9] P. W. Holland, K. B. Laskey, S. Leinhardt, Stochastic graph models of social networks, Proceedings of the blockmodels: First steps, Social networks 5 (1983) national academy of sciences 99 (2002) 2566β2572. 109β137. [25] R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, [10] B. Karrer, M. E. Newman, Stochastic blockmodels A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, and community structure in networks, Physical K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, review E 83 (2011) 016107. H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, [11] B. KamiΕski, P. PraΕat, F. ThΓ©berge, Artificial bench- A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, mark for community detection (abcd): Fast ran- A. Tong, T. Vayer, Pot: Python optimal transport, dom graph model with community structure, arXiv Journal of Machine Learning Research 22 (2021) preprint arXiv:2002.00843 (2020). 1β8. URL: http://jmlr.org/papers/v22/20-451.html. [12] P. Holme, Modern temporal network theory: a [26] M. Bastian, S. Heymann, M. Jacomy, Gephi: an open colloquium, The European Physical Journal B 88 source software for exploring and manipulating (2015) 234. networks, in: Third international AAAI conference [13] E. Pitoura, Historical graphs: models, storage, pro- on weblogs and social media, 2009. cessing, in: European Business Intelligence and Big [27] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefeb- Data Summer School, Springer, 2017, pp. 84β111. vre, Fast unfolding of communities in large net- [14] Y. Miao, W. Han, K. Li, M. Wu, F. Yang, L. Zhou, works, Journal of statistical mechanics: theory and V. Prabhakaran, E. Chen, W. Chen, Immortal- experiment 2008 (2008) P10008. graph: A system for storage and analysis of tempo- [28] F. Chung, F. R. Chung, F. C. Graham, L. Lu, K. F. ral graphs, ACM Transactions on Storage (TOS) 11 Chung, et al., Complex graphs and networks, 107, (2015) 1β34. American Mathematical Soc., 2006. [15] U. Khurana, A. Deshpande, Storing and analyz- [29] A. Pinar, C. Seshadhri, T. G. Kolda, The similarity ing historical graph data at scale, arXiv preprint between stochastic kronecker and chung-lu graph arXiv:1509.08960 (2015). models, in: Proceedings of the 2012 SIAM Interna- [16] M. Haeusler, T. Trojer, J. Kessler, M. Farwick, tional Conference on Data Mining, SIAM, 2012, pp. E. Nowakowski, R. Breu, Chronograph: A ver- 1071β1082. sioned tinkerpop graph database, in: International [30] M. Winlaw, H. DeSterck, G. Sanders, An in-depth Conference on Data Management Technologies and analysis of the chung-lu model, Technical Report, Applications, Springer, 2017, pp. 237β260. Lawrence Livermore National Lab.(LLNL), Liver- [17] F. Chung, L. Lu, The average distances in random more, CA (United States), 2015. graphs with given expected degrees, Proceedings of [31] O. Benyahia, C. Largeron, B. Jeudy, O. R. ZaΓ―ane, the National Academy of Sciences 99 (2002) 15879β Dancer: Dynamic attributed network with com- munity structure generator, in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2016, pp. 41β44. [32] F. Gursoy, B. Badur, A community-aware network growth model for synthetic social network genera- tion, arXiv preprint arXiv:1901.03629 (2019). [33] A. Aghasadeghi, J. Stoyanovich, Generating evolv- ing property graphs with attribute-aware preferen- tial attachment, in: Proceedings of the Workshop on Testing Database Systems, 2018, pp. 1β6. [34] K. Alami, R. Ciucanu, E. M. Nguifo, Synthetic graph generation from finely-tuned temporal constraints., in: TD-LSG@ PKDD/ECML, 2017, pp. 44β47. [35] B. Duan, W. Luo, H. Jiang, L. Ni, Dynamic social networks generator based on modularity: Dsng- m, in: 2019 2nd International Conference on Data Intelligence and Security (ICDIS), IEEE, 2019, pp. 167β173.