1. INTRODUCTION

Time Traveling in Graphs using a Graph Database

Konstantinos Semertzidis

ksemer@cs.uoi.gr 0

Evaggelia Pitoura

pitoura@cs.uoi.gr 0 0 Computer Science & Engineering Department, University of Ioannina , Greece

Most graph structured data, such as data created from the web, social, citation and computer networks, evolve over time. In this paper, we assume that we are given the history of a graph in the form of a sequence of graph snapshots. Our goal is to de ne the different types of queries that one can ask regarding the graph history and present an initial approach to storing graph snapshots and processing historical queries in a native graph database. We de ne three general types of historical queries, namely, historical graph queries, historical time queries and historical top-k queries. We present two representations of graph snapshots that use either a single or a multi-edge approach. We evaluate the two approaches experimentally for various types of historical reachability queries.

Graph Database Historical Queries Reachability

1. INTRODUCTION

In recent years, increasing amounts of graph structured data are made available from a variety of sources, such as social, citation, computer and hyperlink networks. Almost all such real-world networks evolve over time. In this paper, we focus on structural evolution, that includes node and edge additions and removals.

There has been a lot of interest on analytical processing and mining of evolving graphs, including among others developing models [ 14 ], discovering communities [ 2 ], and computing measures such as PageRank [ 3 ]. There has been also research on building graph engines tailored to supporting analytical processing in dynamic graphs, such as Kineograph [ 5 ] and Chronos [ 8 ]. Instead in this paper, we focus on online query-based processing of dynamic graphs.

We assume that we are given an evolving graph, in the form of a sequence of graph snapshots corresponding to the state of the graph at different time points. Our goal is to study the types of queries that one can ask on these sequence of snapshots. We call such queries historical queries. Although graph query processing has received a lot of interest, querying an evolving graph has received limited attention.

First, we provide a taxonomy of historical queries. We introduce three categories of historical queries, namely, historical graph queries, historical time queries and historical topk queries. Historical graph queries are graph queries applied at past snapshots. For example, a historical graph reachability query may ask whether two nodes were reachable at some time interval in the past. Historical time queries are graph queries that ask about the time point or the time interval that a query had a speci c result. For example, a historical time reachability query may ask for the time point at which two nodes become reachable for the rst time. Finally, historical top-k queries ask for the nodes that had a property for the longest period of time. For example, a historical top-k reachability query may ask for the k pairs of nodes that remained connected for the longest interval.

We then address the problem of processing historical reachability queries in a native graph database. We study two approaches of storing graph snapshots, using either a single edge or multiple edges to represent connections that appear at different time points. We present algorithms for processing all different types of historical reachability queries using these approaches and experimentally compare their performance. Due to the native support of an edge-type traversal by the graph database, the multiple-edge approach outperforms the single-edge approach. Indexing boosts the performance of both approaches.

Previous work on historical queries is limited and includes only historical graph queries [ 1, 4, 9, 10, 11, 12, 15, 16 ] with the exception of [ 1 ] that studied one instance of a historical time query and the recent work in [ 17 ] that studies historical top-k graph patterns. Also, none of the previous approaches, with the exception of [ 4 ], is built on top of a native graph database. The work in [ 4 ] proposes a general approach for storing an evolving graph in a native graph database, specifically Neo4j, and experimentally tests a number of general graph queries that include past time instances. Our focus here is on structural updates and reachability queries. We include an experimental comparison with their model.

The rest of the paper is structured as follows. In Section 2, we formally de ne historical reachability queries. In Section 3, we introduce two approaches to storing the evolution of graphs in graph databases and algorithms for processing historical reachability queries. In Section 4, we evaluate our approaches by presenting experimental results. In Section 5, we present related work. Section 6 concludes the paper.

HISTORICAL QUERIES

Most real world graphs evolve over time. In this paper, we focus on structural evolution, that includes node and edge additions and removals. We assume that time is discrete and use successive integers to denote successive points in time. Let G = (V; E) be a graph where V is the set of nodes and E the set of edges. We use Gt = (Vt; Et) to denote the graph snapshot at time point t, that is, the set of nodes and edges that exist at time point t.

De nition 1 (evolving graph).

An evolving graph G[ti;tj] in time interval [ti; tj] is a sequence fGti , Gti+1, : : : ; Gtj g of graph snapshots.

One can think of two interpretations of time points. One interpretation is that of actual time, for example time point t may correspond to say February 9, 2015, 5:00am PDT. Another view is operational. In this case, a new time point is created, after a speci c number of inserts and deletes of nodes and edges has occurred. We use the term time granularity to refer to how often a new time point and the corresponding graph snapshot are created. In the case of actual time, granularity may range for example from milliseconds to years, whereas, in the case of operational time, granularity may be at the level of one or more operations.

For a node u (or, edge e), its lifespan denotes the set of time intervals during which u (resp. e) existed in an evolving graph. More formally, given an evolving graph GI = fGti , Gti+1, : : : , Gtj g, the lifespan, L(u), (resp. L(e)) of a node u (resp. edge e) is a set of intervals such that an interval [ti; tj] I belongs to L(u), (resp. L(e)), if and only if, for all ti tm tj, u 2 Vtm (resp. e 2 Etm ).

Let us now discuss the type of queries that one can pose on evolving graphs. We call such queries historical to distinguish them from queries that consider only the current graph snapshot, Gcurr. In the following, we rst present various types of general historical queries and then formally de ne these types for the case of historical reachability queries. Historical Graph Queries. The rst category of historical queries include queries that are similar to current graph queries but refer to past snapshots. Let Q be any type of graph query, e.g., a reachability, shortest distance, or graph pattern query. The corresponding historical query QH on an evolving graph G[ti;tj] is a pair (Q; IQ), where IQ is an interval [tl; tm], ti tl tm tj. Query QH is executed by applying query Q at all graph snapshots Gt, tl t tm of G[ti;tj], and returns as result an appropriately de ned aggregation of these results. For example, let Q be a query that asks for the shortest path distance between nodes u and v and let QH = (Q; [tl; tm]). The shortest path distance between nodes u and v is computed at all graph snapshots Gtl , Gtl+1, : : : Gtm and these tm - tl + 1 distances are appropriately combined to produce the result of QH .

There are three general ways of combining the results: (a) use all snapshots, (b) use only one of the snapshots, and (c) an intermediate case, in which we use r of the involved snapshots. For example, in the case of shortest path queries, we may combine the results by returning in case (a) the average shortest path distance, in case (b) the minimum shortest path distance and in case (c) the minimum shortest path distance in at least r of the snapshots. Let us now de ne formally the three ways of combining results in the case of reachability queries.

De nition 2 (historical reachability query). Given an evolving graph G[ti;tj], a time interval IQ = [tl; tm], ti tl tm tj and a pair of nodes v, u: (a) a conjunctive historical reachability query (Conj) returns true, if there exists a path from u to v in all graph snapshots Gt, tl t tm of G[ti;tj]. (b) a disjunctive historical reachability query (Disj) returns true, if there exists a path from u to v in at least one graph snapshot Gt, tl t tm, of G[ti;tj], (c) an at least r historical reachability query (Least) returns true, if there exists a path from u to v in at least r graph snapshots Gt, tl t tm, of G[ti;tj].

In the special case in which tl = tm, we just apply Q on the single past snapshot Gtl of G[ti;tj]. We call such queries stab (Stab) queries.

Historical Time Queries. Another type of queries pertinent to evolving graphs are queries that focus on the timing aspect. Such queries ask when an event happened. For example, depending on the type of query, we may ask when a speci c graph pattern occurred, when two nodes become reachable, or when their shortest path distance was equal to a given value.

We make a distinction between queries that ask (a) when is the rst time that an event happened, (b) what is the longest continuous interval that the event lasted, or (c) what is the total time that the event occurred. For reachability queries, we have the following types of historical time queries.

De nition 3 (historical time reachability query). Given an evolving graph G[ti;tj], a time interval IQ = [tl; tm], ti tl tm tj and a pair of nodes v, u: (a) a rst time reachability query (First) returns the smallest time point t, tl t tm, such that, there exists a path from u to v in graph snapshot Gt and there is no path from u to v in any Gt′ , tl t′ < t, (b) a longest continuous time reachability query (Intv) returns an interval [tk1 ; tk2 ], tl tk1 tk2 tm, such that, there exists a path from u to v in all graph snapshots Gt, tk1 t tk2 of G[ti;tj] and there is no longer interval that this holds, (c) a longest total time reachability query (Total) returns the time points t, tl t tm, such that there exists a path from u to v in Gt.

Historical Top-k Queries. The last type of historical queries are queries that ask for the top-k nodes that satisfy a condition for the longest time period. Depending on the query, this may mean what is the graph pattern that appears in the majority of graph snapshots, what are the pairs of nodes that remained reachable the longest, or what are the nodes whose distance was below some given value for most of the time points. Again, we make a distinction between queries that ask for nodes that satisfy the property for the longest duration, either (a) continuously or (b) in total. For reachability queries, this gives us the following two types of top-k queries.

De nition 4 (historical top-k reachability query). Given an evolving graph G[ti;tj], a time interval IQ = [tl; tm], ti tl tm tj and a pair of nodes v, u and an integer k, k > 0: (a) a top-k continuous reachability query (Topk i) returns a set of k pairs (u; v) of nodes u and v such that there exists a path from u to v in all graphs Gt in an interval of size d and there is no pair of nodes u′, v′, for which a path from u′ to v′ exists in all graphs in an interval of size larger than d, (b) a top-k total reachability query (topk t) returns the k pairs (u; v) of nodes u and v such that there exists a path from u to v in the largest number of graph snapshots Gt, tl t tm.

HISTORICAL QUERIES ON A GRAPH DATABASE

In this section, we rst present two different approaches of representing an evolving graph within a graph database and then algorithms for processing historical reachability queries using these representations.

As a running example, we use a bibliographical database where nodes correspond to AU T HORS and there is a PUBLISH edge between two AUTHORS if they are co-authors. Time granularity corresponds to years. 3.1

Single and Multi Edge Representation

In the single-edge approach, the lifespan of a node or edge is modeled using a label (i.e., an attribute or property) of the corresponding node or edge. Figure 1 shows an example of edge lifespans represented by a single label of type String. For example, the co-authorship between authors A, C in 2010, and 2012, is represented by the edge with label "2010,2012". Graph evolution can be tracked by processing the node and edge lifespan labels. Thus, to obtain the graph snapshot Gt for a time point t, we get all edges and nodes and then keep only those edges and nodes whose lifespan label contains t.

While in the single-edge approach there is at most one edge between two nodes, the multi-edge approach uses a different edge type between two nodes for each time point of the lifespan of the edge. For instance, in order to represent co-authorship between authors A, B, C in 2010, 2012, and 2014, we use three different labeled types of edges to connect A, B, and C. An example is shown in Figure 2 which depicts co-authorship between A, B and C in 2010, 2012, and 2014 using multiple type of edges. Multiple type of edges provide an efficient way of retrieving graph snapshots, since a graph database supports an efficient traversal of edges of a given type. Note that when a node u is connected by an edge of a type that corresponds to a time point t, it is obvious that u existed in t. Thus to get information about the lifespan of u, we access its edges, which is a fast process, since graph databases provide efficient indexing for this process. Author: A Author: B

PUBLISH Lifespan: 2010, 2012

PUBLISH

Lifespan: 2010, 2012

For both the single-edge and the multi-edge approaches, we explore whether an index on the lifetime property of each node improves the performance of retrieving a snapshot of the graph. We build an index within the graph database (shown in Figure 3) by creating a new node type TIME where each node of the given type has a unique value that corresponds to a speci c time point. A TIME node that denotes a time point t has relations to all AUTHOR nodes that existed at time t. To obtain all nodes that exist in an interval, we get the neighbors of the TIME nodes that correspond to this interval. 3.2

Historical Reachability Query Processing

All graph databases provide a BF Straversal method that visits all edges of a given type. This method starts from a source node, visits all its neighbors at distance 1 by traversing edges of the given type, then all its neighbors at distance 2, and so on.

TIME Year: 2010

TIME Year: 2011

TIME Year: 2012

TIME Year: 2013

TIME Year: 2014 Author: A PUBLISH Type: 2010

PUBLISH Type: 2012

PUBLISH Type: 2014 Author: C

Author: B PUBLISH

Type: 2010 PUBLISH Type: 2012 For answering if two nodes are reachable in the multiedge approach, we use this method. For example, to nd if two nodes are connected in a given interval I, we call the BF Straversal method for each time instant ti in I passing as an argument the edge type that corresponds to ti.

However, we cannot use the BF Straversal method for the single-edge approach, since we need to post-process the lifespan label of each edge to nd the time instants where the nodes were reachable. The BF Straversal method provided by the deployed graph database does not return the path visited but just the reachable nodes. Thus, we implemented our own BFStraversal algorithm which processes the edge lifespans.

The time index can be used similarly in both single-edge and multi-edge approaches to prune some computations. For example, for the Least query that asks whether nodes u and v are reachable at least r time points in I, we can rst check using the index whether both nodes were active at the same time points at least r times. If they were not active, we do not need to traverse the graph.

To summarize, (a) in the single-edge approach, we use our own implementation of the BF Straversal method and (b) in the multiple-edge approach, we invoke the provided BF Straversal method once for each edge type corresponding to the time points in the query interval.

Let us now discuss in more detail how to process the different types of historical reachability queries that ask for a path from u to v in interval I. A Conj query returns true, when there is no ti in I where v is not reachable from u. A Disj query returns true, when the rst ti in I is found in which v is reachable. Finally, for a Least query, we keep a counter c of the time points in I that v is reachable. If the counter reaches the given r, the query returns a positive answer, otherwise it stops when the sum of the counter and the remaining time instants is less than r.

Historical time reachability queries return time points or time intervals. For a First query, we return the rst ti when v is reachable. For a Intv query, we keep a counter c for the consecutive times that v is reachable and a max variable of the current maximum c. For each ti, that v is not reachable, c is reset and max is updated if c > max. The query stops and returns the interval corresponding to the max value, when c and the remaining time instants are less than max. Finally, for a Total query, we return all time points in I where v is reachable.

For the top-k historical reachability queries, we use the time index to obtain the top-k (active) nodes utop, that is the k nodes that exist for the longest period in I. In particular, for each node u we found in a time instant of I we increase a counter that denotes the number of instants that u is active. We also use a Min Heap structure to keep the top-k pairs of nodes that were connected for the longest interval (Topk i query) or for the largest number of time points (topk t query). The Min Heap stores each pair as a triple (u; v; value) where value is a counter that keeps the longest interval or the largest number of times that u and v were connected.

For a Topk i query, we start traversing from each top active node utop for each time instant in I. Intuitively, top active nodes have more active paths to other nodes. For each node v reachable from utop, we increase a counter C(v), each time v is reachable. We also keep a max(v) variable for the maximum C(v). Each time v is not a reachable from utop, C(v) is reset and max(v) is updated if C(v) > max(v). In the end of the traversal from utop, we insert in the Min Heap the triple (utop, v, max(v)) if max(v) is larger than the smallest element of the Min Heap. The query stops when the Min Heap has size k and its minimum element is larger or equal to the lifetime in I of the remaining top active nodes. Processing of a topk t query is similar. The only difference is that in the Min Heap, we store the number of time points instead of the duration of the interval. 4.

EVALUATION

As our graph database, we use Sparksee [ 6 ] that supports fast loading of the graph data and efficient operations that scan all edges in the graph. Sparksee is based on a compact representation that uses bitmaps and highly compressible data structures [ 7 ]. As our dataset, we used the whole DBLP dataset 1 for the interval [1958, 2015]. Each graph snapshot corresponds to a year in this interval.

We ran all queries on a system with a quad-core Intel Core i7-4770 3.4 GHz processor and 32 GB memory. We only use one core for all experiments.

Storage. We created two different graph database instances (GDBs). The rst graph database instance stores our dataset following the single-edge approach, whereas the second follows the multi-edge approach.

Table 1 shows the characteristics of each graph database instance. Single-edge differs from multi-edge in the number of edge types, since the second one uses a different edge type for each time point, which leads to a larger size. The index nodes size states the number of time points which in our case is 58 years. The edge types for single-edge are two, one type represents the PUBLISH edge and the other one the index edge. Multi-edge has 59 types, one type for the index edge and 58 types for each year in [1958, 2015].

Historical Query Processing. To evaluate the performance of both true and false queries, we generated for each query type 250 true and 250 false queries.

For Conj, Disj, Least, First, Topk i, and topk t, the query interval is I = [2005; 2014], for Stab, the time point is a random year within I, for Intv, Topk i the interval is I = [1958; 2015]. For Least, r was randomly chosen from [ 2, 9 ], while in Topk i, and topk t, k is equal to 10.

Table 2 reports the average time of true and false queries, on both graph instances. Also, we report the query time when the time index is used, as well the average execution time of Topk i, and topk t on the multi-edge GDB instance.

Queries that ask for events that have the longest duration, either continuously or in total require the most time to be processed, (since we need to check long time intervals) followed by queries that seek for the largest number of time points that something holds or the longest continuous interval that something holds.

Comparing the GDB instances, we notice that queries are faster when using multiple types of edges to represent the time points. This can be explained by the fact that using a single edge type requires the processing of the edge to nd if a time point is contained in the lifespan label.

A general remark is that false conjunctive queries are faster than true conjunctive queries, since processing stops as soon as a time point is found at which the two nodes 1http://dblp.uni-trier.de/ are not reachable. Analogously, true disjunctive queries are faster than false disjunctive queries, since processing stops as soon as a time point is found at which the two nodes are reachable. Also an observation that holds independently of the graph GDB used to evaluate queries is that the time index boosts query processing. We gain more speed in false queries than true ones, since we can prune traversals from nodes that are not active in a given time point or in the whole interval. For example, if we seek to nd the longest interval during which A and B were connected and there is not any time point that both authors were active, then a false answer is returned without executing any traversal.

Table 3 shows the top 10 authors pairs returned from Topk i, and topk t. Both queries return the same pairs because the authors of each pair were reachable in the whole interval (10 years). Thus, the authors that were connected for the longest interval are the same with the authors that are connected the most time in the past. We clarify that the top-k processing steps that were followed in Topk i, and topk t stop when they nd the rst k pairs that meet the requirement. Hence, ties, i.e., pairs that have the same property may not be reported.

Comparison with the Time-Varying Approach. Finally, we implemented the data model introduced in [ 4 ] and tested its performance for the Stab query. The approach 100000 10000 )s 1000 m ( e iTm 100 10 1

M+I

TMV M+I

TMV S+I TRUE

S+I FALSE in [ 4 ] introduces a speci c node to model the interaction between two nodes at a speci c time point. They also use a hierarchical index to support different time granularities, which is an issue that we do not address here, thus, we do not implement such an index.

We created a new type of node PAPER which denotes the interaction of publishing a paper and an AUTHOR node type for the authors. We connect with each PAPER its authors using an edge type PUBLISH. For the time index, we connect each AUTHOR and PAPER node to the time index nodes to which they belong. For example, if authors A and B wrote a paper P together in t then from the time index node that corresponds to t, we create edges that connect the time index node with the A, B and P nodes. To nd if two authors A and B are reachable, we have to obtain the authors from each PAPER node that A is connected and from them to obtain their co-authors. We repeat this process until we nd B. This process is costly for nding PAPER nodes that were active at a speci c time point, since we check the time index for each PAPER node to see if it was active in that time.

Running a Stab query using this approach requires 67.8 seconds for true queries and 58.5 seconds for false queries (shown in Figure 4), while our best approach requires only 0.43 and 8 milliseconds for answering true and false queries respectively. This can be explained by the fact that their model has not been designed for answering historical reachability queries but for querying the presence of objects in a number of given time points.

RELATED WORK

Although, graph data management has been the focus of much current research, work in processing historical queries is rather limited. The main focus of research on query processing in evolving graphs has been on efficiently storing and retrieving graph snapshots. In this paper, our focus is on de ning different types of historical queries and on processing them in a graph database.

Historical query processing in these approaches requires as a rst step reconstructing the relevant snapshots. Then, queries are processed through an online traversal on each of the snapshots. Various optimizations for reducing the storage and snapshot reconstruction overheads have been proposed. Optimizations include the reduction of the number of snapshots that need to be reconstructed by minimizing the number of deltas applied [ 11 ], using a hierarchical index of deltas and a memory pool [ 10 ], avoiding the reconstruction of all snapshots [ 15 ], and improving performance by parallel query execution and proper snapshot placement and distribution [ 13 ].

Recent work also addresses indexing for historical reachability queries through an index that contains information about strongly connected components membership at various time points [ 16 ], indexing for historical shortest path distance queries [ 1, 9 ] and indexing for durable pattern queries on historical graphs [ 17 ].

Finally, the only work on evolving graphs using a native graph database we are aware of is the work in [ 4 ] where the authors use Neo4j. Their evolving graph besides structural updates, also includes time-varying attributes. They do not provide a taxonomy of historical graph queries, instead they implement a number of historical queries that also include attributes.

CONCLUSIONS

In this paper, we studied queries on evolving graphs. We have proposed a taxonomy of historical graph queries, that are queries that involve past graph snapshots. We presented different approaches of storing and retrieving an evolving graph in a graph database by either using a single-edge with a lifespan attribute or multiple-edge types where each type corresponds to a different time point. We have proposed algorithms for evaluating historical reachability queries and evaluated our approaches using the DBLP dataset.

There are many possible directions for future work. One such direction is exploiting our algorithms towards answering other types of historical queries, such as shortest path ones. Another direction concerns historical queries that take into account attributes on the nodes and edges as well evolving graphs with time-varying attributes.

[1]

Takuya

Akiba , Yoichi Iwata, and

Yuichi

Yoshida . Dynamic and historical shortest-path distance queries on large evolving networks by pruned landmark labeling . In WWW , pages 237 { 248 , 2014 .

[2]

Lars

Backstrom , Daniel P. Huttenlocher, Jon M. Kleinberg , and Xiangyang Lan. Group formation in large social networks: membership, growth, and evolution . In KDD , pages 44 { 54 , 2006 .

[3]

Bahman

Bahmani , Abdur Chowdhury, and

Ashish

Goel . Fast incremental and personalized pagerank . PVLDB , 4 ( 3 ): 173 { 184 , 2010 .

[4]

Ciro

Cattuto , Marco Quaggiotto, Andre Panisson, and

Alex

Averbuch . Time-varying social networks in a graph database: a neo4j use case . In GRADES, page 11 , 2013 .

[5] Raymond

Cheng

, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou , Feng

Zhao , and Enhong

Chen . Kineograph: taking the pulse of a fast-changing and connected world . In EuroSys , pages 85 { 98 , 2012 .

[6]

Sparksee

Graph Database . http://www.sparsity-technologies.com/.

[7]

David

Dominguez-Sal ,

Urbon-Bayes , Aleix Gimenez-Van~o, Sergio Gomez-Villamor, Norbert Mart nez- Bazan , and Josep-Lluis Larriba-Pey. Survey of graph database performance on the HPC scalable graph analysis benchmark . In WAIM , pages 37 { 48 , 2010 .

[8]

Wentao

Han , Youshan Miao,

Kaiwei

Li , Ming Wu , Fan Yang, Lidong Zhou , Vijayan Prabhakaran, Wenguang Chen, and

Enhong

Chen . Chronos: a graph engine for temporal graph analysis . In EuroSys, page 1 , 2014 .

[9]

Wenyu

Huo and

Vassilis J.

Tsotras . Efficient temporal shortest path queries on evolving social graphs . In SSDBM, page 38 , 2014 .

[10]

Udayan

Khurana and

Amol

Deshpande . Efficient snapshot retrieval over historical graph data . In ICDE , pages 997 { 1008 , 2013 .

[11] Georgia

Koloniari

, Dimitris Souravlias, and

Evaggelia

Pitoura . On graph deltas for historical queries . WOSS , 2012 .

[12] Alan

Labouseur , Jeremy Birnbaum, Paul W. Olsen Jr, Sean R. Spillane , Jayadevan Vijayan, Jeong-Hyon Hwang , and Wook-Shin Han . The g* graph database: efficiently managing large distributed dynamic graphs . Distributed and Parallel Databases , 2014 .

[13] Alan

Labouseur , Paul W. Olsen, and Jeong-Hyon Hwang . Scalable and robust management of dynamic graph data . In VLDB , pages 43 { 48 , 2013 .

[14] Jure

Leskovec

, Jon M. Kleinberg , and Christos Faloutsos . Graphs over time: densi cation laws, shrinking diameters and possible explanations . In KDD , pages 177 { 187 , 2005 .

[15] Chenghui

Ren

, Eric Lo, Ben Kao, Xinjie Zhu, and Reynold Cheng. On querying historical evolving graph sequences . PVLDB , 4 ( 11 ): 726 { 737 , 2011 .

[16] Konstantinos

Semertzidis

, Kostas Lillis, and

Evaggelia

Pitoura . Timereach: Historical reachability queries on evolving graphs . In EDBT , pages 121 { 132 , 2015 .

[17]

Konstantinos

Semertzidis and

Evaggelia

Pitoura . Durable graph pattern queries on historical graphs . In ICDE , 2016 (to appear).