Simple and practically efficient fault-tolerant 2-hop cover labelings Feliciano Colella1 , Mattia D’Emidio1 , Guido Proietti2,3 1 Gran Sasso Science Institute (GSSI), L’Aquila, Italy. 2 Dipartimento di Ingegneria e Scienze dell’Informazione e Matematica, Università degli Studi dell’Aquila, Italy. 3 Istituto di Analisi dei Sistemi ed Informatica “Antonio Ruberti”, Consiglio Nazionale delle Ricerche, Rome, Italy. {mattia.demidio, feliciano.colella}@gssi.it, guido.proietti@univaq.it Abstract A labeling scheme is a method to retrieve some global prop- erty of a graph by exploiting (in a distributed fashion) the information stored, in the form of (short) labels, at its vertices. A classic question in the area is that of labeling the vertices in order to infer the distance between any pair of them, or alternatively to retrieve a corresponding shortest path. On such purpose, among the various solutions, a 2-hop cover distance/path-reporting labeling scheme is based on the idea of representing the shortest paths in a graph as the concatenation at an intermediate so-called hub vertex of the two shortest paths emanating from the corresponding end vertices. The difficult point here is to find a small-size set of hub vertices covering a shortest path for each pair of vertices in the graph, since in this way the corresponding labeling will be compact, but practical efficient solutions are known in the literature. In this paper, we present a simple and efficient way of enriching this successful scheme in order to make it resilient to the failure of a subset of at most k ≥ 1 edges in the graph, by exploiting the notion of so-called edge-independent spanning trees. Depending on k, we provide different strategies and analyze the corresponding performances. In addition, to assess the practical effectiveness of our method, we provide an exper- imental evaluation, conducted for the case k = 1. Our data confirms the new method to be really competitive when compared with the per- formance of the benchmark non-fault-tolerant scheme, equipped with a point-of-failure rerouting policy. 1 Introduction A shortest-path query asks the shortest path between two vertices in a graph. Without doubt, answering to this kind of queries is one of the most fundamen- tal operations on graphs, as it has a wide range of applications, e.g. social net- works analysis [20], route planning in road networks [12, 13], intelligent transport systems [5], routing in communication networks [9, 11]. Baseline methods, e.g. Breadth First Search or Dijkstra’s algorithm, yield unsustainable query times to answer such requests in graphs arising from the mentioned applications, which tend to be huge. To overcome this limit, tens of smarter approaches have been proposed in the last decade [2, 12] which adopt the common strategy of preprocessing the graph in advance to speed-up the query phase. Among them, the 2-hop cover distance/path-reporting labeling scheme (2–HCL from now on) is based on the idea of representing the shortest paths of a graph as the concatenation at an intermediate so-called hub vertex of the two shortest paths emanating from the corresponding end vertices. Roughly speaking, the label at each vertex will con- tain the length and the next-hop vertex of a shortest path towards each hub vertex, and to retrieve a shortest path between two vertices it will suffice to join their labels and find the common hub that minimizes the sum of the two distances. The difficult point here is to find a small set of hub vertices covering a shortest path for each pair of vertices in the graph, since in this way the cor- responding labeling will be compact. Indeed, finding a minimum-size set of hub vertices is known to be NP-hard [6]. Nevertheless, in the practice, 2–HCL is currently considered the best scheme since: i) it is general (it can be applied on all kinds of graphs, including directed weighted ones [10, 13]); ii) even in networks with tens of billions of vertices, it combines extremely low query times with affordable space overhead and reason- able preprocessing effort [1]; iii) it is suited to be used in distributed settings [21]; iv) several practical approximation algorithms are known [6] - among them, the recent pruned landmark labeling (pll) [1] achieves considerably better scalability than other methods. A further level of complexity, when dealing with this kind of approaches, is represented by the fact that, in general, real-world networks are prone to failures. In this case, in fact, if a failing edge is on a queried shortest path, then a distance query will return an underestimated values w.r.t. a true shortest path in the damaged graph, while a path-reporting query would actually return an unfeasible path! In general, reprocessing the graph from scratch after each failure is not a reasonable recovery strategy, since it will unavoidably require a non-negligible amount of computational time [10]. Therefore, a desirable objective becomes that of devising a so-called fault-tolerant scheme, i.e. an approach that allows to answer to queries even in presence of a number of (transient) graph failure operations (e.g., edge or vertex removals). This is usually achieved by suitably enriching the underlying data structure and by accordingly modify the query strategy to consider such enrichment. However, if the paths to be reported are constrained to be shortest also in presence of failures, then this might result in storing an unpractical amount of additional information [17]. Hence, in this case, a reasonable compromise is that of relaxing the optimality constraint and devise more compact schemes that return approximate distances (shortest paths, resp.), i.e. distances (paths, resp.) that are guaranteed to be longer than the corresponding true distances (shortest paths, resp.) by at most a given stretch factor, for any possible kind of failure that has to be handled. Examples of this strategy in the literature are various [3, 8], where many structures have been studied in the fault-tolerant approximate setting. However, to the best of our knowledge, no previous work has ever investigated 2–HCL in such a setting. Our Contribution. In this paper, we move forward in this direction and present a simple and efficient way of enriching a 2–HCL in order to make it resilient to the failure of a subset of any k edges. In more details, given an n-vertex and m-edge graph G, we show that such an enrichment can be done by exploiting the notion of so-called edge-independent spanning trees. More precisely: – for k = 1, 2, 3 and G at least (k + 1)-edge connected, the enrichment takes O(m + n), O(n2 ) and O(n3 ) additional time, respectively, and O(n) addi- tional space; – for k > 3 and G at least (2k + 2)-edge connected, the enrichment takes O(k 2 n2 ) additional time and O(kn) additional space. Even though the above solutions have in principle replacement paths which can be linearly (in n and k) stretched (as compared to new shortest paths in the surviving graph), in practice the situation is much better. To assess that, we provide an experimental evaluation, conducted for the case k = 1, showing the quality of our approach in terms of stretch, space requirements and prepro- cessing/query time. Our data shows the new method to be really effective and competitive when compared with the performance of the benchmark non-fault- tolerant scheme, equipped with a point-of-failure rerouting policy. Notice that our results can be extended to weighted graphs as well, and that our solutions are able to answer to the most general path-reporting queries. 2 Background In what follows, we provide the notation and the basic definitions that are used throughout the paper. Given an unweighted undirected graph G = (V, E) with |V | = n vertices and |E| = m edges, we denote by e = (u, v) an edge e ∈ E connecting two vertices u, v ∈ V . Given an edge e ∈ E of G, we denote by G − e (G − F , resp.) the graph obtained from G by removing an edge e (a set of G edges F ⊆ E, resp.). Moreover, we call πxy a shortest path between two vertices G G x, y ∈ V and dxy (or |πxy |) its length (a.k.a. the distance between x and y in G). G G G In addition, given a shortest path πxy we denote by p(x, πxy ) (p(y, πxy ), resp.) G G the predecessor of x (y, resp.) within πxy , i.e. the vertex v such that (x, v) ∈ πxy G ((v, y) ∈ πxy , resp.). Given two simple paths a and b, we denote by p = a ⊕ b the path p obtained by concatenating them. Furthermore, we say that a graph G = (V, E) is k edge connected if |E| > k and there does not exist a set of edges, of cardinality at most k−1, whose removal disconnects the graph. Given a graph G = (V, E) and a distinguished root vertex r ∈ V , then IT = {T1 , T2 , . . . , Tq } is a collection of q edge-independent spanning Ti T trees of G if and only if, for each vertex v ∈ V , and for each i 6= j, πrv and πrvj are pairwise edge-disjoint paths, i.e. they do not share any edge [15]. In what follows, we give a slightly modified definition of 2–HCL, suited for answering to path-reporting queries, inspired by that of [6] for distance queries. Definition 1 (2–HCL). Let G = (V, E) be an undirected graph. Let the path label P (v) of a vertex v ∈ V be a set of tuples of the form hu, dG G uv , p(v, πuv ))i and let P (G) = {P (v)}v∈V (denoted as P when the graph is clear from the context) be a collection of such labels. Let a path query operation on P (G) from a vertex s to a vertex t be defined as: G G G G ( hh, dhs , p(s, πhs )i | h = argmin {dvs + dvt } if P (s) ∩ P (t) 6= ∅ pQuery(s, t, P ) = v∈P (s)∩P (t) ∅ otherwise where, for the sake of brevity, we denote by P (s) ∩ P (t) the set of vertices {v|hv, dG G G G vs , p(v, πvs )i ∈ P (s) ∧ hv, dvt , p(v, πvt )i ∈ P (t)}. Then, P (G) is a 2– HCL of G if and only if, for any pair of vertices s and t of G it satisfies the cover property [6], i.e., if s and t are connected in G and h is the first argument returned by pQuery(s, t, P ), then dG G G st = dhs + dht and h is called a hub that covers s and t, otherwise pQuery(s, t, P ) = ∅. Given the above, it is clear that a 2–HCL P (G) of a graph G can be used G also to retrieve an entire shortest path πst (i.e. the corresponding set of ver- tices/edges) for any given pair of vertices s, t of G by the procedure shown in Algorithm 1. The routine essentially builds incrementally two shortest paths, one from s toward a hub h and the other from t toward the same hub h, and eventually combines them into a path from s to t which is exactly the shortest between s and t, by the cover property. The two paths are computed by the repeated application of the above path query, i.e. by concatenating edges con- necting vertices to the corresponding predecessors. For the sake of clarity, we represent the shortest path as a doubly linked list of edges. Algorithm 1: Computation of a shortest path via P (G). Input : P (G), a pair of vertices x and y of G. G Output: A shortest path p = πxy between x and y. 1 p ← ∅; px ← ∅; py ← ∅; c1 ← x; c2 ← y; 2 if c1 = c2 then return p; 3 l1 ← pQuery(c1 , c2 , P ); 4 l2 ← pQuery(c2 , c1 , P ); 5 h ← l1 .f irst /* h equals l2 .f irst by construction */ 6 while c1 6= l1 .f irst ∨ c2 6= l2 .f irst do 7 if c1 6= l1 .f irst then 8 px .pushF ront((c1 , l1 .third)); 9 c1 ← l1 .third; 10 l1 ← pQuery(c1 , h, P ); 11 if c2 6= v then 12 py .pushBack((c2 , l2 .third)); 13 c2 ← l2 .third; 14 l2 ←← pQuery(c2 , h, P ); 15 p = px ⊕ py ; 16 return p Problem Statement Let G = (V, E) be an unweighted and undirected graph. We aim at computing what we call a k-edge-fault-tolerant path-reporting labeling scheme (k-EFTPL, in short), i.e. a labeling P (G, k) that, for any pair of vertices G s, t of G, can be used to retrieve: i) a shortest path πst between s and t in G; G−F ii) a backup path γst between s and t, i.e. a simple path (if any) connecting s and t in G − F for any F ⊆ E such that |F | ≤ k. The stretch factor of a k-EFTPL P (G, k) is defined as the maximum ratio, for any pair of vertices s, t and for any F ⊆ E such that |F | ≤ k, between the length of a backup path and that of the corresponding shortest path in G − F , i.e.: G−F |γst | G−F α(P (G, k)) = max |π G−F | . Note that, whenever πst does not exist s,t∈V,∀F ⊆E:|F |≤k st |γ G−F | (i.e. s and t are disconnected in G − F ), we assume |πst G−F | to be 1. A k-EFTPL st P (G, k) such that α(P (G, k)) = 1 is called exact while is called approximate otherwise. 3 A linear-stretch approximate k-EFTPL In this section, we show how to build an approximate k-EFTPL P (G, k) for a given input graph G. Our approach essentially consists in a suitable enrichment of a 2–HCL P (G), based on the use of edge-independent trees, as follows. Given G, we start by computing P (G) via a standard method, e.g. pll [1] and, to be able to “resist” to the failure of k edges, we compute k+1 edge-independent trees T1 , T2 , . . . Tk+1 of G and root them all at a same distinguished vertex, say r ∈ V . Note that, such trees guarantess that, should any k edges e1 , e2 , . . . , ek of the graph fail, for any vertex v ∈ V there will exist (at least) one value i ∈ [1, k + 1] such that v and r are connected, in Ti , by a simple path not containing any of the e1 , e2 , . . . , ek edges. This can be trivially proved by contradiction, by elaborating on the well-known results from graph theory, summarized in the following. Theorem 1 ([16]). Let IT be a graph obtained by merging a collection of q + 1 edge-independent spanning trees {Ti }i=1,2,...,q+1 of a graph G. Then, IT is (q+1)- edge connected. Hence, we use such rooted trees to build a set of what we call tree labels that are assigned to the vertices in order to allow the retrieval of a backup path in presence of k edge failures. Intuitively, such labels store the predecessor of a vertex within a shortest path toward the root r, for each considered tree. Depending on the value of k, we make use of different approaches to build the k + 1 edge-independent trees. Clearly, a necessary condition to guarantee that any pair of vertices remains connected even in presence of k edge failures is that G is at least (k + 1)-edge connected. Then, if k ∈ {1, 2, 3} and G is (k + 1)-edge connected, we can compute k + 1 edge-independent trees in polynomial time, with a time complexity of O(m + n), O(n2 ) and O(n3 ), respectively [4, 7, 15]. On the other hand, if k > 3 and G is h-edge connected, with k + 1 ≤ h ≤ 2k + 1, then to the best of our knowledge it is not possible to build k + 1 edge- independent trees in polynomial time. However, if G is at least (2k + 2)-edge connected, then we can build k + 1 edge-disjoint spanning trees of G, which are clearly also edge-independent, by using the approach proposed in [18], running in O(k 2 n2 ) time. From now on, we will suppose to have at our disposal k + 1 edge-independent trees of G, built as aforementioned. Hence, for each vertex, there will always be a tree label containing a viable predecessor to be used to reach r, even in presence of k arbitrary edge failures in G. More in details, given a vertex v ∈ V and an independent tree Ti of G (rooted at a vertex r ∈ V ), we define a tree label entry Ti to be a pair hr, p(v, πvr )i. Such pair essentially encodes for v a path to follow toward the root r of the i-th tree in the form of a predecessor within the tree. Tree label entries are stored within what we call a tree label T (v, k),4 for any v ∈ V . Hence, any vertex is assigned two labels, namely P (v) and T (v, k), to be used in combination in order to be tolerant to the failure of k edges. We call the set {T (v, k)}v∈V a k-tree labeling of G (denoted as T (G, k) or T when clear from the context) while we define P (G, k) to be the union of P (G) and T (G, k). Moreover, we call the chosen r the root of P (G, k). The routine for building the k-tree labeling of a graph G, and therefore P (G, k), is summarized in Algorithm 2. We represent T (v, k) again as a doubly linked list, since it will be convenient for defining a suited query algorithm. Now, we can easily bound the computational complexity of Algorithm 2 as follows. Algorithm 2: Building a k-EFTPL P (G, k) of a graph G = (V, E) Input : Graph G = (V, E), parameter k ≥ 1, a 2–HCL P (G) of G Output: A k-EFTPL P (G, k) of G 1 Compute k + 1 edge-independent trees T1 , T2 , . . . Tk+1 ; 2 Choose a distinguished vertex r ∈ V , root all k trees in r; 3 T (G, k) ← ∅; 4 foreach v ∈ V do T 5 foreach i = 1, 2, . . . , k + 1 do T (v, k).push_back(hr, p(v, πvri )i); 6 T (G, k) ← T (v, k); 7 P (G, k) = P (G) ∪ T (G, k); 8 return P (G, k) Lemma 1. Algorithm 2 takes O(∆+k|V |) time, where ∆ is the worst-case time for computing k + 1 edge-independent trees. Proof. Clearly, performing line 1 takes O(∆) while the execution of lines 2-3 takes constant time. Moreover, the outer loop of line 4 is executed exactly |V | times while the inner loop of line 5 is performed exactly k + 1 times. Since the operation of line 5 can be done in O(1) time,5 the claim follows. t u Lemma 2. Algorithm 2 builds a k-EFTPL P (G, k) whose overall size, in terms of label entries, is O(|P (G)| + k|V |). Proof. To prove the statement, it is enough to bound the size of T (G, k), since the size of P (G, k) is trivially upper bounded by the sum of the sizes of P (G) and T (G, k). At line 3, T (G, k) is empty, and exactly one entry per value of i is added to the tree label T (v, k) of each vertex v ∈ V at line 5. Since i ranges from 1 to k, the claim follows. t u 4 The k parameter stands for the ability of resisting to k edge failures by k + 1 trees. 5 By storing each Ti as an array of predecessors. Given the construction of Algorithm 2, we now provide a query algorithm that can be used for either retrieving a true shortest path, when the graph is not subject to failures, or a backup path otherwise. The procedure, summarized in Algorithm 3, roughly works as follows. Given P (G, k) and a pair of vertices s and t, the routine starts by trying to retrieve the true shortest path, by relying on P (G) and on the corresponding pQuery strategy. As in the previous case, the shortest path is built incrementally and by combining two separate parts. If no failures are currently present in the network, then the algorithm essentially coincides with Algorithm 1 and returns the shortest path. Otherwise, if a number k 0 < k + 1 of failures are occurring on G, then two cases can occur: either the shortest path from s to t includes at least one of such failures or not. To determine that, the algorithm needs essentially to check, at every step of the construction, whether the edge leading to the predecessor on the original shortest path has failed or not. In the negative case, clearly, the algorithm proceeds as nothing happened, i.e. as Algorithm 1, by repeating the application of pQuery. In the affirmative case, instead, the algorithm starts looking for backup paths by using available tree label entries, which essentially encode edges connecting to predecessors in the independent trees (see Lines 4-9 of Algorithm 4). Since a number of failures can affect different trees, tree label entries are scanned sequentially until one non-failed edge if found. Again two sub-paths are built and then, eventually, combined. Algorithm 3: Query algorithm for a k-EFTPL P (G, k). Input : P (G, k), two vertices x and y. Output: A simple path p between x and y. 1 p ← ∅; px ← ∅; py ← ∅; c1 ← x; c2 ← y; 2 l0 ← the 0-th tree label of T (z, k); /* Any tree label entry, first field is always r */ 3 r ← l0 .f irst; 4 if x = y then return p; 5 l1 ← pQuery(c1 , c2 , P ); 6 h ← l1 .f irst /* h equals l2 .f irst by construction */ 7 px ← NavigateFrom(x, h, r); py ← NavigateFrom(y, h, r); 8 if px .tail() 6= py .tail() then 9 if px .tail() 6= h then ForceToRoot(px , r); 10 else ForceToRoot(py , r); 11 else p ← px ⊕ reverse(py ); 12 for e ∈ p do 13 if |e| > 1 then p ← p \ e /* Removes any duplicates */ ; 14 return p; Note that, in this case, the two sub-paths might be not connected since one of the two searches might end up in the original hub, while the other in the root, and viceversa. This happens when one of the two sub-paths reaches the original hub while the other does not. In this case, Algorithm 5 takes care of finding the missing part. In general, we are able to prove the following. Lemma 3. Algorithm 3 always computes a backup path connecting x to y, for any pair x, y ∈ V . Due to space contraints, we postpone the proof of Lemma to a full version of the paper. Theorem 2. Algorithm 3 builds a k-EFTPL P (G, k) with α(P (G, k)) ∈ O(kn). Algorithm 4: Procedure NavigateFrom(z, h, r) Input : P (G, k), a vertex z, its hub h, the root of the trees r. Output: A simple path pz connecting z to either h or r. 1 pz ← ∅; failed ← f alse; i ← 0; /* i-th edge-independent tree in use */ 2 lz ← pQuery(z, h, P ); 3 while z 6= h ∧ z 6= r do 4 if failed = true then /* failure previously encountered, keep going toward r */ 5 Let li be the i-th tree label of T (z, k); 6 if (z, li .second) ∈ E then 7 pz .pushF ront((z, li .second)); 8 z ← li .second; 9 else i ← i + 1; /* change tree */ 10 else 11 if (z, lz .third) 6∈ E then failed ← true; /* failure encountered */ 12 else /* keep navigating toward h */ 13 pz .pushF ront((z, lz .third)); 14 z ← lz .third; 15 lz ← pQuery(z, h, P ); 16 return pz ; Proof. The length of any path found by Algorithm 3 from any node v to the root r is always upper bounded by diammax = maxi=1,...,k+1 2 · diam(Ti ), where diam(Ti ) is the diameter of the i-th edge-independent tree Ti . Since we could find a failed edge up to k times (one for each tree Ti ), we could have to pay diam(Ti ) each time. Therefore we could have k · diam(Ti ), hence, the claim follows. t u Algorithm 5: Procedure ForceToRoot(p, r). Input : P (G, k), a path p, a vertex r. 1 i ← 0; z ← p.tail(); 2 while z 6= r do 3 Let li be the i-th tree label of T (z, k); 4 if (z, li .second) ∈ E then 5 p.pushF ront((z, li .second)); 6 z ← li .second; 7 else i ← i + 1; /* change tree */ 4 Experimental evaluation In this section we provide an experimental evaluation of our approach. In par- ticular, we implemented, in C++: i) the pll approach of [1] for building 2–HCL; ii) Algorithm 2 for building an approximate k-EFTPL, for k = 1. Note that, details on how to build a 1-EFTPL will be given in an extended version of the paper due to space constraints. We embedded all our code within NetworKit 6 , a well-known open-source framework for large-scale network analysis. Then, to assess the performance of our method, we established the following experimental process. First, inspired by other papers on the subject (e.g. [10, 17]), we selected a meaningful set of real-world networks of various types (including road graphs, 6 see networkit.iti.kit.edu peer to peer networks, and AS communication networks7 ) and treated them as undirected, unweighted graphs. Details about inputs are reported in Table 1 where we show, for each network, the corresponding type, and the number of vertices and edges of the largest 2-edge connected component (second, third and fourth columns, resp.). Then, for each network, we ran both pll and Algorithm 2 space per time (seconds) vertex (bytes) Network Type |V | |E| pll khl P (G) T (G, 1) Barabasi (BAR) Synthetic 365 488 734 347 68 500 6.41 5 198 18 Brightkite (BRI) Social 33 187 188 577 5 680 1.75 2 326 18 CA-GrQc (CAG) Collaboration 2 651 10 480 499 0.03 1 271 18 CA-HepTh (CAH) Collaboration 5 898 20 983 1 130 0.03 2 085 18 Caida (CAI) Autonomous Sys. 6 855 13 341 1 650 0.02 1 412 18 com-youtube (CYT) Social 452 060 2 295 072 66 200 5 090 4 136 18 Denmark (DNK) Road 252 416 320 914 147 000 0.75 13 152 18 flickredges (FED) Social 105 512 2 316 450 2 180 165 12 415 18 ForestFire (FF2) Synthetic 1 178 888 13 849 776 212 000 16 100 17 983 18 flickrlinks (FLI) Social 704 985 14 501 930 125 000 17 700 12 525 18 oregon (ORE) Autonomous Sys. 7 218 19 448 638 2.54 286 18 skitter (SKI) Autonomous Sys. 1 443 769 10 830 987 197 000 11 100 10 666 18 WikiVote (WIK) Hyperlinks 4 786 98 456 751 1.12 1 890 18 WikiTalk (WIT) Collaboration 622 315 2 889 703 47 700 38 700 2 951 18 Table 1: Input details and performance of pll and khl w.r.t. time and space, resp. (denoted by khl in the following) and measured the computational time spent, in seconds, to build P (G) and a 1-EFTPL P (G, 1), resp., which is shown in Table 1 (4th and 5th column, resp.). For the sake of completeness, we also measured the average size of P (G) and the overhead required for storing T (G, 1), in bytes, per vertex. Such values are reported in Table 1 as well (6th and 7th column, resp.). Finally, we selected 10 000 pairs of vertices of the graph and, for each pair s, t, we executed the following steps:8 i) we randomly removed an edge e of the graph which, with probability p = 5/100 is chosen to be on the shortest path between s and t in G;9 ii) we ran, on G − e, a modified version of Algorithm 1 that includes a point-of-failure (POI) rerouting procedure, with s and t as input, and measured both the time taken and the length of the output backup path; notice that, a POI rerouting procedure is a common backup method to deal with failures, that essentially retrieve a feasible solution starting from the point where a failure was encountered [19]; in our case, when one of the two sub-paths computed by Algorithm 1 happens to encounter a failed edge, say for instance the path from s fails at edge e = (x, y), then we build a BFS in G − e rooted at x and we stop as soon as t is reached; iii) we ran Algorithm 3 on G − e with s and t as input, and measured both the time taken and the length of the output backup path; iv) we compute the shortest-path between s and t on G − e (e.g. by means of a BFS) and measure the length of the true shortest path between s and t in G − e. We accordingly compute, for pair s, t, the stretch of the path computed by ii) and that of the path computed by iii). For all the considered performance 7 see http://snap.stanford.edu 8 All our software has been compiled with GNU g++ v.5 (O4 opt. level) under Linux and executed on an Intel Xeon c CPU 9 the choice of p is inspired by previous works on failures in networks [14] metrics, we computed average values over the 10 000 executions. The results concerning the query time (in seconds) are reported in Fig. 1b while those related to the average stretch are shown in Fig. 1a. We have omitted the stretch of the point-of-failure strategy, since it was always negligible, close to 1. Regarding the root vertex r of the P (G, 1), note that, from a worst-case point of view, the stretch provided by the approach is independent from the specific chosen vertex, hence it could be selected at random. However, it is clear that the best choice of the root from a practical viewpoint should be based on some centrality measure, since this would tend to induce short backup paths [13]. Unfortunately, computing fine-grained centrality measures, such as, e.g. betweenness centrality, can be rather computationally expensive and therefore can (heavily) impact on the preprocessing time of our approach. Thus, when needed, we choose r to be the vertex of highest degree, since vertex degree has been shown to be a quite robust centrality measure that can be computed in linear time [10]. For the sake of completeness, we also considered an approximation of the betweenness centrality based on sampling, which however always produced worse results in terms of stretch, and thus we omitted them. 1.5 1.45 1.33 1.37 1e+05 1.25 1.23 1.2 1.24 1.24 1.23 1.26 1.2 1.22 1.16 1.1 1.04 1.0 1e+03 0.5 1e+01 0.0 R I G H I T K D 2 I U E I K T R I G H I T K D 2 I U E I K T BR CA FL SK BR CA FL SK FF FF CY WI CY WI DN OR WI DN OR WI BA CA FE GN BA CA FE GN CA CA (a) (b) Figure 1: (a) Estimation of α(P (G, 1)) based on 10 000 measures; (b) Query time of Algorithm 3 (lighter gray) versus query time of Algorithm 1 with POI rerouting. Analysis. The main outcome of our experimental evaluation concerns both aver- age stretch, provided by the new structure P (G, 1), and query time. Regarding the former, despite the O(kn) worst case upper bound of Lemma 2, the P (G, 1) data structure shows a very good behavior in practice, as the measured average stretch is always pretty close to one, and only slightly larger than the stretch exhibited by the rerouting method, which can be considered a benchmark strat- egy w.r.t. quality of reported shortest paths. Regarding the latter, Algorithm 3 results in being extremely (orders of magnitude) faster (see Fig. 1b) w.r.t. the rerouting-based strategy, which require to perform BFSs every time a failed edge is met. This suggests the method to be a viable option in practice whenever tran- sient failures need to managed. This comes at the price of: i) a negligible time overhead for enriching the P (G) data structure (i.e. for performing khl), which can be hundreds of times smaller than the time for executing pll; ii) a constant amount of extra space (few bytes per vertex) for storing T (G, 1), which can be considered negligible w.r.t. the space per vertex required for storing P (G). 5 Conclusion and future work In this paper, we have studied 2-hop cover distance/path-reporting labeling schemes in the fault-tolerant approximate setting. In particular, we have presented a sim- ple and efficient way of enriching this successful scheme in order to make it re- silient to the failure of a subset of k edges in the graph, by exploiting the notion of so-called edge-independent spanning trees. Depending on k, we have provided different strategies and analyzed the corresponding performance. In addition, to assess the practical effectiveness of our method, we have provided an exper- imental evaluation, conducted for the case k = 1, whose results show the new method to be really competitive when compared with the performance of the benchmark non-fault-tolerant scheme, equipped with a point-of-failure rerouting policy. There are several research directions which might deserve further inves- tigation in this area. For instance, it could be interesting to study whether it is possible to design a scheme with a better guarantee on the stretch w.r.t. the one proposed in this paper. Another interesting direction could be that of extending the experimental evaluation of our new method to higher values of k. Bibliography [1] T. Akiba, Y. Iwata, and Y. Yoshida. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In Proc. of ACM SIGMOD International Conference on Management of Data (SIGMOD 2013), pages 349–360. ACM, 2013. [2] H. Bast, D. Delling, A. Goldberg, M. Müller-Hannemann, T. Pajor, P. Sanders, D. Wagner, and R. F. Werneck. Route planning in transporta- tion networks. arXiv preprint arXiv:1504.05140, 2015. [3] D. Bilò, L. Gualà, S. Leucci, and G. Proietti. Multiple-edge-fault-tolerant approximate shortest-path trees. In Proc. of 33rd Symposium on Theoretical Aspects of Computer Science (STACS), volume 47 of LIPIcs, pages 18:1– 18:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. [4] J. Cheriyan and S. Maheshwari. Finding nonseparating induced cycles and independent spanning trees in 3-connected graphs. Journal of Algorithms, 9(4):507–537, 1988. [5] A. Cionini, G. D’Angelo, M. D’Emidio, D. Frigioni, K. Giannakopoulou, A. Paraskevopoulos, and C. D. Zaroliagis. Engineering graph-based models for dynamic timetable information systems. In 14th Workshop on Algorith- mic Approaches for Transportation Modelling, Optimization, and Systems (ATMOS), volume 42 of OASICS, pages 46–61. Schloss Dagstuhl - Leibniz- Zentrum fuer Informatik, 2014. [6] E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338–1355, 2003. [7] S. Curran, O. Lee, and X. Yu. Finding four independent trees. SIAM J. Comput., 35(5):1023–1058, 2006. [8] A. D’Andrea, M. D’Emidio, D. Frigioni, S. Leucci, and G. Proietti. Path- fault-tolerant approximate shortest-path trees. In Proc. of 22nd Interna- tional Colloquium on Structural Information and Communication Com- plexity (SIROCCO), volume 9439 of Lecture Notes in Computer Science. Springer, 2015. [9] G. D’Angelo, M. D’Emidio, and D. Frigioni. A loop-free shortest-path rout- ing algorithm for dynamic networks. Theoretical Computer Science, 516:1– 19, 2014. [10] G. D’Angelo, M. D’Emidio, and D. Frigioni. Distance queries in large-scale fully dynamic complex networks. In Proc. of 27th International Workshop on Combinatorial Algorithms (IWOCA), volume 9843 of Lecture Notes in Computer Science, pages 109–121. Springer, 2016. [11] G. D’Angelo, M. D’Emidio, D. Frigioni, and V. Maurizio. A speed-up technique for distributed shortest paths computation. In Proc. of 11th International Conference on Computational Science and Its Applications (ICCSA), volume 6783 of Lecture Notes in Computer Science, pages 578– 593. Springer, 2011. [12] G. D’Angelo, M. D’Emidio, D. Frigioni, and C. Vitale. Fully dynamic main- tenance of arc-flags in road networks. In R. Klasing, editor, Proc. of 11th International Symposium on Experimental Algorithms (SEA), volume 7276 of Lecture Notes in Computer Science, pages 135–147. Springer, 2012. [13] D. Delling, A. V. Goldberg, T. Pajor, and R. F. Werneck. Robust distance queries on massive networks. In Proc. of 22th Annual European Symposium on Algorithms (ESA), volume 8737 of Lecture Notes in Computer Science, pages 321–333. Springer, 2014. [14] G. Iannaccone, C.-n. Chuah, R. Mortier, S. Bhattacharyya, and C. Diot. Analysis of link failures in an ip backbone. In Proc. of the 2nd ACM SIG- COMM Workshop on Internet measurment, pages 237–242. ACM, 2002. [15] A. Itai and M. Rodeh. The multi-tree approach to reliability in distributed networks. Inf. Comput., 79(1):43–59, 1988. [16] K. Menger. Zur allgemeinen kurventheorie. Fundamenta Mathematicae, 10(1):96–115, 1927. [17] Y. Qin, Q. Z. Sheng, and W. E. Zhang. SIEF: efficiently answering distance queries for failure prone graphs. In Proc. of 18th International Conference on Extending Database Technology (EDBT), pages 145–156. OpenProceed- ings.org, 2015. [18] J. Roskind and R. E. Tarjan. A note on finding minimum-cost edge-disjoint spanning trees. Mathematics of Operations Research, 10(4):701–708, 1985. [19] N. Santoro. Design and Analysis of Distributed Algorithms (Wiley Series on Parallel and Distributed Computing). Wiley-Interscience, 2006. [20] M. V. Vieira, B. M. Fonseca, R. Damazio, P. B. Golgher, D. de Castro Reis, and B. A. Ribeiro-Neto. Efficient search ranking in social networks. In Proc. of 16th ACM Conference on Information and Knowledge Management (CIKM), pages 563–572. ACM, 2007. [21] X. Zhang and L. Chen. Distance-aware selective online query processing over large distributed graphs. Data Science and Engineering, 2(1):2–21, 2017.