Additively Weighted Voronoi Diagrams for Optimal Sequenced Route Queries∗ Mehdi Sharifzadeh and Cyrus Shahabi Computer Science Department University of Southern California Los Angeles, CA 90089-0781 [sharifza, shahabi]@usc.edu Abstract 8 t1 7 s3 r1 The Optimal Sequenced Route (OSR) query 6 r2 s1 strives to find a route of minimum length start- 5 ing from a given source location and pass- 4 s2 t2 ing through a number of typed locations in a 3 q specific sequence imposed on the types of the 2 r3 t3 locations. In this paper, we propose a pre- 1 s4 computation approach to OSR query in vector 1 2 3 4 5 6 7 8 9 10 11 12 spaces. We exploit the geometric properties of Figure 1: 3 different types of point sets the solution space and theoretically prove its relation to Additively Weighted Voronoi dia- at late night. This type of query is also essential in grams. Our approach recursively accesses these other application domains such as crisis management, diagrams to incrementally build the optimal se- air traffic flow management, supply chain management, quenced route. Our experimental results ver- and video surveillance. ify that our pre-computation approach outper- We call this type of query, where the order of the se- forms the previous index-based approaches in quence in which some points must be visited is enforced terms of query response time. and cannot be changed, the Optimal Sequenced Route or OSR query. Similar types of the planning queries 1 Introduction have received recent attention by the database commu- nity [5, 9, 3, 2] to solve problems in the domains of Suppose that we are planning a Saturday trip in town air traffic flow and supply chain management, which as following: first we intend to visit a shopping center shows the importance of these query types. We use the in the afternoon to check the season’s new arrivals, then first example described above as our running example we plan to dine in an Italian restaurant in early evening, throughout the paper. Figure 1 shows three different and finally, we would like to watch a specific movie at types of point sets shown by white, black and gray cir- late night. Naturally, we intend to drive the minimum cles, which represent shopping centers, Italian restau- overall distance to these destinations. In other words, rants, and theaters, respectively, and a starting point we need to find the locations of the shopping center q (shown by 4). The distance between two points is si , the Italian restaurant rj , and the theater tk that their Manhattan distance. Here, the route (q, s1 , r1 , t1 ) shows the specific movie, which driving toward them (shown with solid lines in the figure) with the length considering the sequence of the plan shortens our trip of 12 units is the optimum answer to our query. One (in terms of distance or time). Note that in this ex- can show that OSR query may not be optimally an- ample, a time constraint enforces the order in which swered by simply performing a series of independent these destinations are to be visited; we usually do not nearest neighbor queries at different locations (e.g., 15- have dinner in the afternoon, or do not go for shopping unit length route (q, s2 , r2 , t2 )). ∗ This research has been funded in part by NSF grants We, for the first time, introduced the OSR query in EEC-9529152 (IMSC ERC), IIS-0238560 (PECASE), IIS-0324955 [8] and proposed two online approaches for vector and (ITR), and unrestricted cash gifts from Google and Microsoft. Any opinions, findings, and conclusions or recommendations ex- metric spaces (e.g., road networks). The R-LORD al- pressed in this material are those of the author(s) and do not gorithm of [8] subsequently performs range queries uti- necessarily reflect the views of the National Science Foundation. lizing an R-tree index structure to filter out the points Proceedings of the third Workshop on STDBM that cannot possibly be on the optimal route, and then Seoul, Korea, September 11, 2006 builds the optimal route in reverse sequence (i.e., from 33 ending to the starting point). In this paper, we exploit including r1 in the diagram of (grey). Now, the final the geometry of the OSR problem space to propose a OSR of q is the route (s1 , r1 , t1 ), which includes s1 , r1 , novel pre-computation algorithm which utilizes Voronoi and t1 in the order of their retrieval. diagrams for processing OSR queries in vector spaces. Our OSR query processing using Voronoi diagrams In our running examples, we use Manhattan distance best suits the applications where the sequences of in R2 which is a representative of the network distance user’s interest are known in advance. This allows pre- if the road network consists of a set of north-south and computation of the corresponding diagrams. Notice east-west roads [1]. that using the OSR-Voronoi family of a sequence M , we can address OSR queries of all suffix sequences 1.1 Contributions of M . Through extensive experiments with a real- world dataset, we show that our approach significantly Assume that we are only interested in the optimal se- outperforms the R-LORD approach proposed in [8] quenced routes that follow a fixed sequence M . For in terms of response time. We also show that the example, assume that this fixed sequence for Figure 1 off-line process used to build the OSR-Voronoi fam- is (white, black, grey). Suppose we can partition the ily of a single sequence is reasonably fast. With a space of points of interest (e.g., R2 ) into regions each given sequence M , this pre-computation phase takes including all the points with a common OSR. For in- P|M | O( i=1 |UMi | log |UMi |) computation steps. stance, in Figure 1, there are many points similar to q for which the optimal sequenced route to a white, then a black, and finally a grey point is the route (s1 , r1 , t1 ). 2 Preliminaries Suppose that our partitioning identifies the region in- 2.1 Formal Problem Definition cluding these points. Furthermore, assume that the partitioning also identifies the region corresponding to In this section, we describe the terms and notations each and every possible route (si , rj , tk ) to a white, a used in [8] to formally define the OSR query. We use black, and a grey point. Assume that we associate and the same notations throughout the paper . store all regions and their corresponding OSRs. There- Let U1 , . . . , Un be n sets, each containing points in a fore, for a given starting point q we can address the OSR d-dimensional space Rd , and D(., .) be a distance met- query with sequence M by simply locating the unique ric defined in Rd where D(., .) obeys the triangular in- region that includes q and reporting its correspond- equality. To illustrate, in the example of Figure 1, U1 , ing optimal sequenced route. In Section 3, we prove U2 , and U3 are the sets of black, white, and gray points, that such partitioning exists as an Additively Weighted representing restaurants, shopping centers and theaters, Voronoi diagram, a specific type of general Voronoi di- respectively. We first define the following four terms. agrams. We theoretically exploit interesting properties Definition 1: Given n, the number of point sets Ui , we of these diagrams which makes them appropriate data say M = (M1 , M2 , . . . , Mm ) is a sequence if and only if structures for efficient OSR query processing. 1 ≤ Mi ≤ n for 1 ≤ i ≤ m. That is, given the point We enhance the above query processing approach by sets Ui , a user’s OSR query is valid only if she asks for exploiting an interesting property of OSRs proved in existing location types. For the example of Figure 1 Lemma 1 (see Section 3). The lemma states that if where n = 3, (2, 1, 2) is a sequence (specifying a shop- (s1 , r1 , t1 ) is q’s optimal route to a white, a black, and ping center, a restaurant, and a shopping center), while a grey point, then (r1 , t1 ) is s1 ’s optimal route to a (3, 4, 1) is not a sequence because 4 is not an existing black and then a grey point. Recursively, (t1 ) is r1 ’s point set. We use sf x(M, i) = (Mi+1 , . . . , Mm ) to refer optimal route to a grey point. Lemma 1 enables us to to the suffix of M with size m − i. avoid storing the complete OSR corresponding to each Definition 2: We say R = (p1 , p2 , . . . , pr ) is a route region of the diagram for a given sequence M ; we store if and only if pi ∈ Rd for each 1 ≤ i ≤ r. We use only the first point of the OSR for each region of the p ⊕R = (p, p1 , . . . , pr ) to denote a new route that starts Voronoi diagram (e.g., s1 for the region corresponding from starting point p and goes sequentially through p1 to (s1 , r1 , t1 )). Instead, we require to compute and store to pr . The route p ⊕ R is the result of adding p to the the Voronoi diagrams corresponding to all suffixes of M . head of route R. We use sf x(R, i) = (pi+1 , . . . , pr ) to For instance, to answer queries with sequence refer to the suffix of R with size r − i. (white, black, grey) in Figure 1, our approach re- Definition 3: We define the length of a route R = quires the diagrams corresponding to three sequences (p1 , p2 , . . . , pr ) as (white, black, grey), (black, grey), and (grey). We re- r−1 fer to these diagrams as the OSR-Voronoi family of se- X L(R) = D(pi , pi+1 ) (1) quence (white, black, grey). Now, we iteratively process i=1 a given OSR query. For the starting point q, we first find s1 , the white point associated with the region in- Note that L(R) = 0 for r = 1. For example, the length cluding q in the diagram of (white, black, grey). Then, of the route (s2 , r2 , s3 ) in Figure 1 is 4 units where D we find r1 , the black point stored with the region in- is the L1 norm (i.e., Manhattan distance). cluding s1 in the diagram of (black, grey). Finally, we Definition 4: Let M = (M1 , M2 , . . . , Mm ) be a se- seek for t1 , the grey point corresponding to the region quence. We refer to the route R = (p1 , p2 , . . . , pm ) as a 34 sequenced route that follows sequence M if and only if pi ∈ UMi where 1 ≤ i ≤ m. In Figure 1, (s2 , r2 , s3 ) is a sequenced route that follows (2, 1, 2) which means that V(p) the route passes only through a white, then a black and finally a white point. p Now, we formally define the OSR query. Definition 5: For a given starting point q in Rd and the sequence M = (M1 , M2 , . . . , Mm ), the Optimal Sequenced Route (OSR) Query, Q(q, M ), is defined Figure 2: The standard Voronoi diagram of 9 points, as finding a sequenced route R = (p1 , . . . , pm ) that fol- the point p and its Voronoi cell V (p) lows M where the value of the following function L is minimum over all the sequenced routes that follow M : L(q, R) = D(q, p1 ) + L(R) (2) Note that L(q, R) is in fact the length of route Rq = q ⊕ R. Throughout the paper, we use Q(q, M ) = (p1 , p2 , . . . , pm ) to denote the answer to the OSR query Q. Without loss of generality, we assume that this opti- mal route is unique for given q and M . For the example in Section 1 where (U1 , U2 , U3 ) = (black, white, gray), M = (2, 1, 3), and D(., .) is the shortest path distance, the answer to the OSR query is Q(q, M ) = (s1 , r1 , t1 ). Figure 3: The AW-Voronoi diagram of 13 points each One special variation of OSR is when the user asks labeled with its positive weight for an optimal sequenced route that ends in a given contains all the points q ∈ Rd for which we have destination q 0 . A special case of this query is where she intends to return to her starting location q. This ∀p0 ∈ S, p0 6= p ⇒ D(q, p)+w(p) ≤ D(q, p0 )+w(p0 ) (4) variation of OSR can simply be addressed by defining a new point set Un+1 = {q 0 } and a new sequence M 0 = Figure 3 illustrates the AW-Voronoi diagram of a set of (M1 , . . . , Mm , n + 1). Consequently, Q(q, M 0 ) will be 13 points labeled with their positive weights. the optimal route following M which ends in q 0 . The Computational Geometry literature generally Table 1 summarizes the notations we use throughout uses negative weights in Equation 4 to define the AW- the paper. Voronoi diagrams. When negative weights are assigned to points p, their AW-Voronoi diagram can be seen as 2.2 Voronoi Diagrams the standard Voronoi diagram of a set of d-dimensional The general Voronoi diagram of a given set S including balls each centered at a point p with a radius equal to points in Rd partitions the space into regions each in- the absolute value of w(p). Here, the set S includes cluding all points with a common closest point in the balls such as b centered at a point p. The distance given set according to a distance metric d(., .). The re- d(q, b) between a ball b and the point q outside b is the gion corresponding to the point p contains all the points smallest Euclidean distance between q and any point in- q ∈ Rd for which we have side b. The literature uses this view to define the AW- Voronoi diagram of the centers of a set of balls (e.g., ∀p0 ∈ S, p0 6= p ⇒ d(q, p) ≤ d(q, p0 ) (3) disks in R2 ) [6]. Each region of the space is assigned to a unique ball which is the common closest ball of all The equality holds for the points on the borders of p’s points inside that region. Figure 4 illustrates defining and p0 ’s regions. Incorporating arbitrary distance met- the AW-Voronoi diagram of five points with negative rics d(., .) in the above equation results in different vari- weights w(p) < 0 using five corresponding disks. Here, ations of Voronoi diagrams. Figure 2 shows the stan- d(q, c), the distance of a point q to a disk c centered at dard Voronoi diagram of nine points in R2 where the p with radius r = −w(p) is defined as D(q, p) − r. As distance metric is Euclidean. We refer to the region the figure shows, point q which is inside the cell of disk V (p) containing the point p as its Voronoi cell. We also c (corresponding to point p) is closer to disk c than to refer to point p as the generator of Voronoi cell V (p). disk c0 . It also shows that some points such as p00 have Now assume that S is a set of points p ∈ Rd each empty cells. That is, no point in the space is closer to assigned a weight w(p) ∈ R. We define the distance the disk c00 than to any other disk. metric d(x, p) as D(x, p) + w(p) where D(., .) denotes The AW-Voronoi diagram demonstrates the follow- a distance metric. Without loss of generality, we as- ing properties: sume that D(., .) is Euclidean distance. The Additively Weighted (AW) Voronoi diagram of the points in S Property AWV-1 With Euclidean distance (i.e. L2 with respect to distance function D(., .) is defined us- norm) as metric D(., .) in Equation 4, the cell edges ing Equation 3. Here, the cell corresponding to point p are either straight lines or hyperbolic curves [6]. 35 Symbol Meaning Symbol Meaning Ui a point set in Rd |Ui | cardinality of the set Ui n number of point sets Ui D(., .) distance function in Rd M a sequence, = (M1 , . . . , Mm ) |M | m, size of sequence M = number of items in M Mi i-th item of M R route (p1 , p2 , . . . , pr ), where pi is a point |R| r, number of points in R pi i-th point in R L(R) length of R q ⊕R route Rp = (q, p1 , . . . , pr ) where R = (p1 , . . . , pr ) sf x(R, i) route (pi+1 , . . . , pr ), a suffix of R sf x(M, i) sequence (Mi+1 , . . . , Mm ), a suffix of M L(q, R) L(q ⊕ R), length of the route q ⊕ R Table 1: Summary of notations AW-Voronoi diagram. We prove that a unique property d(q,c) of optimal sequenced routes enables us to employ this q -w(p) partitioning. First, we prove Lemma 1 which states that the optimal route from pi that follows a suffix of the p original sequence M is the same as the corresponding suffix of the optimal route from q that follows M . p'' c'' Lemma 1. Given the sequence M = (M1 , . . . , Mm ) c and the starting point q, if Q(q, M ) = R = (p1 , . . . , pm ), p' then for any 1 ≤ i < m, we have Q(pi , M 0 ) = sf x(R, i) where M 0 = sf x(M, i). c' Proof. The proof is by contradiction. Assume that Q(pi , M 0 ) = R0 = (p01 , . . . , p0m−i ). Obviously sf x(R, i) = (pi+1 , . . . , pm ) follows sequence M 0 , there- fore we have L(pi , R0 ) < L(pi , (pi+1 , . . . , pm )). We add L(q, (p1 , . . . , pi )) to the both sides of this inequality to get: Figure 4: The AW-Voronoi diagram of five disks (i.e., five points with negative weights). Here, the distance L(q, (p1 , . . . , pi , p01 , . . . p0m−i )) < L(q, (p1 , . . . , pm )) function D(., .) is Euclidean. The above inequality shows that the answer to Q(q, M ) must be (p1 , . . . , pi , p01 , . . . , p0m−i ) which clearly Hence, these curved cells are not convex polygons follows sequence M . This contradicts our assumption as the cells of general Voronoi diagrams. Using that Q(q, M ) = (p1 , . . . , pm ). Manhattan distance (i.e. L1 norm), all edges be- come straight lines. This property of the OSRs implies that only the last point of the optimal sequenced route Q(q, M ) (i.e., pm ) Property AWV-2 The arbitrary numeric values of must necessarily be the closest point of UM to its pre- weights can result in empty cells [6]. We refer to vious point in the route (i.e., pm−1 ). the generator point of an empty cell as a trivial Here, we utilize the special case of Lemma 1 for the point. first point of the route (i.e., p1 for i = 1). For instance, Property AWV-3 The AW-Voronoi diagram of n if the optimal route from point q to a white, a black, points consists of at most O(n) connected edges and a grey point is (s1 , r1 , t1 ), then the optimal route [7]. from the point s1 to a black then a grey point is (r1 , t1 ) (see Figure 1). Therefore, if we identify the white point s1 with which the optimal route from q to a white, a 3 OSR and AW-Voronoi Diagrams black, then a grey point starts (i.e., the first point of Assume that we are only interested in the optimal se- the route) and we know that the optimal route from quenced routes that follow a fixed sequence M (i.e., s1 to a black and then a grey point is (r1 , t1 ), we have Q(q, M ) for points such as q). Suppose we can partition found the optimal route from q (we just append s1 to the space Rd into regions each including all the points q the head of (r1 , t1 )). The following lemma identifies this with a common OSR. That is, for any two points x and first point. y inside the same region, we have Q(x, M ) = Q(y, M ). Lemma 2. Given the sequence M and the starting Assume that for each region, we pre-compute this com- point q, if there exists a point p1 ∈ UM1 so that mon OSR. Furthermore, we store all regions and their ∀p01 ∈ UM1 , p01 6= p1 corresponding OSRs for the given sequence M . There- fore, for a given starting point q we can address the D(q, p1 )+L(p1 , Q(p1 , M 0 )) < D(q, p01 )+L(p01 , Q(p01 , M 0 )) query Q(q, M ) by simply locating the region including (5) q and reporting its corresponding optimal route. where M 0 = sf x(M, 1), then the optimal route Q(q, M ) In this section, we show that the above partitioning starts with p1 which is followed by the points in which facilitates processing OSR queries exists as an Q(p1 , M 0 ) (i.e., Q(q, M ) = p1 ⊕ Q(p1 , M 0 )). 36 Proof. The proof is by contradiction. Assume that Proof. Let the set S = UM1 and w(p) = L(p, Q(p, M 0 )). Q(q, M ) = (p01 , . . . , p0m ) starts with p01 6= p1 . By def- According to the definition of the AW-Voronoi cell of inition, the length of q ⊕ Q(q, M ) is minimum over all p1 given by Equation 4, for the point q inside this cell the routes which follow sequence M . It is clear that the we get ∀p01 ∈ UM1 , p01 6= p1 route p1 ⊕ Q(p1 , M 0 ) follows M , so we have D(q, p1 )+L(p1 , Q(p1 , M 0 )) < D(q, p01 )+L(p01 , Q(p01 , M 0 )) L(q, (p1 ⊕ Q(p1 , M 0 ))) ≥ L(q, Q(q, M )) (6) Therefore, according to Lemma 2 the optimal route Lemma 1 states that we have Q(p01 , M 0 ) = (p02 , . . . , p0m ) Q(q, M ) is p1 ⊕ Q(p1 , M 0 ) which clearly starts with p1 , where M 0 = sf x(M, 1). Hence, we get the generator of the Voronoi cell including q. Q(q, M ) = p01 ⊕ Q(p01 , M 0 ) (7) Given a sequence M , we refer to the AW-Voronoi diagram of the points p in UM1 using the weights Therefore, we replace Q(q, M ) in Equation 6 to get w(p) = L(p, Q(p, M 0 )) where M 0 = sf x(M, 1), as the OSR-Voronoi diagram of the sequence M . Fig- L(q, p1 ⊕ Q(p1 , M 0 )) ≥ L(q, p01 ⊕ Q(p01 , M 0 )) (8) ure 5a illustrates the OSR-Voronoi diagram of se- quence (white, black, grey) (i.e., M = (2, 1, 3)). The Finally, we use the definition of function L() to change distance function D(., .) is Manhattan distance and Equation 8 as follows for the point p01 6= p1 : each white point si is labeled with its weight (i.e., L(si , Q(si , (1, 3))). The OSR-Voronoi diagram has D(q, p1 )+L(p1 , Q(p1 , M 0 )) ≥ D(q, p01 )+L(p01 , Q(p01 , M 0 )) some interesting properties whose proofs are omitted (9) due to lack of space: The above inequality contradicts the one given in the assumption of the lemma. Property OSRV-1 All points in the OSR-Voronoi di- agram of the sequence M (any p ∈ UM1 ) have pos- itive weights. In the example of Figure 1, U1 , U2 , and U3 are the sets of black, white, and gray points, respectively. For Property OSRV-2 Given the sequence M , if for the the sequence M = (2, 1, 3), we have M 0 = (1, 3) and points p and p0 ∈ UM1 we have Q(p, M 0 ) = Q(p0 , M 0 ) = (p2 , . . . , pm ) and D(p, p2 ) = D(p, p0 )+ Q(s1 , M 0 ) = (r1 , t1 ), Q(s2 , M 0 ) = (r3 , t3 ), D(p0 , p2 ) where M 0 = sf x(M, 1), then p is a triv- ial point in OSR-Voronoi diagram of sequence M . Q(s3 , M 0 ) = (r1 , t1 ), Q(s4 , M 0 ) = (r1 , t1 ) That is, the Voronoi cell of p is empty and no OSR D(q, s1 ) + L(s1 , Q(s1 , M 0 )) = 3 + (7 + 2) = 12 that follows M starts with p. D(q, s2 ) + L(s2 , Q(s2 , M 0 )) = 2 + (3 + 9) = 14 4 OSR Query Processing using OSR- 0 D(q, s3 ) + L(s3 , Q(s3 , M )) = 4 + (8 + 2) = 14 Voronoi Diagrams D(q, s4 ) + L(s4 , Q(s4 , M 0 )) = 5 + (9 + 2) = 16 In this section, we describe our OSR query process- ing approach that utilizes OSR-Voronoi diagrams de- According to Lemma 2, the OSR of q to a white, a black fined in Section 3. Assume that we have already built and a grey point starts with s1 followed by Q(s1 , M 0 ) = the OSR-Voronoi diagram of a given sequence M . The (r1 , t1 ). user asks for the OSR Q(q, M ) given a starting point We now strive to find the locus of all points in space q. Subsequently, we locate the Voronoi cell which con- whose OSR that follows a given sequence starts with a tains q in the OSR-Voronoi diagram of M . According point p1 and hence all share a common OSR for that to Theorem 1, the OSR of q starts with the genera- sequence. In Figure 1, for any white point si , there tor of this cell (e.g., s1 in Figure 5a). This point is are points such as q whose optimal routes given the then followed by the result of another OSR query (i.e., sequence (white, black, grey) start with si . Notice that Q(s1 , M 0 )). If we already knew the answer to this sec- the rest of their optimal route is the same (e.g., (r1 , t1 ) ond query, we immediately report user’s desired route for s1 ) and depends only on the location of point si and as s1 ⊕ Q(s1 , M 0 ). Now the problem is that the we need the given sequence (e.g., (black, grey)). the OSR-Voronoi diagram of sequence M 0 to answer the second OSR query. Repeating the same reasoning for Theorem 1. Let M = (M1 , M2 , . . . , Mm ) be a given M 0 verifies that we will require the OSR-Voronoi dia- sequence and M 0 = sf x(M, 1). The locus of all points q grams of all suffixes of the original sequence M . That with a common optimal sequenced route Q(q, M ) which is, to find the OSR which follows M = (M1 , . . . , Mm ) starts with p1 is inside p1 ’s cell in the AW-Voronoi di- we need the OSR-Voronoi diagrams of all sequences agram of the set of points in UM1 where the weight of (Mi , . . . , Mm ) for all 1 ≤ i ≤ m. We refer to these dia- each point p is w(p) = L(p, Q(p, M 0 )). Their common grams as the OSR-Voronoi family of M . For instance, optimal route is Q(q, M ) = p1 ⊕ Q(p1 , M 0 ). to answer queries with sequence (white, black, grey) in 37 8 8 8 t1 7 s3 7 s3 r1 7 r1 6 10 s1 6 r2 s1 2 6 5 9 5 11 5 r2 12 4 s2 4 s2 4 t2 3 q 3 3 9 2 11 2 r3 2 r3 t3 1 s4 1 s4 1 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 Figure 5: The OSR-Voronoi diagrams of a) the sequence (white, black, grey) , b) the sequence (black, grey), and c) the sequence (grey) using the white points in Figure 1 Function ComputeOSR(point q, sequence M ) 4.2 Data Structures Returns route 1. V (p) = the cell containing q in the OSR-Voronoi diagram of M 2. p = the generator of V (p) The OSR-Voronoi diagrams are instances of AW- 3. if |M | = 1 then 4. return (p) Voronoi diagrams whose weights are constrained by 5. else geometry. Therefore, we build them using any al- 6. return p ⊕ ComputeOSR(p, sf x(M, 1)) gorithm for building AW-Voronoi diagrams. The al- Figure 6: Computing OSR for query Q(q, M ) gorithm proposed by Karavelas and Yvinec [4] com- putes the diagram of n points in O(n log n) steps. Figure 1, we require the OSR-Voronoi diagrams of se- Our approach utilizes the OSR-Voronoi diagrams of quences (white, black, grey), (black, grey), and (grey). all suffixes of a sequence M to find the OSRs that Notice that the OSR-Voronoi family of a sequence can follow M . Therefore, to answer queries with se- also be used to answer OSR queries of any suffix of that quence (white, black, grey) in Figure 1, we require sequence. the OSR-Voronoi diagrams of this sequence together with those of sequences (black, grey) and (grey). In general, for a given sequence M , we require |M | 4.1 Query Processing OSR-Voronoi diagrams of all suffixes of M to answer Here, we assume that we have already built the OSR- Q(q, M ). Hence, the pre-computation phase takes P|M | Voronoi family of a given sequence M . Subsequently, O( i=1 |UMi | log |UMi |) computation steps. we employ this set of diagrams to answer the OSR query We incrementally build the OSR-Voronoi family of Q(q, M ). Section 4.2 shows how we recursively build all (white, black, grey) as follows. First, we build the OSR- these diagrams. Voronoi diagram of (grey) shown in Figure 5c which We describe our query processing approach using is simply the standard Voronoi diagram of the set of the example of Figure 1. Notice that here the se- grey points T = {t1 , t2 , t3 } with respect to Manhattan quence is (white, black, grey) and the distance func- distance [6]. Second, for each black point ri we find tion D(., .) is Manhattan distance. We have already the generator grey point of the cell containing ri in this built and loaded the OSR-Voronoi diagrams of se- OSR-Voronoi diagram. In Figure 5c, each black point quences (white, black, grey), (black, grey), and (grey) is connected to its corresponding grey generator by a illustrated in Figures 5a, 5b, and 5c, respectively. No- solid line. We assign the distance between the point tice that the last diagram is the standard Voronoi di- ri and its corresponding grey point to the weight of ri . agram on grey points as the OSR of a point q which For instance, we assign w(r1 ) = 2. Third, we build follows the sequence (grey) is simply the closest grey the OSR-Voronoi diagram of (black, grey) using the set point to q (see Lemma 1 for i = m − 1). First, we lo- of black points R = {r1 , r2 , r3 } and their weights (see cate the cell containing q in the OSR-Voronoi diagram Figure 5b). Similar to the previous step, for each white of (white, black, grey). Following Theorem 1, our OSR point si we locate rx , the generator of the cell which starts with s1 , the generator of this cell (see Figure contains si in the diagram built in this step and assign 5a). The rest of the OSR is the same as the OSR of s1 D(si , rx ) + w(rx ) to the weight of si . For example, the which follows (black, grey). Hence, we next find r1 , the weight of s1 is 7 + 2 = 9. Finally, we build the OSR- generator of the cell containing s1 in the OSR-Voronoi Voronoi diagram of (white, black, grey) using the set of diagram of (black, grey) shown in Figure 5b. Finally, white points S = {s1 , s2 , s3 , s4 } with respect to their we find the closest grey point to r1 as the generator corresponding weights (see Figure 5a). Figure 7 shows of the cell containing r1 in the OSR-Voronoi diagram the pseudo-code of the above algorithm. of (grey) (see Figure 5c). The OSR of point q which Once we computed the OSR-Voronoi family of a follows sequence (white, black, grey) is (s1 , r1 , t1 ) built sequence, we require to store the boundaries of the using the three subsequently retrieved points. Figure Voronoi cells corresponding to each OSR-Voronoi dia- 6 shows the pseudo-code of our OSR query processing gram of the family. However, with Euclidean distance, approach as a recursive function ComputeOSR(point, these boundaries consist of curved edges and hence can- sequence). not be stored as simple polygons. This problem can be 38 Function BuildOSRVoronois(sequence M = (M1 , . . . , Mm )) Points Size Points Size Returns OSR-Voronoi diagrams Hospital 5,314 Building 15,127 1. OSRV ((Mm )) = the standard Voronoi diagram of UMm Summit 69,498 Cemetery 109,557 2. for each p in UMm 3. w(p) = 0 Church 127,949 School 139,523 4. i = m − 1 Populated place 167,203 Institution 319,751 5. while i > 0 do { 6. M 0 = sf x(M, i) Table 2: USGS dataset 7. for each p in UMi { 8. V (p0 ) = the cell containing p in the OSRV (M 0 ) points from this dataset. We used CGAL’s implemen- 9. p0 = the generator of V (p0 ) tation of [4] to implement the OSR-Voronoi diagrams 1 . 10. w(p) = D(p, p0 ) + w(p0 ) } We ran 1000 OSR queries from random starting points 11. OSRV (sf x(M, i − 1)) = the AW-Voronoi diagram of UMi on a DELL Precision 470 with Xeon 3.2 GHz processor 12. i=i−1 } and 3GB of RAM. 13. return OSRV ’s In the first set of experiments, we study the perfor- Figure 7: Incrementally building the data structures mance of OSRV in terms of query time. Figure 8a shows required to answer the OSR query Q(q, M ) the query response time for our OSRV and R-LORD ap- proaches when the number of points in optimal route solved in the two following ways. Each solution guar- (i.e., |M |) varies from 3 to 12. All the required diagrams antees that given a starting point q, we can easily de- fit in main memory, so the reported time represents termine the cell containing q. CPU time. While the figure depicts the experimental results on 250K USGS dataset, the trend is the same 1. The cells are approximated by convex polygons with those of our other datasets with different cardi- and consequently these polygons are stored. The nalities. Figure shows that OSRV always outperforms polygons are also indexed by a spatial index struc- R-LORD. As expected, R-LORD performs slower with ture such as R-tree. Therefore, the cell containing the bigger sequences as it requires more range queries the starting point q is easily retrieved by a spatial on the R-tree. In contrast, OSR query with OSRV has contains() query. This solution introduces an er- almost no cost in time. OSRV’s response time is unno- ror in OSR query processing proportional to the ticeably increasing as OSRV consists of a sequence of approximation error. only |M | point locations on our efficient AW-Voronoi 2. Instead of storing the cells, the neighborhood graph diagrams. of the cell generators is stored (similar to Delaunay Figure 8b illustrates the result of our second set of graph for the general Voronoi diagrams [6]). Ver- experiments on the impact of the cardinality of the tices of the graph are the generators of the OSR- dataset on the efficiency of OSRV. We varied the cardi- Voronoi diagram of M . There is a graph edge nality of our real datasets from 40K to 950K and ran connecting p and p0 iff V (p) and V (p0 ) have com- OSR queries of sequence size |M | = 6. Figure 8b shows mon boundaries. Now, to determine the inclusion that OSRV’s performance is not much affected by the of a starting point q in a cell, we start traversing dataset size and verifies the scalability of OSRV. This the graph from any vertex p. We iteratively visit is while the processing time of R-LORD moderately in- the vertex p0 , the neighbor of the current vertex p creases for larger datasets. For example, R-LORD’s which minimizes D(q, p0 ) + w(p0 ) among p’s neigh- query response time is 0.09 seconds for 40, 000 points, bors and p itself. When no such a vertex is found, and it increases by a factor of 16 when the number of q is inside the Voronoi cell of the current vertex p. points increases to 950, 000 by a factor of 24. Conse- quently, OSRV is the superior approach although R- 5 Performance Evaluation LORD also exhibits a reasonable performance for large datasets. We conducted several experiments to evaluate the per- The last experiment aims to measure the time re- formance of our proposed approach. We compared the quired to build the diagrams in the OSR-Voronoi family query response time of our Voronoi-based approach (de- of a given sequence. Note that this is a batch process noted by OSRV) with that of our R-LORD algorithm which is performed off-line. Figure 8c illustrates the av- proposed in [8]. We also evaluated our new OSRV ap- erage build time for different sizes of the sequence M for proach with respect to its overall time required to build 250K dataset. Moreover, Figure 8d depicts the result the OSR-Voronoi family of a given query. We evaluated of the same experiment for different dataset cardinali- OSRV by investigating the effect of the following two ties when the sequence size is 6. Figure 8c shows that parameters on its performance: 1) size of sequence M it takes only 9 minutes to build 12 OSR-Voronoi dia- in Q(q, M ) (i.e., number of points in the P optimal route), grams needed for the OSR query of a sequence of size n and 2) cardinality of the datasets (i.e., i=1 |Ui |). 12. However, Figure 8d shows that despite its excellent We used a real-world dataset which is obtained from performance for small datasets, OSRV’s procedure for the U.S. Geological Survey (USGS) and consists of building the OSR-Voronoi family of a sequence of size 950, 000 locations of different businesses (e.g., schools) 6 takes about 36 minutes for the 950K dataset. This in the entire country. Table 2 shows the characteristics is still a reasonable performance for an off-line process. of this dataset. In our experiments, we randomly se- lected sets of 40K, 70K, 250K and 500K, and 950K 1 http://www.cgal.org/ 39 0.8 1.6 R-LORD R-LORD 10 40 OSRV OSRV 9 35 Response time (sec) Response time (sec) 0.6 1.2 8 Build time (min) Build time (min) 30 7 6 25 0.4 0.8 5 20 4 15 0.4 3 10 0.2 2 1 5 0.0 0.0 0 0 3 6 9 12 40 70 250 500 950 3 6 9 12 40 70 250 500 950 No of points in route (|M|) Dataset cardinality (K) No of points in route (|M|) Dataset cardinality (K) a) Query cost vs. b) Query cost vs. c) Build cost vs. d) Build cost vs. sequence size (250K) cardinality (|M | = 6) sequence size (250K) cardinality (|M | = 6) Figure 8: Experimental results on the performance of OSRV We believe that upgrading main-memory AW-Voronoi locus of all points with a common optimal sequenced diagrams to their secondary-memory versions will sig- route R is the cell of the first point of R in a AW- nificantly improve OSRV’s off-line performance for large Voronoi diagram whose weights depend on the sequence datasets. M . Based on our theoretical findings, we proposed a novel OSR query processing approach using a family of 6 Related Work pre-computed AW-Voronoi diagrams. Through exper- iments with a real-world dataset, we showed that our In an independent research, Li et al. [5] studied Trip approach significantly outperforms the previous solu- Planning Queries (TPQ), a class of queries similar to tions to OSR query in terms of response time. our OSR query. With a TPQ, the user specifies a While in this paper we adapted AW-Voronoi dia- subset (not a sequence) of location types R and asks grams for OSR queries, we believe that these diagrams for the optimal route from her starting location to a can also be used to answer a variation of OSR query specified destination which passes through at least one where the sequence is not important (TPQ problem [5]). database point of each type in R. In particular, TPQ An orthogonal problem is studying efficient algorithms eliminates the sequence order of OSR to define a new to store the AW-Voronoi diagrams in secondary storage. query. Consequently, finding accurate solutions to TPQ becomes NP-hard as the size of candidate space signif- References icantly grows. The paper proposes several approxima- tion methods to provide near-optimal answers to TPQs. [1] Y. Du, D. Zhang, and T. Xia. The Optimal-Location Query. In Proceedings of SSTD’05, pages 163–180, 2005. In theory, OSR algorithms are able to address address TPQ queries. This can be done by running any OSR [2] C. du Mouza, P. Rigaux, and M. Scholl. Efficient Eval- algorithm |R|! times, each time using a permutation of uation of parameterized pattern queries. In Proceedings of CIKM’05, pages 728–735. ACM, 2005. point types in R as the sequence M and returning the optimal route among all the resulted routes. This expo- [3] M. Hadjieleftheriou, G. Kollios, P. Bakalov, and V. J. Tsotras. Complex Spatio-Temporal Pattern Queries. In nentially complex solution is inefficient in practice. The Proceedings of VLDB’05, pages 877–888, 2005. approximation algorithms of [5] are designed to handle the exponential growth of the problem’s search space. [4] M. I. Karavelas and M. Yvinec. Dynamic Additively Weighted Voronoi Diagrams in 2d. In Proceedings of Terrovitis et al. [9] addressed k-stops shortest path the 10th Annual European Symposium on Algorithms problem for spatial databases. The problem seeks for (ESA’02), pages 586–598, London, UK, 2002. Springer- the optimal path from a starting location to a given des- Verlag. tination which passes through exactly k intermediate [5] F. Li, D. Cheng, M. Hadjieleftheriou, G. Kollios, and points of the location database. The k-stops problem is S.-H. Teng. On Trip Planning Queries in Spatial Data- a specialized case of OSR queries. To be specific, OSR bases. In Proceedings of SSTD’05, pages 273–290. reduces a k-stops problem to query Q(q, M ) where q is Springer, August 2005. the starting location, M = (1, . . . , 1), |M | = k, and U1 [6] A. Okabe, B. Boots, K. Sugihara, and S. N. Chiu. Spa- is the single given point set representing the location tial Tessellations, Concepts and Applications of Voronoi database. The destination is considered by the varia- Diagrams. John Wiley and Sons Ltd., 2nd edition, 2000. tion of the above OSR query described in Section 2.1. [7] Y. Ostrovksy-Berman. Computing Transportation A major drawback is that [9] only considers Euclidean Voronoi Diagrams in Optimal Time. In Proceedings of space. the 21st European Workshop on Computational Geome- try (EWCG’05), pages 159–162, 2005. 7 Conclusions and Future work [8] M. Sharifzadeh, M. Kolahdouzan, and C. Shahabi. The Optimal Sequenced Route. The VLDB Journal, 2006. We studied the problem space of the OSR query in vec- To appear. tor spaces. We exploited the geometric properties of [9] M. Terrovitis, S. Bakiras, D. Papadias, and K. Moura- the solution space and identified the AW-Voronoi di- tidis. Constrained Shortest Path Computation. In Pro- agrams as its corresponding geometric representation. ceedings of SSTD’05, pages 181–199. Springer, August In particular, we proved that given a sequence M the 2005. 40