=Paper=
{{Paper
|id=Vol-1671/paper5
|storemode=property
|title=Identifying Top-K Optimal Locations for Placement of Large-Scale Trajectory-Aware Services
|pdfUrl=https://ceur-ws.org/Vol-1671/paper5.pdf
|volume=Vol-1671
|authors=Shubhadip Mitra
|dblpUrl=https://dblp.org/rec/conf/vldb/Mitra16
}}
==Identifying Top-K Optimal Locations for Placement of Large-Scale Trajectory-Aware Services==
Identifying Top-K Optimal Locations for Placement of
Large-Scale Trajectory-Aware Services
Shubhadip Mitra
Supervised by Arnab Bhattacharya
Dept. of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India
smitr@cse.iitk.ac.in
ABSTRACT
Optimal location problems identify the best sites to set up new facil-
ities for providing service to its users. Majority of the existing work
in this space assumes that the users are static and the datasets are
small. Such assumptions are too restrictive and unrealistic for real-
life services such as setting up of fuel stations, upgradation of cell-
phone base-stations, etc. The placement of such services should,
however, factor in the mobility patterns of its consumers, i.e., the
user trajectories. For example, given a budget of k locations to
set up fuel stations, the objective should be to cover the maximum Figure 1: Illustration of the need of trajectory-aware querying for
number of user trajectories. In this doctoral work, we introduce optimal locations. The lines represent the trajectories of users. We
top-k optimal location problems for large-scale trajectory-aware assume each trajectory belongs to a separate user.
services. Since these problems are NP-hard, we design scalable
techniques with bounded quality guarantees that work directly with data analytics has attracted considerable research attention in recent
user trajectories over city-scale road networks. Empirical evalua- years [12, 13, 18].
tions show that the proposed heuristics are highly efficient, both in To illustrate the need for trajectory-aware planning of services,
terms of space overhead and running time, as well as quite effective, consider Fig. 1. There are seven candidate locations, S1 , . . . , S7 ,
with quality close to that of the optimal. to set up a service, out of which five are either home or office lo-
cations. There are five users commuting between home and office
locations whose trajectories are shown. Using this figure, we next
1. INTRODUCTION AND MOTIVATION motivate three scenarios of trajectory-aware services.
One of the most important problems in planning of services is Scenario 1: A company wants to open two new fuel stations. A
to identify the best locations to set up new facilities (or improve trajectory of a user is satisfied only if it passes through at least
existing facilities) with respect to a given service [4, 20]. Examples one of the fuel stations. If only the static locations are considered,
include setting up new services such as fuel stations, banking, retail i.e., any two out of the five office and residential areas are to be
stores, etc. Such optimal location (OL) problems have also been selected, no combination would satisfy all the trajectories. In con-
extensively studied as Facility Location problems [6, 9]. trast, if we factor in the mobility of the users, choosing S1 and S3
Majority of the existing works on OL problems assume that the as the installation locations satisfies all trajectories. Note that it is
users of the service are static. Such an assumption is often too not enough to simply look at trajectory counts in each possible in-
prohibitive. For example, services such as mobile services, fuel stallation location and then choosing the two most frequent ones.
stations, bill boards, traffic monitoring systems, etc. are widely ac- By that strategy, a location such as S2 would be chosen along with
cessed by users while commuting. Consequently, the placement of S1 . This combination is not effective since the same trajectories
such facilities require taking into consideration the mobility pat- pass through both, thereby reducing each other’s utilities (a pro-
terns (or trajectories) of the users rather than their static locations cess sometimes referred to as cannibalization in economics).
[1–3, 19]. We refer to such services as trajectory-aware services.
A trajectory is simply a sequence of location-time coordinates that Scenario 2: Suppose a mobile service provider wants to set up two
lie on the path of a moving user. Such trajectory data are com- base-stations to provide good quality of experience to the commut-
monly available from GPS traces [10], CDR (Call Detail Records) ing users. In this case, a user is satisfied if it receives good service
data [11] recorded through cellphones, social network check-ins for a large fraction of its trajectory. The selection {S1 , S2 } satis-
[5], etc. Due to more availability of real trajectory data, trajectory fying 3 trajectories is the optimal choice. An alternative selection
such as {S1 , S3 } can offer good quality of service only in a limited
segment of the incident trajectories, thereby not satisfying even a
single trajectory.
Scenario 3: Suppose a company plans to set up two fuel stations
Proceedings of the VLDB 2016 PhD Workshop, September 9, 2016. New
Delhi, India. that minimize the maximum inconvenience, i.e. the extra distance
Copyright (c) 2016 for this paper by its authors. Copying permitted for travelled by a user to avail a service. The optimal solution is {S1 , S3 }
private and academic purposes.
(or {S2 , S3 }) with maximum inconvenience being 0. Any other so- trajectory belongs to a separate user. However, the framework can
lution would have non-zero maximum inconvenience since at least easily generalize to multiple trajectories belonging to a single user.
one user needs to travel non-zero extra distance to reach to the near- Suppose d(vi , vj ) denotes the shortest network distance along a
est fuel station. directed path from node vi to vj , and dr (vi , vj ) denotes the short-
The objective of this doctoral thesis is to develop an efficient est distance of a round-trip starting at node vi , visiting vj , and re-
framework for planning of large-scale trajectory-aware services such turning to vi , i.e., dr (vi , vj ) = d(vi , vj ) + d(vj , vi ). In general,
as the scenarios described above. More specifically, given n can- d(vi , vj ) 6= d(vj , vi ), but dr (vi , vj ) = dr (vj , vi ). With a slight
didate service locations, the proposed models enable the service abuse of notation, assume that dr (Tj , si ) denotes the extra dis-
provider to identify the best k locations to set up new service or tance traveled by the user on trajectory Tj to avail a service at site
improve existing service by factoring in the trajectories of its users. si . Formally, dr (Tj , si ) = min∀vk ,vl ∈Tj {d(vk , si ) + d(si , vl ) −
The models allow the service providers to specify a wide range of d(vk , vl )}.
objectives and constraints depending on the choice of service. To It is convenient for a user to avail a service only if its location is
model various objectives, we associate a utility function Uj with not too far off from her trajectory. Thus, beyond a distance τ , we
each trajectory Tj in the problem. This utility function captures assume that the utility offered by a site si to a trajectory Tj is 0.
how well a trajectory is served by a given set of service locations. We call this user-specified distance τ as the coverage threshold.
The goal is to determine k out of n candidate locations that maxi-
D EFINITION 1 (C OVERAGE ). A site si covers a trajectory
mize the sum of trajectory utilities Uj over all the trajectories.
Tj if the distance dr (Tj , si ) is at most τ , where τ ≥ 0 is the cov-
Motivated by the above three scenarios, we study three classes
erage threshold.
of problems namely, TOPS, TUMP and TIPS, which are described
in Sec. 2, 3 and 4, respectively. For all sites within the coverage threshold τ , the user also spec-
The contributions and significance of this doctoral thesis are sum- ifies a preference function ψ. The preference function ψ(Tj , si )
marized as follows: assigns a score (normalized to [0, 1]) for a trajectory Tj and a site
? Novelty: To the best of our knowledge, this is the first extensive si that indicates how much si is preferred by the user on trajectory
work to study efficient computation of placement of large-scale Tj . Higher values indicate higher preferences with 0 indicating no
trajectory-aware services. The proposed models are highly generic preference. In general, sites that are closer to the trajectory have
and can be easily adapted to meet various service requirements. higher preferences than those farther away.
? Hardness: We show that the TOPS, TUMP and TIPS problems D EFINITION 2 (P REFERENCE F UNCTION ψ). ψ : (T , S) →
are NP-hard. We develop optimal algorithms for each of them and [0, 1] is a real-valued preference function defined as follows:
show why they are impractical. (
? Efficiency: We develop several interesting heuristics for each f (dr (Tj , si )) if dr (Tj , si ) ≤ τ
ψ(Tj , si ) = (1)
of the three problems. The proposed solutions are highly efficient 0 otherwise
both in terms of space overhead and running time.
? Effectiveness: While the proposed heuristics have theoretically where f is a non-increasing function of dr (Tj , si ).
bounded quality guarantees, the empirical evaluations show that The goal of the TOPS query is to report a set of k sites Q ⊆
they are quite close to the optimal. S, |Q| = k, that maximizes the preference score over the set
? Extensive Benchmarking: We benchmark the proposed solu- of trajectories. The preference score of a trajectory Tj over a set
tions through extensive experimental evaluation over real and syn- of sites Q is defined as the utility function Uj for Tj , which is
thetic large-scale datasets. The impact of various parameters such simply the maximum score corresponding to the sites in Q, i.e.,
as budget, coverage threshold, trajectory-length, city geometries, Uj = maxsi ∈Q {ψ(Tj , si )}.
etc. are thoroughly evaluated across multiple large-scale datasets. The generic TOPS query formulation is stated next.
We next describe in detail the TOPS, TUMP and TIPS problems.
P ROBLEM 1 (TOPS). Given a set of trajectories T , a set of
candidate sites S that can host the services, the TOPS problem
2. THE TOPS PROBLEM with query parameters (k, τ, ψ) seeks to report the best k sites,
Referring to Scenario 1, here we consider the problem of OL Q ⊆ S, |Q| =P k, that maximize the sum of trajectory utilities, i.e.,
queries for services such as fuel-stations, ATMs, convenience stores, Q = arg max m j=1 Uj where Uj = maxsi ∈Q {ψ(Tj , si )}.
bill boards, etc., that are typically demanded only intermittently,
and not continuously throughout the trip. We show that the TOPS problem is P NP-hard [16]. Further, we
Consider a road network G = {V, E} over a geographical area also prove that the sum of utilities U = m
j=1 Uj is a non-decreasing
where V = {v1 , . . . , vN } denotes the set of road intersections sub-modular function [16].
(usually referred to as vertices or nodes), and E denotes the road
segments between two adjacent road intersections. To model the 2.1 Algorithms for TOPS
direction of the underlying traffic that passes over a road segment, We propose a greedy approximation algorithm I NC -G REEDY to
we assume that the edges are directed. Assume a set of candidate solve TOPS. This is based on the principle of maximizing marginal
sites S = {s1 , · · · , sn } ⊆ V where a certain service or facility gain. In successive k iterations, it picks a site that offers maximal
can be set up. The set S can be in addition to existing service loca- gain in the utility U . The approximation bound of this algorithm is
tions. Without loss of generality, we can augment the vertices V to max{1−1/e, k/n}. To improve the performance of this algorithm,
include all the sites. Thus, S ⊆ V . we use FM Sketches [7] for efficiently identifying the site offering
The set of trajectories over G is denoted by T = {T1 , · · · , Tm } maximal marginal gain in the utility U . The major drawback of this
where each trajectory Tj is a sequence of nodes Tj = {vj1 , · · · , vjl }, scheme is its high memory overhead of O(mn) which is infeasible
vji ∈ V . The trajectories are usually recorded as GPS traces and for large datasets. To address this limitation, we develop a scalable
may contain arbitrary spatial points on the road network. For our framework N ET C LUS, discussed next.
purpose, each trajectory is map-matched [14] to form a sequence We first state one basic observation. If two sites are close, the sets
of road intersections through which it passes. We assume that each of trajectories they cover are likely to have a high overlap. Hence,
when k n, which is typically the case, the sites chosen in the Bi ∈ B for a time interval of ∆i units and received a throughput of
answer set are likely to be distant from each other. The index struc- ηi bytes per unit of time. Note that ηi can be any metric as long as a
ture, N ET C LUS, is designed based on the above observation. greater ηi denotes better experience (e.g., throughput or packet suc-
Clustering of sites based on similarities between set member- cess rate) when associated to a respective base-station. Henceforth,
ships (such as Jaccard similarity of trajectory sets covered by two for brevity, we refer to this metric as throughput. These trajecto-
sites) will not be useful since they depend on the coverage threshold ries and Φi ’s can be constructed by scanning the active transaction
τ which is available only at run time. Hence, we adopt distance- records maintained by the network operator.
based clustering. If two sites are close to each other, their utilities For ease of notation, we write Bi ∈ Tj if the base-station Bi ∈
as well as the sets of trajectories they cover are likely to be similar. B lies on the trajectory Tj . The length of a trajectory Tj , denoted
Hence, it is unlikely for two sites belonging to the same cluster to by |Tj |, is simply the count of base-stations that lie on it. Suppose
be part of the answer set. Therefore, if I NC -G REEDY is performed d denotes the maximum length of a trajectory.
only on the clusters instead of the sites, the answer sets returned For a trajectory Tj , a base-station Bi ∈ Tj is a bottleneck base-
and their corresponding utilities are likely to be similar. station if it offers a degraded quality of service, e.g., an extremely
Our method follows two main phases: offline and online. In the low upload/download speed, a call-drop, etc. In our model, we
offline phase, clusters are built at multiple resolutions. This forms assume that a base-station Bi acts as a bottleneck w.r.t. a trajec-
the different index instances. A particular index instance is useful tory Tj if the corresponding throughput is less than a threshold,
for a particular range of query coverage thresholds. In the online i.e., ηi < ψ. The value of ψ is computed from a combination
phase, when the query parameters are known, first the appropriate of network parameters. A base-station that is a bottleneck for one
index instance is chosen. The I NC -G REEDY algorithm is then run trajectory may not be a bottleneck for other trajectories since dif-
with the cluster representatives on that instance. We omit the details ferent users may experience different throughputs based on various
in the interest of space. factors such as data plan, time of the day, etc.
Next we outline the solution at a high level. Given the raw GPS Our goal is to maximally improve the mobile user experience
traces of user movements, they are map-matched [14] to the cor- by selectively upgrading k out of n base-stations that act as bottle-
responding road network. From the map-matched trajectories, in necks on some trajectories. Suppose X = {x1 , . . . , xn } denotes
conjunction with the road network, the N ET C LUS index structure the boolean solution vector such that xi = 1 if and only if base-
is built. N ET C LUS performs a multi-resolution clustering of the station Bi is chosen for upgradation and 0 otherwise.
road network following which indexed views of both the network Given a trajectory Tj , we define the weight wji for each base-
and the trajectories are constructed in a compressed format at var- station Bi ∈ Tj , that accounts for the fraction of the total time
ious granularities. This completes the offline phase. In the online that the user (on this trajectory) was connected to the base-station
phase, given the query parameters, the optimum clustering resolu- Bi . More precisely, wji = P ∆i ∆i . Suppose bji denotes a
Bi ∈Tj
tion to answer the query is identified, and the corresponding views bottleneck indicator variable that takes value 1 if the base-station
of the trajectories and road network are analyzed to retrieve the best Bi ∈ Tj is a bottleneck base-station w.r.t. Tj , and 0 otherwise.
k sites for facility locations. Given a trajectory Tj and solution X, we define a bottleneck
2.2 Summary of Experimental Results utility function Wj as follows:
X X
We evaluate our heuristics on a publicly available and widely Wj = wji + wji .xi (2)
used real dataset consisting of GPS traces of taxis from Beijing Bi ∈Tj ,bji =0 Bi ∈Tj ,bji =1
[21, 22]. This dataset has 123, 179 trajectories, and 269, 686 sites.
To study the impact of city geographies, we also evaluate our Wj essentially captures the fraction of the total time when the user
solutions on three synthetic datasets that emulate trajectories in the enjoys acceptable QoE on the trajectory Tj . If all the base-stations
patterns followed in New York, Atlanta and Bangalore. We use on Tj are non-bottleneck, then Wj = 1; otherwise, Wj < 1.
an online traffic generator tool, MNTG (http://mntg.cs.umn.edu/tg/ Henceforth, we consider the bottleneck utility function as the de-
index.php) to generate the traffic traces. fault trajectory utility function.
We observe that N ET C LUS is up to 2 orders of magnitude faster Based on this, we next define a class of trajectories that enjoy
than I NC -G REEDY while yielding solutions that are within 90% of satisfactory QoE after upgradation of the base-stations.
that of I NC -G REEDY. The use of FM sketches for efficiently com-
D EFINITION 3 (γ- BOTTLENECK - FREE TRAJECTORY ). A tra-
puting the site offering maximal marginal utility, yields speed up
jectory Tj ∈ T is γ-bottleneck-free if its utility Wj ≥ γ where Wj
of 3-6 times. Importantly, while I NC -G REEDY goes out of mem-
is given by Eq. (2), and γ ∈ [0, 1] is the bottleneck parameter.
ory for τ ≥ 1.6 km on a 32 GB machine, N ET C LUS requires less
than 3 GB memory for different values of the query parameters. Our aim is to maximize the number of trajectories with high util-
We also find that the longer trajectories are easier to serve than the ity. To do so, given the bottleneck utility function Wj , we map
shorter trajectories, as they can be served through a larger pool of it to a step utility function Uj using a threshold γ (0 ≤ γ ≤ 1),
candidate sites. In addition, N ET C LUS can efficiently absorb dy- henceforth referred to as the bottleneck parameter:
namic updates such as change in trajectories or candidate locations. (
1 if Wj ≥ γ
Uj = (3)
3. THE TUMP PROBLEM 0 otherwise
Referring to Scenario 2, here we focus on cellular services that We now formally state the Trajectory-Aware Macro-Cell Planning
aim to provide good quality of experience (QoE) not just at few Problem, TUMP(γ).
discrete points, but throughout the trip.
Consider a cellular network B = {B1 , . . . , Bn } of n base-stations P ROBLEM 2 (TUMP(γ)). Given a base-station network B
spread across a region. A trajectory Tj is represented as a sequence of size n, a budget parameter k, a bottleneck parameter γ, and a
of tuples of the form Φi = hBi , ∆i , ηi i that captures the user expe- set of m trajectories T = {T1 , . . . , Tm }, each of which has an
rience. The user on this trajectory was connected to the base-station associated utility function Wj , determine the set of k base-stations
P
to upgrade such that the sum of utilities Tj ∈T Uj is maximized, of the two versions of the problem. We are working towards a
where Uj is given by Eq. (3). scalable implementation of these heuristics. For evaluation, we use
the same real and synthetic datasets, as in the case of the TOPS
We show that the TUMP(γ) problem is NP-hard due to a reduc- problem. The preliminary results show that the proposed solutions
tion from the k-Vertex Cover (k-VC) problem [15]. are highly effective and efficient.
Our problem formulation enables the network operator to suit-
ably select the bottleneck parameter γ based on the application re-
Conclusions
quirements. For example, γ = 1 is suitable for real-time appli- With expansion of cities, and growing urban population, more peo-
cations such as voice calls or video conferences whereas γ = 0.8 ple are required to commute longer distances, and thereby gener-
may suffice for video streaming since video players can mask-off ating new demands for various services such as fuel stations, food
certain durations of low connectivity by buffering. Similarly, even joints, mobile services, retail stores, etc. In this light, identifying
γ = 0.5 may be enough for elastic applications such as background the best service-locations is highly critical to the success of these
synchronization of emails. trajectory-aware services. The large-scale road networks and high
volume of trajectories to be served, make these problems extremely
3.1 Algorithms for TUMP challenging. This doctoral thesis introduces three such optimal lo-
Since the TUMP(γ) problem is NP hard, we design four ap- cation problems, namely TOPS, TUMP, and TIPS and develop ef-
proximation algorithms based on linear programming, and greedy ficient and effective frameworks to solve them.
paradigm [15]. Among the proposed schemes, the algorithm that
stands out in terms of quality and practical running times is D EC - 5.[1] O.REFERENCES
Berman, D. Bertsimas, and R. C. Larson. Locating discretionary service
G REEDY. This is a greedy algorithm that works on the principle facilities, ii: maximizing market size, minimizing inconvenience. Operations
of minimizing marginal loss. Initially, we assume that the solution Research, 43(4):623–632, 1995.
comprises of the full set of bottleneck base-stations. Given that [2] O. Berman, R. C. Larson, and N. Fouska. Optimal location of discretionary
the budget is k, the algorithm runs for n − k iterations, where in service facilities. Transportation Science, 26(3):201–211, 1992.
[3] M. Boccia, A. Sforza, and C. Sterle. Flow intercepting facility location:
each iteration, it prunes away a base-station that results in minimal Problems, models and heuristics. Journal of Mathematical Modelling and
loss
k
utility U . The approximation bound of this algorithm is
innthe Algorithms, 8(1):35–79, 2009.
d
/ d . [4] Z. Chen, Y. Liu, R. C.-W. Wong, J. Xiong, G. Mai, and C. Long. Efficient
We show that D EC -G REEDY algorithm can be incrementally ap- algorithms for optimal location queries in road networks. In SIGMOD, pages
123–134, 2014.
plied to an evolving network; i.e., as and when the operator allo- [5] E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user movement
cates additional budget, this algorithm can be applied to incremen- in location-based social networks. In Proceedings of the 17th ACM SIGKDD
tally evolve the network from one generation to another. international conference on Knowledge discovery and data mining, pages
1082–1090. ACM, 2011.
3.2 Summary of Experimental Results [6] Z. Drezner. Facility location: a survey of applications and methods. Springer,
1995.
Among the different algorithms that we design for the TUMP [7] P. Flajolet and G. N. Martin. Probabilistic counting algorithms for data base
problem, D EC -G REEDY provides the optimum balance between applications. J. Computer and System Sciences, 31(2):182–209, 1985.
quality and running time. On thorough empirical evaluation across [8] T. F. Gonzalez. Clustering to minimize the maximum intercluster distance.
Theoretical Computer Science, 38:293–306, 1985.
multiple datasets emulating different city topologies, it is observed [9] H. W. Hamacher and Z. Drezner. Facility location: applications and theory.
that D EC -G REEDY offers significantly higher quality than the other Springer, 2002.
algorithms, especially at low budgets and higher γ, i.e., higher QoE [10] W. He, D. Li, T. Zhang, L. An, M. Guo, and G. Chen. Mining regular routes
requirements. For example, with an upgrade budget of k = 20% from gps data for ridesharing recommendations. In SIGKDD, pages 79–86,
2012.
and γ = 0.8, D EC -G REEDY returns solutions that serve 3-8 times [11] V. Kolar, S. Ranu, A. P. Subramainan, Y. Shrinivasan, A. Telang, R. Kokku, and
more number of users than an approach that uses a greedy location- S. Raghavan. People in motion: Spatio-temporal analytics on call detail records.
based base-station upgrade. In COMSNETS, pages 1–4, 2014.
We also observe that the investment required to provide satis- [12] J.-G. Lee, J. Han, and K.-Y. Whang. Trajectory clustering: a
partition-and-group framework. In Proceedings of the 2007 ACM SIGMOD
factory QoE to mobile users is dependent on the population dis- international conference on Management of data, pages 593–604. ACM, 2007.
tributions and their road-network. Specifically, cities with a dense [13] C. Long, R. C.-W. Wong, and H. Jagadish. Direction-preserving trajectory
central business districts, such as New York, need less budget to simplification. Proceedings of the VLDB Endowment, 6(10):949–960, 2013.
satisfy a large segment of mobile users than cities where businesses [14] Y. Lou, C. Zhang, Y. Zheng, X. Xie, W. Wang, and Y. Huang. Map-matching
for low-sampling-rate gps trajectories. In GIS, pages 352–361, 2009.
are spread out (e.g., Atlanta). [15] S. Mitra, S. Ranu, V. Kolar, A. Telang, A. Bhattacharya, R. Kokku, and
S. Raghavan. Trajectory aware macro-cell planning for mobile users. In
INFOCOM, 2015.
4. THE TIPS PROBLEM [16] S. Mitra, R. Sharma, P. Saraf, A. Bhattacharya, and S. Ranu. Netclus: A
Referring to Scenario 3, the next problem is related to the TOPS scalable indexing framework for trajectory-aware placement of services. In
problem, which we refer to as Trajectory-aware Inconvenience- Under Review at VLDB, 2016.
[17] W. S. Sarle. Finding groups in data: An introduction to cluster analysis. Journal
minimizing Placement of Services (TIPS), stated as follows: of the American Statistical Association, 86(415):830–833, 1991.
P ROBLEM 3 (TIPS). Given a set of trajectories T , a set of [18] R. Song, W. Sun, B. Zheng, and Y. Zheng. Press: A novel framework of
trajectory compression in road networks. Proceedings of the VLDB Endowment,
candidate sites S that can host the services, the TIPS problem with 7(9):661–672, 2014.
query parameter k seeks to report the best k sites, Q ⊆ S, |Q| = [19] T.-H. Wu and J.-N. Lin. Solving the competitive discretionary service facility
k, that minimizes the maximum (average) inconvenience, i.e., the location problem. European Journal of Operational Research, 144(2):366–378,
extra distance travelled by a commuting user in order to avail a 2003.
[20] X. Xiao, B. Yao, and F. Li. Optimal location queries in road network databases.
service at her nearest service location. In ICDE, pages 804–815, 2011.
[21] J. Yuan, Y. Zheng, X. Xie, and G. Sun. Driving with knowledge from the
The maximum (average) inconvenience minimization problem is physical world. In KDD, pages 316–324, 2011.
a generalization of the k-center problem [8] (k-medoidS problem [22] J. Yuan, Y. Zheng, C. Zhang, W. Xie, X. Xie, G. Sun, and Y. Huang. T-drive:
[17]), and is thus, NP Hard. We design multiple heuristics for each driving directions based on taxi trajectories. In SIGSPATIAL GIS, 2010.