=Paper= {{Paper |id=Vol-1088/paper1 |storemode=property |title=Predicting Globally and Locally: A Comparison of Methods for Vehicle Trajectory Prediction |pdfUrl=https://ceur-ws.org/Vol-1088/paper1.pdf |volume=Vol-1088 |dblpUrl=https://dblp.org/rec/conf/ijcai/GrovesNG13 }} ==Predicting Globally and Locally: A Comparison of Methods for Vehicle Trajectory Prediction== https://ceur-ws.org/Vol-1088/paper1.pdf
                  Predicting Globally and Locally: A Comparison of Methods
                               for Vehicle Trajectory Prediction

                             William Groves, Ernesto Nunes, and Maria Gini
                  Department of Computer Science and Engineering, University of Minnesota
                                    {groves, enunes, gini}@cs.umn.edu


                          Abstract                                  global analysis of the data, via clustering, as well as local in-
                                                                    ference using the Bayesian framework. In addition, since our
     We propose eigen-based and Markov-based meth-                  algorithm only uses location and time data, it can be easily
     ods to explore the global and local structure of               generalized to other domains with spatio-temporal informa-
     patterns in real-world GPS taxi trajectories. Our              tion. Our contributions are summarized as follows:
     primary goal is to predict the subsequent path of
     an in-progress taxi trajectory. The exploration of               1. We offer a systematic way of extracting common behav-
     global and local structure in the data differenti-                  ioral characteristics from a large set of observations us-
     ates this work from the state-of-the-art literature                 ing an algorithm inspired by principal component anal-
     in trajectory prediction methods, which mostly fo-                  ysis (EigenStrat) and our LapStrat algorithm.
     cuses on local structures and feature selection. We              2. We compare the effectiveness of methods that explore
     propose four algorithms: a frequency based algo-                    global structure only (FreqCount and EigenStrat), lo-
     rithm FreqCount, which we use as a benchmark,                       cal structure only (MCStrat), and mixed global and lo-
     two eigen-based (EigenStrat, LapStrat), and a                       cal structure (LapStrat). We show experimentally that
     Markov-based algorithm (MCStrat). Pairwise per-                     LapStrat offers competitive prediction power compared
     formance analysis on a large real-world data set re-                to the more local structure-reliant MCStrat algorithm.
     veals that LapStrat is the best performer, followed
     by MCStrat.                                                    2 Related Work
                                                                    Eigendecomposition has been used extensively to analyze and
1 Introduction                                                      summarize the characteristic structure of data sets. The struc-
In order to discover characteristic patterns in large spatio-       ture of network flows is analyzed in [Lakhina et al., 2004],
temporal data sets, mining algorithms have to take into ac-         principal component analysis (PCA) is used to summarize the
count spatial relations, such as topology and direction, as well    characteristics of the flows that pass through an internet ser-
as temporal relations. The increased use of devices that are        vice provider. [Zhang et al., 2009] identify two weaknesses
capable of storing driving-related spatio-temporal informa-         that make PCA less effective on real-world data. i.e. sensi-
tion helps researchers and practitioners gather the necessary       tivity to outliers in the data, and concerns about its interpreta-
data to understand driving patterns in cities, and to design        tion, and present an alternative, Laplacian eigenanalysis. The
location-based services for drivers. To the urban planner, the      difference between these methods is due to the set of relation-
work can help to aggregate driver habits and can uncover al-        ships each method considers: the Laplacian matrix only con-
ternative routes that could help alleviate traffic. Additionally,   siders similarity between close neighbors, while PCA consid-
it also helps prioritize the maintenance of roads.                  ers relationships between all pairs of points. These studies
   Our work combines data mining techniques that discover           focus on the clustering power of the eigen-based methods to
global structure in the data, and local probabilistic methods       find structures in the data. Our work goes beyond summariz-
that predict short-term routes for drivers, based on past driv-     ing the structure of the taxi routes, and uses the eigenanalysis
ing trajectories through the road network of a city.                clusters to predict the subsequent path of an in-progress taxi
   The literature on prediction has offered Markov-based            trajectory.
and other probabilistic methods that predict paths accurately.         Research in travel prediction based on driver behavior has
However, most methods rely on local structure of data, and          enjoyed some recent popularity. [Krumm, 2010] predicts the
use many extra features to improve prediction accuracy. In          next turn a driver will take by choosing with higher likeli-
this paper we use only the basic spatio-temporal data stream.       hood a turn that links more destinations or is more time effi-
We advance the state-of-the-art by proposing the LapStrat           cient. [Ziebart et al., 2008] offer algorithms for turn predic-
algorithm. This algorithm reduces dimensionality and clus-          tion, route prediction, and destination prediction. The study
ters data using spectral clustering to then predict a subse-        uses a Markov model representation and inverse reinforce-
quent path using a Bayesian network. Our algorithm supports         ment learning coupled with maximum entropy to provide ac-
curate predictions for each of their prediction tasks. [Veloso                                                                     S1
                                                                                                                                             e
                                                                                                                                              S1 S2          S2
                                                                                                                                                                      e
                                                                                                                                                                        S2 S3              S3
et al., 2011] proposes a Naive Bayes model to predict that a
                                                                                                                                             e                        e
taxi will visit an area, using time of the day, day of the week,                                                                                 S2 S1                    S3 S2

weather, and land use as features. In [Fiosina and Fiosins,                                                                    e        e                e        e                    e                                e
                                                                                                                                S4 S1   S1 S4            S5 S2    S2 S5                S6 S3                            S3 S6
2012], travel time prediction in a decentralized setting is in-
vestigated. The work uses kernel density estimation to predict                                                                     S4
                                                                                                                                             e
                                                                                                                                              S4 S5          S5
                                                                                                                                                                      e
                                                                                                                                                                        S5 S6              S6
the travel time of a vehicle based on features including length
                                                                                                                                             e                        e
of the route, average speed in the system, congestion level,                                                                                     S5 S4                    S6 S5

number of traffic lights, and number of left turns in the route.
   All these studies use features beyond location to improve                                                             Figure 1: City grid transitions are all rectilinear.
prediction accuracy, but they do not offer a comprehensive
analysis of the structure of traffic data alone. Our work ad-                        value for each edge in the city grid. Each δsl ,sm is a directed
dresses this shortcoming by providing both an analysis of                            edge coefficient indicating that a transition occurred between
commuting patterns, using eigenanalysis, and route predic-                           sl and sm in the trajectory. The policy vectors for this data
tion based on partial trajectories.                                                  set graph have length (|π|) of 1286, based on the number of
                                                                                     edges in the graph. A small sample city grid is in Figure 1. A
3 Data Preparation                                                                   collection of policies Π =< π1 , π2 , . . . , πw > is computed
                                                                                     from a collection of trajectories V :
The GPS trajectories we use for our experiments are taken
from the publicly available Beijing Taxi data set which in-                                  π vi =< δsv1i,s2 , . . . , δsvli,sm , . . . >        (2)
cludes 1 to 5-minute resolution location data for over ten-                                         (           PN −1              vi vi
thousand taxis for one week in 2009 [Yuan et al., 2010]. Bei-                                         1, if j=1 Φ(cj , cj+1 , esl ,sm ) ≥ 1
                                                                                         δsvli,sm =                                               (3)
jing, China is reported to have seventy-thousand registered                                           0 Otherwise
taxis, so this data set represents a large cross-section of all                      A graphical example showing a trajectory converted into a
taxi traffic for the one-week period [Zhu et al., 2012].                             policy is shown in Figure 2. All visited locations for trajec-
    Because the data set contains only location and time in-
formation of each taxi, preprocessing the data into segments                                                                                                                                                                    12
based on individual taxi fares is useful. The data has sufficient                                                                       GPS Waypoints (time ordered)
                                                                                                                                            Policy Grid Transitions
detail to facilitate inference on when a taxi ride is completed:                                                                                                                                                                10




                                                                                                                                                                                                Waypoint Sequence No.
                                                                                            Latitude (grid cell index)



for example, a taxi waiting for a fare will be stopped at a taxi
stand for many minutes [Zhu et al., 2012]. Using these infer-                                                              8                                                                                                    8

ences, the data is separated into taxi rides.
                                                                                                                                                                                                                                6
    To facilitate analysis, the taxi trajectories are discretized
into transitions on a region grid with cells of size 1.5 km ×                                                                                                                                                                   4
1.5 km square. V =< v1 , v2 , . . . , vw > is a collection of
trajectories. We divide it into VTR , VTE , VVA which are the                                                                                                                                                                   2
training, test, and validation sets, respectively. A trajectory
vi is a sequence of N time-ordered GPS coordinates: vi =<                                                                  4
                                                                                                                                              8                                   12
                                                                                                                                                                                                                                0

cv1 i , . . . cvj i , . . . , cvNi >. Each coordinate contains a GPS lat-                                                                   Longitude (grid cell index)
itude and longitude value, cvj i = (xj , yj ). Given a complete
trajectory (vi ), a partial trajectory (50% of a full trajectory)                        Figure 2: A trajectory converted to a policy in the city grid.
can be generated as vipartial =< cv1 i , cv2 i , . . . , cvN/2         i
                                                                           >. The                                                                             partial

last location of a partial trajectory vi           last           vi
                                                           =< cN/2 > is used         tory vipartial are given by θ vi                                                     :
to begin the prediction task.                                                                                            vipartial
                                                                                                 θ                                 =< ωs1 , ωs2 , . . . , ωsm >, with                                                                (4)
    The relevant portion of the city’s area containing the ma-                                                                       (         Pn             vipartial
jority of the city’s taxi trips, called a city grid, is enclosed                                                               ωsi =   1,   if    j=1     I(c j         , si ) ≥ 1                                                   (5)
in a matrix of dimension 17 × 20. Each si corresponds to                                                                               0 Otherwise
the center of a grid square in the euclidean xy-space. The
city graph is encoded as a rectilinear grid with directed edges                      A baseline approach for prediction, FreqCount, uses ob-
(esi sj ) between adjacent grid squares. I(cj , si ) is an indicator                 served probabilities of each outgoing transition from each
function that returns 1 if GPS coordinate cj is closer to grid                       node in the graph. Figure 3 shows the relative frequencies
center si than to any other grid center and otherwise returns                        of transitions between grid squares in the training set. This
0. Equation 1 shows an indicator function to determine if two                        city grid discretization is similar to methods used by others in
GPS coordinates indicate traversal in the graph.                                     this domain [Krumm and Horvitz, 2006; Veloso et al., 2011].

                                       1, if I(cvj i , sl ) ∗ I(ckvi , sm ) = 1      4 Methods
                                     
           vi vi
    Φ(cj , ck , esl sm ) =
                                       0 Otherwise                                   This work proposes four methods that explore either the local
                                                                               (1)   or the global structure or a mix of both to predict short-term
 From trajectory vi , a policy vector πi is created having one                       trajectories for drivers, based on past trajectories.
                                                                                                                                             location probability
Latitude (grid cell index)




                                                                                                           Latitude (grid cell index)




                                                                                                                                                                                    Latitude (grid cell index)
                                8                                                                                                        8                                                                       8




                                4                                                                                                        4                                                                       4
                                                                     8                        12                                                   8                        12                                             8                       12
                                                               Longitude (grid cell index)                                                   Longitude (grid cell index)                                             Longitude (grid cell index)

                                                                                          partial
Figure 4: A sample partial policy π vi                                                                     Figure 5: θ̂, location probabilities from Fre-                          Figure 6: Actual complete trajectory corre-
                                                                                                           qCount method with horizon of 3                                         sponding to Fig. 4, trajectory π vi



                                                          15
                                                                                                                                                            Algorithm 1: Policy Iteration
                                                                                                                                                          Input: Location vector with last location of taxi θ last , a
                             Latitude (grid cell index)




                                                          12                                                                                                      policy list Π, prediction horizon niter
                                                                                                                                                          Output: A location vector containing visit probabilities
                                                          9                                                                                                         for future locations θ̂
                                                                                                                                                            accum
                                                                                                                                                        1 θ        ← θ last
                                                          6                                                                                             2 for π ∈ Π do
                                                                                                                                                        3     t←1
                                                          3
                                                                                                                                                        4     θ 0 ← θ last
                                                                                                                                                        5     while t ≤ niter do
                                                          0
                                                               0         3      6         9          12                                 15              6         θ t =< ωst 1 , ωst 2 , . . . , ωst i , . . . , ωst M >
                                                                             Longitude (grid cell index)                                                7         , where ωst i = maxsj ∈S (ωst−1           j
                                                                                                                                                                                                                 ∗ δsπj ,si )
                                                                                                                                                        8         t←t+1
Figure 3: Visualization of frequency counts for edge transitions in
the training set. Longer arrows indicate more occurrences.                                                                                              9     for Si ∈ S do
                                                                                                                                                                      accum                   accum            t
                                                                                                                                                       10         ωsθi      = max(ωsθi                  , ωsθi )
Benchmark: Frequency Count Method. A benchmark pre-
diction measure, FreqCount, uses frequency counts for tran-                                                                                            11    θ̂ = θ accum
sitions in the training set to predict future actions. The relative
frequency of each rectilinear transition from each location in
the grid is computed and is normalized based on the number                                                                                              EigenStrat: Eigen Analysis of Covariance. This method
of trajectories involving the grid cell. The resulting policy                                                                                           exploits linear relationships between transitions in the grid
matrix is a Markov chain that determines the next predicted                                                                                             which 1) can be matched to partial trajectories for purposes
action based on the current location of the vehicle.                                                                                                    of prediction and 2) can be used to study behaviors in the
   The FreqCount method computes a policy vector based                                                                                                  system. The first part of the algorithm focuses on model gen-
on all trajectories in the training set VT R . π FreqCount contains                                                                                     eration. For each pair of edges, the covariance is computed
first order Markov transition probabilities computed from all                                                                                           using the training set observations. The n largest eigenvectors
trajectories as in Equation 6.                                                                                                                          are computed from the covariance matrix. These form a col-
                                 P           v                                                                                                          lection of characteristic eigen-strategies from training data.
                 πFreqCount        v∈VT R δsi ,sj
               δsi ,sj      = P         PM v                    (6)                                                                                        When predicting for an in-progress trajectory, the algo-
                                v∈VT R    k=1 δsi ,sk                                                                                                   rithm takes the policy generated from a partial taxi trajectory
   The probability of a transition (si → sj ) is computed as                                                                                            π vpredict , a maximum angle to use as the relevancy threshold
the count of the transition si → sj in VT R divided by the                                                                                              α, and the eigen-strategies as Π. Eigen-strategies having an
count of all transitions exiting si in VT R .                                                                                                           angular distance less than α to π vpredict are added to Πrel .
   Policy iteration (Algorithm 1) is applied to the last loca-                                                                                          This collection is then used for policy iteration. Optimal val-
tion of a partial trajectory using the frequency count policy                                                                                           ues for α and dims are learned experimentally.
set ΠFreqCount =< π FreqCount > to determine a basic predic-                                                                                               Eigenpolicies also facilitate exploration of strategic deci-
tion of future actions. This method only considers frequency                                                                                            sions. Figure 7 shows an eigenpolicy plot with a distinct pat-
of occurrence for each transition in the training set, so it is ex-                                                                                     tern in the training data. Taxis were strongly confined to tra-
pected to perform poorly in areas where trajectories intersect.                                                                                         jectories either the inside circle or the perimeter of the circle,
 Algorithm 2: EigenStrat                                                   Algorithm 3: LapStrat
  Input: ΠTR , number of principal components (dims),                       Input: ΠTR , dimension (dims), number of clusters (k),
          minimum angle between policies (α), prediction                            similarity threshold (ǫ), prediction (horizon),
                                                       partial                                            partial
          horizon (horizon), partial policy (π vi              )                    partial policy (π vi          )
  Output: Inferred location vector θ̂                                       Output: Inferred location vector θ̂
1 Generate covariance matrix C|πi |×|πi | (where πi ∈ ΠTR )               1 Generate similarity matrix W|ΠTR |×|ΠTR | where
  between transitions on the grid
                                                                                    
                                                                                      J(πi , πj ), if J(πi , πj ) ≥ ǫ
2 Get the dims eigenvectors of C with largest eigenvalues                   wij =
                                                                                      0             Otherwise
                                               v partial
3 Compute cosine similarity between π i                  and the          2 Generate Laplacian (L): L = D − W and ∀dij ∈ D
  principal components (πj , j = 1 . . . dims):                                    (P
                                                                                        |ΠT R |
                                 partial
                                                                            dij =       z=1 wiz , if i = z
  Πrel = {πj ||cos(πj , π vi             )| > α}
                    vi partial                                                       0                 Otherwise
4 If the cos(πj , π            ) < 0, then flip the sign of the
                                                                          3 Get the dims eigenvectors with smallest eigenvalues
  coefficients for this eigenpolicy. Use Algorithm 1 with
                                                                          4 Use k-means to find the mean centroids (πj , j = 1 . . . k)
  Πrel on vipartial for horizon iterations to compute θ̂                    of k policy clusters
                                                                                                                v partial
                                                                          5 Find all centroids similar to π i             :
                                                                                                      partial
                                                                                                   vi
but rarely between these regions. The two series (positive and              Πrel = {πj |J(πj , π              ) > ǫ}
                                                                                                                   partial
negative) indicate the sign and magnitude of the grid coeffi-             6 Use Algorithm 1 with Πrel on vi                 for horizon
cients for this eigenvector. We believe analysis of this type               iterations to compute θ̂
has great promise for large spatio-temporal data sets.

                                                                          perform k-means to find clusters in the reduced dimension.
                                                positive directions       The optimal value for dims is learned experimentally.
                                               negative directions
       Latitude (grid cell index)




                                    8
                                                                          MCStrat: Markov Chain-Based Algorithm. The Markov
                                                                                                                                   partial
                                                                          chain approach uses local, recent information from vpredict      ,
                                                                          the partial trajectory to predict from. Given the last k edges
                                                                          traversed by the vehicle, the algorithm finds all complete tra-
                                                                          jectories in the training set containing the same k edges to
                                    4                                     build a set of relevant policies Vrel using the match function.
                                                                          match(k, a, b) returns 1 only if at least the last k transitions
                                                                          in the policy generated by trajectory a are also found in b.
                                                                          Using Equation 9, Vrel is used to build a composite single
                                                4                     8
                                                                          relevant policy πrel , that obeys the Markov assumption, so
                                        Longitude (grid cell index)
                                                                          the resulting policy preserves the probability mass.
      Figure 7: An eigenpolicy showing a strategic pattern.                                                    partial
                                                                             Vrel = {vi match(k, π vpredict , π vi ) = 1, vi ∈ VTR } (7)
                                                                                              rel            rel
LapStrat: Spectral Clustering-Inspired Algorithm. Lap-                        πrel =< δsπ1 ,s2 , . . . , δsπi ,sj , . . . >                (8)
Strat (Algorithm 3) combines spectral clustering and                                     P                 v
                                                                             πrel            v∈Vrel δsi ,sj
Bayesian-based policy iteration to cluster policies and infer               δsi ,sj = P            PM v                                    (9)
driver next turns. Spectral clustering operates upon a simi-                           v∈Vrel          k=1 δsi ,sk
larity graph and its respective Laplacian operator. This work             Using the composite πrel , policy iteration is then performed
follows the approach of [Shi and Malik, 2000] using an un-                on the last location vector computed from vpredict .
normalized graph Laplacian. We use Jaccard index to com-
pute the similarity graph between policies. We chose the Jac-             Method Complexity Comparison. A comparison of the
card index, because it finds similarities between policies that           storage complexity of the methods appears in Table 1.
are almost parallel. This is important in cases such as two
highways that only have one meeting point; in this case, if                 Model               Model Construction            Model Storage
the highways are alternative routes to the same intersection,             FreqCount             O(|π|)                        O(|π|)
they should be similar with respect to the intersection point.            EigenStrat            O((|π|)2 )                    O(dims × |π|)
The input to the Jaccard index are two vectors representing
policies generated in Section 3. J(πi , πj ) is the Jaccard sim-           LapStrat             O((|ΠT R |)2 )                O(k × |π|)
ilarity for pair πi and πj . The unnormalized Laplacian is                 MCStrat              O(1)                          O(|ΠT R | × |π|)
computed by subtracting the degree matrix from the similar-
ity matrix in the same fashion as [Shi and Malik, 2000]. We                               Table 1: Space complexity of methods.
choose the dims eigenvectors with smallest eigenvalues, and
5 Results                                                               Method       FreqCount EigenStrat LapStrat MCStrat
Given an in-progress taxi trajectory, the methods presented           FreqCount              n/a                     n/a         n/a
facilitate predictions about the future movement of the ve-           EigenStrat  0.431                              n/a         n/a
hicle. To simulate this task, a collection of partial trajecto-
ries (e.g. Figure 4) is generated from complete trajectories           LapStrat *0.000211 *0.000218                             0.462
in the test set. A set of relevant policy vectors is generated         MCStrat *0.00149 *0.000243                    n/a
using one of the four methods described, and policy itera-
tion is performed to generate the future location predictions.       Table 3: p-values of Wilcoxon signed-rank test pairs. Starred (*) val-
The inferred future location matrix (e.g. Figure 5) is com-          ues indicate the row method achieves statistically significant (0.1%
pared against the actual complete taxi trajectory (e.g. Fig-         significance level) improvement over the column method for a pre-
ure 6). Prediction results are scored by comparing the in-           diction horizon of 6. If n/a, the row method’s mean is not better than
                                                                     the column method.
ferred visited location vector θ̂ against the full location vec-
tor θ vi . The scores are computed using Pearson’s correlation:
score = Cor(θ̂, θ vi ). The scores reported are the aggregate        pose to extend this work using a hierarchical approach which
mean of scores from examples in the validation set.                  simultaneously incorporates global and local predictions to
   The data set contains 100,000 subtrajectories (of approxi-        provide more robust results.
mately 1 hour in length) from 10,000 taxis. The data set is
split randomly into 3 disjoint collections to facilitate experi-     References
mentation: 90% in the training set, and 5% in both the test and      [Fiosina and Fiosins, 2012] J. Fiosina and M. Fiosins. Co-
validation sets. For each model type, the training set is used
                                                                        operative kernel-based forecasting in decentralized multi-
to generate the model. Model parameters are optimized using
                                                                        agent systems for urban traffic networks. In Ubiquitous
the test set. Scores are computed using predictions made on
                                                                        Data Mining, pages 3–7. ECAI, 2012.
partial trajectories from the validation set.
   Results of each method for 4 prediction horizons are shown        [Krumm and Horvitz, 2006] J. Krumm and E. Horvitz. Pre-
in Table 2. The methods leveraging more local information               destination: Inferring destinations from partial trajecto-
near the last location of the vehicle (LapStrat, MCStrat) per-          ries. UbiComp 2006, pages 243–260, 2006.
form better than the methods relying only on global patterns         [Krumm, 2010] J. Krumm. Where will they turn: predicting
(FreqCount, EigenStrat). This is true for all prediction hori-          turn proportions at intersections. Personal and Ubiquitous
zons, but the more local methods have an even greater perfor-           Computing, 14(7):591–599, 2010.
mance advantage for larger prediction horizons.
                                                                     [Lakhina et al., 2004] A. Lakhina,           K. Papagiannaki,
                         Prediction Horizon                             M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft.
                                                                        Structural analysis of network traffic flows. Perform. Eval.
Method          1          2           4           6                    Rev., 32(1):61–72, 2004.
FreqCount .579 (.141) .593 (.127) .583 (.123) .573 (.122)
EigenStrat .563 (.143) .576 (.134) .574 (.140) .574 (.140)           [Shi and Malik, 2000] J. Shi and J. Malik. Normalized cuts
LapStrat .590 (.144) .618 (.139) .626 (.137) .626 (.137)                and image segmentation. IEEE Trans. on Pattern Analysis
MCStrat .600 (.146) .616 (.149) .621 (.149) .621 (.149)                 and Machine Intelligence, 22(8):888–905, 2000.
                                                                     [Veloso et al., 2011] M. Veloso, S. Phithakkitnukoon, and
Table 2: Correlation (std. dev.) by method and prediction horizon.      C. Bento. Urban mobility study using taxi traces. In Proc.
The best score is in bold.                                              of the 2011 Int’l Workshop on Trajectory Data Mining and
                                                                        Analysis, pages 23–30, 2011.
   Statistical significance testing was performed on the vali-       [Yuan et al., 2010] J. Yuan, Y. Zheng, C. Zhang, W. Xie,
dation set results, as shown in Table 3. The best performing            X. Xie, G. Sun, and Y. Huang. T-drive: driving directions
methods (LapStrat and MCStrat) achieve a statistically sig-             based on taxi trajectories. In Proc. of the 18th SIGSPATIAL
nificant performance improvement over the other methods.                Int’l Conf. on Advances in GIS, pages 99–108, 2010.
However, the relative performance difference between the lo-
cal methods is not significantly different.                          [Zhang et al., 2009] J. Zhang, P. Niyogi, and Mary S.
                                                                        McPeek. Laplacian eigenfunctions learn population struc-
6 Conclusions                                                           ture. PLoS ONE, 4(12):e7928, 12 2009.
The methods presented can be applied to many other spatio-           [Zhu et al., 2012] Y. Zhu, Y. Zheng, L. Zhang, D. Santani,
temporal domains where only basic location and time infor-              X. Xie, and Q. Yang. Inferring Taxi Status Using GPS
mation is collected from portable devices, such as sensor net-          Trajectories. ArXiv e-prints, May 2012.
works as well as mobile phone networks. These predictions            [Ziebart et al., 2008] B. Ziebart, A. Maas, A. Dey, and
assume the action space is large but fixed and observations             J. Bagnell. Navigate like a cabbie: probabilistic reason-
implicitly are clustered into distinct but repeated goals. In           ing from observed context-aware behavior. In Proc. of the
this domain, each observation is a set of actions a driver takes        10th Int’l Conf. on Ubiquitous computing, UbiComp ’08,
in fulfillment of a specific goal: for example, to take a passen-       pages 322–331, 2008.
ger from the airport to his/her home. In future work, we pro-