=Paper= {{Paper |id=Vol-2086/AICS2017_paper_26 |storemode=property |title=Inferring Waypoints in the Absence of Knowledge of Driving Style |pdfUrl=https://ceur-ws.org/Vol-2086/AICS2017_paper_26.pdf |volume=Vol-2086 |authors=Daniel A. Desmond,Ken Brown |dblpUrl=https://dblp.org/rec/conf/aics/DesmondB17 }} ==Inferring Waypoints in the Absence of Knowledge of Driving Style== https://ceur-ws.org/Vol-2086/AICS2017_paper_26.pdf
          Inferring Waypoints in the Absence of
                Knowledge of Driving Style

                      Daniel A. Desmond, Kenneth N. Brown

    Insight Centre for Data Analytics, Department of Computer Science, University
                             College Cork, Cork, Ireland
                 {daniel.desmond, ken.brown}@insight-centre.org



        Abstract. We present an algorithm for predicting intervals which con-
        tain waypoints from a GPS trace of a multi-part trip without having
        access to historical data about the driver or any other aggregated data
        sets. We assume the driver’s driving style is not known, but that it can be
        approximated by one of a set of cost preferences. The method uses a set
        of repeated forward and backward searches along the trace, where each
        of the searches represents one of the driving costs. We evaluate the algo-
        rithm empirically on multi-part trips on real route maps. The algorithm
        selects the results of the search with the fewest number of intervals and
        we achieve over 95% recall on estimating waypoints while the intervals
        cover less than 9% of the trace.


1     Introduction

Recreating intent from activity traces is an important aspect of security, missing
persons search, assisted living, and retail analysis. In the retail analysis case the
inference would require the analysis of large sets of activity traces from many
people, while in assisted living there will be many traces from one person. In
the security and missing persons case the inferences may have to be determined
from a single trace for an individual. In such cases the activity traces may be
trajectories through space with intermediate waypoints intepreted as intent.
    In previous work [1] we showed it was possible to predict intervals which
contain up to 97% of waypoints from a driver’s GPS trace when a driver’s route
choice is based on shortest paths. We now we relax this assumption and consider
the case where the driver’s route choice may be based on different criteria. We
assume that we have a model which is an abstraction of road network, which
contains details of travel times and distances, tolls and road types but no other
information. We present an algorithm which compares the results of a series of
repeated forward and backward searches for shortest paths based on different
cost functions which approximate different route choices. We evaluate the al-
gorithm empirically using randomly generated locations from which multi-trip
routes are generated by an online route planner. We demonstrate that the algo-
rithm generates intervals in the traces which cover over 95% of the waypoints,
and where the intervals cover less than 9% of the trace.
   The remainder of the paper is organised as follows: Section 2 discusses related
work. Our proposed approach to the problem is introduced in section 3. Section
4 describes the form of the experiments. The results of the experiments are
reported in section 5 and section 6 concludes the paper.


2   Related Work

Wardrop [2] proposed that the minimization of travel time was the most im-
portant criterion in route selection and his first principle of route choice states
“the journey times on all routes actually used are equal, and less than those
which would be experienced by a single vehicle on any unused route”. Studies
by Duffel and Kalombatis [3] and Huchingson et al. [4] also found that travel
time was considered to be the most important factor when deciding on which
route to take. However in a more recent review of multiple studies Chen et al. [5]
showed that route choice was based on more than travel time, with the driver’s
preferences, familiarity with the areas being travelled and other judgements also
accounting for how routes were chosen. Zhu and Levinson [6] also carried out
an empirical test of Wardrops’s first principle and found that the majority of
people did not choose the shortest path by time.
    Currently the majority of research based on prediction or inference using
GPS traces revolves around destination prediction and establishing if patterns
exist such as popular or heavily travelled routes in the traces being examined.
These require the use of a number of different types of machine learning methods
such as pattern recognition [7] and Hidden Markov Models [8] among others. All
of these methods require the use of large amounts of historical data to build
up their models and do not use a graph of the geographical area being studied.
Where the inference of waypoints via creating sub-traces from the trace is used,
the ultimate aim is still to predict the destination [9]. The most similar problem
to the one we are tackling here is one studied by Kafsi et al [10], where the aim
is to infer a set of waypoints from a GPS trace. They ignore the time component
when segmenting the trace but use historical data to estimate the waypoints
computing the entropy of conditional Markov trajectories. To the best of our
knowledge, we are the first to present a method for inferring waypoints where
historical data is not required and we only assume that the driver is consistent
in their driving style for the whole trip.
    There exist a multitude of trip planners both on-line and off-line such as
Google maps [11], and those built using openstreetmap data(OSM)[12] data
[13,14]. Openrouteservice [15] is an open source route planner in which it is
possible to create multi-point trips with different travel profiles or cost functions
and download the trace data for the trip.
    In previous work [1], we demonstrated a method for inferring waypoints using
only a GPS trace and no other historical or aggregated data, where the assump-
tion was made that the creator of the trace used shortest paths by time. The
method is based on repeated forward and backwards searches using Dijkstra’s
algorithm [16]. The method is discussed in section 3.
3    Approach

Our hypotheses are

 – Given a multipart trajectory constructed from point-to-point trips, in an un-
   known driving style, the individual destinations (waypoints) can be inferred
   from the results of multiple simulations of different driving styles.
 – The simulated driving style which produces the fewest waypoint estimates
   generates a reliable estimate of the true waypoints.

     Let G = (V, E, t, d, c, H, h) be a strongly connected, multi-weighted, directed
graph embedded in a two-dimensional (2D) space. V is the set of vertices where
each vertex is a location in the space. E is the set of directed edges (vi , vj )
where vi , vj ∈ V and so each edge represents a road segment. t is a function
t : E −→ N+ representing the cost of traversing an edge in seconds. d is a
function d : E −→ N+ representing the cost of traversing an edge in miles. c is
a function c : E −→ N+ representing the cost of using a toll in US dollars. h is
a function h : E −→ H representing the type of road of the edge. H is set of
different road types.
     A trip s is a sequence of vertices and s̄ is the last vertex in s. A multitrip
M is a sequence of trips hs1 , s2 , ..., sj i such that s¯i is the first vertex in si+1 . A
flattened multitrip is a sequence of vertices created by flattening the multitrip.
A trace T = hv1 , v2 , ..., vk i is a sequence of vertices sampled in order from the
flattened multitrip. Given a trace our aim is to reconstruct the individual trips
i.e. the endpoints hs¯1 , s¯2 , ..., sj−1
                                       ¯ i from the multitrip. We allow a relaxation in
which our aim is to output a list of intervals h[a1 , b1 ], [a2 , b2 ], ..., [aj , bj ]i where
s¯i is contained within [ai , bi ].
     For this work we define a driving style to be a route choice preference function
using an appropriate cost function χ. A shortest χ-path is the path with the
smallest χ cost. If χ is based on travel time then the χ least cost path will be
the path that takes the least time. To accommodate inaccuracies in travel costs
in our abstraction of the road network and the driver’s bounded knowledge of
the route taken we introduce an ε-shortest χ-path.

Definition 1. ε-shortest χ-path: Path P from A to B is an ε-shortest χ-path
from A to B if there is no other A, B path Q with χ(Q) ≤ χ(P) - ε.

Definition 2. ε-shortest χ-path(percentage): Path P from A to B is an ε-shortest
χ-path from A to B if there is no other A, B path Q with χ(Q) ≤ ( 100−ε  100 ) ∗
χ(P), where ε is a percentage.

    In our previous work [1] we presented an algorithm and demonstrated its
success at inferring intervals containing waypoints when the driver was following
the shortest path by time. In the current work we do not know the driving
style. Instead we simulate different driving styles (approximated by different cost
functions χ) and infer the intervals which contain waypoints from the results of
the computations. The new algorithm modified for a specific cost function χ is
 Algorithm 1: Waypoint Estimation With Cost Function
   input : Allowable Tolerance ε
   input : Heading Tolerance α
   input : cost function χ
   input : Trace T
   output: List K of intervals
 1 List K
 2 List ST                              // will hold the calculated sub-traces
 3 subT raceStart ←− T [0]
 4 for i ← 2 to last point in T do
 5     previous ←− T [i − 2]
 6     current ←− T [i − 1]
 7     next ←− T [i]
 8     heading1 ←− heading traveled from previous to current
 9     heading2 ←− heading traveled from current to next
10     if difference between heading1 and heading2 is an α-heading change then
11         add interval [previous, next] to K
12         add sub-trace from subT raceStart to previous to ST
13         subT raceStart ←− next
14     end
15 end
16 for st ∈ ST do
17     source ←− st[0]
18     while not at end of st do
19         Searching from source find first point B on st which is not a χ-cost
             ε-shortest path
20         Searching back from B find first point A on reverse search of st which
             is not a χ-cost ε-shortest path
21         add interval [A,B] to K
22         source ←− B
23     end
24 end
25 return K
shown in Algorithm 1. The algorithm infers waypoints using two methods which
are carried out sequentially. In the first method (lines 4-15) we identify abrupt
reversals of direction which would be caused by driving in to an area and leaving
by the reverse route and add these to the intervals. As each point is a location
in 2D space, each successive pair of points in trace T has a direction between
them. Therefore we define a α-heading change as follows

Definition 3. α-heading change: Difference between heading of travel from ti−1
to ti and heading of travel from ti to ti+1 is 180◦ ±α◦ .

    We then split the trace into subtraces, separated by the intervals for abrupt
reversals, where each subtrace is made up of contiguous points of the trace. The
second method (lines 16-24) for inferring waypoints involves searching forward
through a subtrace until we find a point B which is the first point not on a
ε-shortest χ-path, then search backwards from B until we find point A which is
the first point not on a ε-shortest χ-path from B. Interval [A,B] is added to the
list of intervals, and the forward search resumes from point B. When we have
reached the end of the subtraces we return the list of intervals found.
    Based on the results of [1] a search that uses the correct cost function should
produce approximately the correct number of intervals (or fewer if one waypoint
is on the shortest path between its predecessor and successor waypoints). A
simulation using an incorrect cost function χ should introduce extra intervals
to account for deviations from its shortest χ-path, although it is possible that
it could miss a waypoint between the intervals for the same reason as above.
Therefore we suspect that the cost function that returns the fewest intervals is
a good estimate for the true driving style. If more than one function produces
the fewest intervals we need a tiebreaker. Given a list of intervals, we assume
each interval will need to be examined for places of interest, therefore we could
choose wide intervals where we are more likely to find waypoints, which we call
maximum spread, or we could choose narrow intervals where the search time
would be smaller at the cost of missing a waypoint, which we call minimum
spread, where spread is defined as total width of the intervals in seconds. We
define the spread in terms of seconds since the GPS points are timestamped and
we cannot compute distance travelled between GPS points. The results from
both of these options will be compared in section 5. If there is still more than
one function remaining then random choice will be used to select one. Other
options such as selecting by the maximum spread or minimum spread initially
and then using the number of estimates as a tiebreaker were implemented but
gave poor results and so are not discussed further.
    The pseudocode for the Waypoint Estimation Assuming No Driving Style
Information is shown in Algorithm 2. The inputs are the trace T , ε, α, and a
set of cost functions representing different driving styles. Initialize two lists, AK
which will hold the list of each the intervals returned from each of the waypoint
estimation algorithms we are interested in and K to hold the intervals which will
be returned(lines 1-2). A counter C to keep track of the minimum number of
intervals returned by any cost function is initialized to ∞ (line 3). For each of the
cost functions we return a list L of intervals (line 5). If the number of intervals
is equal to C then L is added to AK (lines 6-8). If the number of intervals is
less than C then AK is cleared, L added to the now empty list and C set to the
number of intervals in L (lines 9-13). If AK contains only one list of intervals
then set K equal to this list of intervals (line 16). If AK contains more than
one list, we select the one with minimum or maximum spread. If there are still
ties, we randomly select a list with minimum or maximum spread (lines 17-24).
Return the list K of the intervals found (line 25).




 Algorithm 2: Waypoint Estimation Assuming No Driving Style Infor-
 mation
   input : Allowable Tolerance ε
   input : Heading Tolerance α
   input : Set of Cost functions X
   input : Trace T
   output: List K of intervals
 1 List AK
 2 List K
 3 Integer C ←− ∞
 4 for χ ∈ X do
 5     List L = Waypoint Estimation with Cost Function (T , ε, α, χ)
 6     if L.size = C then
 7          add L to AK
 8     end
 9     if L.size