Safe Temporal Planning for Urban Driving

                   Bence Cserna and William J. Doyle and Tianyi Gu and Wheeler Ruml
                                                   Department of Computer Science
                                                     University of New Hampshire
                                                       Durham, NH 03824 USA
                                               bence, doyle, gu, ruml at cs.unh.edu


                             Abstract                                  vehicles and many more pedestrians. Not only does the plan-
  A self-driving car must always have a plan for safely coming         ner need to consider the spatial location of each part of this
  to a halt. Often, finding these safe plans is treated as an after-   environment but their temporal evolution needs to be built
  thought. In this paper, we demonstrate that techniques explic-       into the planning techniques that are employed. These two
  itly designed for safety can yield higher quality plans and          aspects need to be intelligently combined to produce high
  lower latency than conventional planners in an urban driv-           quality and computationally feasible algorithms to address
  ing setting. We adopt ideas from a previously-proposed safe          the urban driving problem.
  online real-time heuristic search method to the spatiotempo-            Hierarchical methods are state-of-the-art for addressing
  ral state lattices used when planning for autonomous driv-           the difficulty of autonomous urban driving (Paden et al.,
  ing. We experimentally compare our proof-of-concept imple-
                                                                       2016). At the highest level, a vehicle must be able to select
  mentation to conventional methods and find significantly im-
  proved performance while still maintaining passenger com-            a route from a road network based upon its current position
  fort and safety.                                                     in that network and the desired destination of the passenger.
                                                                       This network can be represented as a graph with millions of
                         Introduction                                  weighted edges. Traditional offline algorithms such as A*
A central goal of artificial intelligence is the construction          are computationally infeasible, as planners are expected to
of autonomous systems. It is becoming more common that                 run at a rate of ten times per second. Once a route has been
these systems interact closely with humans. Perhaps the                identified, the vehicle must select a sequence of behaviors
most intimate way that humans can interact with an au-                 to use along that route. These behaviors correspond to situa-
tonomous system is to climb inside it and put their lives in           tions such as ‘highway cruising’ or ‘stopping at a stop sign’
the hands of its control system. Interestingly, it is exactly          and identify the pertinent rules of the road with respect to
such systems, in the form of self-driving automotive mo-               other aspects of the environment, such as pedestrians and
bility systems, that are predicted to become widespread in             other drivers. Next, a motion planning algorithm will decide
the coming decades. It is crucial that the AI systems that de-         the vehicle trajectory used to achieve the next behavior. This
cide on the actions of autonomous vehicles be designed with            motion plan then becomes a reference trajectory for a low-
safety as a fundamental aspect.                                        level controller to achieve using throttle and steering inputs
   Self-driving vehicle technology has many potential ben-             while correcting for errors and inaccuracies from the vehicle
efits for individuals, such as reduced collisions, improved            model. These four components of route planning, behavior
mobility, and reclaiming time spent driving. They also have            selection, motion planning, and control comprise the hierar-
the potential for broad benefits to society and the environ-           chical approach to autonomous driving.
ment, such as reduced car ownership, space devoted to park-               Our work addresses urban driving at the level of motion
ing, and the number of vehicles required. However, there are           planning: we want to send trajectories to the vehicle con-
still significant problems to be surmounted to achieve this vi-        troller such that the motions are comfortable for passengers
sion. For example, in addition to avoiding actual collisions,          while simultaneously ensuring that they are safe. We define
for this technology to be successfully adopted, individuals            a safe trajectory as one that reaches a safe state. In urban
need to feel subjectively safe and comfortable while using             driving, a safe state is one in which the vehicle is stopped
an autonomous vehicle. In this paper, we address the prob-             (Shalev-Shwartz, Shammah, and Shashua, 2017).1 We as-
lem of planning trajectories for autonomous vehicles, taking           sume that the vehicle replans frequently, thus only the first
into account both objective safety, given the predicted tra-           part of a safe trajectory will be executed in most cases before
jectories of nearby vehicles and pedestrians, and subjective           a new trajectory is computed to replace the current plan.
passenger comfort.
                                                                          The planner also needs to take into account the comfort of
Background
Trajectory planning for urban driving has proven to be a                  1
                                                                            For highway driving, a safe state might be one where the car
complex problem because of the inherently dynamic envi-                is pulled over at the side of the road. We do not address this in this
ronment. A typical driver can encounter hundreds of other              paper.
its passengers. Specific ranges of acceleration, when applied         Algorithm 1: SafeTLP
to the autonomous vehicle, create an uncomfortable riding
                                                                       Input: sroot
experience. Allowing a high amount of acceleration to be
                                                                    1 perform B EST-F IRST S EARCH on lattice to find a
chosen by the planner and executed on the vehicle leads to
                                                                        potentially unsafe trajectory T from sroot to a partial
passenger discomfort due to the physical stress the human
                                                                        goal
body undergoes when the vehicle quickly accelerates.
                                                                    2 scurrent ← T.last
   McNaughton et al. (2011) devise acceleration profiles for
                                                                    3 while scurrent exists do
a spatiotemporal lattice. These profiles can be used to con-
                                                                    4     perform B EST-F IRST S EARCH on dsafe
struct a constraint on the planner to maintain the acceleration
                                                                               from the node scurrent to sgoal
of the autonomous vehicle within comfortable ranges. Our
                                                                    5     if sgoal is found then
work addresses these two crucial aspects of the urban driv-
                                                                    6          cache the safe partial path from scurrent to sgoal
ing problem. We construct a planner that can remain safe
                                                                    7          return hsroot . . . scurrent ihscurrent . . . sgoal i
while limiting the control of the vehicle for the passengers
to be comfortable.                                                  8     else
   In this paper, we study a general online method for guar-        9          scurrent ← scurrent .predecessor in P
anteeing that the planner will find a series of safe actions for
                                                                   10 B EST-F IRST S EARCH to find a safe trajectory T from
the agent to execute. We dynamically allow the agent to plan
                                                                       sroot to sgoal
for its current goals and devise a way to balance the safety
                                                                   11 return T
of the trajectories the agent is executing while being as close
to the edge of non-safe action execution as possible. We de-
fine a general problem setting for the use of this technique
and study the use of it in the domain of simple urban driv-        the RRT could be a highly sub-optimal plan (Karaman and
ing. We empirically test this method on a simulated vehicle        Frazzoli, 2011).
with inertia and find that the new techniques dramatically            SafeRTS (Cserna et al., 2017) is a real-time search algo-
outperform the conventional ones.                                  rithm that can guarantee safety. It explicitly tries to prove
                                                                   nodes are safe and finds a plan to a safe node. SafeRTS
                     Previous Work                                 searches for a partial real-time plan that has a frontier node
A state lattice (Pivtoraiko, Knepper, and Kelly, 2009) is a        with the most promising f value and guarantees that plan
discretization of a continuous state space. A Spatiotemporal       could be lead to a safe node in the lattice. The authors intro-
state lattice (Ziegler and Stiller, 2009) is the result of com-    duce a safety heuristic, dsafe (n), that estimates the distance
bining a traditional state lattice with time and velocity di-      through the state space from a given node n forward to the
mensions. The urban driving domain requires the planner to         nearest safe state. As an online search algorithm, SafeRTS
be able to consider both time and space while planning. The        distributes the expansion allowance between exploring the
state lattice gives us a method for searching through a static     best f state and attempting to prove its safety. The safety
environment; however, adding in time and velocity to the           proof performs a best-first search on dsafe .
state space lattice can lead to an exponential blowup of the
size of the search space. Even assuming a modest number                       Safe Temporal Lattice Planning
of possible accelerations applied to a vehicle, the state space    We now turn to applying Cserna et al. (2017)’s notion of safe
can contain nearly 12 million trajectory edges that would          planning to spatiotemporal state lattice planning. The plan-
need to be evaluated during each planning iteration (Piv-          ner’s goal state is a location a predefined distance along the
toraiko, Knepper, and Kelly, 2009). On the other hand, it is       current route with a zero velocity. Our technique, Safe Tem-
a difficult task to compose search heuristics for the complex      poral Lattice Planning (SafeTLP), first performs a best-first-
cost function of urban driving. For example, McNaughton            search from the root state sroot until a partial goal state is
et al. (2011) applied exhaustive search on a spatiotemporal        expanded (line 1 of Alg. 1). A partial goal state is a state that
state lattice.                                                     matches the location of the goal state but may not have the
   CL-RRT (Kuwata et al., 2009) is a real-time motion plan-        correct speed. The priority function of the best-first search
ning algorithm that can guarantee safety. To deal with dy-         prioritizes states with lower distance-to-goal, earlier goal
namic obstacles, in each planning iteration, the algorithm         achievement time, and higher speed.
cleans the motion tree by removing the invalid tree nodes             This naı̈ve solution is inherently unsafe as it does not con-
and saving unconnected subtrees into a stand-by forest. A          sider upcoming obstacles and dead-ends states beyond the
random state is sampled by biasing the nearest tree in the         explored spaces or partial goal. To ensure safety SafeTLP
forest or the goal. If the new sampled state is in one of the      proves that a prefix of the naı̈ve trajectory is safe by con-
subtrees, then it will try to connect the nearest state on the     structing a safe trajectory starting with this prefix. The al-
current motion tree to the sampled state by solving a bound-       gorithm first attempts to prove that the last state of the tra-
ary value problem. Otherwise, it will perform the conven-          jectory is safe (line 4). If the safety proof is not success-
tional RRT extend routine. They guarantee safety by ensur-         ful, SafeTLP falls back to the preceding state on the trajec-
ing that the vehicle is stopped and safe at the end of the         tory(line 9). The last few states in the naı̈ve trajectory may
trajectory. We will refer to these approaches as plan-to-stop.     be skipped if a maximum allowed deceleration would not
However, the random feasible solution that is constructed by       allow the agent to come to a full stop from these states.
                                        Partial Goals
      S root                                                                      naive
                                                                                  SafeTLP
                                                                                  plan-to-stop
                                                        Fast State


                                                                       velocity
                                                        Safe Goal
                                                                                                 distance
                                        Partial Goals
      S root       SA         SB                                                                  (a)                    (b)
               a        b           c                                  Figure 2: (a): An cartoon sketch of algorithm behavior for
                                                    Fast State
                                                                       spatiotemporal planning. (b): Distance-velocity projection
                               D                                       of the plan-to-stop search tree. The horizontal axis repre-
                                                    Safe Goal
                                                                       sents space and the vertical axis velocity. Note that the tem-
                                                                       poral aspect of the states is not captured by the projection.
           Figure 1: SafeTLP search graph example.
                                                                       Theorem 2 In the worst case, SafeTLP expands no more
   As we attempt to prove safety, we store every node en-              than twice as many states as the naı̈ve plan-to-stop method
countered in a special safety closed list. If reencountered            expands in the worst case.
during a subsequent attempt, such nodes are not expanded
as we have already attempted to prove safety from them.                Proof: Recall that, due to the safety closed list, any node
   Similar to SafeRTS’s dsafe heuristic , SafeTLP prioritizes          will be expanded at most once by a safety proof attempt. If
states that are closer to the goal state. The safety proof ex-         all the safety proofs fail, then SafeTLP will perform best-
pands the search tree under the state scurrent (Cserna et al.,         first search initiated from the agent’s current state. This ex-
2017) with expansion ordered based on the distance to goal             haustive search will expand every state at most once, as the
and speed (the lower, the better). If the safety open list be-         standard closed list of best-first search will prevent duplicate
comes empty, the proof is unsuccessful and scurrent is la-             expansions. Thus any state will be expanded at most twice
beled as unsafe. Upon the expansion of the goal node, the              (once by a safety proof attempt and once by the exhaustive
safety search terminates and returns the trajectory leading to         search). The naı̈ve method expands each node at most once.
the goal from scurrent . scurrent is now proven safe and its           2
naı̈ve partial plan is augmented by the discovered safe tra-              In motivating our approach, we present three different ap-
jectory to the goal to form a complete safe plan from the              proaches in Figure 2a. The abstract diagram shows the dis-
agent’s current state to a goal state (line 7).                        tance towards the goal along the horizontal axis and the ve-
   We use Fig. 1 to demonstrate the behavior of SafeTLP.               locity of the vehicle along the vertical axis. Ideally, we want
SafeTLP first expands a search tree (blue on Fig. 1) from              an algorithm which allows the vehicle to travel as fast as
sroot that optimizes for velocity until it reaches a state in          possible towards the goal while guaranteeing that there is an
the partial goal set. In our example the sequence of a, b, and         alternate course of actions to travel to a safe state in the case
c trajectory segments represents a trajectory that leads to a          of an emergency or unexpected situation.
partial goal state spartial goal . Then SafeTLP attempts to find          The basic naı̈ve approach, shown in orange, performs a
a safe trajectory to sgoal from each intermediate state identi-        best-first search towards the goal. By optimizing distance-
fied by the ha, b, ci trajectory starting from the state closest       to-goal, this search tends to accelerate the vehicle until it
to the partial goal. First, it attempts to prove that spartial goal    reaches the maximum velocity and then maintain that speed.
is safe, then it falls back to Sb , Sa , and lastly to sroot . These   This technique is clearly not safe as it does not guarantee a
proofs are independent search trees marked with orange on              way for the vehicle to reach a safe state in case of an emer-
Fig. 1. The proof searches are optimized on dsafe . In our             gency. However, it does fulfill the task of accelerating the
example the safety proof from spartial goal and Sb fails and           vehicle to a very high speed. On the other hand, the cur-
only succeeds from Sa . SafeTLP constructs a safe trajectory           rent state-of-the-art is a more exhaustive approach, shown
to sgoal by appending the prefix hai trajectory that leads to          in green and labeled plan-to-stop. This performs an exhaus-
Sa with the trajectory segment sequence D.                             tive breadth-first search expanding all the possible trajecto-
Theorem 1 SafeTLP is guaranteed to find a safe plan if one             ries for the vehicle to undertake with the goal of completely
exists in the state space.                                             stopping the vehicle at the end as shown in Fig. 2b. This ap-
                                                                       proach is able to find a safe way of reaching the goal at the
Proof: If all safety proofs initiated from the states on the           price of many node expansions and a slow velocity for the
naı̈ve path (line 5 of Alg. 1) fail, SafeTLP will perform a            vehicle. To be able to plan to slow down at the current plan-
best-first search from the agent’s current state (line 10). Be-        ning horizon, it needs to plan to have a lower average veloc-
cause best-first search does not prune states from the search          ity than the naı̈ve approach would find. In essence, we have
tree and the state space is finite, this search will eventually        devised a hybrid approach that performs a best-first search
find paths from sroot to every other reachable state. Best-            to find a trajectory leading to a very high speed, and then
first search is exhaustive and complete in finite state spaces,        performs a search on safety to find a series of safe actions
thus eventually it will identify a path to a safe state if one         leading us towards the goal safely represented by the blue
exists.                                                      2         line. In this fashion, we guarantee that even though we are
                                                                                                            1         2            3     4        environment more quickly and thus be safer.
                                                                                                                                                     Figure 3c present violin plots showing the distribution
                     comfortable acceleration                                                   0.8                 1.0           1.2   1.5       of the average velocity for each test instances. As we can
                     comfortable deceleration                                                   0.8                 1.0           1.2   1.5       see, SafeTLP produces higher performance trajectories than
                      aggressive deceleration                                                   1.6                 1.8           2.0   2.2       plan-to-stop.
  Table 1: Acceleration limit sets (m/s2 ). Inferred from Pow-                                                                                                           Conclusion
  ell and Palacı́n (2015), Martin and Litwhiler (2008), and
  Hoberock (1977)                                                                                                                                 We have introduced a new and more effective method for
                                                                                                                                                  safe action selection in spatiotemporal planning. Previous
                                                                                                                                                  methods are extremely conservative in how they guarantee
                                                               5
            105                                                                                                                                   safety by expanding drastically more nodes than is required.
                                                                                          Average Velocity [m/s]
                                                                                                               8
Expanded Nodes


                                                               4
                                                                                                                                                  Our planner can quickly generate comfortable and safe plans
                                           Planning Time [s]


                                                                                                               7

            104                                                3                                               6                                  online. We demonstrate that planning time is drastically re-
                                                               2                                               5                                  duced while simultaneously being able to achieve a higher
            103                                                1
                                                                                                               4                                  average velocity for the vehicle. In combining techniques
                                                                                                               3                                  from real-time heuristic search and spatiotemporal planning,
                                                               0
                  Plan to Stop   SafeTPL                           Plan to Stop SafeTPL                            Plan to Stop         SafeTPL   we introduced a method for selecting safe and fast plans
                        (a)                                              (b)                                                      (c)
                                                                                                                                                  quickly. We hope this work encourages further development
                                                                                                                                                  in the applicability of safe real-time and online search.
  Figure 3: (a): Distribution of the number of expanded nodes
  during search. (b):Planning time distribution. (c): Average                                                                                                            References
  vehicle velocity distribution.                                                                                                                  Cserna, B.; Doyle, W.; Ramsdell, J.; and Ruml, W. 2017.
                                                                                                                                                    Avoiding dead ends in real-time heuristic search. In Pro-
  accelerating the vehicle very quickly towards the goal, there                                                                                     ceedings of AAAI-18, 1306–1313.
  will be a series of actions the vehicle can take to reach a safe                                                                                Hoberock, L. L. 1977. A survey of longitudinal acceleration
  state in the event of an emergency or unexpected action. This                                                                                     comfort studies in ground transportation vehicles. J. Dyn.
  purpose-designed hybrid algorithm combines the low num-                                                                                           Sys., Meas., and Control 99(2):76–84.
  ber of expansions of the best-first search’s high-velocity ac-                                                                                  Karaman, S., and Frazzoli, E. 2011. Sampling-based algo-
  tions with the guarantee of safety of the exhaustive search.                                                                                      rithms for optimal motion planning. IJRR 30(7):846–894.
  Ultimately, this approach yields a high-quality solution that
  is both fast and guaranteed to be safe.                                                                                                         Kuwata, Y.; Teo, J.; Fiore, G.; Karaman, S.; Frazzoli, E.; and
                                                                                                                                                    How, J. P. 2009. Real-time motion planning with appli-
                                           Empirical Evaluation                                                                                     cations to autonomous urban driving. IEEE Transactions
  We demonstrate planning in simplified spatiotemporal lat-                                                                                         on Control Systems Technology 17(5):1105–1118.
  tice. The goal is 100 meters away from the start position of                                                                                    Martin, D., and Litwhiler, D. 2008. An investigation of
  the vehicle. The road is discretized by half meters, so 200                                                                                       acceleration and jerk profiles of public transportation ve-
  states in total on the distance dimension of the lattice. From                                                                                    hicles. In ASEE, Conference Proceedings.
  each state, the vehicle can apply three actions: accelerate,                                                                                    McNaughton, M.; Urmson, C.; Dolan, J. M.; and Lee, J.-
  maintain the velocity, and decelerate. The following studies                                                                                      W. 2011. Motion planning for autonomous driving with
  Powell and Palacı́n (2015), Martin and Litwhiler (2008), and                                                                                      a conformal spatiotemporal lattice. In ICRA-2011, 4889–
  Hoberock (1977) show that 0.8 to 1.5 m/s2 are comfortable                                                                                         4895.
  acceleration values for urban driving, and people can toler-
  ate up to 2.2 m/s2 without injury. Following this, we design                                                                                    Paden, B.; Čáp, M.; Yong, S. Z.; Yershov, D.; and Frazzoli,
  four sets of comfortable acceleration and deceleration pairs,                                                                                     E. 2016. A survey of motion planning and control tech-
  along with the aggressive deceleration that would be allowed                                                                                      niques for self-driving urban vehicles. IEEE Transactions
  to apply in our approach. Table 1 shows the acceleration and                                                                                      on Intelligent Vehicles 1(1):33–55.
  deceleration sets we used in our experiment. We fixed the                                                                                       Pivtoraiko, M.; Knepper, R. A.; and Kelly, A. 2009. Differ-
  maximum velocity limit at 15 m/s. We vary the starting ve-                                                                                        entially constrained mobile robot motion planning in state
  locity from 0 to 5 m/s with 0.2 increment.                                                                                                        lattices. Journal of Field Robotics 26(3):308–333.
     Figure 3a shows the number of expanded node for the al-                                                                                      Powell, J., and Palacı́n, R. 2015. Passenger stability within
  gorithms to find a safe trajectory. As we can see, SafeTLP                                                                                        moving railway vehicles: limits on maximum longitudinal
  expands many (about three magnitudes) fewer nodes than                                                                                            acceleration. Urban Rail Transit 1(2):95–103.
  plan-to-stop. This is because plan-to-stop performs an ex-
  haustive search of the state space while SafeTLP explicitly                                                                                     Shalev-Shwartz, S.; Shammah, S.; and Shashua, A. 2017.
  reasons about the optimal safe solution. Figure 3b shows the                                                                                      On a formal model of safe and scalable self-driving cars.
  planning time. As expected, SafeTLP consumes much less                                                                                            arXiv preprint arXiv:1708.06374.
  time than plan-to-stop to find a safe plan. In a real appli-                                                                                    Ziegler, J., and Stiller, C. 2009. Spatiotemporal state lat-
  cation setting, fast planning enables the planner to run in a                                                                                     tices for fast trajectory planning in dynamic on-road driv-
  higher frequency, thereby enabling the vehicle to react to the                                                                                    ing scenarios. In IROS-2009, 1879–1884.