Safe Temporal Planning for Urban Driving Bence Cserna and William J. Doyle and Tianyi Gu and Wheeler Ruml Department of Computer Science University of New Hampshire Durham, NH 03824 USA bence, doyle, gu, ruml at cs.unh.edu Abstract vehicles and many more pedestrians. Not only does the plan- A self-driving car must always have a plan for safely coming ner need to consider the spatial location of each part of this to a halt. Often, finding these safe plans is treated as an after- environment but their temporal evolution needs to be built thought. In this paper, we demonstrate that techniques explic- into the planning techniques that are employed. These two itly designed for safety can yield higher quality plans and aspects need to be intelligently combined to produce high lower latency than conventional planners in an urban driv- quality and computationally feasible algorithms to address ing setting. We adopt ideas from a previously-proposed safe the urban driving problem. online real-time heuristic search method to the spatiotempo- Hierarchical methods are state-of-the-art for addressing ral state lattices used when planning for autonomous driv- the difficulty of autonomous urban driving (Paden et al., ing. We experimentally compare our proof-of-concept imple- 2016). At the highest level, a vehicle must be able to select mentation to conventional methods and find significantly im- proved performance while still maintaining passenger com- a route from a road network based upon its current position fort and safety. in that network and the desired destination of the passenger. This network can be represented as a graph with millions of Introduction weighted edges. Traditional offline algorithms such as A* A central goal of artificial intelligence is the construction are computationally infeasible, as planners are expected to of autonomous systems. It is becoming more common that run at a rate of ten times per second. Once a route has been these systems interact closely with humans. Perhaps the identified, the vehicle must select a sequence of behaviors most intimate way that humans can interact with an au- to use along that route. These behaviors correspond to situa- tonomous system is to climb inside it and put their lives in tions such as ‘highway cruising’ or ‘stopping at a stop sign’ the hands of its control system. Interestingly, it is exactly and identify the pertinent rules of the road with respect to such systems, in the form of self-driving automotive mo- other aspects of the environment, such as pedestrians and bility systems, that are predicted to become widespread in other drivers. Next, a motion planning algorithm will decide the coming decades. It is crucial that the AI systems that de- the vehicle trajectory used to achieve the next behavior. This cide on the actions of autonomous vehicles be designed with motion plan then becomes a reference trajectory for a low- safety as a fundamental aspect. level controller to achieve using throttle and steering inputs Self-driving vehicle technology has many potential ben- while correcting for errors and inaccuracies from the vehicle efits for individuals, such as reduced collisions, improved model. These four components of route planning, behavior mobility, and reclaiming time spent driving. They also have selection, motion planning, and control comprise the hierar- the potential for broad benefits to society and the environ- chical approach to autonomous driving. ment, such as reduced car ownership, space devoted to park- Our work addresses urban driving at the level of motion ing, and the number of vehicles required. However, there are planning: we want to send trajectories to the vehicle con- still significant problems to be surmounted to achieve this vi- troller such that the motions are comfortable for passengers sion. For example, in addition to avoiding actual collisions, while simultaneously ensuring that they are safe. We define for this technology to be successfully adopted, individuals a safe trajectory as one that reaches a safe state. In urban need to feel subjectively safe and comfortable while using driving, a safe state is one in which the vehicle is stopped an autonomous vehicle. In this paper, we address the prob- (Shalev-Shwartz, Shammah, and Shashua, 2017).1 We as- lem of planning trajectories for autonomous vehicles, taking sume that the vehicle replans frequently, thus only the first into account both objective safety, given the predicted tra- part of a safe trajectory will be executed in most cases before jectories of nearby vehicles and pedestrians, and subjective a new trajectory is computed to replace the current plan. passenger comfort. The planner also needs to take into account the comfort of Background Trajectory planning for urban driving has proven to be a 1 For highway driving, a safe state might be one where the car complex problem because of the inherently dynamic envi- is pulled over at the side of the road. We do not address this in this ronment. A typical driver can encounter hundreds of other paper. its passengers. Specific ranges of acceleration, when applied Algorithm 1: SafeTLP to the autonomous vehicle, create an uncomfortable riding Input: sroot experience. Allowing a high amount of acceleration to be 1 perform B EST-F IRST S EARCH on lattice to find a chosen by the planner and executed on the vehicle leads to potentially unsafe trajectory T from sroot to a partial passenger discomfort due to the physical stress the human goal body undergoes when the vehicle quickly accelerates. 2 scurrent ← T.last McNaughton et al. (2011) devise acceleration profiles for 3 while scurrent exists do a spatiotemporal lattice. These profiles can be used to con- 4 perform B EST-F IRST S EARCH on dsafe struct a constraint on the planner to maintain the acceleration from the node scurrent to sgoal of the autonomous vehicle within comfortable ranges. Our 5 if sgoal is found then work addresses these two crucial aspects of the urban driv- 6 cache the safe partial path from scurrent to sgoal ing problem. We construct a planner that can remain safe 7 return hsroot . . . scurrent ihscurrent . . . sgoal i while limiting the control of the vehicle for the passengers to be comfortable. 8 else In this paper, we study a general online method for guar- 9 scurrent ← scurrent .predecessor in P anteeing that the planner will find a series of safe actions for 10 B EST-F IRST S EARCH to find a safe trajectory T from the agent to execute. We dynamically allow the agent to plan sroot to sgoal for its current goals and devise a way to balance the safety 11 return T of the trajectories the agent is executing while being as close to the edge of non-safe action execution as possible. We de- fine a general problem setting for the use of this technique and study the use of it in the domain of simple urban driv- the RRT could be a highly sub-optimal plan (Karaman and ing. We empirically test this method on a simulated vehicle Frazzoli, 2011). with inertia and find that the new techniques dramatically SafeRTS (Cserna et al., 2017) is a real-time search algo- outperform the conventional ones. rithm that can guarantee safety. It explicitly tries to prove nodes are safe and finds a plan to a safe node. SafeRTS Previous Work searches for a partial real-time plan that has a frontier node A state lattice (Pivtoraiko, Knepper, and Kelly, 2009) is a with the most promising f value and guarantees that plan discretization of a continuous state space. A Spatiotemporal could be lead to a safe node in the lattice. The authors intro- state lattice (Ziegler and Stiller, 2009) is the result of com- duce a safety heuristic, dsafe (n), that estimates the distance bining a traditional state lattice with time and velocity di- through the state space from a given node n forward to the mensions. The urban driving domain requires the planner to nearest safe state. As an online search algorithm, SafeRTS be able to consider both time and space while planning. The distributes the expansion allowance between exploring the state lattice gives us a method for searching through a static best f state and attempting to prove its safety. The safety environment; however, adding in time and velocity to the proof performs a best-first search on dsafe . state space lattice can lead to an exponential blowup of the size of the search space. Even assuming a modest number Safe Temporal Lattice Planning of possible accelerations applied to a vehicle, the state space We now turn to applying Cserna et al. (2017)’s notion of safe can contain nearly 12 million trajectory edges that would planning to spatiotemporal state lattice planning. The plan- need to be evaluated during each planning iteration (Piv- ner’s goal state is a location a predefined distance along the toraiko, Knepper, and Kelly, 2009). On the other hand, it is current route with a zero velocity. Our technique, Safe Tem- a difficult task to compose search heuristics for the complex poral Lattice Planning (SafeTLP), first performs a best-first- cost function of urban driving. For example, McNaughton search from the root state sroot until a partial goal state is et al. (2011) applied exhaustive search on a spatiotemporal expanded (line 1 of Alg. 1). A partial goal state is a state that state lattice. matches the location of the goal state but may not have the CL-RRT (Kuwata et al., 2009) is a real-time motion plan- correct speed. The priority function of the best-first search ning algorithm that can guarantee safety. To deal with dy- prioritizes states with lower distance-to-goal, earlier goal namic obstacles, in each planning iteration, the algorithm achievement time, and higher speed. cleans the motion tree by removing the invalid tree nodes This naı̈ve solution is inherently unsafe as it does not con- and saving unconnected subtrees into a stand-by forest. A sider upcoming obstacles and dead-ends states beyond the random state is sampled by biasing the nearest tree in the explored spaces or partial goal. To ensure safety SafeTLP forest or the goal. If the new sampled state is in one of the proves that a prefix of the naı̈ve trajectory is safe by con- subtrees, then it will try to connect the nearest state on the structing a safe trajectory starting with this prefix. The al- current motion tree to the sampled state by solving a bound- gorithm first attempts to prove that the last state of the tra- ary value problem. Otherwise, it will perform the conven- jectory is safe (line 4). If the safety proof is not success- tional RRT extend routine. They guarantee safety by ensur- ful, SafeTLP falls back to the preceding state on the trajec- ing that the vehicle is stopped and safe at the end of the tory(line 9). The last few states in the naı̈ve trajectory may trajectory. We will refer to these approaches as plan-to-stop. be skipped if a maximum allowed deceleration would not However, the random feasible solution that is constructed by allow the agent to come to a full stop from these states. Partial Goals S root naive SafeTLP plan-to-stop Fast State velocity Safe Goal distance Partial Goals S root SA SB (a) (b) a b c Figure 2: (a): An cartoon sketch of algorithm behavior for Fast State spatiotemporal planning. (b): Distance-velocity projection D of the plan-to-stop search tree. The horizontal axis repre- Safe Goal sents space and the vertical axis velocity. Note that the tem- poral aspect of the states is not captured by the projection. Figure 1: SafeTLP search graph example. Theorem 2 In the worst case, SafeTLP expands no more As we attempt to prove safety, we store every node en- than twice as many states as the naı̈ve plan-to-stop method countered in a special safety closed list. If reencountered expands in the worst case. during a subsequent attempt, such nodes are not expanded as we have already attempted to prove safety from them. Proof: Recall that, due to the safety closed list, any node Similar to SafeRTS’s dsafe heuristic , SafeTLP prioritizes will be expanded at most once by a safety proof attempt. If states that are closer to the goal state. The safety proof ex- all the safety proofs fail, then SafeTLP will perform best- pands the search tree under the state scurrent (Cserna et al., first search initiated from the agent’s current state. This ex- 2017) with expansion ordered based on the distance to goal haustive search will expand every state at most once, as the and speed (the lower, the better). If the safety open list be- standard closed list of best-first search will prevent duplicate comes empty, the proof is unsuccessful and scurrent is la- expansions. Thus any state will be expanded at most twice beled as unsafe. Upon the expansion of the goal node, the (once by a safety proof attempt and once by the exhaustive safety search terminates and returns the trajectory leading to search). The naı̈ve method expands each node at most once. the goal from scurrent . scurrent is now proven safe and its 2 naı̈ve partial plan is augmented by the discovered safe tra- In motivating our approach, we present three different ap- jectory to the goal to form a complete safe plan from the proaches in Figure 2a. The abstract diagram shows the dis- agent’s current state to a goal state (line 7). tance towards the goal along the horizontal axis and the ve- We use Fig. 1 to demonstrate the behavior of SafeTLP. locity of the vehicle along the vertical axis. Ideally, we want SafeTLP first expands a search tree (blue on Fig. 1) from an algorithm which allows the vehicle to travel as fast as sroot that optimizes for velocity until it reaches a state in possible towards the goal while guaranteeing that there is an the partial goal set. In our example the sequence of a, b, and alternate course of actions to travel to a safe state in the case c trajectory segments represents a trajectory that leads to a of an emergency or unexpected situation. partial goal state spartial goal . Then SafeTLP attempts to find The basic naı̈ve approach, shown in orange, performs a a safe trajectory to sgoal from each intermediate state identi- best-first search towards the goal. By optimizing distance- fied by the ha, b, ci trajectory starting from the state closest to-goal, this search tends to accelerate the vehicle until it to the partial goal. First, it attempts to prove that spartial goal reaches the maximum velocity and then maintain that speed. is safe, then it falls back to Sb , Sa , and lastly to sroot . These This technique is clearly not safe as it does not guarantee a proofs are independent search trees marked with orange on way for the vehicle to reach a safe state in case of an emer- Fig. 1. The proof searches are optimized on dsafe . In our gency. However, it does fulfill the task of accelerating the example the safety proof from spartial goal and Sb fails and vehicle to a very high speed. On the other hand, the cur- only succeeds from Sa . SafeTLP constructs a safe trajectory rent state-of-the-art is a more exhaustive approach, shown to sgoal by appending the prefix hai trajectory that leads to in green and labeled plan-to-stop. This performs an exhaus- Sa with the trajectory segment sequence D. tive breadth-first search expanding all the possible trajecto- Theorem 1 SafeTLP is guaranteed to find a safe plan if one ries for the vehicle to undertake with the goal of completely exists in the state space. stopping the vehicle at the end as shown in Fig. 2b. This ap- proach is able to find a safe way of reaching the goal at the Proof: If all safety proofs initiated from the states on the price of many node expansions and a slow velocity for the naı̈ve path (line 5 of Alg. 1) fail, SafeTLP will perform a vehicle. To be able to plan to slow down at the current plan- best-first search from the agent’s current state (line 10). Be- ning horizon, it needs to plan to have a lower average veloc- cause best-first search does not prune states from the search ity than the naı̈ve approach would find. In essence, we have tree and the state space is finite, this search will eventually devised a hybrid approach that performs a best-first search find paths from sroot to every other reachable state. Best- to find a trajectory leading to a very high speed, and then first search is exhaustive and complete in finite state spaces, performs a search on safety to find a series of safe actions thus eventually it will identify a path to a safe state if one leading us towards the goal safely represented by the blue exists. 2 line. In this fashion, we guarantee that even though we are 1 2 3 4 environment more quickly and thus be safer. Figure 3c present violin plots showing the distribution comfortable acceleration 0.8 1.0 1.2 1.5 of the average velocity for each test instances. As we can comfortable deceleration 0.8 1.0 1.2 1.5 see, SafeTLP produces higher performance trajectories than aggressive deceleration 1.6 1.8 2.0 2.2 plan-to-stop. Table 1: Acceleration limit sets (m/s2 ). Inferred from Pow- Conclusion ell and Palacı́n (2015), Martin and Litwhiler (2008), and Hoberock (1977) We have introduced a new and more effective method for safe action selection in spatiotemporal planning. Previous methods are extremely conservative in how they guarantee 5 105 safety by expanding drastically more nodes than is required. Average Velocity [m/s] 8 Expanded Nodes 4 Our planner can quickly generate comfortable and safe plans Planning Time [s] 7 104 3 6 online. We demonstrate that planning time is drastically re- 2 5 duced while simultaneously being able to achieve a higher 103 1 4 average velocity for the vehicle. In combining techniques 3 from real-time heuristic search and spatiotemporal planning, 0 Plan to Stop SafeTPL Plan to Stop SafeTPL Plan to Stop SafeTPL we introduced a method for selecting safe and fast plans (a) (b) (c) quickly. We hope this work encourages further development in the applicability of safe real-time and online search. Figure 3: (a): Distribution of the number of expanded nodes during search. (b):Planning time distribution. (c): Average References vehicle velocity distribution. Cserna, B.; Doyle, W.; Ramsdell, J.; and Ruml, W. 2017. Avoiding dead ends in real-time heuristic search. In Pro- accelerating the vehicle very quickly towards the goal, there ceedings of AAAI-18, 1306–1313. will be a series of actions the vehicle can take to reach a safe Hoberock, L. L. 1977. A survey of longitudinal acceleration state in the event of an emergency or unexpected action. This comfort studies in ground transportation vehicles. J. Dyn. purpose-designed hybrid algorithm combines the low num- Sys., Meas., and Control 99(2):76–84. ber of expansions of the best-first search’s high-velocity ac- Karaman, S., and Frazzoli, E. 2011. Sampling-based algo- tions with the guarantee of safety of the exhaustive search. rithms for optimal motion planning. IJRR 30(7):846–894. Ultimately, this approach yields a high-quality solution that is both fast and guaranteed to be safe. Kuwata, Y.; Teo, J.; Fiore, G.; Karaman, S.; Frazzoli, E.; and How, J. P. 2009. Real-time motion planning with appli- Empirical Evaluation cations to autonomous urban driving. IEEE Transactions We demonstrate planning in simplified spatiotemporal lat- on Control Systems Technology 17(5):1105–1118. tice. The goal is 100 meters away from the start position of Martin, D., and Litwhiler, D. 2008. An investigation of the vehicle. The road is discretized by half meters, so 200 acceleration and jerk profiles of public transportation ve- states in total on the distance dimension of the lattice. From hicles. In ASEE, Conference Proceedings. each state, the vehicle can apply three actions: accelerate, McNaughton, M.; Urmson, C.; Dolan, J. M.; and Lee, J.- maintain the velocity, and decelerate. The following studies W. 2011. Motion planning for autonomous driving with Powell and Palacı́n (2015), Martin and Litwhiler (2008), and a conformal spatiotemporal lattice. In ICRA-2011, 4889– Hoberock (1977) show that 0.8 to 1.5 m/s2 are comfortable 4895. acceleration values for urban driving, and people can toler- ate up to 2.2 m/s2 without injury. Following this, we design Paden, B.; Čáp, M.; Yong, S. Z.; Yershov, D.; and Frazzoli, four sets of comfortable acceleration and deceleration pairs, E. 2016. A survey of motion planning and control tech- along with the aggressive deceleration that would be allowed niques for self-driving urban vehicles. IEEE Transactions to apply in our approach. Table 1 shows the acceleration and on Intelligent Vehicles 1(1):33–55. deceleration sets we used in our experiment. We fixed the Pivtoraiko, M.; Knepper, R. A.; and Kelly, A. 2009. Differ- maximum velocity limit at 15 m/s. We vary the starting ve- entially constrained mobile robot motion planning in state locity from 0 to 5 m/s with 0.2 increment. lattices. Journal of Field Robotics 26(3):308–333. Figure 3a shows the number of expanded node for the al- Powell, J., and Palacı́n, R. 2015. Passenger stability within gorithms to find a safe trajectory. As we can see, SafeTLP moving railway vehicles: limits on maximum longitudinal expands many (about three magnitudes) fewer nodes than acceleration. Urban Rail Transit 1(2):95–103. plan-to-stop. This is because plan-to-stop performs an ex- haustive search of the state space while SafeTLP explicitly Shalev-Shwartz, S.; Shammah, S.; and Shashua, A. 2017. reasons about the optimal safe solution. Figure 3b shows the On a formal model of safe and scalable self-driving cars. planning time. As expected, SafeTLP consumes much less arXiv preprint arXiv:1708.06374. time than plan-to-stop to find a safe plan. In a real appli- Ziegler, J., and Stiller, C. 2009. Spatiotemporal state lat- cation setting, fast planning enables the planner to run in a tices for fast trajectory planning in dynamic on-road driv- higher frequency, thereby enabling the vehicle to react to the ing scenarios. In IROS-2009, 1879–1884.