efficient interaction between the symbolic and geometric layers. In [DGTN09] task and motion planning is in- terleaved by checking individual high-level action feasibility using semantic attachments [Wey80]. A combined search in the logical and geometric spaces is performed in [CAG09] using a state composed of both the sym- bolic and geometric paths. Srivastava et al. [SFR+ 14] implicitly incorporate geometric variables, performing symbolic-geometric mapping using a planner-independent interface layer. In [KLP13] the current state uncer- tainty is incorporated, modeling the planning problem in the belief space. FFRob [GLPK18] directly conduct task planning by performing search over a sampled finite set of poses, grasps and configurations. The authors of [GLPK18] extend the FF heuristics, incorporating geometric and kinematic planning constraints that pro- vide a tight estimate of the distance to the goal. Iteratively Deepened Task and Motion Planning (IDTMP) is a constraint-based task planning approach that incorporates geometric information to account for the motion feasibility at the task planning level [DKCK18]. In our architecture, the task action costs are computed using a motion planner, similar to the motion planner information that guides the IDTMP task planner. IDTMP performs task-motion interaction using abstraction-refinement functions whereas we use semantic attachments. The scope of TMP is not limited to manipulation problems alone. TMP for navigation is rather ubiquitous in most real world scenarios. Real-world planning problems in large scale environments often require solving several sub-problems. For example, while navigating to a goal, the robot might have to visit other places of interests. Visiting these places of interest is a high-level task that can be addressed using traditional task planners. As such, TMP for navigation essentially involves selecting discrete actions to navigate to different regions, objects or locations of interest in the environment and deciding the order of these visits. Synthesizing the best set of discrete actions for a given objective requires computing the navigation costs for each of these actions. Hence it is inevitable that motion planning be interleaved with task planning to compute the motion costs for each of the discrete actions. Though it can be argued that the motion costs can be approximated a priori and fed to the task planner, in large knowledge-intensive domains such an assumption can be harder to justify, especially in the presence of localization and map uncertainty. UP2TA [MRMB16] develops a unified path planning and task planning framework for mobile robot navigation. An interesting feature of UP2TA is its task planner heuristic, which is a combination of the FF heuristic [Hof03] and the Euclidean distance between the waypoints. Wong et al. [WYYG18] develop a task planning approach that takes into account the optimal traversal costs to synthesize a plan. Similar to UP2TA, they define tasks that are to be performed at different waypoints. However, they pre-compute the optimal path for all pairs of waypoints. In contrast, our recent work [TMB19] consider a general approach where the robot has to reason at a high-level about different objects or locations or regions to navigate to. The objects/locations/regions are instantiated to their geometric counterpart, by considering a set of sampled poses. For example, if a robot has to reach a location close to a chair, the geometric instantiations of this symbolic goal would correspond to a set of poses around the chair. PETLON [LZS18] introduces a TMP approach for navigation that is task-level optimal. However, the action costs returned by their motion planner is the trajectory length and they assume completely observable domains. The approach in [TMB19], in addition to being task-level optimal is more general since it is not limited to any particular cost function and can be easily adapted to support any general formulation. Moreover, in [TMB19] planning is performed in the belief space— a probability distribution function over the robot states. The above discussed TMP approaches focus on single robot scenarios. This paper contributes to the literature in the following directions: (1) We propose a distributed multi-robot task-motion planning for navigating in large knowledge-intensive domains. Motion planning is performed in partially-observable state-spaces with motion and sensing uncertainty and hence the approach falls under the category of multi-robot belief space planning. (2) Our approach is task-level optimal, that is, the task plan cost returned by our approach is lower than any of the other possible task plans. (3) Finally, the TMP framework that embeds a motion planner within a task planner through an interface layer is probabilistically complete. 2 Preliminaries and Problem Definition 2.1 Task-Motion Planning Task planning or classical planning can be defined as the process of finding a discrete sequence of actions from the current state to a desired goal state [GNT16]. The Planning Domain Definition Language (PDDL) [MGH+ 98] being the de facto standard syntax for task planning, we resort to the same for modeling our task domain. Motion planning finds a sequence of collision free poses from a given initial/start pose (position and orientation) to a desired goal pose [Lat91]. Task-Motion Planning (TMP) essentially involves combining discrete and continuous decision-making to facilitate efficient interaction between the two domains. Starting from an initial state, TMP synthesizes a plan to a goal state by a concurrent or interleaved set of discrete actions and continuous collision-free motions. Definition 1. A task-motion domain is a tuple Ψ = (C, Ω, φ, ξ, q0 ) where: • C is the configuration space or the space of possible robot poses; • Ω = (S, A, γ, s0 , Sg ) is the task domain with S being a finite set of states, A a finite set of actions, γ : S × A → S is the state transition function, s0 ∈ S is the initial state and Sg ⊆ S is the set of goal states; • φ : S → 2C , is a function mapping states to the configurations. For example, if s represents the task state— the robot is in a corridor, then φ(s) corresponds to all configurations such that the robot is in the corridor; • ξ : A → 2C , is a function mapping actions to motion plans. Note that motion plan essentially computes a collision free poses in C; • q0 is the initial configuration. Definition 2. The TMP problem for the TMP domain Ψ is to find a sequence of actions a0 , ..., an such that si+1 = γ(si , ai ), sn+1 ∈ Sg and to find a sequence of motion plans τ0 , ..., τn such that for i = 0, ..., n, it holds that (i), τi (0) ∈ φ(si ) and τi (1) ∈ φ(si+1 ), (ii) τi+1 (0) = τi (1) and (iii) τi ∈ ξ(ai ). 2.2 Problem Definition We consider a distributed multi-robot TMP framework where robots operating in a known environment (map given) can observe each other, thereby facilitating collaborative multi-robot localization. When one robot detects another, the resulting localization uncertainty for both the robots is less than when there is no such mutual observations [RB02]. This stems from the integration of multi-robot constraints into the joint robot beliefs. In this work we only consider robots mutually observing themselves at the same time. However, it should be noted that multi-robot constraints can also be formulated for different robots observing the same environment at different . time instances [Ind18]. At any time k, we denote the robot pose (or configuration qk ) by xk = (x, y, θ), the acquired measurement is denoted by zk and the applied control action is denoted as uk . We consider a standard motion xk+1 = f (xk , uk ) + wk , wk ∼ N (0, Rk ), where wk is the random unobservable noise, modeled as a zero mean Gaussian. We also consider an observation model with Gaussian noise, zk = h(xk ) + vk , vk ∼ N (0, Qk ). The motion and observation models can be written probabilistically as p(xk+1 |xk , uk ) and p(zk |xk ), respec- tively. Given an initial distribution p(x0 ), the motion and observation models, the posterior probability distri- bution at time k can be written Z p(xk |Z0:k , U0:k−1 ) = ηp(zk |xk ) p(xk |xk−1 , uk−1 )b[xk−1 ] (1) . . where Z0:k = {z0 , ..., zk }, U0:k−1 = {u0 , ..., uk−1 } and b[xk−1 ] ∼ N (µk−1 , Σk−1 ) is the posterior probability distribution R or belief at time k−1. Similarly, given an action uk , the propagated belief can be written as b[xk+1 ¯ ]= p(xk+1 |xk , uk )b[xk ]. Given the current belief b[xk ] and the control uk , the propagated belief parameters can be computed using the standard Extended Kalman Filter (EKF) prediction. The Jacobian of f (·) with respect to xk will be denoted by Fk . Upon receiving a measurement zk , the posterior belief b[xk+1 ] is computed using the EKF update equations. We now formulate the multi-robot localization problem. For simplicity we consider only two robots r and r0 , but the formulation can be trivially expanded to incorporate R robots. At any time k, we denote the pose of robot r by xrk , the acquired measurement is denoted by zkr and the applied control action is denoted as urk . We first consider the case in which there are no mutual observations between the robots. For two robots r and r0 , the joint belief at time k is given by 0 r0 r0 0 r0 r0 p(xrk , xrk |Z0:k r , Z0:k r , U0:k−1 , U0:k−1 ) = p(xrk |Z0:k r r , U0:k−1 )p(xrk |Z0:k , U0:k−1 ) Z Z 0 0 0 0 0 r0 = ηp(zkr |xrk ) p(xrk |xrk−1 , urk−1 )b[xrk−1 ] · p(zkr |xrk ) p(xrk |xrk−1 , urk−1 )b[xk−1 ] (2) 0 As seen above the joint belief is factorized into individual beliefs of robot r and r0 . Let xk = [xrk , xrk ] be the joint state, then the EKF prediction can be written as 0 0 µ̄k = [f (µrk−1 , urk−1 ), f (µrk−1 , urk−1 ); Σ̄k = Fk−1 Σk−1 Fk−1 T + Rk−1 (3) where Fk−1 , Σk−1 and Rk−1 are diagonal matrices. This renders the predicted covariance matrix Σ̄k diagonal. Since we do not consider mutual observations, the Kalman gain is also a diagonal matrix  r   r   r   r  Fk−1 0 Σk−1 0 Rk−1 0 Kk−1 0 Fk−1 = r 0 ; Σk−1 = 0 ; Rk−1 = 0 ; Kk−1 = 0 (4) 0 Fk−1 0 Σrk−1 0 r Rk−1 0 r Kk−1 giving a diagonal covariance matrix Σk . As such this corresponds to performing the belief propagation and updates for each robot individually [RB02]. Now let us consider the case when robots can mutually observe each 0 other. When robot r observes robot r0 at time k, the measurement constraint will be denoted by ζkr,r . It is assumed that a common reference frame is established so that the robots can communicate relevant information with each other. The joint belief at time k is given by 0 0 0 0 0 0 0 0 0 p(xrk , xrk |Z0:k r r , Z0:k , ζkr,r , U0:k−1 r r , U0:k−1 ) = p(xrk |xrk , Z0:k r , ζkr,r , U0:k−1 r )p(xrk |Z0:k r r , U0:k−1 ) 0 0 0 0 0 0 0 = ηp(zkr |xrk )p(xrk |xrk , Z0:k−1 r , ζkr,r , U0:k−1 r ) · p(zkr |xrk )p(xrk |Z0:k−1 r r , U0:k−1 ) Z Z 0 0 0 0 0 0 0 r0 = ηp(zkr |xrk )p(ζkr,r |xrk , xrk ) p(xrk |xrk−1 , urk−1 )b[xrk−1 ] · p(zkr |xrk ) p(xrk |xrk−1 , urk−1 )b[xk−1 ] (5)  0 0  The measurement likelihood term p ζkr,r |xrk , xrk introduces 0cross correlations in Σk . This is because the measurement Jacobian is computed with respect to xrk−1 and xrk−1 [MPS05] unlike the previous scenario where the measurement Jacobian was computed separately for each r using its corresponding xrk−1 . We assume that 0 0 0 robot r measures the range and bearing of r0 , that is, ζkr,r = [dr,r r,r T k , φk ] where 0 xrk (2) − xrk (2) 0 q r,r 0 dr,r k = (xk r 0 (1) − xr (1))2 + (xr 0 (2) − xr (2))2 ; k k k φ k = arctan( 0 ) − xrk (3) (6) xrk (1) − xrk (1) Thus the Jacobian Hk−1 , the partial derivative of the measurement function with respect to the joint state is  0 0 0 0  (xr (1)−xr (1) (xr (2)−xr (2)) (xrk (1)−xrk (1) (xrk (2)−xrk (2)) − k r,r0 k − k r,r0 k 0 0 0 0 dr,r dr,r Hk−1 =  r0 dk r dk k k (7)   0 0 0 (xk (2)−xk (2)) (xrk (1)−xrk (1)) (xrk (1)−xrk (1)) (xrk (2)−xrk (2))  r,r 0 2 − r,r 0 2 −1 − r,r 0 2 r,r 0 2 0 (dk ) (dk ) (dk ) (dk ) The first time when a mutual observation is incorporated (say at time k), the measurement Jacobian introduces cross correlations in Σk and thereafter the matrices are no longer diagonal. 3 Approach PDDL-based planning frameworks are limited, as they are incapable of handling rigorous numerical calculations. Most approaches perform such calculations via external modules or semantic attachments [Wey80]. Recently, Bernardini et al. [BFLP17] developed a PDDL-based temporal planner to implicitly trigger such external calls via a specialized semantic attachments called external advisors. They classify variables into direct V dir , indirect V ind and free V f ree . V dir and V f ree variables are the normal PDDL function variables whose values are changed in the action effects, in accordance with PDDL semantics. V ind variables are affected by the changes in the V dir variables. A change in a V dir variable invokes the external advisor which in turn computes the V ind variables. The Temporal Relaxed Plan Graph (TRPG) [CCFL10] construction stage of the planner incorporates the indirect variable values for heuristic calculation, thereby synthesizing an efficient goal-directed search. We employ this semantic attachment based approach for the task-motion interface. Task-Motion planning for Navigation: An overview of our approach is shown in Fig. 1b. We define A = {a1 , ..., an } as the finite set of symbolic/discrete actions available to the task planner. We use a sampling based Probabilistic Roadmap (PRM) [KSLO96] to instantiate robot poses for the task actions. To begin with, the initial mean and covariance of the robot’s pose are assumed to be known. This means that, for each robot r the initial state sr0 corresponds to a single pose instantiation q0r . The regions to be navigated to are also instantiated into poses, by sampling from the pose space within each region or task state. For example, let us consider a scenario where robots need to visit different rooms L1, . . . , Ln. The task state sri might specify that the robot r is in room L1 and the goal state sri+1 can be for the robot r to reach room L2. In the considered scenario φ(sri ) and φ(sri+1 ), that is, the mapping from states to configurations, correspond to all possible poses such that r is in rooms L1 and afterwards must be in L2, respectively. In other words, the pose instantiations are the poses that lie inside the rooms and are sampled once the map of the environment is available. Since the set of possible poses is infinite, we randomly sample a set of poses corresponding to each task state si . Note that these pose instantiations for each room are the same for all the robots. Furthermore, this sampling is an independent problem and the pose instantiations are incorporated while building the entire roadmap. Task Planner Motion Planner S A a1 feasible path Action applied to Action calls motion planner optimal expand the state si aj feasible Optimal path returned path si+1 si+1 si+1 an feasible path (a) (b) Figure 1: (a) A fragment of the PDDL room domain. (b) The discrete actions available to the planner are denoted by A = {a1 , a2 , a3 , . . . , an }. Different motion plans are generated for the action that requires appropriate robot motion via an external mod- ule. This module is essentially a motion planner. The optimal path among the feasible motion plans is then selected, returning the optimal cost to the task planner. The corresponding action and the optimal path is the task-motion plan for changing the task state of a robot from si to si+1 . A fragment of the corresponding PDDL domain is shown in Fig. 1a. The external module computes the V ind values and is invoked only when a change occurs in the V dir variables due to action effects. The PDDL keyword increase is overloaded to refer to an encapsulated object [PAFL15] and the external module is called if the PDDL action to be expanded has an effect of the form (increase (vidir ) (vjind )), where vidir ∈ V dir and vjind ∈ V ind . Once such an action ai is expanded by the task planner, the corresponding start and goal states of robot r, that is, sri and sri+1 are communicated to the motion planner. This is facilitated through the function (triggered ?r1 ?from1 ?to1 ?r2 ?from2 ?to2) 1). This specifies that robot r1 is navigating from ?f rom1 to to1 and that robot r2 is navigating from ?f rom2 to to2, where f rom1, f rom2 and to1, to2 are free variables denoting the start and goal states (corresponds to different rooms) of robots r1 and r2 respectively. The function triggered is assigned the value of 1 each time the actions are expanded and re-initialized to 0 once the action duration is completed. This is performed so that the grounded variables are communicated to the motion planner. For each region si , the number of pose instantiations will be denoted by sni and a particular instantiation by sni k . For each robot r, with the pose instantiation of sri as the start node, for each pose instantiation of sri+1 , we simulate a sequence r,n of controls and observations along each edge of the roadmap starting from sri ,nk and ending in si+1j , estimating r,n the beliefs at the each of these nodes using (5). The si+1j that corresponds to the minimum cost is then selected r as the goal pose for robot r for the state si+1 . Thereafter, this instantiation becomes the start node when an expansion is attempted from state sri+1 for robot r. . Though our formulation can be adapted to any generic cost function we use a standard cost function, c = Mu cu + MG cG + MΣ cΣ , where cu is the control usage, cG is the distance to the goal and cΣ is the cost due to uncertainty, defined as trace(Σ), where Σ is the state covariance associated with the robot belief. Mu , MG and MΣ are user-defined weights. The cost of the selected motion plan is then returned to the task planner as the cost of the corresponding action. The variable external returns the motion cost computed by the external module and achieves semantic attachment by passing its value to the task-level cost variable act-cost (see Fig. 1a). The task-motion plan for changing the task state of the robot from the state si to si+1 is the ordered tuple of the action ai and the corresponding optimal path. The tuple is appended for all the task-level actions to generate the complete task-motion plan. Optimality: For a given roadmap, the plan synthesized by our approach is optimal at the task-level. This means that the task plan cost returned by our approach (c∗ ) is lower than any of the other possible task plan costs (c). Let us denote the optimal plan corresponding to c∗ as π ∗ . Suppose that there exists a plan π with associated cost c such that c < c∗ . If π and π ∗ have the same sequence of actions, this is not possible since the action costs are evaluated by the motion planner and for a given roadmap, the motion cost returned is the optimal for each action, giving c∗ ≤ c. If π and π ∗ have a different sequence of actions, the task planner ensures that the returned plan is optimal, giving c∗ ≤ c. Therefore, in both the case, we have c∗ ≤ c. Completeness: We provide a sufficient condition under which the probability of our approach returning a plan approaches one exponentially with the number of samples used in the construction of the roadmap. A task planning problem, Ω = (S, A, γ, s0 , Sg ) is complete if it does contain any dead-ends [HN01], that is there are no states from which goal states cannot be reached. The PRM motion planner is probabilistically complete [KF11], that is the probability of failure decays to zero exponentially with the number of samples used in the construction of the roadmap. Therefore, if the motion planner terminates each time it is invoked then probability of finding a plan, if it exists, approaches one. (a) Map (b) config 1 (c) config 2 Figure 2: (a) Map of the corridor environment. The stars with different colors represent landmarks that aid the robots in better localization. (b) and (c) Pose covariance evolution for robots r (blue trajectory) and r 0 (green trajectory). The belief evolution for a single planning instantiation corresponding to a unique set of simulated observations are shown. Black dots represent the sampled poses and the covariance estimates (only (x,y) portion shown) are shown as red ellipses. (b) config 1 incorporating mutual observations between the robots. (c) No mutual observations considered. (a) (b) (c) Figure 3: (a) Average position estimation errors. This corroborates the single instance of belief evolution as shown in Fig. 2. (b) Average planning time with increasing number of rooms to visit for 2 robots. The planning time is only about 5 seconds when 10 rooms are to be visited. (c) Average planning time in log-scale for different number of collaborating robots. 4 Empirical Evaluation We evaluate our approach in a simulated corridor environment whose map is as shown in Fig. 2a. The robot’s can navigate to rooms L = L1, . . . , L10 that are connected to one another through a corridor. These rooms have doors, which can either be closed or open, connecting them to the corridor. We assume that once the robot is near to a closed door that directly connects a room to the corridor, it is able to open the door– for example using human aid. Navigating to rooms can hence be encoded using a single high-level PDDL action goto room as seen in Fig. 1a. The stars with different colors represent certain unique features assumed to be known and modeled like a printer or trash can that aid the robot’s in better localization. The performance are evaluated on an Intel R Core i7-6500U under Ubuntu 16.04 LTS. We first validate our approach by considering a scenario in which robot r starting at room L1 has to visit room L10 and robot r0 starting at L10 has to visit L1. Fig. 2 shows the planned trajectories with belief evolution (pose covariances) for robots r and r0 . Multi-robot constraints are incorporated in Fig. 2(b) and correspond to config 1 while config 2 as seen in Fig. 2(c) does not consider mutual observations between the robots. Clearly, incorporating mutual observation constraints facilitate improved localization. We ran the same scenario for 25 different planning sessions, each time sampling the initial position of the robots r and r0 from the known initial beliefs. The average position errors at each node along the planned trajectories are shown in Fig. 3(a). This performance evaluation shows the improved estimation accuracy for both the robots while incorporating multi-robot constraints. In particular, for robot r0 in config 2, that is, without multi-robot constraints, it is seen that there is significant pose uncertainty along its path. This is attributed to the lack of landmarks, rendering inaccurate localization. However, incorporating multi-robot constraints significantly improves localization, with the worst case position norm error for r0 reducing by about 90%. Next, we test the scalability for an increasing number of rooms to visit. As the number of rooms to visit increase, the task-level complexity increase as the task completion requires more task-level actions. We ran config 1 for five different scenarios that correspond to visiting 2, 4, 6, 8 and 10 rooms, respectively. For each scenario, 25 different planning sessions are conducted with the rooms to be visited being selected randomly at each run. The selected rooms are then given in the goal condition of the PDDL problem. The average planning time with two robots are shown in Fig. 3(b), the plans being computed in less than 5 seconds in all cases. As seen from the figure, the planning time increase almost linearly as there is only one single high-level action, namely, goto room. Currently, this high-level action models the navigation of both r and r0 and therefore the complexity is directly dependent on the number of rooms. Finally, we test the scalability for an increasing number of collaborating robots. In the considered scenario eight rooms are to be visited with 2, 4 and 6 different robots. For each run, the rooms to be visited are randomly selected and the average time for 25 different planning sessions are plotted in log-scale in Fig. 3(c). It is seen that planning time scales exponentially with an increasing number of collaborating robots. This is quite intuitive as planning is to be performed for all possible robot pairs. 5 Conclusion and Discussion In this paper, we have introduced a multi-robot cooperative navigation approach for TMP under motion and sensing uncertainty. Task-motion interaction is facilitated by means of semantic attachments that return motion costs to the task planner. In this way, the action costs of the task plans are evaluated using a motion planner. The plan synthesized is optimal at the task-level since the overall action cost is less than that of other task plans generated for a given roadmap. It is also shown that our approach is probabilistically complete. Though our approach scales well with an increasing task-level complexity, there is an exponential increase in planning time as the number of cooperative robots that perform the task increases. Caching and reusing plans might help alleviate this complexity in some cases. Presently, our approach fares well only when there are an even number of rooms to visit. For odd number of rooms to visit, the generated plan can force additional robot motions since our task-level action is defined for a pair of robots. Let us consider a scenario with 4 robots r1, . . . , r4 and rooms L1, L3 and L7 to be visited. The synthesized plan might be that r1 visits L1, r2 visits L3 and r3 visits L7, r4 visits Li (where i = 1, . . . , 10). Robot r4 visiting Li is an additional room visit, even though it is not specified in the goal condition. This visit helps to obtain mutual observations between r3 and r4 but in practice we only need r1 visiting L1, r2 visiting L3 and r3 visiting L7. However, it is to be noted that this formulation still preserves the task-level optimality. Decoupling and defining the action for each robot can rescind the additional motions. Yet, the computational challenge associated with the semantic attachment architecture needs to be analyzed and it is an immediate work for future. References [BFLP17] Sara Bernardini, Maria Fox, Derek Long, and Chiara Piacentini. Boosting Search Guidance in Problems with Semantic Attachments. In International Conference on Automated Planning and Scheduling (ICAPS), pages 29–37, Pittsburgh, PA, USA, June 2017. [CAG09] Stéphane Cambon, Rachid Alami, and Fabien Gravot. A hybrid approach to intricate motion, manipulation and task planning. The International Journal of Robotics Research, 28(1):104–126, 2009. [CCFL10] Amanda Jane Coles, Andrew Ian Coles, Maria Fox, and Derek Long. Forward-chaining partial-order planning. In Twentieth International Conference on Automated Planning and Scheduling, 2010. [DGTN09] Christian Dornhege, Marc Gissler, Matthias Teschner, and Bernhard Nebel. Integrating symbolic and geometric planning for mobile manipulation. In Safety, Security & Rescue Robotics (SSRR), IEEE International Workshop on, pages 1–6. IEEE, 2009. [DKCK18] Neil T. Dantam, Zachary K. Kingston, Swarat Chaudhuri, and Lydia E. Kavraki. An Incremental Constraint-Based Framework for Task and Motion Planning. International Journal of Robotics Research, Special Issue on the 2016 Robotics: Science and Systems Conference, 37(10):1134–1151, 2018. [GLPK18] Caelan Reed Garrett, Tomas Lozano-Perez, and Leslie Pack Kaelbling. FFRob: Leveraging symbolic planning for efficient task and motion planning. The International Journal of Robotics Research, 37(1):104–136, 2018. [GNT16] Malik Ghallab, Dana Nau, and Paolo Traverso. Automated planning and acting. Cambridge Uni- versity Press, 2016. [HN01] Jörg Hoffmann and Bernhard Nebel. The FF Planning System: Fast Plan Generation Through Heuristic Search. Journal of Artificial Intelligence Research, 14:253–302, 2001. [Hof03] Jörg Hoffmann. The Metric-FF Planning System: Translating “Ignoring Delete Lists” to Numeric State Variables. Journal of Artificial Intelligence Research, 20:291–341, 2003. [Ind18] Vadim Indelman. Cooperative multi-robot belief space planning for autonomous navigation in un- known environments. Autonomous Robots, 42(2):353–373, 2018. [KF11] Sertac Karaman and Emilio Frazzoli. Sampling-based algorithms for optimal motion planning. The International Journal of Robotics Research, 30(7):846–894, 2011. [KLP13] Leslie Pack Kaelbling and Tomás Lozano-Pérez. Integrated task and motion planning in belief space. The International Journal of Robotics Research, 32(9-10):1194–1227, 2013. [KSLO96] Lydia E Kavraki, Petr Svestka, J-C Latombe, and Mark H Overmars. Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4):566–580, 1996. [Lat91] Jean-Claude Latombe. Robot Motion Planning. Kluwer Academic Publishers, 1991. [LDG+ 18] Fabien Lagriffoul, Neil T. Dantam, Caelan Garrett, Aliakbar Akbari, Siddharth Srivastava, and Lydia E. Kavraki. Platform-independent benchmarks for task and motion planning. Robotics and Automation Letters, 2018. [LZS18] Shih-Yun Lo, Shiqi Zhang, and Peter Stone. Petlon: Planning efficiently for task-level-optimal navi- gation. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pages 220–228. International Foundation for Autonomous Agents and Multiagent Systems, 2018. [MGH+ 98] Drew McDermott, Malik Ghallab, Adele Howe, Craig Knoblock, Ashwin Ram, Manuela Veloso, Daniel Weld, and David Wilkins. PDDL- The Planning Domain Definition Language. In AIPS-98 Planning Competition Committee, 1998. [MPS05] Agostino Martinelli, Frederic Pont, and Roland Siegwart. Multi-robot localization using relative observations. In Proceedings of the 2005 IEEE international conference on robotics and automation, pages 2797–2802. IEEE, 2005. [MRMB16] Pablo Muñoz, Marı́a D R-Moreno, and David F Barrero. Unified framework for path-planning and task-planning for autonomous robots. Robotics and Autonomous Systems, 82:1–14, 2016. [PAFL15] Chiara Piacentini, Varvara Alimisis, Maria Fox, and Derek Long. An extension of metric temporal planning with application to ac voltage control. Artificial intelligence, 229:210–245, 2015. [RB02] Stergios I Roumeliotis and George A Bekey. Distributed multirobot localization. IEEE transactions on robotics and automation, 18(5):781–795, 2002. [SFR+ 14] Siddharth Srivastava, Eugene Fang, Lorenzo Riano, Rohan Chitnis, Stuart Russell, and Pieter Abbeel. Combined task and motion planning through an extensible planner-independent inter- face layer. In Robotics and Automation (ICRA), IEEE International Conference on, pages 639–646. IEEE, 2014. [TMB19] Antony Thomas, Fulvio Mastrogiovanni, and Marco Baglietto. Task-Motion Planning for Navigation in Belief Space. In The International Symposium on Robotics Research, 2019. [Wey80] Richard W. Weyhrauch. Prolegomena to a theory of mechanized formal reasoning. Artificial Intel- ligence, 13, 1980. [WYYG18] Cuebong Wong, Erfu Yang, Xiu-Tian Yan, and Dongbing Gu. Optimal path planning based on a multi-tree T-RRT* approach for robotic task planning in continuous cost spaces. In 2018 12th France-Japan and 10th Europe-Asia Congress on Mechatronics, pages 242–247. IEEE, 2018.