efficient interaction between the symbolic and geometric layers. In [DGTN09] task and motion planning is in-
terleaved by checking individual high-level action feasibility using semantic attachments [Wey80]. A combined
search in the logical and geometric spaces is performed in [CAG09] using a state composed of both the sym-
bolic and geometric paths. Srivastava et al. [SFR+ 14] implicitly incorporate geometric variables, performing
symbolic-geometric mapping using a planner-independent interface layer. In [KLP13] the current state uncer-
tainty is incorporated, modeling the planning problem in the belief space. FFRob [GLPK18] directly conduct
task planning by performing search over a sampled finite set of poses, grasps and configurations. The authors
of [GLPK18] extend the FF heuristics, incorporating geometric and kinematic planning constraints that pro-
vide a tight estimate of the distance to the goal. Iteratively Deepened Task and Motion Planning (IDTMP) is
a constraint-based task planning approach that incorporates geometric information to account for the motion
feasibility at the task planning level [DKCK18]. In our architecture, the task action costs are computed using
a motion planner, similar to the motion planner information that guides the IDTMP task planner. IDTMP
performs task-motion interaction using abstraction-refinement functions whereas we use semantic attachments.
    The scope of TMP is not limited to manipulation problems alone. TMP for navigation is rather ubiquitous in
most real world scenarios. Real-world planning problems in large scale environments often require solving several
sub-problems. For example, while navigating to a goal, the robot might have to visit other places of interests.
Visiting these places of interest is a high-level task that can be addressed using traditional task planners. As
such, TMP for navigation essentially involves selecting discrete actions to navigate to different regions, objects
or locations of interest in the environment and deciding the order of these visits. Synthesizing the best set of
discrete actions for a given objective requires computing the navigation costs for each of these actions. Hence
it is inevitable that motion planning be interleaved with task planning to compute the motion costs for each of
the discrete actions. Though it can be argued that the motion costs can be approximated a priori and fed to
the task planner, in large knowledge-intensive domains such an assumption can be harder to justify, especially in
the presence of localization and map uncertainty. UP2TA [MRMB16] develops a unified path planning and task
planning framework for mobile robot navigation. An interesting feature of UP2TA is its task planner heuristic,
which is a combination of the FF heuristic [Hof03] and the Euclidean distance between the waypoints. Wong et
al. [WYYG18] develop a task planning approach that takes into account the optimal traversal costs to synthesize
a plan. Similar to UP2TA, they define tasks that are to be performed at different waypoints. However, they
pre-compute the optimal path for all pairs of waypoints. In contrast, our recent work [TMB19] consider a
general approach where the robot has to reason at a high-level about different objects or locations or regions to
navigate to. The objects/locations/regions are instantiated to their geometric counterpart, by considering a set
of sampled poses. For example, if a robot has to reach a location close to a chair, the geometric instantiations
of this symbolic goal would correspond to a set of poses around the chair. PETLON [LZS18] introduces a TMP
approach for navigation that is task-level optimal. However, the action costs returned by their motion planner is
the trajectory length and they assume completely observable domains. The approach in [TMB19], in addition to
being task-level optimal is more general since it is not limited to any particular cost function and can be easily
adapted to support any general formulation. Moreover, in [TMB19] planning is performed in the belief space—
a probability distribution function over the robot states.
    The above discussed TMP approaches focus on single robot scenarios. This paper contributes to the literature
in the following directions: (1) We propose a distributed multi-robot task-motion planning for navigating in large
knowledge-intensive domains. Motion planning is performed in partially-observable state-spaces with motion and
sensing uncertainty and hence the approach falls under the category of multi-robot belief space planning. (2)
Our approach is task-level optimal, that is, the task plan cost returned by our approach is lower than any of the
other possible task plans. (3) Finally, the TMP framework that embeds a motion planner within a task planner
through an interface layer is probabilistically complete.

2     Preliminaries and Problem Definition
2.1   Task-Motion Planning
Task planning or classical planning can be defined as the process of finding a discrete sequence of actions from the
current state to a desired goal state [GNT16]. The Planning Domain Definition Language (PDDL) [MGH+ 98]
being the de facto standard syntax for task planning, we resort to the same for modeling our task domain. Motion
planning finds a sequence of collision free poses from a given initial/start pose (position and orientation) to a
desired goal pose [Lat91]. Task-Motion Planning (TMP) essentially involves combining discrete and continuous
decision-making to facilitate efficient interaction between the two domains. Starting from an initial state, TMP
synthesizes a plan to a goal state by a concurrent or interleaved set of discrete actions and continuous collision-free
motions.
Definition 1. A task-motion domain is a tuple Ψ = (C, Ω, φ, ξ, q0 ) where:
  • C is the configuration space or the space of possible robot poses;
  • Ω = (S, A, γ, s0 , Sg ) is the task domain with S being a finite set of states, A a finite set of actions, γ :
    S × A → S is the state transition function, s0 ∈ S is the initial state and Sg ⊆ S is the set of goal states;
  • φ : S → 2C , is a function mapping states to the configurations. For example, if s represents the task state—
    the robot is in a corridor, then φ(s) corresponds to all configurations such that the robot is in the corridor;
  • ξ : A → 2C , is a function mapping actions to motion plans. Note that motion plan essentially computes a
    collision free poses in C;
  • q0 is the initial configuration.
Definition 2. The TMP problem for the TMP domain Ψ is to find a sequence of actions a0 , ..., an such that
si+1 = γ(si , ai ), sn+1 ∈ Sg and to find a sequence of motion plans τ0 , ..., τn such that for i = 0, ..., n, it holds
that (i), τi (0) ∈ φ(si ) and τi (1) ∈ φ(si+1 ), (ii) τi+1 (0) = τi (1) and (iii) τi ∈ ξ(ai ).

2.2    Problem Definition
We consider a distributed multi-robot TMP framework where robots operating in a known environment (map
given) can observe each other, thereby facilitating collaborative multi-robot localization. When one robot detects
another, the resulting localization uncertainty for both the robots is less than when there is no such mutual
observations [RB02]. This stems from the integration of multi-robot constraints into the joint robot beliefs. In this
work we only consider robots mutually observing themselves at the same time. However, it should be noted that
multi-robot constraints can also be formulated for different robots observing the same environment at different
                                                                                                    .
time instances [Ind18]. At any time k, we denote the robot pose (or configuration qk ) by xk = (x, y, θ), the
acquired measurement is denoted by zk and the applied control action is denoted as uk . We consider a standard
motion xk+1 = f (xk , uk ) + wk , wk ∼ N (0, Rk ), where wk is the random unobservable noise, modeled as a zero
mean Gaussian. We also consider an observation model with Gaussian noise, zk = h(xk ) + vk , vk ∼ N (0, Qk ).
   The motion and observation models can be written probabilistically as p(xk+1 |xk , uk ) and p(zk |xk ), respec-
tively. Given an initial distribution p(x0 ), the motion and observation models, the posterior probability distri-
bution at time k can be written                             Z
                                 p(xk |Z0:k , U0:k−1 ) = ηp(zk |xk )     p(xk |xk−1 , uk−1 )b[xk−1 ]                                (1)
                 .                           .
where Z0:k = {z0 , ..., zk }, U0:k−1 = {u0 , ..., uk−1 } and b[xk−1 ] ∼ N (µk−1 , Σk−1 ) is the posterior probability
distribution
R                or belief at time k−1. Similarly, given an action uk , the propagated belief can be written as b[xk+1            ¯ ]=
   p(xk+1 |xk , uk )b[xk ]. Given the current belief b[xk ] and the control uk , the propagated belief parameters can
be computed using the standard Extended Kalman Filter (EKF) prediction. The Jacobian of f (·) with respect
to xk will be denoted by Fk . Upon receiving a measurement zk , the posterior belief b[xk+1 ] is computed using
the EKF update equations.
    We now formulate the multi-robot localization problem. For simplicity we consider only two robots r and r0 ,
but the formulation can be trivially expanded to incorporate R robots. At any time k, we denote the pose of
robot r by xrk , the acquired measurement is denoted by zkr and the applied control action is denoted as urk . We
first consider the case in which there are no mutual observations between the robots. For two robots r and r0 ,
the joint belief at time k is given by
              0
                        r0              r0                                   0
                                                                                 r0     r0
   p(xrk , xrk |Z0:k
                  r
                     , Z0:k    r
                            , U0:k−1 , U0:k−1 ) = p(xrk |Z0:k
                                                          r      r
                                                              , U0:k−1 )p(xrk |Z0:k , U0:k−1   )
                                                      Z                                              Z
                                                                                              0    0       0    0       0
                                                                                                                               r0
                                       = ηp(zkr |xrk ) p(xrk |xrk−1 , urk−1 )b[xrk−1 ] · p(zkr |xrk ) p(xrk |xrk−1 , urk−1 )b[xk−1 ] (2)

                                                                                                                             0
As seen above the joint belief is factorized into individual beliefs of robot r and r0 . Let xk = [xrk , xrk ] be the
joint state, then the EKF prediction can be written as
                                                          0      0
                           µ̄k = [f (µrk−1 , urk−1 ), f (µrk−1 , urk−1 ); Σ̄k = Fk−1 Σk−1 Fk−1
                                                                                           T
                                                                                               + Rk−1                               (3)

where Fk−1 , Σk−1 and Rk−1 are diagonal matrices. This renders the predicted covariance matrix Σ̄k diagonal.
Since we do not consider mutual observations, the Kalman gain is also a diagonal matrix
            r                                r                                   r                                           r            
            Fk−1             0                 Σk−1           0                      Rk−1                    0                     Kk−1    0
    Fk−1 =                r  0       ; Σk−1 =                  0           ; Rk−1 =                          0           ; Kk−1 =           0        (4)
             0           Fk−1                   0           Σrk−1                     0                   r
                                                                                                         Rk−1                       0      r
                                                                                                                                          Kk−1
giving a diagonal covariance matrix Σk . As such this corresponds to performing the belief propagation and
updates for each robot individually [RB02]. Now let us consider the case when robots can mutually observe each
                                                                                                       0
other. When robot r observes robot r0 at time k, the measurement constraint will be denoted by ζkr,r . It is
assumed that a common reference frame is established so that the robots can communicate relevant information
with each other. The joint belief at time k is given by
             0           0       0               0                      0          0                             0       0     0
    p(xrk , xrk |Z0:k
                  r      r
                      , Z0:k , ζkr,r , U0:k−1
                                        r        r
                                              , U0:k−1 ) = p(xrk |xrk , Z0:k
                                                                         r
                                                                             , ζkr,r , U0:k−1
                                                                                        r
                                                                                              )p(xrk |Z0:k
                                                                                                       r      r
                                                                                                           , U0:k−1 )
                                                     0                 0                   0     0               0       0         0
                         = ηp(zkr |xrk )p(xrk |xrk , Z0:k−1
                                                        r
                                                            , ζkr,r , U0:k−1
                                                                       r
                                                                             ) · p(zkr |xrk )p(xrk |Z0:k−1
                                                                                                       r      r
                                                                                                           , U0:k−1 )
                                                        Z                                               Z
                                           0          0                                         0    0        0   0       0
                                                                                                                                 r0
                     = ηp(zkr |xrk )p(ζkr,r |xrk , xrk ) p(xrk |xrk−1 , urk−1 )b[xrk−1 ] · p(zkr |xrk ) p(xrk |xrk−1 , urk−1 )b[xk−1 ]               (5)
                                   0              0
                                                     
 The measurement likelihood term p ζkr,r |xrk , xrk introduces 0cross correlations in Σk . This is because the
measurement Jacobian is computed with respect to xrk−1 and xrk−1 [MPS05] unlike the previous scenario where
the measurement Jacobian was computed separately for each r using its corresponding xrk−1 . We assume that
                                                                               0       0             0
robot r measures the range and bearing of r0 , that is, ζkr,r = [dr,r       r,r T
                                                                       k , φk ] where
                                                                                          0
                                                                                       xrk (2) − xrk (2)
       0
             q
                                                                 r,r 0
  dr,r
   k     =      (xk
                   r 0 (1) − xr (1))2 + (xr 0 (2) − xr (2))2 ;
                              k           k          k         φ k       =     arctan(    0              ) − xrk (3)                                 (6)
                                                                                       xrk (1) − xrk (1)
Thus the Jacobian Hk−1 , the partial derivative of the measurement function with respect to the joint state is
                            0                 0                     0                  0              
                          (xr (1)−xr (1)   (xr (2)−xr (2))       (xrk (1)−xrk (1)   (xrk (2)−xrk (2))
                      − k r,r0 k         − k r,r0 k           0               0                 0     0
                                                                         dr,r             dr,r
             Hk−1 =  r0 dk r                    dk                       k                 k
                                                                                                             (7)
                                                                                                      
                                               0                       0                0
                        (xk (2)−xk (2))    (xrk (1)−xrk (1))      (xrk (1)−xrk (1)) (xrk (2)−xrk (2))
                                                                                                       
                               r,r 0 2   −        r,r 0 2    −1 −          r,r 0 2         r,r 0 2    0
                                       (dk   )              (dk    )                           (dk       )                   (dk   )

   The first time when a mutual observation is incorporated (say at time k), the measurement Jacobian introduces
cross correlations in Σk and thereafter the matrices are no longer diagonal.
3      Approach
PDDL-based planning frameworks are limited, as they are incapable of handling rigorous numerical calculations.
Most approaches perform such calculations via external modules or semantic attachments [Wey80]. Recently,
Bernardini et al. [BFLP17] developed a PDDL-based temporal planner to implicitly trigger such external calls
via a specialized semantic attachments called external advisors. They classify variables into direct V dir , indirect
V ind and free V f ree . V dir and V f ree variables are the normal PDDL function variables whose values are changed
in the action effects, in accordance with PDDL semantics. V ind variables are affected by the changes in the V dir
variables. A change in a V dir variable invokes the external advisor which in turn computes the V ind variables.
The Temporal Relaxed Plan Graph (TRPG) [CCFL10] construction stage of the planner incorporates the indirect
variable values for heuristic calculation, thereby synthesizing an efficient goal-directed search. We employ this
semantic attachment based approach for the task-motion interface.
Task-Motion planning for Navigation: An overview of our approach is shown in Fig. 1b. We define
A = {a1 , ..., an } as the finite set of symbolic/discrete actions available to the task planner. We use a sampling
based Probabilistic Roadmap (PRM) [KSLO96] to instantiate robot poses for the task actions. To begin with,
the initial mean and covariance of the robot’s pose are assumed to be known. This means that, for each robot
r the initial state sr0 corresponds to a single pose instantiation q0r . The regions to be navigated to are also
instantiated into poses, by sampling from the pose space within each region or task state. For example, let us
consider a scenario where robots need to visit different rooms L1, . . . , Ln. The task state sri might specify that
the robot r is in room L1 and the goal state sri+1 can be for the robot r to reach room L2. In the considered
scenario φ(sri ) and φ(sri+1 ), that is, the mapping from states to configurations, correspond to all possible poses
such that r is in rooms L1 and afterwards must be in L2, respectively. In other words, the pose instantiations
are the poses that lie inside the rooms and are sampled once the map of the environment is available. Since the
set of possible poses is infinite, we randomly sample a set of poses corresponding to each task state si . Note
that these pose instantiations for each room are the same for all the robots. Furthermore, this sampling is an
independent problem and the pose instantiations are incorporated while building the entire roadmap.
                                                                                                              Task Planner                                              Motion Planner

                                                                                         S                                      A
                                                                                                                                    a1                                      feasible
                                                                                                                                                                              path


                                                                                                            Action applied to
                                                                                                                                          Action calls motion planner       optimal
                                                                                                            expand the state
                                                                                                si                                  aj                                      feasible
                                                                                                                                          Optimal path returned               path


                                                                                         si+1        si+1              si+1         an
                                                                                                                                                                            feasible
                                                                                                                                                                              path


                             (a)                                                                                                    (b)

Figure 1: (a) A fragment of the PDDL room domain. (b) The discrete actions available to the planner are denoted by A =
{a1 , a2 , a3 , . . . , an }. Different motion plans are generated for the action that requires appropriate robot motion via an external mod-
ule. This module is essentially a motion planner. The optimal path among the feasible motion plans is then selected, returning the optimal
cost to the task planner. The corresponding action and the optimal path is the task-motion plan for changing the task state of a robot from
si to si+1 .

    A fragment of the corresponding PDDL domain is shown in Fig. 1a. The external module computes the V ind
values and is invoked only when a change occurs in the V dir variables due to action effects. The PDDL keyword
increase is overloaded to refer to an encapsulated object [PAFL15] and the external module is called if the PDDL
action to be expanded has an effect of the form (increase (vidir ) (vjind )), where vidir ∈ V dir and vjind ∈ V ind .
Once such an action ai is expanded by the task planner, the corresponding start and goal states of robot r, that
is, sri and sri+1 are communicated to the motion planner. This is facilitated through the function (triggered ?r1
?from1 ?to1 ?r2 ?from2 ?to2) 1). This specifies that robot r1 is navigating from ?f rom1 to to1 and that robot
r2 is navigating from ?f rom2 to to2, where f rom1, f rom2 and to1, to2 are free variables denoting the start and
goal states (corresponds to different rooms) of robots r1 and r2 respectively. The function triggered is assigned
the value of 1 each time the actions are expanded and re-initialized to 0 once the action duration is completed.
This is performed so that the grounded variables are communicated to the motion planner. For each region si ,
the number of pose instantiations will be denoted by sni and a particular instantiation by sni k . For each robot r,
with the pose instantiation of sri as the start node, for each pose instantiation of sri+1 , we simulate a sequence
                                                                                                    r,n
of controls and observations along each edge of the roadmap starting from sri ,nk and ending in si+1j , estimating
                                                         r,n
the beliefs at the each of these nodes using (5). The si+1j that corresponds to the minimum cost is then selected
                                               r
as the goal pose for robot r for the state si+1 . Thereafter, this instantiation becomes the start node when an
expansion is attempted from state sri+1 for robot r.
                                                                                                                   .
    Though our formulation can be adapted to any generic cost function we use a standard cost function, c =
Mu cu + MG cG + MΣ cΣ , where cu is the control usage, cG is the distance to the goal and cΣ is the cost due to
uncertainty, defined as trace(Σ), where Σ is the state covariance associated with the robot belief. Mu , MG and
MΣ are user-defined weights. The cost of the selected motion plan is then returned to the task planner as the
cost of the corresponding action. The variable external returns the motion cost computed by the external module
and achieves semantic attachment by passing its value to the task-level cost variable act-cost (see Fig. 1a). The
task-motion plan for changing the task state of the robot from the state si to si+1 is the ordered tuple of the
action ai and the corresponding optimal path. The tuple is appended for all the task-level actions to generate
the complete task-motion plan.
Optimality: For a given roadmap, the plan synthesized by our approach is optimal at the task-level. This
means that the task plan cost returned by our approach (c∗ ) is lower than any of the other possible task plan
costs (c). Let us denote the optimal plan corresponding to c∗ as π ∗ . Suppose that there exists a plan π with
associated cost c such that c < c∗ . If π and π ∗ have the same sequence of actions, this is not possible since
the action costs are evaluated by the motion planner and for a given roadmap, the motion cost returned is the
optimal for each action, giving c∗ ≤ c. If π and π ∗ have a different sequence of actions, the task planner ensures
that the returned plan is optimal, giving c∗ ≤ c. Therefore, in both the case, we have c∗ ≤ c.
Completeness: We provide a sufficient condition under which the probability of our approach returning a plan
approaches one exponentially with the number of samples used in the construction of the roadmap. A task
planning problem, Ω = (S, A, γ, s0 , Sg ) is complete if it does contain any dead-ends [HN01], that is there are no
states from which goal states cannot be reached. The PRM motion planner is probabilistically complete [KF11],
that is the probability of failure decays to zero exponentially with the number of samples used in the construction
of the roadmap. Therefore, if the motion planner terminates each time it is invoked then probability of finding
a plan, if it exists, approaches one.
                    (a) Map                                         (b) config 1                                  (c) config 2
Figure 2: (a) Map of the corridor environment. The stars with different colors represent landmarks that aid the robots in better localization.
(b) and (c) Pose covariance evolution for robots r (blue trajectory) and r 0 (green trajectory). The belief evolution for a single planning
instantiation corresponding to a unique set of simulated observations are shown. Black dots represent the sampled poses and the covariance
estimates (only (x,y) portion shown) are shown as red ellipses. (b) config 1 incorporating mutual observations between the robots. (c) No
mutual observations considered.


                    (a)                                              (b)                                               (c)
Figure 3: (a) Average position estimation errors. This corroborates the single instance of belief evolution as shown in Fig. 2. (b) Average
planning time with increasing number of rooms to visit for 2 robots. The planning time is only about 5 seconds when 10 rooms are to be
visited. (c) Average planning time in log-scale for different number of collaborating robots.

4    Empirical Evaluation
We evaluate our approach in a simulated corridor environment whose map is as shown in Fig. 2a. The robot’s
can navigate to rooms L = L1, . . . , L10 that are connected to one another through a corridor. These rooms have
doors, which can either be closed or open, connecting them to the corridor. We assume that once the robot is
near to a closed door that directly connects a room to the corridor, it is able to open the door– for example
using human aid. Navigating to rooms can hence be encoded using a single high-level PDDL action goto room
as seen in Fig. 1a. The stars with different colors represent certain unique features assumed to be known and
modeled like a printer or trash can that aid the robot’s in better localization. The performance are evaluated on
an Intel R Core i7-6500U under Ubuntu 16.04 LTS.
   We first validate our approach by considering a scenario in which robot r starting at room L1 has to visit
room L10 and robot r0 starting at L10 has to visit L1. Fig. 2 shows the planned trajectories with belief evolution
(pose covariances) for robots r and r0 . Multi-robot constraints are incorporated in Fig. 2(b) and correspond to
config 1 while config 2 as seen in Fig. 2(c) does not consider mutual observations between the robots. Clearly,
incorporating mutual observation constraints facilitate improved localization. We ran the same scenario for
25 different planning sessions, each time sampling the initial position of the robots r and r0 from the known
initial beliefs. The average position errors at each node along the planned trajectories are shown in Fig. 3(a).
This performance evaluation shows the improved estimation accuracy for both the robots while incorporating
multi-robot constraints. In particular, for robot r0 in config 2, that is, without multi-robot constraints, it is seen
that there is significant pose uncertainty along its path. This is attributed to the lack of landmarks, rendering
inaccurate localization. However, incorporating multi-robot constraints significantly improves localization, with
the worst case position norm error for r0 reducing by about 90%.
   Next, we test the scalability for an increasing number of rooms to visit. As the number of rooms to visit
increase, the task-level complexity increase as the task completion requires more task-level actions. We ran config
1 for five different scenarios that correspond to visiting 2, 4, 6, 8 and 10 rooms, respectively. For each scenario,
25 different planning sessions are conducted with the rooms to be visited being selected randomly at each run.
The selected rooms are then given in the goal condition of the PDDL problem. The average planning time with
two robots are shown in Fig. 3(b), the plans being computed in less than 5 seconds in all cases. As seen from the
figure, the planning time increase almost linearly as there is only one single high-level action, namely, goto room.
Currently, this high-level action models the navigation of both r and r0 and therefore the complexity is directly
dependent on the number of rooms.
   Finally, we test the scalability for an increasing number of collaborating robots. In the considered scenario
eight rooms are to be visited with 2, 4 and 6 different robots. For each run, the rooms to be visited are randomly
selected and the average time for 25 different planning sessions are plotted in log-scale in Fig. 3(c). It is seen that
planning time scales exponentially with an increasing number of collaborating robots. This is quite intuitive as
planning is to be performed for all possible robot pairs.

5   Conclusion and Discussion
In this paper, we have introduced a multi-robot cooperative navigation approach for TMP under motion and
sensing uncertainty. Task-motion interaction is facilitated by means of semantic attachments that return motion
costs to the task planner. In this way, the action costs of the task plans are evaluated using a motion planner.
The plan synthesized is optimal at the task-level since the overall action cost is less than that of other task plans
generated for a given roadmap. It is also shown that our approach is probabilistically complete. Though our
approach scales well with an increasing task-level complexity, there is an exponential increase in planning time
as the number of cooperative robots that perform the task increases. Caching and reusing plans might help
alleviate this complexity in some cases.
   Presently, our approach fares well only when there are an even number of rooms to visit. For odd number of
rooms to visit, the generated plan can force additional robot motions since our task-level action is defined for a
pair of robots. Let us consider a scenario with 4 robots r1, . . . , r4 and rooms L1, L3 and L7 to be visited. The
synthesized plan might be that r1 visits L1, r2 visits L3 and r3 visits L7, r4 visits Li (where i = 1, . . . , 10).
Robot r4 visiting Li is an additional room visit, even though it is not specified in the goal condition. This visit
helps to obtain mutual observations between r3 and r4 but in practice we only need r1 visiting L1, r2 visiting
L3 and r3 visiting L7. However, it is to be noted that this formulation still preserves the task-level optimality.
Decoupling and defining the action for each robot can rescind the additional motions. Yet, the computational
challenge associated with the semantic attachment architecture needs to be analyzed and it is an immediate
work for future.

References
[BFLP17]     Sara Bernardini, Maria Fox, Derek Long, and Chiara Piacentini. Boosting Search Guidance in
             Problems with Semantic Attachments. In International Conference on Automated Planning and
             Scheduling (ICAPS), pages 29–37, Pittsburgh, PA, USA, June 2017.

[CAG09]      Stéphane Cambon, Rachid Alami, and Fabien Gravot. A hybrid approach to intricate motion,
             manipulation and task planning. The International Journal of Robotics Research, 28(1):104–126,
             2009.

[CCFL10]     Amanda Jane Coles, Andrew Ian Coles, Maria Fox, and Derek Long. Forward-chaining partial-order
             planning. In Twentieth International Conference on Automated Planning and Scheduling, 2010.

[DGTN09] Christian Dornhege, Marc Gissler, Matthias Teschner, and Bernhard Nebel. Integrating symbolic
         and geometric planning for mobile manipulation. In Safety, Security & Rescue Robotics (SSRR),
         IEEE International Workshop on, pages 1–6. IEEE, 2009.

[DKCK18] Neil T. Dantam, Zachary K. Kingston, Swarat Chaudhuri, and Lydia E. Kavraki. An Incremental
         Constraint-Based Framework for Task and Motion Planning. International Journal of Robotics
         Research, Special Issue on the 2016 Robotics: Science and Systems Conference, 37(10):1134–1151,
         2018.

[GLPK18] Caelan Reed Garrett, Tomas Lozano-Perez, and Leslie Pack Kaelbling. FFRob: Leveraging symbolic
         planning for efficient task and motion planning. The International Journal of Robotics Research,
         37(1):104–136, 2018.

[GNT16]      Malik Ghallab, Dana Nau, and Paolo Traverso. Automated planning and acting. Cambridge Uni-
             versity Press, 2016.
[HN01]      Jörg Hoffmann and Bernhard Nebel. The FF Planning System: Fast Plan Generation Through
            Heuristic Search. Journal of Artificial Intelligence Research, 14:253–302, 2001.
[Hof03]     Jörg Hoffmann. The Metric-FF Planning System: Translating “Ignoring Delete Lists” to Numeric
            State Variables. Journal of Artificial Intelligence Research, 20:291–341, 2003.
[Ind18]     Vadim Indelman. Cooperative multi-robot belief space planning for autonomous navigation in un-
            known environments. Autonomous Robots, 42(2):353–373, 2018.
[KF11]      Sertac Karaman and Emilio Frazzoli. Sampling-based algorithms for optimal motion planning. The
            International Journal of Robotics Research, 30(7):846–894, 2011.
[KLP13]     Leslie Pack Kaelbling and Tomás Lozano-Pérez. Integrated task and motion planning in belief space.
            The International Journal of Robotics Research, 32(9-10):1194–1227, 2013.
[KSLO96]    Lydia E Kavraki, Petr Svestka, J-C Latombe, and Mark H Overmars. Probabilistic roadmaps
            for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and
            Automation, 12(4):566–580, 1996.
[Lat91]     Jean-Claude Latombe. Robot Motion Planning. Kluwer Academic Publishers, 1991.
[LDG+ 18] Fabien Lagriffoul, Neil T. Dantam, Caelan Garrett, Aliakbar Akbari, Siddharth Srivastava, and
          Lydia E. Kavraki. Platform-independent benchmarks for task and motion planning. Robotics and
          Automation Letters, 2018.
[LZS18]     Shih-Yun Lo, Shiqi Zhang, and Peter Stone. Petlon: Planning efficiently for task-level-optimal navi-
            gation. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent
            Systems, pages 220–228. International Foundation for Autonomous Agents and Multiagent Systems,
            2018.
[MGH+ 98] Drew McDermott, Malik Ghallab, Adele Howe, Craig Knoblock, Ashwin Ram, Manuela Veloso,
          Daniel Weld, and David Wilkins. PDDL- The Planning Domain Definition Language. In AIPS-98
          Planning Competition Committee, 1998.
[MPS05]     Agostino Martinelli, Frederic Pont, and Roland Siegwart. Multi-robot localization using relative
            observations. In Proceedings of the 2005 IEEE international conference on robotics and automation,
            pages 2797–2802. IEEE, 2005.
[MRMB16] Pablo Muñoz, Marı́a D R-Moreno, and David F Barrero. Unified framework for path-planning and
         task-planning for autonomous robots. Robotics and Autonomous Systems, 82:1–14, 2016.
[PAFL15]    Chiara Piacentini, Varvara Alimisis, Maria Fox, and Derek Long. An extension of metric temporal
            planning with application to ac voltage control. Artificial intelligence, 229:210–245, 2015.
[RB02]      Stergios I Roumeliotis and George A Bekey. Distributed multirobot localization. IEEE transactions
            on robotics and automation, 18(5):781–795, 2002.
[SFR+ 14]   Siddharth Srivastava, Eugene Fang, Lorenzo Riano, Rohan Chitnis, Stuart Russell, and Pieter
            Abbeel. Combined task and motion planning through an extensible planner-independent inter-
            face layer. In Robotics and Automation (ICRA), IEEE International Conference on, pages 639–646.
            IEEE, 2014.
[TMB19]     Antony Thomas, Fulvio Mastrogiovanni, and Marco Baglietto. Task-Motion Planning for Navigation
            in Belief Space. In The International Symposium on Robotics Research, 2019.
[Wey80]     Richard W. Weyhrauch. Prolegomena to a theory of mechanized formal reasoning. Artificial Intel-
            ligence, 13, 1980.
[WYYG18] Cuebong Wong, Erfu Yang, Xiu-Tian Yan, and Dongbing Gu. Optimal path planning based on
         a multi-tree T-RRT* approach for robotic task planning in continuous cost spaces. In 2018 12th
         France-Japan and 10th Europe-Asia Congress on Mechatronics, pages 242–247. IEEE, 2018.