Goal-based Deliberation for Cyber-Physical Systems? Arthur Bit-Monnot University of Sassari, Sassari, Italy afbit@uniss.it Abstract. In the recent years, the field of model-based Artificial Intelli- gence for goal-based autonomy have seen impressive developments in its capacity to tackle large problems. This development however, was done at the price of maintaining coarse abstractions on the system’s model and on the environment. The need to improve the capability of Cyber-Physical Systems to au- tonomously derive and execute complex strategies to fulfill some imposed tasks poses the question of the pertinence of such abstractions for sys- tems that are directly embedded in the physical world. Cyber-Physical Systems are indeed characterized by a continuous state, subject to vari- ous constraints reflecting physical limits, safety margins in the controller as well as formal requirements regarding their operation. In this work, we aim at developing models and algorithms for automated reasoning in the context of Cyber-Physical Systems. The focus is put on the development of planning techniques that account for the con- tinuous state of Cyber-Physical Systems and their particular operating constraints. Keywords: Planning · Cyber-Physical Systems · Constraint-based rea- soning · Robotics · Motion Planning · SMT 1 Context and Objectives The research fields of model-based deliberation have evolved differently depend- ing on whether they target a general view of intelligence (in AI) or systems with a physical embodiment (as in Robotics). This separation of concerns has enabled the development of a wealth of efficient dedicated methods targeting each field’s particular problems but have only been be loosely integrated. Indeed, research in AI, and in particular in planning and scheduling, has been more concerned with the high-level discrete decisions that must be taken to build a successful and high-quality plan while abstracting away most of the lower level details of the system. On the other hand, decision problems for Cyber-Physical Systems have to deal with the physics associated with the state of the system ? The research of Arthur Bit-Monnot has been funded by the EU Commission’s H2020 Programme under grant agreement N.732105 (CERBERO project). and are often deeply concerned with the metric and temporal environment in which the system evolves. A typical use case that illustrates the strong coupling between high-level strategies and low-level state and constraints is the one of planetary exploration rovers. Such systems face a variety of high-level tasks (navigation, observation, manipulation, sampling, communication) that must be scheduled and balanced with lower level constraints regarding terrain, actuators’ limits, bandwidth, or battery level. The cost of deploying and operating such systems mean that (i) operations must be planned in a near optimal way to get the maximal scientific return, and (ii) the system must obey strong safety requirements to make sure it remains functional over long periods of times. The physical and functional limits of the system are key in the definition of a high-level strategy as one must guarantee that a plan can be refined into a safe continuous state trajectory. While research in AI planning and scheduling has been mostly concerned with discrete and abstract models, the continuous numeric state of a Cyber-Physical System (CPS) imposes additional non-independent decision. For instance, re- questing an exploration rover to go to a given point will require planning a feasible path to reach the target location. Such a path is subject to constraints originating from its discrete state (e.g. speed limits when carrying a sampled rock) and of its numerically modeled environment (e.g. location of obstacles, sun exposition). Such constraints impact the feasibility of a given path and of the high-level plan that would require it. In addition, the trajectory taken by the robot impacts the high-level strategy, for instance through an increased battery consumption or arrival time that might impede further activities. As a result, providing plans that have a high probability of success must account for the low-level dynamics and requirements of the system. In addition, the formal requirements for the operation of such systems often span over both the discrete and continuous states of the system and must be accounted for in the decision process as they constitute strong conditions for the validity of a goal-oriented strategy. General Objective The objective of our work is to provide automated reasoning techniques targeting goal-oriented autonomy for Cyber-Physical Systems that account both for high-level objectives and the low-level continuous numeric state. In particular, we aim at tightly integrating constraint-based task planning with solvers accounting for the continuous state of a CPS; providing a unified model and framework for goal-based autonomy with strong guarantees on the system’s operation. 2 State of the Art Work in task planning and scheduling has focused mostly on discrete problems where the state of the system is subject to discrete changes at specific hap- penings. This restriction has enabled the development of efficient techniques centered on the use of heuristic search in planning [6,7] and constraint program- ming in scheduling [11]. Task planners’ high dependency on heuristics is however a shortcoming of forward-search planners as the accuracy of distance-estimators typically depends on many restrictive assumptions on the problems, making the extension of those planners to more expressive settings a very hard task. While constraint-based scheduling techniques offer a more flexible framework, there is still a gap to encode general planning problems as well as low-level dynamics over continuous states [3]. Recent attempts at extending task planning with a more expressive formalism focused on numeric states and processes are still in their infancy and are far from the scalability of the state-of-the-art techniques [5,4,17]. On the other hand, the robotics field has seen a steady progress on planning and controlling the motion of a robot in its physical environments, with now well established methods and tools [13]. Of particular interest to us is the current trend of encoding the problem as constraint optimization problems over the (finely discretized) continuous state trajectory of the system [18,16]. The last few years have seen an important developments on the problem of joint Task and Motion Planning (e.g. [1,9,8]) which is of particular interest to us because it represents a special and well understood sub-class of the problem of planning for Cyber-Physical Systems. However, most of the work has focused on geometrically demanding manipulation problems with minimal interactions regarding task-level requirements [12]. In particular, with one exception [14], such work is focused on a single robot and does not handle time and resources which are critical in the operation of most real world systems. 3 Past and Current Research Our work towards developing a framework for goal-based autonomy for Cyber- Physical Systems (CPS) could be coarsely characterized in three parts: – Develop and improve constraint-based models for discrete task planning, tar- geting scalability, support for incremental online plan adaptation as well as lazy learning as a mean for extension. – Generalize existing work in robotics for handling continuous state as non- linear optimization problems. – Tightly integrate the discrete and continuous solvers through means of lazy learning, the continuous state solver acting as a theory solver with respect to the discrete solver. 3.1 Constraint-based Planning An important requirement for our research project is to have a mean to solve planning and scheduling problems that: 1. can scale to large discrete problem featuring resources, multiple agent and temporally-extended requirements, 2. is adapted for online plan adaptation, and 3. can be extended towards techniques targeting the continuous numeric state. Our work here has been focused on the use of constraint-based models for temporal planning and scheduling. Compared to heuristic search in state-space which is the standard in automated planning research, the proposed constraint- based approach is particularly appealing by (i) its capacity to incrementally incorporate additional constraints and (ii) the fact that it does not rely on elaborate heuristics. The former is critical for the envisioned way of integrated reasoning on continuous state through lazy learning and for handling online plan adaptation. The latter is critical in that the reliance of forward-search planners on extremely accurate heuristics comes as a liability when trying to support complex domains that arise when operating on real CPSs. We developed an encoding for temporal planning where (i) actions are en- coded as optional chronicles and (ii) action parameters and temporal variables are lifted and related by constraints [3], which is inspired by recent results in the field of constraint-based planning and scheduling [2,10]. We showed that such a representation can be exploited by off-the-shelf SMT solvers with performance comparable to state-of-the-art planners on temporal problems [3]. The proximity of the encoding with the one of constraint-based scheduling solvers additionally opens up the possibility of reusing those as an alternative backend [11]. More importantly the use of constraint-based solvers, and SMT in particular, enables the use of clause learning as a way to support more specialized solvers targeting the continuous state of CPSs. 3.2 Handling Numeric Continuous State The key idea for handling continuous state is to provide a constraint satisfaction and optimization solver that acts as a theory solver with respect to the dis- crete planner. The continuous state is subject to a fined-grained discretization resulting in a non-linear constraint satisfaction problem. We build on recent insights from the robotics community in the use of op- timization techniques for trajectory and motion planning [18,16] and generalize those techniques towards more general requirements over the continuous state rather than the physical occupancy of the robot only. Our time-oriented constraint based model allows us to reason at an abstract level on the limited set of happenings at which specific state features of the system are constrained. Here, we detail our scheme to reason on the continuous state evolution between happenings. Our approach is based on the following ideas: – The partial state evolution between two happening is finely discretized. Each sampled partial state results in a set of decision variables representing, e.g., the configuration of a mobile robot or the level of a battery. – A nonlinear constraint satisfaction problem is formulated in which the deci- sion variables in sampled states are related with constraints encoding: • state dynamics, e.g., the distance between two sample poses of a robot must obey velocity and acceleration limits; • constraints originating from the task-level plan, that might be placed on a particular timepoint (e.g. the pose of the robot must be in a given area) or over a temporal interval (e.g. the battery level is linearly increasing during a recharge action); • constraints representing requirements on the hybrid state, e.g., the arm of a robotic manipulator must remain still while navigating; • implicit general constraints, e.g., there must be no collision between any of the robot’s poses and its environment or the battery level must always remain above a given threshold. The resulting problem cannot be expected to be convex in the general case making it untractable to prove the presence or absence of a satisfying assignment. To overcome this, a constraint satisfaction problem is turned into a nonlinear op- timization problem with the objective of minimizing the squared error on each of the constraints. Given an initial assignment, efficient techniques such as Guauss- Newton or Levenberg-Marquardt are available to drive it into a local optimum in which constraint violations are minimized. In practice, the generation of non- linear optimization problems over the variables of the partial states is expected to result in problems with a large number of variables. However, the constraints that result from constraining the state evolution typically relates the variables of a state sample only with the variables of the directly preceding and following state samples. This particular structure yields very sparse optimization problems with a banded structure and can be exploited by nonlinear solvers [15]. This fact has been notably leveraged in the context of trajectory planning to provide fast online trajectory replanning in an environment with dynamic obstacles [16]. This scheme of considering the fulfillment of constraints on the continuous and numeric state globally as a constraint optimization problem has several advantages: – Assignment to numerical parts of the problem is delayed and handled by the specialized solver. A key benefit is that the high-level solver does not need to commit to such values which would require elaborate heuristics. This overcomes one of the major difficulties of joint task and motion planning. – When the problem has a “simple” solution, it can be found very quickly and without backtracking on variable assignments. – On the occurrence of discrepancies between the planned state and the actual one at execution time, a previous solution can be very efficiently adapted to remain within safety bounds by reoptimizing it. In practice, this approach provides us with a formalism to tackle the problem of refining high-level plans into detailed state trajectories that can be checked for feasibility and safety. Importantly, a set of trajectories can be continuously adapted by means of reoptimization to account for the monitoring feedback, providing online adaptation, coordination and collision avoidance. Treating this problem of refining a plan into low-level primitives as a global optimization prob- lem allows us to treat it in a principled way with known optimization techniques. Our current prototype builds on the g 2 o optimization framework [15] and its use for trajectory optimization [16]. We are able to represent partially ordered and explicitly timed state trajectories for several robots, on problems involving navigation, object placement and energy consumption. Our current research focus is on the extraction of a minimized unsatisfiable core in the case where the local optimum results in constraint violations; in order to enable a good bi-directional integration with the high-level planner. While the violated constraints provide some information, the local optimum found results both from the initial assignment and on non-convex constraints that can drive the search into a particular basin. In practice, witnessing a constraint violation means that the combination of a set of constraints and of an initial assignment (i.e. a state trajectory) results in an unfeasible refined plan. Our key challenge is to efficiently explore the space of possible refinements to find a satisfiable state trajectory or provide a probabilistic evidence that a set of constraints should be relaxed in the high-level plan. Such evidence can then be efficiently handled by our constraint-based engine to prune the search space of alternative high-level solutions. References 1. Bidot, J., Karlsson, L., Lagriffoul, F., Saffiotti, A.: Geometric Backtracking for Combined Task and Path Planning in Robotic Systems. Tech. rep. (2013) 2. Bit-Monnot, A.: Temporal and Hierarchical Models for Planning and Acting in Robotics. Ph.D. thesis, Université de Toulouse (2016) 3. Bit-Monnot, A.: A Constraint-based Encoding for Domain-Independent Temporal Planning. In: International Conference on Principles and Practice of Constraint Programming (CP) (2018) 4. Bryce, D., Gao, S., Musliner, D., Goldman, R.: SMT-based Nonlinear PDDL+ Planning. In: AAAI Conference on Artificial Intelligence (2015) 5. Cashmore, M., Fox, M., Long, D., Magazzeni, D.: A Compilation of the Full PDDL+ Language into SMT. In: International Conference on Automated Plan- ning and Scheduling (ICAPS) (2016) 6. Coles, A., Coles, A., Fox, M., Long, D.: Forward-Chaining Partial-Order Planning. In: International Conference on Automated Planning and Scheduling (ICAPS) (2010) 7. Coles, A., Coles, A., Fox, M., Long, D.: COLIN: Planning with Continuous Linear Numeric Change. Journal of Artificial Intelligence Research (JAIR) 44 (2012) 8. Dantam, N.T., Kingston, Z.K., Chaudhuri, S., Kavraki, L.E.: Incremental Task and Motion Planning: A Constraint-Based Approach. In: Robotics: Science and Systems Conference (2016) 9. Garrett, C.R.: FFRob: An Efficient Heuristic for Task and Motion Planning. In: Algorithmic Foundations of Robotics (2015) 10. Laborie, P., Rogerie, J.: Reasoning with Conditional Time-Intervals. In: Interna- tional Florida Artificial Intelligence Research Society Conference (FLAIRS) (2008) 11. Laborie, P., Rogerie, J., Shaw, P., Vilím, P.: IBM ILOG CP optimizer for schedul- ing. Constraints 23(2) (2018) 12. Lagriffoul, F.: On Benchmarks for Combined Task and Motion Planning. In: RSS Workshop on Task and Motion Planning (2016) 13. LaValle, S.M.: Planning Algorithms. Cambridge University Press (2006) 14. Mansouri, M., Pecora, F.: More Knowledge on the Table : Planning with Space , Time and Resources for Robots. In: International Conference on Robotics and Automation (ICRA) (2014) 15. Rainer, K., Grisetti, G., Konolige, K.: g2o : A General Framework for Graph Opti- mization. In: International Conference on Robotics and Automation (ICRA) (2011) 16. Rösmann, C., Hoffmann, F., Bertram, T.: Integrated online trajectory planning and optimization in distinctive topologies. Robotics and Autonomous Systems 88 (2017) 17. Scala, E., Ramirez, M., Haslum, P., Thiebaux, S.: Numeric Planning with Dis- junctive Global Constraints via SMT. In: International Conference on Automated Planning and Scheduling (ICAPS) (2016) 18. Schulman, J., Duan, Y., Ho, J., Lee, A., Awwal, I., Bradlow, H., Pan, J., Patil, S., Goldberg, K., Abbeel, P.: Motion Planning with Sequential Convex Optimization and Convex Collision Checking. International Journal of Robotics Research (IJRR) (2014)