=Paper= {{Paper |id=Vol-2969/paper81-RobOntics |storemode=property |title=FailRecOnt – An Ontology-Based Framework for Failure Interpretation and Recovery in Planning and Execution |pdfUrl=https://ceur-ws.org/Vol-2969/paper81-RobOntics.pdf |volume=Vol-2969 |authors=Mohammed Diab,Mihai Pomarlan,Stefano Borgo,Daniel Beßler,Jan Rossel,John Bateman,Michael Beetz |dblpUrl=https://dblp.org/rec/conf/jowo/DiabPBBRBB21 }} ==FailRecOnt – An Ontology-Based Framework for Failure Interpretation and Recovery in Planning and Execution== https://ceur-ws.org/Vol-2969/paper81-RobOntics.pdf
FailRecOnt – An Ontology-Based Framework for
Failure Interpretation and Recovery in Planning
and Execution
Mohammed Diab1 , Mihai Pomarlan2 , Stefano Borgo3 , Daniel Beßler4 , Jan Rossel1 ,
John Bateman2 and Michael Beetz4
1
  Institute of Industrial and Control Engineering, Universitat Politècnica de Catalunya
2
  Faculty of Linguistics and Literature, University of Bremen
3
  Laboratory for Applied Ontology (LOA), ISTC CNR
4
  Institute for Artificial Intelligence, University of Bremen


                                         Abstract
                                         Autonomous mobile robot manipulators have the potential to act as robot helpers at home to improve
                                         quality of life for various user populations, such as elderly or handicapped people, or to act as robot
                                         co-workers on factory floors, helping in assembly applications where collaborating with other operators
                                         may be required. However, robotic systems do not show robust performance when placed in environments
                                         that are not tightly controlled. An important cause of this is that failure handling often consists of scripted
                                         responses to foreseen complications, which leaves the robot vulnerable to new situations and ill-equipped to
                                         reason about failure and recovery strategies. Instead of libraries of hard-coded reactions that are expensive
                                         to develop and maintain, more sophisticated reasoning mechanisms are needed to handle failure. This
                                         requires an ontological characterization of what failure is, what concepts are useful to formulate causal
                                         explanations of failure, and integration with knowledge of available resources including the capabilities
                                         of the robot as well as those of other potential cooperative agents in the environment, e.g. a human user.
                                         We propose the FailRecOnt framework as a step in this direction. We have integrated an ontology for
                                         failure interpretation and recovery with a contingency-based task and motion planning framework such
                                         that a robot can deal with uncertainty, recover from failures, and deal with human-robot interactions. A
                                         motivating example has been introduced to justify this proposal. The proposal has been tested with a
                                         challenging scenario.




1. Introduction
Robotic manipulators have established themselves in industry as reliable tools for some complex
but repetitive tasks, but this reliability depends on a carefully designed and controlled environment.
Domestic settings do not have this property, and despite interest in service robots working in
the home, today’s robots have too brittle performance because of insufficiently robust failure
handling.

RobOntics 2021: 2nd International Workshop on Ontologies for Autonomous Robotics, held at JOWO 2021: Episode
VII The Bolzano Summer of Knowledge, September 11–18, 2021
" mohammed.diab@upc.edu (M. Diab); mihai.pomarlan@uni-bremen.de (M. Pomarlan); stefano.borgo@cnr.it
(S. Borgo); danielb@uni-bremen.de (D. Beßler); jan.rosell@upc.edu (J. Rossel); bateman@uni-bremen.de
(J. Bateman); beetz@uni-bremen.de (M. Beetz)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)
Figure 1: A motivating example to justify our approach.




   This paper presents a framework for autonomous behavior which is a step towards a more
automated, reasoning and knowledge-driven approach to failure handling. Such an approach
needs a concept of what “failure” means, what kinds of failures might happen and why, and what
an appropriate response might be. The reasoning must also be integrated into the perception-
action loop of the robot, and able to guide a plan repair process to resume or repeat a task after a
failure. The main contributions of this paper are:

    • Ontology formalization:1 Formal definition of concepts describing failures in terms of
      causal mechanism, location, time of performance, and functional considerations including
      concepts to describe recovery strategies according to the available resources and the
      required plan repair operations.
    • State interpretation based on observation: Integration of sensing action and reasoning
      action with failure knowledge. Perception flags abnormal events and describes their nature
      in terms of failure conceptualizations.
    • Plan repair guided by reasoning: Failure descriptions inform reasoning processes that pro-
      pose possible recovery strategies via classification, which allows posing repair/replanning
      queries to a contingency task planner.

  The integration of our proposed ontology-based reasoning mechanism and contingent-based
   1
       https://github.com/ease-crc/failrecont
planning has been implemented for a robot and demonstrated in a test scenario motivated by
previous work [1]. The task is to store an object (a cylinder labeled A) in a given tray according
to its color. Cylinder A may be among other objects, each of which have to be manipulated
according to its nature. Figure 1 shows the initial belief state of the mobile manipulation problem,
where cylinder A is painted in gray because its color is uncertain (it could be red or green), it
is not known whether the can is filled or not (and hence whether the can should be pushed or
picked instead), nor if the containers are open or closed (and hence if the placement of an object
inside the box can be directly performed or not). We enhance our previous work by providing an
ontological way for failure interpretation and recovery strategies to allow the robot to fix its plan
whether a human is available to assist or not.


2. Related Work
An influential, though informal, conceptualization of system dependability is due to Laprie [2]. It
defines failures as a system being unable to operate according to nominal specifications; failures
are caused by error states, in turn caused by faults due to the system, its environment, or its
design. Carlson and Murphy [3] enrich this conceptualization with more human-caused faults and
study failures in deployments of search-and-rescue unmanned ground vehicles. Honig and Oron-
Gilad [4] further enrich the human-caused fault side of the conceptualization and survey existing
failure classifications for robotic systems, especially those due to faults in human-robot interaction.
Ross [5] builds upon this framework and defines errors in terms of recovery strategies. Unlike
our approach, Laprie’s conceptualization is informal, i.e., meant for use by human researchers
and engineers, and treats failures as disruptive and rare events in the life of a system. What we
propose is a formalized account of task failure, i.e., an ontology that a machine can reason with
and use in the frequent eventuality of a particular task – not a system – failing.
   Early detection of erroneous robot behavior has been pursued by model-free methods with
classifiers trained on data from regular operation [6], hidden Markov models [7], predictive
models [8], and models trained from simulation [9, 10].
   An early example of fault-tolerant planning is provided by Jensen et al. [11], in which actions
may have likely/unlikely side-effects, some of which lead to failure, and investigate algorithms
for finding plans that can recover from some number of such failures.
   Our first trial toward failure interpretation and recovery was presented in [1] and [12].
The former is a contingent-based task and motion planning approach; the basic Contingent-FF
planner [13] has been modified to include human-robot collaboration and perception-based state
observation. The latter work describes several sources of failures in automated planning and
execution – geometric, hardware-related, software-related. This ontology has been formalized
based on foundations such as DUL [14] and SUMO [15] in order to generalize the approach and
allow it to be widely used in robotic systems. However, the ontology did not describe recovery
strategies that would let the plan automatically repair itself.
   Other investigations of interpreting failures have been done, e.g. [16], in which three layers
of knowledge are proposed. Although that study shows how assumptive planning provides a
single mechanism for interpretation under uncertainty, it is less applicable to robot manipulation
planning in cluttered environments. We compare this approach with ours in Sec. 7.
3. Ontological Modelling of Failures
3.1. Conceptualization
Ontologically speaking, the labels “success” and “failure” are perspectival. They do not apply to
actions or events per se but are used by an agent to evaluate the result of its action based on its
expectations. The robot acts with the aim to bring about a desired state of the world, therefore it
needs to establish whether the performed action and the resulting state are as expected. A failure
is a lack of correspondence between the expectation and the actual state of the world.
   In this paper we concentrate on an actual robot platform, and present both robot-specific
and generic knowledge about actions to derive failure detection and recovery decisions. This
illustrates how the approach is applicable to a broad class of cases and how it can work for other
robots or scenarios provided adaptations to the specifics of a robot platform are implemented.

3.2. Formalization
We use the DUL ontology as a foundation2 . However, we need to enrich the DUL terminology
to cover our scenario. In particular, in DUL the notion of Situation is agent- and language-
independent. In robotics one need to talk of epistemic situations (“e-situations”), i.e., situations
as accessible to a specific robot, relative to its capabilities to sense and describe the world.
   We write Situation, capitalized and italicized, for the general DUL concept, and “e-situation”
to refer to the robot’s specialized concept. Thus, we call “e-situation” a fragment of physical
reality (a “state of the world”) as it can be sensed and processed by the robot.
   By construction, to each e-situation corresponds a different situation description: an expression
in the robot’s language giving the maximal knowledge the robot may have about the e-situation.
An e-situation 𝑆 satisfies a description 𝐷 when 𝑆 makes the description 𝐷 true.
   We need to introduce another concept, that of expected transformation. An expected transfor-
mation is a triple <𝑆𝑖 ; 𝐴𝑐𝑡𝑗 ; 𝑆𝑓 > stating that the execution of action 𝐴𝑐𝑡𝑗 at state 𝑆𝑖 is expected
to terminate in state 𝑆𝑓 . Transformations are robot-dependent and normative. They may be
regularly updated as a result of experience in learning robots.
   The execution of an action 𝐴𝑐𝑡𝑗 at state 𝑆𝑖 is deemed successful if it is an instantiation of the
expected transformation <𝑆𝑖 ; 𝐴𝑐𝑡𝑗 ; 𝑆𝑓 >, i.e., if the ending state satisfies 𝑆𝑓 . A failure occurs
when an action instantiates a triple <𝑆𝑖 ; 𝐴𝑐𝑡𝑗 ; 𝑆 ′ > such that (a) the triple is not an expected
transformation, and (b) for any expected transformation <𝑆𝑖 ; 𝐴𝑐𝑡𝑗 ; 𝑆𝑓 >, 𝑆 ′ and 𝑆𝑓 cannot both
hold in any possible world state.
   We will next present how the above discussion is formalized in description logic axioms. We
introduce a relation, manifestsIn, of domain Situation and range Event. In our model, State is a
subclass of Event. A DUL Situation satisfies some Description. The correspondences between
a situation description and an e-situation as state of the world are represented by the chain
𝑖𝑠𝑆𝑎𝑡𝑖𝑠𝑓 𝑖𝑒𝑑𝐵𝑦 ∘ 𝑚𝑎𝑛𝑖𝑓 𝑒𝑠𝑡𝑠𝐼𝑛.
   The expected transformation <𝑆𝑖 ; 𝐴𝑐𝑡𝑗 ; 𝑆𝑓 > becomes in DUL terms an Action that isEventIn-
cludedIn some Situation which hasPostcondition a Situation corresponding to 𝑆𝑓 .

    2
     DUL is formulated in a fragment of description logic which we therefore also use for our work here; see
appendix https://sir.upc.edu/projects/ontologies/ for a short introduction to its syntax and semantics.
   We derive our Failure concept from DUL’s EventType. A Failure classifies only Actions
(Events with an Agent enacting a Plan pursuing a Goal). Further, Failure classifies only Actions
for which the Agent knows of inconsistencies between expected and actual postconditions. The
mismatch is represented by a new concept we defined, NonrealizedSituation, i.e. Situations
without manifestsIn links to any Events:
𝑃 𝑙𝑎𝑛𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 ⊑ 𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛 (DUL axiom)
𝑁 𝑜𝑛𝑟𝑒𝑎𝑙𝑖𝑧𝑒𝑑𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛 ⊑ 𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛
𝑁 𝑜𝑛𝑟𝑒𝑎𝑙𝑖𝑧𝑒𝑑𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛 ⊓ (∃𝑚𝑎𝑛𝑖𝑓 𝑒𝑠𝑡𝑠𝐼𝑛.𝐸𝑣𝑒𝑛𝑡) ⊑ ⊥
𝐹 𝑎𝑖𝑙𝑒𝑑𝑃 𝑙𝑎𝑛𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 ≡ 𝑃 𝑙𝑎𝑛𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 ⊓
       ∃ℎ𝑎𝑠𝑃 𝑜𝑠𝑡𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛.𝑁 𝑜𝑛𝑟𝑒𝑎𝑙𝑖𝑧𝑒𝑑𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛
𝐹 𝑎𝑖𝑙𝑢𝑟𝑒 ≡ 𝐸𝑣𝑒𝑛𝑡𝑇 𝑦𝑝𝑒 ⊓ ∀𝑐𝑙𝑎𝑠𝑠𝑖𝑓 𝑖𝑒𝑠.(𝐴𝑐𝑡𝑖𝑜𝑛 ⊓
       ∃𝑖𝑠𝐸𝑣𝑒𝑛𝑡𝐼𝑛𝑐𝑙𝑢𝑑𝑒𝑑𝐼𝑛.𝐹 𝑎𝑖𝑙𝑒𝑑𝑃 𝑙𝑎𝑛𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛)
   We use a feature of open-world semantics: facts not asserted are not false, but unknown. A
robot may know of a Situation without any manifestsIn links to Events in the robot’s history
but, unless that Situation is explicitly asserted to be a NonrealizedSituation, the robot maintains
the possibility that the Situation could manifest into an Event, e.g. one in the future. Decisions
on whether a Situation corresponds to some observed Event should come from a module such
as perception. What our ontology describes is how the decision from perception should be
formalized so that it becomes accessible to further reasoning.
   The input to reasoning about failure recovery is, in our approach, a knowledge graph containing
a snapshot of the robot’s history up to a Failure – the past observed Events, their participants, and
any Situation they might relate to.


4. From Failure Understanding to Recovery Strategies
4.1. Causal Explanations for Failure
The main question of an Agent when detecting a Failure is what to do about it. The answer
is often simple: repeat the last action. Such “reflex” answers are often not good enough, but
necessary for an Agent to act fluidly.
   However, simple reflexes will not always work. E.g., a robot wants to turn on a blender, so
it pushes a button; the blender doesn’t start. If the blender is not plugged in, repeated pushing
does nothing. To infer how to act, the Agent needs a causal explanation for why the blender fails
to start. One of the goals of our failure ontology is to define concepts for creating such causal
explanations. Our approach is inspired by Galton’s model of causal relations [17].
   However, because of the nature of Failures, we need to push beyond this model. We are
interested in prevention, and causal explanations of why something did not happen – why the
actual state of the world does not match the Agent’s expectations. The NonrealizedSituation
concept is our way around one difficulty identified by Galton [17]: a prevented Event did not
occur, so there should be no token associated with it in the robot’s history. The expectation for a
state however did exist, modelled as a Situation, and as a NonrealizedSituation when it matches
no actual Events. We define prevents as a relation from a Situation which manifests in some
Event to a NonrealizedSituation:
Figure 2: Examples illustrating our approach to represent task executions and failures: (a) a
successful task execution results in its postconditions manifesting in a state of the world; (b) a
failed task execution has unmanifested postconditions.




𝑝𝑟𝑒𝑣𝑒𝑛𝑡𝑠− ≡ 𝑖𝑠𝑃 𝑟𝑒𝑣𝑒𝑛𝑡𝑒𝑑𝐵𝑦
∃𝑝𝑟𝑒𝑣𝑒𝑛𝑡𝑠.⊤ ⊑ 𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛 ⊓ (∃𝑚𝑎𝑛𝑖𝑓 𝑒𝑠𝑡𝑠𝐼𝑛.𝐸𝑣𝑒𝑛𝑡)
∃𝑖𝑠𝑃 𝑟𝑒𝑣𝑒𝑛𝑡𝑒𝑑𝐵𝑦.⊤ ⊑ 𝑁 𝑜𝑛𝑟𝑒𝑎𝑙𝑖𝑧𝑒𝑑𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛

4.2. Recovery Strategies
A recovery strategy is a method a robot can apply to repair or reconstruct a plan that resulted
in a failure. Replanning is a time-consuming operation, so several heuristics might be used
by the robot in different cases. It is relevant to decide, based on knowledge about the failure,
which heuristic may be appropriate. We will say a RecoveryStrategy recoversFrom some Action
classified as a Failure:
𝑅𝑒𝑐𝑜𝑣𝑒𝑟𝑦𝑆𝑡𝑟𝑎𝑡𝑒𝑔𝑦 ⊑ ∃𝑟𝑒𝑐𝑜𝑣𝑒𝑟𝑠𝐹 𝑟𝑜𝑚.(𝐴𝑐𝑡𝑖𝑜𝑛 ⊓
      (∃𝑖𝑠𝐶𝑙𝑎𝑠𝑠𝑖𝑓 𝑖𝑒𝑑𝐵𝑦.𝐹 𝑎𝑖𝑙𝑢𝑟𝑒))
   We can axiomatically encode that “repeat last action”is not appropriate when the failure has a
sustained cause:
𝑅𝑒𝑝𝑒𝑎𝑡𝐿𝑎𝑠𝑡𝐴𝑐𝑡𝑖𝑜𝑛 ⊑ 𝑅𝑒𝑐𝑜𝑣𝑒𝑟𝑦𝑆𝑡𝑟𝑎𝑡𝑒𝑔𝑦
𝑅𝑒𝑝𝑒𝑎𝑡𝐿𝑎𝑠𝑡𝐴𝑐𝑡𝑖𝑜𝑛 ⊓ ∃𝑟𝑒𝑐𝑜𝑣𝑒𝑟𝑠𝐹 𝑟𝑜𝑚.
      (∃𝑖𝑠𝐶𝑙𝑎𝑠𝑠𝑖𝑓 𝑖𝑒𝑑𝐵𝑦.𝑆𝑢𝑠𝑡𝑎𝑖𝑛𝑒𝑑𝐹 𝑎𝑖𝑙𝑢𝑟𝑒) ⊑ ⊥
   Depending on the nature of a sustained failure’s cause, we can use the ontology to formulate
new subgoals for a RemoveFailureCause strategy by checking which of the robot-known situations
get classified as UnrealizedPrecondition:
𝑈 𝑛𝑟𝑒𝑎𝑙𝑖𝑧𝑒𝑑𝑃 𝑟𝑒𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 ≡ 𝑁 𝑜𝑛𝑟𝑒𝑎𝑙𝑖𝑧𝑒𝑑𝑆𝑖𝑡𝑢𝑎𝑡𝑖𝑜𝑛 ⊓
     ∃𝑖𝑠𝑃 𝑟𝑒𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑂𝑓.(∃𝑖𝑠𝑆𝑒𝑡𝑡𝑖𝑛𝑔𝐹 𝑜𝑟.
           (∃𝑖𝑠𝐶𝑙𝑎𝑠𝑠𝑖𝑓 𝑖𝑒𝑑𝐵𝑦.𝑆𝑢𝑠𝑡𝑎𝑖𝑛𝑒𝑑𝐹 𝑎𝑖𝑙𝑢𝑟𝑒))
and hence the unrealized preconditions become goals for new planning queries.
  Our ontology defines several other recovery strategies to organize a robot’s response. These
concepts are a common vocabulary between robot modules, even if not all reasoning required to
decide on a strategy’s applicability can be captured in OWL-DL.
  In order to check whether the prevents relations hold or not – so as to ascertain whether impedes
holds or not – we would have to defer to other reasoning modules.

4.3. Competency Questions
Our ontology plays two roles. First, it provides a common vocabulary for a robot’s subsystems to
assert facts, enabling an integrated understanding of an event. Second, it enables reasoning with
those facts and selecting appropriate recovery strategies. To guide development towards achieving
these roles, we use a standard technique based on competency questions [18]. Answering the
competency questions requires a collaboration between several robot subsystems and an OWL
DL reasoner operating on the ontology axioms and the facts asserted by the modules. The DL
reasoning queries are classification or consistency queries: given what is known, what is entailed
about the categorization of a failure or its recovery? What cannot be ruled out? We will go
through these competency questions below.
   Q1: What kind of Failure resulted from an Action? The robot system as a whole needs an
integrated understanding of the error it detects, which is enabled by our ontology through a
classification query returning which types of failure are entailed by the information coming from
subsystems.
   Q2: Is a Failure causing a problem for subsequent activities? Our ontology provides the
vocabulary to organize situations and goals in terms of precondition relations, and for robot
subsystems to assert facts about what situations prevent which others.
   Q3: Why did the Failure happen? We are interested in whether the failure is of type Sustained-
Failure, indicating a recovery strategy addressing its cause is necessary. This is a consistency
query asking whether, given prevention relations asserted by the robot’s subsystems while ad-
dressing the previous question, it is possible the observed failure is not a SustainedFailure.
   Q4: What recovery strategy is appropriate to a failure? A DL classification query can be used
to find out what is entailed about the recovery strategy given the nature of the failure as identified
in the previous competency questions. For a more permissive alternative, consistency queries can
be used to find out what recovery strategies are not ruled out.


5. The Failure-Module in the Robot Architecture
Autonomous manipulation tasks require cooperation between perception, knowledge representa-
tion and reasoning, and planning at both symbolic and geometric levels.
Figure 3: a) FailRecOnt - The proposed framework components. b) Flowchart of the proposed
framework.



5.1. Framework Overview
The proposed framework – FailRecOnt – is composed of three main layers, as shown in Fig. 3-a):
planning and execution, knowledge, and assistant (low-level) layer.
   The planning and execution layer contains two modules, the task planning and the task manager
modules. The former includes a task planner to compute a sequence of manipulation actions
to be done, which requires a problem and domain description to set the initial scene, including
world entities’ state, and the goal state. The latter provides interfaces to communicate with
the agents/operators (e.g., robots, humans or sensors). It also keeps monitoring the executed
actions and it returns a failure signal to the recovery module if an error occurs. Moreover, it
has a procedural structure for each step of an action. This structure is formally defined as a
workflow that can be automatically executed through interfacing existing robot control software
components. Workflows can be automatically executed through interfacing existing software
components, e.g., executing a pickUp requires a call to the Inverse Kinematics (IK) module to
check reachability for grasping the objects then finding a collision-free path towards a grasping
configuration. A putDown action must also search for available placement room.
   The knowledge layer contains a set of knowledge to guide the planning and execution layer.
   1. Guidance module, containing planning information, e.g., related to geometric skills, tuned
      based on the robot’s experience. It answers questions such as How to grasp an object?
      What are the constraints of a task? If the robot must store an item in a box, the robot must
      reason about which type of grasp is feasible (side or top-grasp).
   2. Recovery module, providing knowledge to interpret failures and propose recovery strategies
      like: 1) asking a human for assistance for unsolvable tasks by the robot, 2) guiding the
      robot to autonomously recover itself, for instance by calling a sensing module to sense
      the current state of the world or repeat the same action with another parameter (repeat a
      grasping action with different angle).
   3. Awareness module, containing geometric knowledge for reasoning to check action feasi-
      bility, and perceptual knowledge to assist the robot to figure out its workspace. It assists
      the robot to figure out which are the proper algorithms and parameters to be used for the
      available sensors in order to extract data. An example for this knowledge is the Perception
      and Manipulation Knowledge (PMK) presented in detail in [19].
  Finally, the assistant layer provides the low-level modules that allow dealing with:
   1. perception issues, e.g. finding out which sensors can be used for a sensing action in a given
      situation, which are dealt by the sensing module.
   2. geometric issues, e.g. determining if a configuration is collision-free or if an inverse
      kinematic solution exists for a gripper pose, which are dealt by the geometric module.

5.2. Framework Flowchart
The flowchart shown in Fig. 3-b) illustrates how the modules in the framework are integrated for
such tasks. First, the robot needs to describe the initial and goal states. The initial state can be
obtained from perception informed by perceptual knowledge [19, 20]. Then, the task planner
computes the sequence of actions to be executed, which is a general plan. This sequence is
obtained at a symbolic level, without any geometric considerations.
   The generated plan is associated with geometric reasoning to generate a feasible path for each
action. When these paths are executed, failures during execution may occur due to, for instance,
uncertainties in motion planning or sensory extraction. The robot needs to monitor the outcome
of each manipulation action in the execution phase. For the monitoring process, sensing and
reasoning actions are required. The sensing action is required to detect the failures without any
consideration of interpretation process. The reasoning action is required to interpret potential
failures, their possible causes, and potential recovery strategies.
   The recovery strategies let the robot react to changes in the environment that lead to failures.
Moreover, the general plan generated based on the initial state is maintained as-is, to avoid
re-planning based on the environmental changes.


6. Case Study
To illustrate our proposal some simulation examples are performed using The Kautham Project
for geometric planning [21], which has the ability to report geometric failures. Ontologies are
encoded using the Web Ontology Language (OWL) [22]. The ontologies are designed using the
Protégé 3 editor. The modified contingent-based planning presented in [1] is used to generate a
plan. Reasoning for failure interpretation and the potential recovery action(s) is integrated with
the planner through ROS (Robot Operating System) 4 as a service-client communication.


   3
       (http://protege.stanford.edu/)
   4
       https://www.ros.org
Figure 4: The sequence of snapshots from the planning process and its relation with FailRecOnt.
The upper image depicts how the failure box is closed is classified in the simulated example. The
bottom image shows the manipulation example where the goal is to transfer the gray cylinder
(labeled as A) to one of the trays w.r.t its color.




6.1. Action Description
The following actions types are considered: manipulation, sensing, and reasoning.
   Manipulation actions – pushing, transporting objects, opening – require motion, and can be
done by either the robot or a person.
   Sensing actions – checking color, pose, status of a container – do not involve motion, and are
devoted to observing object status. The observation is done at run-time.
   The reasoning actions are devoted to interpreting execution phase failures and providing a
recovery strategy. The reasoning process is done at run-time.
   The contingent Fast-Forward planner uses manipulation actions to find the conditional tree of
Figure 5: The conditional plan results from the planning process. a) flowchart describing the plan
obtained by the contingent FF. b) flowchart describing the monitoring process of the manipulation
action Transfer-B-box1.




Figure 6: a) The executable plan. b) The integration of the FailRecOnt. PS: Macros wrapper is
used to ground symbolic information from knowledge.




plans to execute the task, with reasoning actions associated to monitoring the execution of the
manipulation actions.
6.2. The Integration of the Recovery Module with a Conditional Plan
The complete conditional tree of plans is shown in Fig. 5. The plan contains branching nodes in
which sensing and/or reasoning actions are assigned to monitor manipulation action outcomes in
a semi-automatic way. To reduce computational cost we only assign sensing/reasoning actions to
manipulations expected to have a high probability of failure.
   The recovery strategies have been provided based on the current status of the box as shown in
Fig. 5-b). Recovery strategies could be to ask for help, or the robot recovers the failure by itself.


7. Discussion
The advantages of using FailRecOnt framework are: 1) Generality: FailRecOnt provides common
vocabularies for failure interpretation and recovery strategies in the robotics domain. 2) Share-
ability: FailrecOnt can be exploited in shared tasks with other operators, e.g. robots or humans,
and using heuristic-based or logic-based planning approaches. 3) Interoperability: FailRecOnt
can be used widely in the robotics domain at both planning and execution phases.
   In an alternative approach ([16]), a new state of the world is assumed when a failure occurs,
then the assumption is used by a task planner. This approach works well for tasks that involve
testing options in some order of expected value and where being wrong is safe. However, such
assumptions about the world state are not always necessary to proceed.
   Our approach allows reasoning to more carefully characterize what can be proven about a
failure and what constraints may be imposed on recovery strategies, without making assumptions.
Further, we have shown, using a manipulation task in a cluttered environment, how to integrate
our failure modelling with plan repair techniques rather than relying only on replanning and/or
asking for human assistance (although these are also available recovery strategies).


8. Conclusion and Future Work
Robots, like any other agent, sometimes fail. Knowledge-based robots can recover from failure
by reasoning whether to try again, try something else, or move to other tasks. We argued that the
choice must integrate the conceptual (ontology), the planning (task) and the execution (feasibility)
levels. This work raises many issues that we aim to expand in the future: to deepen the ontological
module, enrich the causal explanatory module, improve the search for an optimal match between
what is known about detected failures and recovery strategies, optimize the interconnections
among FailRecOnt submodules, and find a proper way to handle unknown/unmodeled failures.


Acknowledgments
This work was partially funded by the Spanish Government through the project PID2020-
114819GB-I00, by Deutsche Forschungsgemeinschaft (DFG) through the Collaborative Research
Center 1320, EASE, and by European CSA project OntoCommons (GA 958371). M. Diab is
supported by the Spanish Government through the grants FPI 2017.
References
 [1] A. Akbari, M. Diab, J. Rosell, Contingent task and motion planning under uncertainty for
     human–robot interactions, Applied Sciences 10 (2020) 1665.
 [2] J. Laprie, Dependable computing and fault tolerance : Concepts and terminology, in:
     Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ’ Highlights
     from Twenty-Five Years’., 1995, pp. 2–. doi:10.1109/FTCSH.1995.532603.
 [3] J. Carlson, R. Murphy, How ugvs physically fail in the field, Robotics, IEEE Transactions
     on 21 (2005) 423 – 437. doi:10.1109/TRO.2004.838027.
 [4] S. Honig, T. Oron-Gilad, Understanding and resolving failures in human-robot interaction:
     Literature review and model development, Frontiers in Psychology 9 (2018) 861. URL: https:
     //www.frontiersin.org/article/10.3389/fpsyg.2018.00861. doi:10.3389/fpsyg.2018.
     00861.
 [5] R. Ross, R. Collier, G. O’Hare, Demonstrating social error recovery with agentfactory, in:
     International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS),
     2004, pp. 1424–1425. doi:10.1109/AAMAS.2004.103.
 [6] R. Hornung, H. Urbanek, J. Klodmann, C. Osendorfer, P. van der Smagt, Model-free
     robot anomaly detection, in: IEEE/RSJ International Conference on Intelligent Robots and
     Systems (IROS), 2014, pp. 3676–3683. doi:10.1109/IROS.2014.6943078.
 [7] D. Azzalini, A. Castellini, M. Luperto, A. Farinelli, F. Amigoni, Hmms for anomaly
     detection in autonomous robots, in: International Conference on Autonomous Agents and
     Multiagent Systems (AAMAS), 2020.
 [8] P. Pastor, M. Kalakrishnan, S. Chitta, E. Theodorou, S. Schaal, Skill learning and task
     outcome prediction for manipulation, in: 2011 IEEE International Conference on Robotics
     and Automation, 2011, pp. 3828–3834. doi:10.1109/ICRA.2011.5980200.
 [9] A. Haidu, D. Kohlsdorf, M. Beetz, Learning action failure models from interactive physics-
     based simulations, in: 2015 IEEE/RSJ International Conference on Intelligent Robots and
     Systems (IROS), 2015, pp. 5370–5375. doi:10.1109/IROS.2015.7354136.
[10] A. S. Bauer, P. Schmaus, F. Stulp, D. Leidner, Probabilistic effect prediction through
     semantic augmentation and physical simulation, in: IEEE International Conference on
     Robotics and Automation (ICRA), 2020, pp. 9278–9284. URL: https://elib.dlr.de/134290/.
[11] R. M. Jensen, M. Veloso, R. E. Bryant, Fault tolerant planning: Toward probabilistic
     uncertainty models in symbolic non-deterministic planning, in: ICAPS, 2004.
[12] M. Diab, M. Pomarlan, D. Beßler, A. Akbari, J. Rosell, J. Bateman, M. Beetz, An
     ontology for failure interpretation in automated planning and execution, in: Iberian Robotics
     conference, Springer, 2019, pp. 381–390.
[13] J. Hoffmann, R. Brafman, Contingent planning via heuristic forward search with implicit
     belief states, in: Proc. ICAPS, volume 2005, 2005.
[14] C. Masolo, S. Borgo, A. Gangemi, N. Guarino, A. Oltramari, WonderWeb Deliverable
     D18 Ontology Library, Technical Report, IST Project 2001-33052 WonderWeb: Ontology
     Infrastructure for the Semantic Web, 2003.
[15] I. Niles, A. Pease, Towards a standard upper ontology, in: Proceedings of the international
     conference on Formal Ontology in Information Systems-Volume 2001, ACM, 2001, pp.
     2–9.
[16] M. Hanheide, M. Göbelbecker, G. S. Horn, A. Pronobis, K. Sjöö, A. Ay-
     demir, P. Jensfelt, C. Gretton, R. Dearden, M. Janicek, H. Zender, G.-J. Kruijff,
     N. Hawes, J. L. Wyatt, Robot task planning and explanation in open and uncertain
     worlds, Artificial Intelligence 247 (2017) 119–150. URL: https://www.sciencedirect.
     com/science/article/pii/S000437021500123X. doi:https://doi.org/10.1016/j.
     artint.2015.08.008, special Issue on AI and Robotics.
[17] A. Galton, States, processes and events, and the ontology of causal relations, in: Formal
     Ontology in Information Systems - Proceedings of the Seventh International Conference,
     FOIS 2012, Gray, Austria, July 24-27, 2012, 2012, pp. 279–292.
[18] M. Uschold, M. Gruninger, Ontologies: principles, methods and applications, The Knowl-
     edge Engineering Review 11 (1996) 93–136. doi:10.1017/S0269888900007797.
[19] M. Diab, A. Akbari, M. Ud Din, J. Rosell, PMK - a knowledge processing framework
     for autonomous robotics perception and manipulation, Sensors 19 (2019). URL: http:
     //www.mdpi.com/1424-8220/19/5/1166. doi:10.3390/s19051166.
[20] M. Diab, M. Pomarlan, D. Beßler, A. Akbari, J. Rosell, J. Bateman, M. Beetz,
     Skillman - a skill-based robotic manipulation framework based on perception and
     reasoning, Robotics and Autonomous Systems (2020) 103653. URL: http://www.
     sciencedirect.com/science/article/pii/S0921889020304930. doi:https://doi.org/10.
     1016/j.robot.2020.103653.
[21] J. Rosell, A. Pérez, A. Aliakbar, Muhayyuddin, L. Palomo, N. García, The kautham project:
     A teaching and research tool for robot motion planning, in: Proceedings of the IEEE
     Emerging Technology and Factory Automation (ETFA), 2014, pp. 1–8.
[22] G. Antoniou, F. van Harmelen, Web Ontology Language: OWL, 2004, pp. 67–92.