<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Output: The Importance of Modelling Transients in Meal Preparation Tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michaela Kümpel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vanessa Hassouna</string-name>
          <email>hassouna@uni-bremen.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alina Hawkin</string-name>
          <email>hawkin@uni-bremen.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Beetz</string-name>
          <email>beetz@cs.uni-bremen.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Knowledge Representation</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Meal Preparation Tasks</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Transients</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agent Application</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute for Artificial Intelligence, University of Bremen</institution>
          ,
          <addr-line>Am Fallturm 1, 28359 Bremen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We are moving closer to autonomous robots preparing meals. While restaurant robots in static environments already are successfully performing single actions like making pizza, the goal is to enable robots to perform changing actions, in various environments and with any available object. Towards this goal, a methodology for creating actionable knowledge graphs that can be used to parameterise general action plans has been proposed. However, for extended failure handling towards fully automated action execution, we argue that transients need to be considered. A transient can be described as a transitory object in a task that is not the same as the input object anymore but not yet the output object of the task. For example, when pouring ingredients into a bowl to make the dough, the added ingredients form a mass of ingredients (here: a transient) that only becomes dough through mixing them. This work shows how transients can be modelled and how robots can integrate and possibly benefit from this modelling.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>task execution and extended failure handling, transients need to be considered. A transient is a
transitory object that exists during task execution. It is only apparent in the transitional state
where it is not an input object anymore, but not an output object either.</p>
      <p>Figure 1 shows the motivation behind modelling transients for the example of making
pancakes using pictures from WikiHow instructions on how to prepare pancakes1. Here, we
ifrst have a number of separate ingredients. The ingredients are poured into a bowl and thus
transform into a transient that is neither ingredient nor dough just yet. This transient is mixed
until it transforms to dough. Some part of the dough is taken out of the bowl and poured into
a pan, again forming a transient that is neither dough nor pancake. The transient is flipped
and finally turns into a pancake.</p>
      <p>Considering transients in action execution has the potential to enable robots to apply rules
and thus reason about action execution and improve failure handling. For example, a robot that
knows that the action of mixing dough includes input objects (ingredients), a transient object
(mass of ingredients), and an output object (dough), can assign rules for action execution, such
as setting time limits for mixing dough based on the temperature of the butter.</p>
      <p>This work describes the logic of transients, how they might be modelled and the benefits
for robots if transients are considered. We therefore show how the logic of transients can be
integrated in a robot action plan.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>Many of the needed knowledge to perform tasks can be acquired using top-level ontologies such
as the DOLCE+DnS Ultralite (DUL) foundational framework and its definitions of descriptions
and situations [9] in combination with the Socio-physical Model of Activities (SOMA) [4],
which models relations of actions, objects and agents at a given time and space, designed to
provide robots with environment and activity knowledge so that they can relate an object to
the task at hand through its dispositions and afordances. SOMA additionally allows for the
integration of image schematic relations like SOURCE_PATH_GOAL (SPG), thereby supporting
robots in reasoning about the functional relationships of objects and enabling a more dynamic
action selection [6]. Actions in SOMA can be broken down into tasks, which again can be
broken down into body movements of agents as shown in [3]. While SOMA includes upper
terms for processes, processes have not been a research focus.</p>
      <p>In contrast to this, the basic formal ontology (bfo) defines occurrents and continuants over
a period of time [10], which has been used to formally model processes that are defined as
1The WikiHow article on how to prepare pancakes is accessible at https://www.wikihow.com/Make-Pancakes
occurrences in bfo [11]. Here, a process relates to an execution plan and can be broken down
into steps.</p>
      <p>Both approaches based on SOMA and bfo have modelled input objects and output objects for
tasks [2] and processes [11], but have not considered transients.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Logic of Transients</title>
      <p>Let us continue to consider the action “making a pancake”. For this, we first have to break down
the action into its subtasks. Making a pancake can be broken down into the tasks of grasping
the ingredient container, pouring the ingredient into the bowl and placing the empty container
(these steps are repeated for all ingredients), grasping the whisk, mixing the dough, placing
the whisk to then grasping a spoon, scooping some part of dough out of the bowl, pouring
that dough part and placing the spoon to finally grasping a spatula, flipping the half-baked
pancake, transporting the pancake to a plate and placing the spatula. If we closely look at
these tasks, we can see that transients actually do not necessarily belong to a single task (which
might have been expected) but can cover more than one task. In particular, when pouring a
second ingredient into the bowl, the ingredients already form a transient. When the pouring
task is completed, the transient still exists. Only after the mixing task is performed is the
transient transformed into the dough. If we consider a cutting action, however, only the cutting
task includes a transient (the moment the knife touches the food object, it turns into a transient
until the knife touches the supporting surface and the food object is cut into two pieces).</p>
      <p>Looking at these two examples, it seems that the availability of input and output objects
define transients. In particular, for the cutting task we can state that the input is a food object
and the output are two food parts. For a pouring task, input and output objects are equal,
unless more than one object is poured or the properties of the pouring destination (e.g. the
heat of the pan) will lead to a transformation of the object. Thus, the logic behind transients
can be described formally as in Equation 1, 2, 3 and 4, which are stated in accordance to the
SOMA ontology. In Equation 1 we consider tasks that have an input object and a result object,
where the input object is diferent to the output object in its form (e.g. transitioning from a
food object to food parts) or its quantitative measure (e.g. one object to two objects), then we
can state that the input object transforms into a transient during task execution since the task
triggers a process, in which the transient participates in. The process is stopped when the task
is completed.</p>
      <p>(ℎ _
→  () ∧    
∀ , ,  ∶ ,  ∶  ,  ∶    ∈  ∶
_(, ) ∧ ℎ _</p>
      <p>
        _(, ) ∧  ≠ )
_  (, ) ∧ ℎ
_ (, ) ∧ 
_  (, )
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
      </p>
      <p>In Equation 2, we consider tasks that have the same input object and result object (like
pouring). As discussed above, these tasks only start a process when performed multiple times
and with diferent objects for every execution (e.g., if I pour water twice, the result object is still
water and not transient). We can then state that the objects transform into a transient when</p>
      <p>For modelling transients, we propose to:
• include a relation such as triggers_process that links a task to a process
• integrate processes for every task that meets the criteria formulated in Equation 1, 2, 3
and 4
• integrates transients as participant_of processes
• include a relation such as stops_process that links result objects to processes
As mentioned before, in DUL an action executes a task and can have a physical object as
a participant. From SOMA we know that cutting is a task performed on an object. In recent
work, input and result objects of tasks were proposed [2]. Regarding processes, in DUL, a
   ⊑   ∈  . Other work proposed input and output objects of processes [11]. In
bfo, a    ⊑   ∈  . Thus, transients can similarly be modelled for both top-level
ontologies.</p>
      <p>Following the proposed modelling approach, we enable agents to infer triggered processes
and thus transient objects, as depicted in Figure 3. As described above, pouring the diferent
ingredients into a bowl triggers a mixing process, which involves a transient. Similarly, pouring
the dough into the hot pan triggers a baking process.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Benefits of Incorporating Transients into NEEMs</title>
      <p>Narrative Enabled Episodic Memories (NEEMs) [12]play a crucial role in helping robots learn
from experience [13] [14]. By integrating knowledge about transients into NEEMs, a detailed
record of transitional states and objects during task execution can be provided. This integration
allows robots to analyse their actions, understand where transients occurred, and determine
the success factors for completing tasks.</p>
      <p>We propose to utilise Narrative Enabled Episodic Memories (NEEMs) to improve the robot’s
performance by tracking the transient state’s duration and related conditions. If the robot is
mixing [15] certain components for the first time, it is yet unknown for how long the mixing
action should last to transition the mass from the transient to the goal state. The robot would
have to potentially interrupt the mixing process, lift the tool in order not to obstruct the camera
and perceive the current transient’s state so that it can be determined if the mixing action needs
to be prolonged or if the desired state has been reached. If the robot can generate NEEMs during
this process, a knowledge databasecan be filled with the obtained results. With this, whenever
the robot is tasked to mix the same components, it would gain a much better estimate of how
long the mixing process should last. This would result in less time to check the transient’s state,
potentially reducing spillage during the checking-of-current-mixing-state action. Accumulating
this knowledge over time would also allow a robot to learn about the diferent properties of
ingredients (such as the influence of the temperature of butter for mixing duration) and the
success of diferent mixing techniques. The next time ingredients which have been mixed before
need combining, it would be possible to estimate the mixing time more accurately in advance.
Another benefit of generating NEEMs that consider transient states is that should an action end
in that state; it would be possible to try and analyse why this might have happened. Maybe
the robot dropped the whisk and could not pick it up again, or the consistency of the transient
became too dificult for the robot to mix, or spillage occurred, and the experiment was aborted.
NEEMs provide the potential to analyse such occurrences in hindsight, allowing the planning
system to try and avoid them in the future.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Enabling Agents To Reason About Transients</title>
      <p>Transients exist during transitional states (here: processes) in task execution, representing
the intermediate stages between input and output objects. We utilize the Python version of
the Cognitive Robot Abstract Machine (CRAM) [16], known as PyCRAM, to efectively enable
robotic agents to reason about these transients. This adaptation assists robots in planning and
executing complex, sequential tasks eficiently. PyCRAM encapsulates the logic and transitions
within such tasks, significantly enhancing its ability to manage symbolic plans that account for
transients. This adaptability is essential for robots to perform intricate tasks, allowing them to
decompose tasks into smaller, manageable steps and adapt to fluctuating conditions. Central
to PyCRAM are action designators that convert symbolic task descriptions into specific ROS
action goals for robots. This structure enables robots to carry out high-level actions with an
awareness of context and flexibility, which is crucial for handling complex tasks efectively.</p>
      <p>Although PyCRAM ofers a promising solution for understanding transients, there is still
work to be done to fine-tune the framework to ensure it meets the demands of real-world
applications. Integrating transients into PyCRAM requires careful consideration of action
designators, task planning, and robotic reasoning, especially considering processes that span
several tasks or involve multiple stages and transitional states. For instance, in a mixing task,
a transient occurs when ingredients are combined but have yet to form a consistent mixture.
Action designators that model triggered processes must include additional information on how
tasks relate to processes and how transients transition from one state to another. Processes
can then encompass information like the expected duration of a process, additional rules such
as the influence of temperature on consistency, and desired results. One practical approach
to integrating transients into PyCRAM is establishing rules that explain how transients are
initiated, monitored, and concluded.</p>
      <p>In our past work - ”Steps Towards Generalized Manipulation Action Plans - Tackling Mixing
Task” [15], action designators break down the mixing task into various stages, including
preconditions, mix motions, and postconditions. Each of these stages can contain
transients. The mix motions are categorised into circular, ellipse, and orbital movements.</p>
      <p>The action designator must describe the mixing process’s initial conditions, intermediate
transitions, and outcomes to incorporate transients. For example, a ”spiral outwards” motion
leading into the main mixing phase represents a transitional state where the robot approaches
the final mixing pattern. This phase’s transient could be a partial combination of ingredients
that requires further mixing to achieve the desired consistency. To further integrate transients
into PyCRAM for real-world applications, it’s essential to establish a robust method for coding
these transitions into the framework. This enhancement involves refining the action designators
to include detailed descriptions of task stages, outlining how transients are initiated, evolve,
and conclude.</p>
      <p>For example, consider the mixing task, which can be divided into several key stages or
transient states:
Initial Mixing: Starts the mixing at a slow speed to blend ingredients without spillage.
Main Mixing: Increases speed to ensure thorough batter mixing.</p>
      <p>Consistency Checking: Reduces speed to check the batter regarding its consistency.
Finalizing Mix: Completes the mixing process and prepares to conclude the task.</p>
      <p>The concept centres around the introduction of the TransientState class with Conditions, as
depicted in Listing 1:</p>
      <p>Listing 1: Transient Python Class
c l a s s T r a n s i e n t S t a t e :
# Example c a l l { T r a n s i e n t S t a t e ( ” C o n s i s t e n c y Checking ” , [ ] , [ ] , \ { speed : ” low ” , d u r a t i o n :
” 2 minutes ” , check : ” v i s u a l ” \ } )
d e f _ _ i n i t _ _ ( s e l f , name , e n t r y _ c o n d i t i o n s , e x i t _ c o n d i t i o n s , p r o c e s s _ i n f o ) :
s e l f . name = name
s e l f . e n t r y _ c o n d i t i o n s = e n t r y _ c o n d i t i o n s
# C o n d i t i o n s t o e n t e r t h i s s t a t e
s e l f . e x i t _ c o n d i t i o n s = e x i t _ c o n d i t i o n s
# C o n d i t i o n s t o l e a v e t h i s s t a t e
s e l f . p r o c e s s _ i n f o = p r o c e s s _ i n f o
# I n f o r m a t i o n l i k e d u r a t i o n , i n f l u e n c e f a c t o r s
[ . . . ]</p>
      <p>This object-oriented strategy utilizes the TransientState class to manage various task stages,
with each state capturing essential details such as duration and external influences. The
ActionDesignator class, illustrated in Listing 2, has been expanded from its initial design to
direct the sequence of states required to complete the entire task.</p>
      <p>Listing 2: Action Designator
c l a s s A c t i o n D e s i g n a t o r :
d e f _ _ i n i t _ _ ( s e l f , task_name , s t a t e s : L i s t [ T r a n s i e n t S t a t e ] , [ . . . ] ) :
s e l f . task_name = task_name
s e l f . s t a t e s = s t a t e s
# L i s t o f s t a t e s r e p r e s e n t i n g t h e t r a n s i e n t s
[ . . . ]
d e f p e r f o r m _ t a s k ( s e l f ) :
f o r t h e s t a t e i n s e l f . s t a t e s :
s e l f . e n t e r _ s t a t e ( s t a t e )
s e l f . e x e c u t e _ s t a t e ( s t a t e )
s e l f . e x i t _ s t a t e ( s t a t e )
[ . . . ]
[ . . . ]</p>
      <p>The workflow in PyCRAM involves iterating over each state defined in the states list and
managing the lifecycle of each state using three methods: enter_state, execute_state, and
exit_state, with the following functionalities:
• enter_state(state): Prepares the system to enter a given state, potentially verifying
and establishing entry conditions.
• execute_state(state): Manages the actual operations defined for the state, such as
directing robotic actions, monitoring real-time data, and adjusting parameters based on
process_info, within the TransientState class.
• exit_state(state): Concludes the state’s operations, ensures all exit conditions are
met, and readies the system for the transition to the next stage or task completion.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and Future Work</title>
      <p>This work highlights the critical role of modelling transients in meal preparation tasks, using the
example of pancake making to elucidate the transient states between task start and completion.
By defining the logic of transients and proposing methods for their integration into robotic
reasoning through PyCRAM, we aim to enhance robotic performance in complex, sequential task
execution. We believe that the consideration of transients in the modelling of meal preparation
tasks is crucial for agents that are able to reason about objects, object states, and failures
during task execution. Including transients provides robots with a subtle understanding of task
processes, allowing for more adaptive and subtle responses to dynamic task conditions. This is
particularly important in tasks involving state or composition transformations, such as cooking,
where intermediate states significantly influence the final outcome.</p>
      <p>Future work will involve refining the integration of transients into meal preparation ontologies
and enhancing the robotic action designators to more efectively incorporate and manage these
transient states. The potential for improving robotic interaction with dynamic environments
through a better understanding of transients presents an exciting frontier for cognitive robotics
and practical applications in everyday life.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>The research reported in this paper has been partially supported by the German Federal Ministy
of Education and Research; Project-ID 16DHBKI047 “IntEL4CoRo - Integrated Learning
Environment for Cognitive Robotics”, University of Bremen as well as the German Research Foundation
DFG, as part of Collaborative Research Center (Sonderforschungsbereich) 1320 “EASE -
Everyday Activity Science and Engineering”, University of Bremen (http://www.ease-crc.org/). The
research was conducted in subproject R04 “Cognition-enabled execution of everyday actions”.
[2] M. Kümpel, J.-P. Töberg, V. Hassouna, P. Cimiano, M. Beetz, Towards a knowledge
engineering methodology for flexible robot manipulation in everyday tasks, in: Workshop
on Actionable Knowledge Representation and Reasoning for Robots (AKR3) at European
Semantic Web Conference (ESWC), 2024.
[3] M. Kümpel, Actionable Knowledge Graphs - How Daily Activity Applications can Benefit
from Embodied Web Knowledge, Ph.D. thesis, University of Bremen, 2024. doi:10.26092/
elib/2936.
[4] D. Beßler, R. Porzel, M. Pomarlan, A. Vyas, S. Höfner, M. Beetz, R. Malaka, J. Bateman,
Foundations of the socio-physical model of activities (soma) for autonomous robotic agents,
arXiv preprint arXiv:2011.11972 (2020).
[5] K. Dhanabalachandran, V. Hassouna, M. M. Hedblom, M. Küempel, N. Leusmann, M. Beetz,
Cutting events: Towards autonomous plan adaption by robotic agents through
imageschematic event segmentation, in: Proceedings of the 11th Knowledge Capture Conference,
2021, pp. 25–32.
[6] M. M. Hedblom, M. Pomarlan, R. Porzel, R. Malaka, M. Beetz, Dynamic action selection
using image schema-based reasoning for robots, in: Joint Ontology Workshops, 2021.
[7] M. Beetz, U. Klank, I. Kresse, A. Maldonado, L. Mösenlechner, D. Pangercic, T. Rühr,
M. Tenorth, Robotic roommates making pancakes, in: 2011 11th IEEE-RAS International
Conference on Humanoid Robots, IEEE, 2011, pp. 529–536.
[8] D. Danno, S. Hauser, F. Iida, Robotic cooking through pose extraction from human natural
cooking using openpose, in: International Conference on Intelligent Autonomous Systems,
Springer, 2021, pp. 288–298.
[9] A. Gangemi, N. Guarino, C. Masolo, A. Oltramari, L. Schneider, Sweetening ontologies with
dolce, in: International conference on knowledge engineering and knowledge management,
Springer, 2002, pp. 166–181.
[10] V. Mascardi, V. Cordì, P. Rosso, et al., A comparison of upper ontologies., in: Woa, volume
2007, 2007, pp. 55–64.
[11] D. Dooley, M. Weber, L. Ibanescu, M. Lange, L. Chan, L. Soldatova, C. Yang, R. Warren,
C. Shimizu, H. K. McGinty, et al., Food process ontology requirements, Semantic Web
(2022) 1–32.
[12] J. Winkler, M. Tenorth, A. K. Bozcuoglu, M. Beetz, Cramm–memories for robots performing
everyday manipulation activities, Advances in Cognitive Systems 3 (2014) 47–66.
[13] S. Koralewski, G. Kazhoyan, M. Beetz, Self-specialization of general robot plans based on
experience, IEEE Robotics and Automation Letters 4 (2019) 3766–3773. doi:10.1109/LRA.
2019.2928771.
[14] G. Kazhoyan, A. Hawkin, S. Koralewski, A. Haidu, M. Beetz, Learning motion
parameterizations of mobile pick and place actions from observing humans in virtual environments,
in: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
2020, pp. 9736–9743. doi:10.1109/IROS45743.2020.9341458.
[15] V. Hassouna, H. Alina, M. Beetz, Steps towards generalized manipulation action plans
- tackling mixing task, in: Workshop on Actionable Knowledge Representation and
Reasoning for Robots (AKR3) at European Semantic Web Conference (ESWC), 2024.
[16] M. Beetz, G. Kazhoyan, D. Vernon, The cram cognitive architecture for robot manipulation
in everyday activities, 2023.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Kazhoyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Stelter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. K.</given-names>
            <surname>Kenfack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Koralewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Beetz</surname>
          </string-name>
          ,
          <article-title>The robot household marathon experiment</article-title>
          ,
          <source>in: 2021 IEEE International Conference on Robotics and Automation (ICRA)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>9382</fpage>
          -
          <lpage>9388</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>