=Paper=
{{Paper
|id=Vol-3762/543
|storemode=property
|title=The NEMO co-pilot
|pdfUrl=https://ceur-ws.org/Vol-3762/543.pdf
|volume=Vol-3762
|authors=Stefania Costantini,Pierangelo Dell'Acqua,Giovanni De Gasperis,Francesco Gullo,Andrea Rafanelli
|dblpUrl=https://dblp.org/rec/conf/ital-ia/CostantiniDGGR24
}}
==The NEMO co-pilot==
The NEMO co-pilot
Stefania Costantini1,2,*,† , Pierangelo Dell’Acqua3,† , Giovanni De Gasperis1,† , Francesco Gullo1,†
and Andrea Rafanelli4,1,†
1
Department of Information Engineering, Computer Science and Mathematics, University of L’Aquila, L’Aquila, Italy
2
Gruppo Nazionale per il Calcolo Scientifico - INdAM, Roma, Italy
3
Department of Science and Technology, Linköping University, Linköping, Sweden
4
Department of Computer Science, University of Pisa, Italy
Abstract
In this work, we describe an agent to be employed in Human-AI Teaming in various, even critical, domains, based upon
affective computing, empathy, and Theory of Mind, and a description of the user profile and of the operational, professional,
and ethical requirements of the domain in which the agent operates. The architecture of the proposed agent encompasses a
Knowledge Graph, a Neural component and a Behaviour Tree. We briefly discuss a case study.
Keywords
Human-AI Interaction, Human-AI Teaming, Trustworthy AI, Responsible AI
1. Introduction AI and humans, if working together in Human-AI
Teaming (HAIT), can produce results exceeding what
One recent focus in Artificial Intelligence (AI) is building either can achieve alone, whereas they can control and
intelligent systems where humans and AI systems form improve each other. For instance, a human driver might
teams. This with the aim of exploiting the potentially train to cope with previously unseen situations through
synergistic relationships between human and automa- co-driving automation via a cooperative task shared be-
tion, thus devising “hybrid” systems where the partners tween the human driver and the AI-based system in-
should cooperate to perform complex tasks, possibly in- stalled on the vehicle. At the same time, AI helps drivers
volving a high degree of risk. As a simple example, in an in case of difficulties and immediate risks. In this syn-
AI-supported self-driving or assisted-driving vehicle, the ergistic relationship, humans may improve automation
AI component can be expected to evaluate and co-manage efficacy and capabilities. At the same time, automation
situations and risks, where the driver can provide the AI may enhance human performance in a task and compen-
component with useful information on practical driving sate for human inadequacies, catching and correcting
in all conditions and can self-manage the risks in the case possible misbehaviors, possibly also due to physically
this should be required by the circumstances. Human- or emotionally impaired states, and providing valuable
automation interaction is, in fact, one of the main themes suggestions.
of Human-centered AI. This issue also falls in the realm For the tasks of adopting AI agents in crucial tasks
of Trustworthy AI, whose requirements include respect such as, e.g., improving caregiving in medicine and teach-
for human autonomy, prevention of harm, fairness, and ing and constructing effective human-AI teams, agents
explainability, and of Responsible AI, whose goal is to should be endowed with an emotion recognition and
employ AI in a safe, trustworthy and ethical fashion. management module, capable of empathy, and modelling
Ital-IA 2023: 3rd National Conference on Artificial Intelligence, orga- aspects of the Theory of Mind (ToM), in the sense of being
nized by CINI, May 29–31, 2023, Pisa, Italy able to reconstruct what someone is thinking or feeling.
*
Corresponding author. Modelling a Theory of Mind is often based on forms of
†
These authors contributed equally. “Affective Computing”, which is a set of techniques aimed
$ stefania.costantini@univaq.it (S. Costantini); at eliciting a human’s emotional condition from physical
pierangelo.dellacqua@liu.se (P. Dell’Acqua);
signs, to enable the system to respond intelligently to
giovanni.degasperis@univaq.it (G. D. Gasperis);
francesco.gullo@univaq.it (F. Gullo); andrea.rafanelli@phd.unipi.it human emotional feedback.
(A. Rafanelli) In this work, we describe an agent to be employed in
http://www.di.univaq.it/stefcost (S. Costantini); HAIT, based upon affective computing, empathy, and
https://dellacqua.se/ (P. Dell’Acqua); https://fgullo.github.io/ Theory of Mind, and a description of the user profile and
(F. Gullo)
of the operational, professional and ethical requirements
0000-0002-5686-6124 (S. Costantini); 0000-0003-3780-0389
(P. Dell’Acqua); 0000-0001-9521-471 (G. D. Gasperis); of the domain in which the agent operates. The archi-
0000-0002-7052-1114 (F. Gullo); 0000-0001-8626-2121 (A. Rafanelli) tecture of the proposed agent encompasses a Knowledge
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0). Graph, a Neural component and a Behavior Tree.
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
2. Background nodes based on the agent’s current affective state. The
agent elaborates on the affective state during repeated
2.1. Behavior Trees interactions with the user and then tune its reaction ac-
cordingly. Once the ordering has been established, the
Behaviour Trees (BTs) were introduced as a tool to en-
emotional selector behaves as a priority selector. A white
able modular AI in computer games. A behavior tree
circle with the character E represents an emotional selec-
is essentially a mathematical model of plan execution,
tor. In contrast, an empathy node provides an emotional
where each element (task and action) of a plan is associ-
evaluation of its single child node. An empathy node can
ated with a node in the tree. Their strength comes from
only be a child of an emotional selector. Its child can be a
their ability to create complex tasks composed of simple
leaf or an inner node. A dashed circle line with the name
tasks without worrying about how the simple functions
of the empathy emotion represents an empathy node.
are implemented. For a comprehensive survey of BTs in
To enable the integration of deep learning models for
Artificial Intelligence and Robotic applications, see [1, 2].
emotion recognition and symbolic models for planning
A BT is a directed acyclic graph consisting of different
and decision-making within emotional behavior trees,
types of nodes, each one associated with executable code
we introduced neural nodes. A neural node takes the
(where such code enacts an element composing a plan).
current state of the environment and agent as input and,
In most cases, a BT is tree-shaped, hence the name. How-
using a deep learning model, makes inferences about the
ever, unlike a traditional tree, a node in a BT can have
emotional state. It contains a model, such as an emo-
multiple parents, allowing the reuse of that part of the
tion recognition system, that estimates the emotional
tree. The traversal of a behavior tree starts at the top
state. These estimates are then mapped into the agent’s
node. When a node is traversed, the associated code is ex-
affective state variables that parameterize the emotional
ecuted, returning one of the three states: success, failure,
selector. The neural node continually updates the agent’s
or running.
internal emotional state, allowing the dynamic adapta-
The critical nodes in a BT include leaf nodes and inner
tion of behavior trees to the emotional context.
nodes. An action is a leaf node representing a behav-
ior that the character can perform. The action returns
success or failure when it completes its execution, de- 2.3. Knowledge Graphs
pending on the outcome. An action is depicted as a white Knowledge Graphs (KGs) [4, 5] is a particular type of
circle. A condition is a leaf node that checks an internalknowledge base [6] where knowledge is organized in a
or external state. It returns either success or failure. Agraph-like structure, i.e., with triples that define relation-
condition is represented as a grey rounded rectangle. A ships (edges) among entities (nodes) of interest. KGs are
sequence selector is an inner node that typically has sev-also known as information graphs [7], or heterogeneous
eral child nodes that are executed sequentially. Once a information networks [8].
child node completes its execution successfully, the se- KGs have been extensively used in a plethora of ap-
quence selector continues executing the next child node. plication scenarios, including knowledge completion [9],
If every child node returns success, then the sequence head/tail prediction [10], rule mining [11], query answer-
selector returns success. If one of the child nodes returning [12], and entity alignment [13, 14, 15]. KGs have
failure, the sequence selector immediately returns failure.
also recently recently emerged as supporting tools for
A sequence selector is depicted as a grey square with Retrieval-Augmented Generation (RAG) for Large Lan-
an arrow across the links to its child nodes. A priority guage Models (LLMs) [16, 17, 18, 19].
selector is an inner node. It has a list of child nodes that it
A well-established technique that is commonly ex-
tries to execute one at a time until one of the child nodes
ploited for tasks on KGs is Knowledge graph embeddings
returns success. If none of the child nodes executes suc- (KGEs) [20, 10]. KGEs generate numerical vector repre-
cessfully, the priority selector returns failure. A priority
sentations for entities and relationships of a KG, thus
selector is represented with a grey circle with a questionmaking them amenable to be processed in downstream
mark. tasks where a numerical representation is required (e.g.,
neural network-based machine-learning tasks). Although
2.2. Neural Empathy-Aware Behavior KGEs can differ (significantly) from one another in their
Trees definition, a shared key aspect of all KGEs is that they are
typically defined based on a so-called embedding scoring
To consider empathy and mimic human decision-making, function or simply embedding score. This function quan-
in [3] we introduced neural empathy-aware behavior trees tifies how likely a triple exists in the KG based on the
(NEABTs) by introducing a selector node called emotional embeddings of the entities and the relationship of that
selector, an empathy node, and a neural node. triple. Several KGEs have appeared in the last few years.
The emotional selector is a node that orders its child The distinctive features among embeddings are the score
NEABT
Sensor
ENVIRONMENT
N
KNOWLEDGE GRAPH
KG KG-to- KG ' E
KG
Domain NEABT (KG', Env)
User Profile encoder
Knowledge decoder
e1 e2
...
Sensor (KG', A1 A2 ... An (KG', A1 A2 ... An
Env) Env)
KG decoder
User
Feedback
KGNEABT-to-KG
encoder USER Agent's Agent Action Detected User
suggested Emotion
action to User
Aggregator
Figure 1: The NEMO co-pilot framework
function and the optimization loss. Translational embed- to the User. The Aggregator may perform something
dings in the TransE [21] family and the recent PairRE [22] either very simple (e.g., just derive a textual representa-
assumes that the relationship of a triple performs a trans- tion of the three outputs and concatenate them) or more
lation between the entities of that triple. Semantic em- sophisticated (e.g., exploit a large language model (LLM)).
beddings, such as DistMult [11] or HolE [23], interpret BT’s outputs and User’s feedback are used to update back
the relationship as a multiplicative operator. Complex the KG. This way, we have a loop-back mechanism in
embeddings, such as RotatE [24] and ComplEx [25], use which the KG is exploited by the BT for its internals, and
complex-valued vectors and operations in the complex the BT is exploited to update the KG properly.
plane. Neural-network embeddings, such as ConvE [26], Next, we describe the User, Environment, KG, and
perform sequences of nonlinear operations. NEABT components in more detail.
User. The User performs reactions and actions based
3. Framework on the signals provided by the NEABT. The user’s sen-
sory data flow into the NEABT through a sensor, which
The architecture of the proposed agent is illustrated in represents them in some proper numerical format. Also,
Figure 1. The main components of the architecture are the User’s feedback—e.g., whether (or to which extent)
the User, the Environment, a Knowledge Graph (KG), and a the User has adopted the Agent’s suggested action—is
Behavior Tree (BT). The overall interaction between such sent back to the KG. User’s reactions/actions are assumed
components is described next. to be determined by all three types of BT’s output. In
The BT is fed with signals from Environment, User, and particular, the User’s emotion detected by the BT at the
KG. Such signals are exploited by the BT to perform its previous iteration is important for establishing the emo-
computation and to output (𝑖) an action to be suggested tional conditions that most influence the user.
to the User, (𝑖𝑖) an action actually performed by the agent Environment. Signals from the surrounding environ-
(e.g., an empathetic action), and (𝑖𝑖𝑖) User’s emotion de- ment are detected by a sensor, representing them in some
tected by its neural node (‘N’, see below). Threefold BT’s numerical format, and are thus ready to be processed by
output passes through an “Aggregator”, responsible for the NEABT (along with the KG representation).
suitably aggregating and presenting three BT’s outputs KG. The KG contains information about domain knowl-
edge and user profile. KG’s information is provided to ous driving-related tasks, even under challenging scenar-
the NEABT in a twofold form. It is first encoded in some ios. In this synergistic relationship during the training
proper numerical format and passed to BT’s neural node phase, humans enhance the effectiveness of automation
(see below). The encoding is performed by a KG encoder (capabilities and performance). At the same time, the
component, which can be implemented, e.g., with a KGE agent installed in each vehicle improves human efficiency
(see Section 2.3). KG’s encoded information is then de- and compensates for human inadequacies by intercepting
coded into a format suitable for processing by the internal and correcting potential erroneous behaviours, possibly
nodes of the BT. A KG-to-BT decoder performs KG’s in- resulting from compromised physical or emotional states.
formation decoding. This can be implemented, e.g., as a Potential intervention modes for the agent to assist
neural network component whose training can be per- a struggling driver could include automatically activat-
formed on a ground truth defined through either manual ing (semi-)autonomous driving mode (if available) so the
annotation or the agent’s historical data. The KG is fed driver can momentarily divert their attention. Alterna-
BT’s output and user feedback. Such data in input to tively, the agent could more actively engage with the
the KG are represented in a format suitable for updating driver to regain attentiveness, such as by recommending
the KG, e.g., a set of KG triples should be added and a stimulating music on a dedicated radio station. In case of
set of KG triples removed. Such a translation from BT’s health issues, the agent could recommend pulling over
and User’s signals to KG updating signal is performed to rest or take medication (e.g., for hypertension) or, in
by a further encoder-decoder component. Again, such critical cases, seek emergency assistance by contacting
an encoder-decoder can be implemented as a neural net- emergency services.
work and trained with a ground truth defined manually
or through historical data.
NEABT. The NEMO framework deploys a NEABT as a
References
behavior tree. The BT’s neural node receives the KG’s [1] M. Colledanchise, P. Ögren, Behavior trees in
information and the user’s sensory data and makes infer- robotics and AI: An introduction, CRC Press, 2018.
ences about the user’s emotional state. These estimates [2] M. Iovino, E. Scukins, J. Styrud, P. Ögren, C. Smith,
are mapped into the user affective state variables that A survey of behavior trees in robotics and ai,
parametrize the neural node child, the emotional selector. Robotics and Autonomous Systems 154 (2022)
In turn, the emotional state selector passes the values of 104096.
the affective state variables to its child nodes, empathy [3] S. Costantini, P. Dell’Acqua, G. De Gasperis,
nodes. Each child empathy node provides an empathic A. Rafanelli, Empowering emotional behavior trees
evaluation of its subtree. In Figure 1, every subtree has with neural computation for digital forensic, 15th
a root node that is a sequence selector with a condition European Symposium on Computational Intelli-
node as a child and several action nodes. The condition gence and Mathematics (ESCIM 2024) (in press).
child node returns success/failure by performing a test [4] A. Hogan, E. Blomqvist, M. Cochez, C. d’Amato,
condition upon the input pair (KG’, Env). The correspond- G. de Melo, C. Gutierrez, S. Kirrane, J. E. L. Gayo,
ing action child nodes are executed if the condition node R. Navigli, S. Neumaier, A. N. Ngomo, A. Polleres,
returns success. By doing so, the NEABT can execute ac- S. M. Rashid, A. Rula, L. Schmelzeisen, J. F. Sequeda,
tions over the environment. Some of these action nodes S. Staab, A. Zimmermann, Knowledge graphs, ACM
define the BT threefold output. CSUR 54 (2022) 71:1–71:37.
[5] G. Weikum, Knowledge graphs 2021: A data
odyssey, PVLDB 14 (2021) 3233–3238.
4. Case Study: Driver Co-Pilot [6] O. Deshpande, D. S. Lamba, M. Tourn, S. Das, S. Sub-
Here, we envision a case study that involves developing ramaniam, A. Rajaraman, V. Harinarayan, A. Doan,
an intelligent agent that actively functions as a "compan- Building, maintaining, and using knowledge bases:
ion" (co-driver) and support system for drivers. The agent a report from the trenches, in: SIGMOD, 2013, pp.
will assist drivers by providing interventions in risky sit- 1209–1220.
uations that may arise due to external circumstances [7] M. Lissandrini, D. Mottin, T. Palpanas, D. Papadim-
and/or the driver’s health condition and emotional state, itriou, Y. Velegrakis, Unleashing the power of in-
taking into account emotional aspects that could impact formation graphs, ACM SIGMOD Record 43 (2015)
driving performance. 21–26.
The intelligent agent will also be trained through inter- [8] C. Shi, Y. Li, J. Zhang, Y. Sun, S. Y. Philip, A sur-
action with the human user following the recent "Human- vey of heterogeneous information network analysis,
AI teaming" paradigm. A human driver could coopera- TKDE 29 (2016) 17–37.
tively train the agent by collaboratively performing vari- [9] X. Wang, L. Chen, T. Ban, M. Usman, Y. Guan, S. Liu,
T. Wu, H. Chen, Knowledge graph quality control: Convolutional 2d knowledge graph embeddings, in:
A survey, Fundamental Research 1 (2021) 607–626. AAAI, 2018.
[10] S. Ji, S. Pan, E. Cambria, P. Marttinen, S. Y. Philip,
A survey on knowledge graphs: Representation,
acquisition, and applications, Trans. Neural Netw.
Learn. Syst. 33 (2021) 494–514.
[11] B. Yang, S. W.-t. Yih, X. He, J. Gao, L. Deng, Embed-
ding entities and relations for learning and infer-
ence in knowledge bases, in: ICLR, 2015.
[12] Y. Wu, Y. Xu, X. Lin, W. Zhang, A holistic approach
for answering logical queries on knowledge graphs,
in: ICDE, 2023, pp. 2345–2357.
[13] S. S. Bhowmick, E. C. Dragut, W. Meng, Glob-
ally aware contextual embeddings for named entity
recognition in social media streams, in: ICDE, 2023,
pp. 1544–1557.
[14] J. Huang, Z. Sun, Q. Chen, X. Xu, W. Ren, W. Hu,
Deep active alignment of knowledge graph entities
and schemata, PACMMOD 1 (2023) 159:1–159:26.
[15] A. Zeakis, G. Papadakis, D. Skoutas, M. Koubarakis,
Pre-trained embeddings for entity resolution: An
experimental analysis, PVLDB 16 (2023) 2225–2238.
[16] Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai,
J. Sun, Q. Guo, M. Wang, H. Wang, Retrieval-
augmented generation for large language models:
A survey, CoRR abs/2312.10997 (2023).
[17] S. Hao, T. Liu, Z. Wang, Z. Hu, ToolkenGPT: Aug-
menting frozen language models with massive tools
via tool embeddings, in: NeurIPS, 2023.
[18] X. Wang, Q. Yang, Y. Qiu, J. Liang, Q. He, Z. Gu,
Y. Xiao, W. Wang, KnowledGPT: Enhancing large
language models with retrieval and storage access
on knowledge bases, CoRR abs/2308.11761 (2023).
[19] J. Zhang, Graph-toolformer: To empower LLMs
with graph reasoning ability via prompt augmented
by ChatGPT, CoRR abs/2304.11116 (2023).
[20] Q. Wang, Z. Mao, B. Wang, L. Guo, Knowledge
graph embedding: A survey of approaches and ap-
plications, TKDE 29 (2017) 2724–2743.
[21] A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston,
O. Yakhnenko, Translating embeddings for model-
ing multi-relational data, NeurIPS 26 (2013).
[22] L. Chao, J. He, T. Wang, W. Chu, PairRE: Knowledge
graph embeddings via paired relation vectors, in:
ACL, 2021, pp. 4360–4369.
[23] M. Nickel, V. Tresp, H.-P. Kriegel, et al., A three-way
model for collective learning on multi-relational
data, in: ICML, 2011, pp. 3104482–3104584.
[24] Z. Sun, Z. Deng, J. Nie, J. Tang, RotatE: Knowledge
graph embedding by relational rotation in complex
space, in: ICLR, 2019.
[25] T. Trouillon, J. Welbl, S. Riedel, É. Gaussier,
G. Bouchard, Complex embeddings for simple link
prediction, in: ICML, 2016, pp. 2071–2080.
[26] T. Dettmers, P. Minervini, P. Stenetorp, S. Riedel,