Towards an ontological core for cognitively justified
robots
Stefano Borgo1 , Roberta Ferrario1 , Claudio Masolo1 and Daniele Porello2
1
    Laboratory for Applied Ontology, ISTC CNR, Trento
2
    Department of Philosophy, University of Genova


                                         Abstract
                                         Robots built to interact with humans in everyday life need to organize, manage and elaborate informa-
                                         tion in ways that are not only reliable but also aligned with humans’ understanding of the environment.
                                         The paper argues that cognitively motivated ontologies contribute to the development of techniques
                                         and to the organization of knowledge bases that take us closer to the construction of robot architec-
                                         tures suitable for smooth interactions with humans.

                                         Keywords
                                         Ontology, Cognitive robotics, DOLCE, Threshold operators


1. Introduction
Increasing attention has recently been paid in robotics towards the development of (everyday,
direct) human-robot collaboration. The envisioned future is that cobots (collaborative robots)
will help humans in their usual activities, e.g., in goods production as well as housekeeping, and
enable smooth social relationships across the human and artificial worlds. It is still unclear how
to design cobots for the variety of activities and related possible interactions that are manifested
in modern socio-technical systems (STS). This research area is even more challenging when
the interaction between robots and humans is not restricted to working scenarios. In everyday
situations, human-robot interaction may span the cognitive dimension in its entirety. This short
paper presents an approach to cognitive robotics that takes advantage of recent research in
applied ontology and knowledge representation (KR).


2. One, No One and One Hundred Thousand
One of the central aims of robotics, especially when considered from the perspective of STS, is
to build an object that, when acting, is perceived by humans as a single agent (one), instead of

8𝑡ℎ Italian Workshop on Artificial Intelligence and Robotics (AIRO 2021)
" stefano.borgo@cnr.it (S. Borgo); roberta.ferrario@cnr.it (R. Ferrario); claudio.masolo@loa.istc.cnr.it
(C. Masolo); daniele.porello@unige.it (D. Porello)
~ https://www.istc.cnr.it/people/stefano-borgo (S. Borgo); https://www.istc.cnr.it/people/roberta-ferrario
(R. Ferrario); https://www.istc.cnr.it/people/claudio-masolo (C. Masolo); http://www.dif.unige.it/epi/hp/porello/
(D. Porello)
 0000-0002-9175-3096 (R. Ferrario)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
a combination of several interconnected devices and services (one hundred thousand), an
agent to interact with rather than something to use (no one). Being observationally coherent
and socially predictable is important for artificial embodied agents. A robot has usually a number
of components dedicated to collect data about the environment through a variety of sensors and
channels (cameras, audio systems, magnetic sensors, wireless communication . . . ), each with its
level of precision, capabilities and limitations (e.g. reflective or transparent objects may be a
challenge for optical technology) and possible interferences (from other devices and running
processes). Data, rough or pre-processed, coming from different sensors may not be easily
merged, e.g., due to differences in tolerance, processing delays or contextual significance of some
features. Given that every sensor provides a reading of the environment or fragment of it, for the
agent to act coherently there must be a step in which data are reliably integrated and an overall
and coherent picture of the environment is generated. Methodologies for data integration,
especially when data themselves have to be read from different perspectives of the world (e.g.,
the physical and the social), are complex. On top of this, methodologies must be rational, robust
and produce results that are cognitively acceptable and justifiable. Otherwise the overall artificial
system will not be perceivable as a single agent, and what is built is a machine which falls
short of being a robot (no one). This means that the process of data integration should follow
principles inherited from cognitive sciences. Data integration is obviously not enough to reach
an interpretation of the available information enabling a meaningful interaction with humans.
In order to reach the latter, the robot’s knowledge should be enriched with input concerning
the socio-cultural environment in which it operates, including such diverse information as
attribution of roles, working practices, social rituals, taboos etc. Given the heterogeneity of
the information that robots should use when interacting with humans, a methodology for
organizing, deploying, and representing it is mandatory. In the next sections, we will propose a
cognitively motivated ontology as an appropriate means to tackle such complex issue.


3. Cognitive Ontology in the Agent’s Information Flow
Techniques developed in applied ontology and KR are already exploited in robotics to some
extent [1]. One of the purposes of applied ontology is the transparent and robust organization
and classification of data coherently with the conceptualization the observer has of reality [2].
This effort led to develop several ontology-based methodologies which, paired with techniques
for decision making, can improve sensor data classification, data analysis and data integration.
   Generally speaking, a formal ontology is a general, reliable and well organized conceptual
system. This means that the ontology includes the most usable, domain-independent and widely
applicable concepts (generality), is expressed as a logical theory with Tarskian-style semantics,
has a rich axiomatization and carefully analyzed formal consequences (reliability); and is
constructed following explicitly motivated philosophical principles (justified organization).
Overall, an ontology relies on the symbolic representation of data, and its aim is to formalize
a view of the world, i.e., a way to coherently understand reality and what one expects to be
possible. The ontology DOLCE [3] is an example of such systems. A knowledge-base built from
this ontology can model information from the agent’s perspective.
   As said, formal ontologies are written in logical languages such as first-order logic or, to
ensure the tractability of the reasoning services, in weaker fragments of first-order logic. A
fundamental family of languages for ontologies is that of Description Logics (DL). E.g., the Web
Ontology Language (OWL 2) [4, 5] corresponds essentially to the 𝒮ℛ𝒪ℐ𝒬(𝒟) DL [6].
   Logic-based ontologies are capable of specifying hard constraints on individuals, e.g. the
rigid distinction between an object and its qualities, or an object and its components. To cope
with concepts learnt from examples or data, one can introduce dedicated operators like the
Weighted Threshold Operators. Informally, these operators take a list of relevant concepts in the
ontology and treat them as features to define the newly learnt concept.
   In [7], we started studying weighted logics to provide cognitively meaningful representations
of concepts in ontologies, while in [8] and [9], we studied threshold operators in the context of
DLs. In brief, if 𝐶1 . . . 𝐶𝑛 are concept expressions, 𝑤1 . . . 𝑤𝑛 ∈ R are weights, and 𝑡 ∈ R is a
threshold, the new concept ∑︀   ∇∇𝑡 (𝐶1 : 𝑤1 , . . . , 𝐶𝑛 : 𝑤𝑛 ) (called a ‘Tooth expression’) classifies
the individuals 𝑑 such that {𝑤𝑖 : 𝐶𝑖 applies to 𝑑} ≥ 𝑡. As a toy example, assume we wish
to capture a common-sense definition of mug. We list a number of relevant features with the
associated weights to indicate their relevance for an object to be classified as a mug. This is
expressed by the TBox axiom:

        ∇2 (∃contains.Coffee ⊔ Tea : 1,
  Mug ≡ ∇
                      ∃hasBase.Circular : 1, ∃hasPart.Handle : 1, ∃isLargerThan.Cup : 1) (1)

That is, a mug is associated with features like: it contains coffee or tea, it has a circular base, it
has a handle, it is larger than some cups. The definition adds that, for being a mug, any two of
these four features suffice (here, 1 is the weight assigned to each feature, and 2 is the threshold).
   Threshold operators have been extensively studied in the context of circuit complexity theory
(e.g. [10]), and they are also known in the neural network community by the name of perceptrons
(cf. e.g. [11]). Extensions of DLs with threshold operators have been discussed also in [12, 13].
Classifications via the threshold concepts can be construed as knowledge-dependent, cf. [9]: the
values depend on the knowledge base, so the classification depends on the knowledge available
to the agent. This aspect highlights the situated contextual nature of classification tasks.
   The benefits of Tooth expressions are briefly summarized as follows: 𝑖) they provide a
way to compactly define concepts that are human accessible: instead of writing a possibly
long disjunctive normal form (DNF) of concepts, we list the weighted relevant features and
a threshold; 𝑖𝑖) definitions provided in terms of Tooth can be grounded on cognitive theories
of concepts and categorization, like the theory of prototypes or the theory of exemplars, cf.
[14]; 𝑖𝑖𝑖) adding Tooth expressions to rich DLs does not raise the computational complexity
of the reasoning services (cf. [9]); 𝑖𝑣) the weights of the Tooth expressions can be learnt from
examples or past experience of an agent, so Tooth expressions can bridge symbolic knowledge
with experiential data. For details, we refer to [9].


4. Ontologizing the robot’s architecture
Even if traditional problems in robotics (navigation, object detection, obstacle avoidance, object
grasping etc.) were completely solved, the types of information that a cobot should be able to
manage remain impressive [15]. Since our goal in this paper is to suggest an ontology-based
approach to information in robotics, and to exemplify it in the context of information integration,
we now show how this can be implemented in a generic robot architecture (Fig. 1) . It should
be clear from the earlier sections that our work is essentially about the meaning of information
(with impacts on its collection, organization and manipulation), which is typically a concern in
the robot’s modules controlled by knowledge representation techniques [16].
                                                                         world model, beliefs,
                                                                          behaviors, rules
               KNOWLEDGE BASE                                              (selected info)        Task planning
                                                                                                     module

                                                                                                         selected
                                                                         events, world model,             plans
                                                                            beliefs, rules,
                         Rules, expectations, beliefs                         monitoring
                       scenarios-behavior associations                                                              basic actions
                                                                            (selected info)         Execution
                               KB core module                                                        module
                           (ontologically organized)
                                                                                                          motion
                                                            parameters                                    plans
                          selected
                         scenarios                                                                  Motion and
                                                                         motion-related rules
                                                                           (selected info)          manipulation
                                              Layout and situation                               plannning module
                  Interpretation              assessment module
                     module
                                               (TOOTH operator)


                                         symbolic
                                        expressions


                               Data preprocessing
                                       layer
                          (e.g. via parametrized ML,DL)


                                              raw data


                                   Sensorimotor layer


Figure 1: Information flow in a general-purpose robot architecture (adapted from [15]).

   In Fig. 1 (bottom) data is collected by a sensorimotor module and then preprocessed by a
parametric module, typically relying on subsymbolic approaches. The information produced by
this latter module is available in logical format and is essentially about geometric information and
sensor-dependent features. This information is processed by the layout and situation assessment
module which identified objects and the overall layout of the (detectable) environment. The
Tooth expressions described in the previous sections are devised to integrate information at this
step. The layout and situation description is then passed to the interpretation module and to the
core of the knowledge base (which we do not discuss). The role of the interpretation module is
to elicit the interpretation of the situation, what we call a scenario. Briefly put, a situation is the
mere list of objects and their places in an environment (there are people forming a line), the
scenario is the social interpretation of what is going on (there is a queue), allowing to associate
to the actual state the relevant rules, behaviors and expectations as described in [15]. The rest
of the architecture, which we do not discuss here, describes the connections across the task and
geometric planners and the execution module.
   In this example, the discussion of section 3 refers to a module where both classical DL
expressions and Tooth expressions are used. While the DL expressions represent factual
knowledge (the ABox) or hard laws that regulate the universe of discourse, the Tooth expressions
define concepts under which entities can be classified on the basis of the data (ABox statements)
returned by the data preprocessing layer.
   First note that, since Tooth expressions are legitimate logical concept constructors, they can
be exploited by, and can exploit, the reasoning capabilities of the main knowledge base module.
For instance, given the definition of Mug in equation (1), from the ontological knowledge
present in the KB core module one can infer that mugs are physical objects, that they are located
in space, that they have container functions, etc. Moreover, the ontological knowledge can
be used to deductively close the ABox returned by the preprocessing layer empowering the
classification mechanism. E.g., one can deploy known correlations among qualities to infer
objects’ features that the sensors were unable to directly observe.
   Second, Tooth definitions of concepts can be automatically built by running supervised
learning algorithms on the dataset of ABox statements provided by the preprocessing layer (at
a given time) with the major advantage of being in principle human readable and explainable.
Moreover, knowledge about planned actions and expectations can be used to modify or refine
these definitions (for example acting on the weights used by the Tooth expressions) on the
basis of the success of the actions undertaken by the robot. Finally, the Tooth definitions can
be used to guide the robot to focus on the acquisition of specific information with the goal
of disambiguating doubtful yet critical classifications. For instance, suppose that the robot is
unable to classify an object 𝑎 because no threshold has been reached. Starting from the concepts
with the highest classification degree for 𝑎 and possibly from knowledge of what is typically
present in a scenario, the robot can plan to make more observations and even run a check of its
sensors for possible malfunctioning.
   Third, even the robot architecture could be refined to match assumptions largely adopted by
cognitive theories of concepts and categorization. For instance, cognitive theories usually make
a distinction between concepts (e.g., dog, mug, tree), attributes (e.g., color, shape, texture), and
attribute-values (e.g., crimson, round, smooth). The classification of an object under a concept
is determined on the basis of its attribute-values and of the relevance and typicality of these
values. Attributes and attribute-values are strongly linked to the human perceptive system,
while concepts are cognitively more complex. This contrast can be simulated by assuming that
(𝑖) by being closely linked to the available sensors, the preprocessing layer returns assertions
based on DL-concepts representing attribute-values, while (𝑖𝑖) the complex concepts are defined
via Tooth-expressions involving only the DL-concepts corresponding to attribute-values with
weights representing the relevance and typicality for that concept.
   While the approach we have presented is cognitively interesting and adds the flexibility
of a parametrized classification, it also raises a series of problems. The elaboration of the
output of the sensors performed at the preprocessing layer could result in a set of assertions
that are inconsistent with the general knowledge in the KB core module or that generate
inconsistent classifications under Tooth-defined concepts. This happens, for instance, when
an individual satisfies enough features associated with concepts that are deemed disjoint by
an axiom in the KB core module. There can be several heterogeneous causes that lead to
this kind of conflict: the disjointness axiom might be wrong, the tagging of the data might
be mistaken, the Tooth definition of the concepts might be incomplete, the scenario might
have changed etc. There are approaches to overcome these conflicts, for instance by applying
judgment aggregation techniques in decision theory [17] that may use meta-information about
the sensors, for instance, their reliability in given contexts (see [18]).
References
 [1] A. Olivares-Alarcos, D. Beßler, A. Khamis, P. Goncalves, et. al., A review and comparison
     of ontology-based approaches to robot autonomy, Knowledge Engineering Review 34
     (2019).
 [2] N. Guarino, Formal ontology in information systems, in: N. Guarino (Ed.), Proceedings of
     the Second International Conference on Formal Ontology in Information Systems, IOS
     Press, 1998, pp. 3–15.
 [3] S. Borgo, C. Masolo, Foundational Choices in DOLCE, in: S. Staab, R. Studer (Eds.),
     Handbook on Ontologies, 2nd ed., Springer Verlag, 2009, pp. 361–381.
 [4] O. W. Group, et al., OWL 2 Web Ontology Language Document Overview: W3C Recom-
     mendation 27 October 2009 (2009).
 [5] P. Hitzler, M. Krötzsch, B. Parsia, P. F. Patel-Schneider, S. Rudolph, OWL 2 web ontology
     language primer, W3C recommendation 27 (2009) 123.
 [6] I. Horrocks, P. F. Patel-Schneider, Reducing OWL entailment to description logic satisfia-
     bility, in: International semantic web conference, Springer, 2003, pp. 17–29.
 [7] C. Masolo, D. Porello, Representing concepts by weighted formulas, in: S. Borgo, P. Hitzler,
     O. Kutz (Eds.), Formal Ontology in Information Systems - Proceedings of the 10th Interna-
     tional Conference, FOIS 2018, Cape Town, South Africa, 19-21 September 2018, volume
     306 of Frontiers in Artificial Intelligence and Applications, IOS Press, 2018, pp. 55–68.
 [8] D. Porello, O. Kutz, G. Righetti, N. Troquard, P. Galliani, C. Masolo, A toothful of
     concepts: Towards a theory of weighted concept combination, in: Proceedings of the
     32nd International Workshop on Description Logics, volume 2373, CEUR-WS, 2019. URL:
     http://ceur-ws.org/Vol-2373/paper-24.pdf.
 [9] P. Galliani, O. Kutz, D. Porello, G. Righetti, N. Troquard, On knowledge dependence in
     weighted description logic, in: Proc. of the 5th Global Conference on Artificial Intelligence
     (GCAI 2019), 2019, pp. 17–19.
[10] H. Vollmer, Introduction to circuit complexity: a uniform approach, Springer Science &
     Business Media, 2013.
[11] C. M. Bishop, Pattern recognition and machine learning, Springer Science+ Business Media,
     2006.
[12] F. Baader, G. Brewka, O. F. Gil, Adding threshold concepts to the description logic ℰℒ, in:
     C. Lutz, S. Ranise (Eds.), Frontiers of Combining Systems, Springer International Publishing,
     Cham, 2015, pp. 33–48.
[13] F. Baader, A. Ecke, Reasoning with prototypes in the description logic 𝒜ℒ𝒞 using weighted
     tree automata, in: A.-H. Dediu, J. Janoušek, C. Martín-Vide, B. Truthe (Eds.), Language
     and Automata Theory and Applications, Springer International Publishing, Cham, 2016,
     pp. 63–75.
[14] G. Righetti, D. Porello, O. Kutz, N. Troquard, C. Masolo, Pink panthers and toothless tigers:
     Three problems in classification, in: Proc. of the 5th International Workshop on Artificial
     Intelligence and Cognition, Manchester, September 10–11, 2019.
[15] S. Borgo, E. Blanzieri, Trait-based module for culturally-competent robots, International
     Journal of Humanoid Robotics 16 (2019) 1950028.
[16] K. Rajan, A. Saffiotti, Towards a science of integrated ai and robotics, Artificial Intelligence
     247 (2017) 1 – 9.
[17] D. Porello, N. Troquard, R. Peñaloza, R. Confalonieri, P. Galliani, O. Kutz, Two approaches to
     ontology aggregation based on axiom weakening, in: Proceedings of the Twenty-Seventh
     International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018,
     Stockholm, Sweden., 2018, pp. 1942–1948. URL: https://doi.org/10.24963/ijcai.2018/268.
     doi:10.24963/ijcai.2018/268.
[18] C. Masolo, A. B. Benevides, D. Porello, The interplay between models and observations,
     Applied Ontology 13 (2018) 41–71.