Tangible Interaction and Embodied Cognition
challenged by remote control issues
Vincent Ferrari1 , Valentin Braud1,2,3 , Laurent Bovet2,3 and Nadine Couture3
1
  CREA, Aix-Marseille université, École de l’air et de l’espace, F-13661 Salon-de-Provence, France
2
  ELISA Aerospace, F-02100 Saint-Quentin, France
3
  Univ. Bordeaux, ESTIA INSTITUTE OF TECHNOLOGY, F-64210 Bidart, France


                                         Abstract
                                         This article shows the interest of considering embodied cognition as a theoretical framework for tangible
                                         interfaces. This leads to use tangible interfaces as technical solutions to enable the enforcement of
                                         embodied cognition principles. The reflection that was carried out in this article is motivated by the
                                         context of remote UAV systems control.

                                         Keywords
                                         Embodied Cognition, Tangible Interaction, Unmanned Aerial System Control, Interdisciplinary


1. Introduction
An efficient control of a remote system (drone, rovers, etc.), necessarily implies the development
of the best possible spatial representation in the operator’s mind. In concrete terms, what does
a drone operator have at his disposal to best represent an aeronautical situation that sometimes
takes place thousands of kilometres away? First of all, the operator processes the information
provided by the ground station interfaces. The quality of these interfaces (the information
presented) will determine the plausibility of the operator’s spatial representation. Furthermore,
to ensure this plausibility, the experienced operator also has his/her previous knowledge about
unmanned aircraft vehicles. Moreover, the operator’s level of experience will modulate the
relevance of their spatial representation. Finally, it is necessary to ensure that the information
provided by the interfaces is in line with the nature of the operators’ knowledge. This being
said, do the interfaces of UAS ground stations currently meet these requirements? The answer is
no. We argue here that, in order to address the needs of UAS remote control issues, particularly
those relating to the quality of the operator’s spatial representation, alternatives to traditional
graphical user interfaces (GUIs) implemented in ground stations must be provided. From
our perspective, with the support of the embodied cognition theoretical framework, tangible
interaction will better exploit the capabilities of operators in the remote control of UAS. The
two disciplines addressed here, namely embodied cognition theory (shortened to EC for future
reference) and tangible interaction (shortened to TI for future reference), will challenge each
other. The two disciplines must cohabit and contribute to each other in order to respond to
Proceedings of ETIS 2022, November 7–10, 2022, Toulouse, France
Envelope-Open vincent.ferrari@ecole-air.fr (V. Ferrari); valentin.braud@ecole-air.fr (V. Braud); l.bovet@elisa-aerospace.fr
(L. Bovet); n.couture@estia.fr (N. Couture)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
the very strong constraints of the application field that carries very high stakes, namely drone
systems, and more generally systems that are controlled remotely.
In this particular context, the user (the sensor operator or the pilot) must manage a very large
amount of information enabling him to reconstruct a ”reality” that is distant. So far, this
reconstruction has been performed through screens which are unfortunately multiple and
require a purely cognitive processing of the information, and thus rely on the individual’s
information processing capabilities and on the visual modality. It is therefore very interesting
to consider an alternative to this full visual and cognitive approach, thanks to TI sustained by
EC. A gradual shift in the cognitive sciences towards embodied paradigms of human cognition
has inspired us to think about how EC should theoretically support TI.


2. Embodied Cognition
EC is a vision developed in opposition to traditional cognitivism. EC, which is still a minority
today, is opposed to the analogy between the functioning of the human brain and the functioning
of a computer. This analogy had been the foundation of cognitivism. The climax of the
disagreement between cognitivism and EC lies in the format of knowledge representation.
According to EC, the mind cannot be reduced to the amodal processing of symbolic information.
On the contrary, cognition would be rooted in sensory and motor systems. Cognition would
therefore no longer be abstract and amodal, but rather essentially sensorimotor. In short, EC
considers that the mind must be understood in the context of its body (the ‘sensorimotor
context’), and of its interaction with the environment [1]. Therefore, human knowledge should
be sensorimotor in nature. Thus, if the cognitive system is by essence sensorimotor, then the
spatial representations, elaborated by the human brain to guide actions, must also integrate
sensory and motor information. It should be noted here that the notion of ‘spatial representation’
refers to the way in which an individual represents the elements of an environment as well as
their absolute and relative positions [2]. According to the theoretical principles of EC, a spatial
representation is constructed from two sources: an external source, perception, and an internal
source, memory. If the information at the source of spatial representations is embodied (in the
sense that it is rooted in the body), then it is reasonable to think that spatial representations are
also embodied [3].
   To summarise, EC is a model of cognition built on the assumption that t’s not possible to
separate the elements of the triptych: action, perception and environment [4].


3. Could Embodied Cognition be a theoretical framework for
   Tangible Interaction design?
A theoretical framework introduces and describes the theory or theories underpinning a research
problem. Thus, theoretical frameworks support research by describing and/or drawing from
relevant theoretical aspects obtained in previous work. We argue that EC should be, in the
future, a theoretical framework for TI. We are at an exploratory level, but we believe that EC can
provide the theoretical elements that would allow interaction designers to better understand
and therefore anticipate what may or may not work if they use a system with tangible interfaces.
Therefore, amongst the concepts we have identified that link EC & TI, we will discuss below
only two of us which appear to us as fundamental: affordance and legibility.

3.1. Affordance
Affordance in TI. For Norman [5], affordance is the characteristic of an object to ’naturally’
suggest to its user how to use it. Affordant props allow intuitive and rich manipulations. This
is because the manipulation of props in the tangible system relies on the perception the user
has of his/her environment, and exploits the dexterity of the user. Thus, affordant props must
both enable a digital effect analogous to a real effect (metaphor of verb in [6]) and, through its
physical design, suggest to the user all the possibilities of action (for instance, a cup handle
invites us to grasp it) (metaphor of noun in [6]). Indeed, to determine whether an interface
is tangible, Fishkin in [6] introduced these two kinds of metaphor. The metaphor of the verb
in the sense of action helps to answer the question: how does the effect of a user’s action on
props compare to the effect of that same action in the real world? If the action performed by
the user within the tangible system is a faithful reproduction of the action that the user would
perform in the real world, then the system can be said to offer a metaphor of verb. The designer
of a tangible interface also chooses the shape, colour, smell, weight and texture of its props. In
this way, he generates sensory links between the user and the manipulated props, and thus,
between the user and the data embodied by the props. This is the metaphor of noun 1 . The
system will be considered offering a metaphor of nouns if the appearance of the object strongly
looks like something that the user knows. A tangible interface can propose both a metaphor of
verb and a metaphor of noun. In this case, it makes strong use of analogies of both appearance
and use, but the physical and virtual objects remain different. Finally, if more analogies are not
needed, the metaphor is said to be full, the virtual system and the real system do only one.
    What EC says about affordance? While GUIs rely less, or not at all, on the sensory-motor
abilities of operators, tangible user interfaces by their physical existence enable a more intuitive
integration of the digital world into the users’ environment. In other words, the ‘tangibility’
of the props makes it possible to provide affordances that will be perceptible by the human
sensory-motor system. The fact that the props can be manipulated makes it easier to convoke
the user’s proprioceptive abilities, which is impossible with graphical interfaces. Phillips & Ward
[7] argue that humans perceive the world in the form of affordances, i.e. that the environment
incorporates cues that univocally suggest actions to humans. This hypothesis assumes the
existence of a rapid and systematic link between perception and action that optimizes the
decision making process in the environment. In other words, affordances are visual cues that
incorporate information about action possibilities in the environment and are automatically
‘caught’ by the human sensorimotor system. Taking, the same example as previously, cup
handle placed near a person is an affordance, because the handle calls for the person to grasp
it. Note that, if this cup is outside the individual’s zone of possible action, the motor cortex of
the person will not be activated. The possibility of action is therefore the boundary between
peripersonal space (where action is possible directly) and extrapersonal space (where action is
possible as a result of movement) [8].
    1
     The authors of [6] chose these terminologies ’metaphor of noun’ and ’metaphor of verb’ because cognitive
psychologists claim that nouns and verbs seem to be deeply embedded in our consciousness.
To conclude about affordance by mixing EC and TI, we can say that, for EC, perception does
not only have the passive role of representing information from the environment (as opposed
to traditional cognitivism models), but perception guides action through the processing of
affordances. While GUIs rely less, or not at all, on the sensorimotor abilities of operators,
tangible user interfaces by their physical existence allow for a more intuitive integration of
the digital world into the users’ environment. In other words, the ‘tangibility’ of the props
makes it possible to provide affordances that will be perceptible by the sensorimotor system.
The fact that the props can be manipulated makes it easier to use the user’s proprioceptive
abilities, which is impossible with graphical interfaces. To summarise, and in accordance with
an embodied vision of cognition, perception would have the function of guiding action.

3.2. Legibility
Legibility in TI. In 1992, Durrell Bishop designed the marble answering machine, described in [9].
It is the first example of an attempt to link the physical world to the digital world. The concept
is a telephone answering machine with tangible balls2 . This system marks the beginning of
digital design. Bishop emphasises that, in the interest of communication, what is important is
not just what objects do, but what they convey, and how they convey it. He considers that there
is ”an immediate legibility of the object that contrasts with the illegibility of computing. This
remark is of course strongly related to the notion of affordance presented above. We consider
legibility in the sens given by D. Bishop and as part of the LAVA heuristics (Legible, Actionable,
Veritable, and Aspirational). To go further on the different facets of legibility see the 7 pages,
section A.2.1, of the appendices of [10].
What does EC say about legibility? According to the principles of EC, perception encodes
environmental information according to the possibilities of action, i.e. according to the different
senses and motor actions solicited by the environment (sometimes in the form of affordances).
Considering that perception is based on the sensorimotor system, it seems logical that memory
is also based on this system, since one of its functions is to store the information provided
by perception (i.e. sensorimotor information). It is the procedural memory (a subdivision of
human memory) which groups together the perceptual, motor and cognitive representations
stored in long-term memory and likely to be processed in working memory. These are dynamic
representations that allow the acquisition and realisation of various sensorimotor skills. Access
to this memory is automatic (particularly in the presence of affordances). According to Tucker
and Ellis [11], the perception of an object that has already been manipulated reactivates the
possibilities of actions on that object. When an object is perceived, direct processing is done.
Glenberg [12] considers that affordances are not only perceptible information in the environment
through possible actions, but are complemented by the observer’s prior knowledge of the object’s
functions or experiences with the object which leads to this legibility.
    To conclude about legibility by mixing EC and TI, we can say that, unlike graphical interfaces,
which are overloaded with digital information that requires costly ‘translation’ to move from
a symbol to an action, a tangible interface presents digital data in the form of manipulable
    2
      The machine physically displays voice messages with beads, which can then be listened to, deleted in any
order.To listen to a message, the user takes the ball and places it on a defined slot on the machine. Then the message
can be deleted or the user can also choose to store the messages, outside the machine in a container
and affordant objects to optimise processing. In other words, tangible interfaces modify the
nature of the information to be processed so that it ’matches’ the sensorimotor nature of the
knowledge stored in the operator’s memory. For the human brain, symbols presented on a
screen that represent the state of a system are not directly legible (they require 2 translations,
one from symbol to understanding and another to action)! The embodied approach to cognition
reduces this distance by activating more physical modes of processing, more primary ones,
more adaptive, and therefore faster modes of processing. Exactly like in TI which makes digital
data legible for humans by ”speaking the language of embodiment”!


4. Conclusion and Perspective
The previous discussion on affordance and on legibility, is a first step to be continued, showing
EC as a relevant theoretical framework for TI design. Moreover, TI appears as perfect technical
solution to enable the enforcement of EC principles. This discussion was motivated by the
context of UAS controls where current GUIs have been designed to provide as much information
as possible to operators. With missions diversification, the complexity and the number of
information have increased. As they reported to us in informal meetings, it is presently hard for
operators to find needed information easily and rapidly. Taking into account the fact that most
operators are former pilots or at least trained pilots, by integrating props into the interface, it is
possible to improve the operator’s ability to understand the system and its operation. Thus,
by improving the way the information is understood and manipulated by the operators, the
difficulties linked to the interface are reduced, allowing them to focus more on the operational
elements (mission objectives, payload status, communication, etc.).
   The originality of this operational environment is that the digital data within the system
comes from a physical reality! Indeed, it is a remote control via digital data. There are many
other fields of application that would benefit from the possibility of embodying digital data
that translate a physical reality. e.g. the remote control inside a nuclear power plant, but also
space and aeronautics, i.e. dirty and dangerous environments. Beyond the claimed poetic and
metaphorical aspects of TI, beyond the aestheticism and the often artistic side of TI, we are
convinced, and this has been discussed in [13], that TI is also very useful! For some critical
systems, TI combined with more classical interfaces can improve the systems:maintaining one’s
drone system, not dropping it on someone, monitoring a nuclear power plant, or planning a
space walk. TI will offer the operator not only symbolic data from classic graphical interfaces
but also the meaning of it. These domains definitely need this embodiment and a theoretical
framework to anticipate the effects and limits of this embodiment, and we believe that this
framework can be provided by the theory of EC.
   This leads us to ask the following question: can we still talk about ’tangible interaction’
when we embody digital data that has a physical reality? This question appears to us as a real
intellectual and theoretical challenge. It is a ”mise en abyme”: giving a physical reality to virtual
data that represents a physical reality! This introduces the notion of embodiment of digital data
that represents a physical reality that is not accessible because it is remote.
References
 [1] L. W. Barsalou, Perceptual symbol systems, Behavioral and Brain Sciences 22 (1999)
     577–660. doi:10.1017/S0140525X99002149 .
 [2] R. A. Zwaan, G. A. Radvansky, Situation Models in Language Comprehension and Memory
     (1998) 24. doi:10.1037/0033- 2909.123.2.162 .
 [3] L. Dutriaux, V. Gyselinck, Cognition incarnée : un point de vue sur les représentations
     spatiales:, L’Année psychologique Vol. 116 (2016) 419–465. doi:10.3917/anpsy.163.0419 .
 [4] L. W. Barsalou, Grounded Cognition, Annual Review of Psychology 59 (2008) 617–645.
     doi:10.1146/annurev.psych.59.103006.093639 .
 [5] D. A. Norman, Affordance, conventions, and design, Interactions 6 (1999) 38–43. doi:10.
     1145/301153.301168 .
 [6] K. Fishkin, A taxonomy for and analysis of tangible interfaces, Personal and Ubiquitous
     Computing 8 (2004). doi:10.1007/s00779- 004- 0297- 4 .
 [7] J. C. Phillips, R. Ward, S-R correspondence effects of irrelevant visual affordance: Time
     course and specificity of response activation, Visual Cognition 9 (2002) 540–558. doi:10.
     1080/13506280143000575 .
 [8] E. Y. Coello, A. Bartolo, Language and Action in Cognitive Neuroscience (2013) 407.
     doi:10.4324/9780203095508 .
 [9] B. Moggridge, Designing Interactions, MIT Press, Cambridge, Mass, 2007.
[10] B. Ullmer, O. Shaer, A. Mazalek, C. Hummels, Weaving Fire into Form: Aspirations for Tan-
     gible and Embodied Interaction, volume 44, 1 ed., Association for Computing Machinery,
     New York, NY, USA, 2022.
[11] M. Tucker, R. Ellis, On the Relations Between Seen Objects and Components of Potential
     Actions (1998) 17. doi:10.1037/0096- 1523.24.3.830 .
[12] A. M. Glenberg, What memory is for, Behavioral and Brain Sciences 20 (1997) 1–19.
     doi:10.1017/S0140525X97000010 .
[13] P. R. Cohen, D. R. McGee, Tangible multimodal interfaces for safety-critical applications,
     Communications of the ACM 47 (2004) 41–46. doi:10.1145/962081.962103 .