Epistemic Interaction – tuning interfaces to provide information for AI support Alan Dix1,2,* , Ben Wilson1 , Matt Roach1 , Tommaso Turchi3 and Alessio Malizia3,4 1 Computational Foundry, Swansea University, Wales, UK 2 Cardiff Metropolitan University, Wales, UK 3 Department of Computer Science, University of Pisa, Pisa, Italy 4 Molde University College, Molde, Norway Abstract As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communica- tion, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly ad- vance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements. Keywords human-AI interaction, user interface, artificial intelligence, design, addaptive interfaces, user experience 1. Introduction There is a growing expectation that forms of machine learning or other data-driven algorithms will enter virtually every aspect of human–computer interactions. In many examples the user’s interactions in the physical or digital world are monitored in order to modify subsequent system behaviour. Usually this is done with minimal or no change to the primary interactions, you act as normal, but these actions are passively sensed and analysed. However, there is the potential for minor shifts in interaction that can substantially increase the information available for system adaptation. That is, we can deliberately design interactions to increase the knowledge available to the system – we call this epistemic interaction. In the rest of this article we’ll place this concept in the context of human–object and human– human communication. We are used to the idea of designing devices and systems so that users have better understanding of what they are doing through, visualisations, feedback and SYNERGY Workshop @ AVI 2024 - 17th International Conference on Advanced Visual Interfaces, June 03–07, 2024, Arenzano (Genoa), Italy * Corresponding author. $ alan@hcibook.com (A. Dix); b.j.m.wilson@swansea.ac.uk (B. Wilson); m.j.roach@swansea.ac.uk (M. Roach); tommaso.turchi@unipi.it (T. Turchi); alessio.malizia@unipi.it (A. Malizia) € https://alandix.com/ (A. Dix)  0000-0002-5242-7693 (A. Dix); 0009-0004-5663-5854 (B. Wilson); 0000-0002-1486-5537 (M. Roach); 0000-0001-6826-9688 (T. Turchi); 0000-0002-2601-7009 (A. Malizia) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Figure 1: Central heating controllers: (left) classic design setting target temperature; (right) alternative minimal design with +/– “I want it warmer/cooler” buttons. affordances. However, in human–human collaboration we constantly shape our actions so that they offer subtle often implicit cues to others. In epistemic interaction, we extend this to human–system interactions. Before working through this more theoretical account, we’ll consider a concrete example of epistemic interaction. 2. First example – An intelligent heating control A traditional thermostatic heating control panel has a temperature knob that is twisted to select the desired temperature. Digital controllers often substitute the dial for +/– or up down arrow buttons to increase or decrease the target temperature (Fig. 1, left). A well-known problem with this sort of control is that many people confuse the rate of warming and target temperature. For example, if the target temperature is 20 degrees and the room is only 15, the user, feeling cold, might increase the target temperature further even though the heating is already increasing the temperature. This is a reasonable misunderstanding as non-thermostatic systems often have similar looking controls where the dial is about the amount of heat being produced. This confusion has the potential of becoming even more pronounced in an intelligent system that may proactively change the target room temperature based on, say, the user’s activity or time of day. Imagine instead a controller that just as before has +/– or up/down button, however these are not about setting the target temperature but instead mean “I’m too cold/hot” (see Fig. 1, right). Pressing the ‘+’ button means “I want it warmer than it is now” and maybe double tapping it means “a lot warmer”. If the intelligent system is already increasing the temperature, then the user’s input is effectively ignored, although it would reinforce the system’s learnt rules. If the system had thought the room was warm enough, then the ‘+’ would tell it both to make it hotter now and to learn this for later. Note how the small difference in the interaction technique changes the information available to the system to make adaptations. This is epistemic interaction in practice. 3. Human–object interactions – by nature 3.1. Affordances in the natural world Within HCI, affordances are the most well-known aspect of Gibson’s ecological theory of perception [1], the way in which the perceptible properties of objects in a sense announce their potential for human action. In a human–object interaction the objects are passive and our human perceptual systems have evolved to immediately grasp this action potential; for example that this stone is small enough to hold and yet large enough break open a nut. Of course the world we live in now is full of human-made artefacts. A hammer, car or website rollover has not been around for the hundreds of thousands of years of human development. Sometimes these do recruit natural affordances, for example the size of a hammer shaft suggests it is hand graspable. However this does not in itself establish its full action potential, in particular its intimate connection to the nail, or ultimately joining wood together. 3.2. Affordance seeking in the constructed world Happily, we are naturally affordance–seeking creatures, constantly learning the patterns around us even of radically new classes of thing, so that we can establish new forms of perceptions– action relationships [2]. Early electric light switches recruited natural affordances: they are finger sized and project out from the back plate so suggesting they can be manipulated by a fingertip. However, the relationship to turning on and off a remote device or light is learnt. Having learnt this, later electric light switches can be more subtle and the idea of a switch or button to be pressed to make an effect was borrowed in screen buttons and toggles. Now these have themselves become part of young children’s cultural background, so that the idea of a coloured patch that you touch is now part of many young children’s implicit understanding before they are ever tall enough to reach a light switch. 3.3. Epistemic action Another key term in ecological psychology is epistemic action. In both familiar and unfamiliar situations, if the perception of an object or space is in some way ambiguous or incomplete, we perform some sort of physical action in order to gain more knowledge. This may be a shift our body position, perhaps moving our eyes or head; it may be a movement in space, such as looking round the corner to seek out a landmark; or it may include manipulating an object, for example picking up a hammer to assess its weight. Note both affordance and epistemic action speak about the intimate relationship between perception and action. Affordance says that perception is about seeking out action and epistemic action says that action may be used to enhance perception. 4. Human–system interactions – passive sensing Sometimes intelligence or AI is very upfront in a design, for example in a medical decision support system. However, in many of the examples the adaptations and interventions are more subtle and crucially the sensing is often passive. For example, while you shop on a web site your choices are monitored and then later this may be used to offer suggestions. Similarly activity recognition systems may use vision techniques or IOT-enabled objects to enable context-sensitive services. The term incidental interaction was coined for situations where the user’s actions on a primary task are sensed and the knowledge and learning from that may be used by the system to modify its behaviour on some subsequent secondary task [3, 4]. Sometimes the primary and secondary task are very closely related, for example an intelligent cookery aide watching you add ingredients and prompting you if you appear to have forgotten a stage from the recipe [5, 6]. In other examples the secondary task may be more indirectly related, for example in the Tiree internet-connected shop-open sign, the act of turning on the illuminated ‘open’ sign is used by the system to change the display of web information [4]. Crucially, these systems usually passively sense the user’s actions. The system may try to learn new patterns, but the primary task is designed to be as optimal as possible for its own purposes. It is the job of the system to make meaning from the interactions. In some ways the intelligent system is behaving like the affordance–seeking human encoun- tering new kinds of objects. 5. Human–object interactions – by design Norman’s adoption of the concept of affordance took a design perspective on human–object interactions when the objects are themselves human-made artefacts [7]. Rather than rely on the user to have to learn and adapt to potentially arbitrary representations, Norman suggested that designers should deliberately seek to understand the existing perceptual affordances (termed signifiers in Norman’s later work [8]) of the user whether innate or learnt, and then deliberately design systems so that their perceptual characteristics recruit the users existing palette of affordances. When this was first introduced in the late 1980s, the primary source of existing affordances were physical ones, such as electric light switches and physical buttons. However, now many will themselves be from our digital culture. For example, if designing interactions for a large- scale public display or AR system, one might choose mid-air swiping actions that mimic those for in-contact swiping on a mobile phone and make this affordance perceptually visible through design elements on the distant large display or overlaid AR that look similar to those on the phone interface. Looking back it is clear that artefacts in the constructed world are not arbitrary relying totally on humans’ affordance and meaning-seeking nature, but are already in forms that recruit natural or existing cultural affordances. In some cases, this may be because those that did not fell into disuse; in other cases, it is due to slow gradual co-evolution of artefact design and cultural affordances over many generations [9]; and, in some cases, this may be the skill of inspired crafts-folk. Norman and Gaver’s early work on affordances [7, 10] brought this understanding to the user interface design world and made it explicit. That is, we deliberately design visual and interactive elements of systems so that they offer appropriate perceptual information to the user. 6. Human–human interactions – cooperation through action Much of the earliest work on human communication focused on the explicit channels in the spoken or written word (e.g. grammar, semantics) and other symbol systems (e.g. semiotics). Later work highlighted the many side-channels in everyday conversation from tone of voice to facial expression and eye movement and the way these amplify or modify the raw words. Similarly, Austin and Searle’s speech act theory uncovered the many layers of meaning that lie within and yet beyond the plain content of speech and also, the crucial performative role of communication as action, doing things in and of itself [11, 12]. Notions of repair in conversational analysis [13] and Clark and Brennan’s common ground [14] go beyond the single statements of speech and shows how the patterns of conversation continually act to both carry forward a primary topic, but at the same time monitor and offer feedback and conformations of understanding. For example, when asked “when does it leave?” when standing at the Swansea bus stop, one might answer “the Swansea bus leaves at ten past the hour”, implicitly offering confirmation of the contextual understanding of the question (the Swansea bus, not some other bus or a train) and thus the potential, but only if needed, for the interlocutor to correct the interpretation if incorrect. Analysis of collaborative situations have established that this shaping to allow interpretation goes well beyond direct communication. When physically moving a large object, such as furniture, one might explicitly say things like “can you lift your end a bit?”, but much of the communication is felt through one’s mutual actions on the object – feedthrough [15]: if you push to the left, your companion feels the pressure and will move slightly to their right. Furthermore, you might explicitly exaggerate your movements in order that they are more easily interpreted – onomatopoeic action [16, 17]. Ethnographies of collaboration have shown that this is also true when the collaboration is not so direct. Notably Heath and Luff’s classic study of the London Underground control room [18] demonstrated the importance of subtle cues such as half–overheard telephone conversations or the way someone was looking towards the large display enabled controllers to modify their behaviour or prepare for future more direct interactions even if they were not consciously aware of it. One of the key concepts of ethnomethodology is the notion of accountability [19], the way in which actions are “put together as publicly observable, reportable occurences” [20]. That is in human–human interactions we are constantly shaping the way we perform an action in order to make it comprehensible to others. 7. Epistemic interaction – sense-able by design Epistemic interaction is a natural extension of this human–human communication principle. Even if the most optimal way to carry out a task makes it invisible to others, we may choose, implicitly or explicitly, to adapt our performance so that it is more apparent. Can we do the same for human–system collaboration adapting interactions so that they offer more information to the system to make inferences about our goals, beliefs and behaviours? Taking another parallel, the use of affordances in HCI encourages us to design system appearances and behaviours so that their, otherwise invisible, action potential is apparent to users. Can we explicitly design interactions that expose otherwise hidden user information and thus allow systems to learn better? You can probably think of examples where this is done with explicit additional user inter- actions. For example after reading help text one is often asked to give a thumbs up/down to say whether the information presented was useful. If this is optional then many users simply skip the feedback hence reducing or adding bias to the relevance feedback. If this is made mandatory or hard to ignore, then it has the danger of adding friction to the interface, reducing user engagement and damaging user experience. Looking back to the notions of primary and secondary task from incidental interaction. The relevance feedback is likely to be of use to aid future interactions, possibly for others (secondary task), but the user is being asked to put in additional work now (primary task). The most successful incidental interactions are where the primary task is not explicitly inter- rupted, but is sensed in order to improve future interaction. For example, where past purchasing behaviour (implicit positive relevance) is used to produce recommendations. Advertisers use click-through rate in a similar way. The design challenge of epistemic interaction is to make small changes to the primary task, that do not add noticeably to the perceived effort, but which act to increase the information available for adaptations. A–B testing can be seen in this light. Small changes are made to the system and the user’s behaviour is logged. The variant that has better outcomes is then chosen as the preferred long-term design choice. By definition, there is an expectation that one of the variants will produce slightly worse user outcomes (in terms of efficiency or experience), but the variants are usually close enough that users will be utterly unaware (e.g. pixel-level placement), or don’t care. Arguably A–B testing is self-referential as the information revealed is precisely to make the choice, but it does emphasise that epistemic interaction is not just possible, but widely practiced. In some ways this has connections with the expected, sensed, and desired framework [21] for (primarily physical) sensing-based systems. In the ESD framework one explores what is currently sensed about the object by the system, what is the expected behaviour when an action is made on the physical object and any other interactions that are desired. The framework creates the space to think about gaps in the sensing (prompted often by expected behaviour) and gaps in action potential (desired) that could possibly be delivered by sensed (or potentially) sensed actions that are not currently mapped to a system response. The domain of ESD is like that of implicit interaction [22] where the sensing is primarily about enabling or contextually modifying actions in the primary task. For a epistemic action we are typically (but not solely) concerned with information for a secondary task; however this could be the same as the primary task at a future date, as in the intelligent heating example. Operationalising epistemic action has two aspects: • a design challenge – matching alternative possible interactions to information that would be useful for future interaction adaptations or applications • a selection criterion – using this fit and the additional value of the information to weigh against other criteria For the first, one needs to consider different potential interactions and what can be sensed under each alternative and also think about the potential purposes or needs for information. One can then assess the extent to which the sensed data can feed into different kinds of information, subject of course to the usual assessment for privacy and consent. For the second the value of the information gained by each alternative needs to be set alongside other criteria such as user experience, interaction efficiency, computational cost and development effort. Clearly if there are strong advantages for one alternative on user-centred criteria this would be chosen, but where alternatives are close or uncertain, the information gathering potential can be used as a decider. The intention is to have an end interaction that is a viable and reasonable one for the primary task, but which as a side effect (incidentally) delivers useful information. One potential problem about such interactions is that the lack of an explicit information gathering action means the user will be less aware that actions are being monitored, however, being largely invisible at the moment of gathering does not mean that this should not be upfront in the overall system design and that the information gathered is not available for scrutiny. In some cases this is a minimal concern, for example where the learning and adaptations purely influence the users own home. However, where the information is for broader learning and interaction by others, for example the use of learning analytics to improve future students’ studies, more explicit consent may be required. 8. Example – scrolling vs accordion information displays As a second practical example, let’s consider the choice of interaction technique for showing selected snippets of information in a help system or search results. Various documents have been selected using an algorithm and are available in the form of title, paragraph length snippet (1̃0 lines) and a link to the full document. Several design alternatives are being considered: 1. a standard search results listing the title, a small snippet, and a link to the full document; 2. a long scrolling page with titles and either a substantial extract or full text; 3. the same but as accordion with a title and tiny snippet and an open/close control to show the full extract; 4. variant of (3) where at most one page is open at a time (previous open section is closed as a new one is opened) Figure 2 illustrates some of these options using the search with the book feature at https: //hcibook.com/e3, which allows searching at paragraph-level within a textbook [23]. On the left is option (2), the current interface, with a substantial extract (full paragraph) in a scrollable list. On the right, we see a list of title plus single line snippets in (an envisionment of) a accordion-style interface as would be used by options (3) and (4). Option (1) would look rather like the right-hand image, but with links to single-extract pages rather than accordion controls. Figure 2: Two of the options for search results: (left) scrollable title + extract in search within book at https://hcibook.com/e3/; (right) envisionment of accordion-style interaction with title + tiny snippet and open/close buttons An advantage of (1) is that standard web analytics can be used to measure, for example, click-through and time spent on each page. This can help assess the actual relevance of the information and the information scent of the titles [24]. However, the navigation between pages may become annoying for some users, especially on a mobile device with poor signal compared with a single page design. We might therefore reject design (1), but considering it has led us to think about its advantages and hence the potential for gathering implicit relevance feedback. Let’s say option 2 comes out best in terms of actual user interaction based on a user study or maybe designer’s intuition. It is easiest to rapidly scroll up and down the display spotting things of interest. It is possible to study such scrolling patterns from detailed interaction traces [25], but this is hard to do and fairly uncertain, especially as an automated exercise. However, in options (3) and (4) we have easy to harvest information, not unlike the click- through data from (1). Opening an accordion section says “this looks interesting” the time spent before closing t or opening another section gives a measure of the interest of the underlying snippet and then clickthrough and dwell time on the full document can tell us how good the snippet paragraph was as information scent. Option (4) is slightly more informative than (3) because with option (3) the user might first of all rapidly open several potentially interesting titles, and then scroll up and down them. The trace of opening might then look as though only the last opened title had actually ended up being of value. Table 1 summarises the various criteria and the order of options. While option (2) offers the best immediate user experience, other options, particularly (3) and (4) offer better information. Of course, this information can be used to improve future behaviour and hence future user experience. If the differences in user experience between (2) and (3) are small, then one might prefer (3) Table 1 Criteria for selecting result display format Criterion Measured by Option order Comment User experience Expert analysis or user study 2>3>4>1 Information scent Click through to relevant 1>4>3>2 For (2) impossible to distinguish of title snippets/docs scent of title and snippet Sufficiency of Long dwell on snippet, but 4>3>2>1 Could also mean confusing but snippet no click-through irrelevant Information scent Click through to relevant 4=3=2>1 of snippet para- document after reading the graph snippet Relevance of doc- Dwell time on final docu- All equal ument ment to (2) as a final choice. Of course, one can always obtain finer grained understanding of the difference in terms of user experience between the options using A–B testing! Note too that the user experience differences between the options may also depend on the device and situation. For example, (4) might be a little annoying on a large screen where there is ample screen real estate to open multiple results to read and compare, but on a small-screen mobile device it may be no worse or better than (3). That is, one might use the differences between devices to make small differences in the interface so that additional information is gained from some that might help the same user, or another user, on a different device at a different time. 9. Summary This paper has developed the concept of ‘epistemic interaction’, the way in which user interac- tions can be subtly redesigned in order to increase the information available for cooperating AI. Epistemic interaction has been placed within the existing rich theoretical context of human– object and human–human communication including ecological psychology ideas of affordance and epistemic action, and ethnographic concerns with accountability. By explicitly identifying epistemic interaction, we hope to develop operationalisable design advice and have presented some first steps in this direction. Acknowledgments Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Health and Digital Executive Agency (HaDEA). Neither the European Union nor the granting authority can be held responsible for them. This workshop is supported by the HORIZON Europe project TANGO - Grant Agreement n. 101120763. References [1] J. J. Gibson, The ecological approach to visual perception: classic edition, Houghton Mifflin, 1979. [2] E. J. Gibson, A. D. Pick, An ecological approach to perceptual learning and development, Oxford University Press, USA, 2000. [3] A. Dix, Beyond intention-pushing boundaries with incidental interaction, in: Proceedings of Building Bridges: Interdisciplinary Context-Sensitive Computing, Glasgow University, volume 9, 2002, pp. 1–6. [4] A. Dix, Activity modelling for low-intention interaction, in: The Handbook of Formal Methods in Human-Computer Interaction, Springer, 2017, pp. 183–210. [5] R. Blasco, Á. Marco, R. Casas, D. Cirujano, R. Picking, A smart kitchen for ambient assisted living, Sensors 14 (2014) 1629–1653. [6] T. Kosch, P. W. Woźniak, E. Brady, A. Schmidt, Smart kitchens for people with cognitive impairments: A qualitative study of design requirements, in: Proceedings of the 2018 CHI conference on human factors in computing systems, 2018, pp. 1–12. [7] D. Norman, The design of everyday things: Revised and expanded edition, Basic books, 2013. [8] D. A. Norman, The way i see it signifiers, not affordances, interactions 15 (2008) 18–19. [9] D. Vyas, C. Chisalita, A. Dix, Dynamics of affordances and implications for design, CTIT Technical Reports TR-CTIT-08-36, The University of Twente, Centre for Telematics and Information Technology, 2008. http://doc.utwente.nl/64769/. [10] W. W. Gaver, Technology affordances, in: Proceedings of the SIGCHI conference on Human factors in computing systems, 1991, pp. 79–84. [11] J. L. Austin, How to do things with words, Harvard university press, 1975. [12] J. Searle, Speech Acts: An Essay in the Philosophy of Language, 1969. [13] D. Frohlich, P. Drew, A. Monk, Management of repair in human-computer interaction, Human–Computer Interaction 9 (1994) 385–425. [14] H. H. Clark, S. E. Brennan, Grounding in communication., American Psychological Associ- ation, 1991. [15] A. Dix, Computer supported cooperative work: a framework, in: Design issues in CSCW, Springer, 1994, pp. 9–26. [16] A. Dix, Language and action (2): from observation to communica- tion, Posted on May 18, 2009. https://alandix.com/blog/2009/05/18/ language-and-action-2-from-observation-to-communication/, 2009. [17] A. Dix, S. Gill, D. Ramduny-Ellis, J. Hare, Section 15.7 The origins of language. TouchIT: Understanding Design in a Physical-Digital World, Oxford University Press, 2022. [18] C. Heath, P. Luff, Collaboration and control crisis management and multimedia technology in london underground line control rooms, Computer Supported Cooperative Work (CSCW) 1 (1992) 69–94. [19] H. Garfinkel, Studies in ethnomethodology, in: Social Theory Re-Wired, Routledge, 2023, pp. 58–66. [20] G. Button, W. Sharrock, The organizational accountability of technological work, Social studies of science 28 (1998) 73–102. [21] S. Benford, H. Schnädelbach, B. Koleva, R. Anastasi, C. Greenhalgh, T. Rodden, J. Green, A. Ghali, T. Pridmore, B. Gaver, et al., Expected, sensed, and desired: A framework for designing sensing-based interaction, ACM Transactions on Computer-Human Interaction (TOCHI) 12 (2005) 3–30. [22] A. Schmidt, Implicit human computer interaction through context, Personal technologies 4 (2000) 191–199. [23] A. Dix, J. Finlay, G. Abowd, R. Beale, Human-Computer Interaction, Pearson Education, 2003. [24] S. K. Card, P. Pirolli, M. Van Der Wege, J. B. Morrison, R. W. Reeder, P. K. Schraedley, J. Boshart, Information scent as a driver of web behavior graphs: Results of a protocol analysis method for web usability, in: Proceedings of the SIGCHI conference on Human factors in computing systems, 2001, pp. 498–505. [25] A. Dix, Challenge and potential of fine grain, cross-institutional learning data, in: Pro- ceedings of the Third (2016) ACM Conference on Learning at Scale, 2016, pp. 261–264.