=Paper=
{{Paper
|id=None
|storemode=property
|title=Building Supportive Multimodal User Interfaces
|pdfUrl=https://ceur-ws.org/Vol-828/SUI_2011_paper2.pdf
|volume=Vol-828
}}
==Building Supportive Multimodal User Interfaces==
<pdf width="1500px">https://ceur-ws.org/Vol-828/SUI_2011_paper2.pdf</pdf>
<pre>
            Building Supportive Multimodal User Interfaces
               José Coelho                                                     Carlos Duarte
       LaSIGE, University of Lisbon                                      LaSIGE, University of Lisbon
  Campo Grande Edifício C6 Piso 3 1749-016                          Campo Grande Edifício C6 Piso 3 1749-016
              Lisboa, Portugal                                                 Lisboa, Portugal
            +351 21 750 05 32                                                 +351 21 750 05 19
         jcoelho@lasige.di.fc.ul.pt                                            cad@di.fc.ul.pt

ABSTRACT                                                        reality in GUIDE. Additionally, pointing interaction using a
In this paper, we describe and discuss solutions capable of     video based gesture tracking sensor is helped by cursor
helping in the development of supportive multimodal user        adaptation techniques which makes easier the selection of
interfaces. Based on the specifications and design of           content on the screen, also helping in supporting
European Union funded project GUIDE (Gentle User                interaction.
Interfaces for Elderly People), we show how it is possible      This diversity of devices and modalities of interaction, will
to use several modalities of interaction as well as adapting
                                                                offer users the flexibility to use whatever medium they find
UIs, as a mean of providing users with ideal interaction in
                                                                more appropriate given a specific context, at the same time
every application, and preventing or resolving errors
                                                                as they benefit from visual (text, images, video and
resulting from missed or wrong user-device inputs.
                                                                animations), audio (speech, and other sounds) and haptic
Keywords                                                        feedback (vibration). These multimodal capabilities are in
Supportive multimodal user interfaces, adaptation, GUIDE,       fact, the first step to a supportive interaction.
UI translation.
                                                                Considering the variety of differences present in elderly
INTRODUCTION                                                    users and their preferences when using a system like this,
In this paper we are going to introduce some mechanisms         GUIDE will cluster it’s users in different User Profiles
present in the ongoing GUIDE project and which are              (UPs) - transparent to every user - where data concerning
intended to help developers in the implementation of            preferences and constraints of interaction are saved. By
supportive user interfaces.                                     making use of each UP, GUIDE will try to adapt User
GUIDE Project                                                   Interface (UI) elements to fit every user.
GUIDEi aims to offer multimodal interaction to elderly          In addition to providing supportive use, GUIDE framework
(and disabled) users with the goal of simplifying interaction   supports UI adaptation for every application running.
with a television (TV) and set top box (STB) based system.      Moreover it aims at providing this support requesting
By pointing to the screen, making gestures, issuing speech      reduced extra effort from developers. Since it is not
commands, interacting with a Tablet PC, using the remote        expectable to have developers providing different versions
control, interacting with an Avatar or simply making use of     of applications for users with different characteristics
user intuition for combined interaction with several of these   GUIDE will develop tools to “translate” a “standard” UI
modalities, the GUIDE framework makes fitting interaction       into tailored UIs for every type of user. The extra effort
to users’ characteristics and preferences, possible and also    asked of developers consists in identifying each UI
for impaired users to interact with the TV.                     interactive component using WAI-ARIA ii semantic tags.
In what concerns supportive interaction, the use of Avatars     With that information, GUIDE will abstract UI
is explored with the goal of offering users, a persona with     characteristics, and save them in an Application Model
whom they can relate to, while interacting with the system.     (AM) (one for every application), making adaptation of UI
The Avatar will work like someone who explains to users         components possible at run-time.
the interaction steps to be done in order to execute tasks,     Problem Description
and will help them getting out of “trouble” after an error      Nowadays, most UIs lack capability in guiding users to an
has been generated while using the system. More, the            adequate and efficient interaction [1], when ideally “the UI
existence of generic, as well as content-specific, speech       must guide the user in accomplishing a task the application
commands as a possibility of interaction makes intuition a      was designed for” [4] by providing help and appropriate
                                                                feedback about features, tasks, modalities and contexts of
                                                                interaction. If a user is not capable of perceiving an
 LEAVE BLANK THE LAST 2.5 cm (1”) OF THE LEFT                   application and reacting to errors while interacting, more
     COLUMN ON THE FIRST PAGE FOR THE                           sooner than later he or she is going to abandon its use, and
            COPYRIGHT NOTICE.                                   adopt a more usable application. Unsatisfied users are
                                                                going to prefer a better supportive interface which can fit
and adapt to his or her characteristics. If this is true for the   perform different gestures, informing the user of the set of
so called typical users, for elderly users this is even more       speech commands he or she can issue for achieving typical
relevant. Because these users are usually characterized by         tasks, and providing instructions about how to interact with
having one or multiple impairments (example: hearing               other components of the system like the Avatar engine, the
difficulties, visual incapacity, motor constraints, etc.),         Table PC, etc.. In all this process the user has an active
adequate interaction is only possible when the system is           role, learning by experimentation of every interaction
capable of adapting its UI components and modalities of            modality and device.
interaction to these users’ specific characteristics.              How to know your users?
Therefore, in the development of supportive multimodal             For the users to understand an interface and know how to
user interfaces for elderly or impaired users, several             interact with it, it really helps that the interface knows the
questions need to be answered so that an appropriate               user in advance. Only knowing beforehand what are the
application and interaction can be implemented:                    users preferred ways of interacting as well as the users’
                                                                   impairments and difficulties makes it possible to build or
        How to let your users know how to interact?
                                                                   adapt the interface for appropriate and efficient user
        How to know your users?
                                                                   interaction. For example, if the system doesn’t have enough
        How to help users after a mistake has been                information about the user to know that he or she is blind
         identified?                                               and presents a visual interface to him or her, no interaction
        How to present content and interaction                    will occur at all, and the system will not be used. In a
         possibilities in the most suitable way to the users?      second example, if the user prefers to interact using
In the remainder of this paper, we describe the approaches         pointing and the system presents a simple visual interface
followed in GUIDE to try to offer solutions to the questions       that only receives remote control input, he or she will be
identified above, by supporting multimodal interaction and         less motivated to use and adopt that system (and a higher
UI adaptation for elderly users when using a TV and STB            probability of making errors during interaction exists).
based system. Special interest also goes to the way this           GUIDE will try to collect information about its users before
framework provides every application with the possibility          they get to interact with any of the system’s applications.
of adapting to different contexts of interaction, and to the       To this end, the UIA will also be used for collecting data
presentation of ideas on how it could be possible for these        about users. Every time a new user starts using the system,
types of users to personalize UI presentation and interaction      the UIA is presented on the screen combined with audio
while preserving usability.                                        output (covering possible situations of severe audio or
ANSWERING THE QUESTIONS                                            visual impairments) and the user is asked to perform a
How to let your users know how to interact?                        series of tasks concerning his or her capabilities. In a first
For an efficient interaction to be a reality, users need to        instance, the user is “registered” in the application using
have knowledge about the available ways for performing             name, and facial and vocal characteristics, so that from that
each task. They have to know to the full extent all the            point on, every time he or she wants to use the system the
possibilities and modalities when confronted with different        correspondent UP can be loaded based on these properties.
difficulties and contexts of interactions. Only by                 Next, the application tries to understand if the user has
understanding how they can interact, they can make the             some visual impairment by presenting text on the screen
most of the interface being presented and understand how           and asking for user feedback (figure 1) (e.g. presenting a
to use all the features provided by the application. For           sentence and expecting for user to adjust the font until he
example, if a visual interface with a menu is presented on         feels comfortable reading, and then asking user to read the
the screen, and the user doesn’t know he or she can speak          sentence out loud to make sure he is in fact seeing it well).
the name of a specific button for making a selection, a lot        If the user passes this test, different configurations of text
of time can be lost by performing the task using alternative       font and buttons, as well as several background and text
modalities (the only ones the user has knowledge about)            colors, are tested out to understand his or her preferences
like selecting the button by pressing remote control keys in       regarding visual interfaces. If the user fails the test, the text
a certain sequence or by pointing to the screen with the           font size is raised in a screen-by-screen basis until there is
remote control.                                                    the understanding of how severe is the user visual
GUIDE will try to instruct the users before they start             impairment.
interacting with any of the framework applications. For
this, it will use an application called the User Initialization
Application (UIA) to give the user a clear understanding of
the possible ways of interaction. Users will be guided
through the experimentation of the various modalities of
interaction available in the framework, like pointing to the
screen, issuing speech commands, pressing remote control
buttons, etc.. For this purpose, the UIA will present on the
screen a tutorial with scripted animations of how to
                                                                users will also lose interest, and will not want to use the
                                                                system.
                                                                User information can be collected explicitly with the UIA,
                                                                but also implicitly through run-time analysis of the user
                                                                interaction logs. After the user has gone through all the
                                                                UIA process, he or she starts interacting with different
                                                                applications. Information concerning every task performed
                                                                and modality used is saved by the system in logs. A rule-
                                                                based inference motor will analyze this data and makes
                                                                conclusions about user preferences and difficulties (for
                                                                example, if the user makes consecutive errors when
                                                                pointing to the screen for selection of a menu button, the
Figure 1: UIA prototype. Example of visual test where the       system concludes he or she has difficulties using that
user has to read out loud the text presented on the screen,     modality and tries to increase the size of the buttons before
and increase or decrease the text size to his or her            suggesting a change in the modality of interaction). These
preferences.                                                    conclusions enrich the data collected in the first process.
For every other modality of interaction, similar tests are      All data collected by the UIA and run-time processes are
presented to the user, and data about user impairments and      saved in a user model and used to adapt every application
preferences is collected. For example, the user is asked to     running on the framework [2].
perform different gestures, or asked to point to different
                                                                How to help users after a mistake has been identified?
locations on the screen to understand motor capabilities, to    A supportive UI is one which tries to be aware at all times
repeat out loud what he heard to understand hearing
                                                                if a user is lost in the interaction, or if he or she is having
capabilities (figure 2 top), and asked to play memory and
                                                                too many interaction errors to be enjoying an efficient use
interpretation “games” with the goal of testing his or her      of the application. Accordingly, one of the biggest
cognitive capabilities (figure 2 bottom).                       challenges when guiding the user in the interaction, it’s
                                                                how to identify or perceive that the he or she is lost and
                                                                when is the application or interaction generating errors.
                                                                Only after identifying that, the application can then try to
                                                                help the user and suggest alternative ways to achieve a
                                                                desired goal. This is, however, a difficult task because at
                                                                run-time a lot of dimensions are involved. If the user
                                                                mistakes or misinterprets the interface structure and
                                                                meaning, it can by itself result in interaction mistakes.
                                                                There are also a lot of possible errors caused by changes in
                                                                the context of the interaction, like the physical and social
                                                                aspects of the environment. For example, if a user is
                                                                interacting using speech input and the noise in the room
                                                                increases, the system can fail to interpret the command
                                                                issued because of the background noise, or a wrong
                                                                command can be recognized instead (this can also happen
                                                                when another person is speaking to the user at the same
                                                                time of interaction).
                                                                Interaction mistakes will be identified in GUIDE by
                                                                analyzing the interaction in run-time and by watching for
                                                                unrecognized inputs. Because in this framework users can
                                                                interact with UIs through different modalities (and
                                                                devices), in a singular way or in a combined fashion, the
                                                                system has to be alert for many different errors like:
                                                                         Unrecognized commands issued when speech
Figure 2: UIA prototype. Examples of audio (top) and
                                                                          input is performed.
cognitive (bottom) tests presented to GUIDE users.
                                                                         Selection of meaningless coordinates (coordinates
From the results obtained in GUIDE user trials and from                   not related with any UI interactive content) when
discussions with developers, we also know to be extremely                 pointing with finger.
important that UIA application must be presented to users                Unrecognized gestures performed by the user.
in form of a simple and quick tutorial, so that elderly don’t            Errors resulting from remote control commands.
feel like they are being evaluated. If UIA takes too long,
         Repeated errors when interacting with each              interaction and showing how to use them when a change in
          device or modality (consecutive errors could            the context of interaction happens, showing information
          suggest a switching of modalities is required).         related with the task they are performing every time there
         Long periods with no selection registered but with      are errors in the recognition of modalities or long pauses in
          screen navigation occurring (may suggest that the       the interaction (for example when the user is pointing and
          user is lost, or doesn’t know what to do).              trying to select an area on the screen where there are no
         Errors resulting from incomplete fusion of input        interactive UI items, show him or her where the buttons are
          modalities.                                             by highlighting them). However, GUIDE main focus is
         Contradictory instructions from simultaneous            helping users proceed with the interaction in an alternative
          input of different modalities.                          way even when it’s not possible to detect the cause of the
         No input received after system started a task           error.
          requiring user feedback.                                Finally, every time an error occurs, the Avatar engine will
Additionally, every time a change in context of interaction       also be called for a more “personal” interaction between the
occurs, the system has to be alert for periods of inactivity or   system and the user (meaning, the Avatar presents the
for unexpected inputs, and using the interaction logs the         explanation of the error to the user, shows how changing
system tries to prevent some errors from happening when           modalities can solve the problem or just points the user to
there is clear understanding of what are the causes.              using an alternative modality when an error arises). In this
                                                                  way, it’s almost like together they can find a solution to the
A supportive UI has to be capable of helping the users            problem or “find a way out” of the mistake.
every time there is a mistake in the interaction [4].
However, in modern applications help is a capacity                How to present content and interaction possibilities in
                                                                  the most suitable way to the users?
“created ad-hoc” [4] meaning it was previously generated
                                                                  The main problem with developing interfaces for elderly or
and it does not cover run-time situations not originally
                                                                  disabled users is the great diversity existing in terms of user
foreseen by the designers. For this reason, UI design does
                                                                  characteristics and user impairments. It is common for an
not cover every situation where a user needs help for
                                                                  elderly user to have more than one impairment (for
responding to UI or interaction difficulties. Therefore
                                                                  example, poor hearing and poor vision), as it is usual to
helping the user is not something easy to do in a predefined
                                                                  observe a lot of differences between each of these users.
manner before the user starts using the system, and requires
                                                                  This means that what is good for one user can also, and at
some run-time “intelligence” from the supportive system or
                                                                  the same time, be inappropriate for several others. For
interface. For example, if a user is using speech input for
                                                                  example, an elderly user with hearing difficulties can
menu navigation and his or her dog enters the room and
                                                                  interact with a visual interface without any problem, but
starts barking, the system will receive a series of
                                                                  one with severe visual impairments cannot, and need an
consecutive unrecognized inputs and the user will be in a
                                                                  interface with audio input and output for efficient
situation that was not taken care off in the design process,
                                                                  interaction. However it is not expectable that developers
which can result in aborting the interaction with the
                                                                  will implement different versions of the same application,
application.
                                                                  so the framework has to ensure the ways of interaction are
As it is strongly based on multimodal interaction, one of         adapted to the user characteristics.
GUIDE’s ways of helping users after a mistake has been
                                                                  GUIDE will offer elderly users adaptation mechanisms
identified will rely on suggesting to the user a change in the
                                                                  capable of adapting UI elements to each user
modality of interaction. This change is however, based on
                                                                  characteristics. After the user has gone through the UIA
each user preferences and characteristics firstly identified
                                                                  and the system has collected enough information, the user
by the UIA and logs of interaction, as well as it is based in
                                                                  is assigned to one UP [1]. Using the information about each
the context of interaction and task being performed at that
                                                                  user, GUIDE adapts each UI to fit the UP interaction
moment[2]. So, as the user has already “ranked” modalities
                                                                  patterns. This is only possible because GUIDE asks for
of interaction by preference (and based on constraints),
                                                                  extra information in each application development, so
every time an error results from repeated errors interacting
                                                                  every UI is implemented using HTML, JavaScript, and
with one single modality, another is suggested to the user,
                                                                  CSS languages to what the developers add WAI-ARIA ii
who accepts it (or rejects it) in order to continue the
                                                                  annotations providing semantic information about UI
interaction. This will also be the procedure every time a
                                                                  components. In this way, for every application, GUIDE
change in the context of interaction happens [2] (for
                                                                  will derive and keep an Application Model (AM), which is
example, when the dog starts barking, the system won’t
                                                                  nothing more than an abstract interface that saves
recognize the barks as speech commands – rather, barks
                                                                  information about the structure of the UI and identifies
will be interpreted as background noise - and will suggest
                                                                  each UI element present. This facilitates adaptation to
to the user continuing interacting using pointing).
                                                                  different interaction contexts as well as to different types of
Another way of helping users is to present to them relevant       users (users that belong to different UPs), because every
information related with the context of the error they have       time a user calls for an application, the system uses its
just made, like presenting alternative modalities of              application model and considering the interaction context
and user characteristics, modifies UI elements not                 CONCLUSIONS
appropriate for the user. For instance, when a user with           For the development of supportive multimodal user
visual impairments calls for an application formed by a            interfaces to be a reality, we have to make sure that the
visual menu and some text content, GUIDE consults its              user’s characteristics are known to the application. As well,
AM and “knowing” the user characteristics as well as               the application has to be capable of instructing the users
“observing” no change in the interaction context, loads the        about all the ways of interacting with it, and make sure that
UI increasing the size of the buttons originally defined and       adaption and UI help is presented to users in a personalized
uses audio and visual output modalities.                           fashion. GUIDEs UIA, multimodal interaction and UI
In what concerns the developers control over this UI               translation and adaption, were presented in this paper as
                                                                   possible solutions which can help in the deployment of
adaptation, GUIDE will adopt one of three adaptation
                                                                   supportive user applications without asking much more
schemes depending on the level of freedom given by the
                                                                   additional effort from the developers.
developer to change the application original properties
(CSS and HTML): In “Augmentation”, GUIDE won’t be
able to change any UI components, only making some                 REFERENCES
overlay of output modalities (for example, adding audio            1. Biswas, P., Langdon, P.: Towards an inclusive world –
output to a visual interface); in “Adjustment” GUIDE has              a simulation tool to design interactive electronic
permission to adjust UI component parameters as well as               systems for elderly and disabled users. Proc. SRII,
also making “augmentation” (for example, adding audio                 2011.
output to a visual interface and also changing UI colors for       2. Coelho, J., Duarte, C.: The Contribution of Multimodal
a higher-contrast); and finally in “Replacement” the                  Adaptation Techniques to the GUIDE Interface. In:
developer gives total control to GUIDE, making possible               Stephanidis, C. (ed.): Universal Access in HCI, Part I,
the substitution of UI components as well as                          HCII 2011, LNCS 6765, pp. 337-346. Springer,
“augmentation” and “adjustment” (for example, adding or               Heidelberg (2011)
removing buttons, as well as adjusting colors and adding           3. Garcia Frey, A., Calvary, G. and Dupuy-Chessa,
audio output to a visual interface).                                  S. Xplain: an editor for building self-explanatory user
Additionally, all interfaces must be capable of listening for         interfaces by model-driven engineering. In Proc. of the
user commands at any time of the interaction so that                  2nd Int. Symp. on Engineering Interactive Computing
modifications to the interaction and presentation can be              Systems: EICS 2010, pp 41-46, ACM. Berlin, Germany,
done at run-time, if the user is not satisfied with the current       June 19-23, 2010.
configuration. For example, if a user says “bigger buttons”        4. Myers, B. A., Weitzman, A. J. Ko., and Chau, D. H..
or makes a gesture to increase the volume, the interface              Answering why and why not questions in user
must adapt and reflect these changes (by reloading the UI             interfaces, in CHI’06: Proceedings of the SIGCHI
or modifying output parameters).                                      conference on Human Factors in computing systems,
                                                                      pages 397-406, New York, NY, USA, 2006. ACM


i
     GUIDE– Gentle User Interfaces for Elderly People. http://www.guide-project.eu/.
ii
     http://www.w3.org/WAI/intro/aria

</pre>