=Paper= {{Paper |id=Vol-91/paper-11 |storemode=property |title=Obstacles and Perspectives for Evaluating Mixed Reality Systems Usability |pdfUrl=https://ceur-ws.org/Vol-91/paperU1.pdf |volume=Vol-91 |dblpUrl=https://dblp.org/rec/conf/mixer/BachS04 }} ==Obstacles and Perspectives for Evaluating Mixed Reality Systems Usability== https://ceur-ws.org/Vol-91/paperU1.pdf
               Obstacles and Perspectives for Evaluating
                   Mixed Reality Systems Usability.
                  Cédric Bach                                                   Dominique L. Scapin
                      INRIA                                                            INRIA
          B.P. 105 Domaine de Voluceau                                      B.P. 105 Domaine de Voluceau
           78153, Le Chesnay, France                                         78153, Le Chesnay, France
                +33 1 39 63 51 09                                                 +33 1 39 63 55 07
               cedric.bach@inria.fr                                           dominique.scapin@inria.fr

ABSTRACT                                                        mixed reality systems, from a user-centred perspective, and
The goal of this paper is to survey the main issues with the    to discuss potential research avenues.
ergonomic evaluation of MRS (Mixed Reality Systems)             First, one needs to discuss definitions : MRS are viewed as
and to stimulate discussions for future research. A first       a subset of VE, with an ergonomic perspective.
point concerns definitions and specificities of MRS within
                                                                Secondly, through a brief account of VE and MRS studies,
the « reality / virtuality » continuum, and the incorporation
                                                                we highlight a few items explaining the lack of current
of user issues. Another point concerns the combinatory
                                                                knowledge on the usability of such systems, mainly related
character of the ergonomic knowledge to be applied to
                                                                to their novelty.
MRS entities (reality, « GUIs », « VR », and « MR » spe-
cific). A major issue concerns ergonomic evaluation meth-       Then, the paper mentions briefly a number of evaluation
ods, their current state, their advantages and drawbacks,       methods that can be applied to MRS, and focuses on the
particularly for user testing. Finally, the discussion points   specific methodological challenges with user testing.
at various items which may be part of a future research         Finally, the discussion offers several items of interest for a
agenda, such as the need for more usability data, for generic   common research agenda in the area of MRS ergonomics.
and well controlled experiments; for common testing plat-
forms, for shared recommendations data bases, for design        DEFINITIONS and SCOPE
and assessment of inspection methods, for common task           For MRS, there is no fully agreed-upon definition, so far,
taxonomies and for common models of MR entities and             but a set of features, e.g., as the ones mentioned in the Call
situations, etc.; all of this possibly leading to increased     For Papers for this (IUI / CADUI-associated) Workshop: “
knowledge based on shared benchmarks.                           … integration of the physical and digital worlds in a
                                                                smooth and usable way. This fusion involves the design
Keywords                                                        and development of "mixed reality systems", including
Mixed reality, virtual environments, ergonomics, usability,     augmented reality, augmented virtuality, augmented video,
evaluation methods, user testing, inspection methods, er-       and tangible systems …”. We definitely agree that: “The
gonomic recommendations, research agenda, user-centered         diversity of terms used to denote these systems is evidence
approaches, interaction models, task models.                    both of the amount of research activity in the field and the
                                                                lack of a common conceptual framework for that activity”.
INTRODUCTION
                                                                Obviously, one of the goals of this Workshop should be to
Virtual Environments (VE) are being developed fast and
                                                                progress on definitions and on a conceptual framework.
widely, in various contexts (e.g., training, data
visualisation, computer-aided design, tourism, art, games,      For VE, there are numerous definitions. Most often, they
etc.).                                                          are techno-centred. A definition that we use for VE is the
Mixed Reality Systems (MRS) follow the same path. For           one for VR (Virtual Reality) from Loeffler & Anderson
both types of environments, just as it was the case for         [14] : “Virtual Reality is a three-dimensional, computer-
GUIs and for the Web, a large utilisation will depend on its    generated, simulated environment that is rendered in real
usability.                                                      time according to the behaviour of the user.”
Facing the question of establishing a research agenda for       However, this definition should to be completed on two
contributing to more usable MRS, this paper attempts to         points:
draw a limited picture of the current state of knowledge on
                                                                - VE can be multi-users (for the evaluation of collaborative
                                                                VE, see for instance [25])
 LEAVE BLANK THE LAST 2.5 cm (1”) OF THE LEFT                   - VE can be described along a « Real-Mixed-Virtual »
     COLUMN ON THE FIRST PAGE FOR THE                           continuum, as described by Milgram [16].
            COPYRIGHT NOTICE.                                   That continuum includes in fact a large spectrum of 3D
                                                                computer-based interactive situations. On that spectrum, a
                                                                number of terms used in the literature can be located, such
as Virtual Reality, Mixed Reality, Augmented Reality,                     performance, information presentation, commands
Desktop or Immersive Environments, Augmented                              and controls, etc.).
Virtuality, etc… it is even possible to incorporate within           •    Virtual environments are concerned as well by
that continuum 3D CAD or VRML objects displayed on                        cognitive ergonomics, but also by physical
« classical » computers, which makes them belong to VR,                   ergonomics (e.g., in terms of VE behaviour,
even though very loosely.                                                 presence, cyber sickness, etc).
The first point here is a matter of joining the techno-centred       •    Fusion of the previous elements, which
and the user-centred perspectives. The second point deals                 corresponds to one specificity of MRS: the
with the combinatory nature of MRS ergonomic knowledge                    appropriate correspondence between the various
(real, « GUI », « VR », and « MR » specific).                             real or virtual elements constituting MRS.
People might think that there is an enormous difference                   This problem, also called « continuity »,
between a 3D object to be interacted with in a regular GUI                characterizes the perceptual and cognitive fluidity,
environment, and that same object presented in a CAVE,                    from the users’ point of view, between the real
using a 3D mouse. That difference is certainly there, but                 world and the virtual world. According to Nigay
only from a technical point of view, not necessarily from a               et al. [17], “The perceptual continuity is verified if
user’s point of view.                                                     the user directly and smoothly perceives the
With an ergonomic perspective, these two situations must                  different representations of a given concept.
be considered through their capacity to support users in                  Cognitive continuity is verified if the cognitive
achieving specified task goals, with efficiency, effectiveness            processes that are involved in the interpretation
and satisfaction. In some cases, an immersive situation can               of the different perceived representations lead to a
be beneficial (e.g., learning driving / control operations on             unique interpretation of the concept resulting
a high-speed train, which involves physical simulation of                 from the combination of the interpreted perceived
the access to the driving instruments), while it may be less              representations.”.
appropriate in others (e.g., learning specific procedures for             Such a concept is a challenge for MRS design that
that same train, which involve more cognitive sequence                    requires optimal matching between various
learning steps) where regular GUIs, or even paper instruc-                features in terms of presentation, coding, meaning,
tions may be more appropriate.                                            and behavior.
                                                                          From an ergonomic point of view, one could state
In other words, depending on the task at hand, the useful-                that it is not only an issue with interface
ness of the support required may be different : in kinetics               consistency      (e.g.,   providing      information
learning, immersion may make sense ; while in cognitive                   presentation as well as user initiated interactions
learning, it may not be as helpful.                                       with the same characteristics across operations and
The same point can be made for MRS which are                              across applications), but also a complex
constituted of both real and virtual elements within the                  compatibility issue, i.e., matching the various
same 3D interactive environment. As they can be located                   MRS individual elements with their referent, but
roughly at the midpoint of the Milgram continuum, they                    also matching all related MRS elements which
can be considered as a particular form of VE. However,                    each other, depending on the tasks and context.
their specificity is to stage a form of fusion between real      ERGONOMIC KNOWLEDGE ON MRS
and virtual worlds.                                              Previous history of HCI concerning ergonomics knowledge
Even if MRS can be divided into Augmented Reality and            is probably repeating itself : MRS novelty and lack of er-
Augmented Virtuality [17] depending on the exact physical        gonomic knowledge.
or computer-based nature of the tasks and interactions           With new technologies, usability concerns start with gen-
involved, they relate to combinations of knowledge from          eral questions, hypotheses, debates ; then questions are
various domains of specialization of ergonomics :                sorted and put to the test through experiments ; theories are
     • Real objects, in the area of physical ergonomics,         offered ; data on usability grows rapidly as the technology
          concerned with human anatomical, anthropomet-          becomes widely available ; data is then organized, made
          ric, physiological and biomechanical characteris-      available through manuals, guides ; finally, style guides,
          tics (postures, materials handling, repetitive         architectures, design and evaluation methods are offered,
          movements, musculoskeletal disorders, workplace        tested, compared.
          layout, safety and health); in the area of ergonom-    This has been proven true for GUIS, the Web, ... it should
          ics of every-day products and consumer products;       be the same for VE and MRS. However, there may be ways
          in some cases, in the area of cognitive ergonomics     to accelerate the process by doing user testing early so that
          and organizational ergonomics.                         usability knowledge is gained rapidly, rather than having
     • Classical interfaces (GUIs) mainly in the area of         simply the technology perfected without user concern.
          cognitive ergonomics, concerned with mental            Also, as it has been verified that, when moving from GUIs
          processes, such as perception, memory, reasoning,      to the Web, one should not « re-invent the wheel », but
          (e.g., mental workload, decision-making, skilled       apply as much as possible sound ergonomic knowledge
that can be transferred to novel environments (information                 of recommendations available, to the fact that
organisation, consistency, level of feedback, etc.).                       « classical » ergonomic recommendations have not
Implementation and usage of VE and MRS is indeed very                      yet been applied or extended to VE. In addition,
recent as it started in the early nineties and only grows at a             such recommendations are not presented in a
fast pace since three or four years.                                       common unitary format, but often as experimental
                                                                           or « best practice » results ; see for instance
That novelty explains partly why the currently available                   Gabbard [7] [8]; Kaur [12]; Stanney[21] ; generally
ergonomic knowledge is relatively scarce, compared to the                  such documents are organized according to major
number of issues that need to be tackled in such complex                   categories of VE objects. Also, based on an
environments. Such environments are relatively unstable in                 extensive literature review, a set of 170
their implementation and usage ; lots of technical problems                recommendations dedicated to VE have been
still need to be solved.                                                   extracted under a generic format and organized
However, some actions can be envisioned to better incorpo-                 according to usability dimensions [1].
rate a user-centred approach :                                        • Along those lines, with the goal of defining a
      • The recommendation above to not « re-invent the                    structured ergonomic inspection method, an
          wheel », is particularly true for MRS as their                   adaptation of the Ergonomic Criteria (E. C.) has
          components are mixed, therefore, ergonomics                      been proposed and assessed in terms of intrinsic
          knowledge needs to be gained for the novel, spe-                 validity [2]. First of all the application of the E.
          cific MRS issues (such as « continuity »), but cur-              C. inspection method requires expertise in VE and
          rently available ergonomics knowledge should be                  training on the 20 dimensions covered by these
          applied to its non-specific elements (e.g., real ob-             E. C.. Also, the method needs further evaluation
          jects, 2D displays, etc.).              The amount               for VE and extensions to MRS.
          of knowledge (empirical results, recommenda-           A method dedicated to VR, based on the cognitive
          tions, standards, etc.) in ergonomics about real ob-   walkthrough method, has also been designed by Sutcliffe
          jects, « classical » interfaces is of course enor-     [22]. This method is based on a theory of interaction
          mous, and needs to be carefully investigated in        (Norman [18]). This walkthrough analysis method uses
          terms of applicability depending on context and        three models (first on goal-oriented task actions, second on
          tasks. Also, concerning ergonomic evaluation, a        exploration and navigation in virtual worlds, third on
          number of already available methods should be          interaction in response to system initiative) derived from
          looked at in terms of their applicability and needs    the theory. Each stage of the model is associated with
          for adaptation to the specifics of MRS.                generic design properties. The evaluation method consists
      • Another recommendation above was to incorporate          of a checklist of questions using the properties and
          ergonomic concerns early. That is usually done         following the steps of the method. That method could
          through user testing (with or without hypotheses,      possibly be extended to MRS.
          on one or several contexts, on one system, or          There are also methods, mainly for VE, based on
          through the comparison of several systems, etc.).      recommendations, using computer-based support, such as :
          That is where a number of methodological prob-
          lems arise.                                                 • I-DOVE (Interactive tool for development of
          In order to identify the obstacles and perspectives              Virtual Environments) [11]. The goal of this
          in methodological terms, the next two sections                   prototype, currently being developed as a large-
          deal first with ergonomics evaluation methods and                scale web-based application based on several sets
          secondly with MRS-specific issues when conduct-                  of recommendations, is to offer context specific
          ing user testing.                                                guidance for VE development, and alternative
                                                                           ways of searching and browsing, distinguishing
ERGONOMIC EVALUATION METHODS                                               user categories. The initial prototype was based on
In the context of MRS, which usability evaluation methods                  users interviews and was later evaluated by expert
are available, and how can such environments be                            evaluation.
appropriately evaluated?
                                                                      • Another tool is MAUVE [21]; also developed as a
In the literature, there is no usability method yet                        website, it incorporates design guidelines
specifically designed for MRS, except notations such as                    according to several VE categories such as
AZUR++, a notation for describing, and reasoning about                     navigation, object manipulation, input, output and
the design of mobile mixed systems [6].                                    so on (based on Gabbard’s [8] taxonomy). A
There has been already some research on how to guide VE                    multi-criteria usability matrix is the support for
design with usability considerations (e.g., in [26]), to                   organizing and retrieving recommendations. The
consolidate usability dimensions and to design inspection                  evaluation process is supported in two steps:
methods, but much work is needed to extend their scope to                  “traditional” heuristics stage, and prioritization of
MRS, to assess them and to compare them.                                   usability attributes. That capacity of tailoring the
                                                                           evaluation may be interesting for evaluators with
     • Methods based on heuristics or recommendations
         are difficult to carry out due to the limited amount
         different points of view or organizational goals.               problems already experienced with a few cases of
         The tool has not been evaluated yet.                            user testing of VE systems (see next section).
     • The last tool, which we know of, is a hypertext-         METHODOLOGICAL             DIFFICULTIES        WITH     USER
         based prototype developed by Kaur [13]. This tool      TESTING
         present 45 generic design properties that specify      User testing is certainly the preferred method to be used,
         the necessary support from the system for              particularly in order to alleviate the current lack of available
         “successful” VE interfaces. Like the previous one,     usability data. However, many methodological problems
         no usability evaluation plan was integrated during     require solutions.
         the development of the prototype.
                                                                First of all, let us use a metaphor both historical and aero-
There is also the question of adapting, at least partly, to     nautical. Looking back over a century ago, let us imagine
MRS some of the « classical » ergonomics methods. Such          that the following question was asked at that time: « which
methods are too numerous to be all discussed here (see for      flying machine constitutes the best way to move rapidly in
instance, [3], [4], [9]). However, one can mention three        the air from one point to another : the blimp or the air-
categories of methods that are general enough in their          planes ? ». In those times, the airplanes trials were just
approach to be good candidates for MRS : questionnaires/        starting ; it would have been difficult to find users (i.e.,
interviews, inspection methods and user testing. A number       pilots) able to fly (usually the few pilots were the airplanes
of questions must be solved in order to apply them to           designers themselves) ; the underlying technology was only
MRS.                                                            emerging, and often planes had numerous technical prob-
      • Questionnaire and Interviews allow gathering            lems or simply just crashed. If at that time one would have
         subjective data, often quite important to evaluate     conducted comparative performance testing, there is no
         visual appeal, preferences, aesthetics, missing        doubt that the blimp would have won over the airplanes !
         functionalities, and also very useful as a means to    However, by now, everyone would agree that airplanes are
         compare or cross-reference performance data. Such      better that blimps for long distance transport of passengers.
         methods are certainly interesting candidates for       Making the parallel with HCI technology, « classical
         being applied to MRS, providing specific lead          HCI » would be our blimps and MRS would be our
         questions for interviews and questionnaire items       planes. In several ways, MRS are at the same point as
         are tailored and validated for such environments.                          th.
                                                                planes in early 20 century :
         For questionnaires, Kalawsky [10] has designed
                                                                     • There are very few experts that can operate them.
         VRUSE for measuring usability of             a VR
         application in terms of users’ attitude and                 • The characteristics of tasks that can be performed
         perception. The 100 questionnaire items are                      in such environments are still quite vague.
         organized under 10 usability factors: functionality,        • Learning by trying is still the rule.
         input, output, user guidance, consistency,
         flexibility, simulation fidelity, error correction,         • There are many problems to solve in order to
         presence and overall system usability. This                      « fly » those environments : on the technical
         questionnaire has been tested in terms of                        aspects (e.g., computer graphics) ; on the
         reliability (Cronbach's alpha > 0.9).                            interaction aspects (e.g., devices and modalities) ;
                                                                          and on their usability.
      • Inspection methods are also good candidates for
         supporting MRS evaluations. The problem there          That state of affairs can explain why often results are
         is the need for more data (particularly recommen-      disappointing [19] when classical graphics environments
         dations) and more data organization in order to        are compared to VE or MRS.
         cover the range of ergonomic problems related to       Before conducting user tests or to compare interactive situa-
         MRS. Issues regarding recommendations identifi-        tions, it is best to :
         cation and structuring into dimensions need to be           • First make sure that the environments are already
         looked at carefully; the history of HCI has shown                sufficiently well-designed so that well-known us-
         so far that such dimensions can be established and               ability deficiencies are removed. This is a sound
         efficiently contribute to evaluation of GUIs, the                precaution helping to focus user testing on the real
         web, currently with VR ... but specificities of MR               « new » usability problems and the test compari-
         need to be taken into consideration (e.g., task                  sons on the real usability hypotheses, rather that
         compatibility, devices consistency for visualiza-                obscuring the picture with unwanted usability
         tion, documents compatibility, innovative help                   problems, both for the user and for the experimen-
         systems, etc.).                                                  tal data analyst. This can be achieved by careful
      • User Testing has been the major method in                         assessment of the design, for instance through ap-
         ergonomics and will probably remain as important                 plying available inspection methods and sets of
         for MRS. However, in order to apply that method                  recommendations.
         to MRS, a number of methodological problems                 • To alleviate as much as possible the various
         need to be tackled, including of course, the                     limitations of user testing due to the specificities
         of VE and MRS. Some of those limitations are             and ways to operate the various parts of the environment.
         described in the next three paragraphs.                  This also leads to questions on how best to describe the
                                                                  tested situation without coaching too much the subjects if
Limitations related to the physical environment
                                                                  one wants to study their “intuitive” performance or
One of the major differences between MRS and traditional
                                                                  preferences.
interfaces concerns their physical environment. MRS
require a more sophisticated environment : the users rarely       Along the same line, an additional difficulty is simply that
just sit in front of their computer ; they move from one          it may be difficult to explain the complexities of MRS,
place to another, they talk, they move parts of their body in     considering that written instruction may not be sufficient
order to interact.                                                beforehand, and impossible during the test when several
                                                                  entry or display devices are used together (the user cannot
This raises several problems in evaluation situations in
                                                                  be using an eye-tracker, a data glove, watch a large dis-
which it is useful to prepare the experiments in order to
                                                                  play, and at the same time walk through a leaflet of task
avoid some disturbances, for instance, if the MRS
                                                                  instructions). In addition, when needed, where to make
application area in an office, it is necessary to limit the
                                                                  some help system available ?
interaction zones in order to avoid collision with the chairs,
tables, or cables ; other constraints can complicate the          In some cases, as current technology is unable to support
situations, for instance, interaction devices can be an           fully and consistently novel interaction paradigms, there is
obstacle to data collection : use of stereoscopy does not         no way to test those new ideas unless using « Wizard of
allow collecting data on video or on a monoscopic                 Oz » techniques which require trained specialists and spe-
monitor, in which case, the evaluator must access directly        cific, carefully balanced experimental design.
to the stereoscopic data using a tracker or some parallel         In other cases, the techniques used for interacting and those
application software so that the evaluator does not become        used for gathering subjects data obviously conflict ! For
another user (even though a passive user) in the scene,           instance, the use of thinking aloud cannot work well when
which would certainly become a major bias in terms of             MRS use (as they may often do) voice recognition as an
« presence » [5].                                                 input mechanism !
MRS can also be multi-users and require a large experimen-        Limitations related to the subjects
tation space or they can be used outdoors, which makes            A first problem is that the application of MRS is not al-
difficult or even impossible the use of current usability         ways directed by application needs, but by the design of
laboratories, which are located in limited spaces to facilitate   new interaction paradigms, which makes difficult the speci-
data extraction.                                                  fication of precise and accurate task and user requirements.
MRS using video projection can very easily augment the            It is therefore quite difficult to generalize because the users
room temperature in the usability laboratories, which can         profile is often ill-defined ; sometimes, it is even the tech-
become an important, often underestimated bias in the ex-         nical people that developed the MRS who are tested !
periments.                                                        Another difficulty is that, at the current stage, it still im-
Difficulties in the set up of user testing                        possible yet to practically distinguish, as it is usually done
First of all, the complexity of interactive situations with       with “classical” HCI, the subjects in terms of experience
MRS may necessitate more resources that usual user                with MRS (e.g., novices vs. experts). That holds true as
testing : several evaluators may be needed to be able to          well for the skills of human factors specialists in charge of
extract the interesting data (e.g., checking on performance,      the evaluations !
on various modalities, on various media, etc.). Also, the         Experimental design may also encounter difficulties in the
use of video (several ones, simultaneously on various types       number of subjects needed to cover the many potential
of events) may be mandatory, as user behavioural sequences        variables involved in the MRS. For instance, if one wants
are more complex to extract and describe.                         « simply » to compare various combinations of interactive
In addition to the complexity of software programming for         devices associated to an MRS, such as 3 devices for each
such environments [23], setting up experiments may also           one of 3 user channels MRS (e.g., voice, gesture, eye-
be more complex as it requires more technical specialists to      gaze), one needs 27 different testing situations (i.e., 9
calibrate, tailor several types of technologies and various       combinations x 3 channels) in order to have all subjects
software supporting the complexity of MRS.                        participate to all possible combinations, which can become
                                                                  quite exhausting for the subjects and therefore potentially
In case of devices or system breakdown, it may be more
                                                                  inconclusive for the experiment, due to fatigue, learning
difficult and more time consuming to restart the devices or
                                                                  curve, etc.
system, which may jeopardize the outcome and measures in
the user testing. Therefore, it is even more important for        And if one decides to set up the experimental design with
MRS to be as stable as possible for the duration of the           non-repetitive measures, and associating different subjects’
evaluation experiments.                                           groups to each situation, for instance at a minimum of 5
                                                                  subjects per cell, then the number of subjects required
Another obstacle which is specific to MRS (and to all
                                                                  (135, i.e., 27 x 5) becomes fast very large and consequently
novel applications) is that users may not know at all the
                                                                  quite complex to recruit and manage, not mentioning the
way the situation works, which means extra time and
caution in learning experimental requirements, task goals,
additional difficulty of making sure that the subjects’                 Also, this would facilitate the design and share of
population is homogeneous.                                              common recommendations databases. Regarding
Other limitations can be identified for the subjects such as            that, it would also be useful to share some
the consequences of « cyber sickness ». Such limitations                mechanisms allowing the usable organization,
are both ethical (concerning the decision to run the user               storage, and retrieval of such sets of
testing) and practical (concerning the post-experimental                recommendations, using dedicated software (e.g.,
arrangements).                                                          multiple guidelines bases management tools
                                                                        comparable to MetroWeb [15]).
In terms of ethics, one can wonder if it is legitimate to
expose subjects to physiological problems that can be                • Under the assumption on the combinatory charac-
serious, such as fainting fit or ataxia. In some countries,             ter of the ergonomic knowledge to be applied to
such as France, a law covers experimental situations of that            MRS (real, « classic » & « web » computer-based,
nature and requires the experimental setting to incorporate             « VR », and « MR specific »), an effort should be
the presence of certified physicians.                                   made to distinguish the problems and to apply so-
                                                                        lutions from existing knowledge, for instance, on
Concerning the management of subjects potential disorders               physical parameters of the MRS situations (e.g.,
after the experiments, protocols are available, e.g., Stanney           noise, lighting, temperature, workload, perception,
[20], to make sure that a secure environment is provided to             cognition, etc.), using knowledge related to
diagnose and to avoid potential incidents or accidents. Of              physiology, psychophysiology, anthropometrics,
course, it is obvious that, in the event of « cyber                     etc., and of course software ergonomics. In that
sickness », performance measures are completely biased,                 area, a number of results on GUIs, Web, etc., are
unless it is the goal of the experiment to test such                    obviously applicable (e.g., information presenta-
disorders! In any case, it is mandatory to have ways, in an             tion in 2D graphics, navigation on the internet,
experiment, to diagnose early manifestations of such                    etc.), at least partly, to MRS situations.
sickness, in order to limit its physiological consequences.                         Also, issues related to social psychology
DISCUSSION ET PERSPECTIVES                                              and organizational ergonomics may apply, as such
This paper has presented a number ergonomic issues related              systems may concern organizational overall activi-
to the evaluation of VE and extensions to MRS.                          ties, cooperative work, virtual organizations, etc.
                                                                        In terms of measurements, one may also want to
Among these, it may be fruitful for the future usability of             look at other measures than performance, such as
MRS to investigate and to coordinate a number of research               preferences, or even, levels of addiction (e.g., in
avenues.                                                                games), etc.
In terms of ergonomic knowledge, a common effort                     • Inspection methods are good candidates as
should be pursued to ensure that MRS are, at all stages of              complementary methods to user testing (for
design, evaluated for usability; that such data is sound and            instance, just before user testing). Ergonomic
generic; and that, whenever possible, that knowledge is                 Criteria (op. cit.) have been proposed to VE as a
made available as recommendations and shared for further                basis for ergonomic inspection, but it has been
design and evaluation.                                                  necessary not only to modify one criterion
In terms of methods, several points can be made:                        (“Significance of codes and behaviour”) and to
     • For user testing, it would be useful to study the                add two new ones (“Grouping-Distinguishing
         design of user testing protocols in order to allevi-           items by behaviour” and “Physical workload”, but
         ate the various biases identified.                             also to adapt their definitions, justifications, to
                                                                        add more illustrative examples and counter-
     • It is particularly important to make sure that user              examples, in order to take into account
         testing of MRS concerns well identified usability              recommendations specific to VE. These criteria are
         questions rather than just testing the environment             also been currently tested in terms of compared
         as it is. Caution should be exerted in order to                efficiency (i.e., the number and quality of
         “clean-up” MRS from well-known problems                        usability problems diagnosed) towards expert
         before testing real ergonomic problems on new                  evaluation and user testing.
         usability interaction paradigms.                               These Ergonomic Criteria could be candidate for
     • In addition, organizing some kind of efficient                   further adaptation to the specifics of MRS,
         communication within the MRS research                          providing the empirical data on such environments
         community may help to cross-reference and                      become available on a large scale. One can already
         increase common knowledge on usability, e.g.,                  consider that more criteria might be needed simply
         from the various user tests performed in different             due to the fact that MRS concentrate many
         laboratories. This would certainly help in the                 ergonomics issues from the Reality to the
         generalization of usability results and lead to                Virtuality; not mentioning the specific issue of
         commonly agreed generic recommendations.                       “continuity”.
         Using common testing platforms and testing             In terms of MRS-specific questions, a large place should
         protocols would help to compare and share results.     be considered for the issue of « continuity ». That issue,
which is certainly a major usability property [24] needs to           Teleoperators and Virtual Environments. Vol. 11, n°4,
be further investigated in order to better define the concept,        2002, pp. 404-424.
its components, and distinguish the concept of                    6. Dubois, E., Gray, P.D., Nigay, L., ASUR++: a Design
« continuity » from other dimensions related to guidance,             Notation for Mobile Mixed Systems. Interacting With
the structuring of information, the compatibility with the            Computers, Special Issue on Mobile HCI, Paterno, F.
tasks or common practice, the consistency of information              (ed), 2003.
presentation, of modalities, of procedures, etc.
                                                                  7. Gabbard, J.L. Researching Usability Design and
Another issue that seems to carry new types of problems is            Evaluation Guidelines for Augmented Reality (AR) Sys-
the design of help systems ; how should they be organized,            tems. 2001. Available:
which type of support should they use, etc.                         http://www.sv.vt.edu/classes/ESM4714/Student_Proj/cla
In terms of models, it would be interesting to look at and            ss00/gabbard/
to coordinate the characteristics of the models used for          8. Gabbard, J.L., & Hix, D. 1. A taxonomy of usability
MRS (e.g., task models, interaction models, formal mod-               characteristics in virtual environments. 1997. Avaible :
els, architectures, etc.), and to identify, when useful, the          http://csgrad.cs.vt.edu/~jgabbard/ve/taxonomy/
potential communication mechanisms between these mod-
els ... this has been a recurrent issue for GUIs (also in         9. ISO/ TS 16982. Ergonomics of human-system interac-
terms of system lifecycle development processes); it should           tion – Usability methods supporting human-centred de-
not be different for MRS.                                             sign. 2000
Other issues are worth discussing as well, such as the need       10. Kalawsky, R., VRUSE – a computerised diagnostic
for application and task models, for agreed upon extended             tool : for usability evaluation of virtual/synthetic envi-
task taxonomies, for instance, in order to be able to com-            ronment systems. Applied Egonomics, Elsevier (ed)
pare evaluation results, to generalize from one application           n°30, 1999, pp. 11-25.
domain to another one.                                            11. Karampelas P., Grammenos D., Mourouzis A. &
Partly linked to that issue is the need for shareable classifi-       Stephanadis C. Towards I-dove, an interactive support
cations of MR elements, of common MR objects models                   tool for building and using virtual environments with
(and a shared vocabulary), not only to facilitate the analysis        guidelines. Proceedings of HCI, 22-27 june 2003,
of MRS, but also both to compare results within the scien-            Crete, Greece, vol. 3, pp. 1411-1415.
tific community and to facilitate the design of strategies for    12. Kaur, K. Designing virtual environments for usability.
inspections methods and heuristics.                                   Ph. D. Thesis, City University, London, 1998.
Discussions on these various items, together with other           13. Kaur, K. Designing Usable Virtual Environments.
issues talked about during the Workshop on "Exploring the             1997. Demo available:
design and engineering of Mixed Reality Systems", should                http://www.soi.city.ac.uk/~dj524/demtool/frame.htm
lead to some insight for future research and contribute to
                                                                  14. Loeffler, C.E., and Anderson, T. The Virtual Reality
increased knowledge on shared benchmarks. There is cer-               Casebook. New York: Van Nostrand Reinhold, 1994.
tainly lots of work at hand for this workshop, but also for
the following ones!                                               15. MetroWeb: Available:
                                                                      http://www.isys.ucl.ac.be/bchi/research/metroweb.htm
REFERENCES                                                        16. Milgram, P., Takemura, H., Utsumi, A. and Kishino,
1. Bach, C., Scapin, D. L. Recommandations ergonomi-                  F. in SPIE 94, Vol. 2351, Telemanipulator and
   ques pour l’inspection d’environnements virtuels. (Rap-            Telepresence Technologies, 1994, pp. 282.
   port de contrat). Projet EUREKA-COMEDIA, INRIA                 17. Nigay, L., Dubois, E., Renevier, P., Pasqualetti, L.
   Rocquencourt, France, 2003.                                        and Troccaz, J. Mixed Systems: Combining Physical
2. Bach C. & Scapin D. L. Adaptation of Ergonomic Cri-                and Digital Worlds. Proceedings of HCI, 22-27 june
   teria to Human-Virtual Environments Interactions. in               2003, Crete, Greece, vol. 1, pp. 1203-1207.
   Interact'03. IOS Press. 2003. pp. 880-883.                     18. Norman, D. A., Cognitive engineering. In D. Norman
3. Bastien, J. M. C., Scapin, D. L. Les méthodes ergo-                and S. Draper (eds), User centered system design: New
   nomiques : de l’analyse à la conception et à                       perspectives on Human Computer Interaction (Hillsdale
   l’évaluation. Traité d’ergonomie, P. Falzon (Ed.),                 NJ: LEA), 1986, pp. 31-62.
   Masson. 2003                                                   19. Porcher Nedel, L., Dal Sasso Freitas, C. M., Jacon
4. Bastien, J. M. C., et Scapin, D. L. Évaluation des                 Jacob, L. and Soares Pimentas M. Testing the Use of
   systèmes d'information et Critères Ergonomiques. In                Egocentric Interactive Techniques in Immersive Virtual
   Systèmes d’information et Interactions homme-machine,              Environments. In Proceedings of INTERACT’03 (Zu-
   C. Kolski (Ed.), Hermès. 2001.                                     rich, September 2003), IOS Press, pp. 471-478.
5. Bowman, D.A., Gabbard, J.L., and Hix, D. A Survey              20. Stanney, K. M., Kennedy, R. S. and Kingdon, K. Vir-
   of Usability Evaluation in Virtual Environments: Clas-             tual Environment Usage Protocols. in Handbook of Vir-
   sification and Comparaison of Methods. Presence:                   tual Environments, LEA Publishers, 2002, pp721-730.
21. Stanney, K.M., Mollaghasemi, M., and Reeves, L.              24. Trevisan, D., Vanderdonckt, J. and Macq, B. Continu-
    (2000), Development of MAUVE, the multi-criteria                 ity as a Usability Property. Proceedings of HCI, 22-27
    assessment of usability for virtual environment system.          june 2003, Crete, Greece, vol. 3, pp. 1268-1272.
    (Final Rep., Contract N°. N61339-99-C-0098).                 25. Tromp, J. G., Steed, A. & Wilson, J. R. Systematic
    Orlando, FL: Naval Air Warfare Center Training                   Usability Evaluation and Design Issues for Collabora-
    Systems Division. 2000.                                          tive Virtual Environments. Presence, Vol. 12, n°3, june
22. Sutcliffe, A. G., and Kaur, K. D. Evaluating the usabil-         2003. pp. 241-267.
    ity of virtual reality user interfaces. Behaviour & Infor-   26. Wilson, J. R., Eastgate, R. M., & D'Cruz, M.
    mation Technology, vol. 19, n°6. 2000. pp. 415-426.              Structured Development of Virtual Environments. in
23. Träskbäck, M., Koskinen, T.and Nieminen, M. User-                Handbook of Virtual Environments, LEA Publishers,
    Centred Evaluation Criteria for Mixed Reality                    2002, pp. 353-378.
    Authoring Application. Proceedings of HCI, 22-27 june
    2003, Crete, Greece, vol. 3, pp. 1263-1267.