<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Obstacles and Perspectives for Evaluating Mixed Reality Systems Usability.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cédric Bach</string-name>
          <email>cedric.bach@inria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dominique L. Scapin</string-name>
          <email>dominique.scapin@inria.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INRIA</institution>
          ,
          <addr-line>B.P. 105 Domaine de Voluceau, 78153, Le Chesnay, France, +33 1 39 63 51 09</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>INRIA</institution>
          ,
          <addr-line>B.P. 105 Domaine de Voluceau, 78153, Le Chesnay, France, +33 1 39 63 55 07</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The goal of this paper is to survey the main issues with the ergonomic evaluation of MRS (Mixed Reality Systems) and to stimulate discussions for future research. A first point concerns definitions and specificities of MRS within the « reality / virtuality » continuum, and the incorporation of user issues. Another point concerns the combinatory character of the ergonomic knowledge to be applied to MRS entities (reality, « GUIs », « VR », and « MR » specific). A major issue concerns ergonomic evaluation methods, their current state, their advantages and drawbacks, particularly for user testing. Finally, the discussion points at various items which may be part of a future research agenda, such as the need for more usability data, for generic and well controlled experiments; for common testing platforms, for shared recommendations data bases, for design and assessment of inspection methods, for common task taxonomies and for common models of MR entities and situations, etc.; all of this possibly leading to increased knowledge based on shared benchmarks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>INTRODUCTION
Virtual Environments (VE) are being developed fast and
widely, in various contexts (e.g., training, data
visualisation, computer-aided design, tourism, art, games,
etc.).</p>
      <p>Mixed Reality Systems (MRS) follow the same path. For
both types of environments, just as it was the case for
GUIs and for the Web, a large utilisation will depend on its
usability.</p>
      <p>Facing the question of establishing a research agenda for
contributing to more usable MRS, this paper attempts to
draw a limited picture of the current state of knowledge on
LEAVE BLANK THE LAST 2.5 cm (1”) OF THE LEFT
COLUMN ON THE FIRST PAGE FOR THE</p>
      <p>COPYRIGHT NOTICE.
mixed reality systems, from a user-centred perspective, and
to discuss potential research avenues.</p>
      <p>First, one needs to discuss definitions : MRS are viewed as
a subset of VE, with an ergonomic perspective.</p>
      <p>Secondly, through a brief account of VE and MRS studies,
we highlight a few items explaining the lack of current
knowledge on the usability of such systems, mainly related
to their novelty.</p>
      <p>
        Then, the paper mentions briefly a number of evaluation
methods that can be applied to MRS, and focuses on the
specific methodological challenges with user testing.
Finally, the discussion offers several items of interest for a
common research agenda in the area of MRS ergonomics.
DEFINITIONS and SCOPE
For MRS, there is no fully agreed-upon definition, so far,
but a set of features, e.g., as the ones mentioned in the Call
For Papers for this (IUI / CADUI-associated) Workshop: “
… integration of the physical and digital worlds in a
smooth and usable way. This fusion involves the design
and development of "mixed reality systems", including
augmented reality, augmented virtuality, augmented video,
and tangible systems …”. We definitely agree that: “The
diversity of terms used to denote these systems is evidence
both of the amount of research activity in the field and the
lack of a common conceptual framework for that activity”.
Obviously, one of the goals of this Workshop should be to
progress on definitions and on a conceptual framework.
For VE, there are numerous definitions. Most often, they
are techno-centred. A definition that we use for VE is the
one for VR (Virtual Reality) from Loeffler &amp; Anderson
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] : “Virtual Reality is a three-dimensional,
computergenerated, simulated environment that is rendered in real
time according to the behaviour of the user.”
However, this definition should to be completed on two
points:
- VE can be multi-users (for the evaluation of collaborative
VE, see for instance [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ])
- VE can be described along a « Real-Mixed-Virtual »
continuum, as described by Milgram [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>That continuum includes in fact a large spectrum of 3D
computer-based interactive situations. On that spectrum, a
number of terms used in the literature can be located, such
as Virtual Reality, Mixed Reality, Augmented Reality,
Desktop or Immersive Environments, Augmented
Virtuality, etc… it is even possible to incorporate within
that continuum 3D CAD or VRML objects displayed on
« classical » computers, which makes them belong to VR,
even though very loosely.</p>
      <p>The first point here is a matter of joining the techno-centred
and the user-centred perspectives. The second point deals
with the combinatory nature of MRS ergonomic knowledge
(real, « GUI », « VR », and « MR » specific).</p>
      <p>People might think that there is an enormous difference
between a 3D object to be interacted with in a regular GUI
environment, and that same object presented in a CAVE,
using a 3D mouse. That difference is certainly there, but
only from a technical point of view, not necessarily from a
user’s point of view.</p>
      <p>With an ergonomic perspective, these two situations must
be considered through their capacity to support users in
achieving specified task goals, with efficiency, effectiveness
and satisfaction. In some cases, an immersive situation can
be beneficial (e.g., learning driving / control operations on
a high-speed train, which involves physical simulation of
the access to the driving instruments), while it may be less
appropriate in others (e.g., learning specific procedures for
that same train, which involve more cognitive sequence
learning steps) where regular GUIs, or even paper
instructions may be more appropriate.</p>
      <p>In other words, depending on the task at hand, the
usefulness of the support required may be different : in kinetics
learning, immersion may make sense ; while in cognitive
learning, it may not be as helpful.</p>
      <p>The same point can be made for MRS which are
constituted of both real and virtual elements within the
same 3D interactive environment. As they can be located
roughly at the midpoint of the Milgram continuum, they
can be considered as a particular form of VE. However,
their specificity is to stage a form of fusion between real
and virtual worlds.</p>
      <p>
        Even if MRS can be divided into Augmented Reality and
Augmented Virtuality [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] depending on the exact physical
or computer-based nature of the tasks and interactions
involved, they relate to combinations of knowledge from
various domains of specialization of ergonomics :
•
•
      </p>
      <p>Real objects, in the area of physical ergonomics,
concerned with human anatomical,
anthropometric, physiological and biomechanical
characteristics (postures, materials handling, repetitive
movements, musculoskeletal disorders, workplace
layout, safety and health); in the area of
ergonomics of every-day products and consumer products;
in some cases, in the area of cognitive ergonomics
and organizational ergonomics.</p>
      <p>Classical interfaces (GUIs) mainly in the area of
cognitive ergonomics, concerned with mental
processes, such as perception, memory, reasoning,
(e.g., mental workload, decision-making, skilled
performance, information presentation, commands
and controls, etc.).</p>
      <p>Virtual environments are concerned as well by
cognitive ergonomics, but also by physical
ergonomics (e.g., in terms of VE behaviour,
presence, cyber sickness, etc).</p>
      <p>Fusion of the previous elements, which
corresponds to one specificity of MRS: the
appropriate correspondence between the various
real or virtual elements constituting MRS.</p>
      <p>
        This problem, also called « continuity »,
characterizes the perceptual and cognitive fluidity,
from the users’ point of view, between the real
world and the virtual world. According to Nigay
et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], “The perceptual continuity is verified if
the user directly and smoothly perceives the
different representations of a given concept.
Cognitive continuity is verified if the cognitive
processes that are involved in the interpretation
of the different perceived representations lead to a
unique interpretation of the concept resulting
from the combination of the interpreted perceived
representations.”.
      </p>
      <p>Such a concept is a challenge for MRS design that
requires optimal matching between various
features in terms of presentation, coding, meaning,
and behavior.</p>
      <p>From an ergonomic point of view, one could state
that it is not only an issue with interface
consistency (e.g., providing information
presentation as well as user initiated interactions
with the same characteristics across operations and
across applications), but also a complex
compatibility issue, i.e., matching the various
MRS individual elements with their referent, but
also matching all related MRS elements which
each other, depending on the tasks and context.</p>
      <p>ERGONOMIC KNOWLEDGE ON MRS
Previous history of HCI concerning ergonomics knowledge
is probably repeating itself : MRS novelty and lack of
ergonomic knowledge.</p>
      <p>With new technologies, usability concerns start with
general questions, hypotheses, debates ; then questions are
sorted and put to the test through experiments ; theories are
offered ; data on usability grows rapidly as the technology
becomes widely available ; data is then organized, made
available through manuals, guides ; finally, style guides,
architectures, design and evaluation methods are offered,
tested, compared.</p>
      <p>This has been proven true for GUIS, the Web, ... it should
be the same for VE and MRS. However, there may be ways
to accelerate the process by doing user testing early so that
usability knowledge is gained rapidly, rather than having
simply the technology perfected without user concern.
Also, as it has been verified that, when moving from GUIs
to the Web, one should not « re-invent the wheel », but
apply as much as possible sound ergonomic knowledge
that can be transferred to novel environments (information
organisation, consistency, level of feedback, etc.).
Implementation and usage of VE and MRS is indeed very
recent as it started in the early nineties and only grows at a
fast pace since three or four years.</p>
      <p>That novelty explains partly why the currently available
ergonomic knowledge is relatively scarce, compared to the
number of issues that need to be tackled in such complex
environments. Such environments are relatively unstable in
their implementation and usage ; lots of technical problems
still need to be solved.</p>
      <p>However, some actions can be envisioned to better
incorporate a user-centred approach :
•
•</p>
      <p>The recommendation above to not « re-invent the
wheel », is particularly true for MRS as their
components are mixed, therefore, ergonomics
knowledge needs to be gained for the novel,
specific MRS issues (such as « continuity »), but
currently available ergonomics knowledge should be
applied to its non-specific elements (e.g., real
objects, 2D displays, etc.). The amount
of knowledge (empirical results,
recommendations, standards, etc.) in ergonomics about real
objects, « classical » interfaces is of course
enormous, and needs to be carefully investigated in
terms of applicability depending on context and
tasks. Also, concerning ergonomic evaluation, a
number of already available methods should be
looked at in terms of their applicability and needs
for adaptation to the specifics of MRS.</p>
      <p>Another recommendation above was to incorporate
ergonomic concerns early. That is usually done
through user testing (with or without hypotheses,
on one or several contexts, on one system, or
through the comparison of several systems, etc.).
That is where a number of methodological
problems arise.</p>
      <p>In order to identify the obstacles and perspectives
in methodological terms, the next two sections
deal first with ergonomics evaluation methods and
secondly with MRS-specific issues when
conducting user testing.</p>
      <p>
        ERGONOMIC EVALUATION METHODS
In the context of MRS, which usability evaluation methods
are available, and how can such environments be
appropriately evaluated?
In the literature, there is no usability method yet
specifically designed for MRS, except notations such as
AZUR++, a notation for describing, and reasoning about
the design of mobile mixed systems [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        There has been already some research on how to guide VE
design with usability considerations (e.g., in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]), to
consolidate usability dimensions and to design inspection
methods, but much work is needed to extend their scope to
MRS, to assess them and to compare them.
      </p>
      <p>•</p>
      <p>
        Methods based on heuristics or recommendations
are difficult to carry out due to the limited amount
•
•
•
of recommendations available, to the fact that
« classical » ergonomic recommendations have not
yet been applied or extended to VE. In addition,
such recommendations are not presented in a
common unitary format, but often as experimental
or « best practice » results ; see for instance
Gabbard [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]; Kaur [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]; Stanney[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] ; generally
such documents are organized according to major
categories of VE objects. Also, based on an
extensive literature review, a set of 170
recommendations dedicated to VE have been
extracted under a generic format and organized
according to usability dimensions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Along those lines, with the goal of defining a
structured ergonomic inspection method, an
adaptation of the Ergonomic Criteria (E. C.) has
been proposed and assessed in terms of intrinsic
validity [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. First of all the application of the E.
C. inspection method requires expertise in VE and
training on the 20 dimensions covered by these
E. C.. Also, the method needs further evaluation
for VE and extensions to MRS.
      </p>
      <p>
        A method dedicated to VR, based on the cognitive
walkthrough method, has also been designed by Sutcliffe
[
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. This method is based on a theory of interaction
(Norman [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]). This walkthrough analysis method uses
three models (first on goal-oriented task actions, second on
exploration and navigation in virtual worlds, third on
interaction in response to system initiative) derived from
the theory. Each stage of the model is associated with
generic design properties. The evaluation method consists
of a checklist of questions using the properties and
following the steps of the method. That method could
possibly be extended to MRS.
      </p>
      <p>
        There are also methods, mainly for VE, based on
recommendations, using computer-based support, such as :
I-DOVE (Interactive tool for development of
Virtual Environments) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The goal of this
prototype, currently being developed as a
largescale web-based application based on several sets
of recommendations, is to offer context specific
guidance for VE development, and alternative
ways of searching and browsing, distinguishing
user categories. The initial prototype was based on
users interviews and was later evaluated by expert
evaluation.
      </p>
      <p>
        Another tool is MAUVE [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]; also developed as a
website, it incorporates design guidelines
according to several VE categories such as
navigation, object manipulation, input, output and
so on (based on Gabbard’s [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] taxonomy). A
multi-criteria usability matrix is the support for
organizing and retrieving recommendations. The
evaluation process is supported in two steps:
“traditional” heuristics stage, and prioritization of
usability attributes. That capacity of tailoring the
evaluation may be interesting for evaluators with
•
•
•
•
different points of view or organizational goals.
The tool has not been evaluated yet.
      </p>
      <p>
        The last tool, which we know of, is a
hypertextbased prototype developed by Kaur [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This tool
present 45 generic design properties that specify
the necessary support from the system for
“successful” VE interfaces. Like the previous one,
no usability evaluation plan was integrated during
the development of the prototype.
      </p>
      <p>
        There is also the question of adapting, at least partly, to
MRS some of the « classical » ergonomics methods. Such
methods are too numerous to be all discussed here (see for
instance, [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]). However, one can mention three
categories of methods that are general enough in their
approach to be good candidates for MRS : questionnaires/
interviews, inspection methods and user testing. A number
of questions must be solved in order to apply them to
MRS.
      </p>
      <p>
        Questionnaire and Interviews allow gathering
subjective data, often quite important to evaluate
visual appeal, preferences, aesthetics, missing
functionalities, and also very useful as a means to
compare or cross-reference performance data. Such
methods are certainly interesting candidates for
being applied to MRS, providing specific lead
questions for interviews and questionnaire items
are tailored and validated for such environments.
For questionnaires, Kalawsky [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] has designed
VRUSE for measuring usability of a VR
application in terms of users’ attitude and
perception. The 100 questionnaire items are
organized under 10 usability factors: functionality,
input, output, user guidance, consistency,
flexibility, simulation fidelity, error correction,
presence and overall system usability. This
questionnaire has been tested in terms of
reliability (Cronbach's alpha &gt; 0.9).
      </p>
      <p>Inspection methods are also good candidates for
supporting MRS evaluations. The problem there
is the need for more data (particularly
recommendations) and more data organization in order to
cover the range of ergonomic problems related to
MRS. Issues regarding recommendations
identification and structuring into dimensions need to be
looked at carefully; the history of HCI has shown
so far that such dimensions can be established and
efficiently contribute to evaluation of GUIs, the
web, currently with VR ... but specificities of MR
need to be taken into consideration (e.g., task
compatibility, devices consistency for
visualization, documents compatibility, innovative help
systems, etc.).</p>
      <p>User Testing has been the major method in
ergonomics and will probably remain as important
for MRS. However, in order to apply that method
to MRS, a number of methodological problems
need to be tackled, including of course, the
•
•
•
•
•
•
problems already experienced with a few cases of
user testing of VE systems (see next section).</p>
      <p>METHODOLOGICAL DIFFICULTIES WITH U S E R
T E S T I N G
User testing is certainly the preferred method to be used,
particularly in order to alleviate the current lack of available
usability data. However, many methodological problems
require solutions.</p>
      <p>First of all, let us use a metaphor both historical and
aeronautical. Looking back over a century ago, let us imagine
that the following question was asked at that time: « which
flying machine constitutes the best way to move rapidly in
the air from one point to another : the blimp or the
airplanes ? ». In those times, the airplanes trials were just
starting ; it would have been difficult to find users (i.e.,
pilots) able to fly (usually the few pilots were the airplanes
designers themselves) ; the underlying technology was only
emerging, and often planes had numerous technical
problems or simply just crashed. If at that time one would have
conducted comparative performance testing, there is no
doubt that the blimp would have won over the airplanes !
However, by now, everyone would agree that airplanes are
better that blimps for long distance transport of passengers.
Making the parallel with HCI technology, « classical
HCI » would be our blimps and MRS would be our
planes. In several ways, MRS are at the same point as
planes in early 20th. century :</p>
    </sec>
    <sec id="sec-2">
      <title>There are very few experts that can operate them.</title>
      <p>The characteristics of tasks that can be performed
in such environments are still quite vague.</p>
    </sec>
    <sec id="sec-3">
      <title>Learning by trying is still the rule.</title>
      <p>There are many problems to solve in order to
« fly » those environments : on the technical
aspects (e.g., computer graphics) ; on the
interaction aspects (e.g., devices and modalities) ;
and on their usability.</p>
      <p>
        That state of affairs can explain why often results are
disappointing [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] when classical graphics environments
are compared to VE or MRS.
      </p>
      <p>Before conducting user tests or to compare interactive
situations, it is best to :</p>
      <p>First make sure that the environments are already
sufficiently well-designed so that well-known
usability deficiencies are removed. This is a sound
precaution helping to focus user testing on the real
« new » usability problems and the test
comparisons on the real usability hypotheses, rather that
obscuring the picture with unwanted usability
problems, both for the user and for the
experimental data analyst. This can be achieved by careful
assessment of the design, for instance through
applying available inspection methods and sets of
recommendations.</p>
      <p>To alleviate as much as possible the various
limitations of user testing due to the specificities
of VE and MRS. Some of those limitations are
described in the next three paragraphs.</p>
      <p>Limitations related to the physical environment
One of the major differences between MRS and traditional
interfaces concerns their physical environment. MRS
require a more sophisticated environment : the users rarely
just sit in front of their computer ; they move from one
place to another, they talk, they move parts of their body in
order to interact.</p>
      <p>
        This raises several problems in evaluation situations in
which it is useful to prepare the experiments in order to
avoid some disturbances, for instance, if the MRS
application area in an office, it is necessary to limit the
interaction zones in order to avoid collision with the chairs,
tables, or cables ; other constraints can complicate the
situations, for instance, interaction devices can be an
obstacle to data collection : use of stereoscopy does not
allow collecting data on video or on a monoscopic
monitor, in which case, the evaluator must access directly
to the stereoscopic data using a tracker or some parallel
application software so that the evaluator does not become
another user (even though a passive user) in the scene,
which would certainly become a major bias in terms of
« presence » [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>MRS can also be multi-users and require a large
experimentation space or they can be used outdoors, which makes
difficult or even impossible the use of current usability
laboratories, which are located in limited spaces to facilitate
data extraction.</p>
      <p>MRS using video projection can very easily augment the
room temperature in the usability laboratories, which can
become an important, often underestimated bias in the
experiments.</p>
      <p>Difficulties in the set up of user testing
First of all, the complexity of interactive situations with
MRS may necessitate more resources that usual user
testing : several evaluators may be needed to be able to
extract the interesting data (e.g., checking on performance,
on various modalities, on various media, etc.). Also, the
use of video (several ones, simultaneously on various types
of events) may be mandatory, as user behavioural sequences
are more complex to extract and describe.</p>
      <p>
        In addition to the complexity of software programming for
such environments [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], setting up experiments may also
be more complex as it requires more technical specialists to
calibrate, tailor several types of technologies and various
software supporting the complexity of MRS.
      </p>
      <p>In case of devices or system breakdown, it may be more
difficult and more time consuming to restart the devices or
system, which may jeopardize the outcome and measures in
the user testing. Therefore, it is even more important for
MRS to be as stable as possible for the duration of the
evaluation experiments.</p>
      <p>Another obstacle which is specific to MRS (and to all
novel applications) is that users may not know at all the
way the situation works, which means extra time and
caution in learning experimental requirements, task goals,
and ways to operate the various parts of the environment.
This also leads to questions on how best to describe the
tested situation without coaching too much the subjects if
one wants to study their “intuitive” performance or
preferences.</p>
      <p>Along the same line, an additional difficulty is simply that
it may be difficult to explain the complexities of MRS,
considering that written instruction may not be sufficient
beforehand, and impossible during the test when several
entry or display devices are used together (the user cannot
be using an eye-tracker, a data glove, watch a large
display, and at the same time walk through a leaflet of task
instructions). In addition, when needed, where to make
some help system available ?
In some cases, as current technology is unable to support
fully and consistently novel interaction paradigms, there is
no way to test those new ideas unless using « Wizard of
Oz » techniques which require trained specialists and
specific, carefully balanced experimental design.</p>
      <p>In other cases, the techniques used for interacting and those
used for gathering subjects data obviously conflict ! For
instance, the use of thinking aloud cannot work well when
MRS use (as they may often do) voice recognition as an
input mechanism !
Limitations related to the subjects
A first problem is that the application of MRS is not
always directed by application needs, but by the design of
new interaction paradigms, which makes difficult the
specification of precise and accurate task and user requirements.
It is therefore quite difficult to generalize because the users
profile is often ill-defined ; sometimes, it is even the
technical people that developed the MRS who are tested !
Another difficulty is that, at the current stage, it still
impossible yet to practically distinguish, as it is usually done
with “classical” HCI, the subjects in terms of experience
with MRS (e.g., novices vs. experts). That holds true as
well for the skills of human factors specialists in charge of
the evaluations !
Experimental design may also encounter difficulties in the
number of subjects needed to cover the many potential
variables involved in the MRS. For instance, if one wants
« simply » to compare various combinations of interactive
devices associated to an MRS, such as 3 devices for each
one of 3 user channels MRS (e.g., voice, gesture,
eyegaze), one needs 27 different testing situations (i.e., 9
combinations x 3 channels) in order to have all subjects
participate to all possible combinations, which can become
quite exhausting for the subjects and therefore potentially
inconclusive for the experiment, due to fatigue, learning
curve, etc.</p>
      <p>And if one decides to set up the experimental design with
non-repetitive measures, and associating different subjects’
groups to each situation, for instance at a minimum of 5
subjects per cell, then the number of subjects required
(135, i.e., 27 x 5) becomes fast very large and consequently
quite complex to recruit and manage, not mentioning the
additional difficulty of making sure that the subjects’
population is homogeneous.</p>
      <p>Other limitations can be identified for the subjects such as
the consequences of « cyber sickness ». Such limitations
are both ethical (concerning the decision to run the user
testing) and practical (concerning the post-experimental
arrangements).</p>
      <p>In terms of ethics, one can wonder if it is legitimate to
expose subjects to physiological problems that can be
serious, such as fainting fit or ataxia. In some countries,
such as France, a law covers experimental situations of that
nature and requires the experimental setting to incorporate
the presence of certified physicians.</p>
      <p>
        Concerning the management of subjects potential disorders
after the experiments, protocols are available, e.g., Stanney
[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], to make sure that a secure environment is provided to
diagnose and to avoid potential incidents or accidents. Of
course, it is obvious that, in the event of « cyber
sickness », performance measures are completely biased,
unless it is the goal of the experiment to test such
disorders! In any case, it is mandatory to have ways, in an
experiment, to diagnose early manifestations of such
sickness, in order to limit its physiological consequences.
DISCUSSION ET PERSPECTIVES
This paper has presented a number ergonomic issues related
to the evaluation of VE and extensions to MRS.
Among these, it may be fruitful for the future usability of
MRS to investigate and to coordinate a number of research
avenues.
      </p>
      <p>In terms of ergonomic knowledge, a common effort
should be pursued to ensure that MRS are, at all stages of
design, evaluated for usability; that such data is sound and
generic; and that, whenever possible, that knowledge is
made available as recommendations and shared for further
design and evaluation.</p>
      <p>In terms of methods, several points can be made:
•
•
•</p>
      <p>For user testing, it would be useful to study the
design of user testing protocols in order to
alleviate the various biases identified.</p>
      <p>It is particularly important to make sure that user
testing of MRS concerns well identified usability
questions rather than just testing the environment
as it is. Caution should be exerted in order to
“clean-up” MRS from well-known problems
before testing real ergonomic problems on new
usability interaction paradigms.</p>
      <p>In addition, organizing some kind of efficient
communication within the MRS research
community may help to cross-reference and
increase common knowledge on usability, e.g.,
from the various user tests performed in different
laboratories. This would certainly help in the
generalization of usability results and lead to
commonly agreed generic recommendations.</p>
      <p>Using common testing platforms and testing
protocols would help to compare and share results.
•
•</p>
      <p>
        Also, this would facilitate the design and share of
common recommendations databases. Regarding
that, it would also be useful to share some
mechanisms allowing the usable organization,
storage, and retrieval of such sets of
recommendations, using dedicated software (e.g.,
multiple guidelines bases management tools
comparable to MetroWeb [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]).
      </p>
      <p>Under the assumption on the combinatory
character of the ergonomic knowledge to be applied to
MRS (real, « classic » &amp; « web » computer-based,
« VR », and « MR specific »), an effort should be
made to distinguish the problems and to apply
solutions from existing knowledge, for instance, on
physical parameters of the MRS situations (e.g.,
noise, lighting, temperature, workload, perception,
cognition, etc.), using knowledge related to
physiology, psychophysiology, anthropometrics,
etc., and of course software ergonomics. In that
area, a number of results on GUIs, Web, etc., are
obviously applicable (e.g., information
presentation in 2D graphics, navigation on the internet,
etc.), at least partly, to MRS situations.</p>
      <p>Also, issues related to social psychology
and organizational ergonomics may apply, as such
systems may concern organizational overall
activities, cooperative work, virtual organizations, etc.
In terms of measurements, one may also want to
look at other measures than performance, such as
preferences, or even, levels of addiction (e.g., in
games), etc.</p>
      <p>Inspection methods are good candidates as
complementary methods to user testing (for
instance, just before user testing). Ergonomic
Criteria (op. cit.) have been proposed to VE as a
basis for ergonomic inspection, but it has been
necessary not only to modify one criterion
(“Significance of codes and behaviour”) and to
add two new ones (“Grouping-Distinguishing
items by behaviour” and “Physical workload”, but
also to adapt their definitions, justifications, to
add more illustrative examples and
counterexamples, in order to take into account
recommendations specific to VE. These criteria are
also been currently tested in terms of compared
efficiency (i.e., the number and quality of
usability problems diagnosed) towards expert
evaluation and user testing.</p>
      <p>These Ergonomic Criteria could be candidate for
further adaptation to the specifics of MRS,
providing the empirical data on such environments
become available on a large scale. One can already
consider that more criteria might be needed simply
due to the fact that MRS concentrate many
ergonomics issues from the Reality to the
Virtuality; not mentioning the specific issue of
“continuity”.</p>
      <p>
        In terms of MRS-specific questions, a large place should
be considered for the issue of « continuity ». That issue,
which is certainly a major usability property [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] needs to
be further investigated in order to better define the concept,
its components, and distinguish the concept of
« continuity » from other dimensions related to guidance,
the structuring of information, the compatibility with the
tasks or common practice, the consistency of information
presentation, of modalities, of procedures, etc.
      </p>
      <p>Another issue that seems to carry new types of problems is
the design of help systems ; how should they be organized,
which type of support should they use, etc.</p>
      <p>In terms of models, it would be interesting to look at and
to coordinate the characteristics of the models used for
MRS (e.g., task models, interaction models, formal
models, architectures, etc.), and to identify, when useful, the
potential communication mechanisms between these
models ... this has been a recurrent issue for GUIs (also in
terms of system lifecycle development processes); it should
not be different for MRS.</p>
      <p>Other issues are worth discussing as well, such as the need
for application and task models, for agreed upon extended
task taxonomies, for instance, in order to be able to
compare evaluation results, to generalize from one application
domain to another one.</p>
      <p>Partly linked to that issue is the need for shareable
classifications of MR elements, of common MR objects models
(and a shared vocabulary), not only to facilitate the analysis
of MRS, but also both to compare results within the
scientific community and to facilitate the design of strategies for
inspections methods and heuristics.</p>
      <p>Discussions on these various items, together with other
issues talked about during the Workshop on "Exploring the
design and engineering of Mixed Reality Systems", should
lead to some insight for future research and contribute to
increased knowledge on shared benchmarks. There is
certainly lots of work at hand for this workshop, but also for
the following ones!</p>
      <p>http://www.soi.city.ac.uk/~dj524/demtool/frame.htm</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bach</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scapin</surname>
            ,
            <given-names>D. L.</given-names>
          </string-name>
          <article-title>Recommandations ergonomiques pour l'inspection d'environnements virtuels</article-title>
          . (Rapport de contrat).
          <source>Projet EUREKA-COMEDIA, INRIA Rocquencourt</source>
          , France,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bach</surname>
            <given-names>C.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Scapin D. L</surname>
          </string-name>
          .
          <article-title>Adaptation of Ergonomic Criteria to Human-Virtual Environments Interactions</article-title>
          . in Interact'03. IOS Press.
          <year>2003</year>
          . pp.
          <fpage>880</fpage>
          -
          <lpage>883</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bastien</surname>
            ,
            <given-names>J. M. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scapin</surname>
            ,
            <given-names>D. L.</given-names>
          </string-name>
          <article-title>Les méthodes ergonomiques : de l'analyse à la conception et à l'évaluation</article-title>
          . Traité d'ergonomie, P. Falzon (Ed.), Masson.
          <year>2003</year>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bastien</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. M. C.</surname>
          </string-name>
          , et Scapin, D. L.
          <article-title>Évaluation des systèmes d'information et Critères Ergonomiques</article-title>
          .
          <source>In Systèmes d'information et Interactions</source>
          homme-machine, C. Kolski (Ed.), Hermès.
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bowman</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gabbard</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Hix</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>A Survey of Usability Evaluation in Virtual Environments: Classification and Comparaison of Methods. Presence: Teleoperators and Virtual Environments</article-title>
          . Vol.
          <volume>11</volume>
          , n°
          <issue>4</issue>
          ,
          <year>2002</year>
          , pp.
          <fpage>404</fpage>
          -
          <lpage>424</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Dubois</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>P.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nigay</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , ASUR++
          <article-title>: a Design Notation for Mobile Mixed Systems</article-title>
          . Interacting With Computers, Special Issue on
          <string-name>
            <surname>Mobile</surname>
            <given-names>HCI</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paterno</surname>
            ,
            <given-names>F</given-names>
          </string-name>
          . (ed),
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gabbard</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          <string-name>
            <surname>Researching</surname>
          </string-name>
          <article-title>Usability Design and Evaluation Guidelines for Augmented Reality (AR) Systems</article-title>
          .
          <year>2001</year>
          . Available: http://www.sv.vt.edu/classes/ESM4714/Student_Proj/cla ss00/gabbard/
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gabbard</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Hix</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>1</year>
          .
          <article-title>A taxonomy of usability characteristics in virtual environments</article-title>
          .
          <year>1997</year>
          . Avaible : http://csgrad.cs.vt.edu/~jgabbard/ve/taxonomy/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. ISO/ TS 16982.
          <article-title>Ergonomics of human-system interaction - Usability methods supporting human-centred design</article-title>
          .
          <source>2000</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kalawsky</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , VRUSE
          <article-title>- a computerised diagnostic tool : for usability evaluation of virtual/synthetic environment systems</article-title>
          . Applied Egonomics, Elsevier (ed) n°
          <fpage>30</fpage>
          ,
          <year>1999</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Karampelas</surname>
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grammenos</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mourouzis</surname>
            <given-names>A.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Stephanadis</surname>
            <given-names>C.</given-names>
          </string-name>
          <article-title>Towards I-dove, an interactive support tool for building and using virtual environments with guidelines</article-title>
          .
          <source>Proceedings of HCI</source>
          ,
          <fpage>22</fpage>
          -
          <lpage>27</lpage>
          june 2003, Crete, Greece, vol.
          <volume>3</volume>
          , pp.
          <fpage>1411</fpage>
          -
          <lpage>1415</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Kaur</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <article-title>Designing virtual environments for usability</article-title>
          .
          <source>Ph. D. Thesis</source>
          , City University, London,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kaur</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Designing Usable Virtual Environments</surname>
          </string-name>
          .
          <year>1997</year>
          . Demo available:
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Loeffler</surname>
            ,
            <given-names>C.E.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>The Virtual Reality Casebook</article-title>
          . New York: Van Nostrand Reinhold,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>15. MetroWeb: Available: http://www.isys.ucl.ac.be/bchi/research/metroweb.htm</mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Milgram</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Takemura</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Utsumi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kishino</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <source>in SPIE 94</source>
          , Vol.
          <volume>2351</volume>
          , Telemanipulator and
          <string-name>
            <given-names>Telepresence</given-names>
            <surname>Technologies</surname>
          </string-name>
          ,
          <year>1994</year>
          , pp.
          <fpage>282</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Nigay</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubois</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Renevier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasqualetti</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Troccaz</surname>
          </string-name>
          ,
          <source>J. Mixed Systems: Combining Physical and Digital Worlds. Proceedings of HCI</source>
          ,
          <fpage>22</fpage>
          -
          <lpage>27</lpage>
          june 2003, Crete, Greece, vol.
          <volume>1</volume>
          , pp.
          <fpage>1203</fpage>
          -
          <lpage>1207</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Norman</surname>
            ,
            <given-names>D. A.</given-names>
          </string-name>
          ,
          <article-title>Cognitive engineering</article-title>
          . In D. Norman and
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Draper (eds), User centered system design: New perspectives on Human Computer Interaction</article-title>
          (Hillsdale NJ: LEA),
          <year>1986</year>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>62</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>Porcher</given-names>
            <surname>Nedel</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          , Dal Sasso Freitas,
          <string-name>
            <surname>C. M.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jacon</given-names>
            <surname>Jacob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            and
            <surname>Soares Pimentas M.</surname>
          </string-name>
          <article-title>Testing the Use of Egocentric Interactive Techniques in Immersive Virtual Environments</article-title>
          .
          <source>In Proceedings of INTERACT'03 (Zurich</source>
          ,
          <year>September 2003</year>
          ), IOS Press, pp.
          <fpage>471</fpage>
          -
          <lpage>478</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Stanney</surname>
            ,
            <given-names>K. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kennedy</surname>
            ,
            <given-names>R. S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kingdon</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <article-title>Virtual Environment Usage Protocols</article-title>
          . in Handbook of Virtual Environments, LEA Publishers,
          <year>2002</year>
          ,
          <fpage>pp721</fpage>
          -
          <lpage>730</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Stanney</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mollaghasemi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Reeves</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2000</year>
          ),
          <article-title>Development of MAUVE, the multi-criteria assessment of usability for virtual environment system. (Final Rep</article-title>
          .,
          <string-name>
            <surname>Contract</surname>
            <given-names>N</given-names>
          </string-name>
          °. N61339-99-C-0098). Orlando, FL:
          <string-name>
            <surname>Naval Air Warfare Center Training Systems Division</surname>
          </string-name>
          .
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Sutcliffe</surname>
            ,
            <given-names>A. G.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Kaur</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. D.</surname>
          </string-name>
          <article-title>Evaluating the usability of virtual reality user interfaces</article-title>
          .
          <source>Behaviour &amp; Information Technology</source>
          , vol.
          <volume>19</volume>
          , n°6.
          <year>2000</year>
          . pp.
          <fpage>415</fpage>
          -
          <lpage>426</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Träskbäck</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koskinen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Nieminen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>UserCentred Evaluation Criteria for Mixed Reality Authoring Application</article-title>
          .
          <source>Proceedings of HCI</source>
          ,
          <fpage>22</fpage>
          -
          <lpage>27</lpage>
          june 2003, Crete, Greece, vol.
          <volume>3</volume>
          , pp.
          <fpage>1263</fpage>
          -
          <lpage>1267</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Trevisan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanderdonckt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Macq</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <article-title>Continuity as a Usability Property</article-title>
          .
          <source>Proceedings of HCI</source>
          ,
          <fpage>22</fpage>
          -
          <lpage>27</lpage>
          june 2003, Crete, Greece, vol.
          <volume>3</volume>
          , pp.
          <fpage>1268</fpage>
          -
          <lpage>1272</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Tromp</surname>
            ,
            <given-names>J. G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Steed</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          &amp; Wilson,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Systematic</surname>
          </string-name>
          <article-title>Usability Evaluation and Design Issues for Collaborative Virtual Environments</article-title>
          . Presence, Vol.
          <volume>12</volume>
          , n°3,
          <string-name>
            <surname>june</surname>
          </string-name>
          <year>2003</year>
          . pp.
          <fpage>241</fpage>
          -
          <lpage>267</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. Wilson,
          <string-name>
            <given-names>J. R.</given-names>
            ,
            <surname>Eastgate</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            , &amp;
            <surname>D'Cruz</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Structured Development of Virtual Environments</article-title>
          . in Handbook of Virtual Environments, LEA Publishers,
          <year>2002</year>
          , pp.
          <fpage>353</fpage>
          -
          <lpage>378</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>