=Paper=
{{Paper
|id=Vol-490/paper-10
|storemode=property
|title=Early User-testing Before Programming Improves Software Quality
|pdfUrl=https://ceur-ws.org/Vol-490/paper_10.pdf
|volume=Vol-490
|dblpUrl=https://dblp.org/rec/conf/iused/PetterssonN09
}}
==Early User-testing Before Programming Improves Software Quality==
Early user-testing before programming
improves software quality
John Sören Pettersson Jenny Nilsson
Department of Information Systems Department of Information Systems
Karlstad University Karlstad University
Karlstad, Sweden Karlstad, Sweden
+4654 700 2553 +4654 700 1135
John_Soren.Pettersson@kau.se Jenny.Nilsson@kau.se
ABSTRACT system was requested by client organisations and their employees.
This position statement does not focus on usability although it They had also included a continuous process of debugging using
presents data from a software up-date cycle where several experienced users and content experts in their update cycles. One
usability- and user-centred methods were used. The important can say that the developers were not aware of the methodological
lesson learnt is that a better (more complete) specification before critique expressed in one paper as “Close Co-operation with the
programming results in fewer errors in the code and that such a Customer Does Not Equal Good Usability” [4] (cf. also [1]).
specification can be reached by user tests of interactive mockups. Through an HCI student’s exam work for the organisation, its
developers became aware of the Wizard-of-Oz method by which
one can test mocked up designs as if they were already
Categories and Subject Descriptors implemented [3]. A more experienced Wizard (second author)
D.2.1 [Software Engineering]: Requirements/Specifications – was hired as a usability expert and design-aide and stayed through
Elicitation methods. the 3-year update project of the software package.
Due to the size of the project, the Wizard could not pre-test
D.2.2 [Software Engineering]: Design Tools and Techniques –
every module: one of the four largest modules was not mocked up
Evolutionary prototyping, User interfaces.
in advanced. Figure 1 shows the two user-centred processes
employed in this large update project (the debugging commenced
General Terms half a year after programming had started).
Design, Experimentation, Human Factors.
TWO ALTERNATIVE USER-CENTRED PROCESSES
Keywords
EUT No-EUT
Software quality, Early user-testing, Wizard-of-Oz prototyping.
1. CASE STUDY Requirements Requirements
Requirements specification
from users from users
Frequent testing of developing software can certainly increase the
usability in the program. However, as we found in a case study,
the method seems to continuously introduce changed or new Two cycles with
requirements which in turn results in more complex code and Early User-Testing (EUT) redesign of UI _
and some other
thereby more errors. This case study consisted of a large update
specifications
cycle of a software package in the area of decision support system
for civil protection. The update involved a complete re- Repeated Repeated
programming of the four largest modules. Several smaller updates evaluations by evaluations by
had been made prior to the large update cycle, and requirements Programming with experienced experienced
for the update had (as always) been collected from the large user debugging users, content users, content
groups. The organisation had routines for collecting requirements experts and experts and
from users, client organisations, and other stakeholders. HCI-expert HCI-expert
There was thus much resemblance of their approach to
principles found in user-centric approaches such as the MUST
method [2]. The organisation had however recognised that usa-
bility was an issue even if the type of functions provided by the Usable and error-free program modules
Permission to make digital or hard copies of all or part of this work for Figure 1. Flow of work with and without Early User-Testing
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists, 2. ERROR RATES
requires prior specific permission and/or a fee. The debugging process showed an interesting difference in the
Conference’04, Month 1–2, 2004, City, State, Country. number of errors found in the module lacking pre-testing and a
Copyright 2004 ACM 1-58113-000-0/00/0004…$5.00.
Table 1. Error rates relative to program size (MB and # of files)
Error type Prio 1 Prio 2 + 3 Priority 1,2,3
Program # / MB # / files # / MB # / files # / MB # / files
Early User-tested module of 1.5 MB and 145 files 4.67 0.05 68.00 0.70 72.67 0.75
Not-EUT module of 2.0 MB and 230 files 32.50 0.28 101.00 0.88 133.50 1.16
Error rates proportionally (EUT / Not-EUT) 0.14 0.17 0.67 0.80 0.54 0.65
Note: Priority 3 was in the error reports noted as “Next version”, often new requirements, while Priority 1 was “critical errors”.
3. EARLY USER-TESTING
comparable pre-tested module. Table 1 indicates both the size (in The much criticized Waterfall model for systems development,
MB) and the number of files of the two modules. Error rates are where all specifications should be settled before the laborious
given both in relation to size and number of files. The EUT- tasks of modelling and programming take place, admittedly has
developed module has about one-fifth of the error rate of the not- some advantages, but only if all requirements really can be settled
EUT module for the “Prio 1” errors (called “critical errors” in the in advanced. By early prototyping designers can approach this
debugging reports). In total, the error rate for the first module is goal. In the case study, the Wizard-of-Oz method was used with
only half of what was found in the second module. user interfaces often based on previous versions of the system.
It is not meaningful to compare program modules without What was needed was elaboration of the interaction design and
considering the relative complexity of each module. The two other uncovering interdependences between various function
EUT-modules were only half the size of the one we select for this requirements. This was met by the WOz prototyping, which was
error comparison but contained, relative to their size, many more conducted in two rounds: a first one on a rough design with 8
errors than the modules in Table 1. However, these other modules participants; a second one six months later on a detailed design
contained specific, database-related complexities and can only be with 5 participants. Although the interaction is ‘real’ in WOz
used for certain comparisons (2.2). experiments, the graphics can be crude in early design phases.
Setting up a WOz environment for testing is laborious as the
2.1 The debugging process Wizard must have control over what the user sees on the monitor
The debugging process commenced nearly a year before the final (and hears from the loudspeakers), but in our research group we
launch of the new version. The debugging was conducted by three have developed a ‘general-purpose’ WOz system which we call
groups which were very familiar with the functional requirements: Ozlab ,which facilitates the setting up of tests enormously (cf. e.g.
a group of very experienced users, the HCI expert, and the content [5]). A WOz set-up also allows designers to probe their own
managers for the different modules’ databases. designs and find interaction bugs even before testing.
The bug-finding by experienced users sometimes resulted in Still to evaluate is how much more costs the error-correction
new requirements coming up. Interestingly, this was also the case took in comparison with the cost for the Wizard work, but from
for the debugging made by the content experts (who had not been our experiences of this project (and noting the difference in
involved in the pre-tests before programming; they had only seen salaries between usability people and programmers…) it seems a
and accepted the requirements specifications). safe bet that the EUT injected as in Figure 1 pays of very well to
say nothing of how much frustration is saves.
2.2 New requirements
For the first module in Table 1 there was only 4 new requirements 4. REFERENCES
coming up in the extensive debugging process while for the [1] Ambler, S.W. 2004. Tailoring Usability into Agile Software
second module there was 13. This we hold to be the source of Development Projects. Maturing Usability, eds. Law,
many of the other errors. When new functions are introduced into Hvannberg & Cockton. Pp 75-95. Springer-Verlag
the developing process, it is harder for the programmers to [2] Bødker, K., Kensing, F. and Simonsen, J. 2004. Participatory
maintain a clean and easily predictable code. IT Design. Designing for Business and Workplace Realities.
That early user-testing can capture many requirements was MIT Press.
shown by a third module, smaller in size than the two modules in
Table 1 (0.7 MB and consisting of only 55 files). This third [3] Gould, J. D. and Lewis, C. 1985. Designing for usability: key
module mainly consisted of a library and the content expert of this principles and what designers think. Com. ACM 28:300-311.
module found many faults during the debugging process: among [4] Jokela, T. and Abrahamsson, P. 2004. Usability Assessment
these were in effect 24 new requirements. In the HCI expert’s (i.e. of an Extreme Programming Project: Close Co-operation
Wizard’s) opinion, most of the new requirements would have with the Customer Does Not Equal Good Usability. PROFES
been possible to spot if the content expert had been included in 2004 Proceedings, pp 393-407. Springer-Verlag.
the pre-testing, which could have been done without the wizard
[5] Molin, L. 2004. Wizard-of-Oz Prototyping for Cooperative
setting up special test scenarios for content experts. This is
Interaction Design of Graphical User Interfaces. Proceedings
important when the Wizard-of-Oz method is used as the method
of the Third Nordic Conference on Human-Computer
incurs some extra costs when mockups have to be prepared before
Interaction, 23-27 October, Tampere, Finland, pp. 425-428.
tests.