Are We Overconfident in Our Understanding of Overconfidence?
                                                        Raymond R. Panko
                                                      Shidler College of Business
                                                         University of Hawai`i
                                                            2404 Maile Way
                                                          Honolulu, HI 96821
                                                          001.808.377.1149
                                                          Ray@Panko.com


ABSTRACT
In spreadsheet error research, there is a Grand Paradox. Although      2. RISK BLINDNESS IN BEHAVIORAL
many studies have looked at spreadsheet errors, and have found,        STUDIES
without exception, has error rates that are unacceptable in
organizations, organizations continue to ignore spreadsheet risks.     This paper introduces other possible approaches for understanding
They do not see the need to apply software engineering disciplines     the Grand Paradox. It focuses on risk blindness, which is our
long seen to be necessary in software development, in which error      unawareness of errors when they occur.
types and rates are similar to those in spreadsheet development..1     Naatanen and Summala [9] first articulated the idea that humans
Traditionally, this Great Paradox had been attributed to over-         are largely blind to risks. Expanding on this idea, Howarth [5]
confidence. This paper introduces other possible approaches for        studied drivers who approached children wanting to cross at an
understanding the Grand Paradox. It focuses on risk blindness,         intersection. Fewer than 10% of drivers took action, and those
which is our unawareness of errors when they occur.                    actions would have come too late if the children had started cros-
                                                                       sing the street. Svenson [14] studied drivers approaching blind
Categories and Subject Descriptors                                     bends in a road. Unfamiliar drivers slowed down. Familiar drivers
K.8.1: Spreadsheets. D.2.5 Testing and Debugging.                      did not, approaching at speeds that would have made accident
                                                                       avoidance impossible.
General Terms
Experimentation, Verification.                                         Fuller [2] suggested that risk blindness in experienced people stems
                                                                       from something like operant conditioning. If we speed in a
Keywords                                                               dangerous area, we get to our destination faster. This positive
Methodology. Spreadsheet Experiments, Experiments, Inspection.         feedback reinforces risky speeding behavior. In spreadsheet
Sampling, Statistics                                                   development, developers who do not do comprehensive error
                                                                       checking finish faster and avoid onerous testing work. In contrast,
                                                                       negative reinforcement in the form of accidents is uncertain and
1. INTRODUCTION                                                        rare.
Despite overwhelming and unanimous evidence that spreadsheet           Even near misses may reinforce risky behavior rather than to reduce
errors are widespread and material, companies have continued to        it. In a simulation study of ship handling, Habberley, Shaddick, and
ignore spreadsheet error risks. In the past, this Great Paradox had    Taylor [4] observed that skilled watch officers consistently came
been attributed to overconfidence. Human beings are overconfident      hazardously close to other vessels. In addition, when risky behavior
in most things, from driving skills to their ability to create large   required error-avoiding actions, watch officers experienced a gain
error-free spreadsheets. In one of the earliest spreadsheet experi-    in confidence in their “skills” because they had successfully avoi-
ments, Brown and Gould [1] noted that developers were extremely        ded accidents. Similarly, in spreadsheet development, if we catch
confident in their spreadsheets’ accuracy, although every par-         some errors as we work, we may believe that we are skilled in
ticipant made at least one undetected error during the development     catching errors and so have no need for formal post-development
process. Later experimenters also remarked on overconfidence.          testing.
Panko conducted an experiment to see if feedback would reduce
overconfidence, as has been the case in some general over-             Another possible explanation comes from modern cognitive/
confidence studies. The study found a statistically significant        neuroscience. Although we see comparatively little of what is in
reduce in confidence and error rates, but the error rate reduction     front of us well and pay attention to much less, our brain’s
was minimal. Goo performed another experiment to see if feedback       constructed reality gives us the illusion what we see what is in front
could reduce overconfidence and errors. There was some reduction       of us clearly [11]. To cope with limited cognitive processing power,
in overconfidence but no statistical reduction in errors.              the CR construction process includes the editing of anything
                                                                       irrelevant to the constructed vision. Part of this is not making us
                                                                       aware of the many errors we make [11]. Error editing makes sense
for optimal performance, but it means that humans have very poor           spreadsheets as their understanding grows. Testing methods must
intuition about the error rates and ability to avoid errors [11]. For      reflect the real process of software development.
the CR process this is an acceptable tradeoff, but it makes us con-
fident that what we are doing works well.
                                                                           4. REFERENCES
Another explanation from cognitive/neuroscience is System 1
thinking, which has been discussed in depth by Kahneman [7].               [1] Brown, P. S. and Gould, J. D. 1987. An experimental study
System 1 thinking uses parallel processing to generate conclusions             of people creating spreadsheets. ACM Transactions on Office
it is fast and easy, but its working are opaque. If we are walking             Information Systems. 5, 3 (Nov. 1987), 258-272.
down a street and a dog on a leash snaps at us, we jump. This is fast      [2] Fuller, R. 1990. Learning to make errors: evidence from a
or System 1 thinking. It is very effective and dominates nearly all            driving simulation. Ergonomics, 33, 10/11 (Oct/Nov, 1993),
of our actions, but it has drawbacks. First, it gives no indication that       1241-1250.
it may be wrong. Unless we actively turn on slow System 2
                                                                           [3] Goo, Justin M. W. 2002. The effect of feedback on
thinking, which we cannot do all the time, we will accept System 1
                                                                               confidence calibration in spreadsheet development. Doctoral
suggestions uncritically. One problem with doing so is that System
                                                                               Dissertation, University of Hawaii.
1 thinking, when faced with an impossible or at least very difficult
task, may solve a simpler task and make a decision on that basis.          [4] Habberley, J. S., Shaddick, C. A., and Taylor, D. H. 1986. A
For instance, if you are told that a bat and ball cost a dollar and ten        behavioural study of the collision avoidance task in bridge
cents and that the bat costs a dollar more than the ball, a typical            watchkeeping. College of Marine Studies, Southampton,
System 1 thought response is that the ball costs ten cents. This is            England. Cited in Reason (1990).
wrong, of course, but System 1 thinking tends to solve the simpler         [5] Howarth, C. I. 1990. The relationship between objective risk,
problem, $1.10 - $1.00. If we do not force ourselves to engage in              subjective risk, and behavior. Ergonomics, 31, 527-535.
slow and odious System 2 thinking, we are likely to accept the                 Cited in Wagenaar & Reason, 1990.
System 1 alternative problem solution.
                                                                           [6] Jones, T. C. 1998. Estimating software costs. McGraw-Hill,
This may be why, when developers are asked whether a spreadsheet               New York, NY.
they have just completed has errors, they quickly say no, on the
basis of something other than reasoned risk. Reithel, Nichols, and         [7] Kahneman, D. 2011. Thinking, fast or slow. Farrar, Strauss
Robinson [13] had participants look at a small poorly formatted                and Giroux, New York, NY.
spreadsheet, a small nicely formatted spreadsheet, a large poorly          [8] Kimberland, K. 2004. Microsoft’s pilot of TSP yields
formatted spreadsheet, and a large nicely formatted spreadsheet.               dramatic results, news@sei, No. 2.
Participants rated their confidence in the four spreadsheets.                  http://www.sei.cmu.edu/news-at-sei/.
Confidence was modest for three of the four spreadsheets. It was           [9] Naatanen, R. and Summala, H. 1976. Road user behavior
much higher for the large well-formatted spreadsheet. Logically,               and traffic accidents. North-Holland, Amsterdam. Cited in
this does not make sense. Larger spreadsheets are more likely to               Wagenaar & Reason, 1990.
have errors than smaller spreadsheets. This sounds like System 1
alternative problem solving.                                               [10] Panko, R. R. 2007. Two experiments in reducing
                                                                                overconfidence in spreadsheet development. Journal of
                                                                                Organizational and End User Computing, 19, 1 (January–
3. CONCLUSION                                                                   March 2007), 1-23.
If we are to address the Great Paradox successfully and convince           [11] Panko, R. R. 2013. The cognitive science of spreadsheet
organizations and individuals that they need to create spreadsheets             errors: Why thinking is bad. Proceedings of the 46th Hawaii
more carefully, we must understand its causes so that we can be                 International Conference on System Sciences (Maui, Hawaii,
persuasive. Beyond that, we must address the Spreadsheet Software               January 7-10, 2013).
Engineering Paradox—that computer scientists and information
                                                                           [12] Reason, J. 1990. Human error. Cambridge University Press,
systems researchers have focused on spreadsheet creation aspects
                                                                                Cambridge, England.
of software engineering, largely ignoring the importance and com-
plexity of testing after the development of modules, functional            [13] Reithel, B. J., Nichols, D. L., and Robinson, R. K. 1996. An
units, and complete spreadsheets. In software engineering, it accep-            experimental investigation of the effects of size, format, and
ted that reducing errors during development is good but never gets              errors on spreadsheet reliability perception. Journal of
close to success. Commercial software developers spend 30% to                   Computer Information Systems, 54-64.
50% of their development resources on testing [6,8], and this does         [14] Svensen, O. 1977. Risks of road transportation from a
not count rework costs after errors are found. Yet spreadsheet                  psychological perspective: A pilot study. Report 3-77, Project
engineering discussions typically downplay or completely ignore                 Risk Generation and Risk Assessment in a Social
this five-ton elephant in the room. It may be that spreadsheets are             Perspective, Committee for Future-Oriented Research,
simply newer than software development, but spreadsheets have                   Stockholm, Sweden, 1977. Cited in Fuller, 1990.
been use for a generation, and strong evidence of error risks have
been around almost that long.                                              [15] Wagenaar, W. A. and Reason, J. T. 1990. Types and tokens
                                                                                in road accident causation. Ergonomics, 33, 10/11 (Nov.
We have only looked at the situation at the individual level. Testing           1993), 1365-1375.
must be accepted by groups and even corporations. Even at the
group level, this paper has not explored such theories as the
diffusion of innovations. If spreadsheet testing is mandated, that
will reduce risks. However, user developers must have the freedom
to explore their problem spaces freely by modifying their