Is replication important for HCI? Abstract Replication is emerging as a key concern within Christian Greiffenhagen subsections of the HCI community. In this paper, we Loughborough University explore the relevance of science and technology studies c.greiffenhagen@lboro.ac.uk (STS), which has addressed replication in various ways. Informed by this literature, we examine HCI’s current Stuart Reeves relationship to replication and provide a set of University of Nottingham recommendations and points of clarification that a stuart@tropic.org.uk replication agenda in HCI should concern itself with. Author Keywords Replication; psychology; science and technology studies; philosophy of science. ACM Classification Keywords H.5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous. Introduction Replication is emerging as a concern within subsections of the HCI community. A key motivation for this is a feeling that HCI emphasises novelty over consolidation of research; consolidation that can be achieved via replication. In response, we advocate the relevance to HCI of understandings of ‘replication’ emerging from the philosophy and sociology of science and technology. This paper highlights a collection of rejoinders to the Presented at RepliCHI2013. Copyright © 2013 for the individual papers ways in which this programme for replication is by the papers’ authors. Copying permitted only for private and academic currently conceptualised within HCI. In doing so we purposes. This volume is published and copyrighted by its editors. intend to help the development of an endogenous understanding of replication as a practice that can be a) publicly accountable and would let ‘anyone’ replicate motivated, b) mature and c) fit for the purposes of HCI. experimental practices [10]. However, the nature of instructions is such that they are always incomplete [4], Replication: Lessons from STS thus scientific instructions must be ‘filled in’ by competent We believe that debate on replication in HCI can be members of the target scientific community in order to enriched by STS and philosophy and sociology of enact them as replications. This is one of the reasons why science. In this section we review some of the findings Medawar characterised the scientific paper, somewhat of this literature and their pertinence to HCI. misleadingly, as a ‘fraud’ [8]. One of the motivations for replication within HCI is the STS reports an alternative view on the nature of wish to make HCI more scientific by modelling HCI on replication in the natural sciences to the surface view of other sciences (e.g., “psychology, physics and scientific replication where scientific articles (in medicine” [11]). While there is nothing problematic in particular: their ‘method sections’) provide an adequate asking for a field to involve more replication, to frame instruction manual for replication work. Specifically it this in terms of making it more ‘scientific’ is possibly problematises the notion of a ‘decisive experiment’ or based on a mythical view of ‘good science’ of which by extension a ‘decisive replication’. At the heart of this "[r]eplication of research is a cornerstone" [11]. This problem is what Collins calls the “experimenter’s view suggests this ‘science’ may be a homogenous regress” [1], that is, a circular relation between practice, possibly even based around a particular experimental findings and the instruments used to method, ‘the’ scientific method. It also tends to think produce them. Reliable experimental findings about replication more from the perspective of the themselves rely upon reliable instruments and vice philosophy of science, rather than the practice of versa. As a result, a key difficulty of replication and the different sciences. experimenter’s regress is that, particularly for contested science, there is not necessarily any standard In contrast, philosophical and sociological studies have for what is to be considered a valid replication. This shown that ‘science’ refers to a fragile structure of raises a principle problem, since it is not clear whether multiple disciplines and multiple methods linked by a ‘failed’ replication is due to a problem with the ‘family resemblances’ only [9, 3]. Not all empirical original experiment or the subsequent replication (“it is sciences work with experiments, and the role of often hard to tell whether an inability to replicate a experiments may differ between different fields. result is due to a group’s failings or a flaw in the original paper” [5, p. 345]). Complicating this picture is the separation between these varied and autochthonous scientific practices and their Further to this, when we consider the track record of rendering into literature. Scientific literature is written in replication in the natural sciences, STS literature argues such a way to as to promise replicability, emerging from that replication in the (natural) sciences employs Boyle’s attempts to create scientific records that were replication for specific, highly motivated and reasoned ends. Thus we find a marked absence of large amounts studies incrementally) and “applied case studies” of replication in the sciences unless we focus on particular (replication through application of prior work) [11]. issues [1, pp. 210-211]. For instance, Collins’ tracing of the construction of gravitational wave detectors during Nuancing this view, we want to start with introducing the 1970s reveals the relevance of replication as an two different kinds of distinction to help us to think activity for working through what was a contested, about replication. controversial domain [2]. In short, ‘doing replication’ is not always seen as a fundamental prerequisite for valid The first distinction is between what we characterise as scientific practice, since a vast number of results go textbook replication and frontier replication. By unreplicated: instead it emerges as the result of ‘textbook replication’ we refer to replications of well- pragmatic action for specific contested cases. known studies that are conducted from HCI textbooks, typically as part of undergraduate or graduate In summary, then, our cursory examination of STS and education. For instance, these could be replications of its related literature highlights that: a) there is no well-known usability studies. We distinguish this from singular form of science or scientific method upon ‘frontier replication’ by which we mean replications of which to model; b) there is no ‘algorithmic’ method for ‘ongoing’ or ‘recent’ studies. We see these forms as replicating directly from scientific literature (indeed, conceptually and practically incommensurate, as this is not its purpose); c) ‘absolute’ security of results opposed to integral facets (e.g., see position in [11] on is problematic in light of the experimenter’s regress; “Benefits of Replication”). Thus, while the primary aim and d) sciences often do not involve replication as a and motivation of textbook replications is learning, the ‘matter of course’, it being difficult and of little value point of frontier replication is often a form of ‘checking’ unless motivated (typically via contestation of results). (which may even be done during the review process). As such we argue that the activities at this ‘frontier’ Replication within HCI becomes the main issue for replication rather than what This issue of replication has become a centre of is happening ‘in the textbook’. discussion within HCI. In light of STS’s view on replication, we seek to ask what is at stake in this A second distinction has to do with what may be replic- discussion. Why replicate? Or: What are the (different) able and what is actually replic-ated, in which the aims aims and motivations for replication? for each are quite different. ‘Being replicated’ concerns the ‘factual’ question of whether a particular study has, Within HCI, it has been acknowledged that there is not actually, been replicated by other researchers or not. just one kind of replication. For example, Wilson et al. We say ‘factual’ since subsequent studies may or may distinguish between four forms: “direct replication” not be seen as valid replications, as in Collins’ study of (“driven by the aspirations of strong science”), gravitational wave detectors [2]. We also note again “conceptual replication” (replication via “alternative that a lack of actual replications may be related to methods”), “replicate & extend” (building on prior matters such as experiments being too costly, too time consuming or lacking in providing the experimenter any ‘description’ and ‘reasoning’ is, of course, ‘replication’. obvious credit. If we take HCI as a scientific endeavour (e.g., [11]) then it follows that its concern for replication would In contrast ‘being replicable’ is motivated by the ‘in thus be informed by this particular picture of ‘normal principle’ possibility of some other researcher being science’; or ‘doing what scientists do’. However, this able to replicate an empirical study. This is often cited assumes coherence of ‘science’ as monolithic practice as one of the differences between ‘quantitative’ and as well as mythologising that practice. ‘qualitative’ methods (very problematic descriptions themselves), where the former supposedly produce In contrast, ‘being replicated’ is a more pragmatic results that could be replicated (again, ‘in principle’), question, which concerns what we can learn from while the later are not. For instance, ethnographic replications and, for example, whether it would be research is often said to be too reliant on the worthwhile to publish more papers based on replication. ‘subjective’ insights of the ethnographer, resulting in non-generic and non-replicable findings. In order to focus the discussion of replication in HCI, it would be very helpful if one could gather more What’s at stake in this distinction? We would argue that examples from different disciplines, from biology to the issue of ‘being replicable’ concerns a foundational physics, to see whether and how replications are valued question, in particular, whether HCI is a science and its in these. Thus we hazard a conjecture: that replication preferences for particular methods over others. These enjoys a special status within psychology (and the questions are not new: psychology—which has strongly debate of replication in HCI is thus a reflection of the informed HCI’s development—has repeatedly influence of psychology, rather than, say, biology, in foregrounded replication as an explicit agenda, such as HCI). But why might that be? in response to perceived experimental biases (e.g., being too ‘WEIRD’ [6]), as well as intentional and One issue is with the scale of the question to be unintentional misconduct [12]. In this sense, ‘being answered through experiment. Some sciences tackle replicated’ is probably more common in psychology very detailed and small questions through extremely than many other sciences because of this explicit detailed experiments. In other words, there exist a very concern (now displayed in HCI) for the lack of actual tight relationship between the data gathered through replicated studies (or those ‘seen as’ validly replicated). the experiments and the derived conclusions. Other sciences (e.g., social science) tackle bigger questions Psychology’s own debates around its status as a and consequently involve a looser relation between science are also consonant with these foundational data and conclusion. concerns of ‘being replicable’, and in the replication agenda we see HCI grasping towards key We would argue that there is a ‘scale’ tension in epistemological themes which arise in the natural psychology—and thus HCI—between tackling ‘big’ and sciences: alongside ‘observation’, ‘measurement’, ‘minute’ questions, questions that can, or can’t be settled through experiments. One possible reason for topics of the natural sciences (e.g., via psychology), more replication in psychology is that studies can be then we must do so knowingly in light of findings from questioned more (i.e., findings are more open to STS. Thus we argue for different understandings of interpretation). replication: a) as an unstable and negotiated practice; b) as a highly motivated activity rather than as an end Discussion of itself; and c) as playing an important role in the We have raised some broad issues in the relationship resolution of scientific controversies. Moving forwards between replication and HCI, and informed this debate we would draw attention to the judicious motivated through recourse to existing work in STS that has application of replication—and the need for ‘just why’ explored replication in the natural sciences. and ‘just how’ it is to be pursued. So, we must be clear about the purposes and motivations of any given Firstly we argue for the importance of the increased replication beyond abstractly “validating and consultation of literatures normally foreign to HCI such understanding contributions” [11]. as that of STS. This is particularly the case for situations where knowledge within the field is out of Finally, we have argued that a mythological view of step with more recent advances in understandings of science tends to be implicit in HCI regarding its status scientific knowledge. For instance, our discussions on as scientific. This leads us to question the value in replication (and science) within HCI are largely positioning HCI as a scientific endeavour. Thus we Popperian or pre-Popperian in form, such as appeals to recommend that it would be helpful to separate the ideals such as falsificationism. While we would not ‘foundational’ question (whether HCI is a science) from argue against such ideals, we contend that the above ‘pragmatic’ question (about the specific understanding benefits from expansion, thus as well as benefits of replication for HCI). citing Collins, we might also refer to developments by Kuhn, Feyerabend or Lynch that, for instance, Acknowledgements encapsulate empirical investigations into practical This work is supported by Horizon Digital Economy mundane scientific action [7]. Research, RCUK grant EP/G065802/1. A fundamental question for the desire for replication in References HCI is that of the motivation to perform replication in [1] Collins, H. M., Changing Order: Replication and the first place. We need to ask ourselves why we might Induction in Scientific Practice, Beverley Hills & London: Sage, 1985. bother with replication in the first place and whether there is any value gained from pursuing a replication [2] Collins, H. M. The seven sexes: A study in the sociology of a phenomenon, or the replication of agenda as a distinctive activity within HCI (which is the experiments in physics. Sociology, 9(2):205-224, 1975. position of the workshop call [11]). As we have seen from STS literature, if we feel the need to derive HCI’s [3] Dupre, J. The disunity of science. Mind 112, 321- 346, 1983. programme from the methods and epistemological [4] Garfinkel, H. Studies in Ethnomethodology. [9] Putnam, H. The idea of science. Midwest Studies In Prentice-Hall, 1967. Philosophy, 15(1):57-64, 1990. [5] Giles, J. The trouble with replication. Nature, [10] Shapin, S. and Schaffer, S. Leviathan and the Air- 442:344-347, July 2006. Pump: Hobbes, Boyle, and the Experimental Life. Princeton University Press, 1989. [6] Henrich, J., Heine, S. J. and Norenzayan, A. The weirdest people in the world? Behavioral and Brain [11] Wilson, M. L., Resnick, P., Coyle, D. and Chi, E. H. Sciences, 33, pp. 61-83, 2010. RepliCHI—The Workshop. In CHI ‘13 Extended Abstracts (CHI EA ‘13). ACM, New York, NY, USA, 2013. [7] Lynch, M. Scientific Practice and Ordinary Action. Cambridge University Press, 1993. [12] Yong, E., Replication studies: Bad copy. Nature, [8] Medawar, P. B. Is the scientific paper a fraud? The 485(7398):298-300, 2012. Listener, 70 (12 September): 377–378, 1963.