=Paper=
{{Paper
|id=None
|storemode=property
|title=RepliPRI: Challenges in Replicating Studies of Online Privacy
|pdfUrl=https://ceur-ws.org/Vol-976/ppaper3.pdf
|volume=Vol-976
|dblpUrl=https://dblp.org/rec/conf/chi/Patil13
}}
==RepliPRI: Challenges in Replicating Studies of Online Privacy==
RepliPRI: Challenges in Replicating
Studies of Online Privacy
Sameer Patil
Helsinki Institute for
Abstract
Information Technology HIIT
Replication of prior results has recently attracted
Aalto University attention and interest from the CHI community. This
Aalto 00076, FInland paper focuses on the challenges and issues faced in
sameer.patil@hiit.fi carrying out meaningful and valid replications of HCI
studies. I attribute these challenges to two main
underlying factors: (i) a domain of inquiry that
simultaneously covers people, social systems, and
technology; and (ii) deficiencies in result reporting and
data archiving. Using examples from investigations of
online privacy, I outline how these challenges manifest
themselves in HCI studies. Longitudinal approaches,
international collaboration, and sharing of study
instruments could help address these challenges.
Author Keywords
Replication, Privacy, Cultural differences
ACM Classification Keywords
H.1.2 [User/Machine Systems]: Human factors.
General Terms
Human Factors, Security
Presented at RepliCHI2013. Copyright c 2013 for the indi- Introduction
vidual papers by the papers authors. Copying permitted only Replication of prior results has recently attracted
for private and academic purposes. This volume is published attention and interest from the CHI community. The
and copyrighted by its editors.
resulting discussions tackle replication from two is the publication describing the results of the
important perspectives: higher level epistemological study. Unfortunately, due to page limits and other
debate on the place and merits of replication in the editorial reasons, publications often do not
scientific (publishing) enterprise and the lower-level include all information — about methods and/or
practical considerations for replicating previous studies data — necessary for carrying out the study the
from the literature. Growing interest in RepliCHI way it was originally conducted. For instance,
suggests increasing recognition for the value of instead of including the entire questionnaire
replicating prior studies. I hope and anticipate that this instrument, the publication may include only
trend will foster continued community discussion on those questionnaire items that led to statistically
how to justify, appreciate, and reward replication as a significant results. Similarly, results may be
valuable scientific pursuit. Therefore, in this paper I presented in the aggregate or as percentages,
focus on the latter aspect, viz., challenges and issues making it difficult to replicate analyses that
faced in carrying out meaningful and valid replications require details of individual data points.
of HCI studies.
I attribute these challenges to two main factors: In the following section, I outline how I have found
these challenges to manifest themselves in
investigation of user preferences and practices
1. Domain of inquiry: A large proportion of HCI regarding online privacy. I conclude with some
studies tackle research problems where results thoughts on addressing the challenges.
typically exhibit simultaneous and interacting
influence of individuals, social systems, and Replicating Studies of Online Privacy
technology. Each of these three factors changes When thinking about and carrying out replications of
at drastically different rates and magnitudes. For research related to privacy, I have encountered several
instance, technology used in a study may practical challenges:
become obsolete within months or a couple of
years, while physical and cognitive capabilities of Privacy is a nuanced and complex issue affected by
adults change at much slower rates (and the individual characteristics, context of operation, and the
magnitude of the change is often comparatively technology under consideration. For instance,
small and predictable). These differences in the individuals have been classified into different groups
evolution trajectories of humans, cultures, and based on their inherent level of privacy concern [7],
technology make it difficult to replicate studies at and privacy concerns have been shown to exhibit
a later time and to determine and attribute cultural variation [3]. People’s mental models and
causes behind differences in results, if any. understanding of the underlying technology also affects
their preferences and practices regarding privacy [4].
2. Insufficient and/or incomplete reporting: Typically This implies that even when considering the same
the only resource available for replicating a study technology, replication conducted at a later time ought
to take into account the impact of learning effects on a simple case of re-running the study with subjects
privacy issues. Replications may also encounter the drawn from a different culture, with translation of
selection-maturation threat to validity owing to major instruments and study materials, if necessary. In
external events that occur after the original study, such practice, however, cultural differences pose several
as news coverage of privacy breaches. Such events hurdles. For instance, the same word or term may be
affect the population’s overall understanding and interpreted differently leading to the same question
awareness of privacy issues, thereby potentially being answered differently. For example, we found that
affecting the results of replications of studies that were the term “cubicle” was understood differently in the US
originally conducted prior to these event(s). and India owing to differences in office layouts and
density. This difference was one of the factors crucial
The majority of attention in replication has been for understanding the differences in results between
devoted to replication at a different (later) time. In the the US and India [5]. In other studies, I discovered that
case of privacy, however, it is equally important to the demographic question about ethnicity, which is
consider replication across different cultures. For commonly asked in the US (and even mandated for
example, we administered a questionnaire NSF-sponsored studies), was considered potentially
simultaneously in the US and India, enabling us to offensive and confusing in Europe. Differences in
draw interesting and surprising observations from lifestyle and beliefs can also affect whether questions
comparison across cultures [5]. Our results confirmed and tasks from one study can yield valid results, or
earlier findings regarding low levels of consumer even make sense, when replicated in a different
privacy concerns in India. Surprisingly, by examining cultural context. For instance, some privacy studies
interpersonal privacy separately from consumer have asked Western respondents about premarital sex,
privacy, we found that interpersonal privacy concerns sexual practices, extramarital affairs, and number of
in India were not only higher than consumer privacy sexual partners (e.g., [1]). Such questions are unlikely
concerns but also higher than interpersonal privacy to produce meaningful results in cultures where such
concerns in the US. Our study considered culture at practices are uncommon and/or forbidden. Resolving
the broad level of national cultures. However, it should this issue can be complicated when such
be noted that for replication purposes “culture” could culturally-specific questions comprise parts of standard
be construed to connote any large groups with shared scales; using the scale without modifications will not
characteristics and/or values, such as students, yield meaningful results and dropping and/or modifying
engineers, mothers, liberals, etc. Moreover, if items in the scale risks affecting the validity of
replication across cultures is conducted at a time later comparison across studies. Finally, it is also necessary
than the original study, then learning effects and to consider whether results across cultures are
maturation threats need to be taken into account (as affected by differences in sampling techniques and
discussed above). sample characteristics. For instance, although our
comparison of the US and India was limited to software
In theory, replication with a different cultural sample is
professionals, the mean and median ages of the Indian
participants were lower than those of the US simultaneously covers individuals, social systems, and
participants. technology; and (ii) result reporting and data archiving.
We found that understanding privacy-related cultural The second of these, in particular, could be easily
nuance often requires insights derived from qualitative addressed by requiring inclusion of full instruments and
methods (such as interviews, focus groups, field visits, study protocols as appendices1 . Similarly, authors of
etc.) and/or insider knowledge of the culture and its accepted papers could be asked, or even required, to
practices [6]. Currently the CHI community is focused upload the raw data after taking steps necessary to
mostly on replication of studies that employ quantitative protect participant anonymity. In this regard, ACM,
methods, such as experiments, questionnaires, or IEEE, NSF, and other prominent HCI funding and
usability evaluations. Complementing quantitative sponsoring organizations can follow the lead of the
replications with qualitative insights has potential to NIH, which mandates raw data availability. In a similar
broaden the scope of these replication endeavors. vein, an open source inspired approach could
Toward this end, it may also be fruitful to tackle whether encourage authors to release the source code of
and how qualitative studies could be effectively systems and scripts used for conducting studies and
replicated. carrying out analyses. An open question regarding
data and code sharing is how to deal with
Discussion and Conclusion commercialization and intellectual property issues
The previous section utilized examples from (especially when corporate entities are involved in
investigations of online privacy attitudes and behaviors conducting the study)2 .
to illustrate some of the challenges and issues in
replicating HCI studies. Online privacy cuts across the One approach for addressing the issue of intersection
individual, the social, and the technical, in much the of people and technology is to encourage longitudinal
same way as many studies in HCI do. Therefore, I investigations carried out at regular intervals over
believe that many, if not all, of these concerns are also several years. Depending on the details and logistics of
likely to arise in HCI investigations of other topics. the study, a longitudinal investigation could utilize the
same participants or different participants with the
The RepliCHI workshop is an important milestone same sampling method and sample characteristics.
toward developing a comprehensive compilation and The former approach can help examine the impact of
understanding of various challenges involved in the changes in individual characteristics, evolution in
replication of HCI studies. Moving forward, it is lifestyles, and effects of learning. The latter approach
necessary to apply this knowledge and insight for can help illuminate the impact of changes in
constructing best practices to follow and pitfalls to
1 This also provides the additional benefit of addressing one of the
avoid. Toward this end, I offer suggestions that address
most common comments raised in peer reviews — lack of method-
the two important considerations outlined in the ological detail.
Introduction, viz., (i) domain of inquiry that 2 Data used by studies conducted by corporations was a hotly de-
bated topic at the WWW 2012 conference [2].
technology. For replications across cultures, however, it researchers.
is perhaps best to target simultaneous study http://www.nytimes.com/2012/05/22/science/big-
deployment. Fostering international collaborations data-troves-stay-forbidden-to-social-scientists.html,
and/or leveraging international students to gain cultural May 2012.
knowledge and access could help in this regard. [3] Milberg, S., Burke, S., Smith, H., and Kallman, E.
Values, personal information privacy, and
Requiring a replication component in Bachelor’s and regulatory approaches. Communications of the
Master’s theses could provide a starting point for ACM 38, 12 (1995), 65–74.
repeating studies from the literature, simultaneously [4] Patil, S., and Kobsa, A. Uncovering privacy
serving a valuable pedagogical purpose by training the attitudes and practices in Instant Messaging. In
next generation. Further, conferences and journals Proceedings of the 2005 International ACM
could explicitly solicit replications of specific studies. SIGGROUP Conference on Supporting Group
Special conference sessions or journal sections could Work, GROUP ‘05, ACM (New York, NY, USA,
be devoted solely to replication studies. Discussions 2005), 109–112.
and follow-up activities from the RepliCHI workshop [5] Patil, S., Kobsa, A., John, A., and Seligmann, D.
could lead the way toward legitimizing and promoting Comparing privacy attitudes of knowledge workers
replication as a valuable scientific pursuit within HCI. in the U.S. and India. In Proceedings of the 3rd
International Conference on Intercultural
Acknowledgments Collaboration, ICIC ‘10, ACM (New York, NY, USA,
I thank Mihir Mahajan and John McCurley for editorial 2010), 141–150.
comments. [6] Patil, S., Kobsa, A., John, A., and Seligmann, D.
Methodological reflections on a field study of a
References globally distributed software project. Information
[1] Grossklags, J., and Acquisti, A. When 25 cents is and Software Technology 53, 9 (2011), 969–980.
too much: An experiment on willingness-to-sell and [7] Taylor, H. Most people are “privacy pragmatists”
willingness-to-protect personal information. In who, while concerned about privacy, will sometimes
Workshop on the Economics of Information trade it off for other benefits. The Harris Poll 17
Security (WEIS) (2007). (2003), 19.
[2] Markoff, J. Troves of personal data, forbidden to