=Paper=
{{Paper
|id=Vol-1618/FuturePD_paper3
|storemode=property
|title=Accuracy and Reliability of Personal Data Collection: An
Autoethnographic Study
|pdfUrl=https://ceur-ws.org/Vol-1618/FuturePD_paper3.pdf
|volume=Vol-1618
|authors=Amon Rapp,Alessandro Marcengo,Federica Cena
|dblpUrl=https://dblp.org/rec/conf/um/RappMC16
}}
==Accuracy and Reliability of Personal Data Collection: An
Autoethnographic Study==
Accuracy and Reliability of Personal Data Collection: An
Autoethnographic Study
Amon Rapp Alessandro Marcengo Federica Cena
University of Torino Telecom Italia University of Torino
Computer Science Department Via Reiss Romoli, 274 Computer Science Department
C.so Svizzera, 185, Torino Italy Torino Italy C.so Svizzera, 185, Torino Italy
amon.rapp@gmail.com alessandro.marcengo@telecom cena@di.unito.it
italia.it
ABSTRACT gathered data and on the consequent perceived reliability of the
Accuracy of self-tracking devices is a key problem when dealing instrument used.
with personal data. Different devices may result in different We carried out a four-week autoethnographic study to investigate
reported measure, and this may impact on the users’ perceived how different self-tracking tools may lead to different results in
reliability of the devices they used. We conducted an terms of the values of the collected data. The results of the study
autoethnography to investigate how different devices collect data reveal that: i) the data collected for a specific target parameter
on specific parameter in order to highlight discrepancies in the were different depending on the tools used, and such difference
measures reported. Results highlight that designers should account was primarily due to the position in which these instruments were
for the variability of activities that users may face during their worn and the activities performed during the day by the
daily practices, as each of them may impact on the device’s ethnographer; ii) the discrepancies among the measures reported
capability of collecting accurate data. by the different tools impacted on their perceived reliability,
pushing the ethnographer to seek strategies to account for the data
CCS Concepts collected.
• Human-centered computing➝Human computer interaction.
2. RELATED WORK
Keywords Various research has studied how users perceive reliability and
Personal informatics; Quantified Self; Personalization; accuracy of self-tracking instruments. Kay et al. [3] found that
Autoethnography. users react negatively to the inaccuracies of their devices, while
Lazar et al. [4] emphasized that they do care about the accuracy of
the data collected, so that failing to produce accurate information
1. INTRODUCTION is one of the main reason for abandoning a specific device.
Personal Informatics systems are currently appealing a large Consolvo et al. [1] listed seven different types of errors that a
number of users, spreading beyond the traditional user group of fitness tracker device may produce during its daily use, such as
Quantified Selfers [6]. Quantified selfers have a deep knowledge exchanging one activity or another one, completely failing to
of tracking technologies, finding solutions for the possible barriers detect an activity, or detecting an activity that was not occurred:
that they may encounter during the data collection and this kind of errors produces frustration in users, directly impacting
management. However, this is not true for all those people that are on the instrument’s credibility. While Mackinlay [5] highlighted
interested and curious toward Personal Informatics, and may try that users put to test their devices’ accuracy, but often find
this kind of technologies for the first time [8]. difficulties in calibrating them due to the scarce visibility of their
One of the issue that this new user base may encounter is related status. Finally, Yang et al. [9] outlined the various techniques that
to the bewilderment induced by the different possibilities of users use to evaluate trackers’ accuracy, emphasizing the different
tracking the same parameter. Thanks to the spreading of multiple perceptions that they may have of accuracy and reliability.
wearable devices for personal data collection, in fact users can 3. METHOD
now rely on different instruments to measure the same parameter.
We used autoethnography to individuate discrepancies among
Each of them has its own physical structure, uses specific
diverse trackers and analyze how they may affect the user’s
recognition algorithms and is addressed to be worn on certain part
experience. This method considers the ethnographer’s subjective
of the body: and all these elements may affect the reported
experience worth to be analyzed and reported, valuable as that of
measures and thus the data collected by the device. The
the other individuals. The autoethnographer continuously
differences in the data collected that may result from such a
observes herself to account for the reality she is interested to
diversity might impact on the user’s perceived accuracy of the
explain [2].
The second author self-examined the use of four different
wearable devices to compare the data collected and eventually
individuate criticalities due to discrepancies in their accuracy
and/or reliability. The devices were chosen by taking into account
the position in which they are worn, with the goal of exploring the
differences in the gathered measures by them.
We selected: Withings Activité on the right wrist; Shine Misfits should be possible also to advise the user about the best body
necklace; Sony SWR30 on the left wrist; GoogleFit application location to wear the device in relation to her personal lifestyle.
running background on a Sony Xperia Z3.
The hypothesis was that the recorded data would not be affected
5. CONCLUSION
Our study emphasizes the need of considering the idiosyncratic
by the influence of the body positioning, all recording
activities that users carry out during their daily practices in order
approximately the same data. The self-observation session was
to produce more accurate and thus reliable trackers. Activity
carried out for four weeks. We provide here a brief summary of
recognition algorithms should be tailored to the specific habits of
the study findings pointing to Marcengo et al. [7] for a more
the single individual as these may be the main culprit for the
detailed description.
inaccurate reporting of the target parameters. Personalization,
4. RESULTS AND DISCUSSION thus, should be not only a matter of the services provided by the
Sleep data analysis showed interesting problems related with the new personal informatics technologies, but also a key requirement
personal style of “going to sleep” in relation with the used device. for the design and implementation of the modalities for collecting
For instance, the sleep total amount recorded by the Misfit Shine the data.
(necklace) is always higher of about thirty minutes. This point is
due to the fact that the Shine considers the lying position as the
6. REFERENCES
1. Consolvo, S., McDonald, D.W., Toscos, T., Chen, M. Y.,
user is already sleeping even if she’s reading a book or watching
her tablet in the bed. So the sleep total amount will always be Froehlich, J., Harrison, B., Klasnja, P., LaMarca, A.,
increased by the activity performed before falling asleep. The LeGrand, L., Libby, R., Smith, I., Landay, J. A.: Activity
device with the best accuracy results the one worn on the right sensing in the wild: a field trial of ubifit garden. In
wrist. This makes possible to distinguish the activities performed Proceedings of the SIGCHI Conference on Human Factors in
with the right hand while lying in the bed as something different Computing Systems (CHI '08) 1797–1806 (2008)
from sleeping (for left-handed user the same principle will work 2. Ellis, C., Bochner, A.,: Autoethnography, personal narrative,
for the left wrist). and personal reflexivity. In Handbook of qualitative research
Also steps showed interesting evidences and relations through life (2nd. ed.), Norman K. Denzinand Yvonna S. Lincoln (eds.).
style and devices. The total steps amount is very biased by the Sage, Thousand Oaks, CA, 733-768 (2000)
interaction between the location on the body (if wearable) and the 3. Kay, M., Morris, D., schraefel, mc, Kientz, J. A.: There’s no
activities performed by the user. Indeed, considering the data such thing as gaining a pound: reconsidering the bathroom
collected by Withings Activité (on the right wrist) it is clear that if scale user interface. In: ACM international joint conference
the user performed a lot of public talking on a specific day on Pervasive and ubiquitous computing (UbiComp '13), 401–
(meetings, showing slides, etc) steps becomes inclined towards 410 (2013)
high figures due to the gestures involved. Opposite results become 4. Lazar, A., Koehler, C., Tanenbaum, J., Nguyen, D.H.;Why
evident according to different life circumstances. In particular we use and abandon smart devices. In: the 2015 ACM
data became surprisingly low for two conditions. The first one is International Joint Conference on Pervasive and Ubiquitous
when the user walk pushing a stroller. In this case the device does Computing (UbiComp '15). ACM, New York, NY, USA,
not log the alternate hanging of the hands and does not see the 635-646 (2015)
activity as walking. The second one occurs if the user carry a
5. Mackinlay, M.: Phases of Accuracy Diagnosis:(In) visibility
moderately heavy bag (e.g. a small suitcase) depending which
of System Status in the Fitbit. Intersect: The Stanford Journal
hand holds the bag.
of Science, Technology and Society 6, 2 (2013)
If the steps are collected by a phone app even more life situation 6. Marcengo, A., Rapp, A.: Visualization of Human Behavior
distortions become evident toward low figures because of all the Data: The Quantified Self, in Huang L. H. and Huang, W.
occasions when the phone is not on the body (e.g. weekend, (Eds.) Innovative approaches of data visualization and visual
sports, home, etc.). This, in a minor evident manner, is also true
analytics. IGI Global, Hershey, PA, 236-265 (2013)
also for wearable devices. On the weekend all data appears
7. Marcengo, A., Rapp, A., Cena, F. The Falsified Self:
distorted by incomplete or peculiar usage of the device due to
Complexities in Personal Data Collection. To appear in
different life activities (i.e. working in the garden, playing with
kids, etc.). Proceedings of HCI International ’16, Springer, 2016.
8. Rapp, A., Cena, F.: Self-monitoring and Technology:
From these evidences some needs of personalization in the design Challenges and Open Issues in Personal Informatics. In: HCI
of logging devices and apps emerge. Manufacturers need to International. Universal Access in Human-Computer
consider different designs for different life styles brought by
Interaction. Design for All and Accessibility Practice.
different types of users with different life patterns (e.g. watching
Lecture Notes in Computer Science, Volume 8516, 2014,
videos in the bed, walking with a stroller, carrying a bag,
gesturing a lot, etc.). These patterns could be compressed into a 613-622 (2014)
few personas that can lead to different declinations of the same 9. Yang, R., Shin, E., Newman, M. N., Ackerman, M. S.: When
device or slightly different tracking algorithms on the same fitness trackers don't 'fit': end-user difficulties in the
device. This personalization may be transferred directly into the assessment of personal tracking device accuracy. In: the 2015
user experience by collecting specific aspects and habits that ACM International Joint Conference on Pervasive and
impact on the accuracy of the logging system. In certain case Ubiquitous Computing (UbiComp '15). ACM, New York,
NY, USA, 623-634 (2015)