Stresscapes: Validating Linkages between Place and Stress Expression on Social Media Martin Sykora M.D.S YKORA @ LBORO . AC . UK Centre for Information Management, SBE, Loughborough University UK Colin Robertson Geography and Environmental Studies, Wilfried Laurier University CANADA Ketan Shankardass Psychology Department, Wilfried Laurier University CANADA Rob Feick School of Planning, University of Waterloo CANADA Krystelle Shaughnessy Psychology Department, University of Ottawa CANADA Becca Coates CIM, SBE, Loughborough University UK Haydn Lawrence Geography and Environmental Studies, Wilfried Laurier University CANADA Thomas W. Jackson CIM, SBE, Loughborough University UK Abstract tings, provide opportunities to build knowledge of place and state-of-mind linkages that will in- Understanding how individuals and groups per- form the design and promotion of vibrant place- ceive their surroundings and how different phys- making by individuals and communities. ical and social environments may influence their In this paper we present a novel study, to be un- state-of-mind has intrigued re-searchers for some dertaken this summer within the Greater Toronto time. Much of this research has focused on in- area in Canada, with 140 recruited participants vestigating why certain natural and human-built who are frequent, geo-tagging, Twitter users. places can engender specific emotive responses The goal of the study will be to assess emo- (e.g. fear, disgust, joy, etc.) and, by extension, tional, acute and chronic stress experienced in ur- how these responses can be considered in place- ban built-environments and as expressed during making activities such as urban planning and de- daily activities. An existing automated seman- sign. Developing a better understanding of the tic natural language processing tool will be vali- linkages between place and emotional state is dated through this study, and it is hoped that the challenging in part because both cognitive pro- methodology developed can be extrapolated to cesses and the concept of place are complex, dy- other urban environments as well, with a second namic and multi-faceted and are mediated by a validation study already planned to take place confluence of contextual, individual and social next year in London, United Kingdom. processes. There is evidence to suggest that so- cial media data produced by individuals in situ and in near real-time may provide novel insights 1. Introduction into the nature and dynamics of individuals’ re- sponses to their surroundings. The explosion In recent years automated processing of rich, geo-tagged, of user-generated digital data and the sensoriza- social media text streams, such as Tweets and Facebook tion of environments, especially in urban set- status updates is receiving considerable attention in the lit- erature. This is largely motivated by the insights and value Proceedings of the 2 nd International Workshop on Mining Urban that such datasets were shown to provide (Chew & Eysen- Data, Lille, France, 2015. Copyright c 2015 for this paper by its bach, 2010; O’Connor et al., 2010; Tumasjan et al., 2010; authors. Copying permitted for private and academic purposes. Abel et al., 2012). Social-media streams, in general, al- 3y Stresscapes: Validating Linkages between Place and Stress Expression on Social Media low for observing large numbers of spontaneous, real-time in-sights into how emotional stress is related to particular interactions and varied expression of opinion, which are conditions (e.g. traffic congestion), place types and designs often fleeting and private (Miller, 2011). Miller (2011) (e.g. public versus private places, high versus low density) further points out that some social scientists now see an and times (e.g. commuting rush hours) within urban com- unprecedented opportunity to study human communication munities. with various applications and contexts, which has been an As far as the authors are aware this is the first study of its obstacle up until recently. O’Connor et al. (2010) demon- kind, which will be looking at various forms of stress, link- strated how large-scale trends can be captured from Twitter ages to urban environments, and validation of a computa- messages, based on simple sentiment word frequency mea- tional social media analysis tool against ’real’ experiences sures. The researchers evaluated and correlated their Twit- of acute and chronic stress, using already well established ter samples against several consumer confidence and polit- and validated measures from literature. ical opinion surveys in order to validate their approach, and have pointed out the potential of social-media as a rudimen- The remainder of the paper is organised as follows. Section tary yet powerful polling and survey methodology. In her 2 introduces some background and prior work on stress in position paper De Choudhury (2013) suggests that mental urban environments and the computational tool for emotion health studies would benefit from employing social media, based stress detection. Method details and overall valida- as it provides an unbiased collection of an individuals lan- tion study design are presented in section 3. Section 4 con- guage and behaviour, and Coppersmith et al. (2014) fur- cludes the paper and suggestions for future work are made. ther highlight how social media enables large scale anal- yses, which has not been previously possible with tradi- 2. Background tional methods. Eichstaedt et al. (2015) propose a strong argument in favor of employing social media to study heart Intensive acute and chronic psychological stress appears to disease mortality based on psychological characteristics play a causal role in the onset of multiple chronic disease gleaned from Twitter language use. Especially negative outcomes, such as asthma and obesity (Shankardass et al., emotional language and expressions of stress play an im- 2009; 2014), engendering significant costs related to eco- portant role. They argue that traditional approaches that nomic productivity, and health and social service spending use household visits and phone surveys are costly and have (Daar et al., 2007). A body of evidence suggests that the limited spatial and temporal precision. built environment shapes how we experience and respond to stress (Shankardass, 2012). However, there is a critical Motivated by this initial evidence we will be investigating gap in our understanding of how our environments shape emotional acute and chronic stress as expressed in geo- our experience of stressors (e.g., social disorder) and influ- tagged, in-situ social media language. Our primary fo- ence how we cope with our perceived stress because of the cus will be the connection between expressions of stress availability (or lack thereof) of resources, e.g., safe park and the geography of urban built environments; applying space (Shankardass, 2012). There is a lack of place-based geo-spatial analysis methods to define dynamic stress land- measures of stress to facilitate research on these interrela- scapes, or stresscapes that will help us to understand how tionships. stress varies from place-to-place and from time-to-time within urban centres. As Schwartz and Germaine (2014) This study uses a conceptual framework recently proposed rightly point out, studies concerning the combination of so- by Shankardass (2012), which builds on Pearlin’s stress cial media, identity performance, and place are still rare. process heuristic (Pearlin, 1999), where sources of stress Hence, we particularly seek to contribute to recent research that are perceived as stressful can manifest emotion-al, be- related to the linkages between place and expressions of havioural and physiological responses (e.g., negative affect, personal or social stress. Research on this topic has tra- smoking and endocrine activation, respectively). Two criti- ditionally focused on the role of either individual or con- cal mediators of these responses are resource appraisal and textual factors; however, it is necessary to investigate the coping behaviours, while the neighbourhood built environ- interplay between individuals and the nature of their im- ment can present stressors and offer resources that con- mediate surroundings. Assessment of stress is normally dition how we cope in space and time. This conceptual overly general, which makes it hard to compare the experi- framework guides our hypotheses about which confounders ence of stress across individuals; whereas focusing on the and moderators ought to be considered in building a predic- emotional dimensions of the stress response offers a more tion model of emotional stress on stress-related endocrine specific measure for analysis. We will recontextualize so- activation. These include personality differences, such as cial media expressions through spatial modelling and inte- trait anxiety and pessimism (Chang, 2002) - which may gration with contextual geospatial datasets describing par- confound the relationship - and low self-esteem (Dumont ticipants’ immediate surroundings. This will lead to new & Provost, 1999) - which may increase the effect of per- 3R Stresscapes: Validating Linkages between Place and Stress Expression on Social Media ceived stress on chronic endocrine activation, as well as stress in its various shapes and forms, with the intention to low social support (ibid.) - which may increase the effect of validate this system against real experiences of stress (see perceived stress on chronic endocrine activation, and sex next section for details on this validation study). and gender (Baum & Grunberg, 1991) with hard-to-predict moderating effects on the relationship. Chronic endocrine 3. Methodology and Study Design activation may be more likely where individuals adopt cop- ing styles that do not effectively deal with stressors (e.g., Emotional stress has been conceptualized in different ways, avoidance coping, rather than approach coping or problem- including in terms of negative affect and as a state of dis- oriented coping. tress. Two criteria will be utilized as criteria for validation in this study, including; Taking all this into account, our overall goal is to further develop and validate an ontology of emotional stress (based on presence of negative and the lack of positive affect) that • The single-item distress thermometer, which is a sim- will facilitate measurement through semantic analysis of ple Likert scale shaped like a vertical thermometer that geo-tagged Twitter posts (Sykora et al., 2013) and assess asks the subject to select a number corresponding to the predictive validity of perceived psychological stress. their level of distress (Zwahlen et al., 2008). 2.1. Detection of Stress from Tweets There are numerous systems for effective, efficient and • The 10-item negative affect scale from the expanded accurate sentiment and emotion detection from language. version of the Positive and Negative Affect Schedule A broader overview of the various approaches is avail- (PANAS-X) will also be used (Watson & Clark, 1994). able in Thelwall et al. (2012). One of the popular tech- niques is based on the use of words and phrase dictionar- These measures will be framed using moment instructions, ies with known associated sentiment polarities or emotion i.e., we will ask whether participants have experienced dis- categories; however, these dictionaries, although some- tress/negative affect ”right now”, that is, at the present mo- times combined and semi-automatically generated for bet- ment. The aim will be to collect at least 10 measures of ter cross-domain performance, are relatively flat and lack each during the two week follow-up (see section on overall semantic expressivity. Even more recently Eichstaedt et al. design). An algorithm will be used to scan a series of dis- (2015) still used a combination of simple dictionaries to crete stress-related terms (still being compiled) in real-time perform their automated tweet analysis. for all participants and randomly trigger the study follow- In this work we employ an ontology based approach, up survey via SMS / text message. This survey will be which is essentially a map of words and phrases with a triggered roughly at evenly-spaced time points across the much richer semantic representation than simple dictio- two-week follow-up period, based on the rate of tweets by naries. The system we will use is called EMOTIVE and the study participants. In order to understand the context of is based on (1) a custom Natural Language Processing Tweets, a series of questions will also be asked, to assess (NLP) pipeline, which parses tweets and classifies parts-of- what activity mode the participant was in (e.g., work, play, speech tags, and (2) an ontology, in which emotions, related commute, domestic, study) at the time of the Tweet, and phrases and terms (including a wide set of intensifiers, con- whether and how the surrounding environment influenced junctions, negators, interjections), and linguistic analysis the Tweet in any way. Participants will also be requested to rules are represented and matched against (Sykora et al., automatically geo-tag their Tweets by default, for the dura- 2013). EMOTIVE automatically detects expressions of tion of the study follow-up. eight well recognised and fine-grained emotions in sparse texts (e.g. Tweets). The system discovers the following 3.1. Recruitment range of emotions; anger, disgust, fear, happiness, sadness, surprise (also known as Ekman’s basic emotions - Ekman The study population will include long-term (>3 months at and Davidson, (1994)), and confusion and shame, but at the study entry) active (>4 posts per week) Twitter users who same time differentiates emotions by strength (also known are free from anxiety disorders. The planned sample size is as activation level, e.g. fear - ’uneasy’, ’fearful’, ’petri- 140 participants, which was calculated based on Bland and fied’). An evaluation of the system against other bench- Altman (1986) and scaled up by 40% for anticipated drop- marks performed in Sykora et al. (2013) showed excellent outs. A pool of potential study participants (i.e. long term, results, with a very high f-measure of .962. Given the rich active Twitter users) will be identified from a database of representation of emotions and the ontology this is based several million collected Tweets, geo-tagged in the Toronto on, we will link and extended this system into representing area. The study will also be limited to participants who live and work in the greater Toronto area. 3k Stresscapes: Validating Linkages between Place and Stress Expression on Social Media 3.2. Overall Design ally different built environment, with its own characteris- tics. We hope this will lend itself to some interesting anal- The study can be broken-up into three phases: yses. • a. Running enrollment of study participants (1 month, begins mid-May 2015) Acknowledgments We are grateful for this work to be supported by an SSHRC • b. Follow-up period (2 months) each participant 2 (Social Sciences and Humanities Research Council) Part- weeks nership Development grant and partly by an internal Wil- • c. Study exit and hair sampling (running in parallel to frid Laurier University grant. b), and ultimately study takedown by mid-September. References Data about participants will be collected at study entry, Abel, F., Hauff, C., Houben, G., Stronkman, R., and Tao, specifically socio-demographic and psychological informa- K. Semantics + filtering + search = twitcident exploring tion. Information about the experience of emotional stress information in social web streams. In Proceedings of the and relationship with place will be collected at approxi- 23rd ACM International Conference on Hypertext and mately 10 time points during the follow-up period. Sub- Social Media, Milwaukee, USA, 2012. sequently in the study exit participants will be asked to complete a checklist of potentially stressful events, in or- Baum, A. and Grunberg, N. E. Gender, stress, and health. der to understand the influence of major life events during Health Psychology, 10(2):80–85, 1991. the follow-up period. Bland, J. M. and Altman, D. G. Statistical methods for as- 3.3. Assessment of Chronic Psychological Stress sessing agreement between two methods of clinical mea- surement. Lancet, 327(8476):307–310, 1986. The research team has also secured additional funding to augment this summer’s study with a collection of hair sam- Chang, E. C. Optimismpessimism and stress appraisal: ples for cortisol analysis in order to examine how our de- Testing a cognitive interactive model of psychological vised measure of stress predicts chronic activation and al- adjustment in adults. Cognitive Therapy Research, 26 lostatic load (i.e. physiological dysfunction). Participants (5):675–690, 2002. who agreed to this will provide a 0.95 cm hair sample (from the root) at study exit, which will be analysed using im- Chew, C. and Eysenbach, G. Pandemics in the age of twit- munoassay analysis following a validated protocol (Gow ter: content analysis of tweets during the 2009 h1n1 out- et al., 2010). Because hair grows at a rate of approximately break. PLOS One, 5(11):e14118, 2010. 1.25 cm per month, cortisol embedded in this sample length will reflect a retrospective record of approximately three Choudhury, M. De. Role of social media in tackling chal- prior weeks. Hair cortisol level will be considered an out- lenges in mental health. In Proceedings of the 2nd Inter- come in regression models from our measure of stress. national Workshop on Socially-aware Multimedia, pp. 49–52, New York, USA., 2013. 4. Conclusion and Future Work Coppersmith, G., Harman, C., and Dredze, M. Measuring There are several key benefits of our study. First, a post traumatic stress disorder in twitter. In Proceedings place-based measure of physiological stress will signifi- of the 8th International AAAI Conference on Weblogs cantly broaden the potential for research to examine how and Social Media, Ann Arbor, USA, 2014. the neighbourhood environment affects human health and well-being. This could lead to studies that inform the de- Daar, A. S., Singer, P. A., Persad, D. L., Pramming, S. K., sign of neighbourhoods that facilitate stronger prevention Matthews, D. R., Beaglehole, R., ..., and Bell. Grand and management of stress-related illnesses. Second, the challenges in chronic non-communicable diseases. Na- final predictive validation model will create empirical evi- ture, 450(7169):494–496, 2007. dence of the inter-relationship amongst emotional and psy- chological stress, endocrine activation and a range of de- Dumont, M. and Provost, M. A. Resilience in adolescents: mographic and psychological traits. The study described in Protective role of social support, coping strategies, self- this paper will be repeated, with lessons learned, next year esteem, and social activities on experience of stress and in the city of London. It is hoped that this will strengthen depression. Journal of Youth Adolescence, 28(3):343– the model and validation, and will also provide a cultur- 363, 1999. 3j Stresscapes: Validating Linkages between Place and Stress Expression on Social Media Eichstaedt, J. C., Schwartz, H. A., Kern, M. L., Park, G., Thelwall, M., Buckley, K., and Paltoglou, G. Sentiment Labarthe, D. R., Merchant, R. M., ..., and Seligman, strength detection for the social web. Journal of the M. E. Psychological language on twitter predicts county- American Society for Information Science and Technol- level heart disease mortality. Psychological Science, 26 ogy, 63(1):163–173, 2012. (2):159–169, 2015. Tumasjan, A., Sprenger, T. O., and Welpe, I. M. Predicting Ekman, P. and Davidson, R. J. (eds.). The Nature of Emo- elections with twitter: What 140 characters reveal about tion: Fundamental Questions. Affective Science. Ox- political sentiment. In Proceedings of the 4th Interna- ford University Press, 1994. tional AAAI Conference on Weblogs and Social Media, Washington D.C., USA, 2010. Gow, R., Thomson, S., Rieder, M., Uum, S. Van, and Ko- ren, G. An assessment of cortisol analysis in hair and Watson, D. and Clark, L. A. The panas-x. manual for the its clinical applications. Forensic Science International, positive and negative affect schedule: expanded form. 196(1):32–37, 2010. Technical report, University of Iowa, Iowa City, IA, USA, 1994. URL http://www2.psychology. Miller, G. Social scientists wade into the tweet stream. uiowa.edu/faculty/clark/panas-x.pdf. Science, 333(6051):1814–1815, 2011. Zwahlen, D., Hagenbuch, N., Carley, M., Recklitis, C., and O’Connor, B., Balasubramanyan, R., Routledge, B., and Buchi, S. Screening cancer patients’ families with the Smith, N. From tweets to polls: Linking text sentiment distress thermometer (dt): a validation study. Psycho- to public opinion time series. In Proceedings of the 4th Oncology, 17(10):959–966, 2008. International AAAI Conference on Weblogs and Social Media, Washington D.C., USA, 2010. Pearlin, L. I. The stress process revisited: Reflections on concepts and their interelationships. Handbook on The Sociology of Mental Health. Plenum Press, New York, 1999. Schwartz, R. and Germaine, R. H. The spatial self: Location-based identity performance on social me- dia. New Media & Society, 2014. doi: 10.1177/ 1461444814531364. Shankardass, K. Place-based stress and chronic disease: A systems view of environmental determinants. In Rethink- ing Social Epidemiology, pp. 113–136. Springer, Nether- lands, 2012. Shankardass, K., McConnell, R., Jerrett, M., Milam, J., Richardson, J., and Berhane, K. Parental stress in- creases the effect of traffic-related air pollution on child- hood asthma incidence. In Proceedings of the National Academy of Sciences, volume 106, pp. 12406–12411, USA, 2009. Shankardass, K., McConnell, R., Jerrett, M., Lam, C., Wolch, J., Milam, J., Gilliland, F., and Berhane, K. 2014. parental stress increases body mass index trajectory in preadolescents. Pediatric Obesity, 9(6):435–442, 2014. Sykora, M., Jackson, T. W., O’Brien, A., and Elayan, S. Emotive ontology: Extracting fine-grained emotions from terse, informal messages. IADIS International Journal on Computer Science and Information Systems, 8(2):106–118, 2013. 39