Identifying First Responder Communities Using Social Network Analysis John S. Erickson, Katie Chastain, Evan W. Patton, Zachary Fry, Rui Yan, James P. McCusker, Deborah L. McGuinness Rensselaer Polytechnic Institute, Tetherless World Constellation 110 8th Street, Troy NY 12180 USA {erickj4}@cs.rpi.edu Abstract. First responder communities must identify technologies that are effective in performing duties ranging from law enforcement to emer- gency medical to fire fighting. We aimed to create tools that gather and assist in quickly understanding responders’ requirements using semantic technologies and social network analysis. We describe the design and pro- totyping of a set of semantically-enabled interactive tools that provide a ”dashboard” for visualizing and interacting with aggregated data to perform focused social network analysis and community identification. 1 Keywords: first responders, emergency response, network analysis, topic modeling 1 Introduction In response to a request from NIST to develop approaches to using social net- works and associated technology to improve first responder effectiveness and safety, we used semantic technologies and social network analysis to locate Twitter-based first responder sub-communities and to identify current topics and active stakeholders within those communities. Our objective is to create a repeatable set of Twitter-compatible methods that constitute an initial require- ments gathering process. We report on using social media analysis techniques for the tasks of identifying first responder communities and on examining tools and techniques for identifying potential requirements stakeholders within those networks. Our First Responders Social Network Analysis Workflow (Figure 12 ) has helped researchers make sense of the vast quantity of information moving through Twitter. Identified stakeholders might be engaged by researchers in (for exam- ple) participatory design 3 tasks that are elements of a requirements gathering 1 A technical report discussing this work in greater detail may be found at [3]. All tool screenshots mentioned in this paper appear in the tech report in greater detail. 2 See also http://tw.rpi.edu/media/latest/workflow2 3 Participatory design studies end user participation in the design and introduction of computer-based systems in the workplace, with the goal of creating a more balanced relationship between the technologies being developed and the human activities they are meant to facilitate. See e.g. [4], citing [5] Fig. 1. Overview of a First Responders Social Network Analysis Workflow methodology. We present first responder-related Twitter data and metadata through interfaces that reduce the overall information, to keep up with the quickly-changing environment of social media. 2 Identifying First Responder Communities During Disasters We employed the Twitter Search API 4 to collect tweets containing one or more hashtags from a list of 17 hashtags identified as relevant by the first responder community.5 We report on two events: the anticipated February 2013 Nemo storm and the unanticipated Boston Marathon bombing. A visualization tool allows browsing over time showing (for example) total tweets for a hashtag over time while enabling a user to zoom in and explore with finer temporal granularity. 3 Identifying Themes through Topic Modelling We created a tool to visualize and enable interaction with topic modeling6 re- sults, applying MALLET (http://mallet.cs.umass.edu/ ) across Twitter sample data. The tool presents topics as a pie chart; each ”pie slice” represents an emergent topic, with assigned names indicating the most prevalent hashtags occurring in that topic. A popup list of hashtags enables the researcher to view other hashtags that are more loosely related to the topic. 4 See e.g. ”Using the Twitter Search API” http://bit.ly/1sY7O 5 For the complete list see http://www.sm4em.org/active-hashtags/ 6 See especially [2] 4 Identifying Hashtags of Interest Machine learning can help researchers identify hashtags that are topically related to the users area of interest but might not be immediately obvious. One of our tools uses co-occurrence to help identify evolving hashtags of interest, relating Twitter frequent posters with hashtags. The intensity of each cell in a matrix indicates the relative frequency with which a given user has tweeted using a particular hashtag. Users are filtered by weighted entropy and a subset is selected to provide the most coverage over hashtags of interest. Researchers may use this tool to develop a fine-grained understanding of topics and to pinpoint users of interest for further requirements gathering. 5 Multi-modal Visualization Tools Situations may arise where close examination of network dynamics and conversa- tion evolution is necessary. ”Multi-modal” data visualizations enable researchers to move seamlessly from macro-scale visualizations to the micro-scale of individ- ual tweets. Fig 2 shows one level where the propagation and retweeting of themes can be dynamically observed, while another level supports examination of indi- vidual tweets and associated media content.7 Fig. 2. Multi-modal visualization of Boston Marathon Twitter activity 7 Further details of the Twitter dataset used for this visualization may be found in [3] 6 Discussion and Conclusions Our social network analysis and visualization tools demonstrate methods of pas- sive social network monitoring8 intended to help researchers discover topical social network conversations among first responders. These tools have limited ability to connect and engage researchers with individual persons of interest. Current and future work includes extending the tools to expose and make ac- tionable more user information, including identifying which individuals are most active on pertinent hashtags and are stakeholders of interest from a requirements gathering perspective. The time-sensitive nature of any Twitter sample dataset requires that visualization tools be adept at filtering over time periods of inter- est. Current and future work includes improved support for browsing over time with emphasis on finding and understanding topic shifts. Recent events have demonstrated [1] that passive studies using social net- work data without full user knowledge and consent may backfire. Further studies should carefully examine the social implications of this work and in particular seek to understand at what point, if any, researchers should seek informed con- sent from potential stakeholder candidates. 7 Acknowledgements We are grateful to the Law Enforcement Standards Office (OLES) of the U.S. National Institute of Standards and Technology (NIST) for sponsoring this work, and members of the DHS First Responders Communities of Practice Virtual Social Media Working Group (VSMWG) for numerous helpful discussions. References 1. Arthur, C.: Facebook emotion study breached ethical guidelines, researchers say. The Guardian (June 2014), http://bit.ly/1kuebVW 2. Blei, D.M.: Probabilistic topic models. Communications of the ACM 55, 77–84 (April 2012), http://bit.ly/1rdOccT 3. Erickson, J.S., Chastain, K., Patton, E., Fry, Z., Yan, R., McCusker, J., McGuin- ness, D.L.: Technical report: Identifying first responder communities using so- cial network analysis. Tech. rep., RPI (July 2014), tw.rpi.edu/web/doc/tr_ firstresponder_communityanalysis 4. Kensing, F., Blomberg, J.: Participatory design: Issues and concerns. Computer Supported Cooperative Work 7, 167185 (1998), http://bit.ly/1zESj44 5. Suchman, L.: Forward. In: Schuler, D., Namioka, A. (eds.) Participatory Design: Principles and Practices. p. viiix (1993) 8 Passive monitoring supports constant monitoring of a ”default” set of known first responder hashtags meaning that when unanticipated events such as natural dis- asters happen, it is likely we’ll have a useful if not perfect sample dataset. Active monitoring, conducted after the fact supports a deeper examination of user activity, including a focused examination of retweets and an investigation of ”spontaneous” hashtags that emerge throughout the event.