Navigating Controversy as a Complex Search Task Shiri Dori-Hacohen Elad Yom-Tov James Allan Center for Intelligent Microsoft Research Center for Intelligent Information Retrieval eladyt@microsoft.com Information Retrieval University of Massachusetts University of Massachusetts Amherst Amherst shiri@cs.umass.edu allan@cs.umass.edu ABSTRACT ments, users are often left on their own to find the language Seeking information on a controversial topic is often a com- used to describe different stances of an argument, in order plex task, for both the user and the search engine. There to issue queries to retrieve information about them, and to are multiple subtleties involved with information seeking on classify the returned documents into these different views. controversial topics. Here we discuss some of the challenges Should search engines help users explicitly in this process? in addressing these complex tasks, describing the spectrum Should search engines make users aware of the different as- between cases where there is a clear “right” answer, through pects of a topic or, alternatively, downweight some views fact disputes and moral debates, and discuss cases where (though this may arguably be viewed as censorship)? One search queries have a measurable effect on the well-being of way or another, helping the user navigate the controver- people. We briefly survey the current state of the art, and sial topic, along with its different opinions and stances, is a the many open questions remaining, including both techni- crucial part of the search engine’s role in the case of these cal challenges and the possible ethical implications for search complex search tasks, be it implicitly or explicitly. engine algorithms. Some might argue that the search engine’s role in the case of controversial topics ends at presenting the results in a simple keyword-based “list of ten links” on a Search Engine 1. INTRODUCTION Results Page (SERP), and that the search engine has no With the rise of personalization and the fear that it is place to take a moral stand. Even presenting the contro- creating a “Filter Bubble”, that is, exposure to a narrower versy and the various stances on it may not be a simple range of viewpoints [31], navigating controversy is becoming choice: if search engines provide comprehensive information an increasingly challenging task for search engine users and on the different stances regarding a topic (e.g. presenting administrators alike. On one hand, by presenting answers to pro-anorexia opinions alongside anorexia treatments), this a user’s information need [9], search engines feed into con- information may nudge people towards harmful behavior, firmation bias and assist users - sometimes unawares - to re- either by exposing them to wrong or harmful information, main in their own echo chambers. On the other hand, high- or because users may stop perceiving search engines as hon- lighting a controversy outright may have unintended conse- est brokers of information. quences. The subtle differences between fact disputes and At the same time, simply providing every result available their interpretations, between scientific debates and moral with no qualification can also be harmful, as disputed claims stands, further exacerbate these challenges. are allowed to proliferate without any warning to the unsus- Information has a clear effect on the choices people make. pecting user. For example, unproven, “quack” medical treat- The introduction of Fox News, a channel with clear politi- ments often put users at risk by warning them not to heed cal leanings, was associated with a shift of 3-8% in voting their doctors [1, 4]. With unfounded claims widespread on patterns in presidential elections from 1996 to 2000 towards the web, there are subtle ethical concerns with settling for the channel’s opinions [10]. In the health domain, queries a “buyer beware” (“caveat emptor”) approach. Caplan and about celebrities perceived as anorexic were shown to induce Levin raise a similar concern regarding “caveat emptor” in queries indicative of eating disorders [45]. the medical realm: “...researchers have an obligation to do Therefore, when a user’s information need pertains to a more [in order] to enable patients to make informed choices” controversial topic, their search task becomes complex, as [8]. With concerns of life and death on the balance (e.g., does the process of presenting the “correct” information. in the case of medical controversies), we should not under- Since search engines match keywords to the retrieved docu- estimate the impact of such choices on search engine users. Recent work assumes that trustworthiness should be pre- served, for example in the case of knowledge extraction [11]. Some may go as far as arguing that, if technology allows for discernment of trustworthy vs. non-trustworthy sources, the search engine has an obligation to serve the trustworthy results to the users; others may say this is a slippery slope, Copyright c 2015 for the individual papers by the papers’ authors. Copy- and may in fact be viewed as censorship. ing permitted for private and academic purposes. This volume is published and copyrighted by its editors. When discussing “navigating controversy” as a complex ECIR Supporting Complex Search Task Workshop ’15 Vienna, Austria search task, there is an additional layer of complexity: be- Published on CEUR-WS: http://ceur-ws.org/Vol-1338/. yond the complex task that the user herself is trying to com- 3. THE PROBLEM OF DEFINING CONTRO- plete, complexity also stems from the search engine’s design VERSY and algorithmic choices. It’s possible that amidst all the websites crawled by an engine, the correct response (if one How does one define controversy? While there is no one even exists) is nowhere to be found, or is unfairly biased [42]. definition of the term controversy, we might use the follow- Should a search engine operator be concerned with civic or ing definition as an approximation: controversial topics are ethical implications of the search results it serves on contro- those that generate strong disagreement among large groups versial topics [20]? Should the user always be provided with of people. Like the definition of relevance, it’s possible that what they want to see, even if it can be harmful to the user, controversy should be defined operationally: whatever peo- or to society as a whole? Where should we draw the line ple perceive as controversial, is controversial. between presenting trustworthy information from authori- However, in line with others’ findings [24], our research so tative sources and discounting incorrect statements, versus far shows that achieving inter-annotator agreement on the presenting opinions on a moral debate? “controversy” label is very challenging. Additionally, while These questions are open problems. Far from providing intuition and some researchers might suggest that the no- the community with a “correct” answer, we’d like to open tion of sentiment should be relevant for controversy (e.g. the discussion on the case of navigating controversy as a [32, 39]), others have argued that sentiment is not the right complex search task. Here we highlight some of the issues metric by which to measure controversy [3, 12, 27]; opinions that users may want to perform when searching for informa- on movies and products may contain sentiment, yet lack tion on controversial topics, including seeking information controversy. on controversial topics; understanding different stances or Likewise, we find some of the definitions of controversy opinions on such topics; and placing results within the con- used by others, or the datasets that those definitions lead text of the larger debate. Even the definition of controversy them to use, to be very problematic (e.g. definitions that is an open question, which we will discuss as well. confound vandalism and controversy and therefore rate “pod- cast” as the most controversial topic in Wikipedia [40], or relying on the list of Lamest Edit Wars in Wikipedia as a controversy dataset [7]). It may perhaps be helpful to break the definition of con- 2. SUPPORTING USERS WITH CONTRO- troversy into several interrelated definitions: For example, bias, disputes, truth value and polarity or intensity of emo- VERSIAL QUERIES tion are potentially easier terms to define, but each of them In order to account for users’ information needs on contro- only partially covers controversy. How does controversy re- versial queries and modulate the results in some way, there late to these constructs, and how would one proceed to dis- is first a technical challenge of recognizing that the query ad- cover the relationships between them? dresses a controversial topic, and determining what is con- Additionally, does the scope and context of the contro- troversial about it. Prior work has shown that it is possible versy matter? For example, do controversies regarding oc- to create classifiers for controversial Wikipedia pages [23, 35, currences on American Idol (which may induce edit wars 37] as well as events on Twitter [32]; recently, Dori-Hacohen on Wikipedia) matter less than a controversy on the Israeli- and Allan demonstrated such a classifier to detect contro- Palestinian Conflict? One could argue that the latter is a versial web pages [12, 13]. much more controversial and influential topic; but for the Controversies can also be detected from a query perspec- user searching for “American Idol” or, for example, “Joanna tive, if those are available, by finding queries that have se- Pacitti” (a controversial contestant on the show), perhaps mantically opposite meanings [19, 43]. Additionally, some the knowledge that this represents a controversial topic may advances have been made in recent years with regards to be just as relevant – in the context of that search. automatically detecting bias (cf. [33]). The goal of such Though one may have an intuitive understanding of the detection could be to inform the user of the controversy by term “controversy”, without a structured definition, our work means of a browser extension or search engine warning [12]. (as well as others’) will not hold as much weight or predictive A similar approach was demonstrated with regards to fact power. disputes [15], a specific type of controversy. Assuming one has successfully discovered that a topic is 3.1 Single truth or shades of gray controversial, another challenge is understanding what is Information needs vary in the number of answers to them, controversial about it. In the political sphere, Awadallah et both correct and incorrect. Some information needs have a al. demonstrated automatic extraction of politician opinions single correct answer to them, while others may have sev- [3]. Sentiment-based diversification of search on controver- eral possible correct answers, requiring a moral judgment or sial topics has been proposed by Kacimi and Gamper [22], entailing an opinion, e.g. political and religious questions. though several researchers have argued that controversy is There are also questions for which there is a single scientif- distinct from sentiment analysis [3, 12, 27]. ically correct answer, but for which non-scientific responses While frameworks for machine-readable argumentation and exist, even though they are factually incorrect. For example, “The Argument Web” have been implemented [5], search en- some people claim that the Mumps-Measles-Rubella (MMR) gines cannot rely on widespread adoption of such tools. Re- vaccine causes autism; though studies have shown this claim cently, Borra et al. [6] demonstrated an algorithm that de- to be incorrect, it is still believed by many people. tects which topics are most contested within a given Wiki- This variation in answers requires different treatment in pedia page; these and similar advances will be needed in each case. The simplest category is that where the infor- order to present users with explicit stances on controversial mation need has a single, correct, answer, which the search topics. engine can provide. The second category is of questions which have a technically correct response, but also an in- 4. OPEN QUESTIONS correct one which is prevalent on the web. Recent research Several researchers have claimed that search engines have by White and Hassan has demonstrated this phenomena in significant political power [21]. In his book Republic.com 2.0, web search results, and specifically in health search [42]. legal scholar Cass Sunstein argues that a purely consumer- The last category is of questions which have several pos- based approach to Internet search is a major risk for democ- sible correct answers, among which people may choose by racy [38]. One of deliberative democracy’s basic tenets, he making a moral judgment, for example, topics of abortion, argues, is the ability to have a shared set of experiences, and same-sex marriage, and other highly charged issues; reli- to be exposed to arguments you disagree with. Search en- gious and political questions often fall under this umbrella. gines and social media are increasingly responsible for “Filter Selective exposure theory shows that people seek informa- Bubbles”, wherein click-feedback and personalization lead tion which affirms their viewpoint and avoid information users to only see what they want, serving to further increase which challenges it [16]. Exposure to differing viewpoints confirmation bias [31]. While this may seems to match in- has been shown to be socially advantageous in reducing the dividual users’ preference, the net effect on society is poten- likelihood of adopting polarized views [36] and increasing tially detrimental. Being exposed only to like-minded people tolerance for people with other opinions [17]. These advan- in so-called “echo chambers” serves to increase polarization tages have led some to argue that technology could be used and reduce diversity [34]2 . to expose people to a broader variety of perspectives, for Contrary to the common wisdom, some evidence exists example by modifying the display of information to nudge that online personalization has not increased the filter bub- people to becoming “open-minded deliberators” [17]. ble [18]. That said, research has shown that exposing users This reasoning has led researchers to try and inform peo- to opposing opinions increases their interest in seeking di- ple of the differing views on the topics which they are read- verse opinions, and their interest in news in general [43]. ing. Providing people with feedback as to how much (on There have been suggestions to diversify search results based average) their reading was biased towards one or another on sentiment [22], though others argue that presenting the political opinion, had only a small effect on nudging people opposite opinion would only help in some cases [2, 29]. Prior to read more diverse opinions [28]. Kriplean et al. [26] de- bias of people changes the results of a search query, even veloped a system for people to explicitly construct and share without personalization. For example, the results for the pro/con lists for a political election in Washington state, but query “what are the advantages of the MMR vaccine?” are found that opinions did not significantly change after using completely different from the results served for the query the system. In another experiment, Oh et al. [30] found “what are the dangers of the MMR vaccine?”. Moreover, that people preferred search results which were clearly de- the way people interpret the same information is dependent lineated as to their leaning. Recently, Yom-Tov et al. [43] on their bias, for example in the case of gun control [25] or showed that people would read opposite opinions to theirs bias towards vaccines [44]. Thus, if a user seeks information if their language model was appropriately selected. Such an on “how does MMR cause autism?”, should a search engine intervention had long-lasting effects on reducing selective inform the user of the truth, or just satisfy their information exposure. Thus, it is technically possible to provide people need? One possible solution includes highlighting disputed with diverse opinions where they have sought only one, but claims [15] or explicitly presenting opposing viewpoints [41], there still remains the question of whether this should be but the problem remains that the user may not trust sources the role of a search engine. that don’t match their existing worldview. An additional concern is whether claiming that certain Since search engines (as well as their social media coun- facts are “true” or “false” holds any objective meaning. The terparts) are increasingly the dominant medium for seeking scope of this paper does not allow a deep dive into the philo- information and news, the question then becomes: should sophical questions of objectivism vs. moral relativism, and search engines reflect what is on the internet and match the constructs of objectivity, subjectivity and intersubjec- content to users to maximize their preference, regardless of tivity1 . Nonetheless, we can still delineate a few obvious its truth value, or any concerns about diversity of opinion? concerns: the choice of which facts are in dispute, or which Where do we draw the line between fact disputes and moral topics are controversial, can vary significantly with the cul- debates? Should the controversial nature of a topic depend tural and social setting in which these questions are evalu- on the social and cultural setting in which it is being eval- ated. For example, a user in Israel and a user in Iran may uated? Should the search engines have a civic duty, and in have very different opinions about what holds “true”, and ei- that case, who decides what that duty is? ther may be offended if the others’ worldview was presented There are multiple technical challenges remaining in clas- as a “fact”; what is fact to one is either highly controversial sifying controversial topics and extracting the opinions about or simply false to the other, and vice versa. As another ex- them. However, even if these technical challenges of detect- ample, the research by White and Hassan cited above [42] ing controversy and stances were solved, there remains the assumes that the Western world’s view of medicine is the question of if, when and how to present these to the user, only correct one, but users in China may beg to differ. Is a based on their information need. As we discussed, there are topic therefore only controversial if a user (or culture) be- ethical concerns with a search engine taking action, but also lieves it to be so? Who, then, can decide when a topic is with inaction. It remains to be seen if users would be inter- controversial? How can the system know that a user believes ested in hearing opposing opinions, or whether interventions a topic is controversial, and should the system then respond would be useful; and finally, it is unclear whether it is within differently than when a user accepts it as “fact”? 2 We note that, despite our own biases, the values of democ- 1 Dubois [14] provides an insightful exploration of these con- racy and diversity of opinion are also culturally predicated, cepts with regard to stance taking. and not necessarily applicable to all search engine users. the search engine’s purview (or even its duty) to intervene, international conference on Conference on and if so, how. Information & Knowledge Management, CIKM ’13, pages 1845–1848, New York, NY, USA, 2013. ACM. Acknowledgments [13] S. Dori-Hacohen and J. Allan. Automated This work was supported in part by the Center for Intelli- Controversy Detection on gent Information Retrieval and in part by NSF grant #IIS- the Web. In ECIR’15, To Appear. Preprint available at: 1217281. Any opinions, findings and conclusions or recom- http://maroo.cs.umass.edu/pub/web/getpdf.php?id=1173, mendations expressed in this material are those of the au- 2015. thors and do not necessarily reflect those of the sponsor. [14] J. W. Du Bois. The stance triangle. Stancetaking in We thank Gonen Dori-Hacohen, Myung-ha Jang, Shankar discourse: Subjectivity, evaluation, interaction, pages Kumar, Kishore Papineni and David Wemhoener for fruitful 139–182, 2007. conversations. Special thanks to the anonymous reviewers [15] R. Ennals, B. Trushkowsky, and J. M. Agosta. for their insightful comments. Highlighting disputed claims on the web. In Proceedings of the 19th international conference on 5. REFERENCES World wide web - WWW ’10, WWW ’10, page 341, [1] American Cancer Society. Metabolic Therapy, Mar. New York, New York, USA, 2010. ACM Press. 2012. Retrieved from [16] D. Frey. Recent research on selective exposure to http://www.cancer.org/treatment/treatmentsandside information. Advances in experimental social effects/complementaryandalternativemedicine/dietand psychology, 19:41–80, 1986. nutrition/metabolic-therapy [17] R. K. Garrett and P. Resnick. Resisting political [2] J. An, D. Quercia, and J. Crowcroft. Why individuals fragmentation on the Internet. Daedalus, seek diverse opinions (or why they don’t). Proceedings 140(4):108–120, 2011. of the 5th Annual ACM Web Science Conference on - [18] M. Gentzkow and J. M. Shapiro. Ideological WebSci ’13, pages 15–18, 2013. segregation online and offline. Quarterly Journal of [3] R. Awadallah, M. Ramanath, and G. Weikum. Economics, 126:1799–1839, 2011. Harmony and Dissonance: Organizing the People’s [19] K. Gyllstrom and M.-F. M. Moens. Clash of the Voices on Political Controversies. WSDM, pages typings: finding controversies and children’s topics 523–532, Feb. 2012. within queries. In Proceedings of the 33rd European [4] S. Barrett and V. Herbert. Twenty-Six Ways to Spot conference on Advances in information retrieval, Quacks and Vitamin Pushers, 2014. Retrieved from ECIR’11, pages 80–91, Berlin, Heidelberg, 2011. http://www.quackwatch.org/01QuackeryRelatedTopics/ Springer. spotquack.html [20] L. M. Hinman. Esse est indicato in Google: Ethical [5] F. Bex, M. Snaith, J. Lawrence, and C. Reed. and political issues in search engines. International ArguBlogging: An application for the Argument Web. Review of Information Ethics, 3(6):19–25, 2005. Web Semantics: Science, Services and Agents on the [21] L. D. Introna and H. Nissenbaum. Shaping the Web: World Wide Web, 25:9–15, Mar. 2014. Why the Politics of Search Engines Matters. The [6] E. Borra, A. Kaltenbrunner, M. Mauri, Information Society, 16(3):169–185, 2000. U. Amsterdam, E. Weltevrede, D. Laniado, R. Rogers, [22] M. Kacimi and J. Gamper. MOUNA: Mining Opinions P. Ciuccarelli, and G. Magni. Societal Controversies in to Unveil Neglected Arguments. In Proceedings of the Wikipedia Articles. Proceedings CHI 2015, pages 3–6, 21st ACM international conference on Information 2015. and knowledge management - CIKM ’12, page 2722, [7] S. Bykau, F. Korn, D. Srivastava, and Y. Velegrakis. New York, New York, USA, Oct. 2012. ACM Press. Fine-Grained Controversy Detection in Wikipedia, [23] A. Kittur, B. Suh, B. A. Pendleton, E. H. Chi, 2015. L. Angeles, and P. Alto. He Says, She Says: Conflict [8] A. Caplan and B. Levine. Hope, hype and help: and Coordination in Wikipedia. In Proceedings of the Ethically assessing the growing market in stem cell SIGCHI Conference on Human Factors in Computing therapies. Current, 10(5):33–34, 2010. Systems, CHI ’07, pages 453–462, New York, NY, [9] D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. USA, 2007. ACM Press. What makes a query difficult? In Proceedings of the [24] M. Klenner, M. Amsler, and N. Hollenstein. Verb 29th annual international ACM SIGIR conference on Polarity Frames: a New Resource and its Application Research and development in information retrieval, in Target-specific Polarity Classification. In pages 390–397. ACM, 2006. Proceedings of the 12th edition of the KONVENS [10] S. DellaVigna and E. Kaplan. The Fox News effect: conference Vol. 1. - Hildesheim. Universität Media bias and voting. The Quarterly Journal of Hildesheim, 2014. Economics, 122(3):1187–1234, 2007. [25] D. Koutra, P. Bennett, and E. Horvitz. Events and [11] X. L. Dong, E. Gabrilovich, K. Murphy, V. Dang, Controversies: Influences of a Shocking News Event on I. Watts, W. Horn, C. Lugaresi, S. Sun, and Information Seeking. TAIA workshop in SIGIR, pages W. Zhang. Knowledge-Based Trust: Estimating the 0–3, 2014. Trustworthiness of Web Sources. Arxiv preprint, [26] T. Kriplean, J. Morgan, D. Freelon, A. Borning, and (Section 3), 2015. L. Bennett. Supporting reflective public thought with [12] S. Dori-Hacohen and J. Allan. Detecting controversy ConsiderIt. In Proceedings of the ACM Conference on on the web. In Proceedings of the 22nd ACM Computer Supported Cooperative Work (CSCW), pages 265–274. ACM, 2012. (TWEB), 8(4):25, 2014. [27] Y. Mejova, A. X. Zhang, N. Diakopoulos, and [43] E. Yom-Tov, S. T. Dumais, and Q. Guo. Promoting C. Castillo. Controversy and Sentiment in Online civil discourse through search engine diversity. Social News. Sept. 2014. Science Computer Review, 2013. [28] S. A. Munson, S. Y. Lee, and P. Resnick. Encouraging [44] E. Yom-Tov, L. Fernandez-Luque, and L. Luque. Reading of Diverse Political Viewpoints with a Information is in the eye of the beholder: Seeking Browser Widget. In Proceedings of the Internation information on the {MMR} vaccine through an Conference on Weblogs and Social Media, 2013. Internet search engine. In Proceedings of the American [29] S. A. Munson and P. Resnick. Presenting Diverse Medical Informatics Association, 2014. Political Opinions: How and How Much. In Proc. CHI [45] E. Yom-Tov and D. m. boyd. On the link between 2010, CHI ’10, pages 1457–1466, New York, NY, USA, media coverage of anorexia and pro-anorexic practices 2010. ACM. on the web. International Journal of Eating Disorders, [30] A. Oh, H. Lee, and Y. Kim. User evaluation of a 47(2):196–202, 2014. system for classifying and displaying political viewpoints of weblogs. Proc. ICWSM, 2009. [31] E. Pariser. The Filter Bubble: What the Internet is hiding from you. Penguin Press HC, 2011. [32] A.-M. Popescu and M. Pennacchiotti. Detecting controversial events from twitter. In Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM ’10, pages 1873–1876, 2010. [33] M. Recasens, C. Danescu-Niculescu-Mizil, and D. Jurafsky. Linguistic Models for Analyzing and Detecting Biased Language. Proceedings of the 51st Annual Meeting on Association for Computational Linguistics, 2013. [34] D. Schkade, C. R. Sunstein, and R. Hastie. What happened on deliberation day? California Law Review, 95(298):915–940, 2007. [35] H. Sepehri Rad and D. Barbosa. Identifying Controversial Articles in {Wikipedia}: A Comparative Study. In Proceedings of 8th conference on WikiSym, WikiSym ’12. ACM, 2012. [36] A. L. Stinchcombe. Going to Extremes: How Like Minds Unite and Divide. Contemporary Sociology: A Journal of Reviews, 39(2):205–206, 2010. [37] R. R. Sumi, T. Yasseri, A. Rung, A. Kornai, and J. Kertész. Edit wars in Wikipedia. Privacy, Security, Risk and Trust (PASSAT), 2011 IEEE Third International Conference on Social Computing (SocialCom), pages 724–727, 2011. [38] C. R. Sunstein. Republic.com 2.0. Princeton University Press, 2009. [39] M. Tsytsarau, T. Palpanas, and K. Denecke. Scalable detection of sentiment-based contradictions. DiversiWeb 2011, 2011. [40] B.-q. Vuong, E.-p. Lim, A. Sun, M.-T. Le, H. W. Lauw, and K. Chang. On ranking controversies in Wikipedia: models and evaluation. In Proceedings of the international conference on Web search and web data mining, WSDM ’08, pages 171–182, New York, NY, USA, 2008. ACM. [41] V. G. V. Vydiswaran, C. Zhai, D. Roth, and P. Pirolli. BiasTrust: Teaching Biased Users About Controversial Topics. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12, pages 1905–1909, New York, NY, USA, 2012. ACM. [42] R. W. White and A. Hassan. Content bias in online health search. ACM Transactions on the Web