Visual Sentiment Analysis of RSS News Feeds Featuring the US Presidential Election in 2008 Franz Wanner, Christian Rohrdantz, Florian Mansmann, Daniela Oelke, Daniel A. Keim University of Konstanz, Germany firstname.lastname@uni-konstanz.de ABSTRACT making a purchase decision or afterwards to cope with the The technology behind RSS feeds offers great possibilities product’s shortcomings or praise its functionality. Further- to retrieve more news items than ever. In contrast to these more, politicians want to find out their public reputation, the technical developments, human capabilities to read all these manner the news write about them, and the reaction of the news items have not increased likewise. To bridge this gap, public on these articles. this paper presents a visual analytics tool for conducting semi-automatic sentiment analysis of large news feeds. While Since public opinion polls are an expensive undertaking, our the tool automatically retrieves and analyzes RSS feeds with goal is to offer a semi-automatic approach by mining the respect to positive and negative opinion words, the more de- web for particular key words, conducting sentiment analysis manding news analysis of finding trends, spotting peculiar- on the text to assess how positive or negative a particular ities and putting events into context is left to the human ex- news postings is, and then to present the information in a pert. For a solid analysis the news similarity filter enables visual exploration tool. While our approach is not suitable to highlighting of similar or redundant news items. A case completely replace a thoroughly conducted opinion poll due study about news related to the US presidential election in to the lack of accuracy, it has also some unique advantages, 2008 shows how the visual interface of the tool empowers namely low costs and the possibility to continuously monitor the analyst to draw meaningful conclusions without the ef- a particular subject in real-time. Knowing at an early stage fort of reading all news postings. that consumers have a problem with a sub-component of a product gives the company more time to react appropriately Author Keywords and to avoid damage to valuable trade marks. sentiment analysis, opinion mining, information visualiza- tion, visual analytics In this paper, we demonstrate a novel way of using text anal- ysis methods in combination with a visual representation. On the one hand, this system automatically evaluates the ACM Classification Keywords emotional content of a news posting. On the other hand, the H.5.2 Information Interfaces and Presentation: Miscellaneous visual interface empowers the human expert to draw mean- ingful conclusions, to selectively read a few news postings INTRODUCTION with strong emotional content, to discover trends, and to gain The web is the largest information source in the world. One an overview of the development of chosen topic in the me- major aspect of the web is to bring news from all over the dia. world via RSS feeds instantaneously on your screen. Apart from passive usage of the web as a media, web 2.0 technol- To exemplify our tool we have a closer look at the news ogy helps more and more people to actively contribute to this coverage in the web of the 2008 US presidential election. valuable information source by creating content in an easy Out of 50 chosen political RSS newstickers, we retrieved way. There are many possibilities to take an active part in all RSS articles containing at least one of the following key the web: blogs, reviews and other ways to state comments. words: “Obama”, “McCain”, “Biden” and “Palin” as well as “Democrat” and “Republican”. Thereupon, the articles are Analyzing news stories and user generated content is of huge automatically evaluated with respect to the contained posi- importance for many people and organizations. Economic tive and negative opinion words, resulting in a normalized analysts, for example, would like to find consumer and pub- sentiment score for each article. lic opinions on their products and services. Likewise, po- tential consumers seek experiences of existing users before For presentation purposes, these articles are then visualized on a daily timeline using symbols to encode the contained key words. The vertical position of each symbol is defined by the article’s sentiment score, which makes strong emo- tional news more visible. Furthermore, we demonstrate an interactive feature to show relations between the news items to track the development of a specific topic. Workshop on Visual Interfaces to the Social and the Semantic Web (VISSW2009), IUI2009, Feb 8 2009, Sanibel Island, Florida, USA. Copy- The rest of this paper is structured as follows: In section right is held by the author/owner(s). 1 Related Work text and sentiment analysis methods and vi- Two further approaches being related to our work are [2] and sual interfaces for them are discussed. The next section Vi- [11]. Both of them analyze blogs and / or newspaper articles sual Sentiment Analysis then presents our processing, visu- with respect to their political orientation. However, none of alization, and interaction approaches for analyzing the news the approaches explores the development over time as we coverage of the 2008 US presidential election. Afterwards, do. Instead they both focus on analyzing the link structure section Results shows how some interesting topics about the between the different blogs respectively the citation patterns candidates and their parties manifest in our visualization. By for newspaper articles. In addition, [11] takes into account summarizing our contributions we draw our conclusions in how emotionally charged a post is. the last section. Sentiment Analysis Within the abundant literature that exists in the context of RELATED WORK sentiment analysis and opinion mining, some major tasks can be identified: Text Analysis The visualization and visual analysis of textual data is in- • Classification of the statements of a document (or a sen- creasingly attracting interest in different application domains. tence) as subjective or objective. (e.g. [29, 14]) Many of the early approaches in that area dealt with the vi- sualization of retrieval results (see e.g., VIBE [22] or In- • Classification of a document (or a sentence) as expressing foCrystal [27]). Furthermore, a variety of techniques con- a negative or positive sentiment (or opinion). (e.g. [25, centrate on the visualization of large document collections, 5]) most of which are based on dimensionality-reduction meth- ods (see e.g. WebSOM [23], Galaxies and ThemeScape of • Feature-based opinion mining made up by two successive IN-SPIRET M [30], or [9]). In contrast to this, text feature steps: First, the features (or attributes), that have been visualization techniques visualize single documents in de- commented on, are identified. Secondly, the respective tail and show the distribution of specific text features across opinion that has been expressed on them is detected. (e.g. the text. Prominent examples among these are e.g. TileBars [17, 18, 26, 21, 20]) [16], Seesoft [3], the FeatureLens [6], and Literature Finger- printing [19]. But also [1] and the Compus system of Fekete Note that our approach is not contributing to the area of au- and Dufournaud [7] are worth being mentioned: As opposed tomatic sentiment analysis but makes use of some of its stan- to the other techniques they offer the possibility to visualize dard techniques. However, we contribute to the development several text features at once. of visualizations for sentiment analysis. Related work in this respect includes [10, 24, 13]. The visualization, which Relatively few approaches tackle the problem of visualizing shows to have the highest resemblance to our work, can be temporal variations across a set of documents as we do in found in [24]. The authors suggest to use bars to visualize this paper. One example for such an approach is the well- how many positive respectively negative statements – that known ThemeRiver visualization [15] that reveals the devel- comment on one of the analyzed attributes of a product – ex- opment of topics over time in a river-like graphic. Accord- ist within the document corpus. Our work is similar in that ing to the metaphor each topic is represented as one colored we also use the vertical deflection of bars to encode the opin- “current” in the “river” that flows in the direction of the time- ion that is expressed. In contrast to [24] however, in our case line from left to right. To allow for several different themes one bar represents one document instead of the summary of to be displayed at once the currents are stacked on top of all sentences talking about a specific attribute of a product. each other. The thickness of a current at a specific point Moreover, in our visualization the development over time is in time represents the strength of the topic in the associated central, something that is completely omitted in all of the documents. TimeMines [28] and Narratives [8] are exam- above mentioned approaches for sentiment analysis / opin- ples for visualizations that are based on standard line charts. ion mining. In [10] customer reviews are visualized, too, TimeMines automatically determines keywords and judges but a Treemap representation is used to display the result of those keywords with respect to their temporal significance the analysis. Finally, [13] presents an adaptation of the Rose in the context of the corpus. Furthermore, keywords that Plot visualizations to illustrate the affective content of a doc- show to have a similar development over time are grouped ument. In addition to positive and negative sentiments, the to form a topic. Narratives presents the development of a documents are also analyzed with respect to the categories specific topic over time and searches for correlated terms. virtue, vice, pleasure, pain, power cooperative, and power conflict. A similar concept is reported in [12]. The system BlogPulse (that can be found at www.blogpulse.com) monitors blogs VISUAL SENTIMENT ANALYSIS and displays timelines that show how many blogs talk about Data Processing a specific topic at a specific point in time. In addition, hot The data we used was gathered from 50 different RSS news topics are detected automatically. All of the mentioned time- feeds, that mainly dealt with the 2008 US presidential elec- oriented approaches have a common limitation: They merely tions. The RSS feeds were retrieved every 30 minutes during display the development of the significance of keywords or a time interval of one month (10/09/2008 - 11/10/2008). For topics over time. Our approach goes beyond that by means every news item in each feed we saved date, title and descrip- of additionally revealing the sentiment of the documents. tion, as well as the id of the feed. Next, noise was eliminated 2 out of the title and description. With noise we refer to strings (negative). Horizontal lines mark the position that a news that do not carry any content, such as URLs or strings con- item would have that is neither positive nor negative. sisting of special characters. The concatenation of title and description was then considered to be the content of the news item. Finally, we filtered out those documents that contained Coloring none of the following signal words: “Obama”, “McCain”, Everything that is solely related to the conservatives (Repub- “Biden”, “Palin”, “Democrat” and “Republican”. More than lican party) is colored in red and everything purely related to 23,000 news items contained at least one of the six strings. the liberals (Democratic party) in blue. Gray news objects relate both to the liberals and the conservatives, which basi- Pairwise similarities between news items were calculated by cally means that both camps are mentioned within the news’ applying a similarity measure, which counts the number of content. non-stopwords that two items have in common (normalized by the length of the larger item). Although this is a relatively simple measure it works quite well for the short descriptive Shape texts in the RSS news feeds. The use of different shapes for the object allows us to make a distinction between news items in which the first candi- Another aspect of interest is the sentiment context of a news date of a party was mentioned, the second candidate but not item, which is done by enriching each item with a sentiment the first candidate or none of them but only the name of the score. For this purpose we make use of a freely available party. Figure 1 shows the visual appearance of the different list of words that evoke positive or negative associations [4]. shapes. Please note that we keep the horizontal interruptions We count the number of positive and negative words and that are utilized to mark news items that talk about the sec- evaluate the whole news item as rather positive if it contains ond candidate always at the same vertical position of each in total more positive than negative words. Likewise, the line (regardless of the vertical shift of the object that encodes item is evaluated as rather negative if it contains more neg- the emotional score). This leads to a clear visual pattern of ative than positive words. The absolute relation of positive continuous white horizontal lines, if several neighboring ob- against negative words normalized by the item’s length, pro- jects refer to the second candidates only. vides our sentiment score. One important point to mention here is that the appearance of a candidate, e.g., in a negative context, does not necessarily mean, that the item contains negative publicity for the candidate, but simply that he ap- pears in a negatively connoted context. This becomes clear when we consider the example of news telling that racists planned to assassinate Obama (see section “Results”). This was bad news for Obama not about Obama, with a visibly negative connotation. Data Visualization The visualization on the one hand aims to give a meaningful representation of the data and on the other hand is intended to be an appropriate starting point for the interactive explo- ration and discovery of interesting patterns. Figure 4 shows a Figure 1. Symbols used to represent news items according to the ap- screenshot of the visualization. Each line represents one day pearance of certain keywords. and each colored object depicts one news item. The news item’s emotional score is encoded by a vertical displacement of the news item. Colors encode whether the text mentions Opacity the Democratic party, the Republican party or both. Addi- We paint our news objects with a relatively low opacity. That tionally, the shape of the news objects visualizes whether the means they are partly transparent, which comes with two ad- first candidate, the second candidate or only the name of the vantages: First, the problem of overlapping news objects is party itself was mentioned. The following passages describe reduced. In most cases every object is visible and can be each of those aspects in detail. differentiated clearly from its overlapping neighbors. Sec- ondly, if multiple news items are put on top of each other, Placement the overall opacity at this position increases, resulting in an Every news item is represented by an object in a 2D plane. object that is less opaque and can therefore be distinguished The position of the object within the plane depends on the from objects that represent just one news item. The situation date the news was published. Thereby, the day it was pub- that several feeds bring the same news nearly at the same lished accounts for the line it will be placed in (as each line moment in time is often the case when the news is very im- represents one day) and the time of day determines its hor- portant. That means that the less opaque news objects of- izontal position within the line. The exact vertical position ten represent news that are more important and surely more depends on the sentiment score of the object. According to widely spread. Figure 2 visually illustrates the above men- this value an object is slightly shifted up (positive) or down tioned design decisions. 3 Republican. To exemplify our Visual Analytics technique, higher α-value: same “Biden in highlighted we picked five interesting discussions in the monitored RSS news item from neutral context” news item feeds. different feeds + Palin abused power in Alaska On Saturday, 10th October, many negative news postings oc- two horizontal curred about Sarah Palin. Almost all articles deal with the sentiment lines represent topic whether Sarah Palin had abused her power in Alaska shift one day or not. As demonstrated in Fig. 5 there is a high density of red shapes with two white bars symbolizing news postings - about Palin. Their positions below the baseline denote that mainly negative emotion words were used in these postings. “Democrats in “McCain in Only one exceptionally positive red news item sticks out in negative context” about one hour positive context” the visualization. A closer look at this posting reveals that it of the day is a response from the McCain-Palin presidential campaign: “Sarah Palin acted ‘within proper and lawful authority’ in removing the state’s public safety commissioner”. Figure 2. Semantics of the visualization Fri Oct 10 19:41:49 CST 2008 (Feed 19): Fri Oct 10 22:15:22 CST 2008 (Feed 39): Palin abused power Alaska 'Troopergate' Palin says report says she acted lawfully probe finds: AFP - Republican vice- (Reuters): Reuters - Alaska Gov. Sarah Palin presidential nominee Sarah Palin abused her acted "within proper and lawful authority" in position as Alaska Governor by pressuring removing the state's public safety Interactive Visual Analytics officials to dismiss a state trooper, an commissioner, the McCain-Palin Republican investigator's report said. presidential ticket said on Friday in response The visualization is designed for an interactive data explo- to a state report. ration. There are several possibilities to interact with the tool: • Zooming: Continuous zooming allows to analyze certain parts at a greater level of detail. Fri Oct 10 21:50:40 CST 2008 (Feed 32): Alaska ethics probe says Palin abused her Fri Oct 10 19:24:20 CST 2008 (Feed 49): power: CHILLICOTHE, Ohio (Reuters) - An Alaska panel finds Palin abused power in Alaska ethics inquiry found on Friday that U.S. • Details on demand: When the mouse is dragged over a firing: ANCHORAGE, Alaska (AP) -- A Republican vice presidential candidate Sarah Palin abused her power as the state's legislative committee investigating Alaska news object, a tooltip appears containing date, time, feed Gov. Sarah Palin has found she unlawfully abused her authority in firing the state's governor, casting a cloud over John McCain's controversial choice of running mate for the November 4 election. id, and content of the item. public safety commissioner. The investigative report concludes that a family grudge wasn't the sole reason for firing Public Safety Fri Oct 10 21:06:44 CST 2008 (Feed 18): Commissioner Walter Monegan but says it Probe accuses Palin of abuse of power (AFP): • Similarity search: With a mouse click on a news object, likely was a contributing factor.... AFP - Investigators found vice presidential nominee Sarah Palin abused her powers as Alaska governor, dealing another blow to the search for similar news items is started. The news item Republican John McCain's struggling White House bid. itself and every other news object that is related to it is highlighted (please refer to section “Data Processing” for our definition of similarity). Figure 3 shows an example. Figure 5. Media coverage dealing with the topic of Sarah Palin’s abuse of power as a governor of Alaska. • Filtering: The user can select the different candidates / parties he is interested in. Another possibility to reduce Bad news for the Democrats the number of items that are displayed is to select one spe- Approximately one week before the US presidential elec- cific RSS feed. Both filtering mechanisms can be used tion we detected a high appearance of news which included to analyze in detail the behaviour of one specific news “Obama” (see Fig. 6). The sentiment scores of these post- provider respectively the development of news for a sub- ings were mainly negative and dealt with a plot to assassi- set of candidates and/or parties. nate Barack Obama and 102 blacks. Note that the news are bad for him but not about him, meaning that a negative event is related to him in the news postings although the negative opinion words do not refer to him as a person. The used emotion words were so strong, that even in the overview it is possible to recognize the emergence of the negative news of that event on 28th of October (see Fig. 4). Figure 3. After selecting one news item, similar items are highlighted Note that although each RSS posting only consist of a few in yellow enabling the user to track specific topics (low threshold) or sentences, the few contained positive or negative opinion redundant postings (high threshold). words are sufficient to provide clear results. Further head- lines of that day discuss the corruption scandal of a Demo- RESULTS cratic senator and result in negative headlines for the Democrats. First of all, we present an overview of all 50 monitored RSS feeds over a time period of 31 days in Fig. 4. A prede- TV debate Obama vs. McCain fined filter displays all news postings containing at least one In the middle of October the final TV debate between the of the terms Obama, McCain, Biden, Palin, Democrat, and Democrat candidate Barack Obama and the Republican can- 4 A B C D E Figure 4. 31 days of the 2008 US presidential election showing a scandal of power abuse by Palin (A), the TV debate McCain vs. Obama (B), assassination plans against Obama (C), the election day (D), and a debate about Palin’s election wardrobe (E). 5 8). These outliers deal with some critical notes about the ex- pensive wardrobe, which was bought by Sarah Palin for her campaign, and her inappropriate use of language describing her critics. Fri Nov 07 15:40:35 CST 2008 (Feed 23): Fri Nov 07 17:56:01 CST 2008 (Feed 31): GOP tries to sort out Palin's donor-funded Palin fires back at leaks questioning her duds: WASHINGTON (AP) -- Republican smarts: WASHINGTON (Reuters) - Alaska Party lawyers are still trying to determine Gov. Sarah Palin fired back on Friday exactly what clothing was purchased for against post-election claims by aides to Alaska Gov. Sarah Palin, what was Republican presidential candidate John returned and what has become of the McCain that she thought Africa was a rest..... country, not a continent, calling the anonymous sources "jerks." Mon Oct 27 14:24:25 CST 2008 Mon Oct 27 15:45:26 CST 2008 Mon Oct 27 16:45:39 CST 2008 (Feed 37): (Feed 38): (Feed 31): ATF disrupts skinhead plot to Assassination plot targeting Skinheads held over Obama assassinate Obama (AP): Obama disrupted (AP): AP - Law death plot: WASHINGTON AP - The ATF says it has enforcement agents have broken (Reuters) - Two white broken up a plot to assassinate up a plot by two neo-Nazi supremacist skinheads were Democratic presidential skinheads to assassinate arrested in Tennessee over candidate Barack Obama and Democratic presidential plans to go on a killing spree shoot or decapitate 102 black candidate Barack Obama and and eventually shoot people in a Tennessee murder shoot or decapitate 88 black Democratic presidential spree. people, the Bureau of Alcohol, candidate Barack Obama, court Tobacco Firearms and documents showed on Monday. Explosives said Monday. Fri Nov 07 16:38:59 CST 2008 (Feed 39): Fri Nov 07 16:01:19 CST 2008 (Feed 37): Palin denounces her critics as cowardly Figure 6. Democrats appears in “negative context”. Bad news for Palin denounces her critics as cowardly (AP): AP - Alaska Gov. Sarah Palin called her critics cowards and jerks Friday for Obama, but not about him. (AP): AP - Alaska Gov. Sarah Palin is deriding her anonymously and insisted she striking back at critics of the high-priced wardrobe she wore as the Republican never asked for the expensive wardrobe vice presidential candidate.... purchased for her use on the presidential campaign. didate John McCain was held. As shown in Fig. 7, news postings of the event cover both candidates (gray) and gen- Figure 8. Palin under attack after the elections. erally have low sentiment scores due to the criticism of both candidates against each other. The debate revealed little nov- elty with respect to each candidate’s political plans after the Further trends election. Therefore, there were no strong positive statements The Democratic vice presidential candidate Joe Biden, who about the event in the monitored feeds. is represented by blue bars with two interruptions, was not referenced often. As it can be seen in Fig. 4, he appears very rarely compared to the Republican vice presidential candi- date Sarah Palin. A further discovery was that some feeds show daily patterns. For example, one RSS-feed only sent messages in the morn- ing at about 7AM, others broadcast their news during work- ing hours and some feeds even switched the coverage of po- litical events within daily patterns, which is probably due to Wed Oct 15 22:20:20 CST 2008 (Feed 32): Wed Oct 15 22:36:54 CST 2008 two editors each preferring news about one party and taking McCain and Obama battle in contentious debate: HEMPSTEAD, New York (Reuters) (Feed 34): turns in writing news postings. Obama, McCain Get Feisty in - Republican John McCain and Democrat Final Presidential Debate: Barack Obama battled fiercely on Wednesday in their liveliest and most Candidates mix it up on campaign Often, the same news story is broadcasted in many different attacks economics, taxes, "Joe contentious debate, with McCain attacking the plumber." plumber. feeds (e.g., the above mentioned news about Palin’s wardrobe). Obama's tax plan, campaign tone and relationship with a 1960s radical. This is mainly due to the fact that some feeds immediately broadcast the news copied from a particular news agency, whereas other feeds broadcasted this information later. An- Figure 7. TV debate other feed resent the same news posting several times as shown in Fig. 9. Obama wins the election CONCLUSIONS As you can see in annotation D in Fig. 4 the election day is The main contribution of this paper is the combination of a dominated by gray bars. This is due to the fact that these sentiment analysis method with a visualization technique re- news postings reported about election results in particular vealing the emotional content of RSS news feeds over time. states, featuring scores of both candidates. In the evening of Through textual filters, we focused our analysis on the 2008 the election day lots of news postings were received about US presidential election featuring positive and negative news the winner Barack Obama. The density of news about the items about the presidential candidates Obama and McCain, Democrats increased rapidly after the result was known and the vice president candidates Biden and Palin and the two dominate the news for several days. major parties. The timeline visualization builds upon three basic elements, first the attribute color denotes the political Palin’s wardrobe party featured in the news article, second, different shapes Although after the election the blue shapes increased im- are used to distinguish between the discussed persons, and mensely, some red negatively rated items stick out (see Fig. third, the emotional score of each RSS news article resulted 6 news items are copied from other news tickers, related RSS postings are often based on the text of the same announce- ment of a newswire and therefore often contain almost iden- tical vocabulary. For the analysis of other content, such as product reviews or the full articles linked in the RSS tick- ers, more complex document similarity measures could be employed. Furthermore, we believe that more sophisticated sentiment analysis methods can be integrated into the pre- sented analysis tool. Acknowledgement This work has been funded by the research center ”Compu- Figure 9. Technical failure or search engine optimization resulting in resending the same news postings over and over again. tational Analysis of Linguistic Development” at the Univer- sity of Konstanz and by the German Research Society (DFG) under the grant GK-1042, Explorative Analysis and Visual- in the vertical position of the representative symbol on the ization of Large Information Spaces, Konstanz. time line. We thank the anonymous reviewers of the VISSW 2009 for their valuable comments. Within the result section, we showed how some emotional discussions manifested in our news visualization: 1) Palin REFERENCES abused power in Alaska, which resulted in many negative 1. A. Abbasi and H. Chen. Categorization and analysis of news items and her own version sticking out as a highly pos- text in computer mediated communication archives itive article. 2) The story about assassination plans against using visualization. In JCDL ’07: Proceedings of the Obama dominated the news for several hours with highly 2007 conference on Digital libraries, pages 11–18, negative sentiment scores. 3) The final TV debate consisted New York, NY, USA, 2007. ACM. of mainly gray elements since reports featured both candi- dates. In general, the accusations of both candidates against 2. L. A. Adamic and N. Glance. The political blogosphere each other resulted in more negative than positive sentiment and the 2004 U.S. election: divided they blog. In scores. 4) Obama wins the elections, which is documented LinkKDD ’05: Proceedings of the 3rd international by the vast dominance of blue news elements on the eve of workshop on Link discovery, pages 36–43. ACM, 2005. the election day and the following days. 5) Even after the 3. T. Ball and S. G. Eick. Software Visualization in the election a discussion about the expensive wardrobe of Palin Large. IEEE Computer, 29(4):33–43, 1996. fills negative headlines. 4. V. Buvac. Internet General Inquirer, 2008. The tool’s interaction concept shows the corresponding RSS http://www.webuse.umd.edu:9090/ as retrieved on Nov. news articles when the mouse is moved over a symbol on 14, 2008. the timeline. To find redundant or similar news items in the 5. K. Dave, S. Lawrence, and D. M. Pennock. Mining the process of analyzing particular events, we furthermore im- peanut gallery: opinion extraction and semantic plemented a simple document similarity filter, which after classification of product reviews. In WWW ’03: selecting a particular news item highlights all related news Proceedings of the 12th international conference on postings surpassing a certain threshold of similarity. World Wide Web, pages 519–528. ACM, 2003. We believe that the presented analysis tool can not only be 6. A. Don, E. Zheleva, M. Gregory, S. Tarkan, L. Auvil, used to monitor public emotional discussions, but is also ca- T. Clement, B. Shneiderman, and C. Plaisant. pable of evaluating product reviews, public opinions on a Discovering interesting usage patterns in text particular subject, or to get hints about the reputation an en- collections: integrating text mining with visualization. terprise. By offering sentiment analysis functionality of a In CIKM ’07: Proceedings of the sixteenth ACM multitude of large RSS feeds in real-time, users of this tech- conference on Conference on information and nique can take early action, such as reacting before a topic knowledge management, pages 213–222. ACM, 2007. dominates news coverage. This strategic dimension of our application is very valuable for public relation specialists 7. J.-D. Fekete and N. Dufournaud. Compus: visualization and could be implemented in early warning systems. Fur- and analysis of structured documents for understanding thermore, we expect the tool to be useful for monitoring social life in the 16th century. In DL ’00: Proceedings the evolution of popularity of certain products, persons, or of the fifth ACM conference on Digital libraries, pages views, ultimately answering the question about why a posi- 47–55, New York, NY, USA, 2000. ACM. tive public image turned into a negative one. 8. D. Fisher, A. Hoff, G. Robertson, and M. Hurst. Narratives: A Visualization to Track Narrative Events Future Work as they Develop. In IEEE Symposium on Visual For computing the similarity between news items we used Analytics and Technology (VAST 2007), pages a simple word matching method. Due to the fact that many 115–122, 2008. 7 9. B. Fortuna, D. Mladenic, and M. Grobelnik. 23. K. Lagus, T. Honkela, S. Kaski, and T. Kohonen. Visualization of Text Document Corpus. Informatica Self-organizing maps of document collections: A new Journal, 29(4):497–502, 2005. approach to interactive exploration. In E. Simoudis, J. Han, and U. Fayyad, editors, Proceedings of the 10. M. Gamon, A. Aue, S. Corston-Oliver, and E. Ringger. Second International Conference on Knowledge Pulse: Mining Customer Opinions from Free Text. In Discovery and Data Mining, pages 238–243. AAAI Advances in Intelligent Data Analysis VI, pages Press, 1996. 121–132. Springer, 2005. 24. B. Liu, M. Hu, and J. Cheng. Opinion observer: 11. M. Gamon, S. Basu, D. Belenko, D. Fisher, M. Hurst, analyzing and comparing opinions on the Web. In and A. C. König. BLEWS: Using Blogs to Provide WWW ’05: Proceedings of the 14th international Context for News Articles. In ICWSM, 2008. conference on World Wide Web, pages 342–351. ACM, 12. N. Glance, M. Hurst, and T. Tomokiyo. BlogPulse: 2005. Automated Trend Discovery for Weblogs. In WWW 2004 Workshop on the Weblogging Ecosystem. ACM, 25. B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: May 2004. sentiment classification using machine learning techniques. In EMNLP ’02: Proceedings of the ACL-02 13. M. L. Gregory, N. Chinchor, P. Whitney, R. Carter, conference on Empirical methods in natural language E. Hetzler, and A. Turner. User-directed Sentiment processing, pages 79–86. Association for Analysis: Visualizing the Affective Content of Computational Linguistics, 2002. Documents. In Workshop on Sentiment and Subjectivity in Text, pages 23–30, 2006. 26. A.-M. Popescu and O. Etzioni. Extracting product features and opinions from reviews. In HLT ’05: 14. V. Hatzivassiloglou and J. Wiebe. Effects of adjective Proceedings of the conference on Human Language orientation and gradability on sentence subjectivity, Technology and Empirical Methods in Natural 2000. Language Processing, pages 339–346. Association for Computational Linguistics, 2005. 15. S. Havre, E. Hetzler, P. Whitney, and L. Nowell. ThemeRiver: Visualizing Thematic Changes in Large 27. A. Spoerri. InfoCrystal: a visual tool for information Document Collections. IEEE Transactions on retrieval & management. In CIKM ’93: Proceedings of Visualization and Computer Graphics, 8(1):9–20, 2002. the second international conference on Information and knowledge management, pages 11–20. ACM, 1993. 16. M. A. Hearst. TileBars: Visualization of Term Distribution Information in Full Text Information 28. R. Swan and D. Jensen. TimeMines: Constructing Access. In Proceedings of the Conference on Human Timelines with Statistical Models of Word Usage, Factors in Computing Systems, CHI’95, 1995. 2000. 17. M. Hu and B. Liu. Mining and summarizing customer 29. B. Wang, B. Spencer, C. X. Ling, and H. Zhang. reviews. In KDD ’04: Proceedings of the tenth ACM Semi-supervised Self-training for Sentence Subjectivity SIGKDD international conference on Knowledge Classification, pages 344–355. Lecture Notes in discovery and data mining, pages 168–177. ACM, Computer Science. Springer Berlin / Heidelberg, 2008. 2004. 30. J. A. Wise, J. J. Thomas, K. Pennock, D. Lantrip, 18. M. Hu and B. Liu. Mining Opinion Features in M. Pottier, A. Schur, and V. Crow. Visualizing the Customer Reviews. In AAAI, pages 755–760, 2004. non-visual: spatial analysis and interaction with 19. D. A. Keim and D. Oelke. Literature Fingerprinting: A information from text documents. In INFOVIS ’95: New Method for Visual Literary Analysis. In EEE Proceedings of the 1995 IEEE Symposium on Symposium on Visual Analytics and Technology (VAST Information Visualization, pages 51–58, 1995. 2007), pages 115–122, 2007. 20. S.-M. Kim and E. Hovy. Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text. In Proceedings of the ACL Workshop on Sentiment and Subjectivity in Text, pages 1–8, 2006. 21. N. Kobayashi, K. Inui, Y. Matsumoto, K. Tateishi, and T. Fukushima. Collecting Evaluative Expressions for Opinion Extraction. In IJCNLP, pages 596–605, 2004. 22. R. R. Korfhage. To see, or not to see – is That the query? In SIGIR ’91: Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, pages 134–141. ACM Press, 1991. 8