=Paper=
{{Paper
|id=Vol-443/paper-7
|storemode=property
|title=Visual Sentiment Analysis of RSS News Feeds Featuring the US Presidential Election in 2008
|pdfUrl=https://ceur-ws.org/Vol-443/paper7.pdf
|volume=Vol-443
}}
==Visual Sentiment Analysis of RSS News Feeds Featuring the US Presidential Election in 2008==
<pdf width="1500px">https://ceur-ws.org/Vol-443/paper7.pdf</pdf>
<pre>
  Visual Sentiment Analysis of RSS News Feeds Featuring
            the US Presidential Election in 2008
       Franz Wanner, Christian Rohrdantz, Florian Mansmann, Daniela Oelke, Daniel A. Keim
                                  University of Konstanz, Germany
                                firstname.lastname@uni-konstanz.de

ABSTRACT                                                                    making a purchase decision or afterwards to cope with the
The technology behind RSS feeds offers great possibilities                  product’s shortcomings or praise its functionality. Further-
to retrieve more news items than ever. In contrast to these                 more, politicians want to find out their public reputation, the
technical developments, human capabilities to read all these                manner the news write about them, and the reaction of the
news items have not increased likewise. To bridge this gap,                 public on these articles.
this paper presents a visual analytics tool for conducting
semi-automatic sentiment analysis of large news feeds. While                Since public opinion polls are an expensive undertaking, our
the tool automatically retrieves and analyzes RSS feeds with                goal is to offer a semi-automatic approach by mining the
respect to positive and negative opinion words, the more de-                web for particular key words, conducting sentiment analysis
manding news analysis of finding trends, spotting peculiar-                 on the text to assess how positive or negative a particular
ities and putting events into context is left to the human ex-              news postings is, and then to present the information in a
pert. For a solid analysis the news similarity filter enables               visual exploration tool. While our approach is not suitable to
highlighting of similar or redundant news items. A case                     completely replace a thoroughly conducted opinion poll due
study about news related to the US presidential election in                 to the lack of accuracy, it has also some unique advantages,
2008 shows how the visual interface of the tool empowers                    namely low costs and the possibility to continuously monitor
the analyst to draw meaningful conclusions without the ef-                  a particular subject in real-time. Knowing at an early stage
fort of reading all news postings.                                          that consumers have a problem with a sub-component of a
                                                                            product gives the company more time to react appropriately
Author Keywords                                                             and to avoid damage to valuable trade marks.
sentiment analysis, opinion mining, information visualiza-
tion, visual analytics                                                      In this paper, we demonstrate a novel way of using text anal-
                                                                            ysis methods in combination with a visual representation.
                                                                            On the one hand, this system automatically evaluates the
ACM Classification Keywords
                                                                            emotional content of a news posting. On the other hand, the
H.5.2 Information Interfaces and Presentation: Miscellaneous
                                                                            visual interface empowers the human expert to draw mean-
                                                                            ingful conclusions, to selectively read a few news postings
INTRODUCTION                                                                with strong emotional content, to discover trends, and to gain
The web is the largest information source in the world. One                 an overview of the development of chosen topic in the me-
major aspect of the web is to bring news from all over the                  dia.
world via RSS feeds instantaneously on your screen. Apart
from passive usage of the web as a media, web 2.0 technol-                  To exemplify our tool we have a closer look at the news
ogy helps more and more people to actively contribute to this               coverage in the web of the 2008 US presidential election.
valuable information source by creating content in an easy                  Out of 50 chosen political RSS newstickers, we retrieved
way. There are many possibilities to take an active part in                 all RSS articles containing at least one of the following key
the web: blogs, reviews and other ways to state comments.                   words: “Obama”, “McCain”, “Biden” and “Palin” as well as
                                                                            “Democrat” and “Republican”. Thereupon, the articles are
Analyzing news stories and user generated content is of huge                automatically evaluated with respect to the contained posi-
importance for many people and organizations. Economic                      tive and negative opinion words, resulting in a normalized
analysts, for example, would like to find consumer and pub-                 sentiment score for each article.
lic opinions on their products and services. Likewise, po-
tential consumers seek experiences of existing users before                 For presentation purposes, these articles are then visualized
                                                                            on a daily timeline using symbols to encode the contained
                                                                            key words. The vertical position of each symbol is defined
                                                                            by the article’s sentiment score, which makes strong emo-
                                                                            tional news more visible. Furthermore, we demonstrate an
                                                                            interactive feature to show relations between the news items
                                                                            to track the development of a specific topic.
Workshop on Visual Interfaces to the Social and the Semantic Web
(VISSW2009), IUI2009, Feb 8 2009, Sanibel Island, Florida, USA. Copy-       The rest of this paper is structured as follows: In section
right is held by the author/owner(s).

                                                                        1
Related Work text and sentiment analysis methods and vi-                Two further approaches being related to our work are [2] and
sual interfaces for them are discussed. The next section Vi-            [11]. Both of them analyze blogs and / or newspaper articles
sual Sentiment Analysis then presents our processing, visu-             with respect to their political orientation. However, none of
alization, and interaction approaches for analyzing the news            the approaches explores the development over time as we
coverage of the 2008 US presidential election. Afterwards,              do. Instead they both focus on analyzing the link structure
section Results shows how some interesting topics about the             between the different blogs respectively the citation patterns
candidates and their parties manifest in our visualization. By          for newspaper articles. In addition, [11] takes into account
summarizing our contributions we draw our conclusions in                how emotionally charged a post is.
the last section.
                                                                        Sentiment Analysis
                                                                        Within the abundant literature that exists in the context of
RELATED WORK                                                            sentiment analysis and opinion mining, some major tasks
                                                                        can be identified:
Text Analysis
The visualization and visual analysis of textual data is in-            • Classification of the statements of a document (or a sen-
creasingly attracting interest in different application domains.          tence) as subjective or objective. (e.g. [29, 14])
Many of the early approaches in that area dealt with the vi-
sualization of retrieval results (see e.g., VIBE [22] or In-            • Classification of a document (or a sentence) as expressing
foCrystal [27]). Furthermore, a variety of techniques con-                a negative or positive sentiment (or opinion). (e.g. [25,
centrate on the visualization of large document collections,              5])
most of which are based on dimensionality-reduction meth-
ods (see e.g. WebSOM [23], Galaxies and ThemeScape of                   • Feature-based opinion mining made up by two successive
IN-SPIRET M [30], or [9]). In contrast to this, text feature              steps: First, the features (or attributes), that have been
visualization techniques visualize single documents in de-                commented on, are identified. Secondly, the respective
tail and show the distribution of specific text features across           opinion that has been expressed on them is detected. (e.g.
the text. Prominent examples among these are e.g. TileBars                [17, 18, 26, 21, 20])
[16], Seesoft [3], the FeatureLens [6], and Literature Finger-
printing [19]. But also [1] and the Compus system of Fekete             Note that our approach is not contributing to the area of au-
and Dufournaud [7] are worth being mentioned: As opposed                tomatic sentiment analysis but makes use of some of its stan-
to the other techniques they offer the possibility to visualize         dard techniques. However, we contribute to the development
several text features at once.                                          of visualizations for sentiment analysis. Related work in
                                                                        this respect includes [10, 24, 13]. The visualization, which
Relatively few approaches tackle the problem of visualizing             shows to have the highest resemblance to our work, can be
temporal variations across a set of documents as we do in               found in [24]. The authors suggest to use bars to visualize
this paper. One example for such an approach is the well-               how many positive respectively negative statements – that
known ThemeRiver visualization [15] that reveals the devel-             comment on one of the analyzed attributes of a product – ex-
opment of topics over time in a river-like graphic. Accord-             ist within the document corpus. Our work is similar in that
ing to the metaphor each topic is represented as one colored            we also use the vertical deflection of bars to encode the opin-
“current” in the “river” that flows in the direction of the time-       ion that is expressed. In contrast to [24] however, in our case
line from left to right. To allow for several different themes          one bar represents one document instead of the summary of
to be displayed at once the currents are stacked on top of              all sentences talking about a specific attribute of a product.
each other. The thickness of a current at a specific point              Moreover, in our visualization the development over time is
in time represents the strength of the topic in the associated          central, something that is completely omitted in all of the
documents. TimeMines [28] and Narratives [8] are exam-                  above mentioned approaches for sentiment analysis / opin-
ples for visualizations that are based on standard line charts.         ion mining. In [10] customer reviews are visualized, too,
TimeMines automatically determines keywords and judges                  but a Treemap representation is used to display the result of
those keywords with respect to their temporal significance              the analysis. Finally, [13] presents an adaptation of the Rose
in the context of the corpus. Furthermore, keywords that                Plot visualizations to illustrate the affective content of a doc-
show to have a similar development over time are grouped                ument. In addition to positive and negative sentiments, the
to form a topic. Narratives presents the development of a               documents are also analyzed with respect to the categories
specific topic over time and searches for correlated terms.             virtue, vice, pleasure, pain, power cooperative, and power
                                                                        conflict.
A similar concept is reported in [12]. The system BlogPulse
(that can be found at www.blogpulse.com) monitors blogs                 VISUAL SENTIMENT ANALYSIS
and displays timelines that show how many blogs talk about              Data Processing
a specific topic at a specific point in time. In addition, hot          The data we used was gathered from 50 different RSS news
topics are detected automatically. All of the mentioned time-           feeds, that mainly dealt with the 2008 US presidential elec-
oriented approaches have a common limitation: They merely               tions. The RSS feeds were retrieved every 30 minutes during
display the development of the significance of keywords or              a time interval of one month (10/09/2008 - 11/10/2008). For
topics over time. Our approach goes beyond that by means                every news item in each feed we saved date, title and descrip-
of additionally revealing the sentiment of the documents.               tion, as well as the id of the feed. Next, noise was eliminated


                                                                    2
out of the title and description. With noise we refer to strings       (negative). Horizontal lines mark the position that a news
that do not carry any content, such as URLs or strings con-            item would have that is neither positive nor negative.
sisting of special characters. The concatenation of title and
description was then considered to be the content of the news
item. Finally, we filtered out those documents that contained          Coloring
none of the following signal words: “Obama”, “McCain”,                 Everything that is solely related to the conservatives (Repub-
“Biden”, “Palin”, “Democrat” and “Republican”. More than               lican party) is colored in red and everything purely related to
23,000 news items contained at least one of the six strings.           the liberals (Democratic party) in blue. Gray news objects
                                                                       relate both to the liberals and the conservatives, which basi-
Pairwise similarities between news items were calculated by            cally means that both camps are mentioned within the news’
applying a similarity measure, which counts the number of              content.
non-stopwords that two items have in common (normalized
by the length of the larger item). Although this is a relatively
simple measure it works quite well for the short descriptive           Shape
texts in the RSS news feeds.                                           The use of different shapes for the object allows us to make
                                                                       a distinction between news items in which the first candi-
Another aspect of interest is the sentiment context of a news          date of a party was mentioned, the second candidate but not
item, which is done by enriching each item with a sentiment            the first candidate or none of them but only the name of the
score. For this purpose we make use of a freely available              party. Figure 1 shows the visual appearance of the different
list of words that evoke positive or negative associations [4].        shapes. Please note that we keep the horizontal interruptions
We count the number of positive and negative words and                 that are utilized to mark news items that talk about the sec-
evaluate the whole news item as rather positive if it contains         ond candidate always at the same vertical position of each
in total more positive than negative words. Likewise, the              line (regardless of the vertical shift of the object that encodes
item is evaluated as rather negative if it contains more neg-          the emotional score). This leads to a clear visual pattern of
ative than positive words. The absolute relation of positive           continuous white horizontal lines, if several neighboring ob-
against negative words normalized by the item’s length, pro-           jects refer to the second candidates only.
vides our sentiment score. One important point to mention
here is that the appearance of a candidate, e.g., in a negative
context, does not necessarily mean, that the item contains
negative publicity for the candidate, but simply that he ap-
pears in a negatively connoted context. This becomes clear
when we consider the example of news telling that racists
planned to assassinate Obama (see section “Results”). This
was bad news for Obama not about Obama, with a visibly
negative connotation.

Data Visualization
The visualization on the one hand aims to give a meaningful
representation of the data and on the other hand is intended
to be an appropriate starting point for the interactive explo-
ration and discovery of interesting patterns. Figure 4 shows a         Figure 1. Symbols used to represent news items according to the ap-
screenshot of the visualization. Each line represents one day          pearance of certain keywords.
and each colored object depicts one news item. The news
item’s emotional score is encoded by a vertical displacement
of the news item. Colors encode whether the text mentions              Opacity
the Democratic party, the Republican party or both. Addi-              We paint our news objects with a relatively low opacity. That
tionally, the shape of the news objects visualizes whether the         means they are partly transparent, which comes with two ad-
first candidate, the second candidate or only the name of the          vantages: First, the problem of overlapping news objects is
party itself was mentioned. The following passages describe            reduced. In most cases every object is visible and can be
each of those aspects in detail.                                       differentiated clearly from its overlapping neighbors. Sec-
                                                                       ondly, if multiple news items are put on top of each other,
Placement                                                              the overall opacity at this position increases, resulting in an
Every news item is represented by an object in a 2D plane.             object that is less opaque and can therefore be distinguished
The position of the object within the plane depends on the             from objects that represent just one news item. The situation
date the news was published. Thereby, the day it was pub-              that several feeds bring the same news nearly at the same
lished accounts for the line it will be placed in (as each line        moment in time is often the case when the news is very im-
represents one day) and the time of day determines its hor-            portant. That means that the less opaque news objects of-
izontal position within the line. The exact vertical position          ten represent news that are more important and surely more
depends on the sentiment score of the object. According to             widely spread. Figure 2 visually illustrates the above men-
this value an object is slightly shifted up (positive) or down         tioned design decisions.


                                                                   3
                                                                              Republican. To exemplify our Visual Analytics technique,
 higher α-value: same         “Biden in                   highlighted         we picked five interesting discussions in the monitored RSS
 news item from               neutral context”            news item           feeds.
 different feeds
                                                    +                         Palin abused power in Alaska
                                                                              On Saturday, 10th October, many negative news postings oc-
  two horizontal                                                              curred about Sarah Palin. Almost all articles deal with the
                                                          sentiment
  lines represent                                                             topic whether Sarah Palin had abused her power in Alaska
                                                          shift
  one day                                                                     or not. As demonstrated in Fig. 5 there is a high density of
                                                                              red shapes with two white bars symbolizing news postings
                                                    -                         about Palin. Their positions below the baseline denote that
                                                                              mainly negative emotion words were used in these postings.
  “Democrats in                                    “McCain in                 Only one exceptionally positive red news item sticks out in
  negative context”          about one hour        positive context”          the visualization. A closer look at this posting reveals that it
                             of the day                                       is a response from the McCain-Palin presidential campaign:
                                                                              “Sarah Palin acted ‘within proper and lawful authority’ in
                                                                              removing the state’s public safety commissioner”.
               Figure 2. Semantics of the visualization
                                                                               Fri Oct 10 19:41:49 CST 2008 (Feed 19):                                  Fri Oct 10 22:15:22 CST 2008 (Feed 39):
                                                                               Palin abused power Alaska 'Troopergate'                                  Palin says report says she acted lawfully
                                                                               probe finds: AFP - Republican vice-                                      (Reuters): Reuters - Alaska Gov. Sarah Palin
                                                                               presidential nominee Sarah Palin abused her                              acted "within proper and lawful authority" in
                                                                               position as Alaska Governor by pressuring                                removing the state's public safety
Interactive Visual Analytics                                                   officials to dismiss a state trooper, an                                 commissioner, the McCain-Palin Republican
                                                                               investigator's report said.                                              presidential ticket said on Friday in response
The visualization is designed for an interactive data explo-                                                                                            to a state report.


ration. There are several possibilities to interact with the
tool:

• Zooming: Continuous zooming allows to analyze certain
  parts at a greater level of detail.                                                                                                                  Fri Oct 10 21:50:40 CST 2008 (Feed 32):
                                                                                                                                                       Alaska ethics probe says Palin abused her
                                                                               Fri Oct 10 19:24:20 CST 2008 (Feed 49):                                 power: CHILLICOTHE, Ohio (Reuters) - An
                                                                               Alaska panel finds Palin abused power in                                Alaska ethics inquiry found on Friday that U.S.
• Details on demand: When the mouse is dragged over a                          firing: ANCHORAGE, Alaska (AP) -- A                                     Republican vice presidential candidate Sarah
                                                                                                                                                       Palin abused her power as the state's
                                                                               legislative committee investigating Alaska
  news object, a tooltip appears containing date, time, feed                   Gov. Sarah Palin has found she unlawfully
                                                                               abused her authority in firing the state's
                                                                                                                                                       governor, casting a cloud over John McCain's
                                                                                                                                                       controversial choice of running mate for the
                                                                                                                                                       November 4 election.
  id, and content of the item.                                                 public safety commissioner. The investigative
                                                                               report concludes that a family grudge wasn't
                                                                               the sole reason for firing Public Safety        Fri Oct 10 21:06:44 CST 2008 (Feed 18):
                                                                               Commissioner Walter Monegan but says it         Probe accuses Palin of abuse of power (AFP):
• Similarity search: With a mouse click on a news object,                      likely was a contributing factor....            AFP - Investigators found vice presidential
                                                                                                                               nominee Sarah Palin abused her powers as
                                                                                                                               Alaska governor, dealing another blow to
  the search for similar news items is started. The news item                                                                  Republican John McCain's struggling White
                                                                                                                               House bid.
  itself and every other news object that is related to it is
  highlighted (please refer to section “Data Processing” for
  our definition of similarity). Figure 3 shows an example.                   Figure 5. Media coverage dealing with the topic of Sarah Palin’s abuse
                                                                              of power as a governor of Alaska.
• Filtering: The user can select the different candidates /
  parties he is interested in. Another possibility to reduce                  Bad news for the Democrats
  the number of items that are displayed is to select one spe-                Approximately one week before the US presidential elec-
  cific RSS feed. Both filtering mechanisms can be used                       tion we detected a high appearance of news which included
  to analyze in detail the behaviour of one specific news                     “Obama” (see Fig. 6). The sentiment scores of these post-
  provider respectively the development of news for a sub-                    ings were mainly negative and dealt with a plot to assassi-
  set of candidates and/or parties.                                           nate Barack Obama and 102 blacks. Note that the news are
                                                                              bad for him but not about him, meaning that a negative event
                                                                              is related to him in the news postings although the negative
                                                                              opinion words do not refer to him as a person.

                                                                              The used emotion words were so strong, that even in the
                                                                              overview it is possible to recognize the emergence of the
                                                                              negative news of that event on 28th of October (see Fig. 4).
Figure 3. After selecting one news item, similar items are highlighted        Note that although each RSS posting only consist of a few
in yellow enabling the user to track specific topics (low threshold) or       sentences, the few contained positive or negative opinion
redundant postings (high threshold).                                          words are sufficient to provide clear results. Further head-
                                                                              lines of that day discuss the corruption scandal of a Demo-
RESULTS                                                                       cratic senator and result in negative headlines for the Democrats.
First of all, we present an overview of all 50 monitored RSS
feeds over a time period of 31 days in Fig. 4. A prede-                       TV debate Obama vs. McCain
fined filter displays all news postings containing at least one               In the middle of October the final TV debate between the
of the terms Obama, McCain, Biden, Palin, Democrat, and                       Democrat candidate Barack Obama and the Republican can-


                                                                          4
                                                                                                                              A


                                                                                                                                  B


                                                                                                  C


                                                                                                                               D


                                                                                                          E


Figure 4. 31 days of the 2008 US presidential election showing a scandal of power abuse by Palin (A), the TV debate McCain vs. Obama (B),
assassination plans against Obama (C), the election day (D), and a debate about Palin’s election wardrobe (E).


                                                                   5
                                                                                                         8). These outliers deal with some critical notes about the ex-
                                                                                                         pensive wardrobe, which was bought by Sarah Palin for her
                                                                                                         campaign, and her inappropriate use of language describing
                                                                                                         her critics.
                                                                                                          Fri Nov 07 15:40:35 CST 2008 (Feed 23):         Fri Nov 07 17:56:01 CST 2008 (Feed 31):
                                                                                                          GOP tries to sort out Palin's donor-funded      Palin fires back at leaks questioning her
                                                                                                          duds: WASHINGTON (AP) -- Republican             smarts: WASHINGTON (Reuters) - Alaska
                                                                                                          Party lawyers are still trying to determine     Gov. Sarah Palin fired back on Friday
                                                                                                          exactly what clothing was purchased for         against post-election claims by aides to
                                                                                                          Alaska Gov. Sarah Palin, what was               Republican presidential candidate John
                                                                                                          returned and what has become of the             McCain that she thought Africa was a
                                                                                                          rest.....                                       country, not a continent, calling the
                                                                                                                                                          anonymous sources "jerks."
 Mon Oct 27 14:24:25 CST 2008      Mon Oct 27 15:45:26 CST 2008     Mon Oct 27 16:45:39 CST 2008
 (Feed 37):                        (Feed 38):                       (Feed 31):
 ATF disrupts skinhead plot to     Assassination plot targeting     Skinheads held over Obama
 assassinate Obama (AP):           Obama disrupted (AP): AP - Law   death plot: WASHINGTON
 AP - The ATF says it has          enforcement agents have broken   (Reuters) - Two white
 broken up a plot to assassinate   up a plot by two neo-Nazi        supremacist skinheads were
 Democratic presidential           skinheads to assassinate         arrested in Tennessee over
 candidate Barack Obama and        Democratic presidential          plans to go on a killing spree
 shoot or decapitate 102 black     candidate Barack Obama and       and eventually shoot
 people in a Tennessee murder      shoot or decapitate 88 black     Democratic presidential
 spree.                            people, the Bureau of Alcohol,   candidate Barack Obama, court
                                   Tobacco Firearms and             documents showed on Monday.
                                   Explosives said Monday.


                                                                                                                                                         Fri Nov 07 16:38:59 CST 2008 (Feed 39):
                                                                                                          Fri Nov 07 16:01:19 CST 2008 (Feed 37):        Palin denounces her critics as cowardly
Figure 6. Democrats appears in “negative context”. Bad news for                                           Palin denounces her critics as cowardly        (AP): AP - Alaska Gov. Sarah Palin called
                                                                                                                                                         her critics cowards and jerks Friday for
Obama, but not about him.                                                                                 (AP): AP - Alaska Gov. Sarah Palin is
                                                                                                                                                         deriding her anonymously and insisted she
                                                                                                          striking back at critics of the high-priced
                                                                                                          wardrobe she wore as the Republican            never asked for the expensive wardrobe
                                                                                                          vice presidential candidate....                purchased for her use on the presidential
                                                                                                                                                         campaign.

didate John McCain was held. As shown in Fig. 7, news
postings of the event cover both candidates (gray) and gen-                                                               Figure 8. Palin under attack after the elections.
erally have low sentiment scores due to the criticism of both
candidates against each other. The debate revealed little nov-
elty with respect to each candidate’s political plans after the                                          Further trends
election. Therefore, there were no strong positive statements                                            The Democratic vice presidential candidate Joe Biden, who
about the event in the monitored feeds.                                                                  is represented by blue bars with two interruptions, was not
                                                                                                         referenced often. As it can be seen in Fig. 4, he appears very
                                                                                                         rarely compared to the Republican vice presidential candi-
                                                                                                         date Sarah Palin.

                                                                                                         A further discovery was that some feeds show daily patterns.
                                                                                                         For example, one RSS-feed only sent messages in the morn-
                                                                                                         ing at about 7AM, others broadcast their news during work-
                                                                                                         ing hours and some feeds even switched the coverage of po-
                                                                                                         litical events within daily patterns, which is probably due to
Wed Oct 15 22:20:20 CST 2008 (Feed 32):                   Wed Oct 15 22:36:54 CST 2008                   two editors each preferring news about one party and taking
McCain and Obama battle in contentious
debate: HEMPSTEAD, New York (Reuters)
                                                          (Feed 34):                                     turns in writing news postings.
                                                          Obama, McCain Get Feisty in
- Republican John McCain and Democrat                     Final Presidential Debate:
Barack Obama battled fiercely on
Wednesday in their liveliest and most
                                                          Candidates mix it up on campaign               Often, the same news story is broadcasted in many different
                                                          attacks economics, taxes, "Joe
contentious debate, with McCain attacking                 the plumber."
                                                              plumber.                                   feeds (e.g., the above mentioned news about Palin’s wardrobe).
Obama's tax plan, campaign tone and
relationship with a 1960s radical.
                                                                                                         This is mainly due to the fact that some feeds immediately
                                                                                                         broadcast the news copied from a particular news agency,
                                                                                                         whereas other feeds broadcasted this information later. An-
                                   Figure 7. TV debate                                                   other feed resent the same news posting several times as
                                                                                                         shown in Fig. 9.
Obama wins the election                                                                                  CONCLUSIONS
As you can see in annotation D in Fig. 4 the election day is                                             The main contribution of this paper is the combination of a
dominated by gray bars. This is due to the fact that these                                               sentiment analysis method with a visualization technique re-
news postings reported about election results in particular                                              vealing the emotional content of RSS news feeds over time.
states, featuring scores of both candidates. In the evening of                                           Through textual filters, we focused our analysis on the 2008
the election day lots of news postings were received about                                               US presidential election featuring positive and negative news
the winner Barack Obama. The density of news about the                                                   items about the presidential candidates Obama and McCain,
Democrats increased rapidly after the result was known and                                               the vice president candidates Biden and Palin and the two
dominate the news for several days.                                                                      major parties. The timeline visualization builds upon three
                                                                                                         basic elements, first the attribute color denotes the political
Palin’s wardrobe                                                                                         party featured in the news article, second, different shapes
Although after the election the blue shapes increased im-                                                are used to distinguish between the discussed persons, and
mensely, some red negatively rated items stick out (see Fig.                                             third, the emotional score of each RSS news article resulted


                                                                                                     6
                                                                             news items are copied from other news tickers, related RSS
                                                                             postings are often based on the text of the same announce-
                                                                             ment of a newswire and therefore often contain almost iden-
                                                                             tical vocabulary. For the analysis of other content, such as
                                                                             product reviews or the full articles linked in the RSS tick-
                                                                             ers, more complex document similarity measures could be
                                                                             employed. Furthermore, we believe that more sophisticated
                                                                             sentiment analysis methods can be integrated into the pre-
                                                                             sented analysis tool.

                                                                             Acknowledgement
                                                                             This work has been funded by the research center ”Compu-
Figure 9. Technical failure or search engine optimization resulting in
resending the same news postings over and over again.
                                                                             tational Analysis of Linguistic Development” at the Univer-
                                                                             sity of Konstanz and by the German Research Society (DFG)
                                                                             under the grant GK-1042, Explorative Analysis and Visual-
in the vertical position of the representative symbol on the                 ization of Large Information Spaces, Konstanz.
time line.                                                                   We thank the anonymous reviewers of the VISSW 2009 for
                                                                             their valuable comments.
Within the result section, we showed how some emotional
discussions manifested in our news visualization: 1) Palin                   REFERENCES
abused power in Alaska, which resulted in many negative                       1. A. Abbasi and H. Chen. Categorization and analysis of
news items and her own version sticking out as a highly pos-                     text in computer mediated communication archives
itive article. 2) The story about assassination plans against                    using visualization. In JCDL ’07: Proceedings of the
Obama dominated the news for several hours with highly                           2007 conference on Digital libraries, pages 11–18,
negative sentiment scores. 3) The final TV debate consisted                      New York, NY, USA, 2007. ACM.
of mainly gray elements since reports featured both candi-
dates. In general, the accusations of both candidates against                 2. L. A. Adamic and N. Glance. The political blogosphere
each other resulted in more negative than positive sentiment                     and the 2004 U.S. election: divided they blog. In
scores. 4) Obama wins the elections, which is documented                         LinkKDD ’05: Proceedings of the 3rd international
by the vast dominance of blue news elements on the eve of                        workshop on Link discovery, pages 36–43. ACM, 2005.
the election day and the following days. 5) Even after the                    3. T. Ball and S. G. Eick. Software Visualization in the
election a discussion about the expensive wardrobe of Palin                      Large. IEEE Computer, 29(4):33–43, 1996.
fills negative headlines.
                                                                              4. V. Buvac. Internet General Inquirer, 2008.
The tool’s interaction concept shows the corresponding RSS                       http://www.webuse.umd.edu:9090/ as retrieved on Nov.
news articles when the mouse is moved over a symbol on                           14, 2008.
the timeline. To find redundant or similar news items in the
                                                                              5. K. Dave, S. Lawrence, and D. M. Pennock. Mining the
process of analyzing particular events, we furthermore im-
                                                                                 peanut gallery: opinion extraction and semantic
plemented a simple document similarity filter, which after
                                                                                 classification of product reviews. In WWW ’03:
selecting a particular news item highlights all related news
                                                                                 Proceedings of the 12th international conference on
postings surpassing a certain threshold of similarity.
                                                                                 World Wide Web, pages 519–528. ACM, 2003.
We believe that the presented analysis tool can not only be                   6. A. Don, E. Zheleva, M. Gregory, S. Tarkan, L. Auvil,
used to monitor public emotional discussions, but is also ca-                    T. Clement, B. Shneiderman, and C. Plaisant.
pable of evaluating product reviews, public opinions on a                        Discovering interesting usage patterns in text
particular subject, or to get hints about the reputation an en-                  collections: integrating text mining with visualization.
terprise. By offering sentiment analysis functionality of a                      In CIKM ’07: Proceedings of the sixteenth ACM
multitude of large RSS feeds in real-time, users of this tech-                   conference on Conference on information and
nique can take early action, such as reacting before a topic                     knowledge management, pages 213–222. ACM, 2007.
dominates news coverage. This strategic dimension of our
application is very valuable for public relation specialists                  7. J.-D. Fekete and N. Dufournaud. Compus: visualization
and could be implemented in early warning systems. Fur-                          and analysis of structured documents for understanding
thermore, we expect the tool to be useful for monitoring                         social life in the 16th century. In DL ’00: Proceedings
the evolution of popularity of certain products, persons, or                     of the fifth ACM conference on Digital libraries, pages
views, ultimately answering the question about why a posi-                       47–55, New York, NY, USA, 2000. ACM.
tive public image turned into a negative one.                                 8. D. Fisher, A. Hoff, G. Robertson, and M. Hurst.
                                                                                 Narratives: A Visualization to Track Narrative Events
Future Work                                                                      as they Develop. In IEEE Symposium on Visual
For computing the similarity between news items we used                          Analytics and Technology (VAST 2007), pages
a simple word matching method. Due to the fact that many                         115–122, 2008.


                                                                         7
 9. B. Fortuna, D. Mladenic, and M. Grobelnik.                   23. K. Lagus, T. Honkela, S. Kaski, and T. Kohonen.
    Visualization of Text Document Corpus. Informatica               Self-organizing maps of document collections: A new
    Journal, 29(4):497–502, 2005.                                    approach to interactive exploration. In E. Simoudis,
                                                                     J. Han, and U. Fayyad, editors, Proceedings of the
10. M. Gamon, A. Aue, S. Corston-Oliver, and E. Ringger.             Second International Conference on Knowledge
    Pulse: Mining Customer Opinions from Free Text. In               Discovery and Data Mining, pages 238–243. AAAI
    Advances in Intelligent Data Analysis VI, pages                  Press, 1996.
    121–132. Springer, 2005.
                                                                 24. B. Liu, M. Hu, and J. Cheng. Opinion observer:
11. M. Gamon, S. Basu, D. Belenko, D. Fisher, M. Hurst,
                                                                     analyzing and comparing opinions on the Web. In
    and A. C. König. BLEWS: Using Blogs to Provide
                                                                     WWW ’05: Proceedings of the 14th international
    Context for News Articles. In ICWSM, 2008.
                                                                     conference on World Wide Web, pages 342–351. ACM,
12. N. Glance, M. Hurst, and T. Tomokiyo. BlogPulse:                 2005.
    Automated Trend Discovery for Weblogs. In WWW
    2004 Workshop on the Weblogging Ecosystem. ACM,              25. B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?:
    May 2004.                                                        sentiment classification using machine learning
                                                                     techniques. In EMNLP ’02: Proceedings of the ACL-02
13. M. L. Gregory, N. Chinchor, P. Whitney, R. Carter,               conference on Empirical methods in natural language
    E. Hetzler, and A. Turner. User-directed Sentiment               processing, pages 79–86. Association for
    Analysis: Visualizing the Affective Content of                   Computational Linguistics, 2002.
    Documents. In Workshop on Sentiment and Subjectivity
    in Text, pages 23–30, 2006.                                  26. A.-M. Popescu and O. Etzioni. Extracting product
                                                                     features and opinions from reviews. In HLT ’05:
14. V. Hatzivassiloglou and J. Wiebe. Effects of adjective           Proceedings of the conference on Human Language
    orientation and gradability on sentence subjectivity,            Technology and Empirical Methods in Natural
    2000.                                                            Language Processing, pages 339–346. Association for
                                                                     Computational Linguistics, 2005.
15. S. Havre, E. Hetzler, P. Whitney, and L. Nowell.
    ThemeRiver: Visualizing Thematic Changes in Large            27. A. Spoerri. InfoCrystal: a visual tool for information
    Document Collections. IEEE Transactions on                       retrieval & management. In CIKM ’93: Proceedings of
    Visualization and Computer Graphics, 8(1):9–20, 2002.            the second international conference on Information and
                                                                     knowledge management, pages 11–20. ACM, 1993.
16. M. A. Hearst. TileBars: Visualization of Term
    Distribution Information in Full Text Information            28. R. Swan and D. Jensen. TimeMines: Constructing
    Access. In Proceedings of the Conference on Human                Timelines with Statistical Models of Word Usage,
    Factors in Computing Systems, CHI’95, 1995.                      2000.
17. M. Hu and B. Liu. Mining and summarizing customer            29. B. Wang, B. Spencer, C. X. Ling, and H. Zhang.
    reviews. In KDD ’04: Proceedings of the tenth ACM                Semi-supervised Self-training for Sentence Subjectivity
    SIGKDD international conference on Knowledge                     Classification, pages 344–355. Lecture Notes in
    discovery and data mining, pages 168–177. ACM,                   Computer Science. Springer Berlin / Heidelberg, 2008.
    2004.
                                                                 30. J. A. Wise, J. J. Thomas, K. Pennock, D. Lantrip,
18. M. Hu and B. Liu. Mining Opinion Features in                     M. Pottier, A. Schur, and V. Crow. Visualizing the
    Customer Reviews. In AAAI, pages 755–760, 2004.                  non-visual: spatial analysis and interaction with
19. D. A. Keim and D. Oelke. Literature Fingerprinting: A            information from text documents. In INFOVIS ’95:
    New Method for Visual Literary Analysis. In EEE                  Proceedings of the 1995 IEEE Symposium on
    Symposium on Visual Analytics and Technology (VAST               Information Visualization, pages 51–58, 1995.
    2007), pages 115–122, 2007.
20. S.-M. Kim and E. Hovy. Extracting Opinions, Opinion
    Holders, and Topics Expressed in Online News Media
    Text. In Proceedings of the ACL Workshop on
    Sentiment and Subjectivity in Text, pages 1–8, 2006.
21. N. Kobayashi, K. Inui, Y. Matsumoto, K. Tateishi, and
    T. Fukushima. Collecting Evaluative Expressions for
    Opinion Extraction. In IJCNLP, pages 596–605, 2004.
22. R. R. Korfhage. To see, or not to see – is That the
    query? In SIGIR ’91: Proceedings of the 14th annual
    international ACM SIGIR conference on Research and
    development in information retrieval, pages 134–141.
    ACM Press, 1991.


                                                             8

</pre>