=Paper= {{Paper |id=Vol-2068/esida3 |storemode=property |title=Searching for Diverse Perspectives in News Articles: Using an LSTM Network to Classify Sentiment |pdfUrl=https://ceur-ws.org/Vol-2068/esida3.pdf |volume=Vol-2068 |authors=Christopher Harris |dblpUrl=https://dblp.org/rec/conf/iui/Harris18 }} ==Searching for Diverse Perspectives in News Articles: Using an LSTM Network to Classify Sentiment== https://ceur-ws.org/Vol-2068/esida3.pdf
       Searching for Diverse Perspectives in News Articles:
          Using an LSTM Network to Classify Sentiment
                                                     Christopher Harris
                                                Department of Computer Science
                                                University of Northern Colorado
                                                      Greeley, CO 80639
ABSTRACT                                                                elections. However, there are ways to evaluate and
When searching for emerging news on named entities, many                categorize this variation in reporting. Sentiment analysis,
users wish to find articles containing a variety of                     which has been widely applied to classifying movie and
perspectives. Advances in sentiment analysis, particularly              product reviews, could also be applied to the sentiment used
by tools that use Recurrent Neural Networks (RNNs), have                in reporting news articles, particularly those that focus on a
made impressive gains in their accuracy handling NLP tasks              specific named entity. Although early approaches in
such as sentiment analysis. Here we describe and implement              sentiment analysis suffered from poor accuracy, recent
a special type of RNN called a Long Short Term Memory                   advances – particularly applying deep learning techniques
(LSTM) network to detect and classify sentiment in a                    such as Recurrent Neural Networks (RNNs) – have increased
collection of news articles. Using an interactive query                 its accuracy and can even distinguish the sentiment between
interface created expressly for this purpose, we conduct an             different named entities when an article contains references
empirical study in which we ask users to classify sentiment             to more than one entity.
on named entities in articles and then we compare these
sentiment classifications with those obtained from our                  It is important for search systems to work with named entities
LSTM network. We compare this sentiment in articles that                containing both informal text (i.e., blog posts) and formal
mention the named entity in a collection of news articles.              text (i.e., news articles). To this end, it is also important to
Last, we discuss how this analysis can identify outliers and            distinguish these different types of sources to the user. When
help detect fake news articles.                                         information on a named entity appears from a verified news
                                                                        source, it carries a different weight (in terms of authenticity)
Author Keywords                                                         from a blog posting from a non-expert; the user should be
Sentiment analysis; RNN; LSTM; named entities; artificial               made aware of this provenance in the search results and be
neural networks; news analysis; fake news.                              able to filter the search results based on the verifiability of
ACM Classification Keywords                                             the news.
I.5.1 [Pattern Recognition]: Models → Neural nets; I.2.7                With the rise in social media as a user’s primary news source
[Artificial Intelligence] → Natural Language Processing.                [9], misleading news articles called fake news have clouded
H.3.3 [Information systems] → Information retrieval                     many users’ ability to determine if a news article has merit
diversity                                                               or if it is a deliberate attempt to misinform and spread a hoax.
INTRODUCTION                                                            Recently, more attention from the NLP community has been
Named entities, which we define as information units such               placed on identifying fake news, which we define as
as person, organization and location names, are extremely               propaganda disguised as real news that is created to mislead
popular components of user queries. For example, Yin and                readers and damage a person’s, an agency’s, or an entity’s
Shah found that nearly 30% of searches on the Bing search               reputation.
engine were simply a named entity and 71% of searches                   A study conducted following the 2016 election found 64%
contained a named entity as part of the query string [13].              of adults indicated that fake news articles caused a great deal
Thus, the proper identification and handling of named                   of confusion and 23% said they had shared fabricated articles
entities is essential to provide an excellent search experience.        themselves – sometimes by mistake and sometimes
There has been a growing number of voices who claim bias                intentionally [3]. We believe that sentiment analysis, when
in reporting from media sources, particularly (but not limited          done properly, can be used to separate news from genuine
to) named entities in politics and entertainment. News                  news sources from fake news. We explore this concept
articles covering the same named entity can be reported from            briefly in this paper.
a variety of perspectives, some sympathetic to the subject              BACKGROUND AND MOTIVATION
while others are far less so – a phenomenon widely noted                Performing queries and obtaining news articles are tasks that
during two 2016 events: the U.K. Brexit vote and U.S.                   rank only behind sending email as the most common internet
                                                                        activities, with 91% and 76% of users reportedly engaging in
© 2018. Copyright for the individual papers remains with the authors.   these activities, respectively [10]. Overall, the internet has
Copying permitted for private and academic purposes.
ESIDA '18, March 11, Tokyo, Japan.                                      grown in importance as a source of information and news on
named entities. As of August 2017, 43% of Americans                 Sentiment Analysis
report often obtaining their news online, quickly approaching       News articles shared on social media are often used to incite
the 50% who often obtain news by television. This 7% gap            affective behavior in readers [7] and are ideal for sentiment
has narrowed considerably from the 19% gap between the              classification. Sentiment analysis is an area of Natural
two sources found only 18 months earlier [6].                       Language Processing (NLP) that examines and classifies the
                                                                    affective states and subjective information about a topic or
The Role of Social Media
                                                                    entity. The research question we wish to examine is how well
Social media platforms such as Facebook and Twitter have
                                                                    machine classified sentiment analysis is correlated with the
transformed how news is created and disseminated. News
                                                                    sentiment as determined by users (which we set as our
content on any named entity can be spread among users
                                                                    ground truth).        We do this by looking at the
without substantial third-party filtering, fact-checking, or
                                                                    subjectivity/objectivity, the polarity, and the magnitude of
editorial judgment on this information. It is now possible for
                                                                    sentiment in the text of the article at the sentence level while
a non-expert user with no prior reputation on a news topic to
                                                                    keeping track of contextual issues such as anaphora
reach as many readers as the verified sources such as the
                                                                    resolution. By creating a two-dimensional vector to represent
Washington Post, CNN, or the BBC [1].
                                                                    the sentiment for each named entity in each sentence (see
With social media, unsurprisingly, users tend to                    Figure 1), we can create an overall vector to match this to the
communicate with others having a similar political ideology,        overall sentiment of the article. In Figure 1, the blue lines
affecting the ability for them to gain a balanced perspective.      represent the boundaries between the classifications of
Of the Facebook articles involving national news, politics, or      sentiment, from very negative to very positive. Note that
world affairs, only 24% of liberals and 35% of conservatives        some of the boundary lines between sentiment ratings (the
have exposure to other perspectives through shares on social        blue lines) are not strictly vertical; if a word is more
media [2]. Therefore, most social media users who wish to           objective, the threshold for it to be at the extremes (either
gain a different perspective on a named entity require a            very positive or very negative) is lower than that when the
convenient yet customizable interface to search these articles      term is denoted as subjective. We discuss how we classify
and view information on these different perspectives.               these terms in the next section.
Although websites like Allsides1 use a bias rating system to        Long Short Term Memory (LSTM) Models
illustrate the spectrum of reporting on a liberal-conservative      We use the LSTM model introduced by Hochreiter and
bias, to our knowledge, no search interface has been created        Schmidhuber [8], and subsequently modified to include
to classify news articles based on the sentiment used in the        forget gates as implemented by Gers, Schmidhuber,
text.                                                               Cummins in [4] and by Graves in [5]. LSTMs have been
                                                                    traditionally applied to machine translation efforts, but here
                                                                    we apply them to classifying sentiment.
                                                                    With RNNs, a weight matrix is associated with the
                                                                    connections between the neurons of the recurrent hidden
                                                                    layer. The purpose of this weight matrix is to model the
                                                                    synapse between two neurons. During the gradient back-
                                                                    propagation phase of a traditional neural network, the
                                                                    gradient signal can be multiplied many times by this weight
                                                                    matrix, which means it have a disproportionately strong
                                                                    influence on the learning process.
                                                                    When weights in this matrix are small (i.e., the leading
                                                                    eigenvalue of the weight matrix < 1.0), a situation called
                                                                    vanishing gradients can occur. In this situation, the gradient
                                                                    signal gets so small that learning either becomes very slow
                                                                    or may stop completely. This has a negative impact on
                                                                    learning the long-term dependencies in the data. However,
Figure 1: An example illustrating the vector representation of
terms in the phrase “She was excellent at helping others but
                                                                    when the weights in this matrix are large (i.e., the leading
found the task boring” illustrating the polarity along the x-axis   eigenvalue of the weight matrix > 1.0), the gradient signal
and subjectivity along the y-axis. Magnitude is represented as      can become so large that learning will diverge, which is often
the length of the vector. Vertical blue lines represent the         referred to as exploding gradients.
boundaries between sentiment classes, with a tighter range for
terms labeled subjective as compared with those labeled as          Minimizing the vanishing and exploding gradients is the
objective.                                                          primary motivation behind the LSTM model. This model

1
    https://www.allsides.com/unbiased-balanced-news
introduces a new structure called a memory cell (see Figure        Our model is a variation of the standard LSTM model; here
2). A memory cell is comprised of four main elements: (a)          the activation of a cell’s output gate is independent of the
an input gate, (b) a neuron with a self-recurrent connection,      memory cell’s state t.
(c) a forget gate, and (d) an output gate. The self-recurrent
connection maintains a weight very close to 1.0. Its purpose       This variation allows us to compute equations (1), (2), (3),
is to ensure that from one timestep to the next, barring any       and (5) in parallel, improving computational efficiency. This
outside interference, the state of a memory cell will remain       is possible because none of these four equations rely on a
constant. The gates serve to modulate the interactions             result produced by any of the other three. We achieve this by
between the memory cell itself and its environment. The            concatenating the four matrices ∗ into a single weight
input gate can allow incoming signal to alter the state of the     matrix W, performing the same concatenation on the four
memory cell or block it. On the other hand, the output gate        weight matrices ∗ to produce the matrix , and the four bias
can allow the state of the memory cell to affect other neurons.    vectors ∗ to produce the vector b. Then, the pre-nonlinearity
Last, the forget gate modulates the memory cell’s self-            activations can be computed with:
recurrent connection, allowing the cell to remember to             (7) z =
ignore, or forget, its previous state.
                                                                   The result is then sliced to obtain the pre-nonlinearity
                                                                   activations for i, f, t, and o. These non-linearity activations
                                                                   are then applied independently to their respective cells.
                                                                   Our model is composed of a single LSTM layer followed by
                                                                   an average pooling and a logistic regression layer as
                                                                   illustrated in Figure 3. From an input sequence x0, x1, x2, ...,
                                                                   xn, the memory cells in the LSTM layer will produce a
                                                                   representation sequence h0, h1, h2, ..., hn. This representation
                                                                   sequence is then averaged over all n timesteps resulting in
                                                                   representation, h. Last, this representation is fed to a logistic
        Figure 2: Illustration of an LSTM memory cell.             regression layer whose target is the class label associated
                                                                   with the input sequence, which is the five ordinal levels of
The following equations illustrate how a layer of memory           sentiment, ranging from very positive to very negative. To
cells is updated at timestep t. We define xt and ht as the input   map these vectorized terms (as seen in Figure 1) to an ordinal
and output, respectively, to the memory cell layer at time t,      value for sentiment, we take the cosine of the term vector.
Wi, Wf, Wc, Wo, hidden-state-to-hidden-state matrices Ui,
Uf, Uc, Uo, are the weight matrices, and bi, bf, bc and bo are
the bias vectors. First, we determine the values for the input
gate, it, and the candidate values for the states of the memory
cells at time t, t:
(1) it =
(2)   t = tanh(Wcxt + Uc          + bc)
Next, we compute the value for ft, the activation of the
memory cells’ forget gates, at time t:
(3) ft = (Wf xt + Uf h(t-1) + bf)
                                                                    Figure 3: It is composed of a single LSTM layer followed by
Given the value of the input gate activation it, the forget gate          mean pooling over time and logistic regression.
activation, ft, and the candidate state value, t, we can
compute Ct, the memory cells’ new state, at time t:                INTERFACE COMPONENTS
                                                                   Figure 4 illustrates the flow of a user query involving a
(4) Ct = it *   t + ft * C(t-1)                                    named entity on our interactive query interface. In this
where * denotes a point-wise (Hadamard) multiplication             section, we describe the major steps and related interfaces.
operator. Once we obtain the new state of the memory cells,        Data Collection
we can compute the value of their output gates, ot, and their      We use a collection of 433,175 news articles scraped from
outputs, ht:                                                       211 formal and informal news sources. Of the 211 news
                                                                   sources, 109 of these are from verified sources. We
(5) ot = 
                                                                   determine verified sources as those from Media Bias/Fact
(6) ht = ot * tanh(Ct)                                             Check that indicate a factual reporting score of “high”. The
                                                                   articles in our collection are on a variety of topics, but all are
                         Figure 4: Flow diagram showing the major components of the search system.

written in English, have publication dates from 2012-2017,        (which was empirically determined). We use a learning rate
and are available on the internet (although some are available    of 10-5, an L2 regularization weight of 0.009, and dropout
only through paywalls). Figure 5 illustrates the distribution     value of 1.0.
of news articles, news sources, and verified sources for each
                                                                  Interactive Query Interface
year in our collection.                                           Figure 6 shows the interactive query interface used in our
The processing of the data in the collection was designed to      study. The query interface is designed to give users as much
be done quickly. Using a single server, we were able to           information to refine their search based on the sentiment of
index, detect and classify sentiment for the entire collection    the search results. The interface is divided into two columns.
of 433,175 articles in approximately 4 minutes, allowing us       The left column contains an area to enter and refine queries,
to handle emergent stream data (i.e. Twitter) with only a         a checkbox for the user to only have results from verified
minor delay.                                                      sources returned, several checkboxes to determine the types
                                                                  of sentiments to include, from very negative to very positive.
                                                                  At the bottom of the left-hand column, the most popular
                                                                  search terms not used in the user query appear in the results,
                                                                  with color coding to indicate the sentiment of the term.
                                                                  In the right-hand column, we have a display of the article
                                                                  counts by sentiment, and the top-ranked search results.
                                                                  Users are also given the ability to sort the search results based
                                                                  on relevance, date, sentiment, or verified source.
                                                                  Next to each search result, users can see the sentiment our
                                                                  approach has indicated for that article, as well as an
                                                                  indication if the article is from a verified source.
                                                                  We implemented searches on our collection using Indri, a
                                                                  scalable open-source search engine [12]. Indri works well
                                                                  with smaller queries, which are typically used in searches on
                                                                  named entities.




   Figure 5: Number of articles (top) and number of unique
 sources (bottom) in our collection, by publication date of the
                            article.
Training of the LSTM Network
The dataset used for training is the recently proposed
Stanford Sentiment Treebank [11], which includes fine
grained sentiment labels for 215,154 phrases in the parse
trees of 11,855 sentences. In our experiment, we focus in          Figure 6: The Interactive Query Interface for searching our
sentiment prediction of complete sentences with respect to          collection, showing an example query. The sentiment we
the named entities contained within each sentence.                derive from each article is represented as the sentiment of the
                                                                                               article.
For our LSTM, we use a use the softsign activation function
over tanh; it is faster than softmax and there is a smaller       Detecting Ambiguous Named Entities
probability of saturation (i.e., having a gradient that           To ensure we are tracking the correct named entity, when
approaches 0). We evaluated our training set over 20 epochs,      appropriate, we need to disambiguate potentially
confounding entities. We use an API from Wikipedia to             EXPERIMENT DESIGN
check for a disambiguation page on that user-provided             Sentiment analysis is primarily associated with a named
named entity. If one is found, we obtain the different            entity, so if multiple entities are described in the article text,
categories, if any, that are provided by Wikipedia. Figure 7      each with a different sentiment, this can convolute the true
shows an example of a search on “Michael Jackson” and the         sentiment around each entity if not properly handled. Also,
categories containing entities named “Michael Jackson”.           the sentiment of the article is a relative concept – if all
This allows users to narrow their search to the correct entity,   articles are negative about a named entity, even a slightly
reducing the possibility of confounding results from              positive article can look very positive in comparison. Our
mistakenly grouping disparate entities together.                  research question is to evaluate if machine generated
                                                                  sentiment analysis is a strong predictor of article sentiment
                                                                  from a user’s perspective. We accomplish this by evaluating
                                                                  feedback on the sentiment rating the users provide.
                                                                  Evaluating Sentiment
                                                                  As with determining relevance in information retrieval,
                                                                  humans widely known to be better than machines at
                                                                  determining the correctness of article sentiment. We hired
                                                                  293 crowdworkers from Amazon Mechanical Turk. These
                                                                  crowdworkers performed 600 separate tasks (HITs) to
                                                                  evaluate 1500 articles (approximately 0.35% of our
                                                                  collection) by searching on 150 named entities. Each article
                                                                  was evaluated by at least 3 different crowdworkers
   Figure 7: The disambiguation page for Michael Jackson.         (crowdworkers could not evaluate an article more than once).
  Categories are pulled from Wikipedia through their API,         The distribution of ratings made by crowdworkers is given
 allowing the user to find the correct Michael Jackson. Note
 the shortcut in the upper right-hand side linking to the most
                                                                  in Figure 9. Most raters evaluated 5 articles and the mean
                     popular named entity.                        number of articles rated was 15.

Detecting Verified Sources
As described earlier, we allow users to search on only
verified news sources or all sources. This allows users to
examine both informal and formal sources. We describe how
we verify sources in the Data Collection section. Figure 8
shows search results without the verified sources only
checkbox checked, allowing unverified sources.




                                                                  Figure 9: The number of articles rated (x-axis) by the number
                                                                      of raters evaluating that number of articles (y-axis).




 Figure 8: The Interactive Query Interface for searching our
   collection, showing search results containing unverified
                           sources
Applying Sentiment Analysis
We use the LSTM method to detect and classify sentiment
analysis for each major named entity in each article as well
as the main keywords associated with that article. We
provide five classes of sentiment, from very negative to very
positive. We display this information to the user as the                 Figure 10: The interface used to evaluate the article’s
sentiment of the article.                                              relevance and classification of the article’s sentiment.
Instructions to Users                                               (either very positive or very negative) and looked at those
Each user (crowdworker) is asked to determine if the article        articles which were extreme outliers, or a difference in
retrieved by their query is relevant to the search criteria. This   ratings of 3 or more on our 5-point scale. Of the 150 named
is used to help refine the search criteria parameters provided      entities examined in our study, we found 14 that had one or
in Indri. More importantly, the user is asked to evaluate the       more articles meeting this condition. These 14 named entity
sentiment assigned to the article on a five-point scale (see        searches yielded 29 articles, of which 28 were unverified
Figure 10). Users were also asked to take a survey on               news articles.
usability of the interface and the perceived accuracy of the
LSTM classified sentiment.                                          We ran a separate analysis of any quotations and facts raised
                                                                    in each of these 28 articles. We then tried to find these facts
Intra-Rater Reliability                                             mentioned in the other articles. Of the 31 quotations in these
To evaluate intra-rater reliability, we kept track of each          articles about the named entity in question, we found 20
crowdworker’s ratings and the articles they rated without           instances where the quotations did not exist in any other
identifying them personally. When the articles were                 article in our collection and 11 instances where these
presented to the crowdworker to rate, they were not made            quotations were mentioned, but convoluted in a way to
aware of the overall rating previously made by our sentiment        contort its context. Of the 89 facts raised in these articles, 77
analysis model. We also kept track so that a single user could      of these were not mentioned in any other article, and 12 that
not evaluate any article more than once.                            were mentioned but taken out of context with respect to the
We understand that the raters’ personal bias can influence          other articles in our collection. While we cannot confidently
their perspective on an article’s sentiment. Although we did        conclude that these articles represent fake news, we believe
not attempt to recalibrate each crowdworker’s ratings based         this approach can help identify articles that have a distinctly
on the pattern of their ratings, we did see if any crowdworker      different sentiment from other articles and bring up
consistently selected the sentiment to be very positive or very     quotations and facts not mentioned in other articles. We plan
negative, implying they were rushing through the task               to explore this relationship in a future study.
instead of evaluating each article thoroughly. Of the 600           Last, we asked the crowdworkers to provide optional
tasks, only 3 tasks needed to be repeated due to this behavior.     feedback on the interface both in terms of usability and in
Fake News                                                           terms of accuracy of sentiment classification on a five-point
We also wish to determine if outliers in sentiment on a named       Likert scale. Of the 293 crowdworkers, we received
entity were good predictors of fake news. For example, if a         feedback from 177 (60.4%). Survey takers scored the
large percentage of articles for an entity are slightly positive    interface as 3.28 for usability (5 = best), with many providing
or very positive, those articles with sentiments rated very         comments that more work needs to be done to reduce its
negative (particularly from unverified sources) are                 complexity. The survey takers scored the LSTM model’s
candidates to be fake news articles. To examine the details         sentiment classification accuracy as 4.54, with many
further, we look at the most negative quotations or facts           providing feedback indicating they concurred with the
provided in these articles using a separate process, and look       LSTM model’s overall accuracy.
at the overlap between these sources and other articles in our       Correlation of              Rating Obtained by the LSTM
collection. We briefly report and analyze these findings.            Ratings between               Sentiment Analysis Model
RESULTS AND ANALYSIS                                                 Users and the
                                                                     LSTM Model              1       2        3        4        5
Our primary research question was to examine how well the
sentiment analysis provided by our LSTM model correlates             Average         1      122       70          6        3        0
with the sentiment rating made by users. Since each of the            of User
articles was evaluated at least 3 times, we took the average                         2        67     242       77          5        0
                                                                      Ratings
rating of the users (rounded to the nearest integer) to be the       (min of 3       3           4    84     192           79       5
correct article sentiment.                                            Ratings
                                                                        per          4           0    10       73       250      74
We performed a Pearson correlation coefficient, r, on the 5           Article)       5           0       2     11          45    79
sentiment classes determined by our LSTM network with the
5 sentiment classes provided by the users. There was a              Table 1: Correlation of Ratings between the average supplied
positive correlation between the two variables [r = 0.823, n        by the users and those obtained by sentiment analysis model.
= 1500, p < 0.001]. Therefore, based on the sample of 1500
news articles evaluated, we believe the sentiment analysis          CONCLUSION AND FUTURE WORK
provided by the LSTM model is a reasonably good predictor            In this paper, we describe an interactive query interface that
of an article’s sentiment. Table 1 shows the correlation            makes use of sentiment analysis. This allows users
between the two sets of ratings.                                    performing a named entity search to receive information on
                                                                    the sentiment of the article and therefore find a wide diversity
To evaluate fake news articles, we examine named entities           of opinions on a named entity search quickly and easily.
where the sentiment is skewed heavily in one direction
We describe the LSTM model we used, and how this can be           2.   Bakshy, E., Messing, S., & Adamic, L. A. (2015).
used to classify sentiment of the news article text into five          Exposure to ideologically diverse news and opinion on
classes, ranging from very negative to very positive. The              Facebook. Science, 348(6239), 1130-1132.
advantage of this model is that even when multiple entities       3.   Barthel, M., Michell, A, and Holcomb, J. (2016). Many
are mentioned in an article it can match the sentiment for the         Americans Believe Fake News Is Sowing Confusion.
named entity in question. We have shown that this technique            Pew Research Center. Available at:
can process news articles quickly, allowing emergent news              http://www.journalism.org/2016/12/15/many-
to be covered quickly.                                                 americans-believe-fake-news-is-sowing-confusion/
We conducted a user study with 293 unique participants to         4.   Gers, F. A., Schmidhuber, J., & Cummins, F. (1999).
answer a research question. They were instructed to classify           Learning to forget: Continual prediction with LSTM.
the sentiment of 1500 articles and indicate how this                   Neural Computation, vol. 12, 2451–2471
sentiment correlates with the sentiment obtained from our
                                                                  5.   Gottfried, J, and Shearer, E. (2017) Americans’ online
model. Each article was evaluated by at least 3 users. With
                                                                       news use is closing in on TV news use. 2017. Pew
a Pearson correlation coefficient, r=0.823, we found the
                                                                       Research Center. Available at:
classification of article sentiment and the classification from
                                                                       http://www.pewresearch.org/fact-tank/2017/09/07/
the LSTM sentiment analysis tool are strongly correlated.
                                                                       americans-online-news-use-vs-tv-news-use/
Combining the sentiment classification techniques with            6.   Graves, A. (2012). Supervised sequence labelling with
some additional analysis allows us to identify potentially             recurrent neural networks (Vol. 385). Heidelberg:
fake news articles. We identified news articles where the              Springer. Chicago
ratings were outliers from a majority of the other relevant
articles using the same named entity search. We find that 28      7.   Hasell, A., & Weeks, B. E. (2016). Partisan
of the 29 articles identified using this approach were                 provocation: The role of partisan news use and
suspicious news articles and would need further                        emotional responses in political information sharing in
investigation. We leave this for a future study.                       social media. Human Communication Research, 42(4),
                                                                       641-661.
There are some limitations to our work. First, out study only
                                                                  8.   Hochreiter, S., & Schmidhuber, J. (1997). Long short-
looks at queries on named entities, which are easier to
                                                                       term memory. Neural computation, 9(8), 1735-1780.
retrieve and analyze semantically than general concepts.
Second, the study worked with a collection of 433,175             9.   Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What
articles, with 84.1% of these being pulled from verified               is Twitter, a social network or a news media? In
sources. With exposure to more unverified sources our                  Proceedings of the 19th international conference on
correlation may be lower, which we leave for future work.              World wide web (pp. 591-600). ACM.
Another limitation has to do with the sentence complexity.        10. Purcell, K., Brenner, J., & Rainie, L. (2012). Search
Our model evaluated sentiment at the sentence level. We               engine use. 2012. Pew Research Center. Available at:
found proximity to the named entity played a role; if more            http://www.pewinternet.org/files/old-
than one named entity was mentioned in a sentence, such as            media/Files/Reports/2012/PIP_Search_Engine_Use_20
“In the 1938 movie Carefree, Fred Astaire performed well              12.pdf
but Ralph Bellamy was forgettable.”, we would expect our          11. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning,
model to provide a positive sentiment for Fred Astaire, a             C. D., Ng, A., & Potts, C. (2013). Recursive deep
neutral sentiment for “Carefree” and a negative sentiment for         models for semantic compositionality over a sentiment
“Ralph Bellamy”; instead it provided a positive sentiment for         treebank. In Proceedings of the 2013 conference on
“Fred Astaire” and “Carefree” and a neutral sentiment for             empirical methods in natural language processing (pp.
“Ralph Bellamy”. Evaluating at the phrase level instead of            1631-1642).
the sentence level will improve the accuracy of our results.      12. Strohman, T., Metzler, D., Turtle, H., & Croft, W. B.
In other future work, we plan to examine the role of images           (2005, May). Indri: A language model-based search
in articles and how this can be analyzed for sentiment as well.       engine for complex queries. In Proceedings of the
We also plan to examine the choice of photos used to                  International Conference on Intelligent Analysis (Vol.
represent named entities in news articles. We plan to                 2, No. 6, pp. 2-6).
examine searches that don’t contain named entities and            13. Yin, X., & Shah, S. (2010). Building taxonomy of web
evaluate if our methods are as accurate as they are with              search intents for name entity queries. In Proceedings
named entities.                                                       of the 19th international conference on World wide
REFERENCES                                                            web (pp. 1001-1010). ACM.
1.   Allcott, H., & Gentzkow, M. (2017). Social media and
     fake news in the 2016 election (No. w23089). National
     Bureau of Economic Research.