=Paper=
{{Paper
|id=Vol-2041/paper4
|storemode=property
|title=Towards an Understanding of Fake News
|pdfUrl=https://ceur-ws.org/Vol-2041/paper4.pdf
|volume=Vol-2041
|authors=Ozlem Ozgobek,Jon Atle Gulla
}}
==Towards an Understanding of Fake News==
<pdf width="1500px">https://ceur-ws.org/Vol-2041/paper4.pdf</pdf>
<pre>
       Towards an Understanding of Fake News
                        Özlem Özgöbek, Jon Atle Gulla


                Email: {ozlem.ozgobek, jon.atle.gulla}@ntnu.no


          Department of Computer Science, NTNU, Trondheim, Norway


      Abstract. Fake news articles are intentionally fabricated to be decep-
      tive and can be proven that they are false. Fake news and spread of mis-
      information are important concepts which may have serious real world
      consequences. Even though this concept exists for so many years, with
      the advancements in technology the speed of diusion of misinformation
      and, how people consume and produce news has changed a lot. So the
      eort towards detecting fake news quickly and correctly has became a
      challenge. Today most of the fact checking is done by professional jour-
      nalists but the research towards the automatic detection of fake news
      increases rapidly. For automatic detection of fake news, linguistic and
      machine learning techniques are the most frequently used techniques. In
      this paper, we analyze these techniques in three main groups: Content
      based methods, user based methods and network based methods. We also
      give a short introduction to the concepts and present some preliminary
      research results towards an understanding of fake news.


   It was not so long ago that the term fake news started to appear frequently
in the media. Even though the terms like deception, hoax, clickbait and credibil-
ity detection of news articles are within the focus of researchers for some time,
with the frequent use of the term fake news, a new detion and more specic
work towards detecting it was required.
   Even though the term fake news is quite new, there was always some news-
papers trying to take more attention from the readers through exaggerated head-
lines and articles containing misinformation [7] [1]. In the internet era, where
every individual have an opportunity to publish and be visible by many others,
it is not surprising that the generation of fake news has increased. One of the
main reasons of generating fake news is the economic gain which can be acquired
by getting more clicks or generating paid fake content [5] for parties who want
to get more clicks. Fake news can cause sudden changes in stock market and this
can easily be converted to an economic gain by the parties who published the
fake news articles [2]. Another common reason for generating fake news is trying
to create a deception and/or a political bias within users in order to get more
supporters. On the other hand where to draw the line between fake news and
expression of opinions is important.

  Copyright held by the author(s). NOBIDS 2017
2


    There are many reasons that the fake news gained so much attention in the
last couple of years. Especially the 2016 US elections played an important role
for the public attention on fake news. The fact that fake news actually causes
some real world problems [4] [6] is another reason that the public reaction has
increased.
    Obviously social media is an important way of getting news for especially
younger generation [17]. There are two aspects of the news on social media:
Traditional news shared on social media and social media as a source of news
(through users who shared events nearby). The second aspect is sometimes used
by the traditional media houses to generate news articles. Both aspects, inten-
tionally or unintentionally, can lead fake news to spread even more.
    According to some research people are not very good at distinguishing real
news from fake ones. In [18] it is suggested that humans can distinguish only 70%
of the fake news. In another study it is mentioned that 75% of people classied
the fake news as accurate news [26]. [23] suggests that people tend to classify
news articles that they do not agree as fake news. The confusion and bias of the
readers bring the demand for approved news from trusted sources.
    Today fake news detection is mostly done by the manual work of professional
journalists. Fact checking web pages are available in many languages all around
the world [3]. On the other hand, the amount of research going on towards
the automatic detection of fake news is increasing rapidly. The collaboration
between the researchers and journalists has an increasing importance towards
the development of automatic fake news detection systems.
    In this paper we give a brief state of the art to the automatic detection of
fake news. The paper is structured as follows: In Section 1 the denition of fake
news is given with dierent aspects. In Section 2 the diusion mechanisms of fake
news is introduced. Commonly used techniques towards automatic detection of
fake news is discussed in Section 3. A brief summary of the important research
ndings is given in Section 4. Finally, the conclusions is given in Section 5.


1 Dening Fake News
Automatic detection of fake news has many challenges. Dening fake news and
setting the right scope for detecting it is one of the basic challenges. The scope of
fake news is usually dierent within dierent research about fake news detection.
For example, clickbaits (the content which is specically designed to attarct more
attention from the readers) are seen as implicated in fast spread of fake news by
some researchers [11], while others does not count this aspect. Similarly satirical
news is classied as a type of fake news by some researchers [25], while it is
not by some others [29]. There is a similar debate about rumors, conspiracy
theories, hyperpartisan news and hoaxes too. A more general term used for fake
news and misinformation detection is called deception detection. In [24] authors
dene deceptive news in three main categories: Intentionally fabricated, large
scale hoaxes and humorous news taken seriously. On the other hand, in [29]
a narrower denition of fake news was accepted which only includes the news
                                                                                     3


articles that are intentionally and veriably false [8]. This denition of fake news
does not include satire, hoax, rumors, conspiracy theories and unintentionally
created misinformation.
   Keeping the fake news denition as narrow as possible for automatic fake
news detection research has more advantages in order to keep the focus down
to the core elements and eliminate ambiguity. But we should also keep in mind
that there are more complicated challenges and it is important to:


  Draw the line between fake news and expression of opinions.
  Distinguish humour, satire, fauxtire and irony: The satire and humouristic
    news articles sometimes classied as fake news because they does not con-
    tain real events and sometimes they can mislead readers. Also detecting the
    humor within a true news story is challenging and can be a false cue in order
    to detect the fake news automatically.
  Consider fake items in a true story: Sometimes the news article can contain
    fake elements. These elements does not make the whole story fake but can
    cause deception. These fake elements are usually harder to detect than de-
    tecting the whole fake news article and they make it hard to classify the
    article as fake or not or to decide the level of fake elements in the article.


   In this paper we assume the following denition of fake news: Fake news
articles are intentionally fabricated to be deceptive and can be proven that they
are false.


2 The Diusion of Fake News
The diusion of fake news can be through dierent media but obviously social
media has the greatest share on spread of fake news. The very dynamic nature
of social media and the fact that every individual is also a publisher in it, makes
social media a suitable medium for diusion of any information. But in social
media fake, inammatory, emotional and one-sided news spread quicker than
many serious, real news articles [21]. Understanding the spread mechanisms of
fake news helps us to prevent them spreading more and even to detect them.
The value of social media as a news source can not be ignored. Social media has
a value as a quick and rst hand source of news events [32] with an increased
event coverage [19]. On the other hand, due to the high noise levels and lack of
validation mechanisms, extracting true news from social media is highly chal-
lenging. Still, social media is commonly used as a source of news for various
news web pages and it has become a prioritized source of news for especially the
younger generation [17]. Besides, strong personal biases exist when it comes to
the perception of news. According to [23] people tend to classify the news arti-
cles that they do not agree with as fake news. Obviously these strong biases
will aect what they share on social media and the spread of fake news. All of
these aspects of social media put it in a very important position for fake news
diusion.
4


      According to [13] the spread of fake images on Twitter as a representative
of a news, was mainly (%86) distributed through retweets and a few through
original tweets.
      Social bots are another important thing to look at in order to understand
the spread mechanisms of fake news. In [28] authors suggest that bots play an
important role in spreading the misinformation on social networks, they are
mostly active in the early phases of the spreading and they target the inuential
users that has a bigger chance of spreading the false information.


3 Commonly Used Detection Techniques
With the increase in attention to fake news, there is also a fast increase in
the number of recent research in this topic. In this section we summarize the
commonly used automatic detection techniques for fake news.


3.1     Content based methods

Content based methods use content cues in order to detect fake news. Content
cue implies to any kind of content related cue to detect fake news. Most of the
content cues are textual cues, so we nd it useful to classify the content cues
as textual and non-textual cues. A more detailed classication of cues towards
clickbait detection is proposed in [11]. Content based methods include the anal-
ysis of any kind of content available in the news like text, image, video or sound
in combination with various machine learning techniques.


Textual content Many of the existing research look for textual cues in order
to detect fake news. This is also how professional journalists approach to manual
fake news detection. Understanding and analyzing the text for fake elements is
the most natural way that we can think of. In order to detect textual cues, lexi-
cal, semantic, syntactic and pragmatic leves of analysis are usually used. Writing
style analysis is one of the most commonly used techniques towards the detection
of misinformation in news [20], [21]. In addition to writing style some research
addresses the language use analysis. In [15] authors suggest a method that in-
cludes stylistic, complexity and psychological feature analysis by using NLP and
sentiment analysis. In [14] linguistic (n-gram) features, credibility features (cap-
italization, punctuation etc.) and semantic features (DBpedia and embeddings)
are used for analysis. Another research for detecting clickbaits proposes a lin-
guistic approach with dierent classiers (e.g. SVM, Random forests etc.) with
an accuracy of 93% [10]. Even though clickbait detection is not directly related
with fake news, they can give us a clue about the fake news. Of course this
does not mean that every clickbait headline refers to a fake news article. Some
research does not include the full text analysis of news articles. In [31] authors
suggest a method only by analyzing the headlines. Fake news detection on social
media might be a little bit more challenging due to the diculty of reaching the
original source of news and the noisy text. In [16] authors propose a method of
                                                                                  5


detecting fake news by exploiting the conicting information on Twitter. They
suggest that opposing viewpoints can help detecting fake news. Fake News Chal-
lenge
        1 leverages the use of machine learning and dierent articial intelligence
techniques for fake news detection. The rst part of the challenge was about
stance detection in news. In [22] authors applied several neural network archi-
tectures and two novel architectural variations for the stance detection of fake
news challenge.


Non-textual content Online news articles usually include more than just the
text. Images, URLs, sound or video les are often available with the news text.
Also the metadata of existing elements, web trac information, image captions
can be classied as non-textual cues since they are not the direct part of the
news article [11]. The analysis of such cues can give us interesting results in the
detection of fake news. Image analysis is one of the most used non-textual content
analysis. In [12] authors proposed an automatic image verication method for
online news articles which has almost %73 of accuracy.


3.2     User based methods


User based fake news detection methods depend on the idea that user behaviour
can give clues about the misinformation. Any kind of user interaction analysis
take place in this category. User comments which can be taken as a textual cue
without the identication of belonging to a particular user. But analyzing the
comments in a relation with particular users can be classied as a user based
method. In [30] authors detect the hoax posts on Facebook by analyzing the users
who liked them. Even though there are some research towards the unreliability
of people identiying fake news [23], in [27] crowdsourcing was used to identify
the fake news.


3.3     Network based methods


Network based fake news detection methods include web trac analysis, web
metadata analysis [11] as well as user network analysis. In [13] authors models
how some tweets get viral on Twitter by analyzing the social network graph of
users. In [9] authors model a social network as a directed weighted graph and
then calculate the probability of nodes transmitting information to each other
in order to contrast the spread of misinformation. They also mentioned the
source identication problem on a social network which addresses the problem
of identiying the infected nodes on a network by misinformation. Also tracking
the news items to its original source can give us strong clues about the reliability
of news.

1
    http://fakenewschallenge.org/
6


3.4     Hybrid methods

Sometimes two or more of the methods above can be used together in order
to achieve more accurate results in fake news detection. One example is [26]
where authors considered the three generally agreed upon characteristics of fake
news: Text, response received and source. They propose an architecture called
CSI (Capture, Score, Integrate). In this architecture recurrent neural networks
is used to capture news article representation, user behaviour over time is used
to score users and two previous outputs are integrated and the result is used for
classication.


4 Research Findings
Although there is limited work on automatic fake news detection, there are
some interesting preliminary results that may help us shape fake news detection
systems in the future:


     The textual characteristics of fake news articles share many similarities with
      satire news. Fake and real news are substantially dierent. [15]
     News articles with a hyperpartisan world view was successfully distinguished
      from more balanced news. Fake news detection is dicult via style analysis
      alone. [21]
     A linguistic approach with dierent classiers (e.g. SVM, Random forests
      etc.) managed to detect clickbaits in online news with an accuracy of 93%.
      [10]
     The diusion pattern of information can be useful for detecting the hoax.
      [30]
     Opposing viewpoints can help us detecting fake news, e.g. by analyzing con-
      icting information on Twitter. [16]
     Users are quite biased when it comes to detecting fake news personally. The
      perception of what is fake and what is not can be very dierent from one
      person to another, and fake news related keywords are often used to express
      disagreement. [23]


5 Conclusions
In this paper we give a short state of the art towards an understanding of fake
news. Fake news is an important concept which may have serious real world
consequences. Even though the scope of fake news diers (ex: including satire
or rumors as fake news), the challenges exist for the automatic detection of mis-
information for all. The diusion mechanisms of fake news is an important step
towards understanding and preventing the spread of misinformation. The im-
portance of social media in the spread of fake news can not be underestimated.
Deeper understanding of human psychology on fake news could be helpful to de-
velop tools for detection and prevention of misinformation. The existing methods
                                                                                                             7


for automatic fake news detection are mostly based on lingusitic and machine
learning techniques. In addition to these methods image analysis and crowd-
sourcing methods were applied. With the increasing popularity of the term fake
news, the research towards automatic detection also increases rapidly. The man-
ual fact checking done by professional journalists give the researchers opportunity
to understand the nature of misinformation and work more eciently towards
the automatic detection of fake news.


References
 1. Before    the    internet,   irresponsible     journalism       was    blamed    for    a   war    and   a
    presidential assassination. https://timeline.com/yellow-journalism-media-history-
    8a29e4462ac. Accessed: 2017-10-15.
 2. Can             'fake        news'            impact             the            stock         market?
    https://www.forbes.com/sites/kenrapoza/2017/02/26/can-fake-news-impact-
    the-stock-market/. Accessed: 2017-10-15.
 3. Global fact-checking sites.           https://reporterslab.org/fact-checking/.               Accessed:
    2017-10-15.
 4. The     real    consequences    of   fake    news.      https://theconversation.com/the-real-
    consequences-of-fake-news-81179. Accessed: 2017-10-15.
 5. This       is       how        facebook's           fake-news         writers      make           money.
    https://www.washingtonpost.com/news/the-intersect/wp/2016/11/18/this-is-
    how-the-internets-fake-news-writers-make-money/. Accessed: 2017-10-15.
 6. Washington         gunman       motivated       by     fake     news      'pizzagate'       conspiracy.
    https://www.theguardian.com/us-news/2016/dec/05/gunman-detained-at-comet-
    pizza-restaurant-was-self-investigating-fake-news-reports. Accessed: 2017-10-15.
 7. Yellow         journalism:      The         fake      news      of      the       19th          century.
    http://publicdomainreview.org/collections/yellow-journalism-the-fake-news-
    of-the-19th-century/. Accessed: 2017-10-15.
 8. H. Allcott and M. Gentzkow.            Social media and fake news in the 2016 election.
    Technical report, National Bureau of Economic Research, 2017.
 9. M. Amoruso, D. Anello, V. Auletta, and D. Ferraioli. Contrasting the spread of
    misinformation in online social networks. In Proceedings of the 16th Conference
    on Autonomous Agents and MultiAgent Systems, pages 13231331. International
    Foundation for Autonomous Agents and Multiagent Systems, 2017.
10. A. Chakraborty, B. Paranjape, S. Kakarla, and N. Ganguly. Stop clickbait: Detect-
    ing and preventing clickbaits in online news media. In Advances in Social Networks
    Analysis and Mining (ASONAM), 2016 IEEE/ACM International Conference on,
    pages 916. IEEE, 2016.
11. Y. Chen, N. J. Conroy, and V. L. Rubin. Misleading online content: Recognizing
                              Proceedings of the 2015 ACM on Workshop on Multi-
    clickbait as false news. In
    modal Deception Detection, pages 1519. ACM, 2015.
12. S. Elkasrawi, A. Dengel, A. Abdelsamad, and S. S. Bukhari.                         What you see is
                                                                   Document
    what you get? automatic image verication for online news content. In
    Analysis Systems (DAS), 2016 12th IAPR Workshop on, pages 114119. IEEE,
    2016.
13. A. Gupta, H. Lamba, P. Kumaraguru, and A. Joshi. Faking sandy: characterizing
    and identifying fake images on twitter during hurricane sandy. InProceedings of
    the 22nd international conference on World Wide Web, pages 729736. ACM, 2013.
8


14. M. Hardalov, I. Koychev, and P. Nakov. In search of credible news. In  Interna-
    tional Conference on Articial Intelligence: Methodology, Systems, and Applica-
    tions, pages 172180. Springer, 2016.
15. B. D. Horne and S. Adali. This just in: Fake news packs a lot in title, uses simpler,
    repetitive content in text body, more similar to satire than real news.   arXiv preprint
    arXiv:1703.09398, 2017.
16. Z. Jin, J. Cao, Y. Zhang, and J. Luo. News verication by exploiting conicting
    social viewpoints in microblogs. In    AAAI, pages 29722978, 2016.
17. R. Marchi. With facebook, blogs, and fake news, teens reject journalistic objec-
    tivity.   Journal of Communication Inquiry, 36(3):246262, 2012.
18. V. Pérez-Rosas, B. Kleinberg, A. Lefevre, and R. Mihalcea. Automatic detection
    of fake news.   arXiv preprint arXiv:1708.07104, 2017.
19. S. Petrovic, M. Osborne, R. McCreadie, C. Macdonald, I. Ounis, and L. Shrimpton.
    Can twitter replace newswire for breaking news? In       ICWSM, 2013.
20. K. Popat, S. Mukherjee, J. Strötgen, and G. Weikum.             Credibility assessment
    of textual claims on the web.  Proceedings of the 25th ACM International on
                                      In
    Conference on Information and Knowledge Management, pages 21732178. ACM,
    2016.
21. M. Potthast, J. Kiesel, K. Reinartz, J. Bevendor, and B. Stein.          A stylometric
    inquiry into hyperpartisan and fake news.    arXiv preprint arXiv:1702.05638, 2017.
22. N. Rakholia and S. Bhargava. is it true?deep learning for stance detection in
    news.
23. M. H. Ribeiro, P. H. Calais, V. A. Almeida, and W. Meira Jr.              " everything i
    disagree with is# fakenews": Correlating political polarization and spread of mis-
    information.    arXiv preprint arXiv:1706.05924, 2017.
24. V. L. Rubin, Y. Chen, and N. J. Conroy. Deception detection for news: three types
    of fakes.   Proceedings of the Association for Information Science and Technology,
    52(1):14, 2015.
25. V. L. Rubin, N. J. Conroy, Y. Chen, and S. Cornwell. Fake news or truth? using
    satirical cues to detect potentially misleading news.     In   Proceedings of NAACL-
    HLT, pages 717, 2016.
26. N. Ruchansky, S. Seo, and Y. Liu. Csi: A hybrid deep model for fake news.         arXiv
    preprint arXiv:1703.06959, 2017.
27. R. J. Sethi. Crowdsourcing the verication of fake news and alternative facts. 2017.
28. C. Shao, G. L. Ciampaglia, O. Varol, A. Flammini, and F. Menczer. The spread
    of fake news by social bots.   arXiv preprint arXiv:1707.07592, 2017.
29. K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu. Fake news detection on social media:
    A data mining perspective.     ACM SIGKDD Explorations Newsletter, 19(1):2236,
    2017.
30. E. Tacchini, G. Ballarin, M. L. Della Vedova, S. Moret, and L. de Alfaro. Some
    like it hoax: Automated fake news detection in social networks.           arXiv preprint
    arXiv:1704.07506, 2017.
31. W. Wei and X. Wan. Learning to identify ambiguous and misleading news head-
    lines.   arXiv preprint arXiv:1705.06031, 2017.
32. H. M. Wold, L. Vikre, J. A. Gulla, Ö. Özgöbek, and X. Su. Twitter topic modeling
    for breaking news detection. In   WEBIST (2), pages 211218, 2016.

</pre>