=Paper=
{{Paper
|id=Vol-2699/paper32
|storemode=property
|title=#DemocratsAreDestroyingAmerica: Rumour Analysis on Twitter During COVID-19
|pdfUrl=https://ceur-ws.org/Vol-2699/paper32.pdf
|volume=Vol-2699
|authors=Lin Tian,Xiuzhen Zhang,Jey Han Lau
|dblpUrl=https://dblp.org/rec/conf/cikm/TianZL20
}}
==#DemocratsAreDestroyingAmerica: Rumour Analysis on Twitter During COVID-19==
<pdf width="1500px">https://ceur-ws.org/Vol-2699/paper32.pdf</pdf>
<pre>
#DemocratsAreDestroyingAmerica: Rumour Analysis
on Twitter During COVID-19
Lin Tiana , Xiuzhen Zhang∗a and Jey Han Laub
a RMIT University, Melbourne, Australia
b The University of Melbourne, Melbourne, Australia


                                         Abstract
                                         COVID-19 has brought about significant economic and social disruption, and misinformation thrives during this un-
                                         certain period. In this paper, we apply state-of-the-art rumour detection systems that leverage both text content and
                                         user metadata to classify COVID-19 related rumours, and analyse how users, topics and emotions of rumours differ
                                         from non-rumours. We found that a number of interesting insights, e.g. rumour-spreading users have a dispropor-
                                         tionately smaller number of followers compared to their followees, rumour topics largely involve politics (with an
                                         abundance of party blaming), and rumours tend to be emotionally charged (anger) but reactions towards rumours
                                         exhibit disapproving sentiments.

                                         Keywords
                                         Rumour Detection, Rumour Analysis, COVID-19, Twitter


1. Introduction                                                                                           about hydroxychloroquine has lead to the death of
                                                                                                          a man in Arizona.4
COVID-19, a novel disease that was first identified                                                          Social media provides a perfect platform for mis-
in China, is an ongoing pandemic that has brought                                                         information propagation as they are largely unreg-
about significant impact to global economy and cre-                                                       ulated. To identify misinformation or fake news,
ated hitherto unseen social disruption. Since late                                                        we may rely on general fact-checking websites,5
Feburary 2020, the pandemic has come to dominate                                                          or COVID-19 specific ones.6 However, due to the
both traditional news and social media platforms,1                                                        evolving circumstances of a pandemic it is unlikely
and misinformation such as fake news, conspiracy                                                          fact-checking or debunking websites will have the
theories and rumours thrive during these uncertain                                                        capacity to keep themselves up-to-date.
times [1].                                                                                                   As such, early detection of potentially malicious
   For example, in Italy we saw rumours being spread                                                      rumours and understanding what or how rumours
to blame the outbreak on migrants and refuges by                                                          are being spread during a crisis is an important task
making the implicit connection between migration/                                                         [4]But what is a “rumour”? We adopt a widely used
movement with the spread of the virus.2 Hydrox-                                                           definition which defines it as a story or a statement
ychloroquine, a drug that was rumoured to be a                                                            with unverified truthful value [5].
COVID-19 treatment despite lacking robust scien-                                                             In this paper, we seek to understand what sorts
tific evidence about its effectiveness [2, 3], is an-                                                     of COVID-19 rumours are being spread on Twitter.
other popular topic on social media.3 These rumours                                                       To this end, we train state-of-the-art rumour detec-
can have serious consequences, e.g. misinformation                                                        tion systems on out-of-domain labelled rumour data
                                                                                                          and apply them to COVID-19 related tweets to de-
Title of the Proceedings: Proceedings of the CIKM 2020 Workshops                                          tect rumours. We analyse several characteristics
October 19-20, Galway, Ireland                                                                            that differentiate rumours from non-rumours in this
Editors of the Proceedings: Stefan Conrad, Ilaria Tiddi
email: s3795533@student.rmit.edu.au (L. Tian);
                                                                                                          COVID-19 data, such as their propagation patterns,
∗ Corresponding author: xiuzhen.zhang@rmit.edu.au (X.                                                     users, topics, and emotions. Our rumour detection
Zhang∗ ); jeyhan.lau@gmail.com (J.H. Lau)                                                                 systems leverage both message content and user
orcid: 0000-0001-5558-3790 (X. Zhang∗ )                                                                   characteristics, and our analyses reveal a number of
                                    © 2020 Copyright for this paper by its authors. Use permitted under
                                    Creative Commons License Attribution 4.0 International (CC BY 4.0).
           CEUR Workshop Proceedings (CEUR-
                                                                                                          interesting insights. For example, rumour-speaders
 CEUR
               http://ceur-ws.org


           WS.org)
 Workshop      ISSN 1613-0073
 Proceedings


     1 https://www.vox.com/recode/2020/3/12/21175570/corona                                                    4 https://edition.cnn.com/2020/03/23/health/arizona-corona

virus-covid-19-social-media-twitter-facebook-google.                                                      virus-chloroquine-death/index.html.
     2 https://time.com/5789666/italy-coronavirus-far-right-sal                                                5 E.g. https://www.snopes.com/ and https://www.factcheck.

vini/.                                                                                                    org/.
     3 https://abcnews.go.com/Health/tracking-hydroxychloro                                                    6 E.g. https://www.fema.gov/coronavirus/rumor-control

quine-misinformation-unproven-covid-19-treatment-ended/s                                                  and https://www.defense.gov/Explore/Spotlight/Coronavirus/
tory?id=70074235.                                                                                         Rumor-Control/
tend to have low follower but high followee count, Table 1
rumours tend to talk about politics (mostly party Rumour classification training data.
blaming) and are more emotionally charged (e.g.
anger), but reactions towards them are also dispro-
                                                                   Twitter15 Twitter16               PHEME      SemEval
portionately more disapproving. We also provide a
        7
website to share our latest findings and up-to-date #source tweets     1,490       818                 6,425         446
                                                    #all tweets      624,458   363,535               105,354      42,195
rumour tracking data analysis.                      #users           426,501   251,799                50,593       5,666
                                                           #rumours              1,118         613     2,402         446
                                                           #non-rumours            372         205     4,022           0
2. Related Work
Rumour detection approaches can generally be cat-
egorised into text-based or non-text-based meth-          3. Methodology and Data
ods. Text-based methods focus on rumour detection
using the textual content, which may include the          3.1. Rumour Classification
original source document/message and user com-            We focus on the detection of rumours vs. non-rumours,
ments/replies. Shu et al. [6] introduce linguistic fea-   rather than the veracity (truthfulness) of rumours.
tures to represent writing styles and other features      In other words, truthful, untruthful and unverified
based on sensational headlines from Twitter to de-        rumours are all rumours in our definition — they ex-
tect misinformation. To detect rumours as early as        hibit novelty/surprise in terms of content and tend
possible, Zhou et al. [7] incorporate reinforcement       to be spread by users — while non-rumours are tra-
learning to dynamically decide how many responses         ditional news stories and non-news related conver-
are needed to classify a rumour.                          sations. The task of rumour detection can therefore
   Non-text-based methods utilise features such as        be formulated as a binary classification problem,
user profiles or propagation patterns for rumour          and we explore both textual information and user
detection. For example, Gupta et al. [8] propose a        metadata as input features.
semi-supervised approach to evaluate the credibility         Consider a set of 𝑛 source tweets 𝑆 = {𝑠1 , 𝑠2 , ..., 𝑠𝑛 }.
of tweets using hand-crafted features based on tweet      Each source tweet is associated with a label 𝑙 indi-
and user metadata. Castillo et al. [9] leverage user      cating the tweet is rumour (𝑙 = 1) or non-rumour
registration age and number of followers to assess        (𝑙 = 0). Each source tweet 𝑠𝑖 also has a set of 𝑚 reac-
credibility. Following studies explore more complex       tions: 𝑅𝑖 = {𝑟𝑖1 , 𝑟𝑖2 , ..., 𝑟𝑖𝑚 }. Reactions are retweets,
features such as belief/intention for rumour predic-      replies and quotes. Each reaction 𝑟𝑖𝑗 is represented
tion [10], where users are categorised based on their     with a tuple 𝑟𝑖𝑗 = (𝑡𝑖𝑗 , 𝑢𝑖𝑗 ), which includes the fol-
“support” or “deny” attitudes toward a piece of news.     lowing information: 𝑡𝑖𝑗 is the textual content of the
   In terms of emotion analysis on social media,          reaction, and 𝑢𝑖𝑗 the metadata features of the user
Larsen et al. [11] propose using principle compo-         who creates the reaction tweet.
nent analysis to predict emotions of tweets, and             In terms of rumour classification models, we ex-
introduces a real-time system that analyses global        plore two methods based on: (1) text [16]; and (2)
and regional emotional signals on Twitter. More           user metadata [17]. The text-based model is imple-
recently, Farruque et al. [12] formulate the emotion      mented with BERT [14] and uses a pre-trained user
detection task as a multi-label classification problem    stance prediction model to classify the veracity of a
and use an LSTM model with attention for emotion          rumour. We adapt the model to our task which treats
prediction.                                               rumour classification as a binary classification task.
   For analysis of COVID-19 on Twitter, Li et al. [13]    For the user-based model, it uses a convolutional
explore using multi-lingual BERT [14] to analyse          network to process user metadata features extracted
public mental health using tweets. Sharma et al. [15]     from their Twitter profile and a recurrent network
present analysis of COVID-19 misinformation based         to combine a set of user features in the propaga-
on news sources from fact-checking sites rather than      tion path. We extend the original eight features to
automatic classification and contrast analysis of ru-     sixteen features.8 We limit the processing of user
mours versus non-rumours.                                 features in the propagation path to the first 50 users.
                                                               8 The extended integer user features are: length of user

                                                          screenname, count of posts and favourite posts; and the binary
                                                          features are: whether the profile is protected, has URL, profile
    7 https://xiuzhenzhang.github.io/rmit-covid19/        image, uses default profile and default profile image.
Table 2                                                                             1e6
Filtered data statistics.                                                     1.2
                      #tweets      30,077,742                                 1.0
               #source tweets          60,550
                                                                              0.8


                                                                # of Tweets
                       #users       8,692,422
              mean #reactions             497                                 0.6
               max #reactions         165,592                                 0.4
                mean #replies              28
                 max #replies           2,177                                 0.2
                                                                              0.0


                                                                                    5


                                                                                                1

                                                                                                         5


                                                                                                                   1

                                                                                                                            5


                                                                                                                                      1
                                                                                1-1


                                                                                               2-0

                                                                                                       2-1


                                                                                                                  3-0

                                                                                                                          3-1


                                                                                                                                     4-0
   To combine both text and user models for rumour


                                                                             -0


                                                                                             -0

                                                                                                        -0


                                                                                                                -0

                                                                                                                           -0


                                                                                                                                   -0
                                                                          20


                                                                                          20

                                                                                                     20


                                                                                                             20

                                                                                                                        20


                                                                                                                                20
detection, we create an ensemble model that takes Figure 1: Filtered English Tweets


                                                                     20


                                                                                          20

                                                                                                  20


                                                                                                             20

                                                                                                                     20


                                                                                                                                20
                                                                                 Date Volume
the output of both models to make the final predic-
tion. As both models produce a probability value
for the rumour class in each source tweet, we com- remaining tweets are “reaction tweets”: retweets,
pute the mean probability and tune a threshold 𝑝 to replies or quotes).11
separate rumours from non-rumours.9                      Figure 1 shows the volume of filtered English
                                                      tweets over time. We can see there is some traf-
3.2. Labelled Rumour Data                             fic of COVID-19 related tweets from late January
                                                      2020, although it doesn’t really pick up until mid-
We use Twitter15, Twitter16 [18], PHEME [19], and
                                                      March. We suspect the spike of activity may be
SemEval2019 [20] as training data to train our bi-
                                                      due the World Health Organisation declaring it as a
nary rumour classification models. For Twitter15,
                                                      pandemic on 12th March.12 .
Twitter16 and PHEME, there are originally 4 classes:
                                                         In terms of pre-processing, we tokenise the tweets
truthful rumours, untruthful rumours, unverified
                                                      with the TweetTokenizer [22] package of NLTK,
rumours and non-rumours; we collapse the truthful,
                                                      and lowercase and lemmatise all words with the
untruthful and unverified rumours into the rumour
                                                      WordNetLemmatizer package, as well as remove
class. SemEval2019 focuses on veracity classification
                                                      digits, non-Latin characters and @usernames. We
and as such has only 3 classes (truthful, untruthful
                                                      also filter stopwords based on an extended NLTK
and unverified); they are all treated as the rumour
                                                      stopword list, which includes COVID-19 specific
class. Statistics of the datasets is presented in Ta-
                                                      stopwords, such as covid19 or coronavirus. Hyper-
ble 1.
                                                      links are encoded with a special token for rumour
                                                      classification (Section 4.1) or removed for topic anal-
3.3. COVID-19 Twitter Data                            ysis (Section 4.3).
We use a public COVID-19 Twitter dataset [21] for
our analyses.10 We use version 4 of the dataset,                4. Results and Analysis
which contains tweets from 1st January 2020 to
5th April 2020. The dataset is regularly updated,               4.1. Rumour Classification
and collects tweets for several languages (English,
French, Spanish and German) based on COVID-19                   To assess the quality of the rumour classification
keywords.                                                       models, we first evaluate the in-domain performance
   As we are interested in rumour analyses in En-               of Twitter15, Twitter16 and PHEME. For each dataset,
glish, we filter the data to keep only source tweets            we randomly split the full data in 60%/20%/20% to
that are in English (based on Twitter metadata) and             create the training, validation and test partitions.
also have at least 10 replies (since those with few             In-domain classification performance is presented
reactions are of little significance for rumour analy-          in Table 3 (in-domain performances are those where
sis). Table 2 presents some statistics of our filtered          “Train” and “Test” are from the same domain).13
dataset. We have approximately 30M tweets post-                     11 Quote is similar to retweet, except that it contains some
filtering, and 60K of them are source tweets (the               response to the original tweet. Both retweets and quotes are
                                                                displayed on the user’s home page, while replies are not.
   9 That is, the ensemble model labels a source tweet as ru-       12 https://twitter.com/WHO/status/1237777021742338049

mour if the mean probability ≥ 𝑝.                                   13 For the ensemble model, we tune the threshold 𝑝 based on
  10 https://github.com/thepanacealab/covid19_twitter.          the validation set, and 𝑝 ranges from 0.7 to 0.8.
                                                                                                            Retweets        Quotes              Replies
Table 3
In-domain and cross-domain classification results. “P”,
“T15” and “T16” denote the PHEME, Twitter15, and Twit-
                                                                                         Rumours
ter 16 datasets respectively.
          Test       Train      Model       Accuracy
                                user          0.85
                      T15       text          0.88
                              user+text       0.88              Non-Rumours
          T15
                                user          0.73
                    P+T16       text          0.71
                              user+text       0.80                                                 0            150         300           450             600

                                user          0.82                                                            Average Number of Reactions
                      T16       text          0.86
                              user+text       0.92              Figure 2: Reaction types.
          T16
                                user          0.75
                    P+T15       text          0.78
                              user+text       0.82                                                 Average # of Reactions within 48 Hours
                                user          0.63
                       P        text          0.92             Average # of Reactions   400
                              user+text       0.81
           P
                                user          0.65                                      300
                   T15+T16      text          0.70
                              user+text       0.78                                      200

                                                                                        100                                          type
Table 4                                                                                                                              Rumour
User statistics. Top half of the table is median statistics,                              0                                          Non-Rumour
bottom half mean.                                                                             0        10        20        30        40            50
                                                                                                                Time in Hours
                             Rumour       Non-rumour
                #Follower     151,521         223,651           Figure 3: Reaction speed.
              #Following        1,486             976
           Follower
         # Following Ratio         63             121
                     #Post     31,433          28,644
           Account Age          2,992           3,119
           Geo Enabled           51%             57%


Overall, we can see the text model does better than
the user model, but the ensemble model (“user+text”)
performs best.
   We next evaluate cross-domain performance. Given
a test domain (e.g. Twitter15), we train the rumour
classification models using a combination of all out-
of-domain data (e.g. Twitter16 and PHEME), and
assess their accuracy on the test domain. This is an Figure 4: Bigram word cloud.
arguably more difficult setting, as there is little or
no topic overlap between the different domains.
   Unsurprisingly, we see a dip in accuracy com- COVID-19 data (Section 3.3).14
pared to the in-domain performance. Encouragingly,       In total, out of the 60K source tweets (Table 2)
however, with the ensemble model we are still get- 15K are classified as rumours. These rumours (and
ting at least 78% accuracy over all domains, sug- non-rumours) will serve as the basis for user, topic
gesting that the model is robust for cross-domain and emotion analyses in subsequent experiments.
rumour detection.                                         14 We set the threshold 𝑝 to 0.85, which is marginally higher
   Given these results, we next train an ensemble than the      thresholds we used in the cross-domain experiments
model on all datasets (Twitter15+Twitter16+PHEME), to improve precision. Note that the COVID-19 data does not in-
and use it to classify tweets on our filtered English clude user metadata, so we crawl them using the official Twitter
                                                               API.
Table 5
Salient hashtags, unigrams and bigrams in rumour and non-rumour tweets.


                              Hashtag     #WuhanVirus, #MOG, #OneVoice1, #FoxNews, #DemocratsAreDe-
              Rumour                      stroyingAmerica, #KAG2020, #ChinaVirus, #Hydroxychloroquine,
                                          #IWillStayAtHome, #ChinaLiedPeopleDied, #MasksNow, #TheMoreY-
                                          ouKnow, #TheResistance, #StopAiringTrump, #VoteRedToSaveAmerica,
                                          #WuhanHealthOrganisation, #CCP_is_terrorist, #DemCast, #BillGates,
                                          #TrumpIsTheWORSTPresidentEVER, #TrumpOwnsEveryDeath, #5G
                              Unigram     trump, pelosi, bill, democrat, fox, gop, american, blame, president, briefing,
                                          joe, lie, hoax, medium, fail, governor, response, china, vote, drug, hydroxy-
                                          chloroquine
                              Bigram      nancy pelosi, chinese chinese, jared kushner, chinese communist, trump re-
                                          sponse, held accountable, trump supporter, trish regan, speaker pelosi, joe
                                          biden, bill gate, china lie, task gown, deep state, blame trump, fox business
                              Hashtag     #BREAKING, #StaySafe, #CoronaUpdate, #CoronavirusLockdown,
         Non-Rumour                       #IndiaFightsCorona, #CoronaOutbreak, #DonaldTrump, #COVID19PH,
                                          #COVID19Pandemic, #covid19australia, #TakeResponsibility, #21day-
                                          lockdown, #CoronavirusPandemic, #Covid19usa, #StayHomeStaySafe,
                                          #StayAtHome, #coronapocalypse, #flu, #Italia, #COVID19OhioReady,
                                          #COVID_19uk, #masks, #china, #StrongerTogether
                              Unigram     positive, confirm, total, india, march, symptom, health, minister, due, nigeria,
                                          lockdown, update, death, infect, old, donate, day, negative, cancel, wash,
                                          hand, social, hour, announce, today, data, stay, worker, isolation, quarantine
                              Bigram      bring total, march march, year old, total number, relief fund, number con-
                                          firm, patient positive, prime minister, travel history, premier league, wash
                                          hand, hubei province, first death, cruise ship, health condition, social care


                                                    😐
              👀                                     3%
              2%                         👀                                           😤                                 👀
        😐          😰                            💔                                                                 💪
                                         4%                                     🙏    6%     😡                          5%    😷
        3%         2%                           3% 💪
            ❤                                                                   6%         19%                😰 5%          17%
                                          😰        3%                      😑
         😷 3% 👏                                                 😡                                             6%
   😈                                ❤     4%                   34%         6%                           ❤
         4%   2%
   5%                               5%                                👊                                 7%
        🙏                                                             7%                                                           👍
                         😡               😷
        6%                                                                                        👍      😳                        16%
                        54%              6%                            😈
                                                                                                 18%     7%
              👍                                                        7%
             18%                           🙏                                                                 🙏
                                                                            😕
                                          12%                                                                9%
                                                                            8%                                               😡
                                                          👍                          😐     😷                           🎶    15%
                                                         25%                         9%   13%                         12%


  (a) Rumour source tweets         (b) Non-rumour source tweets        (c) Rumour reply tweets         (d) Non-rumour reply tweets

Figure 5: Emoji Distribution for rumour vs. non-rumour tweets.


4.2. User Analysis                                                   counts are created during January to May 2020, as
                                                                     opposed to 6.1% for non-rumour accounts).
More than 8M users are involved in the conver-
                                                                        Figure 2 presents the average volume of different
sations around COVID-19 in our filtered English
                                                                     reactions toward rumours and non-rumours. While
dataset (Table 2). We focus only on users who pub-
                                                                     the majority of the reactions for both are retweets,
lished the source tweets in this analysis. Table 4
                                                                     we can see that retweets and quotes are much more
presents some statistics of these users for rumours
                                                                     popular as a response to rumours. This suggests that
and non-rumours.
                                                                     non-rumours tend to attract more discussion/replies
   Interestingly, users who are involved in rumour
                                                                     than rumours.
creation tend to tweet more (higher post counts) and
                                                                        Rumours tend to have high novelty in their con-
follow more users but have less followers, result-
                                Follower ratio. Their                tent so as to attract propagation [23], and we can see
ing in a substantially lower # Following
                                                                     this in Figure 3, which shows the average volume of
account is also generally younger (7.7% rumour ac-
                       #FoxNews                                                           😡 45%                                            👍 32%                        😥 16%
  #DemocratsAreDestroyingAmerica                                                      😡 43%                              😠 22%                             ✌ 21%
                      #ChinaVirus                                                                                             😡 68%                    😷 14%             😠 12%
             #Hydroxychloroquine                                                     😡 42%                                        ❤ 29%                  🙏 13%             ✨ 12%
                        #BillGates                                😡 29%                           💔 21%                 🎶 14%           😠 10%
                                     0                            0.25                                   0.5                                    0.75                               1

                                                                                 (a) Rumour


              #StaySafe                                                          🙏 42%                                 ❤ 24%                 👊 14%
  #CoronavirusLockdown                                                       👍 41%                                                 😡 33%                         😷 16%
       #covid19australia                                                         😡 42%                                                           👍 39%                 🙏 12%
           #Covid19usa                                                       😡 41%                                 🎶 23%                 👍 13%                  👀 12%
                  #Italia                        👊 22%                    🎶 16%                    💪 16%          😈 8%          😡 8%
                            0                             0.25                                     0.5                                     0.75                                    1

                                                                           (b) Non-Rumour

Figure 6: Emoji Distribution for salient hashtags in source tweets


                       #FoxNews                          😡 23%                    👍 16%                  🙏 15%                ❤ 14%
  #DemocratsAreDestroyingAmerica                         😡 23%           😈 10%        😳 9%         😷 8% ✌ 5%           ❤ 7%
                      #ChinaVirus                          😷 25%                          😡 20%                😈 13% ❤ 5%
             #Hydroxychloroquine                   👍 20%                          😡 19%                   🎶 16%               🙏 13%                👀 13%
                        #BillGates                        😡 24%                            👀 22%                   🎶 16%                    😈 16%
                                     0                            0.25                                   0.5                                    0.75                               1

                                                                                 (a) Rumour


              #StaySafe                                   🙏 28%            😷 11%           😡 9%       👍 7%         👀 8%          ❤ 8%
  #CoronavirusLockdown                   😡 17%           🎶 9%     🙏 6% 👍 5% 😠 5% ❤ 5% 😳 4%
       #covid19australia                   😡 19%                🙏 12%             👍 12%             😈 12%                😳 12%
           #Covid19usa                               🙏 26%                         👍 18%                       😡 16%               🎶 14%                       👏 14%
                  #Italia                                         ❤ 33%                       🎶 18%              💪 11%           👏 9%       👊 8%
                            0                             0.25                                     0.5                                     0.75                                    1

                                                                           (b) Non-Rumour

Figure 7: Emoji Distribution for salient hashtags in responses


reactions over time for rumours and non-rumours. china lie), (5) status reports (death toll and death
Although rumours tend to attract more reactions rate), (6) healthcare (doctor nurse and health worker);
in the first 24 hours, we see a convergence after 48 (7) panic buying (toilet paper 16 ); and others.
hours.                                                    To better understand the topical difference be-
                                                       tween rumours and non-rumours, we compute log-
4.3. Topic Analysis                                    likelihood ratio [24] of unigrams, bigrams, hashtags
                                                       and display the most salient words in Table 5.17
To understand the popular topics discussed in Twit-       To ease readability, we highlight some of the salient
ter, we first present a bigram wordcloud in Figure 4. words in the table. For rumours, US politics is one of
We see several broad topics: (1) health advice (social the major topics, with both parties putting blame on
distance, stay home, wash hand, and wear mask); (2) each other (#DemocratsAreDestroyingAmerica and
US politics (president trump and joe biden); (3) UK
politics (prime minister, boris johnson, and herd im-     16 https://www.bbc.com/news/world-australia-53196525.

        15                                                17 We include both source tweets and reactions to con-
munity ); (4) blame on China (wuhan china and
                                                                                          truct the rumour and non-rumour “corpora”, and use NLTK’s
                                                                                          BigramAssocMeasures to compute the loglikelihood ratio. To
   15 https://www.theatlantic.com/health/archive/2020/03/cor                              decide whether a word is salient for rumour or non-rumour, we
onavirus-pandemic-herd-immunity-uk-boris-johnson/608065/.                                 look at its normalised frequency.
#TrumpIsTheWORSTPresidentEVER). Unsurprisingly, (Figure 6(a)), anger dominates all hashtags, although
Fox News (#FoxNews and fox) are associated with #ChinaVirus source tweets are substantially “an-
rumours.18 China is another topic, and the hash- grier” (68%!). Anger in non-rumour source tweets
tags/bigrams suggest blaming (#ChinaVirus,               (Figure 6(b)) is a little more toned down; interest-
#CCP_is_terrorist, #WuhanHealthOrganisation and ingly the dominant emotion for the global lockdown
china lie). We also see also some of the well-known (#CoronavirusLockdown) is more positive than neg-
COVID-19 rumours/hoaxes: #Hydroxychloroquine, ative (41% “thumbs up” vs. 33% “angry”).
#BillGates,19 , and #5G.20                                  Moving over to the emoji distribution for reac-
   Looking at non-rumours, the topics are very dif- tions towards rumour tweets (Figure 7(a)), we see
ferent: they are mostly related to health advice (#Coro- anger in all hashtags, but some of the other emo-
navisuLockdown, #StayHomeStaySafe and wash hand) tions are rather curious, e.g. “thumbs up” (approval)
and status updates (total number, number confirm), for #Hydroxychloroquine, and “googly eyes” (atten-
and more neutral/positive in tone (#StrongerTogether tion drawing) for #BillGates. Unsurprisingly though,
and #coronapocalypse). Politics is rare, although we reactions for all non-rumour hashtags (Figure 7(b))
see prime minister, which may be related to UK poli- are dominated by “prayers” and approval emojis
tics. Another interesting non-rumour topic observed (“thumbs up” and “biceps”), suggesting that despite
here is the cruise ship outbreaks (cruise ship).         the general doom and gloom atmosphere of COVID-
                                                         19, there is still a sense of positivity.
4.4. Emotion Analysis
To understand the public sentiment during the COVID- 5. Conclusion
19 crisis, we explore using an emotion prediction
system to classify the emotion of tweets in our data. We explored an ensemble model combining text-
We experiment with DeepMoji [25], a Bi-LSTM with based and user-based rumour detection models to
attention model trained on a large number of emoji classify COVID-19 related rumours on Twitter. We
occurrences in tweets. We use their pre-trained presented quantitative evaluation to demonstrate
model to label our data with 63 predefined emojis. its robustness in cross-domain rumour detection,
   Figure 5 illustrates the distribution of emojis for analyse the users, topics and emotions of rumours
source and reply tweets in rumours and non-rumours. vs. non-rumours, and found a number of insights.
Looking at the emotions of source tweets (Figure 5(a)
and (b)), “anger” dominates both rumours and non-
rumours, but substantially more in rumours than
                                                              Acknowledgements
non-rumours (54% vs. 34%). Non-rumours also see This work is partially supported by the Australian
more “thumbs up” (encouragement), although the Research Council Discovery Project DP200101441.
difference is less severe (25% vs. 18%).
   For reply tweets (Figure 5(c) and (d)), we see a
similar distribution for the top-3 emotions (“anger”, References
“thumbs up” and “mask face”), but the interesting
observation here is the emojis for the rest (left half of [1] S. Vieweg, A. L. Hughes, K. Starbird, L. Palen,
the pie chart): the reply tweets for rumours display              Microblogging during two natural hazards
disapproving sentiments (e.g. “punch” and “frown”),               events: what Twitter may contribute to sit-
while that of non-rumours are generally positive and              uational awareness, in: Proceedings of the
encouragement in tone (“pray”, “love” and “biceps”).              SIGCHI conference on human factors in com-
   We next present the emoji distribution for some                puting systems, 2010, pp. 1079–1088.
of the salient hashtags for the source and reaction [2] E. A. Meyerowitz, A. G. Vannier, M. G. Friesen,
tweets in Figure 6 and 7 respectively, to see how pub-            S. Schoenfeld, J. A. Gelfand, M. V. Callahan,
lic attitude towards different topics vary across ru-             A. Y. Kim, P. M. Reeves, M. C. Poznansky, Re-
mours and non-rumours. For rumour source tweets                   thinking the role of hydroxychloroquine in the
                                                                  treatment of COVID-19, The FASEB Journal
    18 https://www.nytimes.com/2020/03/31/opinion/coronavir       34 (2020) 6027–6037.
us-fox-news.html.                                             [3] D. N. Juurlink,         Safety considerations
    19 https://www.bbc.com/news/52847648.
                                                                  with chloroquine, hydroxychloroquine and
    20 https://www.reuters.com/article/uk-factcheck-coronavir
                                                                  azithromycin in the management of SARS-
us-5g/false-claim-coronavirus-is-a-hoax-and-part-of-a-wider-
5g-and-human-microchipping-conspiracy-idUSKBN22P22I.              CoV-2 infection, CMAJ 192 (2020) E450–E453.
 [4] B. Wang, J. Zhuang, Crisis information distri-     [15] K. Sharma, S. Seo, C. Meng, S. Rambhatla,
     bution on Twitter: a content analysis of tweets         A. Dua, Y. Liu, Coronavirus on social media:
     during Hurricane Sandy, Natural hazards 89              Analyzing misinformation in Twitter conversa-
     (2017) 161–181.                                         tions, arXiv preprint arXiv:2003.12309 (2020).
 [5] G. W. Allport, L. Postman, The psychology of       [16] L. Tian, X. Zhang, Y. Wang, H. Liu, Early detec-
     rumor. (1947).                                          tion of rumours on Twitter via stance transfer
 [6] K. Shu, A. Sliva, S. Wang, J. Tang, H. Liu, Fake        learning, in: European Conference on Infor-
     news detection on social media: A data min-             mation Retrieval, Springer, 2020, pp. 575–588.
     ing perspective, ACM SIGKDD Explorations           [17] Y. Liu, Y.-F. B. Wu, Early detection of fake news
     Newsletter 19 (2017) 22–36.                             on social media through propagation path clas-
 [7] K. Zhou, C. Shu, B. Li, J. H. Lau, Early rumour         sification with recurrent and convolutional net-
     detection, in: Proceedings of the 2019 Con-             works, in: Thirty-Second AAAI Conference on
     ference of the North American Chapter of the            Artificial Intelligence, 2018.
     Association for Computational Linguistics: Hu-     [18] J. Ma, W. Gao, K.-F. Wong, Detect rumors in
     man Language Technologies, Volume 1 (Long               microblog posts using propagation structure
     and Short Papers), 2019, pp. 1614–1623.                 via kernel learning, in: Proceedings of the
 [8] A. Gupta, P. Kumaraguru, C. Castillo, P. Meier,         55th Annual Meeting of the Association for
     Tweetcred: Real-time credibility assessment of          Computational Linguistics (Volume 1: Long
     content on Twitter, in: International Confer-           Papers), 2017, pp. 708–717.
     ence on Social Informatics, Springer, 2014, pp.    [19] E. Kochkina, M. Liakata, A. Zubiaga, All-in-
     228–243.                                                one: Multi-task learning for rumour verifica-
 [9] C. Castillo, M. Mendoza, B. Poblete, Informa-           tion, arXiv preprint arXiv:1806.03713 (2018).
     tion credibility on Twitter, in: Proceedings       [20] G. Gorrell, E. Kochkina, M. Liakata, A. Aker,
     of the 20th international conference on World           A. Zubiaga, K. Bontcheva, L. Derczynski,
     wide web, ACM, 2011, pp. 675–684.                       Semeval-2019 task 7: Rumoureval, determin-
[10] X. Liu, A. Nourbakhsh, Q. Li, R. Fang, S. Shah,         ing rumour veracity and support for rumours,
     Real-time rumor debunking on Twitter, in:               in: Proceedings of the 13th International Work-
     Proceedings of the 24th ACM International               shop on Semantic Evaluation, 2019, pp. 845–
     on Conference on Information and Knowledge              854.
     Management, ACM, 2015, pp. 1867–1870.              [21] J. M. Banda, R. Tekumalla, G. Wang, J. Yu, T. Liu,
[11] M. E. Larsen, T. W. Boonstra, P. J. Batterham,          Y. Ding, G. Chowell, A large-scale COVID-
     B. O’Dea, C. Paris, H. Christensen, We feel:            19 Twitter chatter dataset for open scientific
     mapping emotion on Twitter, IEEE journal                research–an international collaboration, arXiv
     of biomedical and health informatics 19 (2015)          preprint arXiv:2004.03688 (2020).
     1246–1252.                                         [22] S. Bird, E. Klein, E. Loper, Natural Language
[12] N. Farruque, C. Huang, O. Zaiane, R. Goebel,            Processing with Python, 1st ed., O’Reilly Me-
     Basic and depression specific emotion identifi-         dia, Inc., 2009.
     cation in Tweets: multi-label classification ex-   [23] S. Vosoughi, D. Roy, S. Aral, The spread of
     periments, in: The 20th International Confer-           true and false news online, Science 359 (2018)
     ence on Intelligent Text Processing and Com-            1146–1151.
     putational Linguistics (CICLing), 2019.            [24] T. E. Dunning, Accurate methods for the statis-
[13] I. Li, Y. Li, T. Li, S. Alvarez-Napagao, D. Gar-        tics of surprise and coincidence, Computa-
     cia, What are we depressed about when we                tional linguistics 19 (1993) 61–74.
     talk about COVID19: Mental health analysis         [25] B. Felbo, A. Mislove, A. Søgaard, I. Rahwan,
     on tweets using natural language processing,            S. Lehmann, Using millions of emoji occur-
     arXiv preprint arXiv:2004.10899 (2020).                 rences to learn any-domain representations for
[14] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova,           detecting sentiment, emotion and sarcasm, in:
     BERT: Pre-training of deep bidirectional trans-         Proceedings of the 2017 Conference on Empir-
     formers for language understanding, in: Pro-            ical Methods in Natural Language Processing,
     ceedings of the 2019 Conference of the North            Copenhagen, Denmark, 2017, pp. 1615–1625.
     American Chapter of the Association for Com-
     putational Linguistics: Human Language Tech-
     nologies, Volume 1 (Long and Short Papers),
     Minneapolis, Minnesota, 2019, pp. 4171–4186.

</pre>