PROTECT
                A Pipeline for Propaganda Detection and Classification

                   Vorakit Vorakitphan, Elena Cabrio, Serena Villata
                    Université Côte d’Azur, Inria, CNRS, I3S, France
                       vorakit.vorakitphan@inria.fr,
           elena.cabrio@univ-cotedazur.fr, villata@i3s.unice.fr


                                                                    dEteCTion), un nuovo sistema per identi-
                        Abstract                                    ficare automaticamente i messaggi propa-
                                                                    gandistici e classificarli rispetto alle tec-
    English. Propaganda is a rhetorical tech-                       niche di propaganda utilizzate. PROTECT
    nique to present opinions with the deliber-                     è un sistema progettato come una pipeline
    ate goal of influencing the opinions and the                    completa per rilevare in primo luogo i
    actions of other (groups of) individuals for                    frammenti di testo propagandistici dato il
    predetermined misleading ends. The em-                          testo proposto, e successivamente classi-
    ployment of such manipulation techniques                        ficare tali frammenti secondo la tecnica
    in politics and news articles, as well as                       di propaganda usata, sfruttando le carat-
    its subsequent spread on social networks,                       teristiche semantiche e argomentative del
    may lead to threatening consequences for                        testo. Questo articolo presenta anche un
    the society and its more vulnerable mem-                        video dimostrativo del sistema PROTECT
    bers. In this paper, we present PRO-                            per mostrare le principali funzionalità for-
    TECT (PROpaganda Text dEteCTion), a                             nite all’utente.
    new system to automatically detect propa-
    gandist messages and classify them along
    with the propaganda techniques employed.                    1   Introduction
    PROTECT is designed as a full pipeline
    to firstly detect propaganda text snippets                  Propaganda represents an effective but often mis-
    from the input text, and then classify the                  leading communication strategy which is em-
    technique of propaganda, taking advan-                      ployed to promote a certain viewpoint, for in-
    tage of semantic and argumentation fea-                     stance in the political context (Lasswell, 1938;
    tures. A video demo of the PROTECT sys-                     Koppang, 2009; Dillard and Pfau, 2009; Long-
    tem is also provided to show its main func-                 pre et al., 2019). The goal of this communica-
    tionalities.                                                tion strategy is to persuade the audience about the
                                                                goodness of such a viewpoint by means of mis-
    Italiano. La propaganda è una tecnica re-                  leading and/or partial arguments, which is particu-
    torica per presentare determinate opinioni                  larly harmful for the more vulnerable public in the
    con l’obiettivo deliberato di influenzare le                society (e.g., young or elder people). Therefore
    opinioni e le azioni di altri (gruppi di) in-               the ability to detect the occurrences of propaganda
    dividui per fini predeterminati e tenden-                   in political discourse and newspaper articles is of
    zialmente fuorvianti. L’impiego di tale tec-                main importance, and Natural Language Process-
    nica di manipolazione in politica e nella                   ing methods and technologies play a main role in
    stampa, cosı̀ come la sua diffusione sulle                  this context addressing the propaganda detection
    reti sociali, può portare a conseguenze                    and classification task (Da San Martino et al.,
    disastrose per la società e per i suoi mem-                2019; Da San Martino et al., 2020a). It is, in par-
    bri più vulnerabili. In questo articolo pre-               ticular, important to make this vulnerable public
    sentiamo PROTECT (PROpaganda Text                           aware of the problem and provide them tools able
                                                                to raise their awareness and develop their critical
     Copyright © 2021 for this paper by its authors. Use per-
mitted under Creative Commons License Attribution 4.0 In-       thinking.
ternational (CC BY 4.0).                                           To achieve this ambitious goal, we present in
this paper a new tool called PROTECT (PROpa-                tem to organize news events according to the level
ganda Text dEteCTion) to automatically identify             of propagandist content in the articles, and in-
and classify propaganda in texts. In the current            troduces a new corpus (QProp) annotated with
version, only English text is processed. This tool          the propaganda vs. trustworthy classes, provid-
has been designed with an easy-to-access user in-           ing information about the source of the news
terface and a web-service API to ensure a wide              articles. Recently, a web demo named Prta
public use of PROTECT online. To the best of                (Da San Martino et al., 2020b) has been pro-
our knowledge, PROTECT is the first online tool             posed, trained on disinformation articles. This
for propagandist text identification and classifica-        demo allows a user to enter a plain text or a URL,
tion with an interface allowing the user to submit          but it does not allow users to download such re-
his/her own text to be analysed.1                           sults. Similarly to PROTECT, Prta shows the
   PROTECT presents two main functionalities: i)            propagandist messages at the snippet level with
the automatic propaganda detection and classifica-          an option to filter the propaganda techniques to be
tion service, which allows the user to paste or up-         shown based on the confidence rate, and also ana-
load a text and returns the text where the propagan-        lyzes the usage of propaganda technique on deter-
dist text snippets are highlighted in different colors      mined topics. The implementation of this system
depending on the propaganda technique which is              relies on the approach proposed in (Da San Mar-
employed, and ii) the propaganda word clouds, to            tino et al., 2019).
show in a easy to catch visualisation the identified           The most recent approaches for propaganda de-
propagandist text snippets. PROTECT is deployed             tection are based on language models that mostly
as a web-service API, allowing users to download            involve transformer-based architectures. The ap-
the output (the text annotated with the identified          proach that performed best on the NLP4IF’19
propaganda technique) as a json file. The PRO-              sentence-level classification task relies on the
TECT tool relies on a pipeline architecture to first        BERT architecture with hyperparameters tun-
detect the propaganda text snippets, and second to          ing without activation function (Mapes et al.,
classify the propaganda text snippets with respect          2019). (Yoosuf and Yang, 2019) focused first on
to a specific propaganda technique. We cast this            the pre-processing steps to provide more informa-
task as a sentence-span classification problem and          tion regarding the language model along with ex-
we address it relying on a transformer architec-            isting propaganda techniques, then they employ
ture. Results reach SoTA systems performances               the BERT architecture casting the task as a se-
on the tasks of propaganda detection and classi-            quence labeling problem. The systems that took
fication (for a comparison with SoTA algorithms,            part in the SemEval 2020 Challenge - Task 11 rep-
we refer to (Vorakitphan et al., 2021)).                    resent the most recent approaches to identify pro-
   The paper is structured as follows: first, Section       paganda techniques based on given propagandist
2 discusses the state of the art in propaganda de-          spans. The most interesting and successful ap-
tection and classification and compares our contri-         proach (Jurkiewicz et al., 2020) proposes first to
bution to the literature. Then Section 3 describes          extend the training data from a free text corpus as
the pipeline for the detection and classification of        a silver dataset, and second, an ensemble model
propaganda text snippets as well as the data sets           that exploits both the gold and silver datasets dur-
used for the evaluation and the obtained results.           ing the training steps to achieve the highest scores.
Section 4 describes the functionalities of the web
interface, followed by the Conclusions.                        As most of the above mentioned systems, also
                                                            PROTECT relies on language model architectures
2       Related Work                                        for the detection and classification of propaganda
                                                            messages, empowering them with a rich set of
In the last years, there has been an increasing             features we identified as pivotal in propagandist
interest in investigating methods for textual pro-          text from computational social science literature
paganda detection and classification. Among                 (Vorakitphan et al., 2021). In particular, (Morris,
them, (Barrón-Cedeño et al., 2019) present a sys-         2012) discusses how emotional markers and af-
    1
                                                            fect at word- or phrase-level are employed in pro-
    The video demonstrating the PROTECT tool is available
here https://1drv.ms/u/s!Ao-qMrhQAfYtkzD69                  paganda text, whilst (Ahmad et al., 2019) show
JPAYY3nSFub?e=oUQbxQ                                        that the most effective technique to extract senti-
ment for the propaganda detection task is to rely      higher number of instances (representing re-
on lexicon-based tailored dictionaries. (Li et al.,    spectively 32% and 15% of the propagan-
2017) show how to detect degrees of strength from      dist messages on all above-mentioned datasets).
calmness to exaggeration in press releases. Fi-        The classes with the lower number of in-
nally, (Troiano et al., 2018) focus on feature ex-     stances are Whataboutism, Red-Herring, Band-
traction of text exaggeration and show that main       wagon, Straw-Men, respectively occurring in
factors include imageability, unexpectedness, and      1%, 0.87%, 0.29%, 0.23% in NLP4IF’19
the polarity of a sentence.                            datasets. In SemEval’20T11 such labels where
                                                       merged, and the classes Whataboutism Straw-
3     Propaganda Detection and                         Men Red-Herring, Bandwagon respectively rep-
      Classification                                   resent 1.33% and 1.29% of the propagandist mes-
PROTECT addresses the task of propaganda tech-         sages.
nique detection and classification at fragment-
level, meaning that both the spans and the type        3.2    PROTECT Architecture
of propaganda technique are identified and high-       Given a textual document or a paragraph as input,
lighted in the input sentences. In the following, we   the system performs two steps. First, it performs
describe the datasets used to train and test PRO-      a binary classification at token level, to label a to-
TECT, and the approach implemented in the sys-         ken as propagandist or not. Then, it classifies pro-
tem to address the task.                               pagandist tokens according to the 14 propaganda
                                                       categories from SemEval task (T11).
3.1    Datasets
                                                          For instance, given the following example
To evaluate the approach on which PROTECT              “Manchin says Democrats acted like babies at the
relies, we use two standard benchmarks for             SOTU (video) Personal Liberty Poll Exercise your
Propaganda Detection and Classification, namely        right to vote.” the snippets “babies” is first classi-
the NLP4IF’19 (Da San Martino et al., 2019)            fied as propaganda (step 1), and then more specifi-
and SemEval’20 datasets (Da San Martino                cally as an instance of the Name-Calling Labeling
et al., 2020a). The former was made available          propaganda technique (step 2).
for the shared task NLP4IF’19 on fine-grained
propaganda detection. 18 propaganda techniques         Step 1: Propaganda Snippet Detection. To
are annotated on 469 articles (293 in the training     train PROTECT, we merge the training, develop-
set, 75 in the development set, and 101 in the test    ment and test sets from NLP4IF, and the training
set).2 As a follow up, in 2020 SemEval proposed        set from Semeval’20 T11. The development set
a shared task (T11)3 reducing the number of            from Semeval’20 T11 is instead used to evaluate
propaganda categories with respect to NLP4IF’19        the system performances.4 In the preprocessing
(14 categories, 371 articles in the training set and   phase, each sentence is tokenized and tagged with
75 in the development set). PROTECT detects            a label per token according to the IOB format.
and classifies the same list of 14 propaganda             For the binary classification, we adopt Pre-
techniques as in the SemEval task, namely:             trained Language Model (PLM) based on BERT
Appeal to Authority, Appeal to fear-prejudice,         (bert-base-uncased model) (Devlin et al., 2019)
Bandwagon,        Reductio ad hitlerum,       Black-   architecture. The hyperparameters are a learning
and-White Fallacy, Causal Oversimplification,          rate of 5e-5, a batch of 8, max len of 128. For
Doubt, Exaggeration Minimisation, Flag-Waving,         the evaluation, we compute standard classifica-
Loaded-Language, Name-Calling Labeling, Rep-           tion metrics5 at the token-level. The results ob-
etition, Slogans, Thought-terminating Cliches,         tained by the binary classifier (macro average over
Whataboutism Straw-Men Red-Herring.                    5 runs) on SemEval’20 T11 development set are
   Those classes are not uniformly distributed         0.71 precision, 0.77 recall and 0.72 F-measure (us-
in the data sets.          Loaded-Language and
                                                           4
Name-Calling Labeling are the classes with the               The gold annotations of Semeval’20 test set are not avail-
                                                       able, this is why we selected the development set for evalua-
  2
    https://propaganda.qcri.org/nlp4if-s               tion.
hared-task/                                                5
                                                             https://scikit-learn.org/stable/modu
  3
    https://propaganda.qcri.org/semeval2               les/generated/sklearn.metrics.precisio
020-task11/                                            n recall fscore support.html
       Propaganda Technique                    PLM:              form both a sentence classification and a span clas-
                                              RoBERTa            sification. More precisely: i) we input a sentence
 Appeal to Authority                            0.48             to the tokenizer where max length is set to 128
 Appeal to fear-prejudice                       0.57             with padding; ii) we input the span provided by
 Bandwagon,Reductio ad hit.                     0.72             the propaganda span-template from SemEval T11
 Black-White-Fallacy                            0.38             dataset, and we set max length value of 20 with
 Casual-Oversimplification                      0.70             padding. RoBERTa tokenizer is applied in both
 Doubt                                          0.74             cases. If a sentence does not contain propaganda
 Exaggeration,Minimisation                      0.67             spans, it is labeled as “none-propaganda”.
 Flag-Waving                                    0.88                To take into account context features at
 Loaded Language                                0.88             sentence-level, a BiLSTM is introduced. For each
 Name Calling,Labeling                          0.85             sentence, semantic and argumentation features are
 Repetition                                     0.70             extracted following the methodology proposed in
 Slogans                                        0.72             (Vorakitphan et al., 2021) and given in input to
 Thought-terminating Cliches                    0.52             the BiLSTM model (hyper-parameters: 256 hid-
 Whatab.,Straw Men,Red Her.                     0.55             den size, 1 hidden layer, drop out of 0.1 with
            Average                             0.67             ReLU function at the last layer before the joint
                                                                 loss function). Such features proved to be use-
Table 1: Results on sentence-span classification on              ful to improve the performances of our approach
SemEval’20 T11 dev set (micro-F1) using span-                    on propagandist messages classification, obtaining
pattern produced by the binary classification step               SoTA results on some categories (in (Vorakitphan
(Step 1).                                                        et al., 2021) we provide a comparison of our model
                                                                 with SoTA systems on both NLP4IF and SemEval
ing Softmax as activation function6 ).                           datasets).
   We then perform a post-processing step to auto-                  To combine the results from sentence-span
matically join tokens labelled with the same pro-                based RoBERTa with the feature-based BiLSTM
paganda technique into the same textual span.                    we apply the joint loss strategy proposed in
   Given that PLM is applied at token-level, each                (Vorakitphan et al., 2021). Each model produces
token is processed into sub-words (e.g., “running”               a loss per batch using CrossEntropy loss function
is tokenized and cut into two tokens: “run” and                  L. Following the function: lossjoint loss = α ×
                                                                 (losssentence +lossspan +losssemantic argumentation features )
“##ing”). Such sub-words can mislead the classi-
                                                                                            N loss
fier. For instance, in the following sentence: “The              where each loss value is produced from CrossEn-
next day, Biden said, he was informed by Indian                  tropy function of its classifier (e.g., losssentence
press that there were at least a few Bidens in In-               and lossspan from RoBERTa models of sentence
dia.”, our system detects least a few Bidens in                  and span, losssemantic argumentation features from the
as a propagandist snippet, but it misclassifies one              BiLSTM model.)
sub-word (“at” was not considered as part of “at
                                                                    To train the above mentioned methods for
least”, and therefore excluded from the propagan-
                                                                 the propaganda technique classification task, we
dist snippet).
                                                                 merged the data sets of NLP4IF’19 and Se-
Step 2: Propaganda Technique Classification.                     mEval’20 T11 (same setting as in Step 1). Then
We cast this task as a sentence-span multi-class                 we tested the full pipeline of PROTECT on the de-
classification problem. More specifically, both the              velopment set from Semeval’20 T11. The output
tokenized sentence and the span are used to feed                 of the snippet detection task (Step 1) are provided
the transformer-based model RoBERTa (roberta-                    as a span-pattern to the models performing Step 2.
base pre-trained model)7 (Liu et al., 2019) to per-              Table 1 reports on the obtained results of the full
   6
                                                                 pipeline (Step 1+Step 2) averaged over 5 runs (we
      We are aware that sigmoid function is usually used as
default activation function in binary classification. However,   cannot provide a fair comparison of those results
in our setting we tested both functions and we obtained better   with SoTA systems, given that in SemEval the two
performances with Softmax as activation function (+0.04 F1       tasks are separately evaluated and no pipeline re-
with respect to sigmoid).
    7
      https://huggingface.co/transformers/                       sults are provided). We can notice however, that
model doc/roberta.html                                           our results in a pipeline are comparable with the
                  Figure 1: PROTECT Interface: Propaganda Techniques Classification


ones obtained in (Vorakitphan et al., 2021) on the      face. Checkboxes on the right side of the page pro-
two separate tasks.                                     vide the key to interpret the colors, and allow the
   Given the high complexity of the propaganda          user to check or un-check (i.e. highlight or not)
technique classification task and the classes’ un-      the different propagandist snippets in the text, fil-
balance, some examples are miss-classified by the       tering the results. Faded to dark colours represent
system. For instance, in the following sentence         the confidence level of the prediction (the darker
“The Mueller probe saw several within Trump’s           the colour, the higher the system confidence). The
orbit indicted, but not Trump as family or Trump        snippets in bold contain multiple propaganda tech-
                              ’
himself”, the system annotated the snippet in ital-     niques in the same text spans, that can be unveiled
ics as “Name Calling,Labeling”, while the correct       hovering with the mouse over the snippets.
labels would have been “Repetition”.                       As said before, PROTECT can be used through
                                                        the provided API, and annotated text can be down-
4     PROTECT Functionalities                           loaded as a JSON file with the detected propagan-
As previously introduced, PROTECT allows a              dist snippet(s) at character indices (start to end in-
user to input plain text and retrieve the propagan-     dices of a snippet) based on individual sentence,
dist spans in the message as output by the system.      propaganda technique(s) used, and the confidence
In the current version of the system, two services      score(s)).
are provided through the web interface (and the
                                                        4.2   Service 2: Propaganda Word Clouds
API), described in the following.
                                                        The propagandist snippets output by the system
4.1    Service 1: Propaganda Techniques                 can also be displayed as word clouds, where the
       Classification                                   size of the words represents the system confidence
The system accepts an input plain text in English,      score in assigning the labels (see Figure 2). The
and then the architecture described in Section 3.2      different sizes represent the confidence score of
is run over such text. The output consists of           the prediction, and the colors the propaganda tech-
an annotated version of the input text, where the       nique (as in Service 1). If multiple techniques are
different propagandist techniques detected by the       found in the same snippet, it is duplicated in the
system are highlighted in different colours. The        word cloud. As for the first service, a checkbox on
colour of the highlighted snippet is distinctive of     the right side of the word clouds allows the user
a certain propaganda technique: the darker the          to select the propagandist techniques to be visual-
color, the higher the confidence score of the sys-      ized. Also in this case, a json file can be down-
tem in assigning the label to a textual snippet. Fig-   loaded with the system prediction.
ure 1 shows an example of PROTECT web inter-               The word cloud service has been added to PRO-
                                Figure 2: PROTECT Interface: Word Cloud


TECT in addition to the standard visualization, to        Alberto Barrón-Cedeño, Israa Jaradat, Giovanni Mar-
provide a different and informative way to sum-             tino, and Preslav Nakov. 2019. Proppy: Organizing
                                                            the news based on their propagandistic content. In-
marise propaganda techniques on a topic, and to
                                                            formation Processing & Management, 56, 05.
facilitate their identification.
                                                          Giovanni Da San Martino, Seunghak Yu, Alberto
5   Conclusions                                             Barrón-Cedeño, Rostislav Petrov, and Preslav
                                                            Nakov. 2019. Fine-grained analysis of propaganda
In this paper, we presented PROTECT, a propa-               in news article. In Proceedings of the 2019 Con-
ganda detection and classification tool. PROTECT            ference on Empirical Methods in Natural Language
                                                            Processing and the 9th International Joint Confer-
relies on a pipeline to detect propaganda snip-             ence on Natural Language Processing (EMNLP-
pets from plain text. We evaluated the proposed             IJCNLP), pages 5636–5646, Hong Kong, China,
pipeline on standard benchmarks achieving state-            November. Association for Computational Linguis-
of-the-art results. PROTECT is deployed as a                tics.
web-service API that accepts a plain text input,          Giovanni Da San Martino, Alberto Barrón-Cedeño,
returning downloadable annotated text for further           Henning Wachsmuth, Rostislav Petrov, and Preslav
usage. In addition, a propaganda word clouds ser-           Nakov. 2020a. SemEval-2020 task 11: Detection of
                                                            propaganda techniques in news articles. In Proceed-
vice allows to gain further insights from such text.        ings of the Fourteenth Workshop on Semantic Eval-
                                                            uation, pages 1377–1414, Barcelona (online), De-
Acknowledgments                                             cember. International Committee for Computational
                                                            Linguistics.
This work is partially supported by the AN-
SWER project PIA FSN2 n.              P159564-            Giovanni Da San Martino, Shaden Shaar, Yifan Zhang,
                                                            Seunghak Yu, Alberto Barrón-Cedeño, and Preslav
2661789/DOS0060094 between Inria and Qwant.                 Nakov. 2020b. Prta: A system to support the anal-
This work has also been supported by the French             ysis of propaganda techniques in the news. In Pro-
government, through the 3IA Côte d’Azur Invest-            ceedings of the 58th Annual Meeting of the Associa-
ments in the Future project managed by the Na-              tion for Computational Linguistics: System Demon-
tional Research Agency (ANR) with the reference             strations, pages 287–293, Online, July. Association
                                                            for Computational Linguistics.
number ANR-19-P3IA-0002.
                                                          Jacob Devlin, Ming-Wei Chang, Kenton Lee, and
                                                             Kristina Toutanova. 2019. BERT: pre-training
References                                                   of deep bidirectional transformers for language un-
                                                             derstanding. In Jill Burstein, Christy Doran, and
Siti Rohaidah Ahmad, Muhammad Zakwan Muham-                  Thamar Solorio, editors, Proceedings of the 2019
   mad Rodzi, Nurlaila Syafira Shapiei, Nurhafizah           Conference of the North American Chapter of the
   Moziyana Mohd Yusop, and Suhaila Ismail. 2019.            Association for Computational Linguistics: Human
   A review of feature selection and sentiment analy-        Language Technologies, NAACL-HLT 2019, Min-
   sis technique in issues of propaganda. International      neapolis, MN, USA, June 2-7, 2019, Volume 1 (Long
   Journal of Advanced Computer Science and Appli-           and Short Papers), pages 4171–4186. Association
   cations, 10(11).                                          for Computational Linguistics.
James Price Dillard and Michael Pfau. 2009. The         Vorakit Vorakitphan, Elena Cabrio, and Serena Vil-
  Persuasion Handbook: Developments in Theory and         lata. 2021. ”Don’t discuss”: Investigating Semantic
  Practice. Sage Publications, Inc.                       and Argumentative Features for Supervised Propa-
                                                          gandist Message Detection and Classification. In
Dawid Jurkiewicz, Łukasz Borchmann, Izabela Kos-          Recent Advances in Natural Language Processing
  mala, and Filip Graliński. 2020. ApplicaAI at          (RANLP 2021), Varna (Online), Bulgaria, Septem-
  SemEval-2020 task 11: On RoBERTa-CRF, span              ber.
  CLS and whether self-training helps them. In Pro-
  ceedings of the Fourteenth Workshop on Semantic       Shehel Yoosuf and Yin Yang. 2019. Fine-grained pro-
  Evaluation, pages 1415–1424, Barcelona (online),        paganda detection with fine-tuned BERT. In Pro-
  December. International Committee for Computa-          ceedings of the Second Workshop on Natural Lan-
  tional Linguistics.                                     guage Processing for Internet Freedom: Censor-
                                                          ship, Disinformation, and Propaganda, pages 87–
Haavard Koppang. 2009. Social influence by manipu-        91, Hong Kong, China, November. Association for
  lation: A definition and case of propaganda. Middle     Computational Linguistics.
  East Critique, 18:117 – 143.

Harold Dwight Lasswell. 1938. Propaganda technique
  in the world war.

Yingya Li, Jieke Zhang, and Bei Yu. 2017. An NLP
  analysis of exaggerated claims in science news. In
  Proceedings of the 2017 EMNLP Workshop: Nat-
  ural Language Processing meets Journalism, pages
  106–111, Copenhagen, Denmark, September. Asso-
  ciation for Computational Linguistics.

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Man-
  dar Joshi, Danqi Chen, Omer Levy, Mike Lewis,
  Luke Zettlemoyer, and Veselin Stoyanov. 2019.
  RoBERTa: A Robustly Optimized BERT Pretrain-
  ing Approach. CoRR, abs/1907.11692. eprint:
  1907.11692.

Liane Longpre, Esin Durmus, and Claire Cardie. 2019.
  Persuasion of the undecided: Language vs. the lis-
  tener. In Proceedings of the 6th Workshop on Ar-
  gument Mining, pages 167–176, Florence, Italy, Au-
  gust. Association for Computational Linguistics.

Norman Mapes, Anna White, Radhika Medury, and
  Sumeet Dua. 2019. Divisive language and pro-
  paganda detection using multi-head attention trans-
  formers with deep learning BERT-based language
  models for binary classification. In Proceedings of
  the Second Workshop on Natural Language Process-
  ing for Internet Freedom: Censorship, Disinforma-
  tion, and Propaganda, pages 103–106, Hong Kong,
  China, November. Association for Computational
  Linguistics.

Travis Morris. 2012. Extracting and networking emo-
  tions in extremist propaganda. In 2012 European
  Intelligence and Security Informatics Conference,
  pages 53–59.

Enrica Troiano, Carlo Strapparava, Gözde Özbal, and
  Serra Sinem Tekiroğlu. 2018. A computational
  exploration of exaggeration. In Proceedings of the
  2018 Conference on Empirical Methods in Natural
  Language Processing, pages 3296–3304, Brussels,
  Belgium, October-November. Association for Com-
  putational Linguistics.