<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UniBO at CheckThat! 2024: Multi-lingual and Multi-label Persuasion Technique Detection in News with Data Augmentation and Sequence-Token Classifiers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paolo Gajo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Giordano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Barrón-Cedeño</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DIT, Alma Mater Studiorum - Università di Bologna.</institution>
          <addr-line>Forlì</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <abstract>
        <p>With the widespread use of the Internet and the rise of algorithmic journalism, consumers of news are exposed more than ever before to manipulative, propagandistic, and deceptive content. As a result, major public events and debates on relevant topics can be significantly influenced. This creates an increasing demand for automated tools that help experts analyze the news ecosystem. We explored persuasion technique detection in multi-lingual news as part of the CheckThat! Lab Task 3. Our pipeline comprises two parts. The first part is a data augmentation module, which uses a BERT-based model fine-tuned for word-alignment to project labels from source texts to machine-translated target texts. The second one is a persuasion technique classification module, leveraging two ifne-tuned BERT-based models: a sequence classifier for detecting sentences containing persuasion techniques and a set of 23 token-level classifiers for specific techniques. Our approach, trained on augmented multilingual data with class weighting and a high decision threshold of 0.9, is competitive in all language settings, showing hints of cross-lingual transfer. Despite the research eforts in this direction, exemplified by shared tasks, detecting persuasion techniques, especially across languages, remains challenging due to their implicit and subtle nature.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;persuasion techniques</kwd>
        <kwd>multi-lingual</kwd>
        <kwd>data augmentation</kwd>
        <kwd>class weighting</kwd>
        <kwd>decision threshold</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Media language and news discourse have always attracted the attention of applied linguists and
sociolinguists, mainly because of four reasons [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]: 1) the media provide an easily accessible source of
language data for research and teaching purposes, 2) the media are important linguistic institutions,
and their language usage reflects and shapes both language use and attitudes in a speech community, 3)
the ways in which the media use language are interesting linguistically in their own right, and 4) the
media are important social institutions. They are crucial presenters of culture, politics, and social life,
shaping as well as reflecting how these are formed and expressed.
      </p>
      <p>
        With the widespread use of the Internet and the rise of algorithmic journalism [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5">2, 3, 4, 5, 6</xref>
        ],
characterized by huge amounts of data, the application of algorithms in all phases of the journalistic process
(selection, production, distribution and consumption) and by a high degree of interactivity and direct
communication between news producers and consumers, the latter are exposed more than ever before
to manipulative, propagandistic, and deceptive content. As a result, major public events and debates on
important topics can be significantly influenced. This creates an increasing demand for automated tools
that help experts analyze the news ecosystem, detect manipulation attempts, and aid in studying how
events, global issues, and policies are portrayed by the media in diferent countries and languages. There
has been a growing interest of the NLP community in trying to detect the use of specific propaganda
techniques, as well as the specific span of each instance. This interest is mainly expressed by the
organiAppeal to Authority
Appeal to Populartity
Appeal to Values
      </p>
      <p>Name Calling or Labeling
Casting Doubt
Guilt by Association</p>
      <p>Loaded Language
Repetition
Exaggeration or Minimization
Appeal to Fear, Prejudice</p>
      <p>Appeal to Hypocrisy</p>
      <p>Obfuscation, Vagueness, Confusion
Appeal to Authority</p>
      <p>Questioning the Reputation
Call</p>
      <p>Slogans
Appeal to Time</p>
      <p>Simplification</p>
      <p>Causal Oversimplification
False Dilemma or No Choice</p>
      <p>Distraction</p>
      <p>Strawman
Red Herring</p>
      <p>Other</p>
      <p>Other
Conversation Killer</p>
      <p>Consequential Oversimplification</p>
      <p>Whataboutism
of Persuasion Techniques in Texts and Images [9], WANLP-2022 Shared Task on Propaganda Detection
in Arabic [10], and, most recently, SemEval-2023 Task 3: Detecting the category, the framing, and the
persuasion techniques in online news in a multilingual setup [11]. Task 3 “Persuasion Techniques” of
the CheckThat! Lab 2024 [12] is a further efort to advance the state of the art in this research direction.</p>
      <p>Participants in the Task 3 “Persuasion Techniques” of the CheckThat! Lab 2024 are given a set of news
articles in multiple languages and a list of 23 persuasion techniques (PT), including logical fallacies (e.g.,
straw man, red herring, bandwagon) and emotional manipulation techniques (e.g., loaded language,
appeal to fear, name calling) that might be used to support flawed argumentation. The aim is to identify
the spans of texts in which each technique occurs. Text spans assigned with labels might also overlap.
Therefore, it is set up as a multi-label sequence-tagging task where each sequence can be assigned more
than one class. The evaluation metric is micro-averaged F1, modified to account for partial matching
between the spans. Furthermore, annotation guidelines were provided by the task organizers, which
contain detailed definitions and examples [13].</p>
      <p>Our system can be divided in two parts. The first part of the pipeline comprises the data augmentation
module: a BERT-based model [14] fine-tuned on a word-alignment task. We use this to project the labels
from the source to the target texts, produced by translating the training set into diferent languages
via machine translation (MT). The second part of the pipeline refers to the persuasion technique
classification module, i.e. two separate BERT-based models, henceforth referred to as 1 and 2.
1 is a binary sequence classifier, trained to classify individual sentences as containing a persuasion
technique or not. 2, is a series of 23 token-level classifiers, 2,1, . . . , 2,23, one per persuasion
technique. Leveraging our multilingual MT data augmentation strategy, we trained a single set of
multilingual models and used them to infer on all languages of a holdout validation set and of the test set.
Accordingly, we submitted runs for all test languages. Our system is competitive in all language settings,
showing hints of cross-lingual transfer when training on multi-lingual data and testing on unseen
languages. For reproducibility purposes, we release our code and data on our GitHub repository.1</p>
      <p>The rest of the paper is organized as follows. Section 2 provides an overview of related work on
persuasion techniques and propaganda detection. In Section 3, we describe the training data provided
for the shared task and our data augmentation process. In Section 4 we present our proposed system.
Section 5 reports the performance of the system for our oficial run. Finally, we conclude in Section 6.</p>
      <sec id="sec-1-1">
        <title>1https://github.com/giorluca/checkthat24_DIT</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Research on automatic persuasion technique detection in news overlaps to a large extent with work on
automatic propaganda detection in news [15, 16, 17]. Early research on propaganda detection focused
exclusively on document-level analysis, ignoring the fine-grained aspects of the task.</p>
      <p>Rashkin et al. [15] developed the TSHP-17 corpus, annotated in a distant supervised manner (i.e.
assigning the label of the news outlet to all articles gathered from that news outlet) at
documentlevel with four classes: trusted, satire, hoax, and propaganda. However, as can be deduced from the
results obtained in the experiment, further verified for reproducibility by Barrón-Cedeño et al. [16] and
mentioned by Da San Martino et al. [18], the predictive model trained on this data (logistic regression
with n-gram representation) failed to generalize, performing well only on articles from sources that the
system was trained on and under-performing when evaluated on articles from unseen news sources.</p>
      <p>Barrón-Cedeño et al. [16] developed the QProp corpus annotated at document-level with two labels
(propaganda vs non-propaganda) and trained diferent models (e.g., logistic regression and SVMs) on
this data and on the TSHP-17 corpus to predict the two classes, including linguistic features such as
writing style and readability indices. Their findings confirmed that using distant supervision might
introduce bias in the model and lead to predict the source of the article, rather than to discriminate
propaganda from non-propaganda, independently from the news source.</p>
      <p>An alternative line of research has focused on detecting the use of specific propaganda techniques
in text [19, 20, 17]. Habernal et al. [19, 20] developed a corpus with 1.3k arguments annotated with
ifve fallacies that directly relate to propaganda techniques. A more fine-grained analysis was done by
Da San Martino et al. [17], who developed a corpus of news articles annotated with 18 propaganda
techniques which was used to train a gated deep neural network for sentence-level propaganda detection.
For a survey on computational propaganda detection see Da San Martino et al. [18].</p>
    </sec>
    <sec id="sec-3">
      <title>3. Data</title>
      <p>The training data provided for the task is an existing corpus, consisting of 1,612 news articles in 9
languages annotated with 48K instances of 23 persuasion techniques [11]. The persuasion technique
taxonomy is pictured in Figure 1. A new test dataset of around 500 news articles in Arabic, Bulgarian,
English, Portuguese, and Slovene is provided for this edition [12]. The distribution of training data by
language is provided in Table 1.</p>
      <sec id="sec-3-1">
        <title>3.1. Data Preprocessing</title>
        <p>While inspecting the datasets in the preliminary stages, we noticed that the gold annotations would
sometimes not coincide with the character slices, once visualized as Python strings. This was caused
by the fact that some files were saved using carriage return along with the newline character (\r\n),
while others only contained newline characters (\n). Once read in Python using the ‘r’ reading mode,
this meant that in files containing ‘\r\n’ the newlines counted as only one character, instead of two;
conversely, in those only using ‘\n’ the newlines correctly counted as only one character. In order to
solve this issue, we read all files in ‘rb’ binary mode and decoded prior to data processing, so that \r
characters would be correctly counted in order to feed models the correct text spans.</p>
        <p>Since the documents are too long to feed in input to the hereby-used models,2 we generate smaller
training samples by splitting documents at the sentence level, obtaining 59,908 total gold training
sentences, as indicated in Table 1. Prior to training, we split the obtained sentence dataset 80/20 into
training and validation instances.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data Augmentation</title>
        <p>In order to increase the amount of available training data we augment the training sentences via MT
and label projection. MT is carried out by translating the dataset sentence by sentence from English to
the other training languages with the NLLB 3.3B model [21].</p>
        <p>Following Nagata et al. [22], the PT annotations are then projected onto the translated text by
using mDeBERTa models [23] trained on a word-alignment task with a question-answering classifier
head. Given a source sentence  with characters  ∈ , and its translated target sentence  with
characters  ∈ , and an alignment between spans ,+, labeled as , and ,+, with  &lt;  ∈ N,
,  &gt; 0 ∈ N, label projection is the task of assigning the label  to the span ,+ [24]. In other words,
given a source span, the model is tasked to find the equivalent span in the translated text. Doing this,
we obtain synthetic annotated data in the target language, which we use to further train both our
sequence and token classifiers. In order to train these word-alignment models, we use XL-WA [ 25],
a multilingual word-alignment dataset [25] built from WikiMatrix [26].3 The dataset has a balanced
domain distribution and features 14 EN-XX language combinations. Its training set is composed of
silver labels automatically generated, while the development and test sets are manually annotated. We
align each source-target combination of machine-translated data (EN-IT, EN-ES, EN-RU, EN-SL, EN-BG,
EN-PT), where English is always the source gold data, with a diferent word-alignment model, trained
on the specific language combination contained in XL-WA.</p>
        <p>Ultimately, departing from the original 24,514 gold English sentences indicated in Table 1, we generate
the same amount for each of the six target languages, for a total of 147,084 extra training sentences.
Thus, the total number of training instances amounts to 206,992.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Models</title>
      <sec id="sec-4-1">
        <title>4.1. Word Alignment</title>
        <p>For the word alignment task, used in the data augmentation step, we adopt the approach proposed by
Nagata et al. [22], which treats word alignment as a question answering task using an mDeBERTa [23]
model. In this approach, the source word to be aligned is enclosed within rarely used characters, such
as ‘∙ ’, and the model is fed both the source sequence  and the target sequence  simultaneously. The
input to the model at the token level is structured as follows:
[CLS] 1, . . . , (∙ ),  , . . . ,  +, (∙ ), . . . ,  
[SEP] 1, . . . ,   , . . . ,  +, . . . ,   [SEP]
Here, the source word to be aligned is represented by the tokens  , . . . ,  +, where   ∈ ().
The model is then tasked with predicting the tuple (  ,  +), where   ∈ (), which denotes the
boundary indices of the aligned word in the target sequence.</p>
        <p>For each language combination involved in the data augmentation process, we train our models for
up to 3 epochs on each of XL-WA’s languages with a batch size of 16. The optimizer’s learning rate is
set to 3 × 10− 4, and  is 10− 8. We select the best model based on the Exact metric  [27]:
 =
∑︀ (, ) ,
‖‖
(1)
2Since the models we use are derived from BERT base, they can only handle 512 tokens in input.
3https://ai.meta.com/blog/wikimatrix/
where  is a list of predictions and (, ) is the Kronecker delta:
(, ) =
{︃1, if  = ,
0, if  ̸= .</p>
        <p>(2)
Before computing , we lowercase and strip the predicted and gold strings  and  of excess
punctuation and spacing.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Persuasion Technique Classification</title>
        <p>The model we use is composed of two separate Transformer networks [28], both based on
mDeBERTa [23]. The first part of the model, 1, is a binary sequence classifier, trained to classify individual
sentences as containing any persuasion technique or not. The second part of the model, 2, is a series
of 23 token-level classifiers, 2,1, . . . , 2,23, one per persuasion technique.</p>
        <p>Sequence classifier Upon training, we feed the sequence classifier a balanced subsample of the
sentence dataset, obtained as per Section 3. Specifically, we take all sentences containing at least one PT
(positive instances) and sample an equal number of negative instances from the rest of the training set.
Token classifiers Prior to training for token classification, we preprocess and label the data using
the BIO annotation scheme [29]. In this scheme, the first word of an entity is assigned a B-{class}
(beginning) label, subsequent words are assigned an I-{class} (inside) label, and words not part of
any entity are assigned an O (outside) label. We follow established methodology by ignoring subword
tokens when calculating cross-entropy loss.4</p>
        <p>Since the 23 token classification models are tailored specifically to each PT, we train them on sentences
where only one PT is kept at a time. This means that if a sentence contains a persuasion technique
which the model is not supposed to learn to predict, we set the tokens relative to that persuasion
technique to the outside O label. Just like for the sequence classifier, we balance positive and negative
instances for training.</p>
        <p>For both 1 and 2, we set the optimizer’s learning rate at 5 × 10− 5, while  is 10− 8. We train all
models for up to 10 epochs with a patience of 2 epochs, keeping the model with the highest performance
on the validation set.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Reducing False Positives</title>
        <p>As we are using 23 separate token classifiers, the number of predictions being produced ends up being
very large. Since the submission website only accepts TXT files of up to 800 KBytes and our token
classifiers produce too many predictions, our full outputs are not suitable for submission in most
languages. As such, we opt for reducing the number of positive predictions in order to adhere to the
submission size limit. To accomplish this, during training we use a modified, weighted version of the
cross-entropy loss function. Specifically, we empirically assign a weight of 0.5 to the O majority class
(label 0) and a weight of 2.0 to the minority B and I classes (labels 1 and 2). This weighting ensures that
the model pays more attention to correctly predicting the minority classes, thus reducing the overall
number of positive predictions.</p>
        <p>When computing the evaluation metrics, we also apply a threshold to the model’s predictions. We
use the softmax function to convert the logits into probabilities. Then, we set any probability below the
threshold of 0.9 to zero before determining the predicted labels.5 This means that the model only makes
a prediction if it is at least 90% confident, reducing the number of false positives. We did not experiment
with any other parameters, besides function loss weights and the prediction threshold. In addition, since
some of the token classifiers obtained an F 1 score of 0 on their class subset of the validation partition
obtained from splitting the training set, we exclude the predictions produced by those models.</p>
        <sec id="sec-4-3-1">
          <title>4https://huggingface.co/docs/transformers/en/tasks/token_classification 5We attempted diferent thresholds by increments of 0.1 until the submission files were small enough for submission.</title>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Inference</title>
        <p>During inference, we produce the submission predictions following a series of steps. First, after models
1 and 2 have produced their predictions, we set 2’s predictions to 0-tensors for those indices
where 1’s predictions are 0. Then, we binarize the predictions to {0, 1}, with the original {1, 2} labels
mapping onto 1, and 0 mapping onto 0. Lastly, we assign a character span to each consecutive series of
positive (1) predictions in the prediction tensor, based on the characters corresponding to each token.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>In this section, we first report and discuss the results obtained by our model 1 on a holdout validation
set which contains all training languages, obtained by splitting the training set 80/20 into training and
validation data with a set seed (Table 2). Then, we report the results of our 23 token classifiers 2 on the
same holdout validation set, given diferent training settings (Table 3). We report these results with the
intent of providing insight into the contribution of the data augmentation process to the performance
of the second classifier and to show how it is afected by class weighting and by setting a decision
threshold to reduce false positives. Finally, we present the results achieved by our whole system on
the oficial test set (Table 4). The binary classifier 1 performs decently, with a macro F1 of 0.757.
Furthermore, as shown by the reported scores in Table 3, the data augmentation more than doubles
2’s performance. On the other hand, class weighting and setting a decision threshold as high as 0.9,
although necessary as shown above, lowers the best performance by 0.12. Since these preliminary tests
conducted on the holdout validation split show that data augmentation improves the performance of
the 2 classifier, even when class weighting and a decision threshold of 0.9 are set, we choose to adopt
data augmentation also for the final system used to predict on the test set for submission. The rationale
behind this decision is based on the assumption that a higher performance on the holdout validation
set would also translate onto the test set.6</p>
      <p>The results for our oficial test runs are shown in Table 4. Our system performs better than the
baseline7 across all languages and ranks first for all languages except for Arabic. For that language,
Team Mela used a multilingual BERT model which was pre-trained on data in both English and Arabic
[30]. We also observe consistent performance across all languages, with micro average F1 scores ranging
from 0.091 for English to 0.123 for Slovene, possibly showing a robust cross-lingual transfer ability
when training the model on multi-lingual data and testing it on unseen languages.
6Note that our oficial submission (last row in Table 3) is not the best because it is constrained by the maximum size accepted
by the submission website for the produced prediction file. Indeed, the used class weights and prediction threshold are
applied in order to reduce the amount of predictions produced by the model.
7A token classification model followed by heuristics that was kept private by the organizers [12].</p>
      <p>English
1 UniBO
– PersuasionMultiSpan
2 Baseline</p>
      <p>Bulgarian
– PersuasionMultiSpan
1 UniBO
2 Baseline</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>In this paper, we presented our experiments and findings on persuasion technique detection in news
in multiple languages, which was part of the CheckThat! Lab Task 3. Our system is divided in two
parts. The first part of the pipeline comprises data augmentation: a combination of machine translation
and cross-language label projection. The second part of the pipeline refers to the persuasion technique
classification module, comprising two separate BERT-based models. The first acts as a binary sequence
classifier, trained to classify individual sentences as containing a persuasion technique or not. The
second comprises 23 token-level classifiers, one per persuasion technique.</p>
      <p>We submitted runs for five all test languages. Our final model, trained on the shared task’s training data
augmented via MT and tuned with class weighting and a high decision threshold of 0.9, is competitive in
all language settings and shows hints of cross-lingual transfer capabilities when trained on multi-lingual
data and testing on unseen languages.</p>
      <p>Automatically detecting persuasion techniques in news in a multi-lingual setting still proves to be a
challenging task, given how implicitly such techniques manifest and how subtle their usage can be.
This leaves space for future work. For example, investigating the existence of more explicit predictive
features, including but not limited to textual and linguistic indicators or online dissemination patterns,
that prove to be typical of propagandistic news, independently of news source and common across
diferent languages, which could be leveraged alongside transformer models and data augmentation.
Sligh, A. Sehl (Eds.), The International Encyclopedia of Journalism Studies, Wiley-Blackwell,
Massachusetts, USA, 2018.
[6] J. M. Túñez-López, C. Toural-Bran, A. G. Frazão-Nogueira, From data journalism to robotic
journalism: The automation of news processing, Journalistic metamorphosis: media transformation
in the digital age (2020) 17–28.
[7] G. Da San Martino, A. Barron-Cedeno, P. Nakov, Findings of the NLP4IF-2019 shared task on
ifne-grained propaganda detection, in: Proceedings of the second workshop on natural language
processing for internet freedom: censorship, disinformation, and propaganda, 2019, pp. 162–170.
[8] G. Da San Martino, A. Barrón-Cedeño, H. Wachsmuth, R. Petrov, P. Nakov, SemEval-2020 task 11:
Detection of propaganda techniques in news articles, in: Proceedings of the Fourteenth Workshop
on Semantic Evaluation, 2020, pp. 1377–1414.
[9] D. Dimitrov, B. B. Ali, S. Shaar, F. Alam, F. Silvestri, H. Firooz, P. Nakov, G. Da San Martino,
SemEval-2021 task 6: Detection of persuasion techniques in texts and images, in: Proceedings of
the 15th International Workshop on Semantic Evaluation (SemEval-2021), 2021, pp. 70–98.
[10] F. Alam, H. Mubarak, W. Zaghouani, G. Da San Martino, P. Nakov, Overview of the WANLP 2022
shared task on propaganda detection in arabic, in: Proceedings of the The Seventh Arabic Natural
Language Processing Workshop (WANLP), 2022, pp. 108–118.
[11] J. Piskorski, N. Stefanovitch, G. Da San Martino, P. Nakov, Semeval-2023 task 3: Detecting the
category, the framing, and the persuasion techniques in online news in a multi-lingual setup, in:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), 2023, pp.
2343–2361.
[12] A. Barrón-Cedeño, F. Alam, T. Chakraborty, T. Elsayed, P. Nakov, P. Przybyła, J. M. Struß, F. Haouari,
M. Hasanain, F. Ruggeri, et al., The CLEF-2024 CheckThat! Lab: Check-Worthiness,
Subjectivity, Persuasion, Roles, Authorities, and Adversarial Robustness, in: European Conference on
Information Retrieval, Springer, 2024, pp. 449–458.
[13] J. Piskorski, N. Stefanovitch, V.-A. Bausier, N. Faggiani, J. Linge, S. Kharazi, N. Nikolaidis, G. Teodori,
B. De Longueville, B. Doherty, et al., News categorization, framing and persuasion techniques:
Annotation guidelines, European Commission, Ispra, JRC132862 (2023).
[14] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers
for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019
Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational
Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186.
[15] H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, Y. Choi, Truth of varying shades: Analyzing language
in fake news and political fact-checking, in: Proceedings of the 2017 conference on empirical
methods in natural language processing, 2017, pp. 2931–2937.
[16] A. Barrón-Cedeño, I. Jaradat, G. Da San Martino, P. Nakov, Proppy: Organizing the news based on
their propagandistic content, Information Processing &amp; Management 56 (2019) 1849–1864.
[17] G. Da San Martino, Y. Seunghak, A. Barrón-Cedeno, R. Petrov, P. Nakov, et al., Fine-grained analysis
of propaganda in news article, in: Proceedings of the 2019 conference on empirical methods
in natural language processing and the 9th international joint conference on natural language
processing (EMNLP-IJCNLP), Association for Computational Linguistics, 2019, pp. 5636–5646.
[18] G. Da San Martino, S. Cresci, A. Barrón-Cedeño, S. Yu, R. Di Pietro, P. Nakov, A survey on
computational propaganda detection, in: Proceedings of the Twenty-Ninth International Conference on
International Joint Conferences on Artificial Intelligence, 2021, pp. 4826–4832.
[19] I. Habernal, R. Hannemann, C. Pollak, C. Klamm, P. Pauli, I. Gurevych, Argotario: Computational
argumentation meets serious games, in: Proceedings of the 2017 Conference on Empirical Methods
in Natural Language Processing: System Demonstrations, 2017, pp. 7–12.
[20] I. Habernal, P. Pauli, I. Gurevych, Adapting serious game for fallacious argumentation to german:
Pitfalls, insights, and best practices, in: Proceedings of the Eleventh International Conference on
Language Resources and Evaluation (LREC 2018), 2018.
[21] N. Team, M. R. Costa-jussà, J. Cross, O. Çelebi, M. Elbayad, K. Heafield, K. Hefernan, E. Kalbassi,
J. Lam, D. Licht, J. Maillard, A. Sun, S. Wang, G. Wenzek, A. Youngblood, B. Akula, L. Barrault,
G. M. Gonzalez, P. Hansanti, J. Hofman, S. Jarrett, K. R. Sadagopan, D. Rowe, S. Spruit, C. Tran,
P. Andrews, N. F. Ayan, S. Bhosale, S. Edunov, A. Fan, C. Gao, V. Goswami, F. Guzmán, P. Koehn,
A. Mourachko, C. Ropers, S. Saleem, H. Schwenk, J. Wang, M. Ai, No Language Left Behind: Scaling
Human-Centered Machine Translation, 2022.
[22] M. Nagata, K. Chousa, M. Nishino, A supervised word alignment method based on cross-language
span prediction using multilingual BERT, in: B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of
the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association
for Computational Linguistics, Online, 2020, pp. 555–565.
[23] P. He, J. Gao, W. Chen, DeBERTaV3: Improving DeBERTa using ELECTRA-style pre-training with
gradient-disentangled embedding sharing, in: The Eleventh International Conference on Learning
Representations, 2022.
[24] A. Jain, B. Paranjape, Z. C. Lipton, Entity projection via machine translation for cross-lingual NER,
in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th International Joint Conference on Natural Language
Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, 2019,
pp. 1083–1092. URL: https://aclanthology.org/D19-1100. doi:10.18653/v1/D19-1100.
[25] F. Martelli, A. S. Bejgu, C. Campagnano, J. Čibej, R. Costa, A. Gantar, J. Kallas, S. Koeva, K. Koppel,
S. Krek, M. Langemets, V. Lipp, S. Nimb, S. Olsen, B. S. Pedersen, V. Quochi, A. Salgado, L. Simon,
C. Tiberius, R.-J. Ureña-Ruiz, R. Navigli, XL-WA: a Gold Evaluation Benchmark for Word
Alignment in 14 Language Pairs, in: F. Boschetti, N. N. Gianluca E. Lebani, Bernardo Magnini (Eds.),
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023), volume
3596, CEUR-WS, Venice, Italy, 2023.
[26] H. Schwenk, V. Chaudhary, S. Sun, H. Gong, F. Guzmán, WikiMatrix: Mining 135M parallel
sentences in 1620 language pairs from Wikipedia, in: P. Merlo, J. Tiedemann, R. Tsarfaty (Eds.),
Proceedings of the 16th Conference of the European Chapter of the Association for Computational
Linguistics: Main Volume, Association for Computational Linguistics, Online, 2021, pp. 1351–1361.
[27] P. Rajpurkar, R. Jia, P. Liang, Know what you don’t know: Unanswerable questions for SQuAD,
in: I. Gurevych, Y. Miyao (Eds.), Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics,
Melbourne, Australia, 2018, pp. 784–789.
[28] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin,
Attention is All you Need, in: Advances in Neural Information Processing Systems, volume 30,
Curran Associates, Inc., 2017.
[29] L. A. Ramshaw, M. P. Marcus, Text Chunking Using Transformation-Based Learning, Springer</p>
      <p>Netherlands, Dordrecht, 1999, pp. 157–176.
[30] S. Nabhani, M. A. R. Riyadh, Mela at CheckThat! 2024: Transferring persuasion detection from</p>
      <p>English to Arabic - a multilingual BERT approach, in: [33], 2024.
[31] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott,
L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at scale, in:
D. Jurafsky, J. Chai, N. Schluter, J. Tetreault (Eds.), Proceedings of the 58th Annual Meeting of the
Association for Computational Linguistics, Association for Computational Linguistics, Online,
2020, pp. 8440–8451.
[32] J. Piskorski, N. Stefanovitch, F. Alam, R. Campos, D. Dimitrov, A. Jorge, S. Pollak, N. Ribin, Z. Fijavž,
M. Hasanain, N. Guimarães, A. F. Pacheco, E. Sartori, P. Silvano, A. V. Zwitter, I. Koychev, N. Yu,
P. Nakov, G. Da San Martino, Overview of the CLEF-2024 CheckThat! lab task 3 on persuasion
techniques, in: [33], 2024.
[33] G. Faggioli, N. Ferro, P. Galuščáková, A. García Seco de Herrera (Eds.), Working Notes of CLEF
2024 - Conference and Labs of the Evaluation Forum, CLEF 2024, Grenoble, France, 2024.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bell</surname>
          </string-name>
          ,
          <article-title>Language and the media</article-title>
          ,
          <source>Annual review of applied linguistics 15</source>
          (
          <year>1995</year>
          )
          <fpage>23</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <article-title>Towards a sociology of computational and algorithmic journalism</article-title>
          ,
          <source>New media &amp; society 15</source>
          (
          <year>2013</year>
          )
          <fpage>1005</fpage>
          -
          <lpage>1021</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Coddington</surname>
          </string-name>
          ,
          <article-title>Clarifying journalism's quantitative turn: A typology for evaluating data journalism, computational journalism, and computer-assisted reporting</article-title>
          ,
          <source>Digital journalism 3</source>
          (
          <year>2015</year>
          )
          <fpage>331</fpage>
          -
          <lpage>348</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Graefe</surname>
          </string-name>
          , Guide to automated journalism, Tow Center for Digital Journalism Publications, Columbia University,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Thurman</surname>
          </string-name>
          , Personalization of news, in: T. P.
          <string-name>
            <surname>Vos</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Hanusch</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Dimitrakopoulou</surname>
          </string-name>
          , M. Geertsema-
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>