1. Introduction

UPB at ACTI: Detecting Conspiracies using fine tuned Sentence Transformers

Andrei Paraschiv

Mihai Dascalu

0 0 University Politehnica of Bucharest 313 Splaiul Independetei , Bucharest , Romania

English. Conspiracy theories have become a prominent and concerning aspect of online discourse, posing challenges to information integrity and societal trust. As such, we address conspiracy theory detection as proposed by the ACTI @ EVALITA 2023 shared task. The combination of pre-trained sentence Transformer models and data augmentation techniques enabled us to secure first place in the final leaderboard of both sub-tasks. Our methodology attained F1 scores of 85.71% in the binary classification and 91.23% for the fine-grained conspiracy topic classification, surpassing other competing systems.

eol>Conspiracy Theory Content Moderation Large Language Models Computational Social Science

1. Introduction

serve as powerful tools in the hands of nefarious groups, politicians, or state actors who exploit susceptible comConspiracy theories distort the shared understanding of munities, manipulating them into taking or endorsing reality and erode trust in crucial democratic institutions. actions that can result in significant and dramatic social By substituting reliable, evidence-based information with repercussions [7, 8]. dubious, implausible, or blatantly false claims, these the- Building upon the importance of addressing conspirories foster a climate of disagreement regarding facts and acy theories, eforts have been made to research and give undue weight to personal opinions and anecdotal ev- develop automated methods for detecting conspiratorial idence over established facts and scientifically validated content on various platforms and languages. For instance, theories. Aaronovitch [2] defines conspiracy theories as as part of the EVALITA 2023 workshop, the organizers of ’the attribution of deliberate agency to something more the ACTI shared task introduced a novel approach: the likely to be accidental or unintended; therefore, it is the automatic identification of conspiratorial content in Italunnecessary assumption of conspiracy when other expla- ian language Telegram messages. This initiative aimed nations are more probable.’ Due to the rapid spread of in- to enhance our ability to quickly recognize and respond formation across the internet, coupled with the alarming to conspiracy theories, enabling the promotion of critical speed at which false information can proliferate [3], we thinking and media literacy by providing reliable sources ifnd ourselves amidst what some have dubbed a "golden and encouraging evidence-based discourse. Leveraging age" of conspiracy theories [4]. Being a distinct form such advancements can efectively limit the influence of of misinformation, conspiracy theories exhibit unique conspiracy theories while fostering a more informed and characteristics. Brotherton et al. [5] identified five key resilient society. attributes commonly found in modern conspiracy theo- This paper presents our contribution to the ACTI @ ries: government malfeasance, extraterrestrial cover-up, EVALITA 2023 shared task [9]. We focused on employmalevolent global conspiracies, personal well-being, and ing the power of pretrained Italian language sentence information control. Transformers. To further enhance the performance and

While embracing conspiracy theories can give indi- address potential biases, we employed Large Language viduals a sense of reclaiming power or accessing hidden Models (LLMs) to augment the training data, resulting knowledge, these beliefs can sometimes have negative in a more balanced and comprehensive training set. This and dangerous consequences. One recent example is combination of leveraging pre-trained models and data the violent insurrection on the US Capitol on 6 January augmentation techniques formed the foundation of our 2021 driven by conspiracy theories surrounding QAnon methodology, enabling us to achieve first place in the fiand election fraud [6]. Additionally, these theories can nal leaderboard of both sub-tasks with F1 scores of 85.71% and respectively 91.23%.

2. Related Work

to classify these messages, providing valuable insights into the prevalence of vaccine conspiracy theories on Recently, online platforms have often banned—entirely social media platforms. deactivated—communities that breached their increas- Tunstall et al. [25] presented a new approach based on ingly comprehensive guidelines. In 2020 alone, Red- Sentence Transformers[26] called SetFit that focused on dit banned around 2,000 subreddits (the name a com- data-eficient fine-tuning of sentence embeddings, particmunity receives on the platform) associated with hate ularly for binary labels. The training of SetFit follows a speech. Similarly, Facebook banned 1,500 pages and two-step process. First, it fine-tuned the sentence embedgroups related to the QAnon conspiracy theory [10]. dings in a contrastive manner. This step helped in optiWhile these decisions are met with enthusiasm [e.g., mizing the embeddings for the specific classification task. see Anti-Defamation League [11]], the eficacy of “de- Subsequently, a classification head was trained using platforming” these online communities has been ques- finetuned sentence embeddings, enabling efective classitioned [12, 13]. When mainstream platforms ban entire ifcation on the training labels. Their approach aimed to communities for their ofensive rhetoric, users often mi- enhance the eficiency and performance of fine-tuning grate to alternative fringe platforms, sometimes created sentence embeddings in scenarios with limited data. The exclusively to host the banned community [14, 15]. Ban- eficacy and power of Sentence-Transformers has been ning, in that context, would not only strengthen the in- shown in multiple tasks spanning from text generation frastructure hosting these fringe platforms [12] but allow [27, 28] to sentence classification tasks.[ 29, 30, 31]. These these communities to become more toxic elsewhere [16]. models capture the semantic and contextual information In order to improve the eficacy of such moderation poli- of sentences or paragraphs, enabling nuanced represencies identifying and tracking the propagation of prob- tations of textual data. Leveraging such models, Bates lematic content like conspiracy theories is crucial. For and Gurevych [32] used SetFit to propose LAGONN, a example the Zika virus outbreak in 2016, coupled with hate speech and toxic messages classification framework the influence of social networks and the declaration of a for content moderation. public health emergency by the WHO, showed the harm the dissemination of conspiracy theories can generated [17, 18]. 3. Method

The COVID-19 pandemic had a profound impact, emphasizing the dangers associated with the proliferation of 3.1. Task Description conspiracy theories. These theories encompassed a wide The ACTI @ EVALITA 2023 organizers put forth two subrange of topics, including the virus’s origin, its spread, tasks for participants to address. The first sub-task [ 9] the role of 5G networks, and the eficacy and safety of vac- involved binary classification, where participants were cines. With COVID-related lockdowns in place, people provided with a dataset consisting of 1,842 training sambecame more reliant on social networking platforms such ples and 460 test samples. The objective was to classify as Twitter, Facebook, and Instagram, which increased messages as either conspiratorial or non-conspiratorial. their exposure to disinformation and conspiracy theo- The second sub-task focused on fine-grained conspiracy ries. MediaEval 2020 [19] focused on a 5G and COVID-19 topic classification. Participants were required to clasconspiracy tweets dataset, proposing two shared tasks to sify messages into one of four specific conspiracy topic address this issue. The first task involved detecting con- classes: Covid, QAnon, Flat-Earth, or Russia-conspiracy. spiracies based on textual information, while the second A training set of 810 records was provided for this subtask focused on structure-based detection utilizing the task, while the evaluation test set contained 300 samples. retweet graph. Various systems were proposed to tackle Table 1 shows the class distribution for both sub-tasks. these tasks, employing diferent approaches such as methods relying on Support Vector Machine (SVM) [20], BERT Classes Count [21], and GNN [22] . In their study, Tyagi and Carley [23] Sub-Task A Non Conspiratorial 917 employed an SVM to classify the stance of Twitter users Conspiratorial 925 towards climate change conspiracies. Their findings re- Sub-Task B Covid 435 vealed that individuals who expressed disbelief in climate QAnon 242 change tend to share a significantly higher number of Flat-Earth 76 other types of conspiracy-related messages compared to Russian 57 those who believe in climate change. Furthermore, Amin Table 1 et al. [24] manually labeled 598 Facebook comments as ACTI Dataset distribution for the training sets on Sub-task A Covid-19 vaccine conspiracy or neutral and used a BERT- and B. based model in conjunction with Google Perspective API

The macro F1 score was adopted as a criterion to eval- Sentence-Transformers are pretrained Transformer uate the two sub-tasks. During the competition, 30% of models finetuned in a Siamese network, such that sethe test dataset was immediately evaluated on the Public mantically similar sentences or paragraphs are projected Leaderboard, giving participants an initial indication of near each other in the embedding space; in contrast, the their model’s performance. However, the final evaluation distance in the embedding space is maximized for senwas conducted on the remaining 70% of private entries. tence pairs that are diferent. In our experiments, we used These final evaluation scores were then used to compile several Italian pretrained Sentence Transformers from the Private Leaderboard made public after the conclusion the Huggingface Hub4, as mentioned in Table 3. The first of the competition. step in the SetFit training process involves generating positive and negative triplets. Positive triplets consist of 3.2. Sentence Transformer and Data sentences from the same class, while negative triplets contain sentences from diferent classes. The training data

Augmentation is expanded by including positive and negative triplets, We considered an Italian language Sentence Transformer providing a more comprehensive and diverse training model for our submissions and trained contrastive with set. The Sentence Transformer captures the contextual SetFit1 as described by Tunstall et al. [25]. Since the and semantic information of the messages, providing training dataset is highly imbalanced between the con- a powerful feature representation. In the second step, spiratorial classes (see in Table 1), we integrated a data a fully connected classification head is trained on top augmentation step in our classification pipeline, as seen of the Sentence-Transformer to distinguish between the in Figure 1. available classes.

In the data augmentation step, we used an LLM to create paraphrases for our training data using the prompt "riformulare questo testo: [comment_text]" and diferent seeds to create variations of the answers. In our experiments, we used "text-davinci-003" from the GPT-3 family2 and the mT5 model finetuned on Italian language paraphrases3. We set a high temperature (t=0.9) for the LLMs to ensure diverse text generation. The distribution for the augmented dataset is shown in Table 2.

Sub-Task A Sub-Task B

Classes Non Conspiratorial Conspiratorial Covid QAnon Flat-Earth Russian

1https://github.com/huggingface/setfit 2https://platform.openai.com/docs/models 3https://huggingface.co/aiknowyou/mt5-base-it-paraphraser 4. Results Besides experimenting with diferent pre-trained mod

els, as shown in Table 3, we also performed grid search tuning with several key hyper-parameters, namely the number of iterations, the learning rate, and the number of epochs for training. The number of iterations determined the quantity of generated triplets during training. By adjusting this parameter, we controlled the training data’s size, potentially influencing the model’s ability to generalize and capture important patterns. We set the maximum sequence length for the tokenizer to 512 for all of our experiments. We withheld 20% of the training data to evaluate the performance of the trained models during the development time.

The best-performing model difered between the subtasks. The best-performing model in the binary classification sub-task was based on "efederici/sentenceBERTino". This model was trained on the "textdavinci-003" augmented dataset for 1 epoch. We used 5 iterations and a learning rate of 1e-05. In contrast, the larger "nickprock/sentence-bert-base-italianxxl-uncased" model performed best for the fine-grained conspiracy topic classification sub-task. We trained this model on the same dataset for 1 epoch. The learning rate used was 1e-05, and the number of iterations was set to 10. This model yielded the best results in both Leaderboards (see Table 4).

We conducted an ablation study after the competition ended to assess the impact of data augmentation. We trained the best-performing models under diferent

4https://huggingface.co/models

Model efederici/sentence-BERTino efederici/sentence-bert-base efederici/sentence-BERTino-3-64 efederici/mmarco-sentence-BERTino efederici/sentence-it5-base efederici/sentence-it5-small nickprock/sentence-bert-base-italian-uncased nickprock/sentence-bert-base-italian-xxl-uncased aiknowyou/aiky-sentence-bertino

5. Conclusion

Private Leader- In this paper, we described our approach addressing the board two sub-tasks in the ACTI @ EVALITA 2023 competi81.29% tion. The challenge focuses on automatically detecting 83.83% conspiratorial Telegram messages and the classification into four conspiracy topics: Covid, QAnon, Flat-Earth, 82.25% and Russian conspiracies. Through the utilization of text augmentation techniques and the training of SentenceTransformers with contrastive learning, we developed robust classifiers. Our best models achieved first place in the Private Leaderboard on both tasks with F1 scores of Private 85.712% in the binary classification and 91.225% for the Leader- fine-grained conspiracy topic classification. This paper board contributes to the growing body of research on conspir93.67% acy theory detection and emphasizes the efectiveness 89.67% of leveraging pre-trained models and data augmentation techniques. Our results argue the potential of these ap87.07% proaches in addressing the challenges posed by conspiracy theories and their propagation in online platforms.

Acknowledgement In the case of sub-task A, the additional data substan

tially influenced both the Public and Private test results. This work was supported by a grant of the Ministry of The augmented dataset led to significant improvements Research, Innovation, and Digitalization, project Cloudin performance. However, we see a decline in the Pri- Precis, Contract 344/390020/06.09.2021, MySMIS code: 124812, within POC. Amendment Institute at Columbia University, January 11 (2021). [13] G. Russo, L. Verginer, M. H. Ribeiro, G. Casiraghi, References Spillover of antisocial behavior from fringe platforms: The unintended consequences of commu[1] M. Lai, S. Menini, M. Polignano, V. Russo, R. Sprug- nity banning, in: Proceedings of the International noli, G. Venturi, Evalita 2023: Overview of the 8th AAAI Conference on Web and Social Media, volevaluation campaign of natural language process- ume 17, 2023, pp. 742–753. ing and speech tools for italian, in: Proceedings [14] C. Dewey, Washington Post — These are the 5 subof the Eighth Evaluation Campaign of Natural Lan- reddits Reddit banned under its game-changing guage Processing and Speech Tools for Italian. Final anti-harassment policy, and why it banned them, Workshop (EVALITA 2023), CEUR.org, Parma, Italy, https://wapo.st/3AO7pbl, 2016.

2023. [15] G. Russo, M. Horta Ribeiro, G. Casiraghi, [2] D. Aaronovitch, Voodoo histories: The role of the L. Verginer, Understanding online migration conspiracy theory in shaping modern history, NY: decisions following the banning of radical Riverhead (2010). communities, in: Proceedings of the 15th [3] S. Vosoughi, D. Roy, S. Aral, The spread of true and ACM Web Science Conference 2023, WebSci false news online, science 359 (2018) 1146–1151. ’23, Association for Computing Machinery, [4] H. W. Hanley, D. Kumar, Z. Durumeric, A golden New York, NY, USA, 2023, p. 251–259. URL: age: Conspiracy theories’ relationship with mis- h t t p s : / / d o i . o r g / 1 0 . 1 1 4 5 / 3 5 7 8 5 0 3 . 3 5 8 3 6 0 8. information outlets, news media, and the wider internet, arXiv preprint arXiv:2301.10880 (2023). [16] dMo.i:H1o0r.t1a1R4i5b/ei3r5o,7S8.5J0h3a.ve3r5, 8S3.6Z0an8n.ettou, J. Black[5] R. Brotherton, C. C. French, A. D. Pickering, Mea- burn, G. Stringhini, E. De Cristofaro, R. West, Do suring belief in conspiracy theories: The generic platform migrations compromise content moderconspiracist beliefs scale, Frontiers in psychology ation? evidence from r/the_donald and r/incels, 4 (2013) 279. Proceedings of the ACM on Human-Computer In[6] A. Seitz, Mob at u.s. capitol encouraged by online teraction 5 (2021) 1–24.

conspiracy theories, The Associated Press (2021). [17] A. Ghenai, Y. Mejova, Catching zika fever: ApURL: https://apnews.com/article/donald-trump-c plication of crowdsourcing and machine learnonspiracy-theories-michael-pence-media-socia ing for tracking health misinformation on twitl-media-daba3f5dd16a431abc627a5cf c922b87. ter, in: 2017 IEEE International Conference on [7] W. Audureau, Why conspiracy theorists and the Healthcare Informatics (ICHI), 2017, pp. 518–518. kremlin echo each other’s disinformation, Le Monde (2023). URL: https://www.lemonde.fr/e [18] dMo.iJ:1.W0.o1o1d0,9P/roIpCaHgIa.ti2n0g1a7n.d5d8e.bunking conspiracy n/les-decodeurs/article/2023/03/02/conspiracy-t theories on twitter during the 2015–2016 zika virus heorists-the-kremlin-echo-each-other-s-disinfo outbreak, Cyberpsychology, behavior, and social rmation_6017960_8.html. networking 21 (2018) 485–490. [8] I. Yablokov, Russian disinformation finds fertile [19] K. Pogorelov, D. T. Schroeder, L. Burchard, J. Moe, ground in the west, Nature Human Behaviour 6 S. Brenner, P. Filkukova, J. Langguth, Fakenews: (2022) 766–767. Corona virus and 5g conspiracy task at mediaeval [9] G. Russo, N. Stoehr, M. H. Ribeiro, Acti at evalita 2020., in: MediaEval, 2020.

2023: Overview of the conspiracy theory identifica- [20] M. Moosleitner, B. Murauer, G. Specht, Detecting tion task, arXiv preprint arXiv:2307.06954 (2023). conspiracy tweets using support vector machines., [10] B. Collins, B. Zadrozny, Facebook bans qanon across in: MediaEval, 2020.

its platforms, https://www.nbcnews.com/tech/tec [21] A. Malakhov, A. Patruno, S. Bocconi, Fake news h-news/facebook-bans-qanon-across-its-platf or classification with BERT, in: Working Notes Proms-n1242339, 2020. ceedings of the MediaEval 2020 Workshop, On[11] Anti-Defamation League, ADL statement on Face- line, 14-15 December 2020, volume 2882 of CEUR bpolaotkfo’sr mde,chisttiopns:/t/owfinwalwly.abdal.norQgA/nneownsc/opnretsesn-trferloemase hWttoprks:s/h/coepuPr-rwocse.eodrgin/Vgso,l-C2E8U82R/p-WapSe.ro3r8g.,p2d0f.20. URL: s/adl-statement-on-f acebooks-decision-to-f inally [22] A. Paraschiv, G.-E. Zaharia, D.-C. Cercel, M. Das-ban-qanon-content-f rom-platf orm, 2020. calu, Graph convolutional networks applied to [12] E. Zuckerman, C. Rajendra-Nicolucci, Deplatform- fakenews: corona virus and 5g conspiracy, UPB ing our way to the alt-tech ecosystem, Knight First Scientific Bulletin, Series C: Electrical Engineering 83 (2021) 71–82. [23] A. Tyagi, K. M. Carley, Climate change conspiracy theories on social media, arXiv preprint arXiv:2107.03318 (2021). [24] M. H. Amin, H. Madanu, S. Lavu, H. Mansourifar, D. Alsagheer, W. Shi, Detecting conspiracy theory against covid-19 vaccines, arXiv preprint arXiv:2211.13003 (2022). [25] L. Tunstall, N. Reimers, U. E. S. Jo, L. Bates, D. Korat,

M. Wasserblat, O. Pereg, Eficient few-shot learning without prompts, arXiv preprint arXiv:2209.11055 (2022). [26] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks, arXiv preprint arXiv:1908.10084 (2019). [27] A. Amin-Nejad, J. Ive, S. Velupillai, Exploring transformer text generation for medical dataset augmentation, in: Proceedings of the Twelfth Language Resources and Evaluation Conference, 2020, pp. 4699– 4708. [28] G. Russo, N. Hollenstein, C. C. Musat, C. Zhang,

Control, generate, augment: A scalable framework for multi-attribute text generation, ArXiv abs/2004.14983 (2020). [29] J. Hong, J. Park, D. Kim, S. Choi, B. Son, J. Kang,

Empowering sentence encoders with prompting and label retrieval for zero-shot text classification, 2023. arXiv:2212.10391. [30] G. Piao, Scholarly text classification with sentence bert and entity embeddings, in: Trends and Applications in Knowledge Discovery and Data Mining: PAKDD 2021 Workshops, WSPA, MLMEIN, SDPRA, DARAI, and AI4EPT, Delhi, India, May 11, 2021 Proceedings 25, Springer, 2021, pp. 79–87. [31] G. Russo, C. Gote, L. Brandenberger, S. Schlosser,

F. Schweitzer, Disentangling active and passive cosponsorship in the u.s. congress, ArXiv abs/2205.09674 (2022). [32] L. Bates, I. Gurevych, Like a good nearest neighbor:

Practical content moderation with sentence transformers, arXiv preprint arXiv:2302.08957 (2023).