1. Introduction and Motivations

of the Political Ideology Detection in Italian Texts Task

Daniel Russo

drusso@fbk.eu 1 2 5

Salud María Jiménez-Zafra

2 3

José Antonio García-Díaz

joseantonio.garcia8@um.es 2 4

Tommaso Caselli

t.caselli@rug.nl 0 2

Marco Guerini

guerini@fbk.eu 1 2

L. Alfonso Ureña-López

2 3

Rafael Valencia-García

valencia@um.es 2 4 0 CLCG, University of Groningen , Netherlands 1 LanD, Fondazione Bruno Kessler , Via Sommarive 18, Povo, Trento , Italy 2 Processing and Speech Tools for Italian , Sep 7 - 8, Parma, IT 3 SINAI, Universidad de Jaén , Spain 4 UMUTeam, Universidad de Murcia , Spain 5 University of Trento , Italy

This paper presents the PoliticIT 2023 shared task, organised at EVALITA 2023 workshop. The task aims to extract politicians' ideology information from a set of tweets in Italian framed as a binary and a multiclass classification. The task is designed to be privacy-preserving and it is accompanied by a subtask targeting the identification of self-assigned gender as a demographic trait. The PoliticIT task attracted 7 teams that registered for the task, submitted results and presented working notes describing their systems. Most of the teams proposed transformer-based approaches, while some of them also used traditional machine learning algorithms or even a combination of both.

Author profiling Political ideology Author analysis Demographic and psychographic traits

1. Introduction and Motivations

The study of the political discourse on Social Media Platforms is of paramount importance in order to understand where society is heading. Political discourse is by definition ideologically based and political ideologies are spread with discourse [ 1 ]: for this reason, the analysis of the latter cannot go without the understanding of the former.

Political ideology is defined as a psychographic trait

right wing, whereas openness to experience and agreeability were notably more correlated to the left wing.

Moreover, political ideology has a great influence in the daily lives of each citizen. For example, [ 4 ] found a correlation between political ideology and the attitude of citizens to vaccination campaigns. Still, citizens react to the political messages they are exposed to. Therefore studying how politicians spread their ideology using social media discourses is useful to better analyse the policies and perspectives that are proposed on how society haviour, including moral and ethical values, attitudes, that can help comprehend both individual and social be- should be organized and work.

In this scenario, the PoliticIT shared task organized

biases, and prejudices [ 2 ]. In fact, this trait helps under- at EVALITA 2023 [ 5 ] aims to extract political ideology standing how individuals think that society should be organised and has a strong relationship with personality traits as demonstrated in [ 3 ]. For instance, they found that conscientiousness is strongly correlated with the LGOBE

0009-0006-9123-5316 (D. Russo); 0000-0003-3274-8825 (S. M. Jiménez-Zafra); 0000-0002-3651-2660 (J. A. García-Díaz); 0000-0003-2936-0256 (T. Caselli); 0000-0003-1582-6617 (M. Guerini); [ 6 ], targeting author attribution, bot detection, gender detection, and author obfuscation, among others. Other initiatives, such as the PoliticES shared task [7], have focused on capturing other traits such as the political ideology expressed in a message. PoliticIT is a twintask of PoliticES and aims at analysing political ideology while being privacy-preserving. For this reason, a novel methodology of text clustering, that creates “virtual users” has been used on top of traditional anonymisation procedures.

The rest of the paper is organized as follows. Section 2 describes the PoliticIT shared task. Section 3 presents 3.1. Data Collection the dataset provided in the competition. Section 4 summarises the participant approaches. Section 5 shows the The dataset was collected from the Twitter timelines2 of results and a discussion thereof. Finally, Section 6 con- Italian politicians using the UMUCorpusClassifier [ 8], folcludes the paper with a discussion of the most interesting lowing a strategy similar to the one adopted in PoliticES outcomes of this task and possible future works. 2022 [7] and in [9]. In particular, the data refer to the politicians from the legislature XIX of the Italian Republic. The list of deputies, senators, and ministers was taken 2. Task Description from the institutional websites of the Italian Parliament3 and Government.4 All the politicians’ Twitter accounts The PoliticIT task is structured along three subtasks: were manually retrieved, as they are not reported on the institutional websites. We discarded politicians that did • Subtask A - Self-assigned Gender : Given a mes- not have a Twitter account or that were highly inactive on sage, the system must assign a value for the gen- this social media (i.e. whose accounts present very few or der of the author. The set of labels has been de- old tweets). The time window for the corpus compilation termined according to the personal web pages of was December 2022 as the oldest date, but no start date the politicians of the Italian Parliament. The task was set. In the first iteration, we compiled 371,822 tweets has been framed as a binary classification task from 468 politicians between November 2010 and Decemwith M for men and F for women. ber 2022. The average number of tweets per politician • Subtask B - Political Ideology (binary): systems is 794.49 but with a large standard deviation of 847.12, are required to determine the political orienta- which suggest that not all politicians are equally active tion of a message; the binary version of the task on Twitter. Thus, we decided to remove from the dataset presents two macro-categories: Left and Right. those politicians with less than 25 tweets, leaving a total • Subtask C - Political Ideology (multi-class): this of 408 politicians.

subtask presents a more fine-grained set of la- To balance the number of tweets per politician, we bels for the political orientation expressed by a ifrst removed those tweets that are not written in Italian. given message. In this case, we employed four To detect the language, we employed FastText language labels: Left , Moderate-Left , Moderate-Right, identification model [ 10 ]. Secondly, we removed the docand Right. uments that shared content from news websites without retweeting. To do this, we discarded tweets that con

PoliticIT was organized through CodaLab.1 The run of tained mentions of news websites, by detecting linguistic the task is divided into three phases: (i) Practice, (ii) Eval- clues within the text, such as the pipe symbol, which uation, and (iii) Post-evaluation. In the Practice phase, is commonly employed by news websites to categorise the participants were initially provided with a subset their content. Thirdly, we selected tweets based on topics. of the training data in order to familiarise themselves An initial list of topics was extracted with BERTopic [ 11 ], with the training data format. During this phase, we a topic modelling technique for the creation of interalso provided a notebook comprising the code for our pretable clusters based on Transformers and c-TF-IDF. In Logistic Regression baseline, as a starting point for the particular, we leveraged the Italian BERT model from [ 12 ]. development of more eficient systems. The full training We obtained a list of topics organised into 21 categories. set was released in February 2023. Currently, the task is This list was manually checked to introduce additional in its post-evaluation phase, where participation is pub- keywords for categories such as European Union, immilicly open to other teams and research groups from the gration, energy, feminism, sports, mafia or religion. Next, community. we identified which topics appeared in each tweet and prioritised those tweets that contained at least one topic. 3. Datasets and Format We then selected the tweets according to their topic in order to avoid any possible bias in the dataset.

This section provides the reader with an overview of the dataset proposed for the PoliticIT 2023 shared task along with a comprehensive description of the modalities employed for creating it.

1https://codalab.lisn.upsaclay.fr/competitions/8507 2https://developer.twitter.com/en/docs/twitter-api/v1/tweets/

timelines/api-reference/get-statuses-home_timeline

3Chamber of Deputies: https://www.camera.it/leg19/28 Senate of the Republic: https://www.senato.it/leg/19/BGT/Schede/ Attsen/Sena.html 4https://www.governo.it/it/ministri-e-sottosegretari

3.2. Data Annotation and Anonymization

We enriched the dataset by assigning to each politician a label indicating their political ideology. Political ideologies have been directly derived from the politicians’ afiliation party. In particular, the mapping from the politician to the political ideology was obtained through a two-step procedure: 1. Automatic labelling of politicians with their current political party afiliation. The party afiliation has been inferred from the parliamentary group to which the parliament party belongs. The data were extracted from the Italian institutional websites on October 31, 2023, thus they do not relfect changes in parliamentary groups following this date. 2. Mapping of the political parties to specific political ideology labels. The set of labels has been identified using Wikipedia. 5 We used four political ideology labels, i.e. left , moderate left , right, moderate right. Parties that are mapped in the centre, or cross-party, were nevertheless assigned one of the four aforementioned labels on the basis of their political alliances and the programme they presented during the 2022 Italian election campaign. The decision to “force” this classification was made to avoid excessive imbalance within each class. Therefore, we labelled “Movimento 5 Stelle” as left , whereas “Azione” and “Italia Viva” as moderate left . replaced with the @user token. Consequently, the text traits cannot be guessed trivially by reading a politician’s name and searching for personal information on the Internet. We also replaced the name of the political parties and of their Twitter accounts with the POLITICAL_PARTY token. • Clustering procedure - Subsequently, we created clusters of texts by mixing some of the extracted tweets in order to prevent ethical and privacy issues related to author profiling on Twitter. All the clusters are composed of tweets written by diferent politicians that share the same traits under evaluation, i.e. political ideology and selfassigned gender. For this, we divide the politicians into training and test in order to prevent that tweets from the same politician from appearing both in training and validation. To generate a cluster, we first set their demographic and psychographic traits, and then randomly pick tweets from users that share these traits. Thereby, each cluster represents “virtual users”, with their selfassigned gender (male, female) and political spectrum. For the latter, we labelled the data according to two axes: binary (left, right) and multiclass (left, moderate left, moderate right and right). At the end of this process, we obtained 1751 clusters with 80 tweets per cluster. It should be noted that the clusters from the training and test sets are independent to prevent machine learning approaches from identifying the authors rather than the demographic and psychographic traits.

Gender labels were assigned through three diferent

approaches, depending on the source of the data. for the 3.3. Data Formats Italian deputies, gender was directly extracted from the institutional website, which allows the filtering of mem- The training and test sets are produced in a ratio of nearly bers according to this trait. The website of the Senate 75%-25%. Table 1 presents a summary of the distribution of the Republic does not clearly states the gender of the of labels per subtask. In no case, the labels are evenly members. In this case, employed linguistic cues present distributed. Male politicians are almost double the numon the personal page of each senator to infer the gender. ber of female politicians, and more than 200 politicians Specifically, we looked at the Italian verb “nascere” in are from the left wing than from the right. As regards its past participle form as “nato” for the male label and the multi-class ideology, moderate left and right are the “nata” for the female label. Finally, for the ministers, gen- most represented labels. der was manually assigned as the oficial Government Ultimately, each entry of the PoliticIT dataset comwebsite does not comprise this information and does not prises four elements: a cluster id, the self-assigned gender present helpful linguistic cues. In this case, ministers label and the political ideology labels for binary and multiwere labelled according to their biological sex. class classification. The dataset is organised at tweet level.

Subsequently, to build a privacy-compliant approach This means that each row represents one tweet. Each we took a two-step procedure including anonymization line also contains a cluster id to identify the cluster and clustering: to which the tweet belongs, as well as the demographic and psychographic traits of the cluster. Examples are provided in Table 2. The full dataset, including the gold labels, is available on Codalab 6. • Anonymized references in text - References to politicians within Twitter mentions were anonymized by replacing them with the token @user. The rest of the in-text mentions were also

5https://it.wikipedia.org/wiki/Partiti_politici_italiani 6https://codalab.lisn.upsaclay.fr/competitions/8507 4. Systems Overview

A total of seven teams participated in the PoliticIT task, with all teams involved in each of the subtasks. The majority of the participants represented academic institutions. An overview of the system approaches can be found in Table 3, while Section 4.1 provides further details on the systems proposed.

4.1. System Architectures

ExtremITA [ 13 ]. The team proposed two systems. The ifrst system is based on Camoscio [ 14 ], the Italian version of the Standford Alpaca model [ 15 ], pre-trained to generate text as a response to users’ instructions passed as input. The team performed further fine-tuning on triples <task, input, output>. More precisely, the model used the phrasal forms derived from the training data of all EVALITA 2023 challenges: the task is a linguistic description of the task to be solved, whereas the input-output pairs are task-specific. On the other hand, the second system is based on IT5 Transformer [ 16 ], for which finetuning was done on input-output pairs. More precisely, the model used the phrasal forms derived from the training data of all EVALITA 2023 challenges, where the input-output pair is task-specific.

INFOTEC-LaBD [17]. The team employed SVM classi

ifers with linear and nonlinear kernels. Specific attention was given to the data representation. Indeed, the authors employed low-dimensional projections to concisely represent the dataset and its associated labels. These vectors were used for training an SVM classifier.

INGEOTEC. The team implemented diferent configurations of Bag of Words (BoW) classifier in all the subtasks. In particular, for gender and political ideology multiclass classification, INGEOTEC employed a stack generalization approach leveraging three BoW classifiers: two BoW classifiers pre-trained on 5M Italian tweets, and a BoW classifier trained on PoliticIT training set. Instead, for the political ideology binary subtask, the team proposed a BoW classifier trained on the training set.

Teeeech. The team proposed three Transformer-based classifiers trained independently of each other. The authors did not specify which Transformer models were used. Tübingen [18]. The team proposed two main ap

proaches: an SVM-based approach and a Transformerbased approach. The former was the best-performing, i.e.,.e simple linear SVMs with sparse word/character n-gram features, trained separately for each task, only

5. Results and Discussion

Focusing on each subtask, it appears evident that bi- of the errors afect the classification of male politicians nary classification of political ideologies is relatively eas- into females. The best system, INFOTEC-LaBD, obtains ier when compared to the other two subtasks, with the 0.824 macro F1, with a positive Δ from the second best best results being 0.928 obtained by Tübingen. The more system (Tübingen) of 0.032 points. ifne-grained the distinction of political ideologies is the more challenging the task. This is not just an efect of having multiple classes and the distribution of the data, 6. Conclusions but it involves also the subtleties and nuances in policies across the four groups. In Figure 1 we display the In this paper, we have summarized the outcomes of confusion matrices obtained by the Tübingen team for the first edition of the PoliticIT task at EVALITA 2023. the three classification subtasks. Focusing on the errors PoliticIT targets the identification of the political ideolof Subtask C, multi-class ideology classification, we can ogy and gender of the author of a tweet. Political ideology notice that most of the errors concern misclassification is a psychographic trait that can be used to understand of the “extremes” (i.e., Left and Right) into the Moder- individual and social behaviour, and thus contribute to a ate Left category) rather than of the moderate positions better understanding of the society. The task introduces an innovative method concerning the anonymization of (tMhaot d-eartalteeasLteift n tvhse.irMcoodmemrautneicRaitgiohnt)o.nTThwisiitntedric-adt eifs- users to preserve privacy, allowing the investigation of ferences in the moderate political areas are stronger. A these sensitive topics in a fair and ethical way. further noticeable result is the fact that messages from PoliticIT has seen the participation of seven teams, politicians on the Right spectrum tend to be assimilated ifve of whom submitted a full report describing their approach. The results indicate that fine-grained political mostly with the Moderate Left and Moderate Right, suggesting that the narratives of moderate groups tend ideology distinction is more challenging than binary clasto assimilate issues and expressions of the Right. sification between two extreme values. This appears to

The identification of the gender of the author from a be due to the absorption of the narratives of the extremes tweet is also quite challenging even if it is framed as a into the moderate positions, especially from the Modbinary task. In this case, as illustrated by Figure 1a most erate Left . This is a datum that could provide political scientists and sociologists additional insights into the evolution of the Italian political systems. A further as- the Junta de Andalucía (DOC_01073). pect that raises interest concerns the errors in Subtask A, gender classification. In this case, most of the errors are male politicians classified as females. A deep exploration References of the communication styles of the two genres and their correlation with political afiliations is a promising path to better understanding this behaviour.

As expected, approaches based on Transformers are the trend solutions presented by participating teams, but some of them also used feature-based linear machine learning systems. It is quite impressive that the best performing team, Tübingen, uses linear SVMs with word and character n-grams as features weighting each feature using TF-IDF. This indicates that simple methods are still competitive and far from being fully overcome by neural approaches.

PoliticIT has been fully run on CodaLab. The task is now in the Post-evaluation phase and it is still open to submissions from other teams, making this resource freely available to the NLP community for research purposes.

Future work will investigate the addition of an extra subtask related to stance detection, to determine which authors are in favor of certain topics and which users are against. We can use this information to define clusters of users and to observe whether there is a relationship between the topics and the political ideology.

[1]

Fairclough , Critical discourse analysis: The critical study of language , Routledge, 2013 .

[2]

Verhulst ,

L. J.

Eaves ,

P. K.

Hatemi , Correlation not causation: The relationship between personality traits and political ideologies , American journal of political science 56 ( 2012 ) 34 - 51 .

[3]

Fatke , Personality traits and political ideology: A first global assessment , Political Psychology 38 ( 2017 ) 881 - 899 .

[4]

Baumgaertner ,

J. E.

Carlisle ,

Justwan , The influence of political ideology and trust on willingness to vaccinate , PloS one 13 ( 2018 ) e0191728 .

[5]

Lai ,

Menini ,

Polignano ,

Russo ,

Sprugnoli , G. Venturi, EVALITA 2023 : Overview of the 8th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian , in: M. Lai , S.

Menini , M.

Polignano , V.

Russo , R.

Sprugnoli , G. Venturi (Eds.), Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2023 ), CEUR.org, Parma, Italy, 2023 .

[6]

Bevendorf ,

Chulvi ,

G. L. D. L. Peña

Sarracén ,

Kestemont ,

Manjavacas , I. Markov,

Mayerl ,

Potthast ,

Rangel ,

Rosso , et al., Overview Acknowledgments of PAN 2021 : authorship verification, profiling hate speech spreaders on twitter, and style change This work is part of the research projects detection, in: International Conference of the LaTe4PoliticES (PID2022-138099OB-I00) funded Cross-Language Evaluation Forum for European by MCIN/ AEI/10.13039/501100011033 and the Euro- Languages , Springer, 2021 , pp. 419 - 431 .

pean Fund for Regional Development (FEDER)- a [7]

J. A.

García-Díaz ,

S. M.

Jiménez-Zafra , M.-T. M.

way to make Europe and LaTe4PSP (PID2019-

Valdivia , F.

García-Sánchez , L. A.

Ureña-López , 107652RB - I00 /AEI/ 10.13039/501100011033) funded by R. Valencia-García, Overview of PoliticEs 2022: MCIN/AEI/10 .13039/501100011033. This work is also Spanish Author Profiling for Political Ideology, part of the research projects AIInFunds (PDC2021-121112- Procesamiento del Lenguaje Natural 69 ( 2022 ) I00) and LT-SWM (TED2021-131167B-I00) funded by 265-272 .

MCIN/AEI/10 .13039/501100011033 and by the European [8]

J. A.

García-Díaz , Á. Almela,

Alcaraz-Mármol , Union NextGenerationEU/PRTR. It also has been partially R. Valencia-García, Umucorpusclassifier: Compisupported by Project CONSENSO (PID2021-122263OB- lation and evaluation of linguistic corpus for natuC21), Project MODERATES (TED2021-130145B-I00) ral language processing tasks , Procesamiento del and Project SocialTox (PDC2022-133146-C21) funded Lenguaje Natural 65 ( 2020 ) 139 - 142 .

MCIN

/AEI/10.13039/501100011033 and by the [9]

J. A.

García-Díaz ,

Colomo-Palacios , R. ValenciaEuropean Union NextGenerationEU/PRTR, Project

García

, Psychographic traits identification based PRECOM (SUBV-00016) funded by the Ministry on political ideology: An author analysis study on of Consumer Afairs of the Spanish Government, spanish politicians' tweets posted in 2020, Future Project FedDAP (PID2020-116118GA-I00) supported Generation Computer Systems 130 ( 2022 ) 59 - 74 .

MICINN

/AEI/10.13039/501100011033 and WeLee [10]

Joulin , É. Grave,

Bojanowski , T. Mikolov, Bag project ( 1380939 ,

FEDER

Andalucía 2014 -2020) funded of tricks for eficient text classification, in: Proby the Andalusian Regional Government. Salud María ceedings of the 15th Conference of the European Jiménez-Zafra has been partially supported by a grant Chapter of the Association for Computational Linfrom Fondo Social Europeo and the Administration of guistics: Volume 2 ,

Short

Papers , 2017 , pp. 427 - 431 .

[11]

Grootendorst , Bertopic: Neural topic modeling [23]

Tiedemann , Parallel data, tools and interfaces with a class-based tf-idf procedure, arXiv preprint in OPUS , in: Proceedings of the Eighth InarXiv: 2203 .05794 ( 2022 ). ternational Conference on Language Resources

[12]

Schweter , Italian bert and electra models, and Evaluation (LREC'12) , European Language 2020 . URL: https://doi.org/10.5281/zenodo.4263142. Resources Association (ELRA), Istanbul, Turkey, doi:10.5281/zenodo.4263142. 2012 , pp. 2214 - 2218 . URL: http://www.lrec-conf.

[13] C. D. Hromei , D.

Croce , V.

Basile , R.

Basili , Ex- org/proceedings/lrec2012/pdf/463_Paper.pdf. tremITA at EVALITA 2023: Multi-Task Sustainable [24] P. J. Ortiz

Suárez , B.

Sagot , L. Romary, AsynScaling to Large Language Models at its Extreme, chronous pipelines for processing huge corpora EVALITA 2023 Eigth Evaluation Campaign of Nat- on medium to low resource infrastructures , Proural Language Processing and Speech Tools for Ital- ceedings of the Workshop on Challenges in ian ( 2023 ) - . the Management of Large Corpora (CMLC-7)

[14]

Santilli , Camoscio: An italian instruction-tuned 2019 . Cardif, 22nd July 2019 , Leibniz-Institut llama , https://github.com/teelinsan/camoscio, 2023 . für Deutsche Sprache, Mannheim, 2019 , pp. 9 -

[15]

Taori , I. Gulrajani,

Zhang ,

Dubois ,

Li , 16 . URL: http://nbn-resolving.de/urn:nbn:de:bsz: C. Guestrin,

Liang , T. B. Hashimoto , Stanford al- mh39 -90215 . doi: 10 .14618/ids- pub- 9021. paca: An instruction-following llama model , https: [25]

Polignano ,

Basile , M. de Gemmis, G. Semer//github.com/tatsu-lab/stanford_alpaca, 2023 . aro, V. Basile, AlBERTo: Italian BERT Language

[16]

Sarti , M. Nissim, It5: Large-scale text-to-text Understanding Model for NLP Challenging Tasks pretraining for italian language understanding and Based on Tweets, in: Proceedings of the Sixth generation , ArXiv preprint 2203.03759 ( 2022 ). URL: Italian Conference on Computational Linguistics https://arxiv.org/abs/2203.03759. (CLiC-it 2019 ), volume 2481 , CEUR , 2019 . URL:

[17]

Cabrera-Pineda ,

E. S.

Téllez , S. Miranda, https://www.scopus.com/inward/record.uri? INFOTEC-LaBD at PoliticIT: Political Ideology De- eid= 2 - s2 . 0 - 85074851349 &partnerID= 40 &md5= tection in Italian Texts, EVALITA 2023 Eigth Eval- 7abed946e06f76b3825ae5e294ffac14. uation Campaign of Natural Language Processing and Speech Tools for Italian ( 2023 ) - .

[18]

Çöltekin ,

Brivio ,

Can , Tübingen at PoliticIT: Exploring SVMs, Pretrained Language Models, and Linguistic Transfer for Ideology Detection in Social Media, EVALITA 2023 Eigth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian ( 2023 ) - .

[19]

Conneau ,

Khandelwal ,

Goyal ,

Chaudhary ,

Wenzek ,

Guzmán , É. Grave,

Ott ,

Zettlemoyer ,

Stoyanov , Unsupervised crosslingual representation learning at scale , in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , 2020 , pp. 8440 - 8451 .

[20]

Erjavec ,

Ogrodniczuk ,

Osenova ,

Ljubešić ,

Simov ,

Pančur ,

Rudolf ,

Kopp ,

Barkarson ,

Steingrímsson , et al., The parlamint corpora of parliamentary proceedings, Language resources and evaluation 57 ( 2023 ) 415 - 448 .

[21]

Pan , Á. Almela,

García-Sánchez , UMUTeam at PoliticIT-EVALITA2023: Evaluating Transformer Model for Detecting Political Ideology in Italian Texts, EVALITA 2023 Eigth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian ( 2023 ) - .

[22]

Á . Rodríguez-García, URJC-Team at EVALITA 2023: Political Ideology Detection in Italian Texts Using Transformers Architectures, EVALITA 2023 Eigth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian ( 2023 ) - .