-

1613-0073

munity-based Stance Detection

Emanuele Brugnoli

emanuele.brugnoli@sony.com 0 1 2 3

Donald Ruggiero Lo Sardo

donaldruggiero.losardo@sony.com 0 1 2 3 0 Centro Studi e Ricerche Enrico Fermi (CREF) , Piazza del Viminale 1, 00184 Rome , Italy 1 Dipartimento di Fisica - Sapienza Università di Roma , P.le A. Moro 2, 00185 Rome , Italy 2 Sony Computer Science Laboratories Rome, Joint Initiative CREF-SONY , Piazza del Viminale 1, 00184, Rome , Italy 3 Stance Detection , Polarisation, Social Networks

2024

3 10

Stance detection is a critical task in understanding the alignment or opposition of statements within social discourse. In this study, we present a novel stance detection model that labels claim-perspective pairs as either aligned or opposed. The primary innovation of our work lies in our training technique, which leverages social network data from X (formerly Twitter). Our dataset comprises tweets from opinion leaders, political entities and news outlets, along with their followers' interactions through retweets and quotes. By reconstructing politically aligned communities based on retweet interactions, treated as endorsements, we check these communities against common knowledge representations of the political landscape. Our training dataset consists of tweet/quote pairs where the tweet comes from a political entity and the quote either originates from a follower who exclusively retweets that political entity (treated as aligned) or from a user who exclusively retweets a political entity from an opposing ideological community (treated as opposed). This curated subset is used to train an Italian language model based on the RoBERTa architecture, achieving an accuracy of approximately 85%. We then apply our model to label all tweet/quote pairs in the dataset, analyzing its out-of-sample predictions. This work not only demonstrates the eficacy of our stance detection model but also highlights the utility of social network structures in training robust NLP models. Our approach ofers a scalable and accurate method for understanding political discourse and the alignment of social media statements.

CEUR ceur-ws.org

1. Introduction

Stance detection is a critical task within the domain of natural language processing (NLP). It involves identifying the position or attitude expressed in a piece of text ally, stances are classified into three primary categories: favor, against, and neutral. This classification enables a detailed description of textual data, facilitating a deeper insight into public opinion and discourse dynamics. nication platforms such as social media, forums, and online news outlets has resulted in an unprecedented volume of user-generated content. This surge underscores the necessity for automated systems capable of eficiently analyzing and interpreting these vast text corpora. Stance detection addresses this need by providing tools that can systematically assess opinions and reactions embedded within texts, thus ofering valuable applications across various fields including social media analysis [ 3, 4 ], search engines [ 5 ], and linguistics [ 6 ]. (D. R. Lo Sardo) Dec 04 — 06, 2024, Pisa, Italy ∗Corresponding author.

Stance detection, has been explored across various ifelds with difering definitions and applications. Du stance-taking involves evaluating objects, positioning subjects, and aligning with others in dialogic interactions, emphasizing the sociocognitive aspects and intersubjectivity in discourse [ 6 ]. Sayah and Hashemi focus on academic writing, analyzing stance and engagement features like hedges, self-mention, and appeals to shared knowledge to understand communicative styles and interpersonal strategies [8]. Küçük and Can define stance detection as the classification of an author’s position towards a target (favor, against, or neutral), highlighting its tion, and argument mining [9]. These diverse approaches underscore the multifaceted nature of stance detection and its applications in enhancing the understanding of social discourse, academic rhetoric, and online content analysis. For a review of the recent developments of the ifeld we refer to Alturayeif et al. [ 2 ] and AlDayel et al.

[ 3 ].

In this work, we propose a novel approach to training In the following sections, we will outline the data stance detection models by leveraging the interactions gathering approach used for the dataset. Subsequently, within highly polarized communities. Our method uti- we will describe the community detection methods emlizes tweet/quote pairs from the Italian political debate ployed to identify leaders and users within the Italian to construct a robust training set. We operate under political discourse. We will then discuss the model archithe assumption that users who predominantly retweet a tecture and its training process. In the results section, we particular political profile are likely in agreement with will evaluate the model’s performance and present our the statements made by that profile. We restricted our ifndings. Finally, the conclusion will address potential analysis to retweet since this form of communication future developments, the implications of our work, and primarily aligns with the endorsement hypothesis [10]. its limitations.

Namely, being a simple re-posting of a tweet, retweeting is commonly thought to express agreement with the claim of the tweet [11]. Further, though retweets might 2. Results be used with other purposes such as those described by Marsili [12], the repeated nature of the interaction we In this study, we focus on a comprehensive set of Italian observe in our networks reduces the probability that the opinion leaders active on Twitter/X, including the oficial activity falls outside of the endorsement behavior. profiles of major news media outlets as well as prominent

Conversely, while quoting a tweet works similarly to politicians and political parties. The profiles of news meretweeting, the function allows users to add their own dia outlets are further classified according to assessments comments above the tweet. This makes this form of provided by NewsGuard, which categorize them as either communication controversial regarding the endorsement questionable or reliable sources. This classification is cruhypothesis, as agreement or disagreement with the tweet cial for evaluating the quality of the information these depends on the stance of the added comment. On the outlets disseminate, particularly regarding their repuother hand, the information social media users see, con- tation for spreading misinformation. For the selected sume, and share through their news feed heavily depends leaders, we collected all tweets produced from January on the political leaning of their early connections [13, 14]. 2018 to December 2022. The general public (followers) In other words, while algorithms are highly influential is identified based on their RTs to the content produced in determining what people see and shaping their on- by these leaders. See Materials and Methods for details platform experiences [15], there is significant ideological on the data collection process. Using this node configusegregation in political news exposure [16]. It is therefore ration, we construct a bipartite network with two layers: reasonable to expect that users who almost exclusively leaders and followers, where the links represent the numretweet a political entity (party, leader, or both) use quote ber of RTs by the latter of tweets made by the former. If tweets to express agreement with statements posted by a group of followers retweets tweets from two diferent that entity and disagreement with statements posted by leaders, it indicates that these leaders are likely communipolitical entities ideologically distant from their preferred cating similar messages or viewpoints. To analyze these one. Additionally, the quote interaction perfectly encap- relationships more deeply, we perform a monopartite sulates the stance triangle described by Du Bois [ 6 ]. projection onto the leader layer. This projection, detailed

In order to correctly assess political opposition we in Materials and Methods, simplifies the network by conconstruct a retweet network and use the Louvain com- centrating solely on the leaders and the connections bemunity detection algorithm [17] to characterize leaders tween them that are inferred from their shared followers. and, through label propagation, the followers that align Panel (A) of Figure 1 shows the RT network of leaders with their views. aggregated in terms of communities identified through

Through these community labels we construct a an optimized version of the Louvain algorithm [17]. The dataset of claim-perspective couples by annotating tweet- a posteriori analysis of the political leaders in each group quote pairs from profiles that clearly express political reveals that the clustering algorithm efectively identialignment as favor and annotating tweet-quote pairs in ifed communities that align with the political afiliations which the profiles come from diferent communities as of the leaders in each cluster [18, 19]. Specifically, the against. Finally, we use a pretrained BERT model for Left-leaning community includes political entities such as Italian language and fine-tune it to the classification task. +Europa, Azione, Enrico Letta, and Nicola Fratoianni; the

This methodology aims to enhance the accuracy of Right-leaning community features leaders from FdI, FI, stance detection models by incorporating real-world pat- and Lega; and the Five Star Movement (M5S) community terns of agreement and disagreement observed in polar- includes key figures like Giuseppe Conte and Luigi Di ized online environments. Further, it enables an unsuper- Maio. An interesting observation from the network convised training paradigm that can be scaled to very large ifguration is the clustering of questionable news sources. datasets. These profiles consistently group within the same com(A) Retweet network

Political profiles Questionable news sources

Reliable news sources (B) Stance network munity, suggesting a potential alignment or afinity with specific political leanings or ideologies.

Leveraging the political bias of followers in our Twitter network, we build a very large dataset of tweet-quote pairs, each annotated with the corresponding stance (favor or against), as better described in Materials and Methods. Since this method assigns the stance to each pair in an unsupervised manner, to ensure that our approach is performing correctly, we randomly selected 500 pairs (250 favor and 250 against) and manually annotated their stance. We then compared the results of the automatic annotation with the manual annotation. The results, shown in Appendix - Table 3, indicate a high level of accuracy in favor and against classifications, with a small number of neutral cases. The dataset serves as training set for ifne-tuning UmBERTo [ 20], an Italian language model based on the RoBERTa architecture [21], to assign stance labels to claim-perspective pairs. The fine-tuning process is performed using 5-fold cross-validation. The optimal performance for each fold is assessed by measuring the accuracy, i.e., the ratio of correctly predicted instances (both true favor and true against) to the total number of instances. The best-trained models from each fold demonstrate nearly identical performance, as shown by the average accuracy and F1-scores reported in the following table. The best model from fold 3 is identified

Training

Test as the highest performing and is therefore used in the following analyses. The corresponding confusion matrices for both the training and test sets are provided in Appendix - Table 5.

Given the imbalance in the label distribution of the claim-perspective dataset, we use 41, 347 pairs – each annotated as favor and previously removed to create a balanced training set – as an additional test set to evaluate the model’s performance. The model achieves an accuracy of 83.6% when predicting the stance of these pairs.

The model is then applied to classify all the collected tweet-quote pairs based on their stance. Thus, following the same procedure used to construct the RT network of leaders, we develop the stance network and analyze its community structure. In this case, the weight of a link in the bipartite follower-leader network represents the positive diference between the number of favoring and against quotes from a follower on the leader’s tweets.

Panel (B) of Figure 1 shows the stance network of leaders aggregated in terms of communities identified through the Louvain algorithm. The node positions in this representation are the same as those in the RT network, providing a consistent framework for comparison. More formally, to evaluate the diferences in clustering assignments between nodes present in both the retweet network and the stance network, we perform a clustering comparison. Namely, we use the contingency table [22] associated with both the representations to compute community overlap. Figure 2 shows the comparison results broken down by source type: political entities and news outlets. While clusters C and D of the stance network primarily align with clusters 2 and 3 of the RT network, respectively, clusters A and B of the stance network mainly represent a refinement of cluster 1 from the RT network.

This suggests that even in the stance network, the emerging communities align with the political afiliations of the leaders within each cluster. in opinion dynamics, significantly explain the variation in behaviors [25].

Moreover, our model’s ability to reconstruct communities based on the accurate classification of textual pairs (as shown in Figure 2) underscores its potential for community reconstruction in scenarios where the interaction network is not provided.

Importantly, this approach also opens avenues for studying network dynamics based on the probability of agreement between account pairs. This has significant implications for understanding and potentially mitigating coordinated attacks, such as disinformation campaigns and political propaganda. By identifying patterns of agreement and disagreement, we can better detect and analyze the strategies behind these coordinated eforts, enhancing our ability to safeguard democratic processes and public discourse.

Although the tweet-quote pairs used to train the model include only tweets from political entities, the result is significant. The training set does not include pairs where the quote comes from a follower who exclusively retweets political entities from the same ideological community as the tweet’s author. This demonstrates the model’s ability 4. Materials and Methods to reconstruct communities through precise classification of textual pairs. Data Collection. Our dataset comprises approxima

The contingency table for news outlets, while display- tively 15 million tweets collected by monitoring the acing less pronounced patterns overall, still demonstrate tivity of 583 profiles that reflect Italian online social diclear coherence in classification between the retweet net- alogue (e.g., La Repubblica, Il Corriere della Sera, Il Giorwork and the stance network. This is particularly remark- nale). Profiles were selected based on the list of news able considering that these profiles were not included in sites monitored by NewsGuard, a news rating agency the model’s training set. The recovery of the retweet net- dedicated to assigning reliability scores. According to work’s community structure within the stance network NewsGuard, this list covers approximately 95% of online suggests that the model successfully generalizes across engagement with news, providing near-comprehensive profiles with difering linguistic constraints, with only coverage of news-related dialogue [26]. a minimal loss in accuracy, while still allowing for the Additionally, we included Italian political entities in reconstruction of group afiliations. the list of profiles. This inclusion encompasses all major political parties and their leaders (e.g., Giorgia Meloni and Fratelli d’Italia, Elly Schlein and PD, Giuseppe Conte 3. Discussion and M5S). For a complete list of the monitored political profiles see Appendix - Table 4.

For each monitored profile, we collected all tweets from January 2018 to December 2022 using the Twitter/X API before the limitations introduced by the new management1. We also gathered all retweets (RTs) and quotes (QTs) of this content within the same time frame, limited to those tweets that gained at least 20 RTs or 10 QTs. The following table provides a detailed breakdown of the data matching these criteria.

Stance detection remains a vital yet challenging area in natural language processing (NLP), traditionally limited by the constraints of supervised learning. The availability of large language corpora, where interaction networks can be reconstructed, ofers a novel approach that incorporates the social and dynamic aspects of stance, as outlined by Du Bois in his work on the stance triangle [ 6 ].

Our model addresses a more complex task compared to other state-of-the-art models. While existing models typically classify a user’s stance on specific topics, our model classifies claim-perspective pairs into favor and against categories. This requires a deeper analysis of the relational stance between multiple interacting users and their statements.

Despite this increased complexity, our model achieved results comparable to those of existing state-of-the-art models [23, 24]. This success supports the hypothesis that in-group/out-group determinants, well-documented

Category News Politics TOTAL

Community Detection. In order to reconstruct the ity requirement for a single political entity, we calculated discourse communities from the twitter activity we built for each follower the total number of retweets of cona retweet network. In the context of the data collection tent produced by the set of political entities defined strategy previously described, most RTs are from a non- in Table 4 and excluded the bottom 80% of the resulting monitored user (a follower ) to one of the users monitored distribution (i.e., we imposed |RT ( )| > 7 ). For the re(a leader ), excluding a few RTs from one leader to another maining users, we then assigned the label favor to those (45, 299). We can therefore consider this network as a quotes of tweets from their preferred political entity and bipartite network, i.e. a network where all links are from the label against to those quotes of tweets from entities one node type to another, with 367 leaders and 934, 394 belonging to other political communities, as determined followers, connected through links with a weight by the community detection analysis. This procedure equal to the number of RTs from the follower to the resulted in the creation of a dataset containing 243, 277 leader . unique claim-perspective (tweet-quote) pairs, each an

To identify communities among leaders we assume notated with the corresponding stance. Since the label that leaders with the same readership are more likely distribution of the dataset was unbalanced towards favor to be in the same political community. We therefore (specifically, 142, 312 favor and 100, 965 against), we ranconstructed a monopartite network by projecting on the domly removed 41, 347 favor pairs to obtain a balanced leader layer, i.e. we construct a network from the set training set for the stance model. The removed pairs were of all length two paths assigning weights that are the later used as additional test set to evaluate the model’s product of the path’s links. accuracy.

We used the Bipartite Weighted Configuration Model Stance model. We initialized our model starting from (BiWCM) to statistically validate our bipartite projec- UmBERTo [20], an Italian language model based on the tion [27]. BiWCM accounts for weighted interactions RoBERTa architecture [21]. Specifically, we relied on the and preserves the strength of nodes in both layers, en- cased version trained using SentencePiece tokenizer and suring that our observed co-occurrences are not due to Whole Word Masking on a large corpus, encompassing random chance but represent genuine structural patterns around 70 GB of text. This makes it highly efective for in the data. In order to find political communities in various natural language processing tasks in Italian, as the network, we applied the Louvain algorithm 1000 it leverages a vast and diverse dataset to understand the times and selected the solution that minimized modu- nuances of the language [ 29, 30 ]. The pretrained model larity, i.e., the strength of division of the network into was then fine-tuned on the constructed dataset of tweetclusters, with higher values indicating a structure where quote pairs to create a tool capable of inferring the stance more edges lie within communities than would be ex- of claim-perspective text pairs: favor if the perspective pected by chance [28]. agrees with the claim, and against otherwise. To input

The same procedure was followed to construct the the text pairs into the pretrained model, we utilized Umstance network and study its community structure. In BERTo’s special tokens. Specifically, we concatenated this case, the weight of a link in the bipartite follower- the tweet and quote as leader network indicates the fraction of favoring quotes from the follower to the leader’s tweets. <s> + tweet + </s></s> + quote + </s>, Claim-Perspective Pairs Selection. To construct a where <s>, </s></s>, and </s> represent the start, sepdataset of claim-perspective text pairs annotated with aration, and end tokens, respectively. Since we set the corresponding stance (favor if the perspective sup- max_seq_length = 256, which limits the total number ports the claim, against otherwise), we first identified of tokens that can be processed by the model, in cases users who clearly expressed an (almost) absolute prefer- where the concatenated strings exceeded this limit, the ence for a single political entity through their retweet longer text between the tweet and the quote was trunactivity. Specifically, for each follower, we calculated cated. This ensures that the input remains within the the distribution of their RTs across the political entities model’s processing capacity while preserving as much defined in Table 4. Then, we filtered those who allocated information as possible from both texts. Conversely, at least 80% of their RTs to a single political entity. Some shorter concatenated strings were padded using the speusers, although meeting the previous requirement, may cial token <pad> until they reached the 256-token limit. not have had a suficient level of retweet activity during Tweets and quotes were preprocessed before being conthe analyzed period to be considered inclined towards catenated by removing URLs, mentions, non-UTF-8 chara particular political entity. For example, a user who acters, line breaks, and tabs. has only given one retweet to the set of political profiles The pretrained UmBERTo model was imported into would appear totally inclined towards a particular entity. Python from the HugginFace Transformers library [ 31 ] To reduce the uncertainty arising from the indiscriminate as a model for sequence classification. The fine-tuning inclusion of all profiles satisfying the high retweet activ- procedure enabled the model to output the probability distribution over the stance labels by minimizing the cross- heavily on the assumption that retweets are mainly a entropy loss between the predicted labels and the true form of endorsement, and that quotes within one’s own labels, efectively learning to classify the stance of claim- political community are all in agreement and that outside perspective pairs. We chose to perform 5-fold cross- of one’s political community they are all in disagreement. validation to ensure the reliability of the results [ 32 ]. While the high level of polarization observed in these Namely, the data was first partitioned into 5 equally (or networks support the validity of these assumptions, it nearly equally) sized segments or folds. Subsequently 5 also restricts the applicability of the model to domains iterations of training and testing are performed such that where polarization is evident and these assumptions are within each iteration a diferent fold of the data is held- valid. out for testing while the remaining 4 folds are used for learning. Thus, for each training-test split, we fine-tuned the UmBERTo model for 4 epochs using a batch size of 64 Acknowledgments (for both training and testing) and an improved version of the Adam optimizer [ 33 ] with a learning rate of 5 − 5 and a weight decay of 0.01 for regularization. The chosen hyperparameters are among those recommended in the literature[ 34, 21 ].

We extend our deepest gratitude to Vittorio Loreto, the director of the Sony Computer Science Laboratories (CSL) and Professor at La Sapienza University of Rome, for his invaluable support and sponsorship of this research. His guidance was pivotal for the successful completion of our study. We also thank the anonymous reviewers for their insightful suggestions, which have greatly contributed to enhancing the quality of this work.

5. Conclusion

This study introduces a novel stance detection model that significantly advances the understanding of alignment and opposition in social discourse. By leveraging social network data from X (formerly Twitter), we developed a robust training technique that utilizes interactions within politically aligned communities. Our approach involved curating a dataset of tweet/quote pairs, where the quotes are derived from users’ interactions with leaders and politicians. This dataset facilitated the training of a BERT model, which achieved a state of the art accuracy of approximately 85%.

Our findings underscore the eficacy of using social network structures to train NLP models, demonstrating that retweet interactions can serve as reliable indicators of political alignment. This methodology not only enhances the scalability of stance detection but also ofers a nuanced understanding of political discourse on social media platforms. By reconstructing and validating politically aligned communities through expert knowledge, our model provides a robust framework for analyzing the alignment of social media statements.

The implications of this work extend beyond stance detection, ofering potential applications in monitoring political sentiment, identifying misinformation, and understanding public opinion dynamics. Future research could explore the integration of additional social network features and exploring the capacity of the model to generalize to other domains, interaction types and understanding how stance propagates within networks.

Additionally, investigating the role of specific linguistic markers like adverbs across diferent languages and cultures can reveal universal and language-specific determinants of stance.

While our model shows promising results, it also relies [8] L. Sayah, M. R. Hashemi, Exploring stance and consensus, in: A. P. Rocha, L. Steels, H. J. van den engagement features in discourse analysis papers., Herik (Eds.), Proceedings of the 16th International Theory & Practice in Language Studies (TPLS) 4 Conference on Agents and Artificial Intelligence, (2014). ICAART 2024, Volume 3, Rome, Italy, February 24[9] D. Küçük, F. Can, Stance detection: A survey, ACM 26, 2024, SCITEPRESS, 2024, pp. 1405–1412. doi:10.

Computing Surveys (CSUR) 53 (2020) 1–37. 5220/0012595000003636. [10] C. Becatti, G. Caldarelli, R. Lambiotte, F. Saracco, [19] M. Pratelli, F. Saracco, M. Petrocchi, Entropy-based Extracting significant signal of news consumption detection of twitter echo chambers, PNAS Nexus 3 from social networks: the case of Twitter in Italian (2024) pgae177. doi:10.1093/pnasnexus/pgae177. political elections, Palgrave Communications 5 [20] L. Parisi, S. Francia, P. Magnani, Umberto: an (2019). doi:10.1057/s41599-019-0300-3. italian language model trained with whole word [11] D. Boyd, S. Golder, G. Lotan, Tweet, tweet, retweet: masking, https://github.com/musixmatchresearch/ Conversational aspects of retweeting on twitter, in: umberto, 2020. 2010 43rd Hawaii International Conference on Sys- [21] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, tem Sciences, 2010, pp. 1–10. doi:10.1109/HICSS. O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov, 2010.412. Roberta: A robustly optimized bert pretraining ap[12] N. Marsili, Retweeting: Its linguistic and epistemic proach, arXiv (2019). doi:10.48550/arXiv.1907.

value, Synthese 198 (2021) 10457–10483. 11692. [13] W. Chen, D. Pacheco, K.-C. Yang, F. Menczer, [22] S. S. Brier, Analysis of contingency tables under Neutral bots probe political bias on social media, cluster sampling, Biometrika 67 (1980) 591–596. Nature Communications 12 (2021). doi:10.1038/ [23] A. Rashed, M. Kutlu, K. Darwish, T. Elsayed, s41467-021-25738-6. C. Bayrak, Embeddings-based clustering for tar[14] B. Nyhan, J. Settle, E. Thorson, M. Wojcieszak, get specific stances: The case of a polarized turkey, P. Barberá, A. Y. Chen, H. Allcott, T. Brown, in: Proceedings of the International AAAI ConferA. Crespo-Tenorio, D. Dimmery, D. Freelon, ence on web and social media, volume 15, 2021, pp. M. Gentzkow, S. González-Bailón, A. M. Guess, 537–548.

E. Kennedy, Y. M. Kim, D. Lazer, N. Malhotra, [24] S. Shi, K. Qiao, J. Chen, S. Yang, J. Yang, B. Song, D. Moehler, J. Pan, D. R. Thomas, R. Tromble, L. Wang, B. Yan, Mgtab: A multi-relational graphC. V. Rivera, A. Wilkins, B. Xiong, C. K. de Jonge, based twitter account detection benchmark, arXiv A. Franco, W. Mason, N. J. Stroud, J. A. Tucker, preprint arXiv:2301.01123 (2023).

Like-minded sources on facebook are prevalent [25] S. Rathje, J. J. Van Bavel, S. Van Der Linden, Outbut not polarizing, Nature 620 (2023) 137–144. group animosity drives engagement on social medoi:10.1038/s41586-023-06297-w. dia, Proceedings of the National Academy of Sci[15] P. Gravino, D. R. Lo Sardo, E. Brugnoli, Cross- ences 118 (2021) e2024292118. platform impact of social media algorithmic adjust- [26] NewsguardTech.com, Social impact report ments on public discourse, ArXiv (2024). doi:10. 2021, 2022. Available from https://www. 48550/arXiv.2405.00008. newsguardtech.com/wp-content/uploads/2022/ [16] S. González-Bailón, D. Lazer, P. Barberá, M. Zhang, 01/NewsGuard-Social-Impact-Report-1.21.22.pdf H. Allcott, T. Brown, A. Crespo-Tenorio, D. Freelon, (accessed Nov 27, 2023).

M. Gentzkow, A. M. Guess, S. Iyengar, Y. M. Kim, [27] M. Bruno, D. Mazzilli, A. Patelli, T. Squartini, N. Malhotra, D. Moehler, B. Nyhan, J. Pan, C. V. F. Saracco, Inferring comparative advantage via Rivera, J. Settle, E. Thorson, R. Tromble, A. Wilkins, entropy maximization, Journal of Physics: ComM. Wojcieszak, C. Kiewiet de Jonge, A. Franco, plexity 4 (2023) 045011. doi:10.1088/2632-072X/ W. Mason, N. Jomini Stroud, J. A. Tucker, Asymmet- ad1411. ric ideological segregation in exposure to political [28] M. E. J. Newman, M. Girvan, Finding and evaluating news on facebook, Science 381 (2023) 392–398. community structure in networks, Phys. Rev. E 69 doi:10.1126/science.ade7138. (2004) 026113. doi:10.1103/PhysRevE.69.026113. [17] V. D. Blondel, J.-L. Guillame, R. Lambiotte, E. Lefeb- [29] F. Bianchi, D. Nozza, D. Hovy, FEEL-IT: Emotion vre, Fast unfolding of communities in large net- and sentiment classification for the Italian language, works, Journal of Statistical Mechanics: The- in: O. De Clercq, A. Balahur, J. Sedoc, V. Barriere, ory and Experiment 10008 (2008). doi:10.1088/ S. Tafreshi, S. Buechel, V. Hoste (Eds.), Proceed1742-5468/2008/10/P10008. ings of the Eleventh Workshop on Computational [18] E. Brugnoli, P. Gravino, D. R. Lo Sardo, V. Loreto, Approaches to Subjectivity, Sentiment and Social G. Prevedello, Fine-grained clustering of social Media Analysis, Association for Computational Linmedia: How moral triggers drive preferences and guistics, Online, 2021, pp. 76–83.

la Favor un Against a Neutral M Σ

Automatic

Favor Against 221 7 16 209 13 34 250 250

Political entity

+Europa Articolo Uno Azione Cambiamo! Coraggio Italia Democrazia e Autonomia Europa Verde FdI FI ItalExit IV Lega M5S Twitter profiles piu_europa, emmabonino articolounodp, robersperanza azione_it, carlocalenda giovannitoti coraggio_italia, luigibrugnaro movimentodema europaverde_it, angelobonelli1 giorgiameloni, fratelliditalia forza_italia, berlusconi gparagone italiaviva, matteorenzi legasalvini, matteosalvinimi giuseppeconteit, mov5stelle, luigidimaio manifesta_it maurizio_lupi pdnetwork, enricoletta, sbonaccini, ellyesse potere_alpopolo direzioneprc si_sinistra, nfratoianni antoniodepoli unione_popolare, demagistris

Favor

l a u t c Against A

Favor

l a u t c Against A

Predicted Favor

70, 690 10, 517 81, 207

Favor

16, 929 2, 740 19, 669

[1]

Küçük ,

Can , Stance detection: Concepts, approaches, resources, and outstanding issues , in: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval , 2021 , pp. 2673 - 2676 .

[2]

Alturayeif ,

Luqman ,

Ahmed , A systematic review of machine learning techniques for stance detection and its applications , Neural Computing and Applications 35 ( 2023 ) 5113 - 5144 .

[3]

Aldayel , W. Magdy, It is more than what you say!: Leveraging user online activity for improved stance detection , 2019 . URL: https:// 2019 .ic2s2.org/, 5th International Conference on Computational Social Science, IC2S2 2019 ; Conference date: 17 - 07 -2019 Through 20- 07 - 2019 .

[4]

Gupta ,

Mehta , Automatic stance detection for twitter data , in: 2022 1st International Conference on Informatics (ICI) , IEEE, 2022 , pp. 223 - 225 .

[5]

Draws ,

Natesan Ramamurthy , I. Baldini ,

Dhurandhar ,

Padhi ,

Timmermans ,

Tintarev , Explainable cross-topic stance detection for search results , in: Proceedings of the 2023 Conference on Human Information Interaction and Retrieval , 2023 , pp. 221 - 235 .

[6]

J. W.

Du Bois , The stance triangle, Stancetaking in discourse: Subjectivity, evaluation , interaction 164 ( 2007 ) 139 - 182 .

[7]

World

Economic Forum , Global Risks Report 2024 ,

Technical

Report , World Economic Forum, 2024 . URL: https://www.weforum.org/publications/ global-risks -report-2024/.

[30]

Tamburini , How “bertology” changed the stateof-the-art also for italian nlp , in: Proceedings of the Seventh Italian Conference on Computational Linguistics , CLiC-it 2020 , Online, 2020 .

[31]

Wolf ,

Debut ,

Sanh ,

Chaumond ,

Delangue ,

Moi ,

Cistac ,

Rault ,

Louf ,

Funtowicz , et al., Huggingface's transformers: State-ofthe-art natural language processing , arXiv ( 2019 ). doi: 10 .48550/arXiv. 1910 . 03771 .

[32]

Refaeilzadeh ,

Tang , H. Liu, Cross-Validation, Springer US, Boston, MA, 2009 , pp. 532 - 538 . doi: 10 . 1007/978- 0- 387 - 39940- 9_ 565 .

[33]

Loshchilov ,

Hutter , Decoupled weight decay regularization , arXiv ( 2017 ). doi: 10 .48550/arXiv. 1711.05101.

[34]

Devlin , M.-

Chang ,

Lee ,

Toutanova , Bert: Pre-training of deep bidirectional transformers for language understanding , arXiv ( 2018 ). doi: 10 .48550/arXiv. 1810 . 04805 .