<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>munity-based Stance Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Emanuele Brugnoli</string-name>
          <email>emanuele.brugnoli@sony.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Donald Ruggiero Lo Sardo</string-name>
          <email>donaldruggiero.losardo@sony.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centro Studi e Ricerche Enrico Fermi (CREF)</institution>
          ,
          <addr-line>Piazza del Viminale 1, 00184 Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Fisica - Sapienza Università di Roma</institution>
          ,
          <addr-line>P.le A. Moro 2, 00185 Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Sony Computer Science Laboratories Rome, Joint Initiative CREF-SONY</institution>
          ,
          <addr-line>Piazza del Viminale 1, 00184, Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Stance Detection</institution>
          ,
          <addr-line>Polarisation, Social Networks</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <fpage>3</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>Stance detection is a critical task in understanding the alignment or opposition of statements within social discourse. In this study, we present a novel stance detection model that labels claim-perspective pairs as either aligned or opposed. The primary innovation of our work lies in our training technique, which leverages social network data from X (formerly Twitter). Our dataset comprises tweets from opinion leaders, political entities and news outlets, along with their followers' interactions through retweets and quotes. By reconstructing politically aligned communities based on retweet interactions, treated as endorsements, we check these communities against common knowledge representations of the political landscape. Our training dataset consists of tweet/quote pairs where the tweet comes from a political entity and the quote either originates from a follower who exclusively retweets that political entity (treated as aligned) or from a user who exclusively retweets a political entity from an opposing ideological community (treated as opposed). This curated subset is used to train an Italian language model based on the RoBERTa architecture, achieving an accuracy of approximately 85%. We then apply our model to label all tweet/quote pairs in the dataset, analyzing its out-of-sample predictions. This work not only demonstrates the eficacy of our stance detection model but also highlights the utility of social network structures in training robust NLP models. Our approach ofers a scalable and accurate method for understanding political discourse and the alignment of social media statements.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Stance detection is a critical task within the domain of
natural language processing (NLP). It involves
identifying the position or attitude expressed in a piece of text
ally, stances are classified into three primary categories:
favor, against, and neutral. This classification enables a
detailed description of textual data, facilitating a deeper
insight into public opinion and discourse dynamics.
nication platforms such as social media, forums, and
online news outlets has resulted in an unprecedented
volume of user-generated content. This surge
underscores the necessity for automated systems capable of
eficiently analyzing and interpreting these vast text
corpora. Stance detection addresses this need by providing
tools that can systematically assess opinions and
reactions embedded within texts, thus ofering valuable
applications across various fields including social media
analysis [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], search engines [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and linguistics [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
(D. R. Lo Sardo)
Dec 04 — 06, 2024, Pisa, Italy
∗Corresponding author.
      </p>
      <p>
        Stance detection, has been explored across various
ifelds with difering definitions and applications. Du
stance-taking involves evaluating objects, positioning
subjects, and aligning with others in dialogic
interactions, emphasizing the sociocognitive aspects and
intersubjectivity in discourse [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Sayah and Hashemi focus
on academic writing, analyzing stance and engagement
features like hedges, self-mention, and appeals to shared
knowledge to understand communicative styles and
interpersonal strategies [8]. Küçük and Can define stance
detection as the classification of an author’s position
towards a target (favor, against, or neutral), highlighting its
tion, and argument mining [9]. These diverse approaches
underscore the multifaceted nature of stance detection
and its applications in enhancing the understanding of
social discourse, academic rhetoric, and online content
analysis. For a review of the recent developments of the
ifeld we refer to Alturayeif et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and AlDayel et al.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>In this work, we propose a novel approach to training In the following sections, we will outline the data
stance detection models by leveraging the interactions gathering approach used for the dataset. Subsequently,
within highly polarized communities. Our method uti- we will describe the community detection methods
emlizes tweet/quote pairs from the Italian political debate ployed to identify leaders and users within the Italian
to construct a robust training set. We operate under political discourse. We will then discuss the model
archithe assumption that users who predominantly retweet a tecture and its training process. In the results section, we
particular political profile are likely in agreement with will evaluate the model’s performance and present our
the statements made by that profile. We restricted our ifndings. Finally, the conclusion will address potential
analysis to retweet since this form of communication future developments, the implications of our work, and
primarily aligns with the endorsement hypothesis [10]. its limitations.</p>
      <p>Namely, being a simple re-posting of a tweet,
retweeting is commonly thought to express agreement with the
claim of the tweet [11]. Further, though retweets might 2. Results
be used with other purposes such as those described by
Marsili [12], the repeated nature of the interaction we In this study, we focus on a comprehensive set of Italian
observe in our networks reduces the probability that the opinion leaders active on Twitter/X, including the oficial
activity falls outside of the endorsement behavior. profiles of major news media outlets as well as prominent</p>
      <p>
        Conversely, while quoting a tweet works similarly to politicians and political parties. The profiles of news
meretweeting, the function allows users to add their own dia outlets are further classified according to assessments
comments above the tweet. This makes this form of provided by NewsGuard, which categorize them as either
communication controversial regarding the endorsement questionable or reliable sources. This classification is
cruhypothesis, as agreement or disagreement with the tweet cial for evaluating the quality of the information these
depends on the stance of the added comment. On the outlets disseminate, particularly regarding their
repuother hand, the information social media users see, con- tation for spreading misinformation. For the selected
sume, and share through their news feed heavily depends leaders, we collected all tweets produced from January
on the political leaning of their early connections [13, 14]. 2018 to December 2022. The general public (followers)
In other words, while algorithms are highly influential is identified based on their RTs to the content produced
in determining what people see and shaping their on- by these leaders. See Materials and Methods for details
platform experiences [15], there is significant ideological on the data collection process. Using this node
configusegregation in political news exposure [16]. It is therefore ration, we construct a bipartite network with two layers:
reasonable to expect that users who almost exclusively leaders and followers, where the links represent the
numretweet a political entity (party, leader, or both) use quote ber of RTs by the latter of tweets made by the former. If
tweets to express agreement with statements posted by a group of followers retweets tweets from two diferent
that entity and disagreement with statements posted by leaders, it indicates that these leaders are likely
communipolitical entities ideologically distant from their preferred cating similar messages or viewpoints. To analyze these
one. Additionally, the quote interaction perfectly encap- relationships more deeply, we perform a monopartite
sulates the stance triangle described by Du Bois [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. projection onto the leader layer. This projection, detailed
      </p>
      <p>In order to correctly assess political opposition we in Materials and Methods, simplifies the network by
conconstruct a retweet network and use the Louvain com- centrating solely on the leaders and the connections
bemunity detection algorithm [17] to characterize leaders tween them that are inferred from their shared followers.
and, through label propagation, the followers that align Panel (A) of Figure 1 shows the RT network of leaders
with their views. aggregated in terms of communities identified through</p>
      <p>Through these community labels we construct a an optimized version of the Louvain algorithm [17]. The
dataset of claim-perspective couples by annotating tweet- a posteriori analysis of the political leaders in each group
quote pairs from profiles that clearly express political reveals that the clustering algorithm efectively
identialignment as favor and annotating tweet-quote pairs in ifed communities that align with the political afiliations
which the profiles come from diferent communities as of the leaders in each cluster [18, 19]. Specifically, the
against. Finally, we use a pretrained BERT model for Left-leaning community includes political entities such as
Italian language and fine-tune it to the classification task. +Europa, Azione, Enrico Letta, and Nicola Fratoianni; the</p>
      <p>This methodology aims to enhance the accuracy of Right-leaning community features leaders from FdI, FI,
stance detection models by incorporating real-world pat- and Lega; and the Five Star Movement (M5S) community
terns of agreement and disagreement observed in polar- includes key figures like Giuseppe Conte and Luigi Di
ized online environments. Further, it enables an unsuper- Maio. An interesting observation from the network
convised training paradigm that can be scaled to very large ifguration is the clustering of questionable news sources.
datasets. These profiles consistently group within the same
com(A) Retweet network</p>
      <p>Political profiles
Questionable news sources</p>
      <p>Reliable news sources
(B) Stance network
munity, suggesting a potential alignment or afinity with
specific political leanings or ideologies.</p>
      <p>Leveraging the political bias of followers in our Twitter
network, we build a very large dataset of tweet-quote
pairs, each annotated with the corresponding stance
(favor or against), as better described in Materials and
Methods. Since this method assigns the stance to each pair
in an unsupervised manner, to ensure that our approach
is performing correctly, we randomly selected 500 pairs
(250 favor and 250 against) and manually annotated their
stance. We then compared the results of the automatic
annotation with the manual annotation. The results, shown
in Appendix - Table 3, indicate a high level of accuracy
in favor and against classifications, with a small number
of neutral cases. The dataset serves as training set for
ifne-tuning UmBERTo [ 20], an Italian language model
based on the RoBERTa architecture [21], to assign stance
labels to claim-perspective pairs. The fine-tuning process
is performed using 5-fold cross-validation. The optimal
performance for each fold is assessed by measuring the
accuracy, i.e., the ratio of correctly predicted instances
(both true favor and true against) to the total number
of instances. The best-trained models from each fold
demonstrate nearly identical performance, as shown by
the average accuracy and F1-scores reported in the
following table. The best model from fold 3 is identified</p>
      <sec id="sec-2-1">
        <title>Training</title>
        <p>Test
as the highest performing and is therefore used in the
following analyses. The corresponding confusion
matrices for both the training and test sets are provided in
Appendix - Table 5.</p>
        <p>Given the imbalance in the label distribution of the
claim-perspective dataset, we use 41, 347 pairs – each
annotated as favor and previously removed to create a
balanced training set – as an additional test set to
evaluate the model’s performance. The model achieves an
accuracy of 83.6% when predicting the stance of these
pairs.</p>
        <p>The model is then applied to classify all the collected
tweet-quote pairs based on their stance. Thus, following
the same procedure used to construct the RT network
of leaders, we develop the stance network and analyze
its community structure. In this case, the weight of a
link in the bipartite follower-leader network represents
the positive diference between the number of favoring
and against quotes from a follower on the leader’s tweets.</p>
        <p>Panel (B) of Figure 1 shows the stance network of leaders
aggregated in terms of communities identified through
the Louvain algorithm. The node positions in this
representation are the same as those in the RT network,
providing a consistent framework for comparison. More
formally, to evaluate the diferences in clustering
assignments between nodes present in both the retweet
network and the stance network, we perform a clustering
comparison. Namely, we use the contingency table [22]
associated with both the representations to compute
community overlap. Figure 2 shows the comparison results
broken down by source type: political entities and news
outlets. While clusters C and D of the stance network
primarily align with clusters 2 and 3 of the RT network,
respectively, clusters A and B of the stance network mainly
represent a refinement of cluster 1 from the RT network.</p>
        <p>This suggests that even in the stance network, the
emerging communities align with the political afiliations of
the leaders within each cluster.
in opinion dynamics, significantly explain the variation
in behaviors [25].</p>
        <p>Moreover, our model’s ability to reconstruct
communities based on the accurate classification of textual pairs
(as shown in Figure 2) underscores its potential for
community reconstruction in scenarios where the interaction
network is not provided.</p>
        <p>Importantly, this approach also opens avenues for
studying network dynamics based on the probability
of agreement between account pairs. This has
significant implications for understanding and potentially
mitigating coordinated attacks, such as disinformation
campaigns and political propaganda. By identifying patterns
of agreement and disagreement, we can better detect and
analyze the strategies behind these coordinated eforts,
enhancing our ability to safeguard democratic processes
and public discourse.</p>
        <p>Although the tweet-quote pairs used to train the model
include only tweets from political entities, the result is
significant. The training set does not include pairs where
the quote comes from a follower who exclusively retweets
political entities from the same ideological community as
the tweet’s author. This demonstrates the model’s ability 4. Materials and Methods
to reconstruct communities through precise classification
of textual pairs. Data Collection. Our dataset comprises
approxima</p>
        <p>The contingency table for news outlets, while display- tively 15 million tweets collected by monitoring the
acing less pronounced patterns overall, still demonstrate tivity of 583 profiles that reflect Italian online social
diclear coherence in classification between the retweet net- alogue (e.g., La Repubblica, Il Corriere della Sera, Il
Giorwork and the stance network. This is particularly remark- nale). Profiles were selected based on the list of news
able considering that these profiles were not included in sites monitored by NewsGuard, a news rating agency
the model’s training set. The recovery of the retweet net- dedicated to assigning reliability scores. According to
work’s community structure within the stance network NewsGuard, this list covers approximately 95% of online
suggests that the model successfully generalizes across engagement with news, providing near-comprehensive
profiles with difering linguistic constraints, with only coverage of news-related dialogue [26].
a minimal loss in accuracy, while still allowing for the Additionally, we included Italian political entities in
reconstruction of group afiliations. the list of profiles. This inclusion encompasses all major
political parties and their leaders (e.g., Giorgia Meloni
and Fratelli d’Italia, Elly Schlein and PD, Giuseppe Conte
3. Discussion and M5S). For a complete list of the monitored political
profiles see Appendix - Table 4.</p>
        <p>For each monitored profile, we collected all tweets
from January 2018 to December 2022 using the Twitter/X
API before the limitations introduced by the new
management1. We also gathered all retweets (RTs) and quotes
(QTs) of this content within the same time frame, limited
to those tweets that gained at least 20 RTs or 10 QTs. The
following table provides a detailed breakdown of the data
matching these criteria.</p>
        <p>
          Stance detection remains a vital yet challenging area in
natural language processing (NLP), traditionally limited
by the constraints of supervised learning. The availability
of large language corpora, where interaction networks
can be reconstructed, ofers a novel approach that
incorporates the social and dynamic aspects of stance, as
outlined by Du Bois in his work on the stance triangle
[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>Our model addresses a more complex task compared
to other state-of-the-art models. While existing models
typically classify a user’s stance on specific topics, our
model classifies claim-perspective pairs into favor and
against categories. This requires a deeper analysis of the
relational stance between multiple interacting users and
their statements.</p>
        <p>Despite this increased complexity, our model achieved
results comparable to those of existing state-of-the-art
models [23, 24]. This success supports the hypothesis
that in-group/out-group determinants, well-documented</p>
      </sec>
      <sec id="sec-2-2">
        <title>Category News Politics TOTAL</title>
        <p>Community Detection. In order to reconstruct the ity requirement for a single political entity, we calculated
discourse communities from the twitter activity we built for each follower  the total number of retweets of
cona retweet network. In the context of the data collection tent produced by the set of political entities  defined
strategy previously described, most RTs are from a non- in Table 4 and excluded the bottom 80% of the resulting
monitored user (a follower ) to one of the users monitored distribution (i.e., we imposed |RT ( )| &gt; 7 ). For the
re(a leader ), excluding a few RTs from one leader to another maining users, we then assigned the label favor to those
(45, 299). We can therefore consider this network as a quotes of tweets from their preferred political entity and
bipartite network, i.e. a network where all links are from the label against to those quotes of tweets from entities
one node type to another, with 367 leaders and 934, 394 belonging to other political communities, as determined
followers, connected through links with a weight   by the community detection analysis. This procedure
equal to the number of RTs from the follower  to the resulted in the creation of a dataset containing 243, 277
leader  . unique claim-perspective (tweet-quote) pairs, each
an</p>
        <p>To identify communities among leaders we assume notated with the corresponding stance. Since the label
that leaders with the same readership are more likely distribution of the dataset was unbalanced towards favor
to be in the same political community. We therefore (specifically, 142, 312 favor and 100, 965 against), we
ranconstructed a monopartite network by projecting on the domly removed 41, 347 favor pairs to obtain a balanced
leader layer, i.e. we construct a network from the set training set for the stance model. The removed pairs were
of all length two paths assigning weights that are the later used as additional test set to evaluate the model’s
product of the path’s links. accuracy.</p>
        <p>
          We used the Bipartite Weighted Configuration Model Stance model. We initialized our model starting from
(BiWCM) to statistically validate our bipartite projec- UmBERTo [20], an Italian language model based on the
tion [27]. BiWCM accounts for weighted interactions RoBERTa architecture [21]. Specifically, we relied on the
and preserves the strength of nodes in both layers, en- cased version trained using SentencePiece tokenizer and
suring that our observed co-occurrences are not due to Whole Word Masking on a large corpus, encompassing
random chance but represent genuine structural patterns around 70 GB of text. This makes it highly efective for
in the data. In order to find political communities in various natural language processing tasks in Italian, as
the network, we applied the Louvain algorithm 1000 it leverages a vast and diverse dataset to understand the
times and selected the solution that minimized modu- nuances of the language [
          <xref ref-type="bibr" rid="ref8">29, 30</xref>
          ]. The pretrained model
larity, i.e., the strength of division of the network into was then fine-tuned on the constructed dataset of
tweetclusters, with higher values indicating a structure where quote pairs to create a tool capable of inferring the stance
more edges lie within communities than would be ex- of claim-perspective text pairs: favor if the perspective
pected by chance [28]. agrees with the claim, and against otherwise. To input
        </p>
        <p>
          The same procedure was followed to construct the the text pairs into the pretrained model, we utilized
Umstance network and study its community structure. In BERTo’s special tokens. Specifically, we concatenated
this case, the weight of a link in the bipartite follower- the tweet and quote as
leader network indicates the fraction of favoring quotes
from the follower to the leader’s tweets. &lt;s&gt; + tweet + &lt;/s&gt;&lt;/s&gt; + quote + &lt;/s&gt;,
Claim-Perspective Pairs Selection. To construct a where &lt;s&gt;, &lt;/s&gt;&lt;/s&gt;, and &lt;/s&gt; represent the start,
sepdataset of claim-perspective text pairs annotated with aration, and end tokens, respectively. Since we set
the corresponding stance (favor if the perspective sup- max_seq_length = 256, which limits the total number
ports the claim, against otherwise), we first identified of tokens that can be processed by the model, in cases
users who clearly expressed an (almost) absolute prefer- where the concatenated strings exceeded this limit, the
ence for a single political entity through their retweet longer text between the tweet and the quote was
trunactivity. Specifically, for each follower, we calculated cated. This ensures that the input remains within the
the distribution of their RTs across the political entities model’s processing capacity while preserving as much
defined in Table 4. Then, we filtered those who allocated information as possible from both texts. Conversely,
at least 80% of their RTs to a single political entity. Some shorter concatenated strings were padded using the
speusers, although meeting the previous requirement, may cial token &lt;pad&gt; until they reached the 256-token limit.
not have had a suficient level of retweet activity during Tweets and quotes were preprocessed before being
conthe analyzed period to be considered inclined towards catenated by removing URLs, mentions, non-UTF-8
chara particular political entity. For example, a user who acters, line breaks, and tabs.
has only given one retweet to the set of political profiles The pretrained UmBERTo model was imported into
would appear totally inclined towards a particular entity. Python from the HugginFace Transformers library [
          <xref ref-type="bibr" rid="ref9">31</xref>
          ]
To reduce the uncertainty arising from the indiscriminate as a model for sequence classification. The fine-tuning
inclusion of all profiles satisfying the high retweet activ- procedure enabled the model to output the probability
distribution over the stance labels by minimizing the cross- heavily on the assumption that retweets are mainly a
entropy loss between the predicted labels and the true form of endorsement, and that quotes within one’s own
labels, efectively learning to classify the stance of claim- political community are all in agreement and that outside
perspective pairs. We chose to perform 5-fold cross- of one’s political community they are all in disagreement.
validation to ensure the reliability of the results [
          <xref ref-type="bibr" rid="ref10">32</xref>
          ]. While the high level of polarization observed in these
Namely, the data was first partitioned into 5 equally (or networks support the validity of these assumptions, it
nearly equally) sized segments or folds. Subsequently 5 also restricts the applicability of the model to domains
iterations of training and testing are performed such that where polarization is evident and these assumptions are
within each iteration a diferent fold of the data is held- valid.
out for testing while the remaining 4 folds are used for
learning. Thus, for each training-test split, we fine-tuned
the UmBERTo model for 4 epochs using a batch size of 64 Acknowledgments
(for both training and testing) and an improved version
of the Adam optimizer [
          <xref ref-type="bibr" rid="ref11">33</xref>
          ] with a learning rate of 5 − 5
and a weight decay of 0.01 for regularization. The chosen
hyperparameters are among those recommended in the
literature[
          <xref ref-type="bibr" rid="ref12">34, 21</xref>
          ].
        </p>
        <p>We extend our deepest gratitude to Vittorio Loreto, the
director of the Sony Computer Science Laboratories (CSL)
and Professor at La Sapienza University of Rome, for his
invaluable support and sponsorship of this research. His
guidance was pivotal for the successful completion of our
study. We also thank the anonymous reviewers for their
insightful suggestions, which have greatly contributed
to enhancing the quality of this work.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Conclusion</title>
      <p>This study introduces a novel stance detection model that
significantly advances the understanding of alignment
and opposition in social discourse. By leveraging social
network data from X (formerly Twitter), we developed a
robust training technique that utilizes interactions within
politically aligned communities. Our approach involved
curating a dataset of tweet/quote pairs, where the quotes
are derived from users’ interactions with leaders and
politicians. This dataset facilitated the training of a BERT
model, which achieved a state of the art accuracy of
approximately 85%.</p>
      <p>Our findings underscore the eficacy of using social
network structures to train NLP models, demonstrating
that retweet interactions can serve as reliable indicators
of political alignment. This methodology not only
enhances the scalability of stance detection but also ofers
a nuanced understanding of political discourse on social
media platforms. By reconstructing and validating
politically aligned communities through expert knowledge,
our model provides a robust framework for analyzing the
alignment of social media statements.</p>
      <p>The implications of this work extend beyond stance
detection, ofering potential applications in monitoring
political sentiment, identifying misinformation, and
understanding public opinion dynamics. Future research
could explore the integration of additional social
network features and exploring the capacity of the model
to generalize to other domains, interaction types and
understanding how stance propagates within networks.</p>
      <p>Additionally, investigating the role of specific
linguistic markers like adverbs across diferent languages and
cultures can reveal universal and language-specific
determinants of stance.</p>
      <p>While our model shows promising results, it also relies
[8] L. Sayah, M. R. Hashemi, Exploring stance and consensus, in: A. P. Rocha, L. Steels, H. J. van den
engagement features in discourse analysis papers., Herik (Eds.), Proceedings of the 16th International
Theory &amp; Practice in Language Studies (TPLS) 4 Conference on Agents and Artificial Intelligence,
(2014). ICAART 2024, Volume 3, Rome, Italy, February
24[9] D. Küçük, F. Can, Stance detection: A survey, ACM 26, 2024, SCITEPRESS, 2024, pp. 1405–1412. doi:10.</p>
      <p>Computing Surveys (CSUR) 53 (2020) 1–37. 5220/0012595000003636.
[10] C. Becatti, G. Caldarelli, R. Lambiotte, F. Saracco, [19] M. Pratelli, F. Saracco, M. Petrocchi, Entropy-based
Extracting significant signal of news consumption detection of twitter echo chambers, PNAS Nexus 3
from social networks: the case of Twitter in Italian (2024) pgae177. doi:10.1093/pnasnexus/pgae177.
political elections, Palgrave Communications 5 [20] L. Parisi, S. Francia, P. Magnani, Umberto: an
(2019). doi:10.1057/s41599-019-0300-3. italian language model trained with whole word
[11] D. Boyd, S. Golder, G. Lotan, Tweet, tweet, retweet: masking, https://github.com/musixmatchresearch/
Conversational aspects of retweeting on twitter, in: umberto, 2020.
2010 43rd Hawaii International Conference on Sys- [21] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen,
tem Sciences, 2010, pp. 1–10. doi:10.1109/HICSS. O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,
2010.412. Roberta: A robustly optimized bert pretraining
ap[12] N. Marsili, Retweeting: Its linguistic and epistemic proach, arXiv (2019). doi:10.48550/arXiv.1907.</p>
      <p>value, Synthese 198 (2021) 10457–10483. 11692.
[13] W. Chen, D. Pacheco, K.-C. Yang, F. Menczer, [22] S. S. Brier, Analysis of contingency tables under
Neutral bots probe political bias on social media, cluster sampling, Biometrika 67 (1980) 591–596.
Nature Communications 12 (2021). doi:10.1038/ [23] A. Rashed, M. Kutlu, K. Darwish, T. Elsayed,
s41467-021-25738-6. C. Bayrak, Embeddings-based clustering for
tar[14] B. Nyhan, J. Settle, E. Thorson, M. Wojcieszak, get specific stances: The case of a polarized turkey,
P. Barberá, A. Y. Chen, H. Allcott, T. Brown, in: Proceedings of the International AAAI
ConferA. Crespo-Tenorio, D. Dimmery, D. Freelon, ence on web and social media, volume 15, 2021, pp.
M. Gentzkow, S. González-Bailón, A. M. Guess, 537–548.</p>
      <p>E. Kennedy, Y. M. Kim, D. Lazer, N. Malhotra, [24] S. Shi, K. Qiao, J. Chen, S. Yang, J. Yang, B. Song,
D. Moehler, J. Pan, D. R. Thomas, R. Tromble, L. Wang, B. Yan, Mgtab: A multi-relational
graphC. V. Rivera, A. Wilkins, B. Xiong, C. K. de Jonge, based twitter account detection benchmark, arXiv
A. Franco, W. Mason, N. J. Stroud, J. A. Tucker, preprint arXiv:2301.01123 (2023).</p>
      <p>Like-minded sources on facebook are prevalent [25] S. Rathje, J. J. Van Bavel, S. Van Der Linden,
Outbut not polarizing, Nature 620 (2023) 137–144. group animosity drives engagement on social
medoi:10.1038/s41586-023-06297-w. dia, Proceedings of the National Academy of
Sci[15] P. Gravino, D. R. Lo Sardo, E. Brugnoli, Cross- ences 118 (2021) e2024292118.
platform impact of social media algorithmic adjust- [26] NewsguardTech.com, Social impact report
ments on public discourse, ArXiv (2024). doi:10. 2021, 2022. Available from https://www.
48550/arXiv.2405.00008. newsguardtech.com/wp-content/uploads/2022/
[16] S. González-Bailón, D. Lazer, P. Barberá, M. Zhang, 01/NewsGuard-Social-Impact-Report-1.21.22.pdf
H. Allcott, T. Brown, A. Crespo-Tenorio, D. Freelon, (accessed Nov 27, 2023).</p>
      <p>M. Gentzkow, A. M. Guess, S. Iyengar, Y. M. Kim, [27] M. Bruno, D. Mazzilli, A. Patelli, T. Squartini,
N. Malhotra, D. Moehler, B. Nyhan, J. Pan, C. V. F. Saracco, Inferring comparative advantage via
Rivera, J. Settle, E. Thorson, R. Tromble, A. Wilkins, entropy maximization, Journal of Physics:
ComM. Wojcieszak, C. Kiewiet de Jonge, A. Franco, plexity 4 (2023) 045011. doi:10.1088/2632-072X/
W. Mason, N. Jomini Stroud, J. A. Tucker, Asymmet- ad1411.
ric ideological segregation in exposure to political [28] M. E. J. Newman, M. Girvan, Finding and evaluating
news on facebook, Science 381 (2023) 392–398. community structure in networks, Phys. Rev. E 69
doi:10.1126/science.ade7138. (2004) 026113. doi:10.1103/PhysRevE.69.026113.
[17] V. D. Blondel, J.-L. Guillame, R. Lambiotte, E. Lefeb- [29] F. Bianchi, D. Nozza, D. Hovy, FEEL-IT: Emotion
vre, Fast unfolding of communities in large net- and sentiment classification for the Italian language,
works, Journal of Statistical Mechanics: The- in: O. De Clercq, A. Balahur, J. Sedoc, V. Barriere,
ory and Experiment 10008 (2008). doi:10.1088/ S. Tafreshi, S. Buechel, V. Hoste (Eds.),
Proceed1742-5468/2008/10/P10008. ings of the Eleventh Workshop on Computational
[18] E. Brugnoli, P. Gravino, D. R. Lo Sardo, V. Loreto, Approaches to Subjectivity, Sentiment and Social
G. Prevedello, Fine-grained clustering of social Media Analysis, Association for Computational
Linmedia: How moral triggers drive preferences and guistics, Online, 2021, pp. 76–83.</p>
      <p>la Favor
un Against
a Neutral
M
Σ</p>
      <sec id="sec-3-1">
        <title>Automatic</title>
        <p>Favor Against
221 7
16 209
13 34
250 250</p>
      </sec>
      <sec id="sec-3-2">
        <title>Political entity</title>
        <p>+Europa
Articolo Uno
Azione
Cambiamo!
Coraggio Italia
Democrazia e Autonomia
Europa Verde
FdI
FI
ItalExit
IV
Lega
M5S
Twitter profiles
piu_europa, emmabonino
articolounodp, robersperanza
azione_it, carlocalenda
giovannitoti
coraggio_italia, luigibrugnaro
movimentodema
europaverde_it, angelobonelli1
giorgiameloni, fratelliditalia
forza_italia, berlusconi
gparagone
italiaviva, matteorenzi
legasalvini, matteosalvinimi
giuseppeconteit, mov5stelle,
luigidimaio
manifesta_it
maurizio_lupi
pdnetwork, enricoletta, sbonaccini,
ellyesse
potere_alpopolo
direzioneprc
si_sinistra, nfratoianni
antoniodepoli
unione_popolare, demagistris</p>
      </sec>
      <sec id="sec-3-3">
        <title>Favor</title>
        <p>l
a
u
t
c Against
A</p>
      </sec>
      <sec id="sec-3-4">
        <title>Favor</title>
        <p>l
a
u
t
c Against
A</p>
      </sec>
      <sec id="sec-3-5">
        <title>Predicted</title>
      </sec>
      <sec id="sec-3-6">
        <title>Favor</title>
        <p>70, 690
10, 517
81, 207</p>
      </sec>
      <sec id="sec-3-7">
        <title>Favor</title>
        <p>16, 929
2, 740
19, 669</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Küçük</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Can</surname>
          </string-name>
          ,
          <article-title>Stance detection: Concepts, approaches, resources, and outstanding issues</article-title>
          ,
          <source>in: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>2673</fpage>
          -
          <lpage>2676</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Alturayeif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Luqman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <article-title>A systematic review of machine learning techniques for stance detection and its applications</article-title>
          ,
          <source>Neural Computing and Applications</source>
          <volume>35</volume>
          (
          <year>2023</year>
          )
          <fpage>5113</fpage>
          -
          <lpage>5144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Aldayel</surname>
          </string-name>
          , W. Magdy,
          <article-title>It is more than what you say!: Leveraging user online activity for improved stance detection</article-title>
          ,
          <year>2019</year>
          . URL: https://
          <year>2019</year>
          .ic2s2.org/, 5th International Conference on Computational Social Science,
          <year>IC2S2</year>
          2019 ; Conference date:
          <fpage>17</fpage>
          -
          <lpage>07</lpage>
          -2019 Through 20-
          <fpage>07</fpage>
          -
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <article-title>Automatic stance detection for twitter data</article-title>
          ,
          <source>in: 2022 1st International Conference on Informatics (ICI)</source>
          , IEEE,
          <year>2022</year>
          , pp.
          <fpage>223</fpage>
          -
          <lpage>225</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Draws</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Natesan Ramamurthy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Baldini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dhurandhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Padhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Timmermans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tintarev</surname>
          </string-name>
          ,
          <article-title>Explainable cross-topic stance detection for search results</article-title>
          ,
          <source>in: Proceedings of the 2023 Conference on Human Information Interaction and Retrieval</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>221</fpage>
          -
          <lpage>235</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Du</surname>
          </string-name>
          <string-name>
            <surname>Bois</surname>
          </string-name>
          ,
          <article-title>The stance triangle, Stancetaking in discourse: Subjectivity, evaluation</article-title>
          , interaction
          <volume>164</volume>
          (
          <year>2007</year>
          )
          <fpage>139</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>World</given-names>
            <surname>Economic</surname>
          </string-name>
          <string-name>
            <surname>Forum</surname>
          </string-name>
          ,
          <source>Global Risks Report</source>
          <year>2024</year>
          ,
          <string-name>
            <given-names>Technical</given-names>
            <surname>Report</surname>
          </string-name>
          , World Economic Forum,
          <year>2024</year>
          . URL: https://www.weforum.org/publications/ global-risks
          <source>-report-2024/.</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>F.</given-names>
            <surname>Tamburini</surname>
          </string-name>
          ,
          <article-title>How “bertology” changed the stateof-the-art also for italian nlp</article-title>
          ,
          <source>in: Proceedings of the Seventh Italian Conference on Computational Linguistics</source>
          , CLiC-it
          <year>2020</year>
          , Online,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delangue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cistac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Louf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funtowicz</surname>
          </string-name>
          , et al.,
          <article-title>Huggingface's transformers: State-ofthe-art natural language processing</article-title>
          , arXiv (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1910</year>
          .
          <volume>03771</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>P.</given-names>
            <surname>Refaeilzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tang</surname>
          </string-name>
          , H. Liu, Cross-Validation, Springer US, Boston, MA,
          <year>2009</year>
          , pp.
          <fpage>532</fpage>
          -
          <lpage>538</lpage>
          . doi:
          <volume>10</volume>
          . 1007/978- 0-
          <fpage>387</fpage>
          - 39940- 9_
          <fpage>565</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>I.</given-names>
            <surname>Loshchilov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hutter</surname>
          </string-name>
          ,
          <article-title>Decoupled weight decay regularization</article-title>
          ,
          <source>arXiv</source>
          (
          <year>2017</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv. 1711.05101.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <source>arXiv</source>
          (
          <year>2018</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1810</year>
          .
          <volume>04805</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>