<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>of the Political Ideology Detection in Italian Texts Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Russo</string-name>
          <email>drusso@fbk.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salud María Jiménez-Zafra</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>José Antonio García-Díaz</string-name>
          <email>joseantonio.garcia8@um.es</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tommaso Caselli</string-name>
          <email>t.caselli@rug.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Guerini</string-name>
          <email>guerini@fbk.eu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>L. Alfonso Ureña-López</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rafael Valencia-García</string-name>
          <email>valencia@um.es</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CLCG, University of Groningen</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LanD, Fondazione Bruno Kessler</institution>
          ,
          <addr-line>Via Sommarive 18, Povo, Trento</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Processing and Speech Tools for Italian</institution>
          ,
          <addr-line>Sep 7 - 8, Parma, IT</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>SINAI, Universidad de Jaén</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>UMUTeam, Universidad de Murcia</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Trento</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the PoliticIT 2023 shared task, organised at EVALITA 2023 workshop. The task aims to extract politicians' ideology information from a set of tweets in Italian framed as a binary and a multiclass classification. The task is designed to be privacy-preserving and it is accompanied by a subtask targeting the identification of self-assigned gender as a demographic trait. The PoliticIT task attracted 7 teams that registered for the task, submitted results and presented working notes describing their systems. Most of the teams proposed transformer-based approaches, while some of them also used traditional machine learning algorithms or even a combination of both.</p>
      </abstract>
      <kwd-group>
        <kwd>Author profiling</kwd>
        <kwd>Political ideology</kwd>
        <kwd>Author analysis</kwd>
        <kwd>Demographic and psychographic traits</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and Motivations</title>
      <p>
        The study of the political discourse on Social Media
Platforms is of paramount importance in order to understand
where society is heading. Political discourse is by
definition ideologically based and political ideologies are
spread with discourse [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]: for this reason, the analysis
of the latter cannot go without the understanding of the
former.
      </p>
      <sec id="sec-1-1">
        <title>Political ideology is defined as a psychographic trait</title>
        <p>right wing, whereas openness to experience and
agreeability were notably more correlated to the left wing.</p>
        <p>
          Moreover, political ideology has a great influence in the
daily lives of each citizen. For example, [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] found a
correlation between political ideology and the attitude of
citizens to vaccination campaigns. Still, citizens react
to the political messages they are exposed to.
Therefore studying how politicians spread their ideology using
social media discourses is useful to better analyse the
policies and perspectives that are proposed on how society
haviour, including moral and ethical values, attitudes,
that can help comprehend both individual and social be- should be organized and work.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>In this scenario, the PoliticIT shared task organized</title>
        <p>
          biases, and prejudices [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. In fact, this trait helps under- at EVALITA 2023 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] aims to extract political ideology
standing how individuals think that society should be
organised and has a strong relationship with personality
traits as demonstrated in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. For instance, they found
that conscientiousness is strongly correlated with the
LGOBE
        </p>
        <p>
          0009-0006-9123-5316 (D. Russo); 0000-0003-3274-8825
(S. M. Jiménez-Zafra); 0000-0002-3651-2660 (J. A. García-Díaz);
0000-0003-2936-0256 (T. Caselli); 0000-0003-1582-6617 (M. Guerini);
[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], targeting author attribution, bot detection, gender
detection, and author obfuscation, among others. Other
initiatives, such as the PoliticES shared task [7], have
focused on capturing other traits such as the political
ideology expressed in a message. PoliticIT is a
twintask of PoliticES and aims at analysing political
ideology while being privacy-preserving. For this reason, a
novel methodology of text clustering, that creates “virtual
users” has been used on top of traditional anonymisation
procedures.
        </p>
        <p>The rest of the paper is organized as follows. Section
2 describes the PoliticIT shared task. Section 3 presents 3.1. Data Collection
the dataset provided in the competition. Section 4
summarises the participant approaches. Section 5 shows the The dataset was collected from the Twitter timelines2 of
results and a discussion thereof. Finally, Section 6 con- Italian politicians using the UMUCorpusClassifier [ 8],
folcludes the paper with a discussion of the most interesting lowing a strategy similar to the one adopted in PoliticES
outcomes of this task and possible future works. 2022 [7] and in [9]. In particular, the data refer to the
politicians from the legislature XIX of the Italian
Republic. The list of deputies, senators, and ministers was taken
2. Task Description from the institutional websites of the Italian Parliament3
and Government.4 All the politicians’ Twitter accounts
The PoliticIT task is structured along three subtasks: were manually retrieved, as they are not reported on the
institutional websites. We discarded politicians that did
• Subtask A - Self-assigned Gender : Given a mes- not have a Twitter account or that were highly inactive on
sage, the system must assign a value for the gen- this social media (i.e. whose accounts present very few or
der of the author. The set of labels has been de- old tweets). The time window for the corpus compilation
termined according to the personal web pages of was December 2022 as the oldest date, but no start date
the politicians of the Italian Parliament. The task was set. In the first iteration, we compiled 371,822 tweets
has been framed as a binary classification task from 468 politicians between November 2010 and
Decemwith M for men and F for women. ber 2022. The average number of tweets per politician
• Subtask B - Political Ideology (binary): systems is 794.49 but with a large standard deviation of 847.12,
are required to determine the political orienta- which suggest that not all politicians are equally active
tion of a message; the binary version of the task on Twitter. Thus, we decided to remove from the dataset
presents two macro-categories: Left and Right. those politicians with less than 25 tweets, leaving a total
• Subtask C - Political Ideology (multi-class): this of 408 politicians.</p>
        <p>
          subtask presents a more fine-grained set of la- To balance the number of tweets per politician, we
bels for the political orientation expressed by a ifrst removed those tweets that are not written in Italian.
given message. In this case, we employed four To detect the language, we employed FastText language
labels: Left , Moderate-Left , Moderate-Right, identification model [
          <xref ref-type="bibr" rid="ref9">10</xref>
          ]. Secondly, we removed the
docand Right. uments that shared content from news websites without
retweeting. To do this, we discarded tweets that
con
        </p>
        <p>
          PoliticIT was organized through CodaLab.1 The run of tained mentions of news websites, by detecting linguistic
the task is divided into three phases: (i) Practice, (ii) Eval- clues within the text, such as the pipe symbol, which
uation, and (iii) Post-evaluation. In the Practice phase, is commonly employed by news websites to categorise
the participants were initially provided with a subset their content. Thirdly, we selected tweets based on topics.
of the training data in order to familiarise themselves An initial list of topics was extracted with BERTopic [
          <xref ref-type="bibr" rid="ref12">11</xref>
          ],
with the training data format. During this phase, we a topic modelling technique for the creation of
interalso provided a notebook comprising the code for our pretable clusters based on Transformers and c-TF-IDF. In
Logistic Regression baseline, as a starting point for the particular, we leveraged the Italian BERT model from [
          <xref ref-type="bibr" rid="ref13">12</xref>
          ].
development of more eficient systems. The full training We obtained a list of topics organised into 21 categories.
set was released in February 2023. Currently, the task is This list was manually checked to introduce additional
in its post-evaluation phase, where participation is pub- keywords for categories such as European Union,
immilicly open to other teams and research groups from the gration, energy, feminism, sports, mafia or religion. Next,
community. we identified which topics appeared in each tweet and
prioritised those tweets that contained at least one topic.
3. Datasets and Format We then selected the tweets according to their topic in
order to avoid any possible bias in the dataset.
        </p>
        <p>This section provides the reader with an overview of
the dataset proposed for the PoliticIT 2023 shared task
along with a comprehensive description of the modalities
employed for creating it.</p>
      </sec>
      <sec id="sec-1-3">
        <title>1https://codalab.lisn.upsaclay.fr/competitions/8507</title>
      </sec>
      <sec id="sec-1-4">
        <title>2https://developer.twitter.com/en/docs/twitter-api/v1/tweets/</title>
        <p>timelines/api-reference/get-statuses-home_timeline</p>
        <p>3Chamber of Deputies: https://www.camera.it/leg19/28
Senate of the Republic: https://www.senato.it/leg/19/BGT/Schede/
Attsen/Sena.html
4https://www.governo.it/it/ministri-e-sottosegretari</p>
        <sec id="sec-1-4-1">
          <title>3.2. Data Annotation and Anonymization</title>
          <p>We enriched the dataset by assigning to each politician
a label indicating their political ideology. Political
ideologies have been directly derived from the politicians’
afiliation party. In particular, the mapping from the
politician to the political ideology was obtained through
a two-step procedure:
1. Automatic labelling of politicians with their
current political party afiliation. The party
afiliation has been inferred from the parliamentary
group to which the parliament party belongs. The
data were extracted from the Italian institutional
websites on October 31, 2023, thus they do not
relfect changes in parliamentary groups following
this date.
2. Mapping of the political parties to specific
political ideology labels. The set of labels has been
identified using Wikipedia. 5 We used four political
ideology labels, i.e. left , moderate left , right,
moderate right. Parties that are mapped in the centre,
or cross-party, were nevertheless assigned one
of the four aforementioned labels on the basis of
their political alliances and the programme they
presented during the 2022 Italian election
campaign. The decision to “force” this classification
was made to avoid excessive imbalance within
each class. Therefore, we labelled “Movimento 5
Stelle” as left , whereas “Azione” and “Italia Viva”
as moderate left .
replaced with the @user token. Consequently,
the text traits cannot be guessed trivially by
reading a politician’s name and searching for personal
information on the Internet. We also replaced the
name of the political parties and of their Twitter
accounts with the POLITICAL_PARTY token.
• Clustering procedure - Subsequently, we
created clusters of texts by mixing some of the
extracted tweets in order to prevent ethical and
privacy issues related to author profiling on Twitter.
All the clusters are composed of tweets written
by diferent politicians that share the same traits
under evaluation, i.e. political ideology and
selfassigned gender. For this, we divide the
politicians into training and test in order to prevent
that tweets from the same politician from
appearing both in training and validation. To generate
a cluster, we first set their demographic and
psychographic traits, and then randomly pick tweets
from users that share these traits. Thereby, each
cluster represents “virtual users”, with their
selfassigned gender (male, female) and political
spectrum. For the latter, we labelled the data
according to two axes: binary (left, right) and multiclass
(left, moderate left, moderate right and right). At
the end of this process, we obtained 1751 clusters
with 80 tweets per cluster. It should be noted
that the clusters from the training and test sets
are independent to prevent machine learning
approaches from identifying the authors rather than
the demographic and psychographic traits.</p>
        </sec>
      </sec>
      <sec id="sec-1-5">
        <title>Gender labels were assigned through three diferent</title>
        <p>approaches, depending on the source of the data. for the 3.3. Data Formats
Italian deputies, gender was directly extracted from the
institutional website, which allows the filtering of mem- The training and test sets are produced in a ratio of nearly
bers according to this trait. The website of the Senate 75%-25%. Table 1 presents a summary of the distribution
of the Republic does not clearly states the gender of the of labels per subtask. In no case, the labels are evenly
members. In this case, employed linguistic cues present distributed. Male politicians are almost double the
numon the personal page of each senator to infer the gender. ber of female politicians, and more than 200 politicians
Specifically, we looked at the Italian verb “nascere” in are from the left wing than from the right. As regards
its past participle form as “nato” for the male label and the multi-class ideology, moderate left and right are the
“nata” for the female label. Finally, for the ministers, gen- most represented labels.
der was manually assigned as the oficial Government Ultimately, each entry of the PoliticIT dataset
comwebsite does not comprise this information and does not prises four elements: a cluster id, the self-assigned gender
present helpful linguistic cues. In this case, ministers label and the political ideology labels for binary and
multiwere labelled according to their biological sex. class classification. The dataset is organised at tweet level.</p>
        <p>Subsequently, to build a privacy-compliant approach This means that each row represents one tweet. Each
we took a two-step procedure including anonymization line also contains a cluster id to identify the cluster
and clustering: to which the tweet belongs, as well as the demographic
and psychographic traits of the cluster. Examples are
provided in Table 2. The full dataset, including the gold
labels, is available on Codalab 6.
• Anonymized references in text - References
to politicians within Twitter mentions were
anonymized by replacing them with the token
@user. The rest of the in-text mentions were also</p>
      </sec>
      <sec id="sec-1-6">
        <title>5https://it.wikipedia.org/wiki/Partiti_politici_italiani</title>
      </sec>
      <sec id="sec-1-7">
        <title>6https://codalab.lisn.upsaclay.fr/competitions/8507</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Systems Overview</title>
      <p>A total of seven teams participated in the PoliticIT task,
with all teams involved in each of the subtasks. The
majority of the participants represented academic
institutions. An overview of the system approaches can be
found in Table 3, while Section 4.1 provides further
details on the systems proposed.</p>
      <sec id="sec-2-1">
        <title>4.1. System Architectures</title>
        <p>
          ExtremITA [
          <xref ref-type="bibr" rid="ref14">13</xref>
          ]. The team proposed two systems. The
ifrst system is based on Camoscio [
          <xref ref-type="bibr" rid="ref15">14</xref>
          ], the Italian version
of the Standford Alpaca model [
          <xref ref-type="bibr" rid="ref16">15</xref>
          ], pre-trained to
generate text as a response to users’ instructions passed as
input. The team performed further fine-tuning on triples
&lt;task, input, output&gt;. More precisely, the model used
the phrasal forms derived from the training data of all
EVALITA 2023 challenges: the task is a linguistic
description of the task to be solved, whereas the input-output
pairs are task-specific. On the other hand, the second
system is based on IT5 Transformer [
          <xref ref-type="bibr" rid="ref17">16</xref>
          ], for which
finetuning was done on input-output pairs. More precisely,
the model used the phrasal forms derived from the
training data of all EVALITA 2023 challenges, where the
input-output pair is task-specific.
        </p>
        <sec id="sec-2-1-1">
          <title>INFOTEC-LaBD [17]. The team employed SVM classi</title>
          <p>ifers with linear and nonlinear kernels. Specific attention
was given to the data representation. Indeed, the authors
employed low-dimensional projections to concisely
represent the dataset and its associated labels. These vectors
were used for training an SVM classifier.</p>
          <p>INGEOTEC. The team implemented diferent
configurations of Bag of Words (BoW) classifier in all the
subtasks. In particular, for gender and political ideology
multiclass classification, INGEOTEC employed a stack
generalization approach leveraging three BoW classifiers:
two BoW classifiers pre-trained on 5M Italian tweets,
and a BoW classifier trained on PoliticIT training set.
Instead, for the political ideology binary subtask, the team
proposed a BoW classifier trained on the training set.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Teeeech. The team proposed three Transformer-based classifiers trained independently of each other. The authors did not specify which Transformer models were used.</title>
        </sec>
        <sec id="sec-2-1-3">
          <title>Tübingen [18]. The team proposed two main ap</title>
          <p>proaches: an SVM-based approach and a
Transformerbased approach. The former was the best-performing,
i.e.,.e simple linear SVMs with sparse word/character
n-gram features, trained separately for each task, only</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Results and Discussion</title>
      <p>Focusing on each subtask, it appears evident that bi- of the errors afect the classification of male politicians
nary classification of political ideologies is relatively eas- into females. The best system, INFOTEC-LaBD, obtains
ier when compared to the other two subtasks, with the 0.824 macro F1, with a positive Δ from the second best
best results being 0.928 obtained by Tübingen. The more system (Tübingen) of 0.032 points.
ifne-grained the distinction of political ideologies is the
more challenging the task. This is not just an efect of
having multiple classes and the distribution of the data, 6. Conclusions
but it involves also the subtleties and nuances in
policies across the four groups. In Figure 1 we display the In this paper, we have summarized the outcomes of
confusion matrices obtained by the Tübingen team for the first edition of the PoliticIT task at EVALITA 2023.
the three classification subtasks. Focusing on the errors PoliticIT targets the identification of the political
ideolof Subtask C, multi-class ideology classification, we can ogy and gender of the author of a tweet. Political ideology
notice that most of the errors concern misclassification is a psychographic trait that can be used to understand
of the “extremes” (i.e., Left and Right) into the Moder- individual and social behaviour, and thus contribute to a
ate Left category) rather than of the moderate positions better understanding of the society. The task introduces
an innovative method concerning the anonymization of
(tMhaot d-eartalteeasLteift n tvhse.irMcoodmemrautneicRaitgiohnt)o.nTThwisiitntedric-adt eifs- users to preserve privacy, allowing the investigation of
ferences in the moderate political areas are stronger. A these sensitive topics in a fair and ethical way.
further noticeable result is the fact that messages from PoliticIT has seen the participation of seven teams,
politicians on the Right spectrum tend to be assimilated ifve of whom submitted a full report describing their
approach. The results indicate that fine-grained political
mostly with the Moderate Left and Moderate Right,
suggesting that the narratives of moderate groups tend ideology distinction is more challenging than binary
clasto assimilate issues and expressions of the Right. sification between two extreme values. This appears to</p>
      <p>The identification of the gender of the author from a be due to the absorption of the narratives of the extremes
tweet is also quite challenging even if it is framed as a into the moderate positions, especially from the
Modbinary task. In this case, as illustrated by Figure 1a most erate Left . This is a datum that could provide political
scientists and sociologists additional insights into the
evolution of the Italian political systems. A further as- the Junta de Andalucía (DOC_01073).
pect that raises interest concerns the errors in Subtask A,
gender classification. In this case, most of the errors are
male politicians classified as females. A deep exploration References
of the communication styles of the two genres and their
correlation with political afiliations is a promising path
to better understanding this behaviour.</p>
      <p>As expected, approaches based on Transformers are
the trend solutions presented by participating teams, but
some of them also used feature-based linear machine
learning systems. It is quite impressive that the best
performing team, Tübingen, uses linear SVMs with word
and character n-grams as features weighting each feature
using TF-IDF. This indicates that simple methods are still
competitive and far from being fully overcome by neural
approaches.</p>
      <p>PoliticIT has been fully run on CodaLab. The task
is now in the Post-evaluation phase and it is still open
to submissions from other teams, making this resource
freely available to the NLP community for research
purposes.</p>
      <p>Future work will investigate the addition of an extra
subtask related to stance detection, to determine which
authors are in favor of certain topics and which users are
against. We can use this information to define clusters
of users and to observe whether there is a relationship
between the topics and the political ideology.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fairclough</surname>
          </string-name>
          ,
          <article-title>Critical discourse analysis: The critical study of language</article-title>
          , Routledge,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Verhulst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Eaves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Hatemi</surname>
          </string-name>
          ,
          <article-title>Correlation not causation: The relationship between personality traits and political ideologies</article-title>
          ,
          <source>American journal of political science 56</source>
          (
          <year>2012</year>
          )
          <fpage>34</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fatke</surname>
          </string-name>
          ,
          <article-title>Personality traits and political ideology: A first global assessment</article-title>
          ,
          <source>Political Psychology</source>
          <volume>38</volume>
          (
          <year>2017</year>
          )
          <fpage>881</fpage>
          -
          <lpage>899</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Baumgaertner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Carlisle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Justwan</surname>
          </string-name>
          ,
          <article-title>The influence of political ideology and trust on willingness to vaccinate</article-title>
          ,
          <source>PloS one 13</source>
          (
          <year>2018</year>
          )
          <article-title>e0191728</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Menini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Polignano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          , G. Venturi,
          <string-name>
            <surname>EVALITA</surname>
          </string-name>
          <year>2023</year>
          :
          <article-title>Overview of the 8th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          , in: M.
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Menini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Polignano</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Russo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Sprugnoli</surname>
          </string-name>
          , G. Venturi (Eds.),
          <source>Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA</source>
          <year>2023</year>
          ), CEUR.org, Parma, Italy,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chulvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L. D. L. Peña</given-names>
            <surname>Sarracén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Manjavacas</surname>
          </string-name>
          , I. Markov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , et al.,
          <source>Overview Acknowledgments of PAN</source>
          <year>2021</year>
          :
          <article-title>authorship verification, profiling hate speech spreaders on twitter, and style change This work is part of the research projects detection, in: International Conference of the LaTe4PoliticES (PID2022-138099OB-I00) funded Cross-Language Evaluation Forum for European by MCIN/</article-title>
          <source>AEI/10.13039/501100011033 and the Euro- Languages</source>
          , Springer,
          <year>2021</year>
          , pp.
          <fpage>419</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>pean Fund for Regional Development (FEDER)-</article-title>
          a [7]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>García-Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Jiménez-Zafra</surname>
          </string-name>
          , M.-T. M.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>way to make Europe and LaTe4PSP (PID2019-</article-title>
          <string-name>
            <surname>Valdivia</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>García-Sánchez</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          <string-name>
            <surname>Ureña-López</surname>
          </string-name>
          ,
          <fpage>107652RB</fpage>
          -
          <lpage>I00</lpage>
          /AEI/ 10.13039/501100011033) funded by R. Valencia-García,
          <source>Overview of PoliticEs 2022: MCIN/AEI/10</source>
          .13039/501100011033.
          <article-title>This work is also Spanish Author Profiling for Political Ideology, part of the research projects AIInFunds (PDC2021-121112-</article-title>
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>69</volume>
          (
          <year>2022</year>
          )
          <article-title>I00) and LT-SWM (TED2021-131167B-I00) funded by 265-272</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>MCIN/AEI/10</source>
          .13039/501100011033 and by the European [8]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>García-Díaz</surname>
          </string-name>
          , Á. Almela,
          <string-name>
            <given-names>G.</given-names>
            <surname>Alcaraz-Mármol</surname>
          </string-name>
          , Union NextGenerationEU/PRTR. It also has been partially R. Valencia-García,
          <article-title>Umucorpusclassifier: Compisupported by Project CONSENSO (PID2021-122263OB- lation and evaluation of linguistic corpus for natuC21), Project MODERATES (TED2021-130145B-I00) ral language processing tasks</article-title>
          ,
          <source>Procesamiento del and Project SocialTox (PDC2022-133146-C21) funded Lenguaje Natural</source>
          <volume>65</volume>
          (
          <year>2020</year>
          )
          <fpage>139</fpage>
          -
          <lpage>142</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>by</surname>
            <given-names>MCIN</given-names>
          </string-name>
          /AEI/10.13039/501100011033 and by the [9]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>García-Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Colomo-Palacios</surname>
          </string-name>
          , R. ValenciaEuropean Union NextGenerationEU/PRTR,
          <string-name>
            <surname>Project</surname>
            <given-names>García</given-names>
          </string-name>
          ,
          <article-title>Psychographic traits identification based PRECOM (SUBV-00016) funded by the Ministry on political ideology: An author analysis study on of Consumer Afairs of the Spanish Government, spanish politicians' tweets posted in 2020, Future Project FedDAP (PID2020-116118GA-I00) supported Generation Computer Systems 130 (</article-title>
          <year>2022</year>
          )
          <fpage>59</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>by</surname>
            <given-names>MICINN</given-names>
          </string-name>
          /AEI/10.13039/501100011033 and WeLee [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , É. Grave,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          , T. Mikolov, Bag project (
          <issue>1380939</issue>
          ,
          <string-name>
            <given-names>FEDER</given-names>
            <surname>Andalucía</surname>
          </string-name>
          2014
          <article-title>-2020) funded of tricks for eficient text classification, in: Proby the Andalusian Regional Government. Salud María ceedings of the 15th Conference of the European Jiménez-Zafra has been partially supported by a grant Chapter of the Association for Computational Linfrom Fondo Social Europeo and the Administration of guistics: Volume 2</article-title>
          ,
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          ,
          <year>2017</year>
          , pp.
          <fpage>427</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Grootendorst</surname>
          </string-name>
          , Bertopic: Neural topic modeling [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tiedemann</surname>
          </string-name>
          ,
          <article-title>Parallel data, tools and interfaces with a class-based tf-idf procedure, arXiv preprint in OPUS</article-title>
          , in: Proceedings of the Eighth InarXiv:
          <volume>2203</volume>
          .05794 (
          <year>2022</year>
          ). ternational Conference on Language Resources
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schweter</surname>
          </string-name>
          ,
          <article-title>Italian bert and electra models, and Evaluation (LREC'12)</article-title>
          ,
          <source>European Language</source>
          <year>2020</year>
          . URL: https://doi.org/10.5281/zenodo.4263142. Resources Association (ELRA), Istanbul, Turkey, doi:10.5281/zenodo.4263142.
          <year>2012</year>
          , pp.
          <fpage>2214</fpage>
          -
          <lpage>2218</lpage>
          . URL: http://www.lrec-conf.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [13]
          <string-name>
            <surname>C. D. Hromei</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Croce</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Basile</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Basili</surname>
          </string-name>
          , Ex- org/proceedings/lrec2012/pdf/463_Paper.pdf. tremITA at EVALITA 2023:
          <string-name>
            <surname>Multi-Task Sustainable</surname>
            [24]
            <given-names>P. J. Ortiz</given-names>
          </string-name>
          <string-name>
            <surname>Suárez</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Sagot</surname>
          </string-name>
          , L. Romary,
          <article-title>AsynScaling to Large Language Models at its Extreme, chronous pipelines for processing huge corpora EVALITA 2023 Eigth Evaluation Campaign of Nat- on medium to low resource infrastructures</article-title>
          ,
          <source>Proural Language Processing and Speech Tools for Ital- ceedings of the Workshop on Challenges in ian (</source>
          <year>2023</year>
          )
          <article-title>-</article-title>
          .
          <source>the Management of Large Corpora (CMLC-7)</source>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Santilli</surname>
          </string-name>
          ,
          <string-name>
            <surname>Camoscio:</surname>
          </string-name>
          <article-title>An italian instruction-tuned 2019</article-title>
          .
          <source>Cardif, 22nd July</source>
          <year>2019</year>
          ,
          <article-title>Leibniz-Institut llama</article-title>
          , https://github.com/teelinsan/camoscio,
          <year>2023</year>
          . für Deutsche Sprache, Mannheim,
          <year>2019</year>
          , pp.
          <fpage>9</fpage>
          -
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Taori</surname>
          </string-name>
          , I. Gulrajani,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dubois</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <volume>16</volume>
          . URL: http://nbn-resolving.de/urn:nbn:de:bsz: C. Guestrin,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          , T. B.
          <string-name>
            <surname>Hashimoto</surname>
          </string-name>
          , Stanford al- mh39
          <source>-90215</source>
          . doi:
          <volume>10</volume>
          .14618/ids- pub- 9021. paca:
          <article-title>An instruction-following llama model</article-title>
          , https: [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Polignano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Basile</surname>
          </string-name>
          , M. de Gemmis, G. Semer//github.com/tatsu-lab/stanford_alpaca,
          <year>2023</year>
          . aro, V. Basile, AlBERTo: Italian BERT Language
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G.</given-names>
            <surname>Sarti</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Nissim, It5: Large-scale text-to-text Understanding Model for NLP Challenging Tasks pretraining for italian language understanding and Based on Tweets, in: Proceedings of the Sixth generation</article-title>
          ,
          <source>ArXiv preprint 2203.03759</source>
          (
          <year>2022</year>
          ). URL: Italian Conference on Computational Linguistics https://arxiv.org/abs/2203.03759. (CLiC-it
          <year>2019</year>
          ), volume
          <volume>2481</volume>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2019</year>
          . URL:
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>H.</given-names>
            <surname>Cabrera-Pineda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Téllez</surname>
          </string-name>
          , S. Miranda, https://www.scopus.com/inward/record.uri?
          <string-name>
            <surname>INFOTEC-LaBD at PoliticIT: Political Ideology</surname>
          </string-name>
          De- eid=
          <fpage>2</fpage>
          -
          <lpage>s2</lpage>
          .
          <fpage>0</fpage>
          -
          <lpage>85074851349</lpage>
          &amp;partnerID=
          <volume>40</volume>
          &amp;md5= tection in Italian Texts,
          <source>EVALITA 2023 Eigth Eval- 7abed946e06f76b3825ae5e294ffac14. uation Campaign of Natural Language Processing and Speech Tools for Italian</source>
          (
          <year>2023</year>
          )
          <article-title>-</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>C.</given-names>
            <surname>Çöltekin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brivio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Can</surname>
          </string-name>
          , Tübingen at PoliticIT:
          <article-title>Exploring SVMs, Pretrained Language Models, and Linguistic Transfer for Ideology Detection in Social Media, EVALITA 2023 Eigth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (</article-title>
          <year>2023</year>
          )
          <article-title>-</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wenzek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , É. Grave,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Unsupervised crosslingual representation learning at scale</article-title>
          ,
          <source>in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>8440</fpage>
          -
          <lpage>8451</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>T.</given-names>
            <surname>Erjavec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ogrodniczuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Osenova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ljubešić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Simov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pančur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rudolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kopp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Barkarson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Steingrímsson</surname>
          </string-name>
          , et al.,
          <source>The parlamint corpora of parliamentary proceedings, Language resources and evaluation 57</source>
          (
          <year>2023</year>
          )
          <fpage>415</fpage>
          -
          <lpage>448</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pan</surname>
          </string-name>
          , Á. Almela,
          <string-name>
            <given-names>F.</given-names>
            <surname>García-Sánchez</surname>
          </string-name>
          , UMUTeam at PoliticIT-EVALITA2023:
          <article-title>Evaluating Transformer Model for Detecting Political Ideology in Italian Texts, EVALITA 2023 Eigth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (</article-title>
          <year>2023</year>
          )
          <article-title>-</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Á</surname>
          </string-name>
          .
          <article-title>Rodríguez-García, URJC-Team at EVALITA 2023: Political Ideology Detection in Italian Texts Using Transformers Architectures, EVALITA 2023 Eigth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (</article-title>
          <year>2023</year>
          )
          <article-title>-</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>