<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Conference and Labs of the Evaluation Forum, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Profiling Cryptocurrency Influencers with Few-Shot Learning Using Data Augmentation and ELECTRA</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Siino</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maurizio Tesconi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ilenia Tinnirello</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cyber Intelligence Lab, Institute of Informatics and Telematics, National Research Council</institution>
          ,
          <addr-line>Pisa, 56127</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Engineering, University of Palermo</institution>
          ,
          <addr-line>Palermo, 90128</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>8</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>With this work we propose an application of the ELECTRA Transformer, fine-tuned on two augmented version of the same training dataset. Our team developed the novel framework for taking part at the Profiling Cryptocurrency Influencers with Few-shot Learning task hosted at PAN@CLEF2023. Our proposed strategy consists of an early data augmentation stage followed by a fine-tuning of ELECTRA. At the first stage we augment the original training dataset provided by the organizers using backtranslation. Using this augmented version of the training dataset, we perform a fine tuning of ELECTRA. Finally, using the fine-tuned version of ELECTRA, we inference the labels of the samples provided in the test set. To develop and test our model we used a two-ways validation on the training set. Firstly, we evaluate all the metrics on the augmented training set, and then we evaluate on the original training set. The metrics we considered span from accuracy to Macro F1, to Micro F1, to Recall and Precision. According to the oficial evaluator, our best submission reached a Macro F1 value equal to 0.3762.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;cryptocurrency influencers</kwd>
        <kwd>few-shot learning</kwd>
        <kwd>author profiling</kwd>
        <kwd>text classification</kwd>
        <kwd>Twitter</kwd>
        <kwd>data augmentation</kwd>
        <kwd>electra</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>English tweets each were available and classes available for predictions were: (1) subjective
opinion, (2) financial information, (3) advertising, (4) announcement. In this paper we discuss
the framework we used to participate in the first subtask (i.e., low-resource influencer profiling).</p>
      <p>After this introduction section, in Section 2 we discuss some traditional and deep approaches
for text classification, along with a brief discussion on some of the architecture proposed in
the previous editions of PAN. In Section 3 we provide the description of our method, including
the training and the simulation steps. In Section 4 we detail the experimental setup and the
evaluation of our framework, reporting the results obtained. In Section 5 we introduce some
future works and conclude the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        A comprehensive discussion on the proposed task for PAN@CLEF2023 is conducted in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. To
develope our proposed approaches [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], we evaluated the best performing methods
participating at the previous-shared competitions organized by PAN. We looked at the results of the
winning team at the author profiling task in 2021, where the best performing model consisted of
a shallow CNN presented in [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. We also considered the winning model at PAN@CLEF2022
where the authors won the challenge thanks to a soft voting ensemble technique that combines
BERTweet models with various loss functions and a BERT feature-based CNN model. In the
2020 edition of the author profiling task[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], based on their most recent 100 tweets, the aim was
to identify the authors likely to disseminate false information. The winners at the shared task
were [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. On the given test set, their models’ total accuracy was 0.77. The winning
strategies are based on n-grams, an SVM, and an ensemble of other machine learning models.
Other ensemble models have been proposed at the following tasks hosted at PAN about irony
and stereotype spreaders detection [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ].
      </p>
      <p>
        We also examined a number of contemporary models for text categorization problems. It is
important to note that Explainable Artificial Intelligence (XAI) techniques are increasingly being
used in place of black box-based strategies. Several of these techniques based on graphs are
applied in actual applications like text classification [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], trafic prediction [ 14], computer vision
[15] and social networking [16]. Authors in [17] compare SVM, Naive Bayes, Logistic Regression,
and Recurrent Neural Networks (RNN) as well as other popular machine learning methods.
Experimental results demonstrate that SVM and Naive Bayes outperform other approaches on
the dataset employed. In addition to the RNN, they do not report the evaluation of CNN or deep
learning-based models. In another relevant comparative study [18], on three separate datasets,
scholars assess seven machine learning methods. Gradient Boosting Algorithm, Gaussian Naive
Bayes, SVM, Random Forest, AdaBoost, KNN and Multi-Layer Perceptron are among the models
that were utilized. The Gradient Boosting Algorithm surpasses the others in terms of accuracy
and F1 score. There are not additional deep model experiments in this work, though.
      </p>
      <p>In [19] the task of automatically detecting fake news spreaders of COVID-19 news is addressed
by the authors by extending the CoAID dataset[20]. A deep learning model and Transformer’s
ability to produce language embeddings are combined in the authors’ stacked Transformer-based
neural network.</p>
      <p>In [21], the authors profile fake news spreaders using psycholinguistic and linguistic
characteristics as input to CNN. The outcomes of their experiments demonstrate how well suggested
model categorizes users as fake news spreaders. The dataset used for the authors’ comparison
was created expressly for their goal. However, only BERT was tested as Transformer model, and
no further investigations are provided about the performance of deep models. Their model has
also been evaluated on the PAN2020 dataset in [22]. On the English and Spanish datasets, the
tested model achieves a binary accuracy of 0.52 and 0.51, respectively. In the same work [22],
the authors suggest a novel model that outperforms the two winning models at PAN@CLEF2020
on both languages by utilizing personality data and visual features.</p>
      <p>In the work conducted in [23], for the purpose of sentiment classification, scholars suggest
using CNN. The authors demonstrate that using consecutive convolutional layers is eficient
for categorizing lengthy texts through tests with three well-known datasets.</p>
      <p>In regards to cryptocurrencies, authors in [24] develop a number of sequence-to-sequence
hyperbolic models that are suitable for bubble detection identification issues based on the
power-law dynamics of cryptocurrencies and user activity on social media. The study described
in [25] is intriguing from the standpoint of NLP. The authors use a combination of statistical
models and NLP techniques to examine what happened in social media starting in June 2019
with a focus on the rise of the Ethereum and Bitcoin prices, in order to better understand the
connections between cryptocurrency values and social media.</p>
      <p>Finally, the survey in [26] gives a succinct rundown of various text classification algorithms.
This overview discusses several ways for extracting text features, dimensionality reduction,
existing algorithms and methods, and evaluation strategies.</p>
      <p>
        Given the performances shown in another international multi-label text classification
challenge [27] and, as discussed in [
        <xref ref-type="bibr" rid="ref14">28, 29</xref>
        ], presuming that natural language processing conventional
methods can truly be outperformed by deep AI models, we decided to employ a Transformer
based architecture (namely, ELECTRA [
        <xref ref-type="bibr" rid="ref15">30</xref>
        ]). Considering that the proposed task hosted at
PAN@CLEF2023 consists on few-shot learning we also evaluated the augmented technique
discussed in [
        <xref ref-type="bibr" rid="ref16">31</xref>
        ]. In this work the authors propose a data augmentation technique based on
backtranslation to augment samples in the dataset.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. The Proposed Approach</title>
      <p>
        An empirical experiment with three stages is used to assess the suggested framework. First,
datasets without our augmentation modules are used to construct the baseline of author profiling
models. In the second phase, backtranslation from English to a target language and back to
English is used to create enriched data. For our two submissions we used two diferent target
languages. The first one is Italian, according to our previous study discussed in [
        <xref ref-type="bibr" rid="ref16">31</xref>
        ]. The second
language we used was German. This choice was motivated by the promising results obtained in
a similar study based on backtranslation [
        <xref ref-type="bibr" rid="ref17">32</xref>
        ]. The backtranslated sample is then concatenated
to the original one. In the final stage, the augmented data are used to train ELECTRA [
        <xref ref-type="bibr" rid="ref15">30</xref>
        ]
and to compare the performances with or without the backtranslation module. In our setting,
each sample is a user’s set of tweets, and we hypothesise that semantically enriching the user’s
tweets using our proposed modules can improve performance. By augmenting each sample
with one or multiple translations, we aim to increase the diversity and informativeness of the
data and improve the representation of the input, ultimately leading to better classification
performance of diferent NLP models. Our results outperform the not-augmented baseline,
showing that the expansion of samples with multiple languages using backtranslation leads to
improved performances in author profiling tasks. Thanks to the backtranslation module our
framework is able to outperform the results obtained without expanding the samples.
      </p>
      <p>
        No preprocessing is applied to the source text in the training datasets. In Figure 1 we show
the frameworks we used for our two submissions at the subtask 1. In the first submission we
augmented the training set backtranslating from Italian [
        <xref ref-type="bibr" rid="ref16">31</xref>
        ] and in the second submission
we backtranslated from German. In [
        <xref ref-type="bibr" rid="ref16">31</xref>
        ], as a last stage classifier, the authors did not use a
Transformer but a shallow CNN instead.
      </p>
      <p>
        The training of our model is performed on the augmented versions of the datasets. For the
ifrst submission we fine-tuned ELECTRA for 30 epochs on the dataset augmented using the
backtranslation tecnique with the Italian language. For the second one we used the German as
a target language. We used ELECTRA both for the interesting performance in terms of training
and inferencing time and results[
        <xref ref-type="bibr" rid="ref15 ref18">30, 33</xref>
        ]. In both cases we backtranslated the samples using the
Google Translate API1. After the training phase, we used the fine-tuned ELECTRA to predict
on the unlabeled test set provided by the task organizers.
      </p>
      <sec id="sec-3-1">
        <title>1https://pypi.org/project/googletrans/</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Evaluation</title>
      <sec id="sec-4-1">
        <title>4.1. Experimental Setup</title>
        <p>
          Our training and inferencing models, developed in TensorFlow and using Simple Transformers2
library, are publicly available as a Jupyter Notebook on GitHub3. For the training and for the
inferencing phases we made use of ELECTRA. According to what stated in [
          <xref ref-type="bibr" rid="ref15">30</xref>
          ], ELECTRA
suggests to replace certain tokens with possible replacements taken from a small generator
network, instead of masking input like in BERT. Then, a discriminative model is trained to
predict whether each token in the corrupted input was replaced by a generator sample or not,
as opposed to developing a model that predicts the original identities of the corrupted tokens.
Along with a graph neural network, ELECTRA can also be employed as an embedding layer
as in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. In our experiments, the original version of ELECTRA, presented in [
          <xref ref-type="bibr" rid="ref15">30</xref>
          ], was used.
In both submissions we used a batch size of 1. We fine-tuned ELECTRA for 30 epochs. No
improvements are obtained in fine-tuning for more epochs. Furthermore, we executed the
ifne-tuning for five runs.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. The Dataset</title>
        <p>The dataset provided by the PAN organizers consists of a set of Twitter authors and a variable
number of corresponding tweets. For each author in the training set the labels are also provided.
In Figure 2 is reported the image from the oficial task website 4.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Results</title>
        <p>The oficial metric used for the author profiling task at PAN@CLEF2023 is the Macro F1. This
metric, along with others, is the same used in the rest of this section and defined in (1).
  1 =
( 1)
#
(1)</p>
        <p>In Table 1 we report the results of our two submissions on the augmented version of the
datasets in terms of Macro F1. We report the highest Macro F1 along the 30 epochs of training and
also the median one. The median is calculated along five random initialization and fine-tuning
of ELECTRA. We also report the loss at the end of the training stage.</p>
        <p>In Table 2 we report the results using all the metrics provided by the oficial evaluator available
on GitHub5 for all the classes available and using the original non-augmented version of the
training set. Finally, in Table 3, we report the results with the metrics already presented in Table
1, but using the original non-augmented version of the training set.</p>
        <p>Although the Macro F1 and the accuracy prove that ELECTRA fine-tuned on the Italian
backtranslated version of the dataset outperforms the German one, as can be seen from Table 2
for three out of five classes the Precision is higher in the case of the submission using German</p>
        <sec id="sec-4-3-1">
          <title>2https://simpleTransformers.ai/about/ 3https://github.com/marco-siino/PAN-CRYPTO-2023 4https://pan.webis.de/clef23/pan23-web/author-profiling.html 5https://github.com/pan-webis-de/pan-code/tree/master/clef23/profiling-cryptocurrency-influencers</title>
          <p>backtranslation. However, a further investigation on the efect of the backtranslation on the
original samples could eventually lead to an explanation of these diferences among the classes.
Finally, while on augmented dataset used for training both the fine-tuned ELECTRA are able
to reach a Macro F1 equal to 0.9937, the version fine-tuned with the Italian backtranslation
appears to generalize better with a gap of 5-6% with respect to Macro F1 and Accuracy when
evaluated on the original non-augmented training set. On the oficial test set provided, our best
submission reached a Macro F1 value equal to 0.3762.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Works</title>
      <p>In this paper we have described our submitted model for our participation at the author profiling
task hosted at PAN@CLEF 2023. It consists of a backtranslation layer followed by an expansion
module to expand every sample in the dataset. These augmented versions of the samples are
then provided to ELECTRA both for training and inference phases.</p>
      <p>We intend to assess performance using diferent backtranslation techniques and other
languages in future studies. We also consider for future works to perform an error analysis on
authors who were incorrectly classified to assess the impact on the performance for the
considered classification task. Increasing the model’s complexity, perhaps by utilizing other recent
generative tool (i.e. ChatGPT), is another way that could eventually boost accuracy in author
profiling tasks. Given the size of the dataset that was provided, additional data augmentation
techniques could possibly be used. Before the training and testing phases of our model, some
research into the content of each tweet could influence the construction of the model in the use
of some strategies to remove noise (i.e., not relevant features) from input samples. According to
our research, enhancing samples with their respective backtranslations can lead to performance
improvements.</p>
      <p>As future works, it would also be interesting to investigate the performance of our approach
also on other datasets used for author profiling tasks. Furthermore, it could also be of interest
to evaluate the impact of other languages used in the backtranslation module discussed here.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>We would like to thank anonymous reviewers for their comments and suggestions that have
helped to improve the presentation of the paper.</p>
    </sec>
    <sec id="sec-7">
      <title>CRediT Authorship Contribution Statement</title>
      <p>Marco Siino: Conceptualization, Formal analysis, Investigation, Methodology, Resources,
Software, Validation, Visualization, Writing - Original draft, Writing - review &amp; editing. Maurizio
Tesconi: Writing - review &amp; editing. Ilenia Tinnirello: Writing - review &amp; editing.
and Labs of the Evaluation Forum, CLEF ’2022, Bologna, Italy, 2022, pp. 573–583.
[14] Y. Li, R. Yu, C. Shahabi, Y. Liu, Difusion convolutional recurrent neural network:
Datadriven trafic forecasting, arXiv preprint arXiv:1707.01926 (2017).
[15] P. Pradhyumna, G. Shreya, et al., Graph neural network (gnn) in image and video
understanding using deep learning for computer vision applications, in: 2021 Second
International Conference on Electronics and Sustainable Communication Systems (ICESC), IEEE,
2021, pp. 1183–1189.
[16] M. Siino, M. La Cascia, I. Tinnirello, Whosnext: Recommending twitter users to follow
using a spreading activation network based approach, in: 2020 International Conference
on Data Mining Workshops (ICDMW), IEEE, 2020, pp. 62–70.
[17] E. M. Mahir, S. Akhter, M. R. Huq, et al., Detecting fake news using machine learning and
deep learning algorithms, in: 2019 7th International Conference on Smart Computing &amp;
Communications (ICSCC), IEEE, 2019, pp. 1–5.
[18] A. P. S. Bali, M. Fernandes, S. Choubey, M. Goel, Comparative performance of machine
learning algorithms for fake news detection, in: International conference on advances in
computing and data sciences, Springer, 2019, pp. 420–430.
[19] S. Leonardi, G. Rizzo, M. Morisio, Automated classification of fake news spreaders to break
the misinformation chain, Information 12 (2021) 248.
[20] L. Cui, D. Lee, Coaid: Covid-19 healthcare misinformation dataset, arXiv preprint
arXiv:2006.00885 (2020).
[21] A. Giachanou, B. Ghanem, E. A. Ríssola, P. Rosso, F. Crestani, D. Oberski, The impact of
psycholinguistic patterns in discriminating between fake news spreaders and fact checkers,
Data &amp; Knowledge Engineering 138 (2022) 101960.
[22] R. Cervero, P. Rosso, G. Pasi, Profiling Fake News Spreaders: Personality and Visual
Information Matter, in: International Conference on Applications of Natural Language to
Information Systems, Springer, 2021, pp. 355–363.
[23] H. Kim, Y.-S. Jeong, Sentiment classification using convolutional neural networks, Applied</p>
      <p>Sciences 9 (2019) 2347.
[24] R. Sawhney, S. Agarwal, V. Mittal, P. Rosso, V. Nanda, S. Chava, Cryptocurrency bubble
detection: a new stock market dataset, financial task &amp; hyperbolic models, arXiv preprint
arXiv:2206.06320 (2022).
[25] M. Ortu, S. Vacca, G. Destefanis, C. Conversano, Cryptocurrency ecosystems and social
media environments: An empirical analysis through hawkes’ models and natural language
processing, Machine Learning with Applications 7 (2022) 100229.
[26] K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, D. Brown, Text
classification algorithms: A survey, Information 10 (2019) 150.
[27] M. Siino, M. La Cascia, I. Tinnirello, McRock at SemEval-2022 task 4: Patronizing and
condescending language detection using multi-channel CNN, hybrid LSTM, DistilBERT
and XLNet, in: Proceedings of the 16th International Workshop on Semantic Evaluation
(SemEval-2022), Association for Computational Linguistics, Seattle, United States, 2022,
pp. 409–417. URL: https://aclanthology.org/2022.semeval-1.55. doi:10.18653/v1/2022.
semeval-1.55.
[28] H. Wu, Y. Liu, J. Wang, Review of text classification methods on deep learning,
CMC</p>
      <p>Computers, Materials &amp; Continua 63 (2020) 1309–1321.</p>
    </sec>
    <sec id="sec-8">
      <title>A. Online Resources</title>
      <sec id="sec-8-1">
        <title>The source code of our model is available via</title>
        <p>• GitHub</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Borrego-Obrador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chinea-Ríos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franco-Salvador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Heini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kredens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pęzik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wolska</surname>
          </string-name>
          , , E. Zangerle, Overview of PAN 2023:
          <article-title>Authorship Verification, Multi-Author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection</article-title>
          , in: A.
          <string-name>
            <surname>Arampatzis</surname>
            , E. Kanoulas,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Tsikrika</surname>
            ,
            <given-names>A. G.</given-names>
          </string-name>
          <string-name>
            <surname>Stefanos Vrochidis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Aliannejadi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Vlachos</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF</source>
          <year>2023</year>
          ), Lecture Notes in Computer Science, Springer,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chinea-Rios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Borrego-Obrador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franco-Salvador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <article-title>Profiling Cryptocurrency Influencers with Few shot Learning at PAN 2023, in: CLEF 2023 Labs and Workshops</article-title>
          , Notebook Papers,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chinea-Ríos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franco-Salvador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Heini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Körner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kredens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pęzik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          , et al.,
          <source>Overview of pan</source>
          <year>2023</year>
          :
          <article-title>Authorship verification, multi-author writing style analysis, profiling cryptocurrency influencers, and trigger detection</article-title>
          ,
          <source>in: Advances in Information Retrieval: 45th European Conference on Information Retrieval</source>
          ,
          <string-name>
            <surname>ECIR</surname>
          </string-name>
          <year>2023</year>
          , Dublin, Ireland, April 2-
          <issue>6</issue>
          ,
          <year>2023</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>III</given-names>
          </string-name>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>518</fpage>
          -
          <lpage>526</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tesconi</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Tinnirello</surname>
          </string-name>
          ,
          <article-title>Profiling cryptocurrency influencers with few-shot learning using data augmentation and electra</article-title>
          ,
          <source>in: CLEF 2023 Labs and Workshops</source>
          , Notebook Papers,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Tinnirello,</surname>
          </string-name>
          <article-title>Xlnet on augmented dataset to profile cryptocurrency influencers</article-title>
          ,
          <source>in: CLEF 2023 Labs and Workshops</source>
          , Notebook Papers,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Di</given-names>
            <surname>Nuovo</surname>
          </string-name>
          , I. Tinnirello,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>La Cascia, Detection of hate speech spreaders using convolutional neural networks</article-title>
          ,
          <source>in: PAN 2021 Profiling Hate Speech Spreaders on Twitter@ CLEF</source>
          , volume
          <volume>2936</volume>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          , pp.
          <fpage>2126</fpage>
          -
          <lpage>2136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Di</given-names>
            <surname>Nuovo</surname>
          </string-name>
          , I. Tinnirello,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Cascia</surname>
          </string-name>
          ,
          <article-title>Fake news spreaders detection: Sometimes attention is not all you need</article-title>
          ,
          <source>Information</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>426</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Giachanou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. H. H.</given-names>
            <surname>Ghanem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <article-title>Overview of the 8th author profiling task at pan 2020: Profiling fake news spreaders on twitter</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>2696</volume>
          ,
          <string-name>
            <surname>Sun</surname>
            <given-names>SITE</given-names>
          </string-name>
          Central Europe,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pizarro</surname>
          </string-name>
          ,
          <article-title>Using n-grams to detect fake news spreaders on twitter</article-title>
          ,
          <source>in: CLEF</source>
          ,
          <year>2020</year>
          , p.
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Buda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bolonyai</surname>
          </string-name>
          ,
          <article-title>An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter</article-title>
          , in: CLEF,
          <year>2020</year>
          , p.
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Croce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garlisi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          ,
          <article-title>An svm ensamble approach to detect irony and stereotype spreaders on twitter</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3180</volume>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>2426</fpage>
          -
          <lpage>2432</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          , I. Tinnirello,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>La Cascia, T100: A modern classic ensemble to profile irony and stereotype spreaders</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3180</volume>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>2666</fpage>
          -
          <lpage>2674</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Lomonaco</surname>
          </string-name>
          , G. Donabauer,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          , Courage at checkthat! 2022:
          <article-title>Harmful tweet detection using graph neural networks and electra</article-title>
          , in: Working Notes of CLEF 2022-Conference
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hashida</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Tamura</surname>
          </string-name>
          , T. Sakai,
          <article-title>Classifying tweets using convolutional neural networks with multi-channel distributed representation</article-title>
          ,
          <source>IAENG International Journal of Computer Science</source>
          <volume>46</volume>
          (
          <year>2019</year>
          )
          <fpage>68</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>K.</given-names>
            <surname>Clark</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>T.</given-names>
            <surname>Luong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Electra:
          <article-title>Pre-training text encoders as discriminators rather than generators</article-title>
          , arXiv preprint arXiv:
          <year>2003</year>
          .
          <volume>10555</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mangione</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siino</surname>
          </string-name>
          , G. Garbo,
          <article-title>Improving irony and stereotype spreaders detection using data augmentation and convolutional neural network</article-title>
          ,
          <source>in: CEUR Workshop Proceedings</source>
          , volume
          <volume>3180</volume>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2022</year>
          , pp.
          <fpage>2585</fpage>
          -
          <lpage>2593</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Beddiar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Jahan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Oussalah</surname>
          </string-name>
          ,
          <article-title>Data expansion using back translation and paraphrasing for hate speech detection</article-title>
          ,
          <source>Online Social Networks and Media</source>
          <volume>24</volume>
          (
          <year>2021</year>
          )
          <fpage>100153</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>M.</given-names>
            <surname>Naseer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Asvial</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Sari</surname>
          </string-name>
          ,
          <article-title>An empirical comparison of bert, roberta, and electra for fact verification</article-title>
          ,
          <source>in: 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>241</fpage>
          -
          <lpage>246</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>