<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>February</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>News Detection Exploiting Ensemble Learning Techniques</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Farwa Batool</string-name>
          <email>farwa.batool@imtlucca.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Lo Re</string-name>
          <email>giuseppe.lore@unipa.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Morana</string-name>
          <email>marco.morana@unipa.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mario Tortorici</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Scuola IMT Alti Studi Lucca</institution>
          ,
          <addr-line>Lucca</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università degli Studi di Palermo</institution>
          ,
          <addr-line>Palermo</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>0</volume>
      <fpage>3</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>Traditional fake news detection methods rely on machine learning models like Decision Trees, Random Forests, and SVM, utilizing features like word count and term frequency. These methods struggle to capture the nuanced features of fake news, especially in the case of evolving online content. To overcome these limitations, this paper provides a Multi-View Ensemble classifier which considers domain knowledge during classification, a critical feature for fake news detection. The multi-view approach allows the classifiers to identify patterns in diferent aspects, which might be missed by traditional methods. The proposed ensemble method utilizes a weighted voting strategy for combining the results from multiple classifiers. The introduction of domain knowledge allows the better generalization of classifiers, against the rapidly evolving domains of fake news. The weighted voting strategy proved to be much more eficient compared with other voting approaches and the achieved results surpassed those of a reference state-of-the-art model.</p>
      </abstract>
      <kwd-group>
        <kwd>Domain knowledge</kwd>
        <kwd>fake news detection</kwd>
        <kwd>ensemble model</kwd>
        <kwd>machine learning</kwd>
        <kwd>scalability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Over the past decade, digital transformation has radically changed the information landscape, giving rise
to an era of instant access of news and updates through online platforms. While this shift has introduced
significant advantages, it has also expanded the dissemination of fake news. Social media has particularly
facilitated the quick and easy sharing of information, often without proper source verification. Fake
news is designed to appear authentic and is spread for a various reasons. In accordance with [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the
possible motivations include harming against individuals, organization, or entities; manipulating public
opinion on certain topics; or simply for entertainment. Adding to this problem, the inability of digital
platforms in efectively controlling and countering the fake news, poses a significant challenge. The issue
raises a critical question: how can a real news be distinguished efectively from fake news in a timely
manner? Traditional manual verification techniques, although accurate, do not provide the level of
scalability required to deal with the amount of digital information. As a result, the scientific community
has shifted their focus to automating the news verification process, using Artificial Intelligence (AI) and
Machine Learning (ML) models [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These technologies enable the analysis of large datasets in real
time [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], making it one of the most relevant areas of research.
      </p>
      <p>Previous studies often rely on a single view for classification overlooking the intricate patterns that
fake news contains. A news item may be more representative in certain aspects–such as semantics and
emotions–rather than other features like writing styles. Considering only one aspects for all news items
can lead to inefective model performance, reducing the model’s ability to generalize and accurately
detect fake news. The other lack in literature is the efective consideration of domain knowledge
to which a news item belongs, neglecting the importance of the latter in the classification process.
Fake news may contain domain-specific information, which require an understanding of the context
to be detected as fake. To address these issues this work introduces an advanced machine learning</p>
      <p>CEUR</p>
      <p>ceur-ws.org
model based ensemble techniques, capable of accurately recognizing fake news and competing with
state-of-the-art models. More precisely, the contributions of this work are following:
• To break down each news item into three types of features– semantic, emotional and stylistic–
capturing a comprehensive view of the content.
• To develop an ensemble model that integrates well-established fake news detection models,
aiming to maximize classification accuracy.
• To integrate domain knowledge into the classification process emphasizing the most relevant
features in the context of fake news, to obtain more accurate classification results.</p>
      <p>The remainder of this paper is organized as follows: related work is described in Section 2. The
detailed methodology of this work is explained in Section 3. Experimental settings and results are
discussed in Section 4. Conclusions are given in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Extensive research is concentrated on the analysis and detection of fake news [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ] based on the
examination of various features, generally classified into social context-based [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], user -based [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], or
content-based [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] characteristics. The former set of features primarily focus on the propagation dynamics
of news within social network, including the speed and patterns of its spread. User-based features
involve assessing the credibility of the individuals sharing or posting news. Content-based features,
on the other hand, emphasize on attributes of news articles, including their text, linguistic style, and
source credibility.
      </p>
      <p>In dealing with this last set, researchers have proposed various techniques, including knowledge
graphs [10], n-grams with Term-Frequency-Inverted Document Frequency (TF-IDF) and Gradient
Boosting classifier [ 11] and transformers [12]. The most significant progress in efectively extracting
semantic context from texts has been made by using pre-trained models on large corpora of data such
as BERT [13] and Roberta [14]. Zhang et al. in [15] conducted an analysis to evaluate how influential
were the emotions extracted from comments and what relationship they had with emotions extracted
from the news text. The authors define two types of emotions, those conveyed by the person writing
the news (i.e., the publisher) and those that the news aroused in users (i.e., social emotion). Then, dual
emotion, given by the union of publisher and social emotion, are integrated into existing models for
fake news detection (e.g. BERT) showing significantly superior performance.</p>
      <p>Other works highlighted the importance of integrating domain knowledge into fake news detection,
as understanding typical phrasing, tone, and factual structure within a specific domain can significantly
improve the detection of false claims in that area. However, fake news can be generally associated with
multiple domains, such as politics, health, entertainment, and education making it essential to consider
cross-domain features for better performance of the models [16]. While substantial research has been
conducted on single-domain fake news detection [17, 18, 19], these models exhibit poor performance
on any unseen or new domain. This limitation arises because the models are trained on model-specific
features and struggle to generalize well on diverse domains.</p>
      <p>To address this problem, Qi et al. [20] presented a Multi-domain Visual Neural Network (MVNN)
and utilized visual content of fake news to classify the images as fake or real based on frequency
domain and pixel domain. While Wang et al. [21] proposed a soft-label multi-domain fake news
detection (SLFEND) utilizing two Chinese text-based datasets for extracting multi-domain features
and the MLP for classification task. The experimental results show significant improvement in results
using Weibo dataset. The paper [16] introduced a novel framework that preserves both domain-specific
and cross-domain knowledge using independent embedding spaces. Additionally, an unsupervised
instance selection technique is proposed to optimize the selection of news records for manual labeling.
The results show that the proposed approach improves detection accuracy on cross-domain datasets,
achieving state-of-the-art performance, especially in handling rarely-seen domains.</p>
      <p>BPE
NRC-EL
NRC-EIL
VADER
Wikipedia
emoticons
[CLS]
token1
token2
Emotional Lexicon
Emotional Intensity
Emotional Score</p>
      <p>Auxiliary Features</p>
      <p>Characters,
sentences, words,
clauses, AWL, LW,</p>
      <p>RIX@,LIX CRreeaddiabbiilliittyy
!, idiaodmjesc,tiimveasges, SAettnrsaictitviviteyness
Degree of adverb,
pronouns, question</p>
      <p>marks
Feature Extractors Level 0 Features</p>
      <p>RoBERTa</p>
      <p>SemExtractor</p>
      <sec id="sec-2-1">
        <title>EmWbeodrdding</title>
        <p>Embedding Convolutional Max</p>
        <p>Layer Layer PLoaoylienrg
EmoExtractor
StyExtractor</p>
        <p>Hidden
Layers
Hidden
Layers</p>
      </sec>
      <sec id="sec-2-2">
        <title>Feature Extractors Level 1 Features</title>
        <p>rsem</p>
        <p>Nan et al. [22] exploited the power of BERT to generate embeddings from news texts which belong
to diferent domains. Based on [ 22] and [15], Zhu et al. also proposed an integrative framework called
M3FEND for automatic multi-domain fake news detection [23]. Two main challenges were Domain Shift
and Incompleteness of domain labels. According to the former fake news can evolve rapidly, resulting in
a change in data distribution. It is therefore necessary to improve the generalization capacity of models.
The latter states that a news item can be classified as belonging to only one domain, but it can deal with
topics coming from multiple fields, for example a news item from the world of politics could also concern
the field health care. The model addressed these challenges by applying a multi-view, multi-domain
approach. Three types of features Semantics, Emotions and Styles were extracted from a Multi-channel
Multi-view Extractor that allows to extract information coming from diferent representations of the
news. To enrich the information coming from the domain, a component called Domain Memory Bank
was used in which all the relevant characteristics of each domain were collected and stored. A Domain
Adapter aggregated these representations and model the discrepancy between domains.</p>
        <p>Although these models exhibit good performance the cost of models is also increased for domain
alignment and domain labels assignment[24]. Additionally, a news item can belong to more than one
domain, but can have limited relevance to other news items [25] in that domain. In this case, forcing
the models to learn and annotate the domain labels will cause a domain bias. To address these problems,
we propose a simple ensemble model with voting strategies assigned weights based on the respective
domains of news items. Ensemble methods are preferred because individual classifiers are prone to
risks such as variance, bias or over-fitting. However, the ensembles have consistently outperformed
individual classifiers in various applications such as sentiment analysis [ 26], anomaly detection [27],
and intrusion detection [28, 29].</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Multi-Domain Feature Extraction</title>
        <p>Similar to [23], the feature extraction method operates at two levels. First, three kind of basic features,
i.e., semantic, emotional, and stylistic, are extracted from the text. These are further propagated into
three distinct deep extractors in order to provide higher-level representations. The whole process is
summarized in Figure 1, where semantic, emotional, and stylistic features extractions are represented
by blue, orange, and green modules, respectively.
3.1.1. Level 0 Features
For semantic view, the pre-trained RoBERTa (Robustly Optimized BERT Pretraining Approach)
tokenizer is employed, similar to the approach used in [22]. The content of each news item is tokenized
using Byte-Pair Encoding (BPE) which keeps the common words intact and breaks down the rare or
unseen words into sub-units. Each generated token is mapped into a dictionary, with special tokens
such as [CLS] for start of the sequence, [SEP] for the separation between the sentences and [PAD] for
padding to make sure uniform sequence length. A maximum length of 300 is set, reflecting the character
length constraint typical on social media platforms such as X. Additionally, an attention mask was also
generated to distinguish the actual tokens from padding where the mask is set to 1 for real tokens and 0
for padding tokens.</p>
        <p>The emotional view captures the feeling of both author’s and reader’s towards the topic, therefore
integrating this information allows the introduction of useful patterns in detection of fake news. Here,
38 emotional features are extracted and grouped into four categories:
• Emotional Lexicon: This set captures emotions through text that conveys emotion using specific
words. The NRC1 Emotion Lexicon (NRC-EL) is used which associates words with eight emotions
(anger, fear, anticipation, trust, surprise, sadness, joy and disgust) and two feelings (negative and
positive). For each word in the lexicon, a flag (with value 0 or 1) denotes its association with a
given emotion.
• Emotional Intensity: Each word in this set is given a score (between 0 or 1) depending on the
strength of emotion conveyed by it. The NRC Emotion Intensity Lexicon (NRC-EIL) is used in
this step.
• Emotional Score: This set measures the impact of emotion related to the text, using numerical
values through the Valence Aware Dictionary and sEntiment Reasoner (VADER) package of NLTK
library.
• Auxiliary Features: The aim is to extract the characteristics of non-verbal elements, such as
emoticons, punctuation elements and capital letters. Emoticons from Wikipedia2 are utilized in
this step.</p>
        <p>The stylistic view captures the fact that fake news authors have distinct linguistic patterns, i.e.,
they need to adopt a particular writing style to convince readers of authenticity and achieve high
engagements. Therefore, analyzing these patterns can help in fake news detection [18]. Therefore,
based on [30], to study the stylistic view of the text, a total of 18 stylistic features were extracted and
grouped into four categories i.e., readability, credibility, attractiveness, and sensitivity of a text.
3.1.2. Level 1 Features
Level 0 features extracted so far are propagated into deep extractors, called SemExtractor, EmoExtractor
and StyExtractor. The SemExtractor module employs a TextCNN model, which receives the Level 0
features as inputs and uses the pre-trained RoBERTa model to generate word embeddings. Convolutional
1https://github.com/RMSnow/WWW2021/tree/master/resources/English/NRC
2https://en.wikipedia.org/wiki/List_of_emoticons
iflters are then applied, followed by a max pooling operation to produce deep semantic features. The
resulting output is a feature vector</p>
        <p>having a dimension of 320 (64 feature maps x 5 filters). The
EmoExtractor designed for deep emotional representations utilizes a Multilayer Perceptron (MLP)
consisting three layers. The input to the network as the Level 0 emotional features. The first hidden layer
expands the initial 38-dimension vector into a 256-dimensional representation using ReLU activation
function for nonlinearity. Then the second hidden layer further expands the feature vector to
320dimensional representation. The output is a feature vector denoted as   . Similar to EmoExtractor,
the StyExtractor also consists of an MLP architecture with three layers. The diference is that the
initial stylistic feature vector is 18-dimensional. All the resulting feature vectors are 320-dimensional
representations of the same news from diferent points of views.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Multi-view Ensemble Classifier (MEC)</title>
        <p>The proposed model employs a multi-view approach to detect fake news by integrating the three types
of deep-features ( 
,</p>
        <p>,   ). The goal is to analyze each news item from diferent perspectives
to obtain a more accurate classification through an ensemble-based system of classifiers. For each
feature group, multiple machine learning models–including Decision Tree, Random Forest, Support
Vector Machine (SVM) and Multilayer Perceptron (MLP) are used to constitute the base learners within
the ensemble. It is then followed by soft voting to aggregate the probabilities produced by the each
ensemble. Lastly, to combine the predictions from the chosen classifiers, three voting strategies have
been evaluated, namely hard, soft , and weighted voting.</p>
        <p>In the case of hard voting, each of the n ensembles provides a discrete prediction  ̂ for each sample
x. The final decision is determined by choosing the class  ̂ that obtains the majority of predictions.
In soft voting , instead, each of the n ensembles returns a probability   ( = |)
indicating that sample
x belongs to class  ∈ {0,1}. The final prediction is based on the arithmetic mean of the probabilities
provided by each ensemble:
 =̂ arg max ∑ ( ̂  = ).</p>
        <p>=1</p>
        <p>The weighted voting strategy is similar to soft voting, but weights   are assigned based on the
domain d to which a given sample belongs. The weights are determined during the validation phase,
based on the intermediate F1-scores and Accuracy metrics [31].</p>
        <p>For each domain d a weight vector   = [  
,   
,    ]is created representing the influence of the
semantic, emotional, and stylistic views on classification within that domain. Each vector is normalized
such that the sum of the weights is equal to 1. The vectors obtained for each single domain, form a
weight matrix W consisting of m rows corresponding to the domains and n columns corresponding to
the views. For three-domain problem, a 3x3 matrix is generated as follows:
 =</p>
        <p>0
[</p>
        <p>1
 2
 
 
0
1
 2


 0
 1 ]
 2</p>
        <p>In the proposed experimental scenario, two separate matrices – one for F1-scores and one for
Accuracies – will be considered. During the final prediction phase, assuming that the news item to
be classified belongs to the j-th domain, the corresponding weight vector   is extracted from the W
rsem
rrsemo
…</p>
        <p>remo
wsem</p>
        <p>wsty
wemo
SOFT
VOSSoTofIfNttG
VVoottiinngg
WEIGHTED
VOTING</p>
        <p>Voting
OSUtrTatPegUy T
Output
rrsemo rsty
…</p>
        <p>rsty
SOFT
VOSSoTofIfNttG
VVoottiinngg Voting</p>
        <p>Soft
Voting</p>
        <p>Strategy
Output
remo
Soft
Voting
Voting</p>
        <p>Strateg
Outp
rsem
…
SOFT
VOSToIfNtG</p>
        <p>Voting</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Analysis and Discussion</title>
      <p>To conduct the experiments, a custom dataset was created by merging data from two existing datasets:
FakeNewsNet and MM-COVID. The FakeNewsNet3 is one of the widely used dataset in research and
includes data of 23194 news items from two domains: entertainment (GossipCop) and politics (PolitiFact).
MM-COVID 4 dataset is a multilingual and multimodal dataset designed for news related COVID-19
pandemic. The news items have been labeled as true or false by reliable fact-checking sources, such as
Snopes and the International Fact-Checking Network (IFCN). The dataset consists of news items in six
main languages: English, Spanish, Portuguese, French, Hindi and Italian. In particular, there are about
4000 news items in English, equally balanced between real and fake. Note that since the datasets are old,
most of users were unavailable, and due to the privacy policies the complete dataset is not withdrawn
in this research. The comparison with state-of-the-art models is also done according to the available
dataset only.</p>
      <p>Considering a strong imbalance in the dataset, a sub-sampling was performed to ensure a
homogeneous distribution of real and false news. Thus, the final dataset consists of a total of 14000 news items,
belonging to three diferent domains: GossipCop, PolitiFact and COVID. Each news was characterized
by its id, content, label and domain. The data distribution was 50% of fake news and 50% of real news.
In case of domains, GossipCop was 76%, samples from PolitiFact were 6.2% and 17.8% of samples from
COVID dataset.</p>
      <p>Random Forest</p>
      <p>SVM
MLP
n_estimators=1000,
max_depth=20
kernel=poly,</p>
      <p>C=1, gamma=1</p>
      <p>The experiments were conducted with a 60-20-20 train-test-validation split, while a grid search
approach was employed to optimize the hyperparameters for each classification models along with
k-fold cross-validation (k = 5). The optimal hyperparameters identified were then used to train the
base classifiers for each feature group. The configurations of the classifiers are as mentioned in Table 1.
Additionally, a fixed random seed (random_state=2024), was set to ensure experimental reproducibility.</p>
      <p>The first experiment aimed to evaluate the performance of the proposed architecture using a balanced
dataset (50% real and 50% fake) of news uniformly taken from the GossipCop, PolitiFact, and COVID
domains (i.e., 33.33% from each dataset). The objective was to evaluate the performance of the model in
the absence of any bias resulting from a non-uniform distribution of samples. The results obtained by
applying the various voting strategies are reported in Table 2 with the best metrics highlighted in bold.</p>
      <p>As it can be observed, the PolitiFact domain was consistently classified with high accuracy across
all voting strategies, most likely because the news belonging to this domain have a more structured
nature. Good results are also obtained considering the COVID domain, whilst the GossipCop posed a
challenge to the model. The reason is probably that the news belonging to this domain present more
unpredictability which would require for the model to be trained with a larger number of samples in
order to detect these aspects. From an overall performance analysis, the proposed weighted voting
strategies (both based on accuracy and F1-score) emerged as the most balanced and well-performing
ones, achieving highest average (over all domains) accuracy and F1-score values. At the same time, the
soft voting strategy also proved to be a good alternative, consistently performing well compared to
hard voting which is the least flexible among the strategies.</p>
      <p>In the second experiment, the entire available dataset was used, which showed a balance for the
classification labels (50% real and 50% fake) and an imbalance for the domain labels (GossipCop domain
(76%), COVID (17.8%) and PolitiFact (6.2%)).</p>
      <p>The results, reported in Table 3, are significantly better for GossipCop, with a slight reduction in
accuracy for the other two domains. This turns out to be predictable because of both a significant
increase in the number of samples and the unbalanced distribution of the domains. Notably, the
increased accuracy in the GossipCop domain supports the hypothesis from experiment 1: with a
more significant number of samples, the model is capable of capturing the complex patterns, thereby
improving classification performance.</p>
      <p>The third experiment presented a more realistic scenario, where real news is present in a greater
quantity than false news, i.e., 41% vs 59%. The results are presented in Table 4 which shows that the
weighted voting strategies again outperform others across multiple datasets. Here, hard and soft voting
strategies provided competitive performance but were unable to capture the nuanced patterns such
as in GossipCop domain, due to the presence of low number of its samples. While earlier studies
[32] suggested that the weight estimation might be challenging, the proposed approach addresses this
by achieving competitive or better F1-scores and accuracies, demonstrating that the weighted voting
performs better than simple voting. The reason being diferent domains exhibiting distinct properties
in terms of semantics, emotional and stylistic features. This overall analysis suggests that weighted
ensemble methods, especially those focused on F1 scores, are well-suited for multi-domain fake news
detection where cross-domain features must be efectively accounted for.</p>
      <p>Finally, in order to demonstrate the efectiveness of the proposed architecture as compared with
stateof-the-art, Table 5 compares the overall performance MEC with the M3FEND model. It is worth noticing
that M3FEND has been tested on the same dataset/features used to evaluate MEC, thus results are
slightly diferent from those reported in [ 23]. As it can be observed, MEC achieves better performances
according to the all the four metrics, with a few exceptions where values are almost comparable.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this study, machine learning techniques were evaluated for their efectiveness in detecting fake news,
leveraging state-of-the-art methodologies. A novel architecture based on ensemble machine learning
techniques was proposed for multi-view fake news recognition. The model analyzes each news item
from three perspectives: semantics, emotions, and writing styles. These perspectives are used to extract
the basic features which are further processed by deep feature extractors to generate more complex
features. These comprehensive feature vectors provide a multi-faceted representation of news items,
enabling the model to incorporate domain-specific knowledge efectively. The aim of ensemble model
is to mitigate the bias and errors inherent in the individual classifiers. Therefore, the ensemble model
aggregates the predictions from multiple classifiers, providing results with the highest probability. This
is followed by an advanced voting strategy integrating domain knowledge. This approach provides not
only robust predictions but also ensure model’s ability to generalize on diverse domains.</p>
      <p>The results demonstrated the model’s eficacy in various dataset configurations, outperforming
comparable approaches in the literature and setting a new benchmark for accuracy and performance in
fake news detection. The domain-dependent weighted F1 voting strategy showed particular promise
for real-world applications.</p>
      <p>Future research directions include integrating a domain prediction component into the model to
automate the domain identification process, enhancing accuracy while maintaining scalability.
Additionally, expanding datasets to include a wider range of domains and incorporating user comments from
social media platforms could provide richer insights and uncover latent patterns, further advancing the
model’s capabilities.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was partially supported by the AMELIS project, within the project FAIR (PE0000013), and by
the ADELE project, within the project SERICS (PE00000014), both under the MUR National Recovery
and Resilience Plan funded by the European Union - NextGenerationEU.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
[10] J. Z. Pan, S. Pavlova, C. Li, N. Li, Y. Li, J. Liu, Content based fake news detection using knowledge
graphs, in: The Semantic Web–ISWC 2018: 17th International Semantic Web Conference, Monterey,
CA, USA, October 8–12, 2018, Proceedings, Part I 17, Springer, 2018, pp. 669–683.
[11] H. E. Wynne, Z. Z. Wint, Content based fake news detection using n-gram models, in: Proceedings
of the 21st international conference on information integration and web-based applications &amp;
services, 2019, pp. 669–673.
[12] S. Raza, C. Ding, Fake news detection based on news content and social contexts: a
transformerbased approach, International Journal of Data Science and Analytics 13 (2022) 335–362.
[13] J. Devlin, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv
preprint arXiv:1810.04805 (2018).
[14] Y. Liu, Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692
(2019).
[15] X. Zhang, J. Cao, X. Li, Q. Sheng, L. Zhong, K. Shu, Mining dual emotion for fake news detection,
in: Proceedings of the web conference 2021, 2021, pp. 3465–3476.
[16] A. Silva, L. Luo, S. Karunasekera, C. Leckie, Embracing domain diferences in fake news:
Crossdomain fake news detection using multi-modal data, in: Proceedings of the AAAI conference on
artificial intelligence, volume 35, 2021, pp. 557–565.
[17] Q. Zhang, Z. Guo, Y. Zhu, P. Vijayakumar, A. Castiglione, B. B. Gupta, A deep learning-based fast
fake news detection model for cyber-physical social services, Pattern Recognition Letters 168
(2023) 31–38.
[18] D. K. Vishwakarma, P. Meel, A. Yadav, K. Singh, A framework of fake news detection on web
platform using convnet, Social Network Analysis and Mining 13 (2023) 24.
[19] Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, J. Gao, Eann: Event adversarial neural
networks for multi-modal fake news detection, in: Proceedings of the 24th acm sigkdd international
conference on knowledge discovery &amp; data mining, 2018, pp. 849–857.
[20] P. Qi, J. Cao, T. Yang, J. Guo, J. Li, Exploiting multi-domain visual information for fake news
detection, in: 2019 IEEE international conference on data mining (ICDM), IEEE, 2019, pp. 518–527.
[21] D. Wang, W. Zhang, W. Wu, X. Guo, Soft-label for multi-domain fake news detection, IEEE Access
11 (2023) 98596–98606. doi:10.1109/ACCESS.2023.3313602.
[22] Q. Nan, J. Cao, Y. Zhu, Y. Wang, J. Li, Mdfend: Multi-domain fake news detection, in: Proceedings
of the 30th ACM International Conference on Information &amp; Knowledge Management, 2021, pp.
3343–3347.
[23] Y. Zhu, Q. Sheng, J. Cao, Q. Nan, K. Shu, M. Wu, J. Wang, F. Zhuang, Memory-guided multi-view
multi-domain fake news detection, IEEE Transactions on Knowledge and Data Engineering 35
(2022) 7178–7191.
[24] H. Liu, W. Wang, H. Li, H. Li, Teller: A trustworthy framework for explainable, generalizable and
controllable fake news detection, arXiv preprint arXiv:2402.07776 (2024).
[25] J. Li, X. Feng, T. Gu, L. Chang, Dual-teacher de-biasing distillation framework for multi-domain
fake news detection, in: 2024 IEEE 40th International Conference on Data Engineering (ICDE),
IEEE, 2024, pp. 3627–3639.
[26] G. Wang, J. Sun, J. Ma, K. Xu, J. Gu, Sentiment classification: The contribution of ensemble
learning, Decision support systems 57 (2014) 77–93.
[27] L. Bilge, D. Balzarotti, W. Robertson, E. Kirda, C. Kruegel, Disclosure: detecting botnet command
and control servers through large-scale netflow analysis, in: Proceedings of the 28th Annual
Computer Security Applications Conference, 2012, pp. 129–138.
[28] V. Agate, F. Concone, A. De Paola, P. Ferraro, S. Gaglio, G. Lo Re, M. Morana, Adaptive ensemble
learning for intrusion detection systems, in: CEUR Workshop Proceedings, volume 3762, CEUR-WS,
2024, pp. 118–123.
[29] V. Agate, D. Felice Maria, A. De Paola, P. Ferraro, G. Lo Re, M. Morana, A behavior-based intrusion
detection system using ensemble learning techniques., in: ITASEC, 2022, pp. 207–218.
[30] Y. Yang, J. Cao, M. Lu, J. Li, C.-W. Lin, How to write high-quality news on social network?
predicting news quality by mining writing style, arXiv preprint arXiv:1902.00750 (2019).
[31] R. Wardoyo, A. Musdholifah, G. A. Pradipta, I. N. H. Sanjaya, Weighted majority voting by statistical
performance analysis on ensemble multiclassifier, in: 2020 Fifth International Conference on
Informatics and Computing (ICIC), IEEE, 2020, pp. 1–8.
[32] G. Fumera, F. Roli, A theoretical and experimental analysis of linear combiners for multiple
classifier systems, IEEE transactions on pattern analysis and machine intelligence 27 (2005)
942–956.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zannettou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sirivianos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Blackburn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kourtellis</surname>
          </string-name>
          ,
          <article-title>The web of false information: Rumors, fake news, hoaxes, clickbait, and various other shenanigans</article-title>
          ,
          <source>Journal of Data and Information Quality (JDIQ) 11</source>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Batool</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Canino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Concone</surname>
          </string-name>
          , G. Lo Re,
          <string-name>
            <given-names>M.</given-names>
            <surname>Morana</surname>
          </string-name>
          ,
          <article-title>A black-box adversarial attack on fake news detection systems (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Concone</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. De Paola</surname>
            , G. Lo Re,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Morana</surname>
          </string-name>
          ,
          <article-title>Twitter analysis for real-time malware discovery</article-title>
          , in: 2017 AEIT International Annual Conference, IEEE,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Amorim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Prodan</surname>
          </string-name>
          ,
          <article-title>Welfake: word embedding over linguistic features for fake news detection</article-title>
          ,
          <source>IEEE Transactions on Computational Social Systems</source>
          <volume>8</volume>
          (
          <year>2021</year>
          )
          <fpage>881</fpage>
          -
          <lpage>893</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chai</surname>
          </string-name>
          , H. Han,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <article-title>An integrated multi-task model for fake news detection</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>34</volume>
          (
          <year>2021</year>
          )
          <fpage>5154</fpage>
          -
          <lpage>5165</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Gravanis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vakali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Diamantaras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Karadais</surname>
          </string-name>
          ,
          <article-title>Behind the cues: A benchmarking study for fake news detection</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>128</volume>
          (
          <year>2019</year>
          )
          <fpage>201</fpage>
          -
          <lpage>213</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          , H. Liu,
          <article-title>Beyond news contents: The role of social context for fake news detection</article-title>
          ,
          <source>in: Proceedings of the twelfth ACM international conference on web search and data mining</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>312</fpage>
          -
          <lpage>320</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zafarani</surname>
          </string-name>
          , H. Liu,
          <article-title>The role of user profiles for fake news detection</article-title>
          ,
          <source>in: Proceedings of the 2019 IEEE/ACM international conference on advances in social networks analysis and mining</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>436</fpage>
          -
          <lpage>439</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Kondamudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chouhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yadav</surname>
          </string-name>
          ,
          <article-title>A comprehensive survey of fake news in social networks: Attributes, features, and detection approaches</article-title>
          ,
          <source>Journal of King Saud UniversityComputer and Information Sciences</source>
          <volume>35</volume>
          (
          <year>2023</year>
          )
          <fpage>101571</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>