<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Bologna, Italy
$ ciprian-octavian.truica@it.uu.se;ciprian.truica@upb.ro (C. Truică);
elena-simona.apostol@it.uu.se;elena.apostol@upb.ro (E. Apostol); adrian.paschke@fokus.fraunhofer.de
(A. Paschke)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Awakened at CheckThat! 2022: Fake News Detection using BiLSTM and Sentence Transformer</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ciprian-Octavian Truică</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena-Simona Apostol</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adrian Paschke</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest</institution>
          ,
          <addr-line>Splaiul Independent</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Information Technology, Uppsala University</institution>
          ,
          <addr-line>Lägerhyddsvägen 1, Uppsala, 75105</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Fraunhofer Institute for Open Communication Systems</institution>
          ,
          <addr-line>Berlin, 10589</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>In recent years, online social networks and online news venues have become some of the main news and event-related information spreading mediums. Although using these mediums has facilitated the speed of accessing information, it also created a new phenomenon used for propaganda and disinformation: fake news. As fake news has detrimental consequences to society, new technologies need to be developed in order to stop their harmful efects. In this paper, we propose two Bidirectional Long Short-Term Memory (BiLSTM) architectures with sentence transformers to solve two tasks: (1) a multi-class mono-lingual fake news detection task (i.e., mono-lingual task); and (2) a multi-class cross-lingual fake news detection task (i.e., cross-lingual task). For the mono-lingual task, we train and test a BiLSTM with BART sentence transformers model on an English dataset and obtain an accuracy of ∼ 0.53 and an F1-Score of ∼ 0.32. For the cross-lingual task, we train a BiLSTM with XLM sentence transformers model on an English dataset and test the model using transfer learning on a German dataset. For this task, we obtain an accuracy of ∼ 0.28 and an F1-Score of ∼ 0.19.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Fake News Detection</kwd>
        <kwd>Neural Networks</kwd>
        <kwd>Sentence Transformers</kwd>
        <kwd>Transfer Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        With the digital age, new mass media paradigms for information distribution have been adopted
by the general public. The current paradigms have shifted from the journalistic rigorous imposed
by editors to personalized social media where anyone can spread event related news. This
new approach aggravates the risk of fake news [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], which has detrimental consequences
to society by facilitating the spread of misinformation in the form of fake news, propaganda,
conspiracy theories, political bias, etc. The practices of spreading misinformation online by
malicious actors need to be tackled from diferent points of view, i.e., from a journalistic and
fact-checking perspective to a more technological-based one.
      </p>
      <p>In this paper, we address the problem of detecting fake news from a technological perspective
by using the CheckThat! 2022: Fake News Detection Challenge datasets. To tackle the problem
we propose two neural network with sentence transformer models for (1) multi-class
monolingual fake news detection; and (2) multi-class cross-lingual fake news detection. We use
transfer learning in order to train a model on an English dataset and test it on German text.</p>
      <p>In this work, we aim to answer the following two research questions:
(1) Does a simple neural network with sentence transformers ofer good results for multi-class
mono-lingual fake news detection?
(2) Can cross-lingual sentence transforms be used through transfer learning in multi-class
cross-lingual fake news detection?</p>
      <p>To answer question (1), we propose the use of a Bidirectional Long Short-Term Memory
(BiLSTM) neural network with BART sentence transformers. While to answer question (2),
we train a BiLSTM neural network with XLM sentence transformers on English textual data
and use transfer learning to solve a cross-lingual fake news detection task by testing the model
on German textual data.</p>
      <p>This paper is structured as follows. In Section 2, we discuss some of the current literature on
fake news detection. In Section 3, we present our approach for mono-lingual and cross-lingual
fake news detection. In Section 4, we present the datasets, experimental setup, and results.
Finally, in Section 5, we summarize our findings and hint at future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        The task of fake news detection has been examined from various perspectives and using various
models from traditional Machine Learning to more elaborate yet more powerful Neural Network
based models. Extensive work in the field of fake news detection has led to many solutions
focusing on either model or data-driven approaches. Among the traditional Machine Learning
models (e.g., Support Vector Machine, Logistic Regression, Decision Trees, AdaBoost, Naïve
Bayes), the model that performs very good in many cases is Multinomial Naïve Bayes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Several current solutions use complex Deep Neural Network architectures for this task. Many
solutions for multi-class mono-lingual fake news detection show promising results when using
Convolutional Neural Network (CNN) based architectures. FNDNet [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is such an architecture
that obtains good results in comparison even with recurrent networks, i.e., LSTM.
OPCNNFAKE [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is an optimized CNN based solution that uses a hyperopt optimization technique to
adapt the values of parameters for each component layer in order to achieve high performance.
Other Deep Learning solutions focus on recurrent networks, e.g., (Bi)GRU, (Bi)GRU, (Bi)LSTM,
obtaining the best results when also using attention mechanisms [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. BiLSTM based solutions
(e.g., Samantaray and Kumar [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Trueman et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]) are very promising as this type of recurrent
network is able to capture both past and future information.
      </p>
      <p>
        In multi-class classification, the employed embedding model is very important. As shown
in Ilie et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], many Deep Learning models have an increase in accuracy when using custom
trained word embeddings versus pre-trained ones. Other models use advanced pre-trained
transformers instead of the more classical word embeddings. Diferent transformer models can
be applied for fake news detection, e.g., BERT (Bidirectional Encoder Representations from
Transformers) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], RoBERTa (A Robustly Optimized BERT pre-training Approach) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], BART
(Bidirectional and Autoregressive Transformer) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. As such, MisRoBÆRTa [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is a complex
architecture that combines BART and RoBERTa for a multi-class classification task. Another
solution to the multi-class classification problem is proposed by Liu et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] that ofers a
two-stage BERT-based model.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>
        In this section, we present the methodology used for fake news detection. For encoding the text,
we used two sentence transformer [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] approaches. (1) For multi-class fake news detection of
news articles in English, we use BART (Bidirectional and Auto-Regressive Transformers) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
sentence transformers. (2) For cross-lingual news articles, we use XLM (Cross-Lingual Language
Model) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] sentence transformers. We employed a BiLSTM (Bidirectional Long Short-Terms
Memory) as the classification model.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Sentence Transformers</title>
        <p>
          Sentence transformers are a modification of the pre-trained BERT networks that use siamese
and triplet network structures to derive semantically meaningful sentence embeddings that
can be compared using cosine-similarity [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. We construct BART [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and XLM [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] sentence
transformers for the mono-lingual and cross-lingual classification tasks, respectively.
        </p>
        <p>
          XLM [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] is a Transformer architecture that uses two approaches during pre-training
depending on the type of data. For mono-lingual data, it uses an unsupervised modeling technique such
as Casual Language Modeling (CLM) or Masked Language Modeling (MLM). For cross-lingual
data, XLM employs a supervised modeling technique that combines MLM with Translation
Language Modeling (TLM).
        </p>
        <p>
          BART [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] is a generalized BERT that uses a transformer-based neural machine translation
architecture. The architecture uses a left-to-right decoder (as in GPT [14] architecture) and a
standard Sequence-to-Sequence bidirectional encoder (as in BERT [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]).
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Classification Models</title>
        <p>For classification, we propose a deep neural network architecture that contains the following
layers: (1) Input layer; (2) BiLSTM layer; and (3) Dense layer.</p>
        <p>The input layer instantiates the neural network. It is used to produce a symbolic tensor-like
object that has the size of the sentence transformer.</p>
        <p>
          LSTM (Long Short-Term Memory)) [15] is a recurrent neural network that process past
information using two state components: (1) a hidden layer for the short-term memory; and
(2) an internal cell state for long-term memory. The BiLSTM layer encapsulates both past and
future information through the use of two hidden states. The forward hidden state processes the
past information using a forward LSTM, while the backwards hidden state process the future
information provided by employing a backward LSTM. To encode both the past and future,
the BiLSTM concatenate into on hidden state the forward and backward hidden state at every
time-step. The number of units for this layer can be determined experimentally using ablation
and hyperparameter testing (see [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] for more details). For the LSTM cell, we use the classic
implementation presented in [15, 16]. For this cell, the recurrent activation function is sigmoid,
the kernel weights are initialized using the Glorot uniform linear transformation [17], and the
bias vector is initialized with zeros.
        </p>
        <p>The Dense Layer is a fully connected Perceptron layer used for classification. The number of
units in this layer is equal to the number of classes. The activation function for this layer is the
sigmoid.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Results</title>
      <p>In this section, we present the experimental results of the proposed models for the CheckThat!
2022 Fake News Detection task for both the mono-lingual and cross-lingual challenges.</p>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>The CheckThat! 2022 Task 3 [18, 19] consists of two subtasks as follows: (1) multi-class
monolingual fake news detection of news articles (English) [20, 21] (i.e., mono-lingual task); and (2)
multi-class cross-lingual fake news detection task (German) (i.e., cross-lingual task). The steps
used in the data collection are defined in Shahi [22].</p>
        <p>For the mono-lingual task, the English training data is the same as from the CheckThat! 2021
version [23]. The number of classes for this task is four: false, partially false, other, and true.
The number of labels has been defined after a thorough study of 83 classes was conducted by
fact-checkers [24]. The dataset contains an English training, development, and testing set.</p>
        <p>For the cross-lingual task, a new test dataset in German is introduced. The main focus of this
task is to use transfer learning to detect fake news content in low resource languages. Thus, the
training for this task is done on the English training and development datasets, and then it is
tested on the German testing set. For this task, we use the same labels as for the mono-lingual
task.</p>
        <p>As we used the same training data for both the mono-lingual and cross-lingual tasks, we
concatenated the English training and development sets to train the model. Table 1 shows the
label distribution for the training dataset. We observe that the dataset is highly imbalanced.
Table 2 presents the label distribution for the test dataset.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Experimental Setup</title>
        <p>
          For the sentence transformer, we used the pre-trained BART (facebook/bart-large) and XLM
(sentence-transformers/stsb-xlm-r-multilingual) from HuggingFace Transformer [25]. We train
the sentence transformers using the SentenceTransformers Python 3 package [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>For classification, the BiLSTM layer uses 100 LSTM units configured as in [ 15]. The dense
layer contains 4 units (equal to the number of classes), and the sigmoid function as activation.
We used the ADAM optimizer and a 64 batch size. The model is trained for 100 epochs. To
prevent overfitting, we used an early stopping mechanism that monitors the Accuracy during
training. We use Keras with TensorFlow as backend for implementing the neural model.</p>
        <p>The implementation is available online on GitHub at the following url: https://github.com/
elena-apostol/AwakenedCheckThat2022.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Results</title>
        <p>Table 3 presents the overall results for both the mono-lingual (i.e., both train and test sets use
English texts) and cross-lingual (i.e., the train set is in English and the test set is in German).
We observe that, by training the BiLSTM with BART sentence embeddings on English, we
obtain an accuracy of ∼ 0.53 for the mono-lingual task and an accuracy of ∼ 0.28 for the
cross-lingual task. With these results, the Awakened team obtained the 3 and the 5ℎ place
in the competition for the mono-lingual task and cross-lingual task, respectively. As a general
observation, the results are highly influenced by the dataset’s size and class imbalance.</p>
        <p>For the mono-lingual task, the low performance of the model is directly impacted by two
dataset related aspects: (1) the dataset size is small, being inadequate for a neural network
approach; and (2) the dataset is highly imbalanced, miss-classification being a real challenge.
When analyzing the evaluation metrics per class (Table 4), these two aspects are even more
emphasized by the results obtained per class. For the false label, the model obtains ∼ 0.67
precision and ∼ 0.83 recall. Thus, the interpretation of these results shows that the models
manage to correctly determine fake news. For the true label, the model obtains ∼ 0.76 precision
and ∼ 0.21 recall. This shows that the model also manages to discriminate well between true
news and the other 3 types of texts. Also, these results show that the contextual, semantic, and
syntactic information encoded by the sentence transformer for the true and false labels are very
specific to this classes. Thus, the textual dissimilarities between fake news and real news are
more prominent. The precision and recall for the partially true (∼ 0.13 precision and ∼ 0.32
recall) and other (∼ 0.04 precision and ∼ 0.03 recall) classes are very small. These results
indicate that these textual data are more similar to the other two classes. Thus, the model does
not manage to discriminate correctly between these two labels and the true and false ones.</p>
        <p>For the cross-lingual task, we observe that the model manages to obtain an accuracy of ∼ 0.28
and an F1-Score of ∼ 0.19 (Table 3). These results are also impacted by the transfer learning
algorithm besides the dataset’s size and the imbalanced labels. Based on these observations,
we can conclude that the multi-lingual sentence transformers do not manage to correctly
ifnd similarities between the English and German texts that are labeled with the same class.
When analyzing the per class results, we observe that the true labeled German documents are
predicted with a high precision (∼ 0.59), but the recall for these labeled documents is ∼ 0.05.
The interpretation of these values for the true label is that the model manages to determine
the true positives more accurately than false negatives. In other words, for the true labeled
documents, the model manages to return more relevant results to this label than irrelevant ones
but does not manages to return most of the relevant results for this label. For the false labeled
German documents, the interpretation of the results is, as expected, in reverse as for the true
labeled documents. Thus, with a precision of ∼ 0.35 and a recall of 0.65, the model manages
to return most of the relevant results for this label but does not manage to return all relevant
results to this label. The model does not manage to classify any of the German documents in
the test set labeled with other and it has a very low F1-Score for the prediction of documents
labeled with partially true.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>In this paper, we trained two BiLSTM neural networks with sentence transformers for data
encoding models to detect the veracity of fake news. The first model is trained and tested on
an English news article dataset for multi-class mono-lingual fake news detection. This model
encodes the textual data using BART sentence transformer. The second model is trained on
the same English dataset and tested on a German dataset for multi-class cross-lingual fake
news detection. The model encodes the textual data using XLM sentence transformer and takes
advantage of transfer learning to solve the task of cross-lingual fake news detection. We use
the first model to answer our first research question (1). With an accuracy of ∼ 0.53 and a
F1-Score of ∼ 0.32, we conclude that it is worth investigating more the use of simple neural
networks with sentence transformers for mono-lingual fake news detection task. The second
model is used to answer our second research question (2). With an accuracy of ∼ 0.28 and
a F1-Score of ∼ 0.19, we conclude that the BiLSTM XML sentence transform model does not
manage to correctly find similarities between the English and German texts. Although, the use
of cross-lingual transformers and transfer learning for multi-class classification in theory could
prove useful, for the multi-class cross-lingual fake news detection task at head, they perform
poorly. For both mono-lingual and cross-lingual tasks, we observed that: (1) the dataset size
needs to be large to be adequate for a neural network approach; and (2) the dataset needs a
balanced label distribution to mitigate against miss-classification.</p>
      <p>In future work, we aim to use transformer embeddings instead of sentence transformers.
We also plan to test other cross-lingual transformers for transfer learning in a larger study to
determine if the conclusions obtained on this small dataset generalize or are obtained by this
data-driven approach.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The research presented in this paper was supported in part by the German Academic Exchange
Service (DAAD) through the projects "AWAKEN: content-Aware and netWork-Aware faKE
News mitigation" (grant no. 91809005) "Deep-Learning Anomaly Detection for Human and
Automated Users Behavior" (grant no. 91809358), in part by the German Federal Ministry
of Education and Research (BMBF) project "PANQURA - a technology platform for more
information transparency in times of crisis" under Grant 03COV03F, in part by the European
Union project "FAST-LISA - Fighting hAte Speech Through a Legal, ICT and Sociolinguistic
approach" under Grant 101049342, and in part by the EU CEF project "NORDIS - NORdic
observatory for digital media and information DISorder" under Grant number2394203).
[14] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., Language models are
unsupervised multitask learners, OpenAI blog 1 (2019) 9.
[15] S. Hochreiter, J. Schmidhuber, Long Short-Term Memory, Neural Computation 9 (1997)
1735–1780. doi:10.1162/neco.1997.9.8.1735.
[16] F. A. Gers, J. Schmidhuber, F. Cummins, Learning to forget: Continual prediction with</p>
      <p>LSTM, Neural Computation 12 (2000) 2451–2471. doi:10.1162/089976600300015015.
[17] X. Glorot, Y. Bengio, Understanding the dificulty of training deep feedforward neural
networks, in: Y. W. Teh, M. Titterington (Eds.), Proceedings of the Thirteenth International
Conference on Artificial Intelligence and Statistics, volume 9 of Proceedings of Machine
Learning Research, PMLR, Chia Laguna Resort, Sardinia, Italy, 2010, pp. 249–256. URL:
https://proceedings.mlr.press/v9/glorot10a.html.
[18] P. Nakov, A. Barrón-Cedeño, G. D. S. Martino, F. Alam, J. M. Struß, T. Mandl, R. Míguez,
T. Caselli, M. Kutlu, W. Zaghouani, C. Li, S. Shaar, G. K. Shahi, H. Mubarak, A. Nikolov,
N. Babulkov, Y. S. Kartal, J. Beltrán, The CLEF-2022 CheckThat! Lab on Fighting the
COVID19 Infodemic and Fake News Detection, in: Lecture Notes in Computer Science, Springer
International Publishing, 2022, pp. 416–428. doi:10.1007/978-3-030-99739-7_52.
[19] P. Nakov, A. Barrón-Cedeño, G. Da San Martino, F. Alam, J. M. Struß, T. Mandl, R. Míguez,
T. Caselli, M. Kutlu, W. Zaghouani, C. Li, S. Shaar, G. K. Shahi, H. Mubarak, A. Nikolov,
N. Babulkov, Y. S. Kartal, J. Beltrán, M. Wiegand, M. Siegel, J. Köhler, Overview of the
CLEF-2022 CheckThat! Lab on Fighting the COVID-19 Infodemic and Fake News Detection,
in: Proceedings of the 13th International Conference of the CLEF Association: Information
Access Evaluation meets Multilinguality, Multimodality, and Visualization, CLEF ’2022,
Bologna, Italy, 2022.
[20] G. K. Shahi, D. Nandini, FakeCovid – A Multilingual Cross-domain Fact Check News
Dataset for COVID-19, in: Workshop Proceedings of the 14th International AAAI
Conference on Web and Social Media, ICWSM, 2020, pp. 1–9. URL: http://workshop-proceedings.
icwsm.org/pdf/2020_14.pdf. doi:10.36190/2020.14.
[21] J. Köhler, G. K. Shahi, J. M. Struß, M. Wiegand, M. Siegel, T. Mandl, Overview of the
CLEF-2022 CheckThat! Lab Task 3 on Fake News Detection, in: Working Notes of CLEF
2022—Conference and Labs of the Evaluation Forum, CLEF ’2022, Bologna, Italy, 2022.
[22] G. K. Shahi, AMUSED: An Annotation Framework of Multi-modal Social Media Data,
arXiv preprint arXiv:2010.00502 (2020).
[23] G. K. Shahi, J. M. Struß, T. Mandl, Overview of the CLEF-2021 CheckThat! lab task 3 on
fake news detection, Working Notes of CLEF (2021).
[24] G. K. Shahi, A. Dirkson, T. A. Majchrzak, An exploratory study of COVID-19
misinformation on Twitter, Online Social Networks and Media 22 (2021) 100104.
[25] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf,
M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. L. Scao,
S. Gugger, M. Drame, Q. Lhoest, A. Rush, Huggingface’s transformers: State-of-the-art
natural language processing, in: Proceedings of the 2020 Conference on Empirical Methods
in Natural Language Processing: System Demonstrations, Association for Computational
Linguistics, 2020, pp. 38–45. doi:10.18653/v1/2020.emnlp-demos.6.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.-I.</given-names>
            <surname>Ilie</surname>
          </string-name>
          , C.
          <article-title>-</article-title>
          <string-name>
            <surname>O. Truică</surname>
            ,
            <given-names>E.-S.</given-names>
          </string-name>
          <string-name>
            <surname>Apostol</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Paschke</surname>
          </string-name>
          ,
          <article-title>Context-Aware Misinformation Detection: A Benchmark of Deep Learning Architectures Using Word Embeddings, IEEE Access 9 (</article-title>
          <year>2021</year>
          )
          <fpage>162122</fpage>
          -
          <lpage>162146</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2021</year>
          .
          <volume>3132502</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.-O.</given-names>
            <surname>Truică</surname>
          </string-name>
          , E.-S. Apostol, MisRoBÆRTa: Transformers versus Misinformation,
          <source>Mathematics</source>
          <volume>10</volume>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          (
          <issue>569</issue>
          ). doi:
          <volume>10</volume>
          .3390/math10040569.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Bedi</surname>
          </string-name>
          , U. Mishra,
          <article-title>Performance of Bernoulli's Naïve Bayes classifier in the detection of fake news</article-title>
          ,
          <source>Materials Today: Proceedings</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Kaliyar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Narang</surname>
          </string-name>
          , S. Sinha, FNDNet
          <article-title>- A deep convolutional neural network for fake news detection</article-title>
          ,
          <source>Cognitive Systems Research</source>
          <volume>61</volume>
          (
          <year>2020</year>
          )
          <fpage>32</fpage>
          -
          <lpage>44</lpage>
          . doi:
          <volume>10</volume>
          . 1016/j.cogsys.
          <year>2019</year>
          .
          <volume>12</volume>
          .005.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Saleh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Alharbi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Alsamhi</surname>
          </string-name>
          , OPCNN-FAKE:
          <article-title>Optimized convolutional neural network for fake news detection</article-title>
          ,
          <source>IEEE Access 9</source>
          (
          <year>2021</year>
          )
          <fpage>129471</fpage>
          -
          <lpage>129489</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Samantaray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Bi-directional Long Short-Term Memory Network for Fake News Detection from Social Media</article-title>
          ,
          <source>in: Intelligent and Cloud Computing</source>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>463</fpage>
          -
          <lpage>470</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T. E.</given-names>
            <surname>Trueman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Narayanasamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vidya</surname>
          </string-name>
          ,
          <article-title>Attention-based C-BiLSTM for fake news detection</article-title>
          ,
          <source>Applied Soft Computing</source>
          <volume>110</volume>
          (
          <year>2021</year>
          )
          <article-title>107600</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.asoc.
          <year>2021</year>
          .
          <volume>107600</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Conference of the North American Chapter of the Association for Computational Linguistics</article-title>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          ,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , V. Stoyanov,
          <string-name>
            <surname>RoBERTa: A Robustly Optimized BERT Pretraining Approach</surname>
          </string-name>
          ,
          <year>2019</year>
          . arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, BART:
          <article-title>Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</article-title>
          , Association for Computational Linguistics,
          <year>2020</year>
          , pp.
          <fpage>7871</fpage>
          -
          <lpage>7880</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <article-title>A Two-Stage Model Based on BERT for Short Fake News Detection</article-title>
          , in: International Conference on Knowledge Science, Engineering and Management, Springer,
          <year>2019</year>
          , pp.
          <fpage>172</fpage>
          -
          <lpage>183</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -29563-9\ _
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          , Sentence-BERT:
          <article-title>Sentence Embeddings using Siamese BERTNetworks</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>3982</fpage>
          -
          <lpage>3992</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/d19-
          <fpage>1410</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          , G. Lample,
          <article-title>Cross-lingual Language Model Pretraining</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>32</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . URL: https://proceedings. neurips.cc/paper/2019/file/c04c19c2c2474dbf5f7ac4372c5b9af1-Paper.pdf.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>