<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SemanticCuetSync at CheckThat! 2024: Pre-trained Transformer-based Approach to Detect Check-Worthy Tweets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Symom Hossain Shohan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Md. Sajjad Hossain</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ashraful Islam Paran</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jawad Hossain</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shawly Ahsan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammed Moshiul Hoque</string-name>
          <email>moshiul_240@cuet.ac.bd</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chittagong University of Engineering and Technology</institution>
          ,
          <addr-line>Chattogram - 4349</addr-line>
          ,
          <country country="BD">Bangladesh</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <abstract>
        <p>This paper presents an intelligent technique for classifying English, Arabic, and Dutch texts as checkworthy, harnessing the power of the BERT-based model. The study explores ten baseline models, including LR, MNB, SVM, CNN+LSTM, CNN+BiLSTM, BERT-Base-Uncased, RoBERTa, AraBERTv2, Dutch-RoBERTa, and Dutch-BERT, to address the shared task. The study also investigates an LLM using few-shots, such as SetFit, to identify checkworthy tweets or texts. Evaluation results unequivocally demonstrate the superiority of transformer-based models, with RoBERTa achieving the highest F1 scores of 75.82% for English tweets, Dehate-BERT scoring 52.55% for Arabic texts, and Dutch-BERT obtaining a maximum score of 58.42% for Dutch texts. Our team ranked 6th overall for English, 5th for Arabic, and 16th for Dutch in the shared task challenge.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Natural Language Processing</kwd>
        <kwd>Check-Worthiness</kwd>
        <kwd>Fact-checking</kwd>
        <kwd>Tweet-Verification</kwd>
        <kwd>Transformers</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Checkworthy content refers to information that must be confirmed for accuracy, as it may have the
potential to shape the opinions and decisions of others. The rise of social networks has led to an
exponential growth in textual data on the internet, sometimes resulting in the spread of false claims
that can be detrimental to society if left unaddressed. These claims can include political, religious, and
health-related misinformation, which can cause discord in society. Fact-checking is a time-consuming
task that requires extensive research, identification, verification, and expert analysis. Automating this
entire process is a significant challenge, and the first step towards this goal is to determine whether the
information is worth checking in the first place.</p>
      <p>
        With the proliferation of communication and social media platforms, such as Facebook, Twitter, and
Reddit, the dissemination of false information has become increasingly prevalent. A recent study has
suggested that people struggle to diferentiate facts from false news [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Intelligent technologies can be
used to support human fact-checkers to identify claims worth fact-checking [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Many studies have
been devoted to developing a fully automated system for fact-checking [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. As social
media data continues to expand daily, it is impractical to monitor everything eficiently by human
experts. Therefore, developing an automatic system has emerged as the ultimate solution to this problem.
This work proposes a solution to classify English, Arabic, and Dutch texts or tweets as checkworthy,
harnessing the power of BERT-based approaches. The critical contributions of this study are:
• Introducing a fine-tuned transformer-based model to classify checkworthy texts for three
languages (English, Arabic, and Dutch).
• Exploring various machine learning (LR, SVM, and MNB), deep learning (CNN, CNN + LSTM,
and CNN + BiLSTM), and transformer-based models for nfiding a suitable method for detecting
checkworthy texts in multiple languages.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Inaccurate news is quickly spreading throughout social media. Checking the authenticity of any
post that surfaces on social media becomes crucial. Intelligent fact-checking systems have emerged
as a significant area of research to tackle this problem. Several domains allow for the detection of
trustworthiness, such as digital scam [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], the healthcare sector [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], politics [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and many more fields.
An overview of Task 1 in the fourth edition of the CheckThat! The lab was provided by Shaar et al.
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Their job was anticipating which tweets involving politics and COVID-19 needed to be verified.
Williams et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] presented a transformer-based solution with data augmentation for this problem,
and it received an mAP (mean average precision) of 0.66 in the Arabic language. Checkworthiness in
multimodal [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is another popular research area these days; in addition to unimodal, Sadouk et al.
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] proposed a multimodal transformer-based model (BERT+ResNet50) to identify checkworthiness in
English which recorded F1 score of 0.71 and transformer based model (MarBERT) with downsampling
recorded F1 score of 0.61 in Arabic for image dataset. Meanwhile, Ivanov et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] proposed audio
datasets from past political debates and ensemble techniques for detecting checkworthiness. Their
audio model (wav2vec2.0) received a mAP of 0.34 when extra noise was eliminated. Ensembles using
BERT and an audio model outperformed BERT alone, with a mAP of 0.38.
      </p>
      <p>This work addresses a significant gap in the existing literature by comprehensively comparing
machine learning (ML), deep learning (DL), and transformer-based solutions. In addition, it investigates
the use of few-shot models like SetFit for determining check worthiness in Dutch, Arabic, and English.
This study improves the understanding of the various models’ performance in these distinct languages.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset and Task Description</title>
      <p>The dataset consists of tweets or texts in English, Arabic, and Dutch languages, along with their
corresponding labels (‘Yes’ for texts worth checking, ‘No’ otherwise). Table 1 shows the distribution
of train, dev, dev-test, and test sets. We trained all models using the training set and evaluated the
model’s performance based on the test set. CLEF 2024 - CheckThat! Lab [16, 17, 18] consists of six tasks
[19, 20, 21, 22, 23]. We participated in task-1 of this shared task. Task-1 [19] focuses on assessing whether
a claim in a tweet or transcription requires further investigation for fact-checking. The traditional
approach for such decisions involves human experts, either professional fact-checkers or annotators,
who evaluate the claim based on various criteria. Table 2 illustrates an example of training data for the
diferent languages.</p>
    </sec>
    <sec id="sec-4">
      <title>4. System Overview</title>
      <p>This task exploited various ML, DL, and transformer-based approaches across all three languages.
MLbased techniques used include linear regression (LR), support vector machine (SVM), and multinomial
naive Bayes (MNB). DL-based techniques involve CNN, CNN+long short-term memory (LSTM), and
CNN+bidirectional LSTM (BiLSTM). Lastly, various BERT-based transformers are fine-tuned for each
language for the given task. Figure 1 illustrates the schematic process of checkworthy text detection.</p>
      <p>Textual Feature Extraction: Textual feature extraction is one of the essential steps in natural
language processing, which involves transforming raw textual data into numerical representations.
This numerical representation aids the models in understanding and processing textual data. A Count
Vectorizer is used in the ML models examined in this work. It is a widely used technique for textual
feature extraction that transforms text data into a matrix of token counts. In DL models, tokenization and
padding combine to convert raw texts into structured numerical data. These numerical representations
are then passed through an embedding layer, which captures more advanced features such as semantic
relationships. This study uses the embedding layer instead of Word2Vec [24] or GloVe [25] to allow the
model to learn task-specific embedding during training. Finally, BERT-based tokenizers are employed
for transformer-based models to exploit the BERT architecture.</p>
      <p>ML Models: Various ML models are examined in this work, such as LR, SVM, MNB, KNN, and RF.
All the hyperparameter settings for these models are illustrated in Table 3.</p>
      <p>CNN: This work employed a CNN model comprising an embedding layer with an output dimension
of 200. The model features two Conv1D layers with 64 and 128 filters, respectively. Both layers used a
kernel size of 2 and ReLU activation. For downsampling, the model incorporates a GlobalMaxPooling1D
layer. Subsequently, a dense layer with 128 units and ReLU activation is followed by a dropout layer
with a rate of 0.5 to prevent overfitting. The output layer has a single unit with sigmoid activation. The
model utilizes the ‘binary_crossentropy’ loss function and ‘Nadam’ optimizer and has trained with a
batch size of 32 for three epochs.</p>
      <p>CNN+LSTM: The CNN+LSTM model used in this work has almost the same architecture as the CNN
model, incorporating a single LSTM layer comprising 64 units and a dropout rate of 0.2 for sequence
modeling. Furthermore, the dense layer included in this design features 64 units and utilizes the ReLU
activation function. The remaining hyperparameter configurations are consistent with those employed
in the CNN model.</p>
      <p>CNN+BiLSTM: This model has an architecture similar to the CNN+LSTM model but replaces LSTM
with a Bidirectional LSTM.</p>
      <p>Transformer models for English: This study fine-tuned three transformer-based models for a
specified task in the English dataset. The models employed were BERT-Base-Uncased [ 26], SetFit [27],
and RoBERTa [28]. The necessary text preprocessing steps were followed before feeding the data into
the transformers. These text preprocessing steps include lowercasing, emoji removal, stop word removal,
stemming, contraction expansion, simple Unicode spelling correction, and HTML tag removal. For stop
word removal, the NLTK stopwords list is used. The main agenda of the text preprocessing steps was to
reduce the noise in the dataset and focus on meaningful words. The BERT-Base-Uncased used in this
task is a pre-trained transformer model with exceptional performance across various natural language
processing (NLP) tasks. This model demonstrated satisfactory performance on the specified task. On
the other hand, SetFit leverages pre-trained transformers with limited labeled data. We explored the
potential of this few-shot learning framework for the given task. SetFit does not require manual prompts
for classification, in contrast to LLMs. Finally, RoBERTa, another optimized version of BERT, is used
here for the specified task and outperforms other models.</p>
      <p>Transformer models for Arabic: This study also exploited three transformer-based models and
ifne-tuned them in the Arabic dataset. Models used for the Arabic dataset were AraBERTV2 [ 29],
SetFit (Few-shot) [27], and Dehate-BERT [30]. Similar to the English dataset, some text-preprocessing
steps were also performed. Again, the preprocessing steps are lowercasing, emoji removal, stop
word removal, stemming, contraction expansion, simple spelling correction using Unicode, HTML tag
removal, punctuation removal, URL removal, whitespace removal, and number removal. Stemming
was performed here using ArabicLightStemmer. Finally, normalization was used to convert similar
characters to a standard form.</p>
      <p>AraBERTV2 is the improved version of AraBERT, which leverages the BERT architecture. This model
was trained on a sizeable Arabic dataset and has demonstrated efectiveness in various downstream NLP
tasks, including sentiment analysis, NER, and Arabic question answering. Dehate-BERT is a pre-trained
transformer model primarily designed for hate speech detection, and it outperformed all other models
in Arabic in this specific task.</p>
      <p>Transformer models for Dutch: This study investigated Dutch RoBERTa [31], SetFit, and
DutchBERT in the Dutch dataset. Rather than undertaking extensive text preprocessing, this work limited its
processing to removing non-Dutch characters from the texts. Table 4 illustrates the hyperparameters of
transformer-based models.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Analysis</title>
      <p>demonstrated the best performance with a precision of 52.06%, recall of 44.58%, and F1-score of 48.03%.
In the transformer category, Dutch-BERT outperformed others with a precision of 48.40%, recall of
73.80%, and the highest F1-score of 58.42%.</p>
      <p>The SetFit model exhibited a precision of 45.31%, recall of 58.44%, and an F1-score of 51.05%. It shows
competitive results compared to transformer-based models such as Dutch-BERT, indicating its possible
usefulness in Dutch language classification tasks.</p>
      <p>In general, transformer-based models surpass both ML and DL models across various languages,
demonstrating the efectiveness of pre-trained language models in numerous natural language
processing applications. Additionally, within each category, specific models show superior performance,
emphasizing the necessity of choosing the appropriate model based on the specific task and language.
5.1. Error Analysis</p>
      <sec id="sec-5-1">
        <title>Quantitative Analysis</title>
        <p>A comprehensive quantitative and qualitative error analysis is conducted to provide detailed insights
into the proposed model’s performance.</p>
        <p>Figure 2 illustrates the confusion matrix of the best-performing models across English, Arabic, and
Dutch.</p>
        <p>From a total of 341 test cases in English, RoBERTa demonstrates strong performance in identifying
the positive class, with 247 True Positives and only 6 False Positives. This indicates a high precision,
meaning the model is highly accurate when it predicts “Yes.” Additionally, with 57 True Negatives, it
correctly identifies many negative instances. However, there are 31 False Negatives, which indicates
that some positive instances are being missed. RoBERTa shows a balanced approach with notable
proficiency in minimizing incorrect optimistic predictions, resulting in a high F1 score of 75.82% over
positive samples.</p>
        <p>From a total of 610 test cases in Arabic, Dehate-BERT shows a diferent pattern in its confusion matrix.
With 215 True Positives and 120 True Negatives, the model accurately identifies many instances from
both classes. However, the model has many False Positives (177) and False Negatives (98). This suggests
that while Dehate-BERT can identify positive instances, it incorrectly classifies many negative instances
as positive, leading to a lower precision. Additionally, the relatively high count of False Negatives
indicates room for improvement in recall, highlighting the need for better distinction between the two
classes.</p>
        <p>Dutch-BERT performs moderately from 1000 test cases in Dutch, with 446 True Positives and 214
True Negatives. The model, however, has 157 False Positives and 183 False Negatives. This indicates
that while Dutch-BERT can identify positive instances reasonably well, it struggles with precision and
recall. The high number of False Positives suggests a tendency to overpredict the positive class, and the
significant count of False Negatives shows it also misses many positive instances. Consequently,
DutchBERT’s overall performance is balanced but shows substantial room for improvement in minimizing
misclassifications to enhance its F1 score.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Qualitative Analysis</title>
        <p>It is clear that the models accurately predicted the labels for examples 2, 3, and 5 but made errors with
examples 1 and 4. For the first example, the sentence’s intent is ambiguous, leading to an incorrect label
prediction by the model. In the case of example 4, although the sentence is checkworthy, the model
mislabeled it due to inadequate training data in the Dutch language, which hindered proper learning.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This work investigated the various ML, DL, and transformer-based models for identifying checkworthy
tweets or texts in English, Arabic, and Dutch. The results indicate that transformer-based models shine
in this task and exhibit exceptional capability in detecting checkworthy text. Specifically, RoBERTa
excels in English, Dehate-BERT for Arabic, and Dutch-BERT for Dutch, achieving the highest F1 scores
of 75.82%, 52.55%, and 58.42%, respectively. The study recommends that further advancements be made
by increasing the training data and incorporating advanced LLMs and GPT models.
speeches, and interviews using audio data, in: ICASSP 2024-2024 IEEE International Conference
on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2024, pp. 12011–12015.
[16] A. Barrón-Cedeño, F. Alam, J. M. Struß, P. Nakov, T. Chakraborty, T. Elsayed, P. Przybyła, T. Caselli,
G. Da San Martino, F. Haouari, C. Li, J. Piskorski, F. Ruggeri, X. Song, R. Suwaileh, Overview of
the CLEF-2024 CheckThat! Lab: Check-worthiness, subjectivity, persuasion, roles, authorities
and adversarial robustness, in: L. Goeuriot, P. Mulhem, G. Quénot, D. Schwab, L. Soulier, G. M.
Di Nunzio, P. Galuščáková, A. García Seco de Herrera, G. Faggioli, N. Ferro (Eds.), Experimental IR
Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International
Conference of the CLEF Association (CLEF 2024), 2024.
[17] A. Barrón-Cedeño, F. Alam, T. Chakraborty, T. Elsayed, P. Nakov, P. Przybyła, J. M. Struß, F. Haouari,
M. Hasanain, F. Ruggeri, X. Song, R. Suwaileh, The CLEF-2024 CheckThat! Lab: Check-worthiness,
subjectivity, persuasion, roles, authorities, and adversarial robustness, in: N. Goharian, N.
Tonellotto, Y. He, A. Lipani, G. McDonald, C. Macdonald, I. Ounis (Eds.), Advances in Information
Retrieval, Springer Nature Switzerland, Cham, 2024, pp. 449–458.
[18] G. Faggioli, N. Ferro, P. Galuščáková, A. García Seco de Herrera (Eds.), Working Notes of CLEF
2024 - Conference and Labs of the Evaluation Forum, CLEF 2024, Grenoble, France, 2024.
[19] M. Hasanain, R. Suwaileh, S. Weering, C. Li, T. Caselli, W. Zaghouani, A. Barrón-Cedeño, P. Nakov,
F. Alam, Overview of the CLEF-2024 CheckThat! lab task 1 on check-worthiness estimation of
multigenre content, in: [18], 2024.
[20] J. M. Struß, F. Ruggeri, A. Barrón-Cedeño, F. Alam, D. Dimitrov, A. Galassi, G. Pachov, I. Koychev,
P. Nakov, M. Siegel, M. Wiegand, M. Hasanain, R. Suwaileh, W. Zaghouani, Overview of the
CLEF-2024 CheckThat! lab task 2 on subjectivity in news articles, in: [18], 2024.
[21] J. Piskorski, N. Stefanovitch, F. Alam, R. Campos, D. Dimitrov, A. Jorge, S. Pollak, N. Ribin, Z. Fijavž,
M. Hasanain, N. Guimarães, A. F. Pacheco, E. Sartori, P. Silvano, A. V. Zwitter, I. Koychev, N. Yu,
P. Nakov, G. Da San Martino, Overview of the CLEF-2024 CheckThat! lab task 3 on persuasion
techniques, in: [18], 2024.
[22] F. Haouari, T. Elsayed, R. Suwaileh, Overview of the CLEF-2024 CheckThat! Lab Task 5 on Rumor</p>
      <p>Verification using Evidence from Authorities, in: [18], 2024.
[23] P. Przybyła, B. Wu, A. Shvets, Y. Mu, K. C. Sheang, X. Song, H. Saggion, Overview of the
CLEF2024 CheckThat! lab task 6 on robustness of credibility assessment with adversarial examples
(incrediblae), in: [18], 2024.
[24] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, J. Dean, Distributed representations of words and
phrases and their compositionality, Advances in neural information processing systems 26 (2013).
[25] J. Pennington, R. Socher, C. D. Manning, Glove: Global vectors for word representation, in:
Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP),
2014, pp. 1532–1543.
[26] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers
for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv.org/abs/1810.04805.
arXiv:1810.04805.
[27] L. Tunstall, N. Reimers, U. E. S. Jo, L. Bates, D. Korat, M. Wasserblat, O. Pereg, Eficient few-shot
learning without prompts, arXiv preprint arXiv:2209.11055 (2022).
[28] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov,
Roberta: A robustly optimized BERT pretraining approach, CoRR abs/1907.11692 (2019). URL:
http://arxiv.org/abs/1907.11692. arXiv:1907.11692.
[29] W. Antoun, F. Baly, H. Hajj, Arabert: Transformer-based model for arabic language understanding,
arXiv preprint arXiv:2003.00104 (2020).
[30] S. S. Aluru, B. Mathew, P. Saha, A. Mukherjee, Deep learning models for multilingual hate speech
detection, arXiv preprint arXiv:2004.06465 (2020).
[31] P. Delobelle, T. Winters, B. Berendt, RobBERT: a Dutch RoBERTa-based Language Model, in:
Findings of the Association for Computational Linguistics: EMNLP 2020, Association for
Computational Linguistics, Online, 2020, pp. 3255–3265. URL: https://www.aclweb.org/anthology/2020.
ifndings-emnlp.292. doi: 10.18653/v1/2020.findings-emnlp.292.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Olan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Jayawickrama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. O.</given-names>
            <surname>Arakpogun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Suklan</surname>
          </string-name>
          , S. Liu,
          <article-title>Fake news on social media: the impact on society</article-title>
          ,
          <source>Information Systems Frontiers</source>
          <volume>26</volume>
          (
          <year>2024</year>
          )
          <fpage>443</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Corney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Papotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D. S.</given-names>
            <surname>Martino</surname>
          </string-name>
          ,
          <article-title>Automated fact-checking for assisting human fact-checkers</article-title>
          ,
          <source>arXiv preprint arXiv:2103.07769</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Meng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Fan</surname>
          </string-name>
          , J. Han,
          <article-title>A survey on truth discovery</article-title>
          ,
          <source>ACM Sigkdd Explorations Newsletter</source>
          <volume>17</volume>
          (
          <year>2016</year>
          )
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Shu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sliva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          , H. Liu,
          <article-title>Fake news detection on social media: A data mining perspective</article-title>
          ,
          <source>ACM SIGKDD explorations newsletter 19</source>
          (
          <year>2017</year>
          )
          <fpage>22</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Lazer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Baum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Benkler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Berinsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Greenhill</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Menczer</surname>
            ,
            <given-names>M. J.</given-names>
          </string-name>
          <string-name>
            <surname>Metzger</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Nyhan</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Pennycook</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Rothschild</surname>
          </string-name>
          , et al.,
          <source>The science of fake news, Science</source>
          <volume>359</volume>
          (
          <year>2018</year>
          )
          <fpage>1094</fpage>
          -
          <lpage>1096</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vosoughi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Aral,</surname>
          </string-name>
          <article-title>The spread of true and false news online</article-title>
          , science
          <volume>359</volume>
          (
          <year>2018</year>
          )
          <fpage>1146</fpage>
          -
          <lpage>1151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. S.</given-names>
            <surname>Sheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>A unified perspective for disinformation detection and truth discovery in social sensing: a survey, ACM Computing Surveys (CSUR) 55 (</article-title>
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chandramouli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. P.</given-names>
            <surname>Subbalakshmi</surname>
          </string-name>
          ,
          <article-title>Scam detection in twitter</article-title>
          ,
          <source>in: Data Mining for Service</source>
          , Springer,
          <year>2014</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Gollapalli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Du</surname>
          </string-name>
          , S.
          <article-title>-</article-title>
          K. Ng,
          <article-title>Identifying checkworthy cure claims on twitter</article-title>
          ,
          <source>in: Proceedings of the ACM Web Conference</source>
          <year>2023</year>
          ,
          <year>2023</year>
          , pp.
          <fpage>4015</fpage>
          -
          <lpage>4019</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Patwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Goldwasser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bagchi</surname>
          </string-name>
          ,
          <article-title>Tathya: A multi-classifier system for detecting check-worthy statements in political debates</article-title>
          ,
          <source>in: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>2259</fpage>
          -
          <lpage>2262</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Shaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hamdan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. S.</given-names>
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kutlu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Kartal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          , G. Da San Martino, et al.,
          <article-title>Overview of the clef-2021 checkthat! lab task 1 on check-worthiness estimation in tweets and political debates</article-title>
          .,
          <source>in: CLEF (working notes)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>369</fpage>
          -
          <lpage>392</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rodrigues</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tran</surname>
          </string-name>
          , Accenture at checkthat!
          <year>2021</year>
          <article-title>: interesting claim identification and ranking with contextually sensitive lexical training data augmentation</article-title>
          ,
          <source>arXiv preprint arXiv:2107.05684</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Cheema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hakimov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Míguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mubarak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          , et al.,
          <article-title>Overview of the clef-2023 checkthat! lab task 1 on checkworthiness in multimodal and multigenre content</article-title>
          , Working Notes of CLEF (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H. T.</given-names>
            <surname>Sadouk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sebbak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. E.</given-names>
            <surname>Zekiri</surname>
          </string-name>
          , Es-vrai at checkthat! 2023:
          <article-title>Analyzing checkworthiness in multimodal and multigenre (</article-title>
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ivanov</surname>
          </string-name>
          , I. Koychev,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hardalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <article-title>Detecting check-worthy claims in political debates,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>